CN113673325A - Multi-feature character emotion recognition method - Google Patents

Multi-feature character emotion recognition method Download PDF

Info

Publication number
CN113673325A
CN113673325A CN202110793285.9A CN202110793285A CN113673325A CN 113673325 A CN113673325 A CN 113673325A CN 202110793285 A CN202110793285 A CN 202110793285A CN 113673325 A CN113673325 A CN 113673325A
Authority
CN
China
Prior art keywords
input
dictionary
feature
sparse
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110793285.9A
Other languages
Chinese (zh)
Other versions
CN113673325B (en
Inventor
钟谭媛
陈志�
李玲娟
岳文静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202110793285.9A priority Critical patent/CN113673325B/en
Publication of CN113673325A publication Critical patent/CN113673325A/en
Application granted granted Critical
Publication of CN113673325B publication Critical patent/CN113673325B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/28Determining representative reference patterns, e.g. by averaging or distorting; Generating dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a multi-feature character emotion recognition method, which comprises the steps of firstly extracting local space-time features of a face and a body in a video by using a 3D convolutional neural network, then performing dictionary learning on extracted feature vectors by using an MOD algorithm under the framework of a sparse coding tree to obtain sparse codes, finally training an SVM classifier at nodes of the sparse coding tree by using the sparse codes as input, continuously classifying, and finally outputting emotion representations of a single category; the invention can be well suitable for different scenes, has stronger generalization capability and can also improve the accuracy of human mood identification in the video of a multi-shielding environment.

Description

Multi-feature character emotion recognition method
Technical Field
The invention relates to the technical field of feature recognition, and mainly relates to a multi-feature character emotion recognition method.
Background
Emotion recognition is an application direction in which computer vision field develops rapidly and researches more in recent years, and the research field covers a series of related subjects such as pattern recognition, machine learning, psychology and medicine. In recent years, emotion recognition has become an important research topic in the fields of computer vision and human-computer interaction, and has important theoretical significance and practical application value.
Emotion recognition for people in a video mainly involves the following techniques:
(1)3D convolutional neural network (C3D): the extracted features encapsulate information about objects, scenes, and actions in the video, making them useful for different tasks without having to fine-tune the model for each task. C3D is a good descriptor: the device is universal, compact, simple and efficient. The method utilizes the 3D convolution network to extract local space-time characteristics of the face and the body of the person in the video, thereby greatly improving the efficiency and effectiveness;
(2) sparse coding tree: it uses a node-specific dictionary and classifier to direct the input vector to child nodes, which in turn have their own specialized dictionaries and classifiers, enabling more accurate classification to be performed;
(3) MOD dictionary learning: the method is a dictionary learning method with an expected maximum value, and the dictionary atoms are continuously updated in the training process through iteration, so that the residual error of sparse representation is continuously reduced to meet the convergence condition, and a dictionary with good discrimination performance is finally obtained;
(4) a support vector machine: used to train the classifier.
Based on the research results, the invention provides a multi-feature character emotion recognition method based on facial expressions and body actions, and aims to improve the emotion recognition accuracy of characters in a video.
Disclosure of Invention
The purpose of the invention is as follows: the invention provides a multi-feature character emotion recognition method which comprises the steps of firstly extracting local features of a face and a body in a video by using a 3D convolutional neural network, then performing dictionary learning on extracted feature vectors by using an MOD algorithm under the framework of a sparse coding tree to obtain sparse codes, and finally training an SVM classifier at nodes of the sparse coding tree by taking the sparse codes as input to finish emotion classification recognition.
The technical scheme is as follows: in order to achieve the purpose, the invention adopts the technical scheme that:
a multi-feature character emotion recognition method comprises the following steps:
step S1, the user inputs the video, and uses the sampling step length of 1 frame to traverse all the frames of the video, and creates a plurality of clips with 16 frame lengths; the plurality of 16-frame clips are used as the input of the 3D convolutional neural network;
step S2, extracting local features of facial expressions and body actions of people in the video by adopting a 3D convolutional neural network; for each input, constructing a 7 × 7 × 512 feature map in the conv5b layer, respectively extracting the spatial position of each feature, and connecting the values of each spatial position along 512 channels to obtain the final local features of the input; the total number of local features of the input video is 7 multiplied by 7, and each obtained local feature is a 512-dimensional vector;
step S3, for the input final local features, dictionary learning is carried out on the input final local features at the sparse coding tree root node by using an MOD algorithm; the MOD algorithm objective function is as follows:
Figure BDA0003161868540000021
wherein D ═ g1,g2,…,gn]TRepresenting a dictionary matrix, giIs a dictionary atom; x is the number ofiInputting a feature vector; omegaiDenotes xiCorresponding dictionary atom giThe sparse coefficient of (d); t is0Representing the number of nonzero elements in the sparse representation coefficient;
step S3.1, training sample set is
Figure BDA0003161868540000022
S3.2, initializing a dictionary; randomly constructing a dictionary initial value D(0)∈Rn×mAnd to carry out D(0)Column normalization;
s3.3, approximating a solution by using a tracking algorithm to obtain a sparse coefficient omegaiThe following were used:
Figure BDA0003161868540000023
s3.4, according to the sparse coefficient matrix W(k)The dictionary is updated as follows:
Figure BDA0003161868540000024
step S3.5 when
Figure BDA0003161868540000025
Less than 10-6Stopping iteration and outputting a final dictionary D;
step S4, learning a classifier by using a Support Vector Machine (SVM), and training a sparse coding tree; in particular, the amount of the solvent to be used,
s4.1, initializing a root node of the sparse coding tree into an active node a; at a, encoding the input local features into sparse codes using the dictionary D output at step S3; carrying out coarse classification on the coded input features at the active node a by adopting a Support Vector Machine (SVM) classifier;
s4.2, classifying according to a branch rule based on the rough classification label; and when branching to the next-level child node, taking the child node as the next active node a, repeating the sparse coding and coarse classification steps until all emotion classifications are finished, and outputting a final result.
Further, the branching rule in step S4.2 specifically includes:
when the coarse classification result is composed of 2 or more confusion classifications, the current node is transferred to a new sub-node specially trained, the coarse classification result is further finely classified, and finally, the recognition result only with a single classification is output.
Further, the 3D convolutional neural network in step S2 is configured to capture temporal and spatial feature information in the video, and includes 8 convolutional layers, 5 pooling layers, 2 fully-connected layers, and 1 softmax output layer, where the size of the 3D convolutional kernel of all layers is 3 × 3 × 3, and the step size is 1; the size of the first layer of the pooling layer is 1 multiplied by 2, the step length is 1, and the sizes of the rest pooling layers are 2 multiplied by 2, and the step length is 2; the input video is resized to 128 x 171, clipped into 16 frame segments that do not overlap with each other and taken as the network input.
Further, in step S2, conv5b is a feature visualization used in the last convolutional layer in the 3D convolutional neural network, and the feature mapping space size is 7 × 7, the number of channels is 512, and two feature mappings are included.
Has the advantages that: by adopting the technical scheme and the prior art, the invention has the following technical effects:
the method comprises the steps of extracting local features of facial expressions and body actions of characters in videos, then performing dictionary learning on extracted feature vectors by using an MOD algorithm under the framework of a sparse coding tree to obtain sparse codes, finally training an SVM classifier at nodes of the sparse coding tree by using the sparse codes as input, continuously classifying, and finally outputting emotion representations of single categories; the invention can be well suitable for different scenes, has stronger generalization capability and can also improve the accuracy of human mood identification in the video of a multi-shielding environment.
(1) The invention uses the 3D convolution neural network to extract local features, can encapsulate information related to targets, scenes and actions in the video, and greatly improves the efficiency and the effectiveness.
(2) The invention uses the sparse coding tree and the MOD algorithm, repeatedly classifies the nodes of the sparse coding tree, and continuously reduces the error through the iteration of the MOD algorithm, thereby more accurately completing the emotion recognition of the person.
(3) According to the invention, the facial expression characteristics and the body action characteristics are used as the elements of emotion recognition, so that the accuracy of emotion recognition under the condition that a person is shielded by a face or a body in a video is improved, and the generalization capability is enhanced.
Drawings
FIG. 1 is a flow chart of a multi-feature character emotion recognition method provided by the present invention;
FIG. 2 is a schematic diagram of a sparse coding tree according to an embodiment of the present invention.
Detailed Description
The invention will be further described with reference to the following description of an embodiment thereof, which is provided in connection with the accompanying drawings.
The multi-feature character emotion recognition method shown in fig. 1 comprises the following steps:
and step S1, inputting a video by a user, traversing all frames of the video by using the sampling step length of 1 frame, and creating a plurality of clips with the length of 16 frames as the input of the 3D convolutional neural network.
And step S2, adopting a 3D convolutional neural network to extract local characteristics of facial expressions and body actions of the people in the video. The 3D convolutional neural network is used for capturing the characteristic information of time and space in the video, and comprises 8 convolutional layers, 5 pooling layers, 2 full-connected layers and 1 softmax output layer, wherein the size of a 3D convolutional core of all the layers is 3 multiplied by 3, and the step length is 1; the size of one pooling layer is 1 multiplied by 2, the step length is 1, and the sizes of the other pooling layers are 2 multiplied by 2, and the step length is 2; the video is resized to 128 x 171 and the clips are 16 frame segments that do not overlap each other and are input as a network.
For each input, constructing a 7 × 7 × 512 feature map in the conv5b layer, respectively extracting the spatial position of each feature, and connecting the values of each spatial position along 512 channels to obtain the final local features of the input; the total number of local features of the input video is 7 × 7, and each obtained local feature is a 512-dimensional vector. conv5b is a feature visualization used in the last convolutional layer in a 3D convolutional neural network, with a feature mapping space size of 7 × 7, 512 channels, and containing two feature maps.
And step S3, performing dictionary learning on the input final local features by using MOD algorithm at the root nodes of the sparse coding tree. The MOD algorithm objective function is as follows:
Figure BDA0003161868540000041
wherein D ═ g1,g2,…,gn]TRepresenting a dictionary matrix, giIs a dictionary atom; x is the number ofiInputting a feature vector; omegaiDenotes xiCorresponding dictionary atom giThe sparse coefficient of (d); t is0The number of non-zero elements in the sparse representation coefficients is represented. In particular, the amount of the solvent to be used,
step S3.1, training sample set is
Figure BDA0003161868540000042
S3.2, initializing a dictionary; randomly constructing a dictionary initial value D(0)∈Rn×mAnd to carry out D(0)Column normalization;
s3.3, approximating a solution by using a tracking algorithm to obtain a sparse coefficient omegaiThe following were used:
Figure BDA0003161868540000043
s3.4, according to the sparse coefficient matrix W(k)The dictionary is updated as follows:
Figure BDA0003161868540000051
step S3.5 when
Figure BDA0003161868540000052
Less than 10-6And stopping iteration and outputting the final dictionary D.
In this step, the MOD algorithm updates the sparse coefficient matrix W by iterative iterations(k)And a dictionary matrix D. Firstly, a tracking algorithm is used for approximating a result, so that a sparse coefficient is updated, and then, according to the input local characteristics and a sparse coefficient matrix W(k)Updating the dictionary as the sparse coefficient matrix W(k)The change is small enough to get the final dictionary D.
And step S4, learning a classifier by using a Support Vector Machine (SVM), and training the sparse coding tree. As shown in fig. 2:
s4.1, initializing a root node of the sparse coding tree into an active node a; at a, encoding the input local features into sparse codes using the dictionary D output at step S3; and carrying out coarse classification on the coded input features at the active node a by adopting a Support Vector Machine (SVM) classifier. The SVM classifier is a rough classifier aiming at four emotions (anger, distraction, heart injury and neutrality) under study; at this time, the classifier only performs rough classification on the input data, and further classification is transmitted to the child node, so the classification is called rough classification.
And S4.2, classifying according to the branch rule based on the rough classification label. The branch rule specifically includes:
when the coarse classification result is composed of 2 or more confusion classifications, the current node is transferred to a new sub-node specially trained, the coarse classification result is further finely classified, and finally, the recognition result only with a single classification is output. For example, when the output result of the coarse classifier only contains a class of "angry", it is output as the final result, and then the samples with the remaining data labeled as "open heart", "wounded heart", and "neutral" are directed to the next new child node, and then the SVM classifier is trained again.
And when branching to the next-level child node, taking the child node as the next active node a, repeating the sparse coding and classifying steps until all emotions are classified, and outputting a final result.
The above description is only of the preferred embodiments of the present invention, and it should be noted that: it will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the principles of the invention and these are intended to be within the scope of the invention.

Claims (4)

1. A multi-feature character emotion recognition method is characterized by comprising the following steps:
step S1, the user inputs the video, and uses the sampling step length of 1 frame to traverse all the frames of the video, and creates a plurality of clips with 16 frame lengths; the plurality of 16-frame clips are used as the input of the 3D convolutional neural network;
step S2, extracting local features of facial expressions and body actions of people in the video by adopting a 3D convolutional neural network; for each input, constructing a 7 × 7 × 512 feature map in the conv5b layer, respectively extracting the spatial position of each feature, and connecting the values of each spatial position along 512 channels to obtain the final local features of the input; the total number of local features of the input video is 7 multiplied by 7, and each obtained local feature is a 512-dimensional vector;
step S3, for the input final local features, dictionary learning is carried out on the input final local features at the sparse coding tree root node by using an MOD algorithm; the MOD algorithm objective function is as follows:
Figure FDA0003161868530000011
wherein D ═ g1,g2,...,gn]TRepresenting a dictionary matrix, giIs a dictionary atom; x is the number ofiInputting a feature vector; omegaiDenotes xiCorresponding dictionary atom giThe sparse coefficient of (d); t is0Representing the number of nonzero elements in the sparse representation coefficient;
step S3.1, training sample set is
Figure FDA0003161868530000012
S3.2, initializing a dictionary; randomly constructing a dictionary initial value D(0)∈Rn×mAnd to carry out D(0)Column normalization;
s3.3, approximating a solution by using a tracking algorithm to obtain a sparse coefficient omegaiThe following were used:
Figure FDA0003161868530000013
s3.4, according to the sample X and the sparse coefficient matrix W(k)The dictionary is updated as follows:
Figure FDA0003161868530000014
step S3.5 when
Figure FDA0003161868530000015
Less than 10-6Stopping iteration and outputting a final dictionary D;
step S4, learning a classifier by using a Support Vector Machine (SVM), and training a sparse coding tree; in particular, the amount of the solvent to be used,
s4.1, initializing a root node of the sparse coding tree into an active node a; at a, encoding the input local features into sparse codes using the dictionary D output at step S3; carrying out coarse classification on the coded input features at the active node a by adopting a Support Vector Machine (SVM) classifier;
s4.2, classifying according to a branch rule based on the rough classification label; and when branching to the next-level child node, taking the child node as the next active node a, repeating the sparse coding and coarse classification steps until all emotion classifications are finished, and outputting a final result.
2. The method for multi-feature character emotion recognition according to claim 1, wherein the branching rule in step S4.2 specifically includes:
when the coarse classification result is composed of 2 or more confusion classifications, the current node is transferred to a new sub-node specially trained, the coarse classification result is further finely classified, and finally, the recognition result only with a single classification is output.
3. The method for multi-feature human emotion recognition of claim 1, wherein the 3D convolutional neural network in step S2 is used for capturing temporal and spatial feature information in video, and comprises 8 convolutional layers, 5 pooling layers, 2 fully-connected layers and 1 softmax output layer, the size of 3D convolutional kernel for all layers is 3 × 3 × 3, and the step size is 1; the size of the first layer of the pooling layer is 1 multiplied by 2, the step length is 1, and the sizes of the rest pooling layers are 2 multiplied by 2, and the step length is 2; the input video is resized to 128 x 171, clipped into 16 frame segments that do not overlap with each other and taken as the network input.
4. The method as claimed in claim 1, wherein in step S2, conv5b is used to visualize the features of the last convolutional layer in the 3D convolutional neural network, the size of the feature mapping space is 7 × 7, the number of channels is 512, and the method includes two feature mappings.
CN202110793285.9A 2021-07-14 2021-07-14 Multi-feature character emotion recognition method Active CN113673325B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110793285.9A CN113673325B (en) 2021-07-14 2021-07-14 Multi-feature character emotion recognition method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110793285.9A CN113673325B (en) 2021-07-14 2021-07-14 Multi-feature character emotion recognition method

Publications (2)

Publication Number Publication Date
CN113673325A true CN113673325A (en) 2021-11-19
CN113673325B CN113673325B (en) 2023-08-15

Family

ID=78539262

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110793285.9A Active CN113673325B (en) 2021-07-14 2021-07-14 Multi-feature character emotion recognition method

Country Status (1)

Country Link
CN (1) CN113673325B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114565964A (en) * 2022-03-03 2022-05-31 网易(杭州)网络有限公司 Emotion recognition model generation method, recognition method, device, medium and equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107784293A (en) * 2017-11-13 2018-03-09 中国矿业大学(北京) A kind of Human bodys' response method classified based on global characteristics and rarefaction representation
CN108319891A (en) * 2017-12-07 2018-07-24 国网新疆电力有限公司信息通信公司 Face feature extraction method based on sparse expression and improved LDA
US20190042952A1 (en) * 2017-08-03 2019-02-07 Beijing University Of Technology Multi-task Semi-Supervised Online Sequential Extreme Learning Method for Emotion Judgment of User
CN109711283A (en) * 2018-12-10 2019-05-03 广东工业大学 A kind of joint doubledictionary and error matrix block Expression Recognition algorithm
CN112699774A (en) * 2020-12-28 2021-04-23 深延科技(北京)有限公司 Method and device for recognizing emotion of person in video, computer equipment and medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190042952A1 (en) * 2017-08-03 2019-02-07 Beijing University Of Technology Multi-task Semi-Supervised Online Sequential Extreme Learning Method for Emotion Judgment of User
CN107784293A (en) * 2017-11-13 2018-03-09 中国矿业大学(北京) A kind of Human bodys' response method classified based on global characteristics and rarefaction representation
CN108319891A (en) * 2017-12-07 2018-07-24 国网新疆电力有限公司信息通信公司 Face feature extraction method based on sparse expression and improved LDA
CN109711283A (en) * 2018-12-10 2019-05-03 广东工业大学 A kind of joint doubledictionary and error matrix block Expression Recognition algorithm
CN112699774A (en) * 2020-12-28 2021-04-23 深延科技(北京)有限公司 Method and device for recognizing emotion of person in video, computer equipment and medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
汪伟鸣;邵洁;: "融合面部表情和肢体动作特征的情绪识别", 电视技术, no. 01 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114565964A (en) * 2022-03-03 2022-05-31 网易(杭州)网络有限公司 Emotion recognition model generation method, recognition method, device, medium and equipment

Also Published As

Publication number Publication date
CN113673325B (en) 2023-08-15

Similar Documents

Publication Publication Date Title
Fathallah et al. Facial expression recognition via deep learning
CN108596039B (en) Bimodal emotion recognition method and system based on 3D convolutional neural network
Ghosh et al. Learning human motion models for long-term predictions
Zhang et al. Spatial–temporal recurrent neural network for emotion recognition
Baradel et al. Glimpse clouds: Human activity recognition from unstructured feature points
Shao et al. Feature learning for image classification via multiobjective genetic programming
CN106782602B (en) Speech emotion recognition method based on deep neural network
CN110532861B (en) Behavior recognition method based on framework-guided multi-mode fusion neural network
Liu et al. Facial expression recognition based on fusion of multiple Gabor features
CN107122752B (en) Human body action comparison method and device
CN111950455B (en) Motion imagery electroencephalogram characteristic identification method based on LFFCNN-GRU algorithm model
CN112667080A (en) Electroencephalogram signal unmanned platform intelligent control method based on deep convolution countermeasure network
CN110309861A (en) A kind of multi-modal mankind's activity recognition methods based on generation confrontation network
CN112949647B (en) Three-dimensional scene description method and device, electronic equipment and storage medium
CN113749657B (en) Brain electricity emotion recognition method based on multi-task capsule
CN112784929B (en) Small sample image classification method and device based on double-element group expansion
Wang et al. A deep clustering via automatic feature embedded learning for human activity recognition
CN112200110A (en) Facial expression recognition method based on deep interference separation learning
CN111523367B (en) Intelligent facial expression recognition method and system based on facial attribute analysis
CN115273236A (en) Multi-mode human gait emotion recognition method
Ullah et al. Emotion recognition from occluded facial images using deep ensemble model
CN113673325A (en) Multi-feature character emotion recognition method
Rawat et al. A novel convolutional neural network-gated recurrent unit approach for image captioning
CN111950592B (en) Multi-modal emotion feature fusion method based on supervised least square multi-class kernel canonical correlation analysis
Albrici et al. G2-VER: Geometry guided model ensemble for video-based facial expression recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant