CN111914594B - Group emotion recognition method based on motion characteristics - Google Patents

Group emotion recognition method based on motion characteristics Download PDF

Info

Publication number
CN111914594B
CN111914594B CN201910383943.XA CN201910383943A CN111914594B CN 111914594 B CN111914594 B CN 111914594B CN 201910383943 A CN201910383943 A CN 201910383943A CN 111914594 B CN111914594 B CN 111914594B
Authority
CN
China
Prior art keywords
time
features
emotion recognition
network
level
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910383943.XA
Other languages
Chinese (zh)
Other versions
CN111914594A (en
Inventor
卿粼波
许盛宇
吴晓红
何小海
滕奇志
周文俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan University
Original Assignee
Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan University filed Critical Sichuan University
Priority to CN201910383943.XA priority Critical patent/CN111914594B/en
Publication of CN111914594A publication Critical patent/CN111914594A/en
Application granted granted Critical
Publication of CN111914594B publication Critical patent/CN111914594B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/53Recognition of crowd images, e.g. recognition of crowd congestion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches

Abstract

The invention provides a group emotion recognition method based on motion characteristics, and mainly relates to the analysis of emotion in a scene video sequence by utilizing a multi-channel group emotion recognition network. The method comprises the following steps: constructing a multi-channel group emotion recognition network, extracting low-level motion features of different time sequences in parallel by using the network, rearranging and fusing the low-level features extracted by each channel in a time dimension, and obtaining global high-level features through a 3D residual error module to realize group emotion recognition. The method effectively avoids the problems of deviation, long time consumption and the like of manual feature extraction, so that the adaptability of the method is stronger. In addition, a multi-channel network is used for carrying out feature extraction on the long video sequence in a time sequence, the time correlation between frames is fully considered, low-level time sequence features are rearranged and fused on the time dimension, the coupling between the features is reduced, and the accuracy and the efficiency of group emotion recognition are improved.

Description

Group emotion recognition method based on motion characteristics
Technical Field
The invention relates to an emotion recognition problem in the field of deep learning, in particular to a group emotion recognition method based on motion characteristics.
Background
The emotion analysis of the crowd judges the emotional state of the crowd by analyzing the behaviors, dresses and the like of the crowd. Videos exist in real life in a large number, such as unmanned aerial vehicle video monitoring, network sharing videos, 3D videos and the like. By analyzing the emotion of the crowd in the video, the emotion and emotion change of the crowd in the video can be learned dynamically, and the video emotion recognition method has a wide application prospect.
Group emotion recognition is mainly analyzed by the emotions of people in a scene when a target is close to a camera. However, in a new era of rapid development, the mere analysis of clearly visible faces and emotions of groups has not fully satisfied the perception of emotional states of people. Therefore, the need of the study is not only to lift the study object from the face of an individual to a group, but also to lift the study on the group to the study on the emotion of a large-scale crowd far away from the shot. With the increasing annual population of the world in recent years, large-scale meeting occasions and population events are more and more, so that the emotion analysis of population groups is particularly important.
The traditional crowd emotion recognition algorithm mainly utilizes shallow algorithms to extract motion characteristics among video frames. For some shallow algorithms (support vector machines, single-layer neural networks, etc.), they need to manually extract features, and given a limited number of samples and computing units, the shallow structure is difficult to effectively express the features of a complex model, and especially when the studied object has rich meanings, the generalization capability is obviously insufficient, so the shallow structure has certain limitations. Existing research aiming at population groups mainly focuses on studying behaviors in the population, and the research on the emotional aspects of the population is less. The basic type of group movement may reflect a representative mood of the group. However, these conventional algorithms often extract too single features, resulting in an analysis of performance that is not deep enough. And a small amount of related research also gives full play to the advantage of deep learning, ensures that the motion characteristics of the group are automatically extracted, simultaneously improves the richness of the characteristics, and realizes the analysis of the group emotion in the video.
Disclosure of Invention
The invention aims to provide a group emotion recognition method based on motion characteristics, which combines deep learning with group emotion in a video, introduces a 3D residual convolution neural network structure, analyzes time sequence characteristics in a group video to obtain motion states of people in the video, and further analyzes emotion information of the people.
For convenience of explanation, the following concepts are first introduced:
convolutional Neural Network (CNN): the convolutional neural network is designed based on the inspiration of a visual neural mechanism, is a multilayer feedforward neural network, each layer is composed of a plurality of two-dimensional planes, each neuron on each plane works independently, and the convolutional neural network mainly comprises a feature extraction layer and a feature mapping layer.
3D Residual Module (3D Residual Module) to solve the problem of learning the identity mapping function, a linear layer is fitted to another feature f (x) h (x) -x, the main idea being to remove the same body part, highlighting the slight variations. And replacing the 2D convolution operation in the residual error module with a 3D convolution operation to obtain the 3D residual error module.
The invention specifically adopts the following technical scheme:
a group emotion recognition method based on motion characteristics is characterized by comprising the following steps:
a. dividing the long video sequence in time sequence, and respectively extracting low-level motion characteristics of each segment by channels;
b. analyzing low-level motion characteristics in the group video by using a 3D residual convolutional neural network;
c. rearranging and fusing the motion characteristics of the multi-channel network in the step a in the time dimension, and analyzing global high-level characteristics;
the method mainly comprises the following steps:
(1) preprocessing a group scene video sequence, and uniformly processing the video sequence into a resolution of 112 multiplied by 112;
(2) dividing a video sequence to be analyzed into 4 short videos, and respectively taking out initial 4 frames in each short video as the input of a network to obtain low-level motion characteristics on different time sequences;
(3) introducing a multi-Channel group emotion recognition network (Channel1 Channel, Channel2 Channel, Channel3 Channel and Channel 4 Channel) based on a 3D residual convolutional neural network, and extracting low-level motion features of corresponding time sequences of each short video.
(4) And performing recombination fusion on the acquired low-level motion characteristics on a time dimension through a fusion module, sending the combined global low-level characteristics into a 3D residual error module, analyzing the global high-level characteristics based on the long video, and finally classifying to obtain group emotion.
The invention has the beneficial effects that:
(1) the advantage of self-learning in the deep learning is fully developed, the machine can automatically learn the image characteristics, the problem of deviation and low efficiency of manually selecting the characteristics is effectively avoided, and the adaptive capacity is stronger.
(2) The original long video sequence is divided into small segments according to time sequence, data volume is compressed on the premise of keeping global information, and network speed and computing efficiency are improved.
(3) The 3D convolutional neural network is used for replacing the 2D convolutional neural network for feature extraction, time sequence information between frames is fully reserved, and the performance and efficiency of the network are optimized by using the 3D residual error module.
(4) The motion features extracted from the channels are rearranged and fused in the time dimension, the features with correlation in the 4 channels are fused together, the coupling between the features is reduced, the correlation of the motion features in the time dimension is fully mined, and the performance of the network on group emotion analysis is improved.
(5) The deep learning and the emotion analysis of the group scene are combined, the problem that the accuracy rate of the traditional method is low is solved, and the research value is improved.
Drawings
Fig. 1 is a diagram of a motion feature population emotion recognition network composition based on a 3D convolutional neural network.
Fig. 2 is an illustration of the way in which the low-level motion features extracted from multiple channels are rearranged and fused in the time dimension.
Detailed Description
The present invention is further described in detail with reference to the drawings and examples, it should be noted that the following examples are only for illustrating the present invention and should not be construed as limiting the scope of the present invention, and those skilled in the art should be able to make certain insubstantial modifications and adaptations to the present invention based on the above disclosure and should still fall within the scope of the present invention.
The group emotion recognition method based on the motion characteristics specifically comprises the following steps:
(1) and a mixed data set combining a CUHK group data set, a UCF data set, a Web data set and a PET2009 data set is used, each long video in the data set is divided into 4 sections of short videos, each section of short video is divided into a plurality of groups according to a group of 4 frames and recombined, a plurality of recombined short video sequences for training are formed, and the expansion of the training set is realized.
(2) Firstly, a Kinetics human motion video data set is adopted to pre-train the model, then the expanded short video data set is sent into 4 channels of a network in batches, and the motion characteristics of each time sequence are extracted respectively to obtain the corresponding low-level motion characteristics.
(3) The acquired 4-channel low-level motion features are recombined through a short video space-time feature fusion module, the low-level motion features acquired by each channel are firstly split and respectively divided into 4 feature segments, and then the feature segments with correlation are stacked together to acquire a recombined global low-level feature.
(4) And sending the fused global low-level features into a subsequent 3D residual module for continuous training to obtain global high-level features based on the long video, and finally classifying to obtain group emotion. And performing back propagation to optimize network parameters according to the classification result until an optimal network model is obtained.
(5) And inputting the test set data into a network, and verifying the performance of the model.

Claims (4)

1. A group emotion recognition method based on motion characteristics is characterized by comprising the following steps:
a. the long-time video sequence is divided in time sequence, and low-level motion characteristics of each segment are respectively extracted by channels;
b. analyzing low-level motion characteristics in the group video by using a 3D residual convolutional neural network;
c. rearranging and fusing the motion characteristics of the multi-channel network in the step a in the time dimension, and analyzing global high-level characteristics;
the method mainly comprises the following steps:
(1) preprocessing a group scene video sequence, and uniformly processing the video sequence into a resolution of 112 multiplied by 112;
(2) dividing a video sequence to be analyzed into 4 short-time videos, respectively taking out initial 4 frames in each short-time video as the input of a network, and obtaining low-level motion characteristics on different time sequences;
(3) introducing a multi-channel group emotion recognition network based on a 3D residual convolution neural network, and respectively extracting low-level motion characteristics of corresponding time sequences of 4 short-time videos by using 4 channels sharing weight parameters;
(4) and (4) recombining and fusing the acquired low-level motion characteristics (4 x (C x H W)) in a time dimension through a fusion module: sequentially splicing feature maps of the ith (i belongs to [0, C ]) layer in 4 channels according to the time sequence to obtain C feature blocks of 4H W, and then sequentially combining the feature blocks to obtain a fusion feature ((C4) H W); and then sending the combined global low-level features into a 3D residual error module, analyzing the global high-level features based on the long video, and finally classifying to obtain group emotions.
2. The group emotion recognition method based on motion characteristics as claimed in claim 1, wherein the average frame extraction method is adopted in step (2), and the video sequence to be analyzed is firstly divided into 4 short videos, and then the initial 4 frames of the 4 short videos are respectively taken, and the video sequence is compressed on the premise of keeping a certain global information, so that the calculation efficiency is improved.
3. The group emotion recognition method based on motion features as claimed in claim 1, wherein in step (3), a 3D convolutional neural network is used instead of a 2D convolutional neural network for feature extraction, so that time sequence information between frames is fully retained, and the performance and efficiency of the network are optimized by using a 3D residual module.
4. The group emotion recognition method based on motion features as claimed in claim 1, wherein the motion features extracted from the 4 channels respectively in step (4) are rearranged and fused in the time dimension, the features with correlation in the 4 channels are fused together, the coupling between the features is reduced, the correlation of the motion features in the time dimension is fully mined, and the group emotion analysis performance by the network is improved.
CN201910383943.XA 2019-05-08 2019-05-08 Group emotion recognition method based on motion characteristics Active CN111914594B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910383943.XA CN111914594B (en) 2019-05-08 2019-05-08 Group emotion recognition method based on motion characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910383943.XA CN111914594B (en) 2019-05-08 2019-05-08 Group emotion recognition method based on motion characteristics

Publications (2)

Publication Number Publication Date
CN111914594A CN111914594A (en) 2020-11-10
CN111914594B true CN111914594B (en) 2022-07-01

Family

ID=73242780

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910383943.XA Active CN111914594B (en) 2019-05-08 2019-05-08 Group emotion recognition method based on motion characteristics

Country Status (1)

Country Link
CN (1) CN111914594B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112699774A (en) * 2020-12-28 2021-04-23 深延科技(北京)有限公司 Method and device for recognizing emotion of person in video, computer equipment and medium
CN112699785B (en) * 2020-12-29 2022-06-07 中国民用航空飞行学院 Group emotion recognition and abnormal emotion detection method based on dimension emotion model

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107368798A (en) * 2017-07-07 2017-11-21 四川大学 A kind of crowd's Emotion identification method based on deep learning
CN107958260A (en) * 2017-10-27 2018-04-24 四川大学 A kind of group behavior analysis method based on multi-feature fusion
US10089556B1 (en) * 2017-06-12 2018-10-02 Konica Minolta Laboratory U.S.A., Inc. Self-attention deep neural network for action recognition in surveillance videos
CN109299700A (en) * 2018-10-15 2019-02-01 南京地铁集团有限公司 Subway group abnormality behavioral value method based on crowd density analysis

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8195598B2 (en) * 2007-11-16 2012-06-05 Agilence, Inc. Method of and system for hierarchical human/crowd behavior detection
CN107169426B (en) * 2017-04-27 2020-03-31 广东工业大学 Crowd emotion abnormality detection and positioning method based on deep neural network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10089556B1 (en) * 2017-06-12 2018-10-02 Konica Minolta Laboratory U.S.A., Inc. Self-attention deep neural network for action recognition in surveillance videos
CN107368798A (en) * 2017-07-07 2017-11-21 四川大学 A kind of crowd's Emotion identification method based on deep learning
CN107958260A (en) * 2017-10-27 2018-04-24 四川大学 A kind of group behavior analysis method based on multi-feature fusion
CN109299700A (en) * 2018-10-15 2019-02-01 南京地铁集团有限公司 Subway group abnormality behavioral value method based on crowd density analysis

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
卿粼波 等.基于多流CNN-LSTM网络的群体情绪识别.《计算机应用研究》.2018, *
张严浩.基于结构化认知计算的群体行为分析.《中国优秀博士学位论文全文数据库 信息科技辑》.2018, *

Also Published As

Publication number Publication date
CN111914594A (en) 2020-11-10

Similar Documents

Publication Publication Date Title
US11010600B2 (en) Face emotion recognition method based on dual-stream convolutional neural network
CN108764072B (en) Blood cell subtype image classification method based on multi-scale fusion
JP7412847B2 (en) Image processing method, image processing device, server, and computer program
CN110837842A (en) Video quality evaluation method, model training method and model training device
CN111914594B (en) Group emotion recognition method based on motion characteristics
CN105160678A (en) Convolutional-neural-network-based reference-free three-dimensional image quality evaluation method
CN110084202A (en) A kind of video behavior recognition methods based on efficient Three dimensional convolution
CN112132197A (en) Model training method, image processing method, device, computer equipment and storage medium
CN108921037B (en) Emotion recognition method based on BN-acceptance double-flow network
US20210056357A1 (en) Systems and methods for implementing flexible, input-adaptive deep learning neural networks
CN110472622B (en) Video processing method and related device, image processing method and related device
WO2021184754A1 (en) Video comparison method and apparatus, computer device and storage medium
CN112132797B (en) Short video quality screening method
CN110225368A (en) A kind of video locating method, device and electronic equipment
CN113392781A (en) Video emotion semantic analysis method based on graph neural network
CN111914600A (en) Group emotion recognition method based on space attention model
CN110110812B (en) Stream depth network model construction method for video motion recognition
Mansour et al. Design of integrated artificial intelligence techniques for video surveillance on iot enabled wireless multimedia sensor networks
CN113657272B (en) Micro video classification method and system based on missing data completion
CN114360018A (en) Rendering method and device of three-dimensional facial expression, storage medium and electronic device
CN111401116A (en) Bimodal emotion recognition method based on enhanced convolution and space-time L STM network
CN112508121B (en) Method and system for sensing outside of industrial robot
Chen et al. Design and implementation of video analytics system based on edge computing
CN109002808A (en) A kind of Human bodys' response method and system
WO2023217138A1 (en) Parameter configuration method and apparatus, device, storage medium and product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant