CN113920568A - Face and human body posture emotion recognition method based on video image - Google Patents

Face and human body posture emotion recognition method based on video image Download PDF

Info

Publication number
CN113920568A
CN113920568A CN202111285381.9A CN202111285381A CN113920568A CN 113920568 A CN113920568 A CN 113920568A CN 202111285381 A CN202111285381 A CN 202111285381A CN 113920568 A CN113920568 A CN 113920568A
Authority
CN
China
Prior art keywords
face
emotion
human
human body
body posture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111285381.9A
Other languages
Chinese (zh)
Inventor
秦瑾
席明
焦勇
秦煜婷
毛智勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Telecom Wanwei Information Technology Co Ltd
Original Assignee
China Telecom Wanwei Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Wanwei Information Technology Co Ltd filed Critical China Telecom Wanwei Information Technology Co Ltd
Priority to CN202111285381.9A priority Critical patent/CN113920568A/en
Publication of CN113920568A publication Critical patent/CN113920568A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/254Fusion techniques of classification results, e.g. of results related to same input data
    • G06F18/256Fusion techniques of classification results, e.g. of results related to same input data of results relating to different input data, e.g. multimodal recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention belongs to the technical field of computer vision recognition, and particularly relates to a human face and human body posture emotion recognition method based on video images. The method comprises the following steps: video image acquisition, video frame sequence, human face detection, human body posture detection, image preprocessing, human face emotion feature extraction, human body posture emotion feature extraction, human face emotion recognition, human body posture emotion recognition, softmax classification layer, weighted average, emotion classification and the like; experiments prove that the method has obvious superiority in multi-modal emotion recognition compared with single-modal emotion recognition. According to the invention, multiple rounds of experiments prove that the accuracy of multi-modal emotion recognition is higher than that of single-modal emotion recognition; the human face features are endowed with relatively larger weight values, the human posture features are endowed with relatively smaller weight values, and relatively higher recognition accuracy can be obtained, namely the human face information plays a leading role in emotion recognition and the human posture plays an auxiliary role in emotion recognition.

Description

Face and human body posture emotion recognition method based on video image
Technical Field
The invention belongs to the technical field of computer vision recognition, and particularly relates to a human face and human body posture emotion recognition method based on video images.
Background
The emotion is the intuitive reflection and high generalization of subjective feeling, internal heart activity and external behavior, and has important significance in daily interaction of people. The method has wide application prospects in the fields of medicine, education and safe driving, and is one of hot spots of research in the field of computer vision for emotion recognition.
The face image contains rich physiological characteristic information, such as gender, age, emotion and the like, and is one of biological characteristic identification research directions. The human body posture also contains rich physiological characteristic information, for example, people in different emotional stages have obviously different posture characteristics, which provides possibility for recognizing emotion through the research of the physiological characteristic information. At present, the face emotion recognition based on a single mode has low recognition accuracy, emotion recognition based on physiological characteristic signals has objectivity and relatively high recognition results, but data acquisition of the physiological characteristic signals needs specific equipment, and is difficult to acquire and poor in user perception. Meanwhile, the physiological characteristic signal acquisition is purposefully maintained with a certain emotion, so that facial expressions and body muscle stiffness are easily caused. Taking a photo as an example, the face is often required to have a certain smile in order to ensure the quality of the photo, but the face and body states cannot be effectively controlled when the smile needs to be expressed due to various reasons such as emotional disorder, facial tension and emotional runaway, and in order to overcome the problem, a photographer needs to call out an eggplant to simulate a smiling state before pressing a shutter. However, the face emotion is various, and a simulation state cannot be set for all expressions. In actual recognition, the emotion often expressed by recording a video for a certain time or snapping the video is the most appropriate. However, the existing identification technology is mostly carried out on static images, and the identification of the dynamic field still has a large blank.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a face and human body posture emotion recognition method based on a video image, which can effectively overcome the defects and obviously improve the recognition accuracy rate compared with a single-mode emotion recognition method.
The technical scheme adopted by the invention for solving the technical problems is as follows:
a face and human body posture emotion recognition method based on video images is characterized by comprising the following steps:
the method comprises the following steps: video image acquisition, video frame sequence, human face detection, human body posture detection, image preprocessing, human face emotion feature extraction, human body posture emotion feature extraction, human face emotion recognition, human body posture emotion recognition, softmax classification layer, weighted average, emotion classification and the like;
the video image acquisition adopts equipment such as a camera or a mobile phone and the like to acquire pedestrian videos, converts continuous pedestrian videos into frame sequence images, and removes non-pedestrian videos and incomplete pedestrian images in the images; the face detection carries out face detection on pedestrian images in the video frame sequence according to the geometric outline shape of the face; detecting a human body posture in a pedestrian image in human body posture detection, wherein the human body posture comprises a head inclination angle, arm swinging amplitude and speed, and leg swinging amplitude and speed; the image preprocessing is used for enhancing the image through light compensation, gray level transformation is used for reducing the calculated amount and the storage space, geometric correction is used for respectively cutting and correcting the face image and the posture image, and filtering and sharpening operation is used for highlighting the edge detail characteristics of the image and removing the influence of noise; extracting human face emotional characteristics by using a ConvLSTM neural network model based on human face key points and human face key parts, wherein the key points comprise eyebrows, eyes, a nose, a mouth, key parts of a forehead, a cheek, an eye pouch, upper and lower jaws and lip parts; extracting human body posture features by using a ConvLSTM neural network model and taking key points as a basis to extract posture emotion features, wherein the human body posture comprises a head inclination angle, arm swinging amplitude and speed, leg swinging amplitude and speed and the like; the softmax classification layer classifies the recognition results to obtain probability recognition results; and carrying out weighted averaging on the multi-modal recognition result to obtain a final output result.
The method comprises the steps of acquiring a pedestrian dynamic video through video image acquisition, converting the dynamic video into an image frame sequence, eliminating images with unclear human faces and incomplete human postures in the image frame sequence, acquiring the pedestrian video through a monitoring camera or a mobile phone, converting the pedestrian video into the image frame sequence, and filtering the images without pedestrian information and incomplete pedestrians in the image frame sequence.
Respectively carrying out face detection and human body posture detection in the image frame sequence after the elimination and screening; preprocessing the detected face image and human body posture image, including light compensation, gray level transformation, geometric correction and filtering sharpening, and removing the influence of noise on emotion feature extraction; the method comprises the steps of carrying out face detection and human posture detection on screened pedestrian images according to face geometric outline shapes and human posture key point sequences, classifying and preprocessing the detected face images and human posture images, wherein the preprocessing comprises light compensation to enhance the images, gray level transformation is used for reducing calculated amount and storage space, geometric correction is used for cutting and correcting the face images and the posture images respectively, filtering and sharpening operation is used for highlighting image edge detail characteristics and removing noise influence, and preparation is made for next-stage feature extraction.
And extracting the emotion characteristics of the human face of the preprocessed human face image sequence through a ConvLSTM neural network model. Extracting face emotion characteristic information through face key points and face key parts, and obtaining a face emotion recognition classification result through a softmax classification layer.
And extracting the emotion characteristics of the human body posture of the preprocessed human body posture image through a ConvLSTM neural network model, and obtaining a human body posture emotion recognition classification result at a softmax classification layer.
And carrying out weighted average on the emotion recognition result obtained based on the face image and the emotion recognition result obtained based on the human body posture to obtain an overall emotion recognition result, and outputting an emotion recognition category.
The sum of the weighted value of the face emotion recognition result and the weighted value of the human posture emotion recognition result is 100%, the weighted value of the face emotion recognition result is 20-80%, and the rest is the weighted value of the global age.
The weighted value of the face emotion recognition result is 40-60%, and the rest is the weighted value of the global age.
The invention firstly collects the dynamic pedestrian video and converts the dynamic pedestrian video into a frame sequence. And then the face image and the human body posture image are respectively processed through different channels. Extracting face emotion characteristics from the preprocessed face image in a ConvLSTM neural network model through key points and key parts, recognizing face emotion in a pre-trained emotion recognition model, and obtaining a face image emotion recognition result through a softmax classification layer. Extracting emotion characteristic information such as head inclination position, arm swinging amplitude and speed, leg swinging amplitude and speed and the like from the preprocessed human body posture sequence image in a ConvLSTM neural network model through human body posture sequence key points, calling a pre-trained human body posture emotion recognition model to recognize human body posture emotion, and obtaining a human body posture emotion recognition result in a softmax classification layer. And respectively carrying out weighted averaging on the two results in the weighted averaging to obtain a final identification result.
In order to improve the emotion recognition accuracy, the invention adopts a multi-modal emotion recognition method combining the face image and the human body posture, so that the problem of low single-modal emotion recognition rate can be effectively solved; in order to reduce the problem of complex network model parameter setting, the ConvLSTM neural network model is adopted in different channels for feature extraction and emotion recognition, and deep level detail feature information of the image can be extracted more effectively. In order to reduce the problem of local information loss caused by feature fusion, the emotion recognition result is obtained through the face image and the human body posture respectively, the recognition result is weighted and averaged in the weighted average to obtain the final recognition result, the local information loss caused by the feature fusion and the complexity caused by designing a feature fusion network can be effectively reduced, and the emotion recognition accuracy is effectively improved. The invention extracts the face emotion characteristics through face key points and face key parts in face image characteristic extraction, and mainly comprises texture information and facial expression information which can reflect face emotion. The key points mainly comprise eyebrows, eyes, a nose, a mouth and the like, and the key parts mainly comprise a forehead part, a cheek part, an eye bag part, an upper jaw part, a lower jaw part, a mouth part and the like. In the human posture feature extraction, emotion feature information such as head inclination angle, arm swing amplitude and speed, leg swing amplitude and speed and the like is mainly extracted through a human posture key point sequence.
The method adopts the ConvLSTM neural network model to carry out feature extraction and emotion recognition, can effectively reduce the problem of inconsistent network design complexity and parameter setting, carries out weighted averaging on the emotion recognition result based on the face image and the human posture emotion recognition result to obtain the final emotion recognition result, and can effectively reduce the problem of local information loss caused by feature fusion so as to improve the emotion recognition accuracy.
The emotion of the invention is four classification results of happy, angry, sad and neutral. Experiments prove that the method has obvious superiority in multi-modal emotion recognition compared with single-modal emotion recognition. According to the invention, multiple rounds of experiments prove that the accuracy of multi-modal emotion recognition is higher than that of single-modal emotion recognition; the human face features are endowed with relatively larger weight values, the human posture features are endowed with relatively smaller weight values, and relatively higher recognition accuracy can be obtained, namely the human face information plays a leading role in emotion recognition and the human posture plays an auxiliary role in emotion recognition.
Drawings
FIG. 1 is a general flow diagram of the present invention.
Detailed Description
A face and human body posture emotion recognition method based on video images is characterized by comprising the following steps:
the method comprises the following steps: video image acquisition, video frame sequence, human face detection, human body posture detection, image preprocessing, human face emotion feature extraction, human body posture emotion feature extraction, human face emotion recognition, human body posture emotion recognition, softmax classification layer, weighted average, emotion classification and the like;
the video image acquisition adopts equipment such as a camera or a mobile phone and the like to acquire pedestrian videos, converts continuous pedestrian videos into frame sequence images, and removes non-pedestrian videos and incomplete pedestrian images in the images; the face detection carries out face detection on pedestrian images in the video frame sequence according to the geometric outline shape of the face; detecting a human body posture in a pedestrian image in human body posture detection, wherein the human body posture comprises a head inclination angle, arm swinging amplitude and speed, and leg swinging amplitude and speed; the image preprocessing is used for enhancing the image through light compensation, gray level transformation is used for reducing the calculated amount and the storage space, geometric correction is used for respectively cutting and correcting the face image and the posture image, and filtering and sharpening operation is used for highlighting the edge detail characteristics of the image and removing the influence of noise; extracting human face emotional characteristics by using a ConvLSTM neural network model based on human face key points and human face key parts, wherein the key points comprise eyebrows, eyes, a nose, a mouth, key parts of a forehead, a cheek, an eye pouch, upper and lower jaws and lip parts; extracting human body posture features by using a ConvLSTM neural network model and taking key points as a basis to extract posture emotion features, wherein the human body posture comprises a head inclination angle, arm swinging amplitude and speed, leg swinging amplitude and speed and the like; the softmax classification layer classifies the recognition results to obtain probability recognition results; and carrying out weighted averaging on the multi-modal recognition result to obtain a final output result.
The method comprises the steps of acquiring a pedestrian dynamic video through video image acquisition, converting the dynamic video into an image frame sequence, eliminating images with unclear human faces and incomplete human postures in the image frame sequence, acquiring the pedestrian video through a monitoring camera or a mobile phone, converting the pedestrian video into the image frame sequence, and filtering the images without pedestrian information and incomplete pedestrians in the image frame sequence.
Respectively carrying out face detection and human body posture detection in the image frame sequence after the elimination and screening; preprocessing the detected face image and human body posture image, including light compensation, gray level transformation, geometric correction and filtering sharpening, and removing the influence of noise on emotion feature extraction; the method comprises the steps of carrying out face detection and human posture detection on screened pedestrian images according to face geometric outline shapes and human posture key point sequences, classifying and preprocessing the detected face images and human posture images, wherein the preprocessing comprises light compensation to enhance the images, gray level transformation is used for reducing calculated amount and storage space, geometric correction is used for cutting and correcting the face images and the posture images respectively, filtering and sharpening operation is used for highlighting image edge detail characteristics and removing noise influence, and preparation is made for next-stage feature extraction.
And extracting the emotion characteristics of the human face of the preprocessed human face image sequence through a ConvLSTM neural network model. Extracting face emotion characteristic information through face key points and face key parts, and obtaining a face emotion recognition classification result through a softmax classification layer.
And extracting the emotion characteristics of the human body posture of the preprocessed human body posture image through a ConvLSTM neural network model, and obtaining a human body posture emotion recognition classification result at a softmax classification layer.
And carrying out weighted average on the emotion recognition result obtained based on the face image and the emotion recognition result obtained based on the human body posture to obtain an overall emotion recognition result, and outputting an emotion recognition category.
The sum of the weighted value of the face emotion recognition result and the weighted value of the human posture emotion recognition result is 100%, the weighted value of the face emotion recognition result is 20-80%, and the rest is the weighted value of the global age.
The weighted value of the face emotion recognition result is 40-60%, and the rest is the weighted value of the global age.
In actual use, the preprocessed human face image sequence is subjected to human face emotion feature extraction through a ConvLSTM neural network model. The face emotion characteristic information is extracted through face key points and face key parts, the key points mainly comprise eyebrows, eyes, a nose, a mouth and the like, and the key parts mainly comprise a forehead part, a cheek part, an eye pouch part, an upper jaw part, a lower jaw part, a lip part and the like. Carrying out emotion recognition on the extracted face emotion characteristic information by calling a pre-trained face emotion recognition model, and obtaining a face emotion recognition classification result through a softmax classification layer; extracting the emotion characteristics of the face of the preprocessed face image through a ConvLSTM network model, predicting the emotion category in a pre-trained emotion estimation model, outputting a probability prediction type through a softmax classification layer, taking a maximum probability prediction value as a current prediction type, and if the face image with happy emotion is input, outputting the face image with happy emotion: 70%, neutral: 20%, gas generation: 8%, sadness: 2%, the emotion is integrated and recognized as happy.
And extracting emotional characteristics of the human body posture of the preprocessed human body posture image through a ConvLSTM neural network model. And extracting emotional characteristic information such as head inclination angle, arm swing amplitude and speed, leg swing amplitude and speed and the like through the human posture key point sequence. And meanwhile, emotion recognition is carried out on the extracted human body posture emotion characteristics by calling a pre-trained human body posture emotion recognition model, and a human body posture emotion recognition classification result is obtained at a softmax classification layer. Extracting human posture emotion characteristics of the preprocessed human face image through a ConvLSTM network model, predicting emotion types in a pre-trained human posture emotion estimation model, outputting probability prediction types through softmax classification layers, taking a maximum probability prediction value as a current prediction type, and outputting a human posture image with high swing speed through the steps, wherein if an input head inclination angle is about 45 degrees, an arm swing amplitude is about 30cm, the swing speed is high, a leg swing amplitude is about 50cm, and the swing speed is high: 35% and neutral: 33%, happy: 28%, sadness: 4%, and integrating the emotion to identify the emotion as anger.
And carrying out weighted average on the emotion recognition result obtained based on the face image and the emotion recognition result obtained based on the human body posture to obtain an overall emotion recognition result, and outputting an emotion recognition category. The recognition results are independent recognition results obtained through a face image and a human body posture respectively, in order to verify the effectiveness of weighted average, the two recognition methods are subjected to 10 times of experiments to obtain an average value, and verification is performed through the following two methods, namely: and fusing the human face features and the human body posture features and then performing emotion recognition to obtain a final result with accuracy of 86.60%, wherein the method II comprises the following steps: the weight proportion is verified through 10 times of experiments to obtain an average result, the face features are endowed with 50% weight, the human posture features are endowed with 50% weight, the accuracy rate of the final output result is 83.20%, the face features are endowed with 60% weight, the human posture features are endowed with 40% weight, the accuracy rate of the final output result is 89.20%, the face features are endowed with 40% weight, the human posture features are endowed with 60% weight, and the accuracy rate of the final output result is 78.60%.
The method comprises the steps of video acquisition, video frame sequence, face detection, human posture detection, image preprocessing, face emotion feature extraction, human posture emotion feature extraction, face emotion recognition, human posture emotion recognition, softmax classification, weighted average and emotion classification.
The pedestrian video image is collected through the video collecting device, the video image is converted into the frame sequence image, in order to effectively process the face image and the human body posture image, the non-face image and the non-human body posture image in the frame sequence image need to be discarded, and meanwhile, the unclear image is filtered.
And processing the screened video frame sequence images at different channels simultaneously. One of the channels is used for face detection, and the other channel is used for human posture detection. Preprocessing the detected face image and human body posture image, wherein light compensation is used for enhancing the image quality; the gray scale transformation is used for reducing the image storage space on the premise of not influencing the image quality; the geometric correction carries out position correction and alignment operation on the human face and the human body posture in the image; filtering and sharpening are used for more accurate target positioning and highlighting image detail features.
And carrying out face emotion feature extraction and human posture emotion feature extraction on the preprocessed face image and the preprocessed human posture image in a ConvLSTM neural network model through different channels. The emotion feature extraction of the face image is mainly characterized by carrying out feature mapping through labeled face key points and face key parts, wherein the face key points mainly comprise eyebrows, eyes, a nose, a mouth and the like, and the key parts mainly comprise a forehead part, a cheek part, an eye bag part, an upper jaw part, a lower jaw part, a lip part and the like. The specific detection extraction of the human body posture emotion mainly extracts emotional characteristics through the head inclination angle, the arm swing amplitude and speed and the leg swing amplitude and speed in the human body key point sequence.
And carrying out emotion recognition on the extracted human face emotion characteristics and human body posture emotion characteristics by calling a pre-trained human face emotion recognition model and a pre-trained human body posture emotion recognition model, and obtaining a recognition result through a softmax classification layer.
And carrying out weighted averaging on the face emotion recognition result and the human posture emotion recognition result in weighted averaging to obtain a final recognition result, and outputting the emotion type.

Claims (8)

1. A face and human body posture emotion recognition method based on video images is characterized by comprising the following steps:
the method comprises the following steps: video image acquisition, video frame sequence, human face detection, human body posture detection, image preprocessing, human face emotion feature extraction, human body posture emotion feature extraction, human face emotion recognition, human body posture emotion recognition, softmax classification layer, weighted average, emotion classification and the like;
the video image acquisition adopts equipment such as a camera or a mobile phone and the like to acquire pedestrian videos, converts continuous pedestrian videos into frame sequence images, and removes non-pedestrian videos and incomplete pedestrian images in the images; the face detection carries out face detection on pedestrian images in the video frame sequence according to the geometric outline shape of the face; detecting a human body posture in a pedestrian image in human body posture detection, wherein the human body posture comprises a head inclination angle, arm swinging amplitude and speed, and leg swinging amplitude and speed; the image preprocessing is used for enhancing the image through light compensation, gray level transformation is used for reducing the calculated amount and the storage space, geometric correction is used for respectively cutting and correcting the face image and the posture image, and filtering and sharpening operation is used for highlighting the edge detail characteristics of the image and removing the influence of noise; extracting human face emotional characteristics by using a ConvLSTM neural network model based on human face key points and human face key parts, wherein the key points comprise eyebrows, eyes, a nose, a mouth, key parts of a forehead, a cheek, an eye pouch, upper and lower jaws and lip parts; extracting human body posture features by using a ConvLSTM neural network model and taking key points as a basis to extract posture emotion features, wherein the human body posture comprises a head inclination angle, arm swinging amplitude and speed, leg swinging amplitude and speed and the like; the softmax classification layer classifies the recognition results to obtain probability recognition results; and carrying out weighted averaging on the multi-modal recognition result to obtain a final output result.
2. The method for recognizing the emotion of human face and human body posture based on video image as claimed in claim 1, wherein: the method comprises the steps of acquiring a pedestrian dynamic video through video image acquisition, converting the dynamic video into an image frame sequence, eliminating images with unclear human faces and incomplete human postures in the image frame sequence, acquiring the pedestrian video through a monitoring camera or a mobile phone, converting the pedestrian video into the image frame sequence, and filtering the images without pedestrian information and incomplete pedestrians in the image frame sequence.
3. The method for recognizing the emotion of human face and human body posture based on video image as claimed in claim 1, wherein: respectively carrying out face detection and human body posture detection in the image frame sequence after the elimination and screening; preprocessing the detected face image and human body posture image, including light compensation, gray level transformation, geometric correction and filtering sharpening, and removing the influence of noise on emotion feature extraction; the method comprises the steps of carrying out face detection and human posture detection on screened pedestrian images according to face geometric outline shapes and human posture key point sequences, classifying and preprocessing the detected face images and human posture images, wherein the preprocessing comprises light compensation to enhance the images, gray level transformation is used for reducing calculated amount and storage space, geometric correction is used for cutting and correcting the face images and the posture images respectively, filtering and sharpening operation is used for highlighting image edge detail characteristics and removing noise influence, and preparation is made for next-stage feature extraction.
4. The method for recognizing the emotion of human face and human body posture based on video image as claimed in claim 1, wherein: extracting the emotion characteristics of the human face of the preprocessed human face image sequence through a ConvLSTM neural network model; extracting face emotion characteristic information through face key points and face key parts, and obtaining a face emotion recognition classification result through a softmax classification layer.
5. The method for recognizing the emotion of human face and human body posture based on video image as claimed in claim 1, wherein: and extracting the emotion characteristics of the human body posture of the preprocessed human body posture image through a ConvLSTM neural network model, and obtaining a human body posture emotion recognition classification result at a softmax classification layer.
6. The method for recognizing the emotion of human face and human body posture based on video image as claimed in claim 1, wherein: and carrying out weighted average on the emotion recognition result obtained by the face image and the emotion recognition result obtained by the human body posture to obtain an overall emotion recognition result, and outputting an emotion recognition category.
7. The method for recognizing the emotion of human face and human body posture based on video image as claimed in claim 6, wherein: the sum of the weighted value of the face emotion recognition result and the weighted value of the human posture emotion recognition result is 100%, the weighted value of the face emotion recognition result is 20-80%, and the rest is the weighted value of the global age.
8. The method for recognizing the emotion of human face and human body posture based on video image as claimed in claim 7, wherein: the weighted value of the face emotion recognition result is 40-60%, and the rest is the weighted value of the global age.
CN202111285381.9A 2021-11-02 2021-11-02 Face and human body posture emotion recognition method based on video image Pending CN113920568A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111285381.9A CN113920568A (en) 2021-11-02 2021-11-02 Face and human body posture emotion recognition method based on video image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111285381.9A CN113920568A (en) 2021-11-02 2021-11-02 Face and human body posture emotion recognition method based on video image

Publications (1)

Publication Number Publication Date
CN113920568A true CN113920568A (en) 2022-01-11

Family

ID=79244933

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111285381.9A Pending CN113920568A (en) 2021-11-02 2021-11-02 Face and human body posture emotion recognition method based on video image

Country Status (1)

Country Link
CN (1) CN113920568A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114550088A (en) * 2022-02-22 2022-05-27 北京城建设计发展集团股份有限公司 Multi-camera fused passenger identification method and system and electronic equipment
CN115049016A (en) * 2022-07-20 2022-09-13 聚好看科技股份有限公司 Model driving method and device based on emotion recognition
CN117036877A (en) * 2023-07-18 2023-11-10 六合熙诚(北京)信息科技有限公司 Emotion recognition method and system for facial expression and gesture fusion

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114550088A (en) * 2022-02-22 2022-05-27 北京城建设计发展集团股份有限公司 Multi-camera fused passenger identification method and system and electronic equipment
CN115049016A (en) * 2022-07-20 2022-09-13 聚好看科技股份有限公司 Model driving method and device based on emotion recognition
CN117036877A (en) * 2023-07-18 2023-11-10 六合熙诚(北京)信息科技有限公司 Emotion recognition method and system for facial expression and gesture fusion

Similar Documents

Publication Publication Date Title
CN108921100B (en) Face recognition method and system based on visible light image and infrared image fusion
CN113920568A (en) Face and human body posture emotion recognition method based on video image
CN109543526B (en) True and false facial paralysis recognition system based on depth difference characteristics
CN105809144B (en) A kind of gesture recognition system and method using movement cutting
Lucey et al. Automatically detecting pain using facial actions
US8462996B2 (en) Method and system for measuring human response to visual stimulus based on changes in facial expression
CN111401270A (en) Human motion posture recognition and evaluation method and system
CN110472512B (en) Face state recognition method and device based on deep learning
CN107798318A (en) The method and its device of a kind of happy micro- expression of robot identification face
CN109299690B (en) Method capable of improving video real-time face recognition precision
CN105160318A (en) Facial expression based lie detection method and system
TW201201115A (en) Facial expression recognition systems and methods and computer program products thereof
Wimmer et al. Low-level fusion of audio and video feature for multi-modal emotion recognition
CN110859609B (en) Multi-feature fusion fatigue driving detection method based on voice analysis
CN111666845B (en) Small sample deep learning multi-mode sign language recognition method based on key frame sampling
CN111738178A (en) Wearing mask facial expression recognition method based on deep learning
CN111079465A (en) Emotional state comprehensive judgment method based on three-dimensional imaging analysis
CN114067185A (en) Film evaluation system based on facial expression recognition
CN114565957A (en) Consciousness assessment method and system based on micro expression recognition
CN114155512A (en) Fatigue detection method and system based on multi-feature fusion of 3D convolutional network
CN113887386A (en) Fatigue detection method based on multi-feature fusion of deep learning and machine learning
CN113627256A (en) Method and system for detecting counterfeit video based on blink synchronization and binocular movement detection
CN111339878A (en) Eye movement data-based correction type real-time emotion recognition method and system
Hakura et al. Facial expression invariants for estimating mental states of person
CN113920567A (en) ConvLSTM network model-based face image age estimation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination