CN116649980B - Emotion monitoring method, system, equipment and storage medium based on artificial intelligence - Google Patents

Emotion monitoring method, system, equipment and storage medium based on artificial intelligence Download PDF

Info

Publication number
CN116649980B
CN116649980B CN202310663752.5A CN202310663752A CN116649980B CN 116649980 B CN116649980 B CN 116649980B CN 202310663752 A CN202310663752 A CN 202310663752A CN 116649980 B CN116649980 B CN 116649980B
Authority
CN
China
Prior art keywords
emotion
target object
heart rate
monitoring information
classification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310663752.5A
Other languages
Chinese (zh)
Other versions
CN116649980A (en
Inventor
伍胡宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan University
Original Assignee
Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan University filed Critical Sichuan University
Priority to CN202310663752.5A priority Critical patent/CN116649980B/en
Publication of CN116649980A publication Critical patent/CN116649980A/en
Application granted granted Critical
Publication of CN116649980B publication Critical patent/CN116649980B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/16Devices for psychotechnics; Testing reaction times ; Devices for evaluating the psychological state
    • A61B5/165Evaluating the state of mind, e.g. depression, anxiety
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/02Detecting, measuring or recording pulse, heart rate, blood pressure or blood flow; Combined pulse/heart-rate/blood pressure determination; Evaluating a cardiovascular condition not otherwise provided for, e.g. using combinations of techniques provided for in this group with electrocardiography or electroauscultation; Heart catheters for measuring blood pressure
    • A61B5/0205Simultaneously evaluating both cardiovascular conditions and different types of body conditions, e.g. heart and respiratory condition
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/117Identification of persons
    • A61B5/1171Identification of persons based on the shapes or appearances of their bodies or parts thereof
    • A61B5/1176Recognition of faces
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/48Other medical applications
    • A61B5/4803Speech analysis specially adapted for diagnostic purposes
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • A61B5/7264Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • A61B5/7264Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
    • A61B5/7267Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems involving training the classification device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Medical Informatics (AREA)
  • Veterinary Medicine (AREA)
  • Public Health (AREA)
  • Surgery (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Pathology (AREA)
  • Biomedical Technology (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • General Physics & Mathematics (AREA)
  • Physiology (AREA)
  • Psychiatry (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mathematical Physics (AREA)
  • Signal Processing (AREA)
  • Cardiology (AREA)
  • Evolutionary Computation (AREA)
  • Fuzzy Systems (AREA)
  • Hospice & Palliative Care (AREA)
  • Pulmonology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Child & Adolescent Psychology (AREA)
  • Developmental Disabilities (AREA)
  • Educational Technology (AREA)
  • Social Psychology (AREA)
  • Psychology (AREA)
  • Testing And Monitoring For Control Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the technical field of artificial intelligence, and particularly discloses an artificial intelligence-based emotion monitoring method, an artificial intelligence-based emotion monitoring system, an artificial intelligence-based emotion monitoring equipment and a storage medium. According to the invention, the artificial intelligence technology is used for merging various factors to efficiently, comprehensively and accurately judge the emotion condition of the target object, and generating emotion monitoring information so as to realize refined emotion monitoring of the target object.

Description

Emotion monitoring method, system, equipment and storage medium based on artificial intelligence
Technical Field
The invention belongs to the technical field of artificial intelligence, and particularly relates to an emotion monitoring method, system, equipment and storage medium based on artificial intelligence.
Background
Along with development of science and technology, artificial intelligence technology is gradually applied to production and life of people, such as realizing emotion recognition of people by using the artificial intelligence technology. Emotion refers to a strong emotional state caused by subjectiveness and is often accompanied by psychological changes, negative emotions such as anger, heart injury and the like can cause adverse effects on physical and mental health of people, especially old people and infants, such as reducing cognitive ability of people, exacerbating worsening physiological conditions, causing diseases and the like, and the influence caused by the negative emotions can be avoided by timely monitoring and adjusting the emotion, so that the method is beneficial to physical and mental health of people. The existing technical scheme for monitoring the emotion state of an individual by using an artificial intelligence technology is to perform single-angle emotion analysis from individual voice or individual images, the information elements on which the emotion analysis and judgment depends are compared on one side, the accuracy of analysis and judgment is still to be improved, only rough emotion classification can be identified, and more detailed emotion state evaluation is not performed.
Disclosure of Invention
The invention aims to provide an emotion monitoring method, system, equipment and storage medium based on artificial intelligence, which are used for solving the problems in the prior art.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
in a first aspect, there is provided an artificial intelligence based emotion monitoring method comprising:
acquiring heart rate monitoring information, voice monitoring information and image monitoring information of a target object;
determining heart rate parameters of the target object according to the heart rate monitoring information;
performing audio preprocessing and audio feature extraction on the voice monitoring information to obtain voice spectrum features of a target object, inputting the voice spectrum features into a deep learning-based voice emotion recognition model to perform voice emotion recognition, and obtaining a first emotion recognition result;
image preprocessing is carried out on the image monitoring information to obtain a face image of the target object, face characteristic parameters are extracted from the face image of the target object, and the face characteristic parameters are input into a face emotion recognition model based on deep learning to carry out face emotion recognition, so that a second emotion recognition result is obtained;
determining a first emotion grade according to the first emotion recognition result, determining a second emotion grade according to the second emotion recognition result, and judging the emotion classification of the target object according to the heart rate parameter, the first emotion recognition result and the second emotion recognition result;
according to the emotion classification of the target object, a corresponding emotion evaluation calculation model is called, and heart rate parameters, a first emotion grade and a second emotion grade are substituted into the emotion evaluation calculation model to calculate, so that emotion evaluation scores of the target object are obtained;
and generating corresponding emotion monitoring information according to the emotion classification and emotion evaluation score of the target object, and sending the emotion monitoring information to the object terminal so as to enable the object terminal to visually display the emotion monitoring information.
In one possible design, the determining the heart rate parameter of the target object according to the heart rate monitoring information includes: and extracting a plurality of heart rate monitoring values monitored in a set time period from the heart rate monitoring information, and taking the average value, the median value or the maximum value of each heart rate monitoring value as the heart rate parameter of the target object.
In one possible design, the performing audio preprocessing and audio feature extraction on the voice monitoring information to obtain a voice spectrum feature of the target object includes:
noise reduction processing is carried out on the voice monitoring information to obtain the voice information of the target object;
pre-emphasis processing is carried out on the voice information through a high-pass filter, so that the voice information after pre-emphasis is obtained;
carrying out framing treatment on the pre-emphasized voice information to obtain a plurality of signal frames;
windowing is carried out on each signal frame, and each windowed signal frame is obtained;
performing fast Fourier transform processing on each windowed signal frame to obtain a frequency spectrum parameter corresponding to each windowed signal frame;
and importing each spectrum parameter into a Mel filter group for operation processing, carrying out logarithmic operation and discrete cosine transform processing on the output parameter of the Mel filter group to obtain Mel cepstrum coefficient, and taking the Mel cepstrum coefficient as the voice spectrum characteristic of the target object.
In one possible design, the image preprocessing is performed on the image monitoring information to obtain a face image of the target object, and the extracting the face feature parameter from the face image of the target object includes:
extracting an initial face image from the image monitoring information, and carrying out gray scale normalization processing, image denoising processing and image enhancement processing on the initial face image to obtain a face image of a target object;
and detecting the facial feature points of the facial image of the target object by adopting a Dlib facial detection method to obtain facial feature parameters.
In one possible design, the speech emotion recognition model is obtained by training a long-term and short-term memory network model through a first training set, wherein the first training set comprises a plurality of mel cepstrum coefficient training samples marked with corresponding emotion classification and emotion level labels; the face emotion recognition model is obtained by training a convolutional neural network model through a second training set, and the second training set comprises a plurality of facial feature parameter training samples marked with corresponding emotion classification and emotion grade labels.
In one possible design, the first emotion recognition result includes a first emotion classification and a first emotion level, the second emotion recognition result includes a second emotion classification and a second emotion level, the first emotion level is determined according to the first emotion recognition result, the second emotion level is determined according to the second emotion recognition result, and the emotion classification of the target object is determined according to the heart rate parameter, the first emotion recognition result and the second emotion recognition result, including:
extracting a first emotion grade and a first emotion classification from the first emotion recognition result, and extracting a second emotion grade and a second emotion classification from the second emotion recognition result;
and carrying out emotion judgment by adopting a preset emotion judgment rule and based on the heart rate parameter, the first emotion classification and the second emotion classification to obtain the emotion classification of the target object.
In one possible design, the emotion assessment calculation model is
Wherein A represents a first emotion level, B represents a second emotion level, H represents heart rate parameters, i represents emotion classification numbers, and P i Representing emotion evaluation score of target object under i emotion classification, D i Characterizing a reference emotion score, alpha, of a target subject set under an i emotion classification i Characterizing a first mood level coefficient, beta, set under the i mood classification i Characterizing a second emotion rating coefficient, lambda, set under the i emotion classification i The heart rate coefficient set under the emotion classification is represented by i, q represents the set constant, and S represents the standard heart rate of the target object.
In a second aspect, an artificial intelligence based emotion monitoring system is provided, comprising an acquisition unit, a determination unit, a first recognition unit, a second recognition unit, a decision unit, a calculation unit and a pushing unit, wherein:
the acquisition unit is used for acquiring heart rate monitoring information, voice monitoring information and image monitoring information of the target object;
the determining unit is used for determining heart rate parameters of the target object according to the heart rate monitoring information;
the first recognition unit is used for carrying out audio preprocessing and audio feature extraction on the voice monitoring information to obtain voice spectrum features of the target object, inputting the voice spectrum features into a deep learning-based voice emotion recognition model to carry out voice emotion recognition to obtain a first emotion recognition result;
the second recognition unit is used for carrying out image preprocessing on the image monitoring information to obtain a face image of the target object, extracting face characteristic parameters from the face image of the target object, inputting the face characteristic parameters into a face emotion recognition model based on deep learning to carry out face emotion recognition, and obtaining a second emotion recognition result;
the judging unit is used for determining a first emotion grade according to the first emotion recognition result, determining a second emotion grade according to the second emotion recognition result and judging the emotion classification of the target object according to the heart rate parameter, the first emotion recognition result and the second emotion recognition result;
the computing unit is used for calling a corresponding emotion evaluation computing model according to the emotion classification of the target object, substituting the heart rate parameter, the first emotion level and the second emotion level into the emotion evaluation computing model for computing, and obtaining an emotion evaluation score of the target object;
and the pushing unit is used for generating corresponding emotion monitoring information according to the emotion classification and emotion evaluation score of the target object and sending the emotion monitoring information to the object terminal so as to enable the object terminal to visually display the emotion monitoring information.
In a third aspect, there is provided an artificial intelligence based emotion monitoring device comprising:
a memory for storing instructions;
and a processor for reading the instructions stored in the memory and executing the method according to any one of the above first aspects according to the instructions.
In a fourth aspect, there is provided a computer readable storage medium having instructions stored thereon which, when run on a computer, cause the computer to perform the method of any of the first aspects. Also provided is a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of any of the first aspects.
The beneficial effects are that: according to the method, heart rate monitoring information, voice monitoring information and image monitoring information of a target object are obtained to determine heart rate parameters, voice emotion recognition is carried out by utilizing the heart rate monitoring information to obtain a first emotion recognition result, face emotion recognition is carried out by utilizing the image monitoring information to obtain a second emotion recognition result, then the heart rate parameters, the first emotion recognition result and the second emotion recognition result are fused to determine emotion classification and emotion evaluation score of the target object, and finally corresponding emotion monitoring information is generated according to the emotion classification and emotion evaluation score of the target object and pushed to the object terminal, so that emotion condition monitoring of the target object can be achieved. According to the invention, the voice and facial emotion recognition of the target object can be realized through an artificial intelligence technology, and emotion classification and emotion scoring are carried out based on heart rate parameters and voice and facial emotion recognition results, so that the emotion condition of the target object can be efficiently, comprehensively and accurately judged by fusing multiple factors, and refined emotion monitoring information is generated.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of steps of a method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a system according to an embodiment of the present invention;
fig. 3 is a schematic diagram of the apparatus according to an embodiment of the present invention.
Detailed Description
It should be noted that the description of these examples is for aiding in understanding the present invention, but is not intended to limit the present invention. Specific structural and functional details disclosed herein are merely representative of example embodiments of the invention. This invention may, however, be embodied in many alternate forms and should not be construed as limited to the embodiments set forth herein.
It should be appreciated that the terms first, second, etc. are used merely to distinguish between descriptions and should not be construed as indicating or implying relative importance. Although the terms first, second, etc. may be used herein to describe various features, these features should not be limited by these terms. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing from the scope of example embodiments of the present invention.
In the following description, specific details are provided to provide a thorough understanding of example embodiments. However, it will be understood by those of ordinary skill in the art that the example embodiments may be practiced without these specific details. For example, a system may be shown in block diagrams in order to avoid obscuring the examples with unnecessary detail. In other embodiments, well-known processes, structures, and techniques may be shown without unnecessary detail in order to avoid obscuring the embodiments.
Example 1:
the embodiment provides an emotion monitoring method based on artificial intelligence, which can be applied to a corresponding monitoring server, as shown in fig. 1, and comprises the following steps:
s1, heart rate monitoring information, voice monitoring information and image monitoring information of a target object are obtained.
In the implementation, to monitor the emotion of the target object, heart rate monitoring information, voice monitoring information and image monitoring information of the target object need to be acquired first. The heart rate monitoring information comprises corresponding heart rate monitoring results of the target object in a period of time, and the heart rate monitoring results can be acquired through a smart watch or a bracelet worn on the wrist of the target object. The voice monitoring information comprises talking voice information of the target object in a period of time, and can also be acquired through a smart watch or a bracelet worn on the wrist of the target object, corresponding semantic desensitization processing can be carried out after the talking voice information of the target object is acquired, and only audio characteristic information of voice is reserved so as to ensure daily privacy safety of the target object. The image monitoring information comprises a face image of the target object and can be acquired through a corresponding monitoring camera or a smart watch worn on the wrist of the target object.
S2, determining heart rate parameters of the target object according to the heart rate monitoring information.
In specific implementation, after the heart rate monitoring information of the target object is obtained, the heart rate parameter of the target object can be determined according to the heart rate monitoring information, for example, a plurality of heart rate monitoring values monitored in a set period of time can be extracted from the heart rate monitoring information, and an average value, a median value or a maximum value of each heart rate monitoring value is taken as the heart rate parameter of the target object.
S3, performing audio preprocessing and audio feature extraction on the voice monitoring information to obtain voice spectrum features of the target object, and inputting the voice spectrum features into a deep learning-based voice emotion recognition model to perform voice emotion recognition to obtain a first emotion recognition result.
In the implementation, after the voice monitoring information of the target object is obtained, audio preprocessing and audio feature extraction are required to be performed on the voice monitoring information to obtain the voice spectrum feature of the target object, and the method comprises the following steps:
and carrying out noise reduction processing on the voice monitoring information to obtain the voice information of the target object so as to eliminate the interference influence of environmental noise.
And pre-emphasis processing is carried out on the voice information through a high-pass filter, so that the pre-emphasized voice information is obtained. The pre-emphasis aims to raise the high frequency part, flatten the frequency spectrum of the signal, and meanwhile, compensate the suppressed high frequency part of the audio signal to eliminate the effects of vocal cords and lips in the sounding process, so as to highlight the formants of high frequency.
And carrying out framing treatment on the pre-emphasized voice information to obtain a plurality of signal frames.
And carrying out windowing on each signal frame to obtain each windowed signal frame. The windowing process involves substituting a signal frame into a corresponding window function to perform an operation, such as a rectangular window, a hamming window, or a hanning window, in order to eliminate signal discontinuities that may be caused at both ends of each frame.
And performing fast Fourier transform processing on each windowed signal frame to obtain the frequency spectrum parameters corresponding to each windowed signal frame. Since the transformation of a signal in the time domain is difficult to express the characteristics of the signal, it needs to be transformed into an energy distribution in the frequency domain, different energy distributions may express the characteristics of different voices. It is necessary to perform a fast fourier transform on each windowed signal frame to obtain the spectrum of each frame.
And importing each spectrum parameter into a Mel filter group for operation processing, carrying out logarithmic operation and discrete cosine transformation processing on the output parameter of the Mel filter group to obtain a Mel cepstrum coefficient, and taking the Mel cepstrum coefficient as the voice spectrum characteristic of a target object, wherein the Mel cepstrum coefficient has strong stability and high recognition rate, and can be used for effectively representing the characteristic of human voice.
After the processing is carried out to obtain the voice spectrum characteristics of the target object, the voice spectrum characteristics can be input into a voice emotion recognition model based on deep learning to carry out voice emotion recognition, and a first emotion recognition result is obtained. The voice emotion recognition model can be obtained by training a long-period memory network model through a first training set, the first training set comprises a plurality of mel cepstrum coefficient training samples marked with corresponding emotion classification and emotion grade labels, the emotion classification can comprise surprise, fear, aversion, anger, happiness, sadness and the like, the emotion grade can be set according to actual conditions, and the voice emotion recognition model trained through the first training set can output corresponding first emotion recognition results comprising the first emotion classification and the first emotion grade after voice frequency spectrum characteristics are input.
S4, performing image preprocessing on the image monitoring information to obtain a face image of the target object, extracting face characteristic parameters from the face image of the target object, and inputting the face characteristic parameters into a face emotion recognition model based on deep learning to perform face emotion recognition to obtain a second emotion recognition result.
In the specific implementation, after the image monitoring information of the target object is obtained, the image monitoring information is required to be subjected to image preprocessing to obtain a face image of the target object, and then the face characteristic parameters are extracted from the face image of the target object, and the process comprises the following steps:
and extracting an initial face image from the image monitoring information, and carrying out gray scale normalization processing, image denoising processing and image enhancement processing on the initial face image to obtain the face image of the target object. The gray scale normalization processing of the initial face image can effectively remove the influence of light on the subsequent face image recognition, and a histogram equalization, histogram specification or gray scale mean variance normalization method can be adopted. The image denoising can be finished by adopting a Gaussian denoising method, so that the face image is clearer. Image enhancement can be accomplished using wavelet transform methods: decomposing a face image into components with different sizes, positions and directions; then amplifying the components to be emphasized, and reducing unnecessary components; and finally, obtaining the enhanced face image by utilizing wavelet inversion.
And detecting the facial feature points of the facial image of the target object by adopting a Dlib facial detection method to obtain facial feature parameters. The Dlib library is packaged with a corresponding machine learning algorithm, accurate face detection and recognition can be achieved, face feature points generally comprise feature points of eyebrows, eyes, mouth, nose, chin and the like, 68 face feature points in a face image can be collected through a corresponding face detection model provided by the Dlib, and association relations among the face feature points are determined to form face feature parameters.
After the facial feature parameters are obtained, the facial feature parameters are input into a deep learning-based facial emotion recognition model to perform facial emotion recognition, and a second emotion recognition result is obtained. The face emotion recognition model is obtained by training a convolutional neural network model through a second training set, the second training set comprises a plurality of facial feature parameter training samples marked with corresponding emotion classification and emotion level labels, the emotion classification can comprise surprise, fear, aversion, anger, happiness, sadness and the like, the emotion level can be set according to actual conditions, and the face emotion recognition model trained through the second training set can output corresponding second emotion recognition results comprising the second emotion classification and the second emotion level after the facial feature parameters are input.
S5, determining a first emotion grade according to the first emotion recognition result, determining a second emotion grade according to the second emotion recognition result, and judging the emotion classification of the target object according to the heart rate parameter, the first emotion recognition result and the second emotion recognition result.
In a specific implementation, after the first emotion recognition result and the second emotion recognition result are obtained respectively, the first emotion level and the first emotion classification can be extracted from the first emotion recognition result, and the second emotion level and the second emotion classification can be extracted from the second emotion recognition result. And determining a heart rate interval corresponding to the heart rate parameter according to the heart rate interval corresponding to the heart rate parameter and a set heart rate interval-emotion classification reference table, wherein the emotion determination rule can be configured to determine the emotion classification of the target object according to the comparison of the heart rate interval corresponding to the heart rate parameter and the set heart rate interval-emotion classification reference table.
S6, according to the emotion classification of the target object, a corresponding emotion evaluation calculation model is called, and heart rate parameters, first emotion grades and second emotion grades are substituted into the emotion evaluation calculation model to calculate, so that emotion evaluation scores of the target object are obtained.
In specific implementation, after determining the emotion classification of the target object, a corresponding emotion evaluation calculation model is called according to the emotion classification of the target object, heart rate parameters, a first emotion level and a second emotion level are substituted into the emotion evaluation calculation model for calculation, so as to obtain an emotion evaluation score of the target object, and the emotion evaluation calculation model can be comprehensively expressed as:
wherein A represents a first emotion level, B represents a second emotion level, H represents heart rate parameters, i represents emotion classification numbers, and P i Representing emotion evaluation score of target object under i emotion classification, D i Characterizing a reference emotion score, alpha, of a target subject set under an i emotion classification i Characterizing a first mood level coefficient, beta, set under the i mood classification i Characterizing a second emotion rating coefficient, lambda, set under the i emotion classification i The heart rate coefficient set under the emotion classification is represented by i, q represents the set constant, and S represents the standard heart rate of the target object.
S7, generating corresponding emotion monitoring information according to the emotion classification and emotion evaluation score of the target object, and sending the emotion monitoring information to the object terminal so that the object terminal can visually display the emotion monitoring information.
In the implementation, when the emotion evaluation score of the target object is obtained through calculation, corresponding emotion monitoring information can be generated according to the emotion classification and the emotion evaluation score of the target object, for example, corresponding emotion expressions are matched and called from an expression library according to the emotion classification, corresponding prompt languages are matched and called from a prompt information library according to the emotion evaluation score, the emotion classification, the emotion evaluation score generation, the emotion expressions and the prompt languages are programmed into corresponding emotion monitoring message templates to obtain the emotion monitoring information, and finally the emotion monitoring information is sent to the target terminal, so that the target terminal visually displays the emotion monitoring information, and people of the target terminal can know the emotion condition of the target object conveniently.
Example 2:
the embodiment provides an emotion monitoring system based on artificial intelligence, as shown in fig. 2, including an acquisition unit, a determination unit, a first recognition unit, a second recognition unit, a determination unit, a calculation unit and a pushing unit, wherein:
the acquisition unit is used for acquiring heart rate monitoring information, voice monitoring information and image monitoring information of the target object;
the determining unit is used for determining heart rate parameters of the target object according to the heart rate monitoring information;
the first recognition unit is used for carrying out audio preprocessing and audio feature extraction on the voice monitoring information to obtain voice spectrum features of the target object, inputting the voice spectrum features into a deep learning-based voice emotion recognition model to carry out voice emotion recognition to obtain a first emotion recognition result;
the second recognition unit is used for carrying out image preprocessing on the image monitoring information to obtain a face image of the target object, extracting face characteristic parameters from the face image of the target object, inputting the face characteristic parameters into a face emotion recognition model based on deep learning to carry out face emotion recognition, and obtaining a second emotion recognition result;
the judging unit is used for determining a first emotion grade according to the first emotion recognition result, determining a second emotion grade according to the second emotion recognition result and judging the emotion classification of the target object according to the heart rate parameter, the first emotion recognition result and the second emotion recognition result;
the computing unit is used for calling a corresponding emotion evaluation computing model according to the emotion classification of the target object, substituting the heart rate parameter, the first emotion level and the second emotion level into the emotion evaluation computing model for computing, and obtaining an emotion evaluation score of the target object;
and the pushing unit is used for generating corresponding emotion monitoring information according to the emotion classification and emotion evaluation score of the target object and sending the emotion monitoring information to the object terminal so as to enable the object terminal to visually display the emotion monitoring information.
Example 3:
this embodiment provides an artificial intelligence based emotion monitoring device, as shown in fig. 3, including, at a hardware level:
the data interface is used for establishing data butt joint between the processor and the corresponding information acquisition end and between the processor and the object terminal;
a memory for storing instructions;
and a processor for reading the instructions stored in the memory and executing the artificial intelligence based emotion monitoring method of embodiment 1 according to the instructions.
Optionally, the device further comprises an internal bus. The processor and memory and data interfaces may be interconnected by an internal bus, which may be an ISA (Industry Standard Architecture ) bus, a PCI (Peripheral Component Interconnect, peripheral component interconnect standard) bus, or an EISA (Extended Industry Standard Architecture ) bus, among others. The buses may be classified as address buses, data buses, control buses, etc.
The Memory may include, but is not limited to, random access Memory (Random Access Memory, RAM), read Only Memory (ROM), flash Memory (Flash Memory), first-in first-out Memory (First Input First Output, FIFO), and/or first-in last-out Memory (First In Last Out, FILO), etc. The processor may be a general-purpose processor including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but also digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
Example 4:
the present embodiment provides a computer-readable storage medium having instructions stored thereon that, when executed on a computer, cause the computer to perform the artificial intelligence based emotion monitoring method of embodiment 1. The computer readable storage medium refers to a carrier for storing data, and may include, but is not limited to, a floppy disk, an optical disk, a hard disk, a flash Memory, and/or a Memory Stick (Memory Stick), etc., where the computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable system.
This embodiment also provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the artificial intelligence based emotion monitoring method of embodiment 1. Wherein the computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable system.
Finally, it should be noted that: the foregoing description is only of the preferred embodiments of the invention and is not intended to limit the scope of the invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (8)

1. An artificial intelligence based emotion monitoring method, comprising:
acquiring heart rate monitoring information, voice monitoring information and image monitoring information of a target object;
determining heart rate parameters of a target object according to heart rate monitoring information, wherein the heart rate parameters comprise extracting a plurality of heart rate monitoring values monitored in a set time period from the heart rate monitoring information, and taking an average value, a median value or a maximum value of each heart rate monitoring value as the heart rate parameters of the target object;
performing audio preprocessing and audio feature extraction on the voice monitoring information to obtain voice spectrum features of a target object, inputting the voice spectrum features into a deep learning-based voice emotion recognition model to perform voice emotion recognition, and obtaining a first emotion recognition result;
image preprocessing is carried out on the image monitoring information to obtain a face image of the target object, face characteristic parameters are extracted from the face image of the target object, and the face characteristic parameters are input into a face emotion recognition model based on deep learning to carry out face emotion recognition, so that a second emotion recognition result is obtained;
determining a first emotion grade according to the first emotion recognition result, determining a second emotion grade according to the second emotion recognition result, and judging the emotion classification of the target object according to the heart rate parameter, the first emotion recognition result and the second emotion recognition result;
according to the emotion classification of the target object, a corresponding emotion evaluation calculation model is called, heart rate parameters, a first emotion grade and a second emotion grade are substituted into the emotion evaluation calculation model for calculation, so that the emotion evaluation score of the target object is obtained, and the emotion evaluation calculation model is
Wherein A represents a first emotion level, B represents a second emotion level, H represents heart rate parameters, i represents emotion classification numbers, and P i Representing emotion evaluation score of target object under i emotion classification, D i Characterizing a reference emotion score, alpha, of a target subject set under an i emotion classification i Characterizing a first mood level coefficient, beta, set under the i mood classification i Characterizing a second emotion rating coefficient, lambda, set under the i emotion classification i Representing the heart rate coefficient set under the emotion classification, wherein q represents a set constant, and S represents the standard heart rate of a target object;
and generating corresponding emotion monitoring information according to the emotion classification and emotion evaluation score of the target object, and sending the emotion monitoring information to the object terminal so as to enable the object terminal to visually display the emotion monitoring information.
2. The emotion monitoring method based on artificial intelligence according to claim 1, wherein the audio preprocessing and audio feature extraction are performed on the speech monitoring information to obtain speech spectrum features of the target object, and the method comprises:
noise reduction processing is carried out on the voice monitoring information to obtain the voice information of the target object;
pre-emphasis processing is carried out on the voice information through a high-pass filter, so that the voice information after pre-emphasis is obtained;
carrying out framing treatment on the pre-emphasized voice information to obtain a plurality of signal frames;
windowing is carried out on each signal frame, and each windowed signal frame is obtained;
performing fast Fourier transform processing on each windowed signal frame to obtain a frequency spectrum parameter corresponding to each windowed signal frame;
and importing each spectrum parameter into a Mel filter group for operation processing, carrying out logarithmic operation and discrete cosine transform processing on the output parameter of the Mel filter group to obtain Mel cepstrum coefficient, and taking the Mel cepstrum coefficient as the voice spectrum characteristic of the target object.
3. The emotion monitoring method based on artificial intelligence according to claim 1, wherein the image preprocessing is performed on the image monitoring information to obtain a face image of the target object, and extracting the facial feature parameters from the face image of the target object comprises:
extracting an initial face image from the image monitoring information, and carrying out gray scale normalization processing, image denoising processing and image enhancement processing on the initial face image to obtain a face image of a target object;
and detecting the facial feature points of the facial image of the target object by adopting a Dlib facial detection method to obtain facial feature parameters.
4. The emotion monitoring method based on artificial intelligence according to claim 1, wherein the speech emotion recognition model is obtained by training a long-term and short-term memory network model through a first training set, and the first training set comprises a plurality of mel cepstrum coefficient training samples marked with corresponding emotion classification and emotion level labels; the face emotion recognition model is obtained by training a convolutional neural network model through a second training set, and the second training set comprises a plurality of facial feature parameter training samples marked with corresponding emotion classification and emotion grade labels.
5. The artificial intelligence based emotion monitoring method of claim 4, wherein the first emotion recognition result includes a first emotion classification and a first emotion level, the second emotion recognition result includes a second emotion classification and a second emotion level, the first emotion level is determined according to the first emotion recognition result, the second emotion level is determined according to the second emotion recognition result, and the emotion classification of the target object is determined according to the heart rate parameter, the first emotion recognition result, and the second emotion recognition result, comprising:
extracting a first emotion grade and a first emotion classification from the first emotion recognition result, and extracting a second emotion grade and a second emotion classification from the second emotion recognition result;
and carrying out emotion judgment by adopting a preset emotion judgment rule and based on the heart rate parameter, the first emotion classification and the second emotion classification to obtain the emotion classification of the target object.
6. The emotion monitoring system based on artificial intelligence is characterized by comprising an acquisition unit, a determination unit, a first identification unit, a second identification unit, a judgment unit, a calculation unit and a pushing unit, wherein:
the acquisition unit is used for acquiring heart rate monitoring information, voice monitoring information and image monitoring information of the target object;
the determining unit is used for determining the heart rate parameter of the target object according to the heart rate monitoring information, and comprises the steps of extracting a plurality of heart rate monitoring values monitored in a set time period from the heart rate monitoring information, and taking the average value, the median value or the maximum value of each heart rate monitoring value as the heart rate parameter of the target object;
the first recognition unit is used for carrying out audio preprocessing and audio feature extraction on the voice monitoring information to obtain voice spectrum features of the target object, inputting the voice spectrum features into a deep learning-based voice emotion recognition model to carry out voice emotion recognition to obtain a first emotion recognition result;
the second recognition unit is used for carrying out image preprocessing on the image monitoring information to obtain a face image of the target object, extracting face characteristic parameters from the face image of the target object, inputting the face characteristic parameters into a face emotion recognition model based on deep learning to carry out face emotion recognition, and obtaining a second emotion recognition result;
the judging unit is used for determining a first emotion grade according to the first emotion recognition result, determining a second emotion grade according to the second emotion recognition result and judging the emotion classification of the target object according to the heart rate parameter, the first emotion recognition result and the second emotion recognition result;
a computing unit, configured to invoke a corresponding emotion evaluation computation model according to the emotion classification of the target object, and substitute the heart rate parameter, the first emotion level and the second emotion level into the emotion evaluation computation model to perform computation, so as to obtain an emotion evaluation score of the target object, where the emotion evaluation computation model is
Wherein A represents a first emotion level, B represents a second emotion level, H represents heart rate parameters, i represents emotion classification numbers, and P i Representing emotion evaluation score of target object under i emotion classification, D i Characterizing a reference emotion score, alpha, of a target subject set under an i emotion classification i Characterizing a first mood level coefficient, beta, set under the i mood classification i Characterizing a second emotion rating coefficient, lambda, set under the i emotion classification i Representing the heart rate coefficient set under the emotion classification, wherein q represents a set constant, and S represents the standard heart rate of a target object;
and the pushing unit is used for generating corresponding emotion monitoring information according to the emotion classification and emotion evaluation score of the target object and sending the emotion monitoring information to the object terminal so as to enable the object terminal to visually display the emotion monitoring information.
7. An artificial intelligence based emotion monitoring device, comprising:
a memory for storing instructions;
a processor for reading instructions stored in said memory and performing the method according to any one of claims 1-5 in accordance with the instructions.
8. A computer readable storage medium having instructions stored thereon which, when run on a computer, cause the computer to perform the method of any of claims 1-5.
CN202310663752.5A 2023-06-06 2023-06-06 Emotion monitoring method, system, equipment and storage medium based on artificial intelligence Active CN116649980B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310663752.5A CN116649980B (en) 2023-06-06 2023-06-06 Emotion monitoring method, system, equipment and storage medium based on artificial intelligence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310663752.5A CN116649980B (en) 2023-06-06 2023-06-06 Emotion monitoring method, system, equipment and storage medium based on artificial intelligence

Publications (2)

Publication Number Publication Date
CN116649980A CN116649980A (en) 2023-08-29
CN116649980B true CN116649980B (en) 2024-03-26

Family

ID=87714947

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310663752.5A Active CN116649980B (en) 2023-06-06 2023-06-06 Emotion monitoring method, system, equipment and storage medium based on artificial intelligence

Country Status (1)

Country Link
CN (1) CN116649980B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2015200496A1 (en) * 2010-08-31 2015-02-19 Forbes Consulting Group, Llc Methods and systems for assessing psychological characteristics
CN107066514A (en) * 2017-01-23 2017-08-18 深圳亲友科技有限公司 The Emotion identification method and system of the elderly
CN108307037A (en) * 2017-12-15 2018-07-20 努比亚技术有限公司 Terminal control method, terminal and computer readable storage medium
CN112085420A (en) * 2020-09-29 2020-12-15 中国银行股份有限公司 Emotion level determination method, device and equipment
CN112329431A (en) * 2019-08-01 2021-02-05 中国移动通信集团上海有限公司 Audio and video data processing method and device and storage medium
CN112509561A (en) * 2020-12-03 2021-03-16 中国联合网络通信集团有限公司 Emotion recognition method, device, equipment and computer readable storage medium
CN115953225A (en) * 2023-02-20 2023-04-11 湖南喜宝达信息科技有限公司 Commodity recommendation method and device based on user emotion

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2015200496A1 (en) * 2010-08-31 2015-02-19 Forbes Consulting Group, Llc Methods and systems for assessing psychological characteristics
CN107066514A (en) * 2017-01-23 2017-08-18 深圳亲友科技有限公司 The Emotion identification method and system of the elderly
CN108307037A (en) * 2017-12-15 2018-07-20 努比亚技术有限公司 Terminal control method, terminal and computer readable storage medium
CN112329431A (en) * 2019-08-01 2021-02-05 中国移动通信集团上海有限公司 Audio and video data processing method and device and storage medium
CN112085420A (en) * 2020-09-29 2020-12-15 中国银行股份有限公司 Emotion level determination method, device and equipment
CN112509561A (en) * 2020-12-03 2021-03-16 中国联合网络通信集团有限公司 Emotion recognition method, device, equipment and computer readable storage medium
CN115953225A (en) * 2023-02-20 2023-04-11 湖南喜宝达信息科技有限公司 Commodity recommendation method and device based on user emotion

Also Published As

Publication number Publication date
CN116649980A (en) 2023-08-29

Similar Documents

Publication Publication Date Title
WO2020173133A1 (en) Training method of emotion recognition model, emotion recognition method, device, apparatus, and storage medium
Davletcharova et al. Detection and analysis of emotion from speech signals
US9123342B2 (en) Method of recognizing gender or age of a speaker according to speech emotion or arousal
Asgari et al. Inferring clinical depression from speech and spoken utterances
Prasomphan Improvement of speech emotion recognition with neural network classifier by using speech spectrogram
JP2017156854A (en) Speech semantic analysis program, apparatus and method for improving comprehension accuracy of context semantic through emotion classification
Muckenhirn et al. Understanding and Visualizing Raw Waveform-Based CNNs.
CN107577991B (en) Follow-up data processing method and device, storage medium and computer equipment
CN116563829A (en) Driver emotion recognition method and device, electronic equipment and storage medium
CN110717410A (en) Voice emotion and facial expression bimodal recognition system
JP2018169506A (en) Conversation satisfaction degree estimation device, voice processing device and conversation satisfaction degree estimation method
WO2020000523A1 (en) Signal processing method and apparatus
Alghifari et al. On the use of voice activity detection in speech emotion recognition
Warule et al. Significance of voiced and unvoiced speech segments for the detection of common cold
Kuang et al. Simplified inverse filter tracked affective acoustic signals classification incorporating deep convolutional neural networks
Kothalkar et al. Automatic screening to detect’at risk’child speech samples using a clinical group verification framework
CN116649980B (en) Emotion monitoring method, system, equipment and storage medium based on artificial intelligence
CN116048282B (en) Data processing method, system, device, equipment and storage medium
Dubey et al. Sinusoidal model-based hypernasality detection in cleft palate speech using CVCV sequence
Loizou An automated integrated speech and face imageanalysis system for the identification of human emotions
Radha et al. Automated detection and severity assessment of dysarthria using raw speech
CN112699236B (en) Deepfake detection method based on emotion recognition and pupil size calculation
Nasir et al. Complexity in Prosody: A Nonlinear Dynamical Systems Approach for Dyadic Conversations; Behavior and Outcomes in Couples Therapy.
CN114299925A (en) Method and system for obtaining importance measurement index of dysphagia symptom of Parkinson disease patient based on voice
Ardiana et al. Gender Classification Based Speaker’s Voice using YIN Algorithm and MFCC

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant