CN111145785A - Emotion recognition method and device based on voice - Google Patents

Emotion recognition method and device based on voice Download PDF

Info

Publication number
CN111145785A
CN111145785A CN201811285508.5A CN201811285508A CN111145785A CN 111145785 A CN111145785 A CN 111145785A CN 201811285508 A CN201811285508 A CN 201811285508A CN 111145785 A CN111145785 A CN 111145785A
Authority
CN
China
Prior art keywords
gaussian mixture
mixture model
anger
voice
emotion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811285508.5A
Other languages
Chinese (zh)
Inventor
张冲
叶荣华
刘松
韦梁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Lingpai Technology Co Ltd
Original Assignee
Guangzhou Lingpai Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Lingpai Technology Co Ltd filed Critical Guangzhou Lingpai Technology Co Ltd
Priority to CN201811285508.5A priority Critical patent/CN111145785A/en
Publication of CN111145785A publication Critical patent/CN111145785A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/63Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Hospice & Palliative Care (AREA)
  • Psychiatry (AREA)
  • General Health & Medical Sciences (AREA)
  • Child & Adolescent Psychology (AREA)
  • Signal Processing (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Feedback Control In General (AREA)

Abstract

The invention discloses a method and a device for emotion recognition based on voice, which comprises the following steps: 1) respectively collecting emotion voice data of characters like, anger and sadness; 2) adopting PCA algorithm to respectively perform dimensionality reduction processing on the emotional sound data of happiness, anger and sadness; 3) then, carrying out endpoint detection on the emotion voice data of the happiness, the anger and the grief after the dimensionality reduction processing to extract three characteristic parameters of a Mel-frequency cepstrum coefficient, a formant and a zero crossing rate, establishing a Gaussian mixture model for the characteristic parameters, respectively training out Gaussian mixture models of the emotion voice of the happiness, the anger and the grief, and establishing an emotion voice database consisting of the happiness-Gaussian mixture model, the anger-Gaussian mixture model and the grief-Gaussian mixture model; 4) collecting voice fragments to be recognized; the emotion recognition method based on the voice is high in recognition accuracy.

Description

Emotion recognition method and device based on voice
Technical Field
The invention relates to a method and a device for emotion recognition based on voice.
Background
The emotion is a state integrating human feelings, ideas and behaviors, and plays an important role in human-to-human communication. Emotion is a state that integrates human feelings, thoughts, and behaviors, and includes a human psychological response to external or self-stimulation, including a physiological response accompanying such a psychological response. The mood plays a ubiquitous role in the daily work and life of people. In medical care, if the emotional state of a patient, particularly a patient with an expression disorder, can be known, different care measures can be taken according to the emotion of the patient, so that the care amount is increased. In the product development process, if the emotional state of the user in the product using process can be identified and the user experience is known, the product function can be improved, and a product more suitable for the user requirement is designed. In various human-machine interaction systems, human-machine interaction becomes more friendly and natural if the system can recognize the emotional state of a human. The analysis and recognition of emotion are important interdisciplinary research subjects in fields such as neuroscience, psychology, cognitive science, computer science, artificial intelligence and the like, and therefore, methods for recognizing emotion are provided for various fields.
Disclosure of Invention
The invention aims to provide a method and a device for recognizing emotion based on voice, which have high recognition accuracy.
In order to solve the problems, the invention adopts the following technical scheme:
a speech-based emotion recognition method includes the following steps:
1) respectively collecting emotion voice data of characters like, anger and sadness;
2) adopting PCA algorithm to respectively perform dimensionality reduction processing on the emotional sound data of happiness, anger and sadness;
3) then, carrying out endpoint detection on the emotion voice data of the happiness, the anger and the grief after the dimensionality reduction processing to extract three characteristic parameters of a Mel-frequency cepstrum coefficient, a formant and a zero crossing rate, establishing a Gaussian mixture model for the characteristic parameters, respectively training out Gaussian mixture models of the emotion voice of the happiness, the anger and the grief, and establishing an emotion voice database consisting of the happiness-Gaussian mixture model, the anger-Gaussian mixture model and the grief-Gaussian mixture model;
4) collecting voice fragments to be recognized;
5) carrying out anti-aliasing filtering, analog-to-digital conversion, pre-emphasis preprocessing and endpoint detection on the collected voice fragments, extracting three characteristic parameters including a Mel-frequency cepstrum coefficient, a formant and a zero crossing rate, establishing a contrast Gaussian mixture model for the characteristic parameters, and then respectively matching with a happiness-Gaussian mixture model, an anger-Gaussian mixture model and a sadness-Gaussian mixture model in an emotion voice database;
6) and comparing the Gaussian mixture model with one of the Happy-Gaussian mixture model, the anger-Gaussian mixture model and the Fulirague-Gaussian mixture model in the emotional sound database, wherein the overlapping rate of the model is more than a set threshold value, and judging the model to have the same emotion as the model.
Preferably, the set value of the threshold is 38 to 70%.
Preferably, the emotion recognition method further includes step 7) of performing a labeling corresponding to the emotion of a happy emotion, a anger emotion, or a sadness on the determined contrasted gaussian mixture model, and updating the labeled contrasted gaussian mixture model into the emotion sound database.
Preferably, the duration of the voice segment is 2-6 s.
The invention also provides a emotion recognition device based on voice, which comprises
The voice acquisition module is used for collecting voice fragments to be recognized;
the audio processing module is used for performing dimensionality reduction processing on the collected emotion voice data of character happiness, anger and sadness, and performing anti-aliasing filtering, analog-to-digital conversion and pre-emphasis preprocessing on the collected voice segments;
the data processing module is used for carrying out endpoint detection, extracting three characteristic parameters of a Mel-frequency cepstrum coefficient, a formant and a zero crossing rate, establishing a Gaussian mixture model for the characteristic parameters, and respectively training out Gaussian mixture models of emotional sounds like happiness, anger and grief;
and the database module is used for storing the happiness-Gaussian mixture model, the anger-Gaussian mixture model and the sadness-Gaussian mixture model.
Preferably, the audio processing module comprises an analog-to-digital converter, an audio output device, an anti-aliasing filter and a pre-emphasis circuit.
Preferably, the audio processing module and the database module are both physically connected with the data processing module.
The invention has the beneficial effects that: by aiming at the requirement of voice recognition, three emotion voice standard databases are firstly established, and voice recognition reference is set. Corresponding sound files are extracted aiming at three emotions, characteristic parameters such as mel-frequency cepstrum coefficients, formants, zero crossing rates and the like are extracted, Gaussian mixture models of three groups of emotions are built, the voice to be recognized can be effectively compared separately, the models are built separately for three emotions of happiness, anger and grief, compared with the traditional modeling mode, the process is simplified, meanwhile, the mode of singly comparing is adopted, the recognition efficiency can be effectively improved, the processing is improved, and the technical problems of complex current voice emotion recognition processing process, high realization difficulty, low accuracy and low efficiency are solved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a flow chart of a speech-based emotion recognition method of the present invention.
Fig. 2 is a connection block diagram of a speech-based emotion recognition apparatus according to the present invention.
In the figure:
1. a sound collection module; 2. an audio processing module; 3. a data processing module; 4. a database module; 5. an analog-to-digital converter; 6. an audio output device; 7. an anti-aliasing filter; 8. a pre-emphasis circuit.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments.
Example 1
As shown in fig. 1, a speech-based emotion recognition method includes the steps of:
1) respectively collecting emotion voice data of characters like, anger and sadness;
2) adopting PCA algorithm to respectively perform dimensionality reduction processing on the emotional sound data of happiness, anger and sadness;
3) then, carrying out endpoint detection on the emotion voice data of the happiness, the anger and the grief after the dimensionality reduction processing to extract three characteristic parameters of a Mel-frequency cepstrum coefficient, a formant and a zero crossing rate, establishing a Gaussian mixture model for the characteristic parameters, respectively training out Gaussian mixture models of the emotion voice of the happiness, the anger and the grief, and establishing an emotion voice database consisting of the happiness-Gaussian mixture model, the anger-Gaussian mixture model and the grief-Gaussian mixture model;
4) collecting voice fragments to be recognized;
5) carrying out anti-aliasing filtering, analog-to-digital conversion, pre-emphasis preprocessing and endpoint detection on the collected voice fragments, extracting three characteristic parameters including a Mel-frequency cepstrum coefficient, a formant and a zero crossing rate, establishing a contrast Gaussian mixture model for the characteristic parameters, and then respectively matching with a happiness-Gaussian mixture model, an anger-Gaussian mixture model and a sadness-Gaussian mixture model in an emotion voice database;
6) and comparing the Gaussian mixture model with one of the Happy-Gaussian mixture model, the anger-Gaussian mixture model and the Fulirague-Gaussian mixture model in the emotional sound database, wherein the overlapping rate of the model is more than a set threshold value, and judging the model to have the same emotion as the model.
In the present embodiment, the set value of the threshold is 45%.
In this embodiment, the duration of the speech segment is 2 s.
As shown in fig. 2, the embodiment further provides a speech-based emotion recognition apparatus, which includes
The voice acquisition module 1 is used for collecting voice fragments to be recognized;
the audio processing module 2 is used for performing dimensionality reduction processing on the collected emotion voice data of character happiness, anger and sadness, and performing anti-aliasing filtering, analog-to-digital conversion and pre-emphasis preprocessing on the collected voice segments;
the data processing module 3 is used for performing endpoint detection, extracting three characteristic parameters of a mel-frequency cepstrum coefficient, a formant and a zero crossing rate, establishing a Gaussian mixture model for the characteristic parameters, and respectively training out Gaussian mixture models of emotional sounds like, anger and grief;
and the database module 4 is used for storing a happiness-Gaussian mixture model, an anger-Gaussian mixture model and a sadness-Gaussian mixture model.
In this embodiment, the audio processing module 2 includes an analog-to-digital converter 5, an audio output device 6, an anti-aliasing filter 7, and a pre-emphasis circuit 8.
In this embodiment, the audio processing module 2 and the database module 4 are both physically connected to the data processing module 3.
Example 2
As shown in fig. 1, a speech-based emotion recognition method includes the steps of:
1) respectively collecting emotion voice data of characters like, anger and sadness;
2) adopting PCA algorithm to respectively perform dimensionality reduction processing on the emotional sound data of happiness, anger and sadness;
3) then, carrying out endpoint detection on the emotion voice data of the happiness, the anger and the grief after the dimensionality reduction processing to extract three characteristic parameters of a Mel-frequency cepstrum coefficient, a formant and a zero crossing rate, establishing a Gaussian mixture model for the characteristic parameters, respectively training out Gaussian mixture models of the emotion voice of the happiness, the anger and the grief, and establishing an emotion voice database consisting of the happiness-Gaussian mixture model, the anger-Gaussian mixture model and the grief-Gaussian mixture model;
4) collecting voice fragments to be recognized;
5) carrying out anti-aliasing filtering, analog-to-digital conversion, pre-emphasis preprocessing and endpoint detection on the collected voice fragments, extracting three characteristic parameters including a Mel-frequency cepstrum coefficient, a formant and a zero crossing rate, establishing a contrast Gaussian mixture model for the characteristic parameters, and then respectively matching with a happiness-Gaussian mixture model, an anger-Gaussian mixture model and a sadness-Gaussian mixture model in an emotion voice database;
6) and comparing the Gaussian mixture model with one of the Happy-Gaussian mixture model, the anger-Gaussian mixture model and the Fulirague-Gaussian mixture model in the emotional sound database, wherein the overlapping rate of the model is more than a set threshold value, and judging the model to have the same emotion as the model.
In the present embodiment, the set value of the threshold is 70%.
In this embodiment, the emotion recognition method further includes step 7) of labeling the determined contrasted gaussian mixture model with corresponding emotion of happiness, anger and sadness, and updating the labeled contrasted gaussian mixture model into the emotion sound database.
In this embodiment, the duration of the speech segment is 6 s.
As shown in fig. 2, the embodiment further provides a speech-based emotion recognition apparatus, which includes
The voice acquisition module 1 is used for collecting voice fragments to be recognized;
the audio processing module 2 is used for performing dimensionality reduction processing on the collected emotion voice data of character happiness, anger and sadness, and performing anti-aliasing filtering, analog-to-digital conversion and pre-emphasis preprocessing on the collected voice segments;
the data processing module 3 is used for performing endpoint detection, extracting three characteristic parameters of a mel-frequency cepstrum coefficient, a formant and a zero crossing rate, establishing a Gaussian mixture model for the characteristic parameters, and respectively training out Gaussian mixture models of emotional sounds like, anger and grief;
and the database module 4 is used for storing a happiness-Gaussian mixture model, an anger-Gaussian mixture model and a sadness-Gaussian mixture model.
In this embodiment, the audio processing module 2 includes an analog-to-digital converter 5, an audio output device 6, an anti-aliasing filter 7, and a pre-emphasis circuit 8.
In this embodiment, the audio processing module 2 and the database module 4 are both physically connected to the data processing module 3.
Example 3
As shown in fig. 1, a speech-based emotion recognition method includes the steps of:
1) respectively collecting emotion voice data of characters like, anger and sadness;
2) adopting PCA algorithm to respectively perform dimensionality reduction processing on the emotional sound data of happiness, anger and sadness;
3) then, carrying out endpoint detection on the emotion voice data of the happiness, the anger and the grief after the dimensionality reduction processing to extract three characteristic parameters of a Mel-frequency cepstrum coefficient, a formant and a zero crossing rate, establishing a Gaussian mixture model for the characteristic parameters, respectively training out Gaussian mixture models of the emotion voice of the happiness, the anger and the grief, and establishing an emotion voice database consisting of the happiness-Gaussian mixture model, the anger-Gaussian mixture model and the grief-Gaussian mixture model;
4) collecting voice fragments to be recognized;
5) carrying out anti-aliasing filtering, analog-to-digital conversion, pre-emphasis preprocessing and endpoint detection on the collected voice fragments, extracting three characteristic parameters including a Mel-frequency cepstrum coefficient, a formant and a zero crossing rate, establishing a contrast Gaussian mixture model for the characteristic parameters, and then respectively matching with a happiness-Gaussian mixture model, an anger-Gaussian mixture model and a sadness-Gaussian mixture model in an emotion voice database;
6) and comparing the Gaussian mixture model with one of the Happy-Gaussian mixture model, the anger-Gaussian mixture model and the Fulirague-Gaussian mixture model in the emotional sound database, wherein the overlapping rate of the model is more than a set threshold value, and judging the model to have the same emotion as the model.
In the present embodiment, the set value of the threshold is 60%.
In this embodiment, the emotion recognition method further includes step 7) of labeling the determined contrasted gaussian mixture model with corresponding emotion of happiness, anger and sadness, and updating the labeled contrasted gaussian mixture model into the emotion sound database.
In this embodiment, the duration of the speech segment is 5 s.
As shown in the figure, the embodiment further provides a speech-based emotion recognition device, which includes a sound collection module 1, configured to collect a speech segment to be recognized;
the audio processing module 2 is used for performing dimensionality reduction processing on the collected emotion voice data of character happiness, anger and sadness, and performing anti-aliasing filtering, analog-to-digital conversion and pre-emphasis preprocessing on the collected voice segments;
the data processing module 3 is used for performing endpoint detection, extracting three characteristic parameters of a mel-frequency cepstrum coefficient, a formant and a zero crossing rate, establishing a Gaussian mixture model for the characteristic parameters, and respectively training out Gaussian mixture models of emotional sounds like, anger and grief;
and the database module 4 is used for storing a happiness-Gaussian mixture model, an anger-Gaussian mixture model and a sadness-Gaussian mixture model.
In this embodiment, the audio processing module 2 includes an analog-to-digital converter 5, an audio output device 6, an anti-aliasing filter 7, and a pre-emphasis circuit 8.
In this embodiment, the audio processing module 2 and the database module 4 are both physically connected to the data processing module 3.
The invention has the beneficial effects that: by aiming at the requirement of voice recognition, three emotion voice standard databases are firstly established, and voice recognition reference is set. Corresponding sound files are extracted aiming at three emotions, characteristic parameters such as mel-frequency cepstrum coefficient, formants, zero crossing rate and the like are extracted, Gaussian mixture models of three groups of emotions are built, the voice to be recognized can be effectively and separately compared, the models are built by separately setting three emotions of happiness, anger and grief, compared with the traditional modeling mode, the process is simplified, meanwhile, the mode of separately comparing is adopted, the recognition efficiency can be effectively improved, the processing is improved, the technical problems that the current voice emotion recognition processing process is complex, the realization difficulty is high, the accuracy rate is low, and the efficiency is low are solved
The above description is only an embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that are not thought of through the inventive work should be included in the scope of the present invention.

Claims (7)

1. A speech-based emotion recognition method is characterized by comprising the following steps:
1) respectively collecting emotion voice data of characters like, anger and sadness;
2) adopting PCA algorithm to respectively perform dimensionality reduction processing on the emotional sound data of happiness, anger and sadness;
3) then, carrying out endpoint detection on the emotion voice data of the happiness, the anger and the grief after the dimensionality reduction processing to extract three characteristic parameters of a Mel-frequency cepstrum coefficient, a formant and a zero crossing rate, establishing a Gaussian mixture model for the characteristic parameters, respectively training out Gaussian mixture models of the emotion voice of the happiness, the anger and the grief, and establishing an emotion voice database consisting of the happiness-Gaussian mixture model, the anger-Gaussian mixture model and the grief-Gaussian mixture model;
4) collecting voice fragments to be recognized;
5) carrying out anti-aliasing filtering, analog-to-digital conversion, pre-emphasis preprocessing and endpoint detection on the collected voice fragments, extracting three characteristic parameters including a Mel-frequency cepstrum coefficient, a formant and a zero crossing rate, establishing a contrast Gaussian mixture model for the characteristic parameters, and then respectively matching with a happiness-Gaussian mixture model, an anger-Gaussian mixture model and a sadness-Gaussian mixture model in an emotion voice database;
6) and comparing the Gaussian mixture model with one of the Happy-Gaussian mixture model, the anger-Gaussian mixture model and the Fulirague-Gaussian mixture model in the emotional sound database, wherein the overlapping rate of the model is more than a set threshold value, and judging the model to have the same emotion as the model.
2. A speech based emotion recognition method as claimed in claim 1, wherein: the set value of the threshold is 38-70%.
3. A speech based emotion recognition method as claimed in claim 2, wherein: and 7) marking the compared Gaussian mixture model after judgment corresponding to the happiness, the anger and the sadness, and updating the marked labels into an emotion sound database.
4. A speech based emotion recognition method as claimed in claim 3, wherein: the duration of the voice segment is 2-6 s.
5. A speech-based emotion recognition apparatus, characterized in that: comprises that
The voice acquisition module is used for collecting voice fragments to be recognized;
the audio processing module is used for performing dimensionality reduction processing on the collected emotion voice data of character happiness, anger and sadness, and performing anti-aliasing filtering, analog-to-digital conversion and pre-emphasis preprocessing on the collected voice segments;
the data processing module is used for carrying out endpoint detection, extracting three characteristic parameters of a Mel-frequency cepstrum coefficient, a formant and a zero crossing rate, establishing a Gaussian mixture model for the characteristic parameters, and respectively training out Gaussian mixture models of emotional sounds like happiness, anger and grief;
and the database module is used for storing the happiness-Gaussian mixture model, the anger-Gaussian mixture model and the sadness-Gaussian mixture model.
6. A speech based emotion recognition apparatus as claimed in claim 5, wherein: the audio processing module comprises an analog-to-digital converter, an audio output device, an anti-aliasing filter and a pre-emphasis circuit.
7. A speech based emotion recognition apparatus as claimed in claim 6, wherein: and the audio processing module and the database module are both physically connected with the data processing module.
CN201811285508.5A 2018-11-02 2018-11-02 Emotion recognition method and device based on voice Pending CN111145785A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811285508.5A CN111145785A (en) 2018-11-02 2018-11-02 Emotion recognition method and device based on voice

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811285508.5A CN111145785A (en) 2018-11-02 2018-11-02 Emotion recognition method and device based on voice

Publications (1)

Publication Number Publication Date
CN111145785A true CN111145785A (en) 2020-05-12

Family

ID=70515079

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811285508.5A Pending CN111145785A (en) 2018-11-02 2018-11-02 Emotion recognition method and device based on voice

Country Status (1)

Country Link
CN (1) CN111145785A (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101419800A (en) * 2008-11-25 2009-04-29 浙江大学 Emotional speaker recognition method based on frequency spectrum translation
CN101937678A (en) * 2010-07-19 2011-01-05 东南大学 Judgment-deniable automatic speech emotion recognition method for fidget
CN102881284A (en) * 2012-09-03 2013-01-16 江苏大学 Unspecific human voice and emotion recognition method and system
CN103544963A (en) * 2013-11-07 2014-01-29 东南大学 Voice emotion recognition method based on core semi-supervised discrimination and analysis
CN103854645A (en) * 2014-03-05 2014-06-11 东南大学 Speech emotion recognition method based on punishment of speaker and independent of speaker
CN105609116A (en) * 2015-12-23 2016-05-25 东南大学 Speech emotional dimensions region automatic recognition method
CN105976809A (en) * 2016-05-25 2016-09-28 中国地质大学(武汉) Voice-and-facial-expression-based identification method and system for dual-modal emotion fusion
CN106570496A (en) * 2016-11-22 2017-04-19 上海智臻智能网络科技股份有限公司 Emotion recognition method and device and intelligent interaction method and device
CN107305773A (en) * 2016-04-15 2017-10-31 美特科技(苏州)有限公司 Voice mood discrimination method
CN107393525A (en) * 2017-07-24 2017-11-24 湖南大学 A kind of fusion feature is assessed and the speech-emotion recognition method of multilayer perceptron
CN107845390A (en) * 2017-09-21 2018-03-27 太原理工大学 A kind of Emotional speech recognition system based on PCNN sound spectrograph Fusion Features
CN108053840A (en) * 2017-12-29 2018-05-18 广州势必可赢网络科技有限公司 Emotion recognition method and system based on PCA-BP

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101419800A (en) * 2008-11-25 2009-04-29 浙江大学 Emotional speaker recognition method based on frequency spectrum translation
CN101937678A (en) * 2010-07-19 2011-01-05 东南大学 Judgment-deniable automatic speech emotion recognition method for fidget
CN102881284A (en) * 2012-09-03 2013-01-16 江苏大学 Unspecific human voice and emotion recognition method and system
CN103544963A (en) * 2013-11-07 2014-01-29 东南大学 Voice emotion recognition method based on core semi-supervised discrimination and analysis
CN103854645A (en) * 2014-03-05 2014-06-11 东南大学 Speech emotion recognition method based on punishment of speaker and independent of speaker
CN105609116A (en) * 2015-12-23 2016-05-25 东南大学 Speech emotional dimensions region automatic recognition method
CN107305773A (en) * 2016-04-15 2017-10-31 美特科技(苏州)有限公司 Voice mood discrimination method
CN105976809A (en) * 2016-05-25 2016-09-28 中国地质大学(武汉) Voice-and-facial-expression-based identification method and system for dual-modal emotion fusion
CN106570496A (en) * 2016-11-22 2017-04-19 上海智臻智能网络科技股份有限公司 Emotion recognition method and device and intelligent interaction method and device
CN107393525A (en) * 2017-07-24 2017-11-24 湖南大学 A kind of fusion feature is assessed and the speech-emotion recognition method of multilayer perceptron
CN107845390A (en) * 2017-09-21 2018-03-27 太原理工大学 A kind of Emotional speech recognition system based on PCNN sound spectrograph Fusion Features
CN108053840A (en) * 2017-12-29 2018-05-18 广州势必可赢网络科技有限公司 Emotion recognition method and system based on PCA-BP

Similar Documents

Publication Publication Date Title
Bertero et al. A first look into a convolutional neural network for speech emotion detection
CN108564942B (en) Voice emotion recognition method and system based on adjustable sensitivity
Ramakrishnan et al. Speech emotion recognition approaches in human computer interaction
CN103366618B (en) Scene device for Chinese learning training based on artificial intelligence and virtual reality
CN111462841B (en) Intelligent depression diagnosis device and system based on knowledge graph
Li et al. Speech emotion recognition using 1d cnn with no attention
CN113241096B (en) Emotion monitoring device and method
Kim et al. Emotion recognition using physiological and speech signal in short-term observation
Hema et al. Emotional speech recognition using cnn and deep learning techniques
Alghifari et al. On the use of voice activity detection in speech emotion recognition
CN112232127A (en) Intelligent speech training system and method
Baird et al. Emotion recognition in public speaking scenarios utilising an lstm-rnn approach with attention
CN109074809B (en) Information processing apparatus, information processing method, and computer-readable storage medium
Kabir et al. Procuring mfccs from crema-d dataset for sentiment analysis using deep learning models with hyperparameter tuning
MacIntyre et al. Pushing the envelope: Evaluating speech rhythm with different envelope extraction techniques
Hamidi et al. Emotion recognition from Persian speech with neural network
Subramanian et al. Audio emotion recognition by deep neural networks and machine learning algorithms
CN108766462B (en) Voice signal feature learning method based on Mel frequency spectrum first-order derivative
Jia et al. Two-level discriminative speech emotion recognition model with wave field dynamics: A personalized speech emotion recognition method
CN111145785A (en) Emotion recognition method and device based on voice
CN115757860A (en) Music emotion label generation method based on multi-mode fusion
Narain et al. Modeling real-world affective and communicative nonverbal vocalizations from minimally speaking individuals
Rheault et al. Multimodal techniques for the study of a ect in political videos
CN111883178A (en) Double-channel voice-to-image-based emotion recognition method
Kexin et al. Research on Emergency Parking Instruction Recognition Based on Speech Recognition and Speech Emotion Recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200512