CN111145785A - Emotion recognition method and device based on voice - Google Patents
Emotion recognition method and device based on voice Download PDFInfo
- Publication number
- CN111145785A CN111145785A CN201811285508.5A CN201811285508A CN111145785A CN 111145785 A CN111145785 A CN 111145785A CN 201811285508 A CN201811285508 A CN 201811285508A CN 111145785 A CN111145785 A CN 111145785A
- Authority
- CN
- China
- Prior art keywords
- gaussian mixture
- mixture model
- anger
- voice
- emotion
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000008909 emotion recognition Effects 0.000 title claims abstract description 27
- 238000000034 method Methods 0.000 title claims abstract description 25
- 239000000203 mixture Substances 0.000 claims abstract description 104
- 230000008451 emotion Effects 0.000 claims abstract description 62
- 238000012545 processing Methods 0.000 claims abstract description 48
- 230000002996 emotional effect Effects 0.000 claims abstract description 19
- 238000001514 detection method Methods 0.000 claims abstract description 16
- 239000012634 fragment Substances 0.000 claims abstract description 15
- 238000012549 training Methods 0.000 claims abstract description 11
- 238000006243 chemical reaction Methods 0.000 claims description 10
- 238000001914 filtration Methods 0.000 claims description 10
- 238000007781 pre-processing Methods 0.000 claims description 10
- 238000002372 labelling Methods 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 238000012356 Product development Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000001149 cognitive effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000036651 mood Effects 0.000 description 1
- 230000006461 physiological response Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/63—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Hospice & Palliative Care (AREA)
- Psychiatry (AREA)
- General Health & Medical Sciences (AREA)
- Child & Adolescent Psychology (AREA)
- Signal Processing (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Feedback Control In General (AREA)
Abstract
The invention discloses a method and a device for emotion recognition based on voice, which comprises the following steps: 1) respectively collecting emotion voice data of characters like, anger and sadness; 2) adopting PCA algorithm to respectively perform dimensionality reduction processing on the emotional sound data of happiness, anger and sadness; 3) then, carrying out endpoint detection on the emotion voice data of the happiness, the anger and the grief after the dimensionality reduction processing to extract three characteristic parameters of a Mel-frequency cepstrum coefficient, a formant and a zero crossing rate, establishing a Gaussian mixture model for the characteristic parameters, respectively training out Gaussian mixture models of the emotion voice of the happiness, the anger and the grief, and establishing an emotion voice database consisting of the happiness-Gaussian mixture model, the anger-Gaussian mixture model and the grief-Gaussian mixture model; 4) collecting voice fragments to be recognized; the emotion recognition method based on the voice is high in recognition accuracy.
Description
Technical Field
The invention relates to a method and a device for emotion recognition based on voice.
Background
The emotion is a state integrating human feelings, ideas and behaviors, and plays an important role in human-to-human communication. Emotion is a state that integrates human feelings, thoughts, and behaviors, and includes a human psychological response to external or self-stimulation, including a physiological response accompanying such a psychological response. The mood plays a ubiquitous role in the daily work and life of people. In medical care, if the emotional state of a patient, particularly a patient with an expression disorder, can be known, different care measures can be taken according to the emotion of the patient, so that the care amount is increased. In the product development process, if the emotional state of the user in the product using process can be identified and the user experience is known, the product function can be improved, and a product more suitable for the user requirement is designed. In various human-machine interaction systems, human-machine interaction becomes more friendly and natural if the system can recognize the emotional state of a human. The analysis and recognition of emotion are important interdisciplinary research subjects in fields such as neuroscience, psychology, cognitive science, computer science, artificial intelligence and the like, and therefore, methods for recognizing emotion are provided for various fields.
Disclosure of Invention
The invention aims to provide a method and a device for recognizing emotion based on voice, which have high recognition accuracy.
In order to solve the problems, the invention adopts the following technical scheme:
a speech-based emotion recognition method includes the following steps:
1) respectively collecting emotion voice data of characters like, anger and sadness;
2) adopting PCA algorithm to respectively perform dimensionality reduction processing on the emotional sound data of happiness, anger and sadness;
3) then, carrying out endpoint detection on the emotion voice data of the happiness, the anger and the grief after the dimensionality reduction processing to extract three characteristic parameters of a Mel-frequency cepstrum coefficient, a formant and a zero crossing rate, establishing a Gaussian mixture model for the characteristic parameters, respectively training out Gaussian mixture models of the emotion voice of the happiness, the anger and the grief, and establishing an emotion voice database consisting of the happiness-Gaussian mixture model, the anger-Gaussian mixture model and the grief-Gaussian mixture model;
4) collecting voice fragments to be recognized;
5) carrying out anti-aliasing filtering, analog-to-digital conversion, pre-emphasis preprocessing and endpoint detection on the collected voice fragments, extracting three characteristic parameters including a Mel-frequency cepstrum coefficient, a formant and a zero crossing rate, establishing a contrast Gaussian mixture model for the characteristic parameters, and then respectively matching with a happiness-Gaussian mixture model, an anger-Gaussian mixture model and a sadness-Gaussian mixture model in an emotion voice database;
6) and comparing the Gaussian mixture model with one of the Happy-Gaussian mixture model, the anger-Gaussian mixture model and the Fulirague-Gaussian mixture model in the emotional sound database, wherein the overlapping rate of the model is more than a set threshold value, and judging the model to have the same emotion as the model.
Preferably, the set value of the threshold is 38 to 70%.
Preferably, the emotion recognition method further includes step 7) of performing a labeling corresponding to the emotion of a happy emotion, a anger emotion, or a sadness on the determined contrasted gaussian mixture model, and updating the labeled contrasted gaussian mixture model into the emotion sound database.
Preferably, the duration of the voice segment is 2-6 s.
The invention also provides a emotion recognition device based on voice, which comprises
The voice acquisition module is used for collecting voice fragments to be recognized;
the audio processing module is used for performing dimensionality reduction processing on the collected emotion voice data of character happiness, anger and sadness, and performing anti-aliasing filtering, analog-to-digital conversion and pre-emphasis preprocessing on the collected voice segments;
the data processing module is used for carrying out endpoint detection, extracting three characteristic parameters of a Mel-frequency cepstrum coefficient, a formant and a zero crossing rate, establishing a Gaussian mixture model for the characteristic parameters, and respectively training out Gaussian mixture models of emotional sounds like happiness, anger and grief;
and the database module is used for storing the happiness-Gaussian mixture model, the anger-Gaussian mixture model and the sadness-Gaussian mixture model.
Preferably, the audio processing module comprises an analog-to-digital converter, an audio output device, an anti-aliasing filter and a pre-emphasis circuit.
Preferably, the audio processing module and the database module are both physically connected with the data processing module.
The invention has the beneficial effects that: by aiming at the requirement of voice recognition, three emotion voice standard databases are firstly established, and voice recognition reference is set. Corresponding sound files are extracted aiming at three emotions, characteristic parameters such as mel-frequency cepstrum coefficients, formants, zero crossing rates and the like are extracted, Gaussian mixture models of three groups of emotions are built, the voice to be recognized can be effectively compared separately, the models are built separately for three emotions of happiness, anger and grief, compared with the traditional modeling mode, the process is simplified, meanwhile, the mode of singly comparing is adopted, the recognition efficiency can be effectively improved, the processing is improved, and the technical problems of complex current voice emotion recognition processing process, high realization difficulty, low accuracy and low efficiency are solved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a flow chart of a speech-based emotion recognition method of the present invention.
Fig. 2 is a connection block diagram of a speech-based emotion recognition apparatus according to the present invention.
In the figure:
1. a sound collection module; 2. an audio processing module; 3. a data processing module; 4. a database module; 5. an analog-to-digital converter; 6. an audio output device; 7. an anti-aliasing filter; 8. a pre-emphasis circuit.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments.
Example 1
As shown in fig. 1, a speech-based emotion recognition method includes the steps of:
1) respectively collecting emotion voice data of characters like, anger and sadness;
2) adopting PCA algorithm to respectively perform dimensionality reduction processing on the emotional sound data of happiness, anger and sadness;
3) then, carrying out endpoint detection on the emotion voice data of the happiness, the anger and the grief after the dimensionality reduction processing to extract three characteristic parameters of a Mel-frequency cepstrum coefficient, a formant and a zero crossing rate, establishing a Gaussian mixture model for the characteristic parameters, respectively training out Gaussian mixture models of the emotion voice of the happiness, the anger and the grief, and establishing an emotion voice database consisting of the happiness-Gaussian mixture model, the anger-Gaussian mixture model and the grief-Gaussian mixture model;
4) collecting voice fragments to be recognized;
5) carrying out anti-aliasing filtering, analog-to-digital conversion, pre-emphasis preprocessing and endpoint detection on the collected voice fragments, extracting three characteristic parameters including a Mel-frequency cepstrum coefficient, a formant and a zero crossing rate, establishing a contrast Gaussian mixture model for the characteristic parameters, and then respectively matching with a happiness-Gaussian mixture model, an anger-Gaussian mixture model and a sadness-Gaussian mixture model in an emotion voice database;
6) and comparing the Gaussian mixture model with one of the Happy-Gaussian mixture model, the anger-Gaussian mixture model and the Fulirague-Gaussian mixture model in the emotional sound database, wherein the overlapping rate of the model is more than a set threshold value, and judging the model to have the same emotion as the model.
In the present embodiment, the set value of the threshold is 45%.
In this embodiment, the duration of the speech segment is 2 s.
As shown in fig. 2, the embodiment further provides a speech-based emotion recognition apparatus, which includes
The voice acquisition module 1 is used for collecting voice fragments to be recognized;
the audio processing module 2 is used for performing dimensionality reduction processing on the collected emotion voice data of character happiness, anger and sadness, and performing anti-aliasing filtering, analog-to-digital conversion and pre-emphasis preprocessing on the collected voice segments;
the data processing module 3 is used for performing endpoint detection, extracting three characteristic parameters of a mel-frequency cepstrum coefficient, a formant and a zero crossing rate, establishing a Gaussian mixture model for the characteristic parameters, and respectively training out Gaussian mixture models of emotional sounds like, anger and grief;
and the database module 4 is used for storing a happiness-Gaussian mixture model, an anger-Gaussian mixture model and a sadness-Gaussian mixture model.
In this embodiment, the audio processing module 2 includes an analog-to-digital converter 5, an audio output device 6, an anti-aliasing filter 7, and a pre-emphasis circuit 8.
In this embodiment, the audio processing module 2 and the database module 4 are both physically connected to the data processing module 3.
Example 2
As shown in fig. 1, a speech-based emotion recognition method includes the steps of:
1) respectively collecting emotion voice data of characters like, anger and sadness;
2) adopting PCA algorithm to respectively perform dimensionality reduction processing on the emotional sound data of happiness, anger and sadness;
3) then, carrying out endpoint detection on the emotion voice data of the happiness, the anger and the grief after the dimensionality reduction processing to extract three characteristic parameters of a Mel-frequency cepstrum coefficient, a formant and a zero crossing rate, establishing a Gaussian mixture model for the characteristic parameters, respectively training out Gaussian mixture models of the emotion voice of the happiness, the anger and the grief, and establishing an emotion voice database consisting of the happiness-Gaussian mixture model, the anger-Gaussian mixture model and the grief-Gaussian mixture model;
4) collecting voice fragments to be recognized;
5) carrying out anti-aliasing filtering, analog-to-digital conversion, pre-emphasis preprocessing and endpoint detection on the collected voice fragments, extracting three characteristic parameters including a Mel-frequency cepstrum coefficient, a formant and a zero crossing rate, establishing a contrast Gaussian mixture model for the characteristic parameters, and then respectively matching with a happiness-Gaussian mixture model, an anger-Gaussian mixture model and a sadness-Gaussian mixture model in an emotion voice database;
6) and comparing the Gaussian mixture model with one of the Happy-Gaussian mixture model, the anger-Gaussian mixture model and the Fulirague-Gaussian mixture model in the emotional sound database, wherein the overlapping rate of the model is more than a set threshold value, and judging the model to have the same emotion as the model.
In the present embodiment, the set value of the threshold is 70%.
In this embodiment, the emotion recognition method further includes step 7) of labeling the determined contrasted gaussian mixture model with corresponding emotion of happiness, anger and sadness, and updating the labeled contrasted gaussian mixture model into the emotion sound database.
In this embodiment, the duration of the speech segment is 6 s.
As shown in fig. 2, the embodiment further provides a speech-based emotion recognition apparatus, which includes
The voice acquisition module 1 is used for collecting voice fragments to be recognized;
the audio processing module 2 is used for performing dimensionality reduction processing on the collected emotion voice data of character happiness, anger and sadness, and performing anti-aliasing filtering, analog-to-digital conversion and pre-emphasis preprocessing on the collected voice segments;
the data processing module 3 is used for performing endpoint detection, extracting three characteristic parameters of a mel-frequency cepstrum coefficient, a formant and a zero crossing rate, establishing a Gaussian mixture model for the characteristic parameters, and respectively training out Gaussian mixture models of emotional sounds like, anger and grief;
and the database module 4 is used for storing a happiness-Gaussian mixture model, an anger-Gaussian mixture model and a sadness-Gaussian mixture model.
In this embodiment, the audio processing module 2 includes an analog-to-digital converter 5, an audio output device 6, an anti-aliasing filter 7, and a pre-emphasis circuit 8.
In this embodiment, the audio processing module 2 and the database module 4 are both physically connected to the data processing module 3.
Example 3
As shown in fig. 1, a speech-based emotion recognition method includes the steps of:
1) respectively collecting emotion voice data of characters like, anger and sadness;
2) adopting PCA algorithm to respectively perform dimensionality reduction processing on the emotional sound data of happiness, anger and sadness;
3) then, carrying out endpoint detection on the emotion voice data of the happiness, the anger and the grief after the dimensionality reduction processing to extract three characteristic parameters of a Mel-frequency cepstrum coefficient, a formant and a zero crossing rate, establishing a Gaussian mixture model for the characteristic parameters, respectively training out Gaussian mixture models of the emotion voice of the happiness, the anger and the grief, and establishing an emotion voice database consisting of the happiness-Gaussian mixture model, the anger-Gaussian mixture model and the grief-Gaussian mixture model;
4) collecting voice fragments to be recognized;
5) carrying out anti-aliasing filtering, analog-to-digital conversion, pre-emphasis preprocessing and endpoint detection on the collected voice fragments, extracting three characteristic parameters including a Mel-frequency cepstrum coefficient, a formant and a zero crossing rate, establishing a contrast Gaussian mixture model for the characteristic parameters, and then respectively matching with a happiness-Gaussian mixture model, an anger-Gaussian mixture model and a sadness-Gaussian mixture model in an emotion voice database;
6) and comparing the Gaussian mixture model with one of the Happy-Gaussian mixture model, the anger-Gaussian mixture model and the Fulirague-Gaussian mixture model in the emotional sound database, wherein the overlapping rate of the model is more than a set threshold value, and judging the model to have the same emotion as the model.
In the present embodiment, the set value of the threshold is 60%.
In this embodiment, the emotion recognition method further includes step 7) of labeling the determined contrasted gaussian mixture model with corresponding emotion of happiness, anger and sadness, and updating the labeled contrasted gaussian mixture model into the emotion sound database.
In this embodiment, the duration of the speech segment is 5 s.
As shown in the figure, the embodiment further provides a speech-based emotion recognition device, which includes a sound collection module 1, configured to collect a speech segment to be recognized;
the audio processing module 2 is used for performing dimensionality reduction processing on the collected emotion voice data of character happiness, anger and sadness, and performing anti-aliasing filtering, analog-to-digital conversion and pre-emphasis preprocessing on the collected voice segments;
the data processing module 3 is used for performing endpoint detection, extracting three characteristic parameters of a mel-frequency cepstrum coefficient, a formant and a zero crossing rate, establishing a Gaussian mixture model for the characteristic parameters, and respectively training out Gaussian mixture models of emotional sounds like, anger and grief;
and the database module 4 is used for storing a happiness-Gaussian mixture model, an anger-Gaussian mixture model and a sadness-Gaussian mixture model.
In this embodiment, the audio processing module 2 includes an analog-to-digital converter 5, an audio output device 6, an anti-aliasing filter 7, and a pre-emphasis circuit 8.
In this embodiment, the audio processing module 2 and the database module 4 are both physically connected to the data processing module 3.
The invention has the beneficial effects that: by aiming at the requirement of voice recognition, three emotion voice standard databases are firstly established, and voice recognition reference is set. Corresponding sound files are extracted aiming at three emotions, characteristic parameters such as mel-frequency cepstrum coefficient, formants, zero crossing rate and the like are extracted, Gaussian mixture models of three groups of emotions are built, the voice to be recognized can be effectively and separately compared, the models are built by separately setting three emotions of happiness, anger and grief, compared with the traditional modeling mode, the process is simplified, meanwhile, the mode of separately comparing is adopted, the recognition efficiency can be effectively improved, the processing is improved, the technical problems that the current voice emotion recognition processing process is complex, the realization difficulty is high, the accuracy rate is low, and the efficiency is low are solved
The above description is only an embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that are not thought of through the inventive work should be included in the scope of the present invention.
Claims (7)
1. A speech-based emotion recognition method is characterized by comprising the following steps:
1) respectively collecting emotion voice data of characters like, anger and sadness;
2) adopting PCA algorithm to respectively perform dimensionality reduction processing on the emotional sound data of happiness, anger and sadness;
3) then, carrying out endpoint detection on the emotion voice data of the happiness, the anger and the grief after the dimensionality reduction processing to extract three characteristic parameters of a Mel-frequency cepstrum coefficient, a formant and a zero crossing rate, establishing a Gaussian mixture model for the characteristic parameters, respectively training out Gaussian mixture models of the emotion voice of the happiness, the anger and the grief, and establishing an emotion voice database consisting of the happiness-Gaussian mixture model, the anger-Gaussian mixture model and the grief-Gaussian mixture model;
4) collecting voice fragments to be recognized;
5) carrying out anti-aliasing filtering, analog-to-digital conversion, pre-emphasis preprocessing and endpoint detection on the collected voice fragments, extracting three characteristic parameters including a Mel-frequency cepstrum coefficient, a formant and a zero crossing rate, establishing a contrast Gaussian mixture model for the characteristic parameters, and then respectively matching with a happiness-Gaussian mixture model, an anger-Gaussian mixture model and a sadness-Gaussian mixture model in an emotion voice database;
6) and comparing the Gaussian mixture model with one of the Happy-Gaussian mixture model, the anger-Gaussian mixture model and the Fulirague-Gaussian mixture model in the emotional sound database, wherein the overlapping rate of the model is more than a set threshold value, and judging the model to have the same emotion as the model.
2. A speech based emotion recognition method as claimed in claim 1, wherein: the set value of the threshold is 38-70%.
3. A speech based emotion recognition method as claimed in claim 2, wherein: and 7) marking the compared Gaussian mixture model after judgment corresponding to the happiness, the anger and the sadness, and updating the marked labels into an emotion sound database.
4. A speech based emotion recognition method as claimed in claim 3, wherein: the duration of the voice segment is 2-6 s.
5. A speech-based emotion recognition apparatus, characterized in that: comprises that
The voice acquisition module is used for collecting voice fragments to be recognized;
the audio processing module is used for performing dimensionality reduction processing on the collected emotion voice data of character happiness, anger and sadness, and performing anti-aliasing filtering, analog-to-digital conversion and pre-emphasis preprocessing on the collected voice segments;
the data processing module is used for carrying out endpoint detection, extracting three characteristic parameters of a Mel-frequency cepstrum coefficient, a formant and a zero crossing rate, establishing a Gaussian mixture model for the characteristic parameters, and respectively training out Gaussian mixture models of emotional sounds like happiness, anger and grief;
and the database module is used for storing the happiness-Gaussian mixture model, the anger-Gaussian mixture model and the sadness-Gaussian mixture model.
6. A speech based emotion recognition apparatus as claimed in claim 5, wherein: the audio processing module comprises an analog-to-digital converter, an audio output device, an anti-aliasing filter and a pre-emphasis circuit.
7. A speech based emotion recognition apparatus as claimed in claim 6, wherein: and the audio processing module and the database module are both physically connected with the data processing module.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811285508.5A CN111145785A (en) | 2018-11-02 | 2018-11-02 | Emotion recognition method and device based on voice |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811285508.5A CN111145785A (en) | 2018-11-02 | 2018-11-02 | Emotion recognition method and device based on voice |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111145785A true CN111145785A (en) | 2020-05-12 |
Family
ID=70515079
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811285508.5A Pending CN111145785A (en) | 2018-11-02 | 2018-11-02 | Emotion recognition method and device based on voice |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111145785A (en) |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101419800A (en) * | 2008-11-25 | 2009-04-29 | 浙江大学 | Emotional speaker recognition method based on frequency spectrum translation |
CN101937678A (en) * | 2010-07-19 | 2011-01-05 | 东南大学 | Judgment-deniable automatic speech emotion recognition method for fidget |
CN102881284A (en) * | 2012-09-03 | 2013-01-16 | 江苏大学 | Unspecific human voice and emotion recognition method and system |
CN103544963A (en) * | 2013-11-07 | 2014-01-29 | 东南大学 | Voice emotion recognition method based on core semi-supervised discrimination and analysis |
CN103854645A (en) * | 2014-03-05 | 2014-06-11 | 东南大学 | Speech emotion recognition method based on punishment of speaker and independent of speaker |
CN105609116A (en) * | 2015-12-23 | 2016-05-25 | 东南大学 | Speech emotional dimensions region automatic recognition method |
CN105976809A (en) * | 2016-05-25 | 2016-09-28 | 中国地质大学(武汉) | Voice-and-facial-expression-based identification method and system for dual-modal emotion fusion |
CN106570496A (en) * | 2016-11-22 | 2017-04-19 | 上海智臻智能网络科技股份有限公司 | Emotion recognition method and device and intelligent interaction method and device |
CN107305773A (en) * | 2016-04-15 | 2017-10-31 | 美特科技(苏州)有限公司 | Voice mood discrimination method |
CN107393525A (en) * | 2017-07-24 | 2017-11-24 | 湖南大学 | A kind of fusion feature is assessed and the speech-emotion recognition method of multilayer perceptron |
CN107845390A (en) * | 2017-09-21 | 2018-03-27 | 太原理工大学 | A kind of Emotional speech recognition system based on PCNN sound spectrograph Fusion Features |
CN108053840A (en) * | 2017-12-29 | 2018-05-18 | 广州势必可赢网络科技有限公司 | Emotion recognition method and system based on PCA-BP |
-
2018
- 2018-11-02 CN CN201811285508.5A patent/CN111145785A/en active Pending
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101419800A (en) * | 2008-11-25 | 2009-04-29 | 浙江大学 | Emotional speaker recognition method based on frequency spectrum translation |
CN101937678A (en) * | 2010-07-19 | 2011-01-05 | 东南大学 | Judgment-deniable automatic speech emotion recognition method for fidget |
CN102881284A (en) * | 2012-09-03 | 2013-01-16 | 江苏大学 | Unspecific human voice and emotion recognition method and system |
CN103544963A (en) * | 2013-11-07 | 2014-01-29 | 东南大学 | Voice emotion recognition method based on core semi-supervised discrimination and analysis |
CN103854645A (en) * | 2014-03-05 | 2014-06-11 | 东南大学 | Speech emotion recognition method based on punishment of speaker and independent of speaker |
CN105609116A (en) * | 2015-12-23 | 2016-05-25 | 东南大学 | Speech emotional dimensions region automatic recognition method |
CN107305773A (en) * | 2016-04-15 | 2017-10-31 | 美特科技(苏州)有限公司 | Voice mood discrimination method |
CN105976809A (en) * | 2016-05-25 | 2016-09-28 | 中国地质大学(武汉) | Voice-and-facial-expression-based identification method and system for dual-modal emotion fusion |
CN106570496A (en) * | 2016-11-22 | 2017-04-19 | 上海智臻智能网络科技股份有限公司 | Emotion recognition method and device and intelligent interaction method and device |
CN107393525A (en) * | 2017-07-24 | 2017-11-24 | 湖南大学 | A kind of fusion feature is assessed and the speech-emotion recognition method of multilayer perceptron |
CN107845390A (en) * | 2017-09-21 | 2018-03-27 | 太原理工大学 | A kind of Emotional speech recognition system based on PCNN sound spectrograph Fusion Features |
CN108053840A (en) * | 2017-12-29 | 2018-05-18 | 广州势必可赢网络科技有限公司 | Emotion recognition method and system based on PCA-BP |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Bertero et al. | A first look into a convolutional neural network for speech emotion detection | |
CN108564942B (en) | Voice emotion recognition method and system based on adjustable sensitivity | |
Ramakrishnan et al. | Speech emotion recognition approaches in human computer interaction | |
CN103366618B (en) | Scene device for Chinese learning training based on artificial intelligence and virtual reality | |
CN111462841B (en) | Intelligent depression diagnosis device and system based on knowledge graph | |
Li et al. | Speech emotion recognition using 1d cnn with no attention | |
CN113241096B (en) | Emotion monitoring device and method | |
Kim et al. | Emotion recognition using physiological and speech signal in short-term observation | |
Hema et al. | Emotional speech recognition using cnn and deep learning techniques | |
Alghifari et al. | On the use of voice activity detection in speech emotion recognition | |
CN112232127A (en) | Intelligent speech training system and method | |
Baird et al. | Emotion recognition in public speaking scenarios utilising an lstm-rnn approach with attention | |
CN109074809B (en) | Information processing apparatus, information processing method, and computer-readable storage medium | |
Kabir et al. | Procuring mfccs from crema-d dataset for sentiment analysis using deep learning models with hyperparameter tuning | |
MacIntyre et al. | Pushing the envelope: Evaluating speech rhythm with different envelope extraction techniques | |
Hamidi et al. | Emotion recognition from Persian speech with neural network | |
Subramanian et al. | Audio emotion recognition by deep neural networks and machine learning algorithms | |
CN108766462B (en) | Voice signal feature learning method based on Mel frequency spectrum first-order derivative | |
Jia et al. | Two-level discriminative speech emotion recognition model with wave field dynamics: A personalized speech emotion recognition method | |
CN111145785A (en) | Emotion recognition method and device based on voice | |
CN115757860A (en) | Music emotion label generation method based on multi-mode fusion | |
Narain et al. | Modeling real-world affective and communicative nonverbal vocalizations from minimally speaking individuals | |
Rheault et al. | Multimodal techniques for the study of a ect in political videos | |
CN111883178A (en) | Double-channel voice-to-image-based emotion recognition method | |
Kexin et al. | Research on Emergency Parking Instruction Recognition Based on Speech Recognition and Speech Emotion Recognition |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200512 |