CN107545905A - Emotion identification method based on sound property - Google Patents
Emotion identification method based on sound property Download PDFInfo
- Publication number
- CN107545905A CN107545905A CN201710720391.8A CN201710720391A CN107545905A CN 107545905 A CN107545905 A CN 107545905A CN 201710720391 A CN201710720391 A CN 201710720391A CN 107545905 A CN107545905 A CN 107545905A
- Authority
- CN
- China
- Prior art keywords
- voice signal
- emotion identification
- mood
- word
- identification result
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Abstract
The invention provides the Emotion identification method based on sound property, including:Sound recording module reads voice signal;Sound pretreatment module identifies the language belonging to read voice signal, and the voice signal read is carried out into subordinate sentence by sentence, obtains the pretreated voice signal of language tag;Acoustic processing module is calculated, extraction speech characteristic parameter according to the language tag of pretreated voice signal by default method;Language tag and the speech characteristic parameter that is extracted of the mood processing module according to pretreated voice signal, obtain the Emotion identification result of every a word, the Emotion identification result describes in a probabilistic manner;Mood post-processing module obtains the Emotion identification result of every a word of voice signal, and Emotion identification result is adjusted according to default mode, obtains the Emotion identification result of the voice signal.Method provided by the invention, it is possible to increase the accuracy rate of Emotion identification.
Description
Technical field
The present invention relates to Emotion identification field, more particularly to a kind of Emotion identification method based on sound property.
Background technology
With the development of science and technology natural language processing, especially language identification have had been applied in increasing industry,
Such as mobile phone speech assistant, self-assisted voice service, among these services, improve the energy that the mood in language is identified
Power is the important method improved service quality.
In other field, the function of Emotion identification is assigned such as on speech communication instrument, both call sides can be helped timely
Understand mutual mood, promote exchange.The mood of identification learning person in remote teaching, when learner is because encountering a difficulty or can not
The material of understanding and show anxiety and it is discontented when, teacher or system can adjust teaching method and progress, or give mood tune
Section instructs.In language navigation field, when voice mood identification function identifies that driver is in the state of emotional instability, system
It can give and remind, or automatically adjust drive parameter to prevent the generation of accident.
The existing Emotion identification method based on sound property, either using vector Splittable mahalanobis distance diagnostic method,
Principle component analysis, neural net method or hidden Markov model, often only consider the content of a voice, simultaneously as
The difference of different language culture, monolingual voice can only be directed to by, which also tending to, carries out Emotion identification, so as to cause voice mood
The accuracy rate of identification is not high enough.
The content of the invention
To solve problem above, the invention provides a kind of Emotion identification method based on sound property, it is possible to increase feelings
The accuracy rate of thread identification.
The embodiments of the invention provide a kind of Emotion identification method based on sound property, including:
Sound recording module reads voice signal;
Sound pretreatment module identifies the language belonging to read voice signal, and the voice signal read is pressed
Sentence carries out subordinate sentence, obtains the pretreated voice signal of language tag;
Acoustic processing module is calculated, extraction language according to the language tag of pretreated voice signal by default method
Sound characteristic parameter;
Language tag and the speech characteristic parameter that is extracted of the mood processing module according to pretreated voice signal, are obtained
To the Emotion identification result of every a word, the Emotion identification result describes in a probabilistic manner;
Mood post-processing module obtains the Emotion identification result of every a word of voice signal, according to default mode to feelings
Thread recognition result is adjusted, and obtains the Emotion identification result of the voice signal.
Preferably, the phonetic feature, in addition to:
Prosodic features, including rising tune, falling tone, accent and stress.
Preferably, the Emotion identification result for obtaining every a word, it is by using principle component analysis or mixed Gaussian
Modelling or hidden Markov model are calculated.
Preferably, the Emotion identification result for obtaining every a word, to obtain by the following method:
Every a word and the distance of every kind of mood is calculated using vector Splittable mahalanobis distance diagnostic method;
According to default method, distance values are converted into probability numbers, made:Apart from smaller, probability it is bigger and, it is all general
Rate sum is 1;
Emotion identification result using the probability as every a word.
Preferably, it is described that Emotion identification result is adjusted according to default mode, including:
The combined chance after some adjustment is carried out to Emotion identification result is calculated according to the first formula, chooses combined chance most
High Adjusted Option is adjusted to Emotion identification result, and first formula is:
P=K (θ) αn-i(1-α)i
Wherein, K (θ) is the probable value corresponding to the quantity for the mood that voice signal includes, and is obtained by sample by statistics,
For the preset function of monotone decreasing, θ is the quantity for the mood that voice signal includes, α be to the Emotion identification of every a word just
True rate, the sentence quantity that n is included by voice signal, i are the quantity of the sentence of adjustment Emotion identification result.
Preferably, the Emotion identification method based on sound property, in addition to:
Sound turns character module and reads voice signal and be converted into text information;
Word Emotion identification module is segmented the text information changed out, and is examined in mood word database
Rope, wherein the mood word database purchase has the word corresponding to different moods;
When the quantity of word that the text information is included corresponding to certain mood exceedes predetermined threshold value, and include other feelings
It is this kind of mood by the Emotion identification of the voice signal when quantity of word corresponding to thread is less than predetermined threshold value.
Preferably, the sound recording module reads voice signal, including:
Sound recording module reads voice signal;
Sound recording module checks the length of voice signal, when the length of voice signal exceedes default threshold value, sound
The voice signal is segmented by recording module, so that of length no more than default threshold value of each section of voice signal.
A kind of Emotion identification method based on sound property provided by the invention, it is possible to increase the accuracy rate of Emotion identification.
Other features and advantages of the present invention will be illustrated in the following description, also, partly becomes from specification
Obtain it is clear that or being understood by implementing the present invention.The purpose of the present invention and other advantages can be by the explanations write
Specifically noted structure is realized and obtained in book, claims and accompanying drawing.
Below by drawings and examples, technical scheme is described in further detail.
Brief description of the drawings
Accompanying drawing is used for providing a further understanding of the present invention, and a part for constitution instruction, the reality with the present invention
Apply example to be used to explain the present invention together, be not construed as limiting the invention.In the accompanying drawings:
Fig. 1 is a kind of flow chart of the Emotion identification method based on sound property in the embodiment of the present invention.
Embodiment
The preferred embodiments of the present invention are illustrated below in conjunction with accompanying drawing, it will be appreciated that described herein preferred real
Apply example to be merely to illustrate and explain the present invention, be not intended to limit the present invention.
The embodiments of the invention provide a kind of Emotion identification method based on sound property, as shown in figure 1, including:
Sound recording module reads voice signal;
Sound pretreatment module identifies the language belonging to read voice signal, and the voice signal read is pressed
Sentence carries out subordinate sentence, obtains the pretreated voice signal of language tag;
Acoustic processing module is calculated, extraction language according to the language tag of pretreated voice signal by default method
Sound characteristic parameter;
Language tag and the speech characteristic parameter that is extracted of the mood processing module according to pretreated voice signal, are obtained
To the Emotion identification result of every a word, the Emotion identification result describes in a probabilistic manner;
Mood post-processing module obtains the Emotion identification result of every a word of voice signal, according to default mode to feelings
Thread recognition result is adjusted, and obtains the Emotion identification result of the voice signal.
By being identified in advance to the language belonging to voice signal, so as to using targetedly sound property to sound
Mood be identified, add to voice signal carry out Emotion identification accuracy rate, while by using mood post-process mould
Block, identified using the mood of one section of sound as overall, further increase the accuracy rate that Emotion identification is carried out to voice signal.
In one embodiment of the invention, the phonetic feature, in addition to:
Prosodic features, including rising tune, falling tone, accent and stress.By with reference to prosodic features, making phonetic feature more complete
Face, so as to be easier to realize the Emotion identification of higher accuracy.
In one embodiment of the invention, the Emotion identification result for obtaining every a word, it is by using pivot
Analytic approach or mixed Gauss model method or hidden Markov model are calculated.By using principle component analysis or mixed Gaussian mould
Type method or hidden Markov model are calculated, and can directly obtain the Emotion identification result described in a probabilistic manner, convenient
Identified using mood post-processing module using the mood of one section of sound as overall, mood knowledge is carried out to voice signal so as to increase
Other accuracy rate.
In one embodiment of the invention, the Emotion identification result for obtaining every a word, for by the following method
Obtain:
Every a word and the distance of every kind of mood is calculated using vector Splittable mahalanobis distance diagnostic method;
According to default method, distance values are converted into probability numbers, made:Apart from smaller, probability it is bigger and, it is all general
Rate sum is 1;
Emotion identification result using the probability as every a word.
The Emotion identification result that can be described in a probabilistic manner by this method, the post processing of convenient use mood
Module identifies using the mood of one section of sound as overall, so as to increase the accuracy rate that Emotion identification is carried out to voice signal.
In one embodiment of the invention, it is described that Emotion identification result is adjusted according to default mode, including:
The combined chance after some adjustment is carried out to Emotion identification result is calculated according to the first formula, chooses combined chance most
High Adjusted Option is adjusted to Emotion identification result, and first formula is:
P=K (θ) αn-i(1-α)i
Wherein, K (θ) is the probable value corresponding to the quantity for the mood that voice signal includes, and is obtained by sample by statistics,
For the preset function of monotone decreasing, θ is the quantity for the mood that voice signal includes, α be to the Emotion identification of every a word just
True rate, the sentence quantity that n is included by voice signal, i are the quantity of the sentence of adjustment Emotion identification result.
By using mood post-processing module, the probability by one section of sound emotional change of consideration and the mood to every words
The probability of mistake is identified, is identified using the mood of one section of sound as overall, further increases and mood is carried out to voice signal
The accuracy rate of identification.
In one embodiment of the invention, the Emotion identification method based on sound property, in addition to:
Sound turns character module and reads voice signal and be converted into text information;
Word Emotion identification module is segmented the text information changed out, and is examined in mood word database
Rope, wherein the mood word database purchase has the word corresponding to different moods;
When the quantity of word that the text information is included corresponding to certain mood exceedes predetermined threshold value, and include other feelings
It is this kind of mood by the Emotion identification of the voice signal when quantity of word corresponding to thread is less than predetermined threshold value.
Turn the mode of word by using sound, can be in the case where word there is obvious mood to be inclined to, it is more accurate to accomplish
Emotion identification.
In one embodiment of the invention, the sound recording module reads voice signal, including:
Sound recording module reads voice signal;
Sound recording module checks the length of voice signal, when the length of voice signal exceedes default threshold value, sound
The voice signal is segmented by recording module, so that of length no more than default threshold value of each section of voice signal.
By limiting the length of every section of sound, can limit using mood post-processing module using the mood of one section of sound as
Integrally come the amount of calculation identified, the speed that Emotion identification is carried out to voice signal is improved.
A kind of Emotion identification method based on sound property provided by the invention, it is possible to increase the accuracy rate of Emotion identification.
The present invention is the flow with reference to method according to embodiments of the present invention, equipment (system) and computer program product
Figure and/or block diagram describe.It should be understood that can be by every first-class in computer program instructions implementation process figure and/or block diagram
Journey and/or the flow in square frame and flow chart and/or block diagram and/or the combination of square frame.These computer programs can be provided
The processors of all-purpose computer, special-purpose computer, Embedded Processor or other programmable data processing devices is instructed to produce
A raw machine so that produced by the instruction of computer or the computing device of other programmable data processing devices for real
The device for the function of being specified in present one flow of flow chart or one square frame of multiple flows and/or block diagram or multiple square frames.
These computer program instructions, which may be alternatively stored in, can guide computer or other programmable data processing devices with spy
Determine in the computer-readable memory that mode works so that the instruction being stored in the computer-readable memory, which produces, to be included referring to
Make the manufacture of device, the command device realize in one flow of flow chart or multiple flows and/or one square frame of block diagram or
The function of being specified in multiple square frames.
These computer program instructions can be also loaded into computer or other programmable data processing devices so that counted
Series of operation steps is performed on calculation machine or other programmable devices to produce computer implemented processing, so as in computer or
The instruction performed on other programmable devices is provided for realizing in one flow of flow chart or multiple flows and/or block diagram one
The step of function of being specified in individual square frame or multiple square frames.
Obviously, those skilled in the art can carry out the essence of various changes and modification without departing from the present invention to the present invention
God and scope.So, if these modifications and variations of the present invention belong to the scope of the claims in the present invention and its equivalent technologies
Within, then the present invention is also intended to comprising including these changes and modification.
Claims (7)
- A kind of 1. Emotion identification method based on sound property, it is characterised in that including:Sound recording module reads voice signal;Sound pretreatment module identifies the language belonging to read voice signal, and by the voice signal read by sentence Subordinate sentence is carried out, obtains the pretreated voice signal of language tag;Acoustic processing module is calculated, extraction voice spy according to the language tag of pretreated voice signal by default method Levy parameter;Language tag and the speech characteristic parameter that is extracted of the mood processing module according to pretreated voice signal, are obtained every The Emotion identification result of a word, the Emotion identification result describe in a probabilistic manner;Mood post-processing module obtains the Emotion identification result of every a word of voice signal, and mood is known according to default mode Other result is adjusted, and obtains the Emotion identification result of the voice signal.
- 2. the method as described in claim 1, it is characterised in that the phonetic feature, in addition to:Prosodic features, including rising tune, falling tone, accent and stress.
- 3. the method as described in claim 1, it is characterised in that the Emotion identification result for obtaining every a word, to pass through It is calculated using principle component analysis or mixed Gauss model method or hidden Markov model.
- 4. the method as described in claim 1, it is characterised in that the Emotion identification result for obtaining every a word, to pass through Following methods obtain:Every a word and the distance of every kind of mood is calculated using vector Splittable mahalanobis distance diagnostic method;According to default method, distance values are converted into probability numbers, made:Apart from smaller, probability it is bigger and, all probability it With for 1;Emotion identification result using the probability as every a word.
- 5. the method as described in claim 1, it is characterised in that described to be adjusted according to default mode to Emotion identification result It is whole, including:The combined chance after some adjustment is carried out to Emotion identification result is calculated according to the first formula, chooses combined chance highest Adjusted Option is adjusted to Emotion identification result, and first formula is:P=K (θ) αn-i(1-α)iWherein, K (θ) is the probable value corresponding to the quantity for the mood that voice signal includes, and is obtained by sample by statistics, for list The preset function successively decreased is adjusted, θ is the quantity for the mood that voice signal includes, and α is the accuracy to the Emotion identification of every a word, The sentence quantity that n is included by voice signal, i are the quantity of the sentence of adjustment Emotion identification result.
- 6. the method as described in claim 1, it is characterised in that also include:Sound turns character module and reads voice signal and be converted into text information;Word Emotion identification module is segmented the text information changed out, and is retrieved in mood word database, Wherein described mood word database purchase has the word corresponding to different moods;When the quantity of word that the text information is included corresponding to certain mood exceedes predetermined threshold value, and include other mood institutes It is this kind of mood by the Emotion identification of the voice signal when quantity of corresponding word is less than predetermined threshold value.
- 7. the method as described in claim 1, it is characterised in that the sound recording module reads voice signal, including:Sound recording module reads voice signal;Sound recording module checks the length of voice signal, when the length of voice signal exceedes default threshold value, sound typing The voice signal is segmented by module, so that of length no more than default threshold value of each section of voice signal.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710720391.8A CN107545905B (en) | 2017-08-21 | 2017-08-21 | Emotion recognition method based on sound characteristics |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710720391.8A CN107545905B (en) | 2017-08-21 | 2017-08-21 | Emotion recognition method based on sound characteristics |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107545905A true CN107545905A (en) | 2018-01-05 |
CN107545905B CN107545905B (en) | 2021-01-05 |
Family
ID=60958751
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710720391.8A Expired - Fee Related CN107545905B (en) | 2017-08-21 | 2017-08-21 | Emotion recognition method based on sound characteristics |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107545905B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108682419A (en) * | 2018-03-30 | 2018-10-19 | 京东方科技集团股份有限公司 | Sound control method and equipment, computer readable storage medium and equipment |
CN110660412A (en) * | 2018-06-28 | 2020-01-07 | Tcl集团股份有限公司 | Emotion guiding method and device and terminal equipment |
CN112447170A (en) * | 2019-08-29 | 2021-03-05 | 北京声智科技有限公司 | Security method and device based on sound information and electronic equipment |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE102005010285A1 (en) * | 2005-03-01 | 2006-09-07 | Deutsche Telekom Ag | Speech recognition involves speech recognizer which uses different speech models for linguistic analysis and an emotion recognizer is also present for determining emotional condition of person |
CN102142253A (en) * | 2010-01-29 | 2011-08-03 | 富士通株式会社 | Voice emotion identification equipment and method |
CN104504027A (en) * | 2014-12-12 | 2015-04-08 | 北京国双科技有限公司 | Method and device for automatically selecting webpage content |
CN105320960A (en) * | 2015-10-14 | 2016-02-10 | 北京航空航天大学 | Voting based classification method for cross-language subjective and objective sentiments |
CN106297825A (en) * | 2016-07-25 | 2017-01-04 | 华南理工大学 | A kind of speech-emotion recognition method based on integrated degree of depth belief network |
-
2017
- 2017-08-21 CN CN201710720391.8A patent/CN107545905B/en not_active Expired - Fee Related
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE102005010285A1 (en) * | 2005-03-01 | 2006-09-07 | Deutsche Telekom Ag | Speech recognition involves speech recognizer which uses different speech models for linguistic analysis and an emotion recognizer is also present for determining emotional condition of person |
CN102142253A (en) * | 2010-01-29 | 2011-08-03 | 富士通株式会社 | Voice emotion identification equipment and method |
CN104504027A (en) * | 2014-12-12 | 2015-04-08 | 北京国双科技有限公司 | Method and device for automatically selecting webpage content |
CN105320960A (en) * | 2015-10-14 | 2016-02-10 | 北京航空航天大学 | Voting based classification method for cross-language subjective and objective sentiments |
CN106297825A (en) * | 2016-07-25 | 2017-01-04 | 华南理工大学 | A kind of speech-emotion recognition method based on integrated degree of depth belief network |
Non-Patent Citations (2)
Title |
---|
余伶俐等: "语音信号的情感特征分析与识别研究综述", 《电路与系统学报》 * |
姜晓庆等: "多语种情感语音的韵律特征分析和情感识别研究", 《声学学报》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108682419A (en) * | 2018-03-30 | 2018-10-19 | 京东方科技集团股份有限公司 | Sound control method and equipment, computer readable storage medium and equipment |
CN110660412A (en) * | 2018-06-28 | 2020-01-07 | Tcl集团股份有限公司 | Emotion guiding method and device and terminal equipment |
CN112447170A (en) * | 2019-08-29 | 2021-03-05 | 北京声智科技有限公司 | Security method and device based on sound information and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN107545905B (en) | 2021-01-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111048062B (en) | Speech synthesis method and apparatus | |
KR20210082153A (en) | Method and system for generating synthesis voice for text via user interface | |
CN109036377A (en) | A kind of phoneme synthesizing method and device | |
US11676572B2 (en) | Instantaneous learning in text-to-speech during dialog | |
CN110914898A (en) | System and method for speech recognition | |
Liu et al. | Mongolian text-to-speech system based on deep neural network | |
Caponetti et al. | Biologically inspired emotion recognition from speech | |
CN107221344A (en) | A kind of speech emotional moving method | |
CN107545905A (en) | Emotion identification method based on sound property | |
CN112509550A (en) | Speech synthesis model training method, speech synthesis device and electronic equipment | |
Meng et al. | Synthesizing English emphatic speech for multimodal corrective feedback in computer-aided pronunciation training | |
Laurinčiukaitė et al. | Lithuanian Speech Corpus Liepa for development of human-computer interfaces working in voice recognition and synthesis mode | |
Sheikhan | Generation of suprasegmental information for speech using a recurrent neural network and binary gravitational search algorithm for feature selection | |
US20140074468A1 (en) | System and Method for Automatic Prediction of Speech Suitability for Statistical Modeling | |
Lee et al. | Korean dialect identification based on intonation modeling | |
Wu et al. | Generating emphatic speech with hidden Markov model for expressive speech synthesis | |
CN115359778A (en) | Confrontation and meta-learning method based on speaker emotion voice synthesis model | |
Krug et al. | Articulatory synthesis for data augmentation in phoneme recognition | |
CN107886938A (en) | Virtual reality guides hypnosis method of speech processing and device | |
Gharavian et al. | Combined classification method for prosodic stress recognition in Farsi language | |
Heba et al. | Lexical emphasis detection in spoken French using F-Banks and neural networks | |
Houidhek et al. | Dnn-based speech synthesis for arabic: modelling and evaluation | |
CN117711444B (en) | Interaction method, device, equipment and storage medium based on talent expression | |
James et al. | Exploring prosodic features modelling for secondary emotions needed for empathetic speech synthesis | |
Wusu-Ansah | Emotion recognition from speech: An implementation in MATLAB |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20210105 Termination date: 20210821 |