CN102201237A - Emotional speaker identification method based on reliability detection of fuzzy support vector machine - Google Patents

Emotional speaker identification method based on reliability detection of fuzzy support vector machine Download PDF

Info

Publication number
CN102201237A
CN102201237A CN201110121720XA CN201110121720A CN102201237A CN 102201237 A CN102201237 A CN 102201237A CN 201110121720X A CN201110121720X A CN 201110121720XA CN 201110121720 A CN201110121720 A CN 201110121720A CN 102201237 A CN102201237 A CN 102201237A
Authority
CN
China
Prior art keywords
speaker
support vector
vector machine
component
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201110121720XA
Other languages
Chinese (zh)
Other versions
CN102201237B (en
Inventor
杨莹春
陈力
吴朝晖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201110121720XA priority Critical patent/CN102201237B/en
Publication of CN102201237A publication Critical patent/CN102201237A/en
Application granted granted Critical
Publication of CN102201237B publication Critical patent/CN102201237B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)

Abstract

The invention discloses an emotional speaker identification method based on reliability detection of fuzzy support vector machine, which comprises the following steps of: extracting speech component characteristics and combining the speech component characteristics with corresponding weight in a universal broadcast modem (UBM) model to form background model component characteristics; taking the obtained background model component characteristics as a fuzzy membership, and establishing a fuzzy support component machine model in a general model component; carrying out a reliability detection by using the fuzzy support vector machine model to obtain the reliability characteristics; and calculating the reliability characteristics and identifying the speaker. The method provided by the invention improves the robustness of the speaker identification system and the performance for identifying a speaker.

Description

Emotional speaker identification method based on reliability detection of fuzzy support vector machine
Technical Field
The invention relates to signal processing and pattern recognition, in particular to an emotional speaker recognition method based on reliability feature detection of a fuzzy support vector machine.
Background
The speaker recognition technology is a technology for recognizing the identity of a speaker according to the voice of the speaker by utilizing a signal processing and pattern recognition method, and mainly comprises the following two steps: speaker model training and voice testing.
Currently, the main features adopted for speaker recognition include mel-frequency cepstrum coefficient (b:)) Linear predictive coding of cepstral coefficients (
Figure 444526DEST_PATH_IMAGE002
) Perceptually weighted linear prediction coefficients (
Figure 286580DEST_PATH_IMAGE003
). The algorithm for speaker recognition mainly comprises vector quantization () General background model method (
Figure 122261DEST_PATH_IMAGE005
) Support vector machine (
Figure 944723DEST_PATH_IMAGE006
) And so on. Wherein,
Figure 274073DEST_PATH_IMAGE005
the method is widely applied to the whole speaker recognition field.
In emotional speaker recognition, the training speech is typically emotional-neutral speech, because in real-world applications, the user typically only provides speech in a neutral pronunciation to train his model. When testing, the voice may include various emotional voices, such as happy, sad, etc. However, the conventional speaker recognition system cannot deal with the mismatch of the training and testing conditions, and therefore, the problem of performance degradation of the speaker recognition system caused by inconsistent emotion of speakers in the training and testing stages needs to be solved for emotion speaker recognition.
Through experimental observation, the difference of the spatial distribution of the voice characteristics is caused by the difference of the sounding states of the speakers in different emotional states, so that compared with a neutral training model, the emotional voice characteristics are not matched with the neutral training model and can be regarded as unreliable characteristics, and the improvement of the recognition performance of the system is facilitated after the emotional voice characteristics are removed in a test stage.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides an emotional speaker recognition method based on the reliability characteristic detection of a fuzzy support vector machine, which reduces the mismatch degree of a model by eliminating the emotional voice characteristics in the tested voice, thereby improving the robustness of a speaker recognition system and improving the performance of speaker recognition.
In order to solve the technical problems, the technical scheme of the invention is as follows:
the emotion speaker identification method based on the reliability detection of the fuzzy support vector machine comprises the following steps
1) Extracting the voice component characteristics, and combining the voice component characteristics with corresponding weights in the UBM model to form general background model component characteristics;
2) using the general background model component characteristics obtained in the step 1) as fuzzy membership, and establishing a fuzzy support vector machine model under the general background model component;
3) carrying out reliability detection on the fuzzy support vector machine model in the step 2) so as to obtain reliable characteristics;
4) computing the reliable characteristics of the step 3) to identify the speaker.
As an alternative: the extraction of the voice component features comprises the following steps:
1) collecting voice signals and carrying out signal preprocessing on the voice signals;
2) extracting the characteristics of the preprocessed voice signals;
the feature extraction selects a feature extraction method based on a Mel cepstrum coefficient and/or a feature extraction method based on a linear prediction cepstrum coefficient;
the pretreatment comprises the following steps in sequence:
sample quantization, descum, pre-emphasis, and windowing.
As an alternative: the method for forming the general background model component characteristics comprises the following steps:
1) randomly dividing the collected voice signals into a development library and an evaluation library;
2) selecting all voices in the development library, extracting features and passing the voices through
Figure 535290DEST_PATH_IMAGE007
The method trains a general background model;
3) calculating the weight of each test voice on each Gaussian model of the general background respectively;
4) combining the step 2) and the step 3) to form the component characteristics of the general background model
As an alternative: the fuzzy support vector machine model is two types of fuzzy support vector machine classifiers of reliable-unreliable features on each Gaussian component, wherein positive samples of the two types of fuzzy support vector machine classifiers are selected from neutral voices in the development library, and negative samples of the two types of fuzzy support vector machine classifiers are selected from emotional voices in the development library.
As an alternative: the reliability detection of the fuzzy support vector machine comprises the following steps:
1) by the formula
Figure 266486DEST_PATH_IMAGE008
Computing test speech
Figure 820702DEST_PATH_IMAGE009
A reliability score at each gaussian component;
the above-mentioned
Figure 637348DEST_PATH_IMAGE010
Figure 436677DEST_PATH_IMAGE011
For the parameters of the classification plane under each Gaussian component
2) By the formula
Figure 960062DEST_PATH_IMAGE012
Computing test speechA weighted reliability score under all gaussian components;
the above-mentioned
Figure 257630DEST_PATH_IMAGE014
As a weight feature
3) Judging whether the characteristic is a reliable characteristic or not according to the result obtained in the step 2), if so, taking the characteristic as the reliable characteristic, otherwise, rejecting the characteristic.
As an alternative: the speaker identification through the feature calculation comprises the following steps;
1) training a Gaussian mixture model of each speaker, wherein the self-adaptive speaker model adopts a maximum posterior probability method;
2) by the formula
Figure 595071DEST_PATH_IMAGE015
To obtain the first
Figure 238542DEST_PATH_IMAGE016
Testing speech in individual speaker models
Figure 636025DEST_PATH_IMAGE009
Is given by the formula
Figure 427264DEST_PATH_IMAGE017
Obtaining the score of the whole sentence test sentence;
the above-mentioned
Figure 506078DEST_PATH_IMAGE018
For the threshold value of feature reliability detection set in the experiment,
Figure 564907DEST_PATH_IMAGE019
probability density of Gaussian distribution
3) Identifying the speaker with the largest score in step 2)
The above-mentioned
Figure 349510DEST_PATH_IMAGE021
Representing the speaker id.
The invention has the beneficial effects that: by removing unreliable features which are seriously influenced by emotional changes in the speech paragraphs, the robustness of the speaker recognition system is improved, and the performance of the system for recognizing the speaker is improved.
Drawings
FIG. 1 is a basic schematic diagram of a method for detecting emotion speaker recognition based on the reliability of a fuzzy support vector machine.
Detailed Description
The invention will be further described with reference to the accompanying drawings and specific embodiments.
As shown in FIG. 1, the emotion speaker recognition method based on reliability detection of fuzzy support vector machine mainly comprises four steps
1) Extracting the voice component characteristics, and combining the voice component characteristics with corresponding weights in the UBM model to form general background model component characteristics;
2) establishing a fuzzy support vector machine model UCFSVM under the general background model component by taking the general background model component characteristics obtained in the step 1) as fuzzy membership;
3) performing reliability detection on the fuzzy support vector machine model UCFSVM of the step 2) and scoring
Figure 294332DEST_PATH_IMAGE022
The size of the data is judged to obtain reliable characteristics;
4) computing the reliable characteristics of the step 3) to identify the speaker.
The general background model component feature extraction comprises the following steps:
collecting voice signals, and carrying out signal preprocessing on the voice signals, wherein the preprocessing comprises the steps of sampling quantization, zero-drift removal, pre-emphasis and windowing.
Extracting the features of the pre-processed speech by a feature extraction method based on Mel cepstrum coefficient (A)
Figure 709133DEST_PATH_IMAGE001
) The feature extraction method of (1), and the feature extraction method based on the linear prediction cepstrum coefficient (
Figure 651681DEST_PATH_IMAGE002
) One or two of them.
For each segment of speech, a segment of feature sequence is obtainedWherein each frame feature is one
Figure 136331DEST_PATH_IMAGE024
The vector of the dimensions is then calculated,
Figure 671218DEST_PATH_IMAGE025
representing the total number of frames of the feature in the sentence.
All training will be
Figure 846984DEST_PATH_IMAGE026
Speech passing of models
Figure 772215DEST_PATH_IMAGE007
Algorithm training
Figure 557375DEST_PATH_IMAGE026
And (4) modeling. Characteristics of each test voice
Figure 681189DEST_PATH_IMAGE027
Are respectively at
Figure 27856DEST_PATH_IMAGE026
Weighting on each Gaussian model
Figure 440383DEST_PATH_IMAGE028
. Suppose that
Figure 530699DEST_PATH_IMAGE026
The model parameters areWherein
Figure 465736DEST_PATH_IMAGE030
and
Figure 118620DEST_PATH_IMAGE032
respectively representing weight, mean and variance. Then characteristic
Figure 420289DEST_PATH_IMAGE009
Belong to the first
Figure 341715DEST_PATH_IMAGE033
The posterior probability of each gaussian component can be expressed as:
Figure 791151DEST_PATH_IMAGE034
wherein,
Figure 223269DEST_PATH_IMAGE035
representing the probability density of a gaussian distribution.
The posterior probability can also be understood as the feature belonging to the same
Figure 176182DEST_PATH_IMAGE026
And (3) weight of the component, and combining the original characteristic and the weight to form a new general background model component characteristic.
The features formed in step (1) above include the features
Figure 537018DEST_PATH_IMAGE026
The newly constructed weight characteristics can serve as the role of fuzzy membership degree in the process of training the fuzzy support vector machine and can also serve as the role of weight of importance of each Gaussian component in the process of calculating the credibility.
Establishing a fuzzy support vector machine model under the general background model component:
in that
Figure 208171DEST_PATH_IMAGE005
On the basis of the model, two types of fuzzy support vector machine models with reliable-unreliable features are trained for each Gaussian component. Wherein neutral features are considered reliable features, emotional features are considered unreliable features, positive samples are selected from the neutral speech of the development library, and negative samples are selected from the emotional speech. Wherein the fuzzy membership of each sample is the weight characteristic mentioned in the step (1).
The method for training the fuzzy support vector machine comprises the following steps: for a training sample set with membership grade marks
Figure 443980DEST_PATH_IMAGE036
Figure 454662DEST_PATH_IMAGE037
Wherein each training data
Figure 484934DEST_PATH_IMAGE038
If it is emotional voice, it is regarded as unreliable voice, and its corresponding labelIf it is a neutral voice, it is labeled as
Figure 978156DEST_PATH_IMAGE040
The problem of optimizing the hyperplane is equivalent to:
Figure 905661DEST_PATH_IMAGE041
wherein,
Figure 44518DEST_PATH_IMAGE042
is a constant number of times that the number of the first,
Figure 191728DEST_PATH_IMAGE043
show that
Figure 503761DEST_PATH_IMAGE044
FromMapping toObtaining the characteristic space vector and degree of membership
Figure 994151DEST_PATH_IMAGE047
Representing the corresponding data
Figure 608410DEST_PATH_IMAGE048
To the extent that they belong to a certain class,
Figure 448190DEST_PATH_IMAGE049
Figure 991166DEST_PATH_IMAGE050
respectively representing classification hyperplanes
Figure 345924DEST_PATH_IMAGE051
Linear coefficient and offset. This problem can be solved using the theory of solving the linear inequality. (Chun-Fu Lin, Sheng-De Wa)ng. Fuzzy Support Vector Machines. IEEE Transactions on Neural Networks, 13(2):464-471, March 2002.)。
The above formula can be converted to its dual expression form:
meanwhile, according to the kuen-tower condition:
the classification surface parameters under each Gaussian component can be obtained by solving the following two formulas:and
Figure 282602DEST_PATH_IMAGE055
feature reliability detection based on fuzzy support vector machine
For testing speech features
Figure 677811DEST_PATH_IMAGE056
It is necessary to calculate a score that is a reliable feature and if the reliability score is too low, it is rejected. The score is calculated in two steps: firstly, the reliability score of the feature on a fuzzy support vector machine under a single Gaussian component of a general background model is obtained
Figure 288921DEST_PATH_IMAGE057
. Secondly, calculating the weighted sum of the reliability scores of the features on the fuzzy support vector machine under all Gaussian components of the general background model, and expressing the weighted sum as follows:
Figure 439279DEST_PATH_IMAGE058
wherein,
Figure 290602DEST_PATH_IMAGE059
indicating the feature inThe weight on the gaussian component of the signal,
Figure 955119DEST_PATH_IMAGE060
the meanings of (A) are as indicated above. The score can be used to determine whether it is a reliable feature, and if the score is greater than a threshold, it is considered a reliable feature, otherwise, it is rejected.
Reliable feature score calculation
After the reliable feature detection in step (3) above, the score of the whole sentence needs to be calculated.
Firstly, a Gaussian mixture model of each speaker needs to be trained, and the maximum posterior probability is adopted by the self-adaptive speaker model (
Figure 276379DEST_PATH_IMAGE061
) The method of (1).
Figure 30708DEST_PATH_IMAGE062
Second, for the first
Figure 830037DEST_PATH_IMAGE016
Individual speaker model, testing speech characteristics
Figure 917204DEST_PATH_IMAGE009
The likelihood score of (1) can be calculated in the second place
Figure 347048DEST_PATH_IMAGE016
The likelihood score for an individual speaker is obtained by the following equation:
for a whole sentence test sentence, the score calculation method is as follows:
Figure 385411DEST_PATH_IMAGE063
for the threshold of reliability set in the experiment, if the reliability score is greater than the threshold, the feature score is retained, otherwise, it is rejected.
Finally, when the target speaker of the sentence is selected, the speaker with the highest score is selected
Figure 631902DEST_PATH_IMAGE021
Figure 29385DEST_PATH_IMAGE065
Results of the experiment
The database used in the experiment was the Chinese emotion speech database (MASC). The database was recorded in a quiet environment using an olympus DM-20 recording pen. The database contains 68 speakers in the native chinese language, 45 male and 23 female. Each speaker has 5 emotional utterances: neutral, angry, happy, angry, and sad. Each speaker will read 2 paragraphs neutral under neutral conditions while speaking 5 words and 20 sentences 3 times each under each emotion.
The experiment was performed on an IBM server. The configuration is as follows: CPU E5420, main frequency 2.5 GHz. The memory is 4G.
In the experiment, the voices of the first 18 speakers were used as a development library, and the voices of the neutral paragraphs of the 18 speakers were used for training
Figure 319159DEST_PATH_IMAGE066
Model, the statement pronunciation under the 18 persons 5 emotions is used for training the fuzzy support vector machine model. The last 50 speakers constitute an evaluation set, one for each speaker
Figure 397973DEST_PATH_IMAGE067
The model is adapted using its neutral section. All sentences under five kinds of emotional voices are used for testing, and the testing voices are 15,000 sentences (a)). In the experiment, the simulation is that the process of speaker identification, the experimental result and the benchmark are
Figure 526652DEST_PATH_IMAGE069
The results are shown in Table 1.
Watch (A)The effect of the method is compared with that of a reference experiment
Emotion classification Reference method Method for producing a composite material
Neutral property 96.23% 95.50%
Anger and anger 31.50% 37.60%
Happy 33.57% 39.47%
Panic alarm 35.00% 39.77%
Sadness and sorrow 61.43% 63.63%
Average 51.55% 55.19%
From the experimental results, the method can effectively detect the reliable features in the sentences, and the recognition accuracy is greatly improved under each emotional state. Meanwhile, the overall identification accuracy is improved by 3.64%. The method is very helpful for improving the performance and robustness of the speaker recognition system.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and decorations can be made without departing from the concept of the present invention, and these modifications and decorations should also be regarded as being within the protection scope of the present invention.

Claims (6)

1. The emotion speaker identification method based on reliability detection of the fuzzy support vector machine is characterized by comprising the following steps
1) Extracting the voice component characteristics, and combining the voice component characteristics with corresponding weights in the UBM model to form general background model component characteristics;
2) using the general background model component characteristics obtained in the step 1) as fuzzy membership, and establishing a fuzzy support vector machine model under the general background model component;
3) carrying out reliability detection by using the fuzzy support vector machine model in the step 2) to obtain reliable characteristics;
4) computing the reliable characteristics of the step 3) to identify the speaker.
2. The method for emotion speaker recognition based on reliability detection of fuzzy support vector machine as claimed in claim 1, wherein said extracting speech component features comprises the steps of:
1) collecting voice signals and carrying out signal preprocessing on the voice signals;
2) extracting the characteristics of the preprocessed voice signals;
the feature extraction selects a feature extraction method based on a Mel cepstrum coefficient and/or a feature extraction method based on a linear prediction cepstrum coefficient;
the pretreatment comprises the following steps in sequence:
sample quantization, descum, pre-emphasis, and windowing.
3. The method for emotion speaker recognition based on reliability detection of fuzzy support vector machine as claimed in claim 1, wherein said forming of general background model component features comprises the steps of:
1) randomly dividing the collected voice signals into a development library and an evaluation library;
2) selecting all the voices in the development library, extracting features and passing the voices throughThe method trains a general background model;
3) calculating posterior probability on each Gaussian model component of the general background for each frame of voice as weight;
4) combining the step 2) and the step 3) to form the general background model component characteristics.
4. The method as claimed in claim 2, wherein the fuzzy SVM model is two classes of classifiers for neutral-emotion feature on each Gaussian component, and the positive samples of the two classes of classifiers are selected as neutral speech in the speech and the negative samples are selected as emotion speech in the speech.
5. The method for emotion speaker recognition based on reliability detection of fuzzy support vector machine according to claims 1-4, wherein said fuzzy support vector machine for reliability detection comprises the steps of:
1) by the formula
Figure 944839DEST_PATH_IMAGE002
Computing test speech
Figure 949704DEST_PATH_IMAGE003
A weighted reliability score at each gaussian component;
the above-mentionedFor the parameters of the classification plane under each Gaussian component
2) By the formula
Figure 906924DEST_PATH_IMAGE006
Calculating a weighted sum of the emotion probabilities of all Gaussian components;
the above-mentioned
Figure 817111DEST_PATH_IMAGE007
As a weight feature
3) Judging whether the characteristic is a reliable characteristic or not according to the result obtained in the step 2), if the result is smaller than a set threshold value, taking the characteristic as the reliable characteristic, otherwise, rejecting the characteristic.
6. The method for recognizing the emotional speaker based on the reliability detection of the fuzzy support vector machine as claimed in claims 1 to 4, wherein the computing and recognizing the speaker by the component feature of the general background model comprises the following steps;
1) training a Gaussian mixture model of each speaker, wherein the self-adaptive speaker model adopts a maximum posterior probability method;
2) by the formula
Figure 804659DEST_PATH_IMAGE008
To obtain the first
Figure 28967DEST_PATH_IMAGE009
Testing speech in individual speaker models
Figure 418360DEST_PATH_IMAGE010
Is given by the formula
Figure 263563DEST_PATH_IMAGE011
Obtaining the score of the whole sentence test sentence;
the above-mentioned
Figure 738406DEST_PATH_IMAGE012
For the threshold value of feature reliability detection set in the experiment,
Figure 563143DEST_PATH_IMAGE013
probability density of Gaussian distribution
3) Identifying the speaker with the largest score in step 2)
Figure 807043DEST_PATH_IMAGE014
The above-mentionedRepresenting speaker identityAnd (5) identifying.
CN201110121720XA 2011-05-12 2011-05-12 Emotional speaker identification method based on reliability detection of fuzzy support vector machine Expired - Fee Related CN102201237B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110121720XA CN102201237B (en) 2011-05-12 2011-05-12 Emotional speaker identification method based on reliability detection of fuzzy support vector machine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110121720XA CN102201237B (en) 2011-05-12 2011-05-12 Emotional speaker identification method based on reliability detection of fuzzy support vector machine

Publications (2)

Publication Number Publication Date
CN102201237A true CN102201237A (en) 2011-09-28
CN102201237B CN102201237B (en) 2013-03-13

Family

ID=44661863

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110121720XA Expired - Fee Related CN102201237B (en) 2011-05-12 2011-05-12 Emotional speaker identification method based on reliability detection of fuzzy support vector machine

Country Status (1)

Country Link
CN (1) CN102201237B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102779510A (en) * 2012-07-19 2012-11-14 东南大学 Speech emotion recognition method based on feature space self-adaptive projection
CN102930297A (en) * 2012-11-05 2013-02-13 北京理工大学 Emotion recognition method for enhancing coupling hidden markov model (HMM) voice-vision fusion
CN102968990A (en) * 2012-11-15 2013-03-13 江苏嘉利德电子科技有限公司 Speaker identifying method and system
CN103258537A (en) * 2013-05-24 2013-08-21 安宁 Method utilizing characteristic combination to identify speech emotions and device thereof
CN103258532A (en) * 2012-11-28 2013-08-21 河海大学常州校区 Method for recognizing Chinese speech emotions based on fuzzy support vector machine
CN106504772A (en) * 2016-11-04 2017-03-15 东南大学 Speech-emotion recognition method based on weights of importance support vector machine classifier
CN107886942A (en) * 2017-10-31 2018-04-06 东南大学 A kind of voice signal emotion identification method returned based on local punishment random spectrum
CN108922564A (en) * 2018-06-29 2018-11-30 北京百度网讯科技有限公司 Emotion identification method, apparatus, computer equipment and storage medium
CN110047491A (en) * 2018-01-16 2019-07-23 中国科学院声学研究所 A kind of relevant method for distinguishing speek person of random digit password and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1758332A (en) * 2005-10-31 2006-04-12 浙江大学 Speaker recognition method based on MFCC linear emotion compensation
CN101178897A (en) * 2007-12-05 2008-05-14 浙江大学 Speaking man recognizing method using base frequency envelope to eliminate emotion voice
JP2008146054A (en) * 2006-12-06 2008-06-26 Korea Electronics Telecommun Speaker information acquisition system using speech feature information on speaker, and method thereof

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1758332A (en) * 2005-10-31 2006-04-12 浙江大学 Speaker recognition method based on MFCC linear emotion compensation
JP2008146054A (en) * 2006-12-06 2008-06-26 Korea Electronics Telecommun Speaker information acquisition system using speech feature information on speaker, and method thereof
CN101178897A (en) * 2007-12-05 2008-05-14 浙江大学 Speaking man recognizing method using base frequency envelope to eliminate emotion voice

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHENYU SHAN ET AL: "Scores selection for emotional speaker recognition", 《ADVANCES IN BIOMETRICS THIRD INTERNATIONAL CONFERENCE, ICB 2009》 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102779510A (en) * 2012-07-19 2012-11-14 东南大学 Speech emotion recognition method based on feature space self-adaptive projection
CN102930297A (en) * 2012-11-05 2013-02-13 北京理工大学 Emotion recognition method for enhancing coupling hidden markov model (HMM) voice-vision fusion
CN102930297B (en) * 2012-11-05 2015-04-29 北京理工大学 Emotion recognition method for enhancing coupling hidden markov model (HMM) voice-vision fusion
CN102968990A (en) * 2012-11-15 2013-03-13 江苏嘉利德电子科技有限公司 Speaker identifying method and system
CN102968990B (en) * 2012-11-15 2015-04-15 朱东来 Speaker identifying method and system
CN103258532B (en) * 2012-11-28 2015-10-28 河海大学常州校区 A kind of Chinese speech sensibility recognition methods based on fuzzy support vector machine
CN103258532A (en) * 2012-11-28 2013-08-21 河海大学常州校区 Method for recognizing Chinese speech emotions based on fuzzy support vector machine
CN103258537A (en) * 2013-05-24 2013-08-21 安宁 Method utilizing characteristic combination to identify speech emotions and device thereof
CN106504772A (en) * 2016-11-04 2017-03-15 东南大学 Speech-emotion recognition method based on weights of importance support vector machine classifier
CN106504772B (en) * 2016-11-04 2019-08-20 东南大学 Speech-emotion recognition method based on weights of importance support vector machine classifier
CN107886942A (en) * 2017-10-31 2018-04-06 东南大学 A kind of voice signal emotion identification method returned based on local punishment random spectrum
CN107886942B (en) * 2017-10-31 2021-09-28 东南大学 Voice signal emotion recognition method based on local punishment random spectral regression
CN110047491A (en) * 2018-01-16 2019-07-23 中国科学院声学研究所 A kind of relevant method for distinguishing speek person of random digit password and device
CN108922564A (en) * 2018-06-29 2018-11-30 北京百度网讯科技有限公司 Emotion identification method, apparatus, computer equipment and storage medium
CN108922564B (en) * 2018-06-29 2021-05-07 北京百度网讯科技有限公司 Emotion recognition method and device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN102201237B (en) 2013-03-13

Similar Documents

Publication Publication Date Title
CN102201237B (en) Emotional speaker identification method based on reliability detection of fuzzy support vector machine
CN105632501B (en) A kind of automatic accent classification method and device based on depth learning technology
US9355642B2 (en) Speaker recognition method through emotional model synthesis based on neighbors preserving principle
TWI395201B (en) Method and system for identifying emotional voices
CN104008754B (en) Speech emotion recognition method based on semi-supervised feature selection
CN111128128B (en) Voice keyword detection method based on complementary model scoring fusion
CN107886968B (en) Voice evaluation method and system
CN104464724A (en) Speaker recognition method for deliberately pretended voices
CN106910495A (en) Audio classification system and method applied to abnormal sound detection
Saleem et al. Forensic speaker recognition: A new method based on extracting accent and language information from short utterances
Khan et al. An intelligent system for spoken term detection that uses belief combination
Taşpınar et al. Identification of the english accent spoken in different countries by the k-nearest neighbor method
Zheng et al. An improved speech emotion recognition algorithm based on deep belief network
Elbarougy Speech emotion recognition based on voiced emotion unit
CN114220419A (en) Voice evaluation method, device, medium and equipment
CN110246509A (en) A kind of stack denoising self-encoding encoder and deep neural network structure for voice lie detection
CN114373453B (en) Voice keyword detection method based on motion trail and distinguishing information
Gupta et al. Deep learning and sociophonetics: Automatic coding of rhoticity using neural networks
Mızrak et al. Gender Detection by Acoustic Characteristics of Sound with Machine Learning Algorithms
CN111554273B (en) Method for selecting amplified corpora in voice keyword recognition
CN114495990A (en) Speech emotion recognition method based on feature fusion
CN113539238B (en) End-to-end language identification and classification method based on cavity convolutional neural network
Chandrakala et al. Combination of generative models and SVM based classifier for speech emotion recognition
Wu et al. Phone set construction based on context-sensitive articulatory attributes for code-switching speech recognition
Li et al. High performance automatic mispronunciation detection method based on neural network and TRAP features.

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20130313