CN102201237A - Emotional speaker identification method based on reliability detection of fuzzy support vector machine - Google Patents
Emotional speaker identification method based on reliability detection of fuzzy support vector machine Download PDFInfo
- Publication number
- CN102201237A CN102201237A CN201110121720XA CN201110121720A CN102201237A CN 102201237 A CN102201237 A CN 102201237A CN 201110121720X A CN201110121720X A CN 201110121720XA CN 201110121720 A CN201110121720 A CN 201110121720A CN 102201237 A CN102201237 A CN 102201237A
- Authority
- CN
- China
- Prior art keywords
- speaker
- support vector
- vector machine
- component
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000012706 support-vector machine Methods 0.000 title claims abstract description 36
- 238000000034 method Methods 0.000 title claims abstract description 30
- 238000001514 detection method Methods 0.000 title claims abstract description 22
- 230000002996 emotional effect Effects 0.000 title claims abstract description 20
- 238000012360 testing method Methods 0.000 claims description 19
- 238000012549 training Methods 0.000 claims description 16
- 230000008451 emotion Effects 0.000 claims description 15
- 230000007935 neutral effect Effects 0.000 claims description 14
- 238000000605 extraction Methods 0.000 claims description 11
- 238000002474 experimental method Methods 0.000 claims description 9
- 238000011161 development Methods 0.000 claims description 8
- 238000007781 pre-processing Methods 0.000 claims description 4
- 238000013139 quantization Methods 0.000 claims description 4
- 238000011156 evaluation Methods 0.000 claims description 3
- 239000000203 mixture Substances 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000005034 decoration Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000012567 pattern recognition method Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000007430 reference method Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
Images
Landscapes
- Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
Abstract
The invention discloses an emotional speaker identification method based on reliability detection of fuzzy support vector machine, which comprises the following steps of: extracting speech component characteristics and combining the speech component characteristics with corresponding weight in a universal broadcast modem (UBM) model to form background model component characteristics; taking the obtained background model component characteristics as a fuzzy membership, and establishing a fuzzy support component machine model in a general model component; carrying out a reliability detection by using the fuzzy support vector machine model to obtain the reliability characteristics; and calculating the reliability characteristics and identifying the speaker. The method provided by the invention improves the robustness of the speaker identification system and the performance for identifying a speaker.
Description
Technical Field
The invention relates to signal processing and pattern recognition, in particular to an emotional speaker recognition method based on reliability feature detection of a fuzzy support vector machine.
Background
The speaker recognition technology is a technology for recognizing the identity of a speaker according to the voice of the speaker by utilizing a signal processing and pattern recognition method, and mainly comprises the following two steps: speaker model training and voice testing.
Currently, the main features adopted for speaker recognition include mel-frequency cepstrum coefficient (b:)) Linear predictive coding of cepstral coefficients () Perceptually weighted linear prediction coefficients (). The algorithm for speaker recognition mainly comprises vector quantization () General background model method () Support vector machine () And so on. Wherein,the method is widely applied to the whole speaker recognition field.
In emotional speaker recognition, the training speech is typically emotional-neutral speech, because in real-world applications, the user typically only provides speech in a neutral pronunciation to train his model. When testing, the voice may include various emotional voices, such as happy, sad, etc. However, the conventional speaker recognition system cannot deal with the mismatch of the training and testing conditions, and therefore, the problem of performance degradation of the speaker recognition system caused by inconsistent emotion of speakers in the training and testing stages needs to be solved for emotion speaker recognition.
Through experimental observation, the difference of the spatial distribution of the voice characteristics is caused by the difference of the sounding states of the speakers in different emotional states, so that compared with a neutral training model, the emotional voice characteristics are not matched with the neutral training model and can be regarded as unreliable characteristics, and the improvement of the recognition performance of the system is facilitated after the emotional voice characteristics are removed in a test stage.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides an emotional speaker recognition method based on the reliability characteristic detection of a fuzzy support vector machine, which reduces the mismatch degree of a model by eliminating the emotional voice characteristics in the tested voice, thereby improving the robustness of a speaker recognition system and improving the performance of speaker recognition.
In order to solve the technical problems, the technical scheme of the invention is as follows:
the emotion speaker identification method based on the reliability detection of the fuzzy support vector machine comprises the following steps
1) Extracting the voice component characteristics, and combining the voice component characteristics with corresponding weights in the UBM model to form general background model component characteristics;
2) using the general background model component characteristics obtained in the step 1) as fuzzy membership, and establishing a fuzzy support vector machine model under the general background model component;
3) carrying out reliability detection on the fuzzy support vector machine model in the step 2) so as to obtain reliable characteristics;
4) computing the reliable characteristics of the step 3) to identify the speaker.
As an alternative: the extraction of the voice component features comprises the following steps:
1) collecting voice signals and carrying out signal preprocessing on the voice signals;
2) extracting the characteristics of the preprocessed voice signals;
the feature extraction selects a feature extraction method based on a Mel cepstrum coefficient and/or a feature extraction method based on a linear prediction cepstrum coefficient;
the pretreatment comprises the following steps in sequence:
sample quantization, descum, pre-emphasis, and windowing.
As an alternative: the method for forming the general background model component characteristics comprises the following steps:
1) randomly dividing the collected voice signals into a development library and an evaluation library;
2) selecting all voices in the development library, extracting features and passing the voices throughThe method trains a general background model;
3) calculating the weight of each test voice on each Gaussian model of the general background respectively;
4) combining the step 2) and the step 3) to form the component characteristics of the general background model
As an alternative: the fuzzy support vector machine model is two types of fuzzy support vector machine classifiers of reliable-unreliable features on each Gaussian component, wherein positive samples of the two types of fuzzy support vector machine classifiers are selected from neutral voices in the development library, and negative samples of the two types of fuzzy support vector machine classifiers are selected from emotional voices in the development library.
As an alternative: the reliability detection of the fuzzy support vector machine comprises the following steps:
3) Judging whether the characteristic is a reliable characteristic or not according to the result obtained in the step 2), if so, taking the characteristic as the reliable characteristic, otherwise, rejecting the characteristic.
As an alternative: the speaker identification through the feature calculation comprises the following steps;
1) training a Gaussian mixture model of each speaker, wherein the self-adaptive speaker model adopts a maximum posterior probability method;
2) by the formulaTo obtain the firstTesting speech in individual speaker modelsIs given by the formulaObtaining the score of the whole sentence test sentence;
the above-mentionedFor the threshold value of feature reliability detection set in the experiment,probability density of Gaussian distribution
3) Identifying the speaker with the largest score in step 2)
The invention has the beneficial effects that: by removing unreliable features which are seriously influenced by emotional changes in the speech paragraphs, the robustness of the speaker recognition system is improved, and the performance of the system for recognizing the speaker is improved.
Drawings
FIG. 1 is a basic schematic diagram of a method for detecting emotion speaker recognition based on the reliability of a fuzzy support vector machine.
Detailed Description
The invention will be further described with reference to the accompanying drawings and specific embodiments.
As shown in FIG. 1, the emotion speaker recognition method based on reliability detection of fuzzy support vector machine mainly comprises four steps
1) Extracting the voice component characteristics, and combining the voice component characteristics with corresponding weights in the UBM model to form general background model component characteristics;
2) establishing a fuzzy support vector machine model UCFSVM under the general background model component by taking the general background model component characteristics obtained in the step 1) as fuzzy membership;
3) performing reliability detection on the fuzzy support vector machine model UCFSVM of the step 2) and scoringThe size of the data is judged to obtain reliable characteristics;
4) computing the reliable characteristics of the step 3) to identify the speaker.
The general background model component feature extraction comprises the following steps:
collecting voice signals, and carrying out signal preprocessing on the voice signals, wherein the preprocessing comprises the steps of sampling quantization, zero-drift removal, pre-emphasis and windowing.
Extracting the features of the pre-processed speech by a feature extraction method based on Mel cepstrum coefficient (A)) The feature extraction method of (1), and the feature extraction method based on the linear prediction cepstrum coefficient () One or two of them.
For each segment of speech, a segment of feature sequence is obtainedWherein each frame feature is oneThe vector of the dimensions is then calculated,representing the total number of frames of the feature in the sentence.
All training will beSpeech passing of modelsAlgorithm trainingAnd (4) modeling. Characteristics of each test voiceAre respectively atWeighting on each Gaussian model. Suppose thatThe model parameters areWherein、andrespectively representing weight, mean and variance. Then characteristicBelong to the firstThe posterior probability of each gaussian component can be expressed as:
The posterior probability can also be understood as the feature belonging to the sameAnd (3) weight of the component, and combining the original characteristic and the weight to form a new general background model component characteristic.
The features formed in step (1) above include the featuresThe newly constructed weight characteristics can serve as the role of fuzzy membership degree in the process of training the fuzzy support vector machine and can also serve as the role of weight of importance of each Gaussian component in the process of calculating the credibility.
Establishing a fuzzy support vector machine model under the general background model component:
in thatOn the basis of the model, two types of fuzzy support vector machine models with reliable-unreliable features are trained for each Gaussian component. Wherein neutral features are considered reliable features, emotional features are considered unreliable features, positive samples are selected from the neutral speech of the development library, and negative samples are selected from the emotional speech. Wherein the fuzzy membership of each sample is the weight characteristic mentioned in the step (1).
The method for training the fuzzy support vector machine comprises the following steps: for a training sample set with membership grade marks:;
Wherein each training dataIf it is emotional voice, it is regarded as unreliable voice, and its corresponding labelIf it is a neutral voice, it is labeled as。
The problem of optimizing the hyperplane is equivalent to:
wherein,is a constant number of times that the number of the first,show thatFromMapping toObtaining the characteristic space vector and degree of membershipRepresenting the corresponding dataTo the extent that they belong to a certain class,,respectively representing classification hyperplanesLinear coefficient and offset. This problem can be solved using the theory of solving the linear inequality. (Chun-Fu Lin, Sheng-De Wa)ng. Fuzzy Support Vector Machines. IEEE Transactions on Neural Networks, 13(2):464-471, March 2002.)。
The above formula can be converted to its dual expression form:
meanwhile, according to the kuen-tower condition:
the classification surface parameters under each Gaussian component can be obtained by solving the following two formulas:and。
feature reliability detection based on fuzzy support vector machine
For testing speech featuresIt is necessary to calculate a score that is a reliable feature and if the reliability score is too low, it is rejected. The score is calculated in two steps: firstly, the reliability score of the feature on a fuzzy support vector machine under a single Gaussian component of a general background model is obtained. Secondly, calculating the weighted sum of the reliability scores of the features on the fuzzy support vector machine under all Gaussian components of the general background model, and expressing the weighted sum as follows:
wherein,indicating the feature inThe weight on the gaussian component of the signal,the meanings of (A) are as indicated above. The score can be used to determine whether it is a reliable feature, and if the score is greater than a threshold, it is considered a reliable feature, otherwise, it is rejected.
Reliable feature score calculation
After the reliable feature detection in step (3) above, the score of the whole sentence needs to be calculated.
Firstly, a Gaussian mixture model of each speaker needs to be trained, and the maximum posterior probability is adopted by the self-adaptive speaker model () The method of (1).
Second, for the firstIndividual speaker model, testing speech characteristicsThe likelihood score of (1) can be calculated in the second placeThe likelihood score for an individual speaker is obtained by the following equation:
for a whole sentence test sentence, the score calculation method is as follows:
for the threshold of reliability set in the experiment, if the reliability score is greater than the threshold, the feature score is retained, otherwise, it is rejected.
Finally, when the target speaker of the sentence is selected, the speaker with the highest score is selected。
Results of the experiment
The database used in the experiment was the Chinese emotion speech database (MASC). The database was recorded in a quiet environment using an olympus DM-20 recording pen. The database contains 68 speakers in the native chinese language, 45 male and 23 female. Each speaker has 5 emotional utterances: neutral, angry, happy, angry, and sad. Each speaker will read 2 paragraphs neutral under neutral conditions while speaking 5 words and 20 sentences 3 times each under each emotion.
The experiment was performed on an IBM server. The configuration is as follows: CPU E5420, main frequency 2.5 GHz. The memory is 4G.
In the experiment, the voices of the first 18 speakers were used as a development library, and the voices of the neutral paragraphs of the 18 speakers were used for trainingModel, the statement pronunciation under the 18 persons 5 emotions is used for training the fuzzy support vector machine model. The last 50 speakers constitute an evaluation set, one for each speakerThe model is adapted using its neutral section. All sentences under five kinds of emotional voices are used for testing, and the testing voices are 15,000 sentences (a)). In the experiment, the simulation is that the process of speaker identification, the experimental result and the benchmark areThe results are shown in Table 1.
Watch (A)The effect of the method is compared with that of a reference experiment
Emotion classification | Reference method | Method for producing a composite material |
Neutral property | 96.23% | 95.50% |
Anger and anger | 31.50% | 37.60% |
Happy | 33.57% | 39.47% |
Panic alarm | 35.00% | 39.77% |
Sadness and sorrow | 61.43% | 63.63% |
Average | 51.55% | 55.19% |
From the experimental results, the method can effectively detect the reliable features in the sentences, and the recognition accuracy is greatly improved under each emotional state. Meanwhile, the overall identification accuracy is improved by 3.64%. The method is very helpful for improving the performance and robustness of the speaker recognition system.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and decorations can be made without departing from the concept of the present invention, and these modifications and decorations should also be regarded as being within the protection scope of the present invention.
Claims (6)
1. The emotion speaker identification method based on reliability detection of the fuzzy support vector machine is characterized by comprising the following steps
1) Extracting the voice component characteristics, and combining the voice component characteristics with corresponding weights in the UBM model to form general background model component characteristics;
2) using the general background model component characteristics obtained in the step 1) as fuzzy membership, and establishing a fuzzy support vector machine model under the general background model component;
3) carrying out reliability detection by using the fuzzy support vector machine model in the step 2) to obtain reliable characteristics;
4) computing the reliable characteristics of the step 3) to identify the speaker.
2. The method for emotion speaker recognition based on reliability detection of fuzzy support vector machine as claimed in claim 1, wherein said extracting speech component features comprises the steps of:
1) collecting voice signals and carrying out signal preprocessing on the voice signals;
2) extracting the characteristics of the preprocessed voice signals;
the feature extraction selects a feature extraction method based on a Mel cepstrum coefficient and/or a feature extraction method based on a linear prediction cepstrum coefficient;
the pretreatment comprises the following steps in sequence:
sample quantization, descum, pre-emphasis, and windowing.
3. The method for emotion speaker recognition based on reliability detection of fuzzy support vector machine as claimed in claim 1, wherein said forming of general background model component features comprises the steps of:
1) randomly dividing the collected voice signals into a development library and an evaluation library;
2) selecting all the voices in the development library, extracting features and passing the voices throughThe method trains a general background model;
3) calculating posterior probability on each Gaussian model component of the general background for each frame of voice as weight;
4) combining the step 2) and the step 3) to form the general background model component characteristics.
4. The method as claimed in claim 2, wherein the fuzzy SVM model is two classes of classifiers for neutral-emotion feature on each Gaussian component, and the positive samples of the two classes of classifiers are selected as neutral speech in the speech and the negative samples are selected as emotion speech in the speech.
5. The method for emotion speaker recognition based on reliability detection of fuzzy support vector machine according to claims 1-4, wherein said fuzzy support vector machine for reliability detection comprises the steps of:
the above-mentioned、For the parameters of the classification plane under each Gaussian component
2) By the formulaCalculating a weighted sum of the emotion probabilities of all Gaussian components;
3) Judging whether the characteristic is a reliable characteristic or not according to the result obtained in the step 2), if the result is smaller than a set threshold value, taking the characteristic as the reliable characteristic, otherwise, rejecting the characteristic.
6. The method for recognizing the emotional speaker based on the reliability detection of the fuzzy support vector machine as claimed in claims 1 to 4, wherein the computing and recognizing the speaker by the component feature of the general background model comprises the following steps;
1) training a Gaussian mixture model of each speaker, wherein the self-adaptive speaker model adopts a maximum posterior probability method;
2) by the formulaTo obtain the firstTesting speech in individual speaker modelsIs given by the formulaObtaining the score of the whole sentence test sentence;
the above-mentionedFor the threshold value of feature reliability detection set in the experiment,probability density of Gaussian distribution
The above-mentionedRepresenting speaker identityAnd (5) identifying.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201110121720XA CN102201237B (en) | 2011-05-12 | 2011-05-12 | Emotional speaker identification method based on reliability detection of fuzzy support vector machine |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201110121720XA CN102201237B (en) | 2011-05-12 | 2011-05-12 | Emotional speaker identification method based on reliability detection of fuzzy support vector machine |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102201237A true CN102201237A (en) | 2011-09-28 |
CN102201237B CN102201237B (en) | 2013-03-13 |
Family
ID=44661863
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201110121720XA Expired - Fee Related CN102201237B (en) | 2011-05-12 | 2011-05-12 | Emotional speaker identification method based on reliability detection of fuzzy support vector machine |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102201237B (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102779510A (en) * | 2012-07-19 | 2012-11-14 | 东南大学 | Speech emotion recognition method based on feature space self-adaptive projection |
CN102930297A (en) * | 2012-11-05 | 2013-02-13 | 北京理工大学 | Emotion recognition method for enhancing coupling hidden markov model (HMM) voice-vision fusion |
CN102968990A (en) * | 2012-11-15 | 2013-03-13 | 江苏嘉利德电子科技有限公司 | Speaker identifying method and system |
CN103258537A (en) * | 2013-05-24 | 2013-08-21 | 安宁 | Method utilizing characteristic combination to identify speech emotions and device thereof |
CN103258532A (en) * | 2012-11-28 | 2013-08-21 | 河海大学常州校区 | Method for recognizing Chinese speech emotions based on fuzzy support vector machine |
CN106504772A (en) * | 2016-11-04 | 2017-03-15 | 东南大学 | Speech-emotion recognition method based on weights of importance support vector machine classifier |
CN107886942A (en) * | 2017-10-31 | 2018-04-06 | 东南大学 | A kind of voice signal emotion identification method returned based on local punishment random spectrum |
CN108922564A (en) * | 2018-06-29 | 2018-11-30 | 北京百度网讯科技有限公司 | Emotion identification method, apparatus, computer equipment and storage medium |
CN110047491A (en) * | 2018-01-16 | 2019-07-23 | 中国科学院声学研究所 | A kind of relevant method for distinguishing speek person of random digit password and device |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1758332A (en) * | 2005-10-31 | 2006-04-12 | 浙江大学 | Speaker recognition method based on MFCC linear emotion compensation |
CN101178897A (en) * | 2007-12-05 | 2008-05-14 | 浙江大学 | Speaking man recognizing method using base frequency envelope to eliminate emotion voice |
JP2008146054A (en) * | 2006-12-06 | 2008-06-26 | Korea Electronics Telecommun | Speaker information acquisition system using speech feature information on speaker, and method thereof |
-
2011
- 2011-05-12 CN CN201110121720XA patent/CN102201237B/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1758332A (en) * | 2005-10-31 | 2006-04-12 | 浙江大学 | Speaker recognition method based on MFCC linear emotion compensation |
JP2008146054A (en) * | 2006-12-06 | 2008-06-26 | Korea Electronics Telecommun | Speaker information acquisition system using speech feature information on speaker, and method thereof |
CN101178897A (en) * | 2007-12-05 | 2008-05-14 | 浙江大学 | Speaking man recognizing method using base frequency envelope to eliminate emotion voice |
Non-Patent Citations (1)
Title |
---|
ZHENYU SHAN ET AL: "Scores selection for emotional speaker recognition", 《ADVANCES IN BIOMETRICS THIRD INTERNATIONAL CONFERENCE, ICB 2009》 * |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102779510A (en) * | 2012-07-19 | 2012-11-14 | 东南大学 | Speech emotion recognition method based on feature space self-adaptive projection |
CN102930297A (en) * | 2012-11-05 | 2013-02-13 | 北京理工大学 | Emotion recognition method for enhancing coupling hidden markov model (HMM) voice-vision fusion |
CN102930297B (en) * | 2012-11-05 | 2015-04-29 | 北京理工大学 | Emotion recognition method for enhancing coupling hidden markov model (HMM) voice-vision fusion |
CN102968990A (en) * | 2012-11-15 | 2013-03-13 | 江苏嘉利德电子科技有限公司 | Speaker identifying method and system |
CN102968990B (en) * | 2012-11-15 | 2015-04-15 | 朱东来 | Speaker identifying method and system |
CN103258532B (en) * | 2012-11-28 | 2015-10-28 | 河海大学常州校区 | A kind of Chinese speech sensibility recognition methods based on fuzzy support vector machine |
CN103258532A (en) * | 2012-11-28 | 2013-08-21 | 河海大学常州校区 | Method for recognizing Chinese speech emotions based on fuzzy support vector machine |
CN103258537A (en) * | 2013-05-24 | 2013-08-21 | 安宁 | Method utilizing characteristic combination to identify speech emotions and device thereof |
CN106504772A (en) * | 2016-11-04 | 2017-03-15 | 东南大学 | Speech-emotion recognition method based on weights of importance support vector machine classifier |
CN106504772B (en) * | 2016-11-04 | 2019-08-20 | 东南大学 | Speech-emotion recognition method based on weights of importance support vector machine classifier |
CN107886942A (en) * | 2017-10-31 | 2018-04-06 | 东南大学 | A kind of voice signal emotion identification method returned based on local punishment random spectrum |
CN107886942B (en) * | 2017-10-31 | 2021-09-28 | 东南大学 | Voice signal emotion recognition method based on local punishment random spectral regression |
CN110047491A (en) * | 2018-01-16 | 2019-07-23 | 中国科学院声学研究所 | A kind of relevant method for distinguishing speek person of random digit password and device |
CN108922564A (en) * | 2018-06-29 | 2018-11-30 | 北京百度网讯科技有限公司 | Emotion identification method, apparatus, computer equipment and storage medium |
CN108922564B (en) * | 2018-06-29 | 2021-05-07 | 北京百度网讯科技有限公司 | Emotion recognition method and device, computer equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN102201237B (en) | 2013-03-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102201237B (en) | Emotional speaker identification method based on reliability detection of fuzzy support vector machine | |
CN105632501B (en) | A kind of automatic accent classification method and device based on depth learning technology | |
US9355642B2 (en) | Speaker recognition method through emotional model synthesis based on neighbors preserving principle | |
TWI395201B (en) | Method and system for identifying emotional voices | |
CN104008754B (en) | Speech emotion recognition method based on semi-supervised feature selection | |
CN111128128B (en) | Voice keyword detection method based on complementary model scoring fusion | |
CN107886968B (en) | Voice evaluation method and system | |
CN104464724A (en) | Speaker recognition method for deliberately pretended voices | |
CN106910495A (en) | Audio classification system and method applied to abnormal sound detection | |
Saleem et al. | Forensic speaker recognition: A new method based on extracting accent and language information from short utterances | |
Khan et al. | An intelligent system for spoken term detection that uses belief combination | |
Taşpınar et al. | Identification of the english accent spoken in different countries by the k-nearest neighbor method | |
Zheng et al. | An improved speech emotion recognition algorithm based on deep belief network | |
Elbarougy | Speech emotion recognition based on voiced emotion unit | |
CN114220419A (en) | Voice evaluation method, device, medium and equipment | |
CN110246509A (en) | A kind of stack denoising self-encoding encoder and deep neural network structure for voice lie detection | |
CN114373453B (en) | Voice keyword detection method based on motion trail and distinguishing information | |
Gupta et al. | Deep learning and sociophonetics: Automatic coding of rhoticity using neural networks | |
Mızrak et al. | Gender Detection by Acoustic Characteristics of Sound with Machine Learning Algorithms | |
CN111554273B (en) | Method for selecting amplified corpora in voice keyword recognition | |
CN114495990A (en) | Speech emotion recognition method based on feature fusion | |
CN113539238B (en) | End-to-end language identification and classification method based on cavity convolutional neural network | |
Chandrakala et al. | Combination of generative models and SVM based classifier for speech emotion recognition | |
Wu et al. | Phone set construction based on context-sensitive articulatory attributes for code-switching speech recognition | |
Li et al. | High performance automatic mispronunciation detection method based on neural network and TRAP features. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20130313 |