CN110047516A - A kind of speech-emotion recognition method based on gender perception - Google Patents

A kind of speech-emotion recognition method based on gender perception Download PDF

Info

Publication number
CN110047516A
CN110047516A CN201910186313.3A CN201910186313A CN110047516A CN 110047516 A CN110047516 A CN 110047516A CN 201910186313 A CN201910186313 A CN 201910186313A CN 110047516 A CN110047516 A CN 110047516A
Authority
CN
China
Prior art keywords
gender
feature
perception
layer
speech
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910186313.3A
Other languages
Chinese (zh)
Inventor
王龙标
党建武
张林娟
郭丽丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN201910186313.3A priority Critical patent/CN110047516A/en
Publication of CN110047516A publication Critical patent/CN110047516A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/63Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state

Abstract

The present invention discloses a kind of speech-emotion recognition method based on gender perception, utilizes the gender Perception Features of gender information: distributed sex character and gender guidance feature.And gender Perception Features and sound spectrograph are fused into assemblage characteristic, learn high-level depth characteristic from assemblage characteristic with CNN-BLSTM network and do emotional semantic classification.Key step has: voice segment, feature preparation, Fusion Features, feature extraction and classification.Gender Perception Features of the invention are compared with existing feature, can effectively utilize the information of gender.The speech-emotion recognition method of gender perception can effectively improve the accuracy of speech emotion recognition.

Description

A kind of speech-emotion recognition method based on gender perception
Technical field
The present invention is speech emotion recognition field, and the utilization and speech emotion recognition feature for being specifically related to sex character are melted Conjunction method.
Background technique
Currently, human-computer interaction is popular in various ways, especially conversational system and intelligent sound assistant.Emotion carries Important semantic information, it is believed that speech emotion recognition can effectively help machine to understand user's intention.It accurately distinguishes and uses Family mood can provide good interactivity, and improve user experience.But we still have with machine nature communication aspects Many difficult, we are not really achieved human-computer interaction.Speech emotion recognition task is still a very big challenge.
Many researchs discovery gender differences sway the emotion expression, this shows that gender information can help speech emotion recognition. The study found that being better than the classification accuracy that gender information incorporates emotion recognition to establish the independent speech emotion recognition of gender respectively System.Gender information has been widely used for speech emotion recognition task.But only not with the simple coding mode such as one-hot coding Gender information can be effectively utilized.Therefore addition gender information, the accuracy rate of emotion recognition only can slightly improve, ineffective.
In order to solve problem above, we have proposed distributed sex characters and two kinds of gender guidance feature new gender senses Know feature.Distributed sex character describes the distribution and individual difference of men and women;Gender guidance feature passes through DNN network from acoustics It is extracted in signal.Both gender Perception Features are merged with sound spectrograph respectively, and last point is then done by CNN-BLSTM model Class.
1) gender information is not efficiently used for traditional voice emotion recognition task, proposes new gender Perception Features, Effectively utilize gender information.
2) new gender Perception Features not only indicate men and women's information, but also reflect the sound of individual difference and part speaker Characteristic is learned, the effective rate of utilization of sex character is improved.
3) by gender Perception Features and sound spectrograph effective integration, and emotional semantic classification is carried out with CNN-BLSTM model, it can be effective The accuracy of ground promotion emotion recognition.
Summary of the invention
The present invention is to propose the gender Perception Features that can effectively utilize gender information the technical issues of solution: distribution Formula sex character and gender guidance feature.And gender Perception Features and sound spectrograph are fused into assemblage characteristic, with CNN-BLSTM net Network learns high-level depth characteristic from assemblage characteristic and does emotional semantic classification.Specific technical solution is as follows:
Voice segment: language grade voice signal is divided into the voice segments of regular length by step 1.
Step 2, feature prepare
1) extract sound spectrograph: to segmentation voice carry out short time discrete Fourier transform, obtain primary light spectrogram S, size be a × b;
2) extract gender Perception Features: distributed sex character and gender drive feature;
2-1) distributed sex character extracts: the random value of fixed dimension is respectively set for male and female first as male Female's template.In order to reflect individual difference, stochastic variable is added in fixed sex template.Finally, point of male Cloth sex character DGFMChange in the range of m-k, and the distributed sex character DGF of womenFChange in the range of k-z;
2-2) gender drives feature extraction: the x for extracting segmentation voice first ties up acoustic feature.It is distinguished to have feature The function of gender uses deep neural network DNN to extract y from acoustic feature and ties up bottleneck characteristic as gender driving feature GDF.
Step 3, Fusion Features
By step 2 1) extract primary light spectrogram S and 2-1) in DGF be fused together into assemblage characteristic F1.I-th The assemblage characteristic vector F of j-th of segment in language1It can indicate are as follows:
F1ij=[Sij,DGFij] (1)
By step 2 1) extract primary light spectrogram S and 2-2) in GDF be fused together into assemblage characteristic F2.I-th The assemblage characteristic vector F of j-th of segment in language2It can indicate are as follows:
F2ij=[Sij,GDFij] (2)
Step 4, feature extraction.Level characteristics are extracted from assemblage characteristic respectively using CNN.
Step 5, classification.By the chronological feature at sentence level of the level characteristics obtained in step 4, it is sent to Learn context Time Dependent in BLSTM network, completes the emotional semantic classification of language grade.7 kinds of emotions include it is neutral, sad, frightened, Glad, angry, boring, detest.
Characteristic use DNN network is driven in the step 2 in order to obtain gender.Gender driving feature specifically constructs step It is rapid as follows:
1) input of DNN is x dimension acoustic feature
2) three hidden layers h1, h2, h3 are set, and wherein h2 hiding unit is less than h1, h3.H2 is also referred to as bottleneck layer.
3) use true gender label as the teacher signal of training DNN.DNN can pass through cost function backpropagation Derivative train.Cost function is measuring the cross entropy between target output and reality output in each trained example.
DNN is trained, the output of hidden layer h2 is bottleneck characteristic, i.e. gender drives feature.
The present invention is based on the speech-emotion recognition methods of gender perception to be based on CNN-BLSTM model.Its configuration is as follows:
There are two convolutional layers and two maximum pond layers by CNN.First convolutional layer has n1 convolution kernel, and convolution size is k1× k1.The pond size of first pond layer is p1×p1.Second convolutional layer has n2A convolution kernel, convolution size are k2×k2.The The size of two pond layers is p2×p2.Flattening layer is in order to two-dimensional characteristic spectrum 1 dimensional vector of flat chemical conversion.In flattening layer Later, Feature Mapping to s is tieed up using the layer that is fully connected with s hidden unit.There are two hidden layers in BLSTM, each Layer has u hidden unit.
Beneficial effect
Gender Perception Features of the invention are compared with existing feature, can effectively utilize the information of gender.Gender perception Speech-emotion recognition method can effectively improve the accuracy of speech emotion recognition.
Detailed description of the invention
Fig. 1 is that the present invention is based on the speech emotion recognition model framework figures of gender driving feature;
Fig. 2 is the DNN model structure for extracting gender driving feature.
Specific embodiment
The present invention is described in detail with reference to the accompanying drawing.
In order to verify the present invention, we verify on Emo-DB database.Emo-DB includes 535 sentences, is divided into Sad, glad, frightened, neutral, angry, boring, 7 kinds of emotions of detest.
Fig. 1 is that the present invention is based on the speech emotion recognition model framework figures of gender driving feature.As shown in Figure 1, main packet Containing following five steps.
Step 1, voice segment.Language grade voice signal is divided into the voice segments of regular length.Every section of segment length 265ms.Often Section includes 25 frames, frame length 25ms, frame shifting 10ms.In Emo-DB affection data library, a language about more than 50,000 is collected according to the method described above Tablet section is tested.Wherein duration longest sentence is divided into 349 voice segments.
Step 2, feature prepare.
1) it extracts sound spectrograph: short time discrete Fourier transform being carried out to segmentation voice, obtains primary light spectrogram S, size 25 ×129;
2) extract gender Perception Features: distributed sex character and gender drive feature.
2-1) distributed sex character extracts: the random value of 32 dimensions is respectively set for male and female first as men and women's mould Plate.In order to reflect individual difference, stochastic variable is added in fixed sex template.Finally, the distribution of male Sex character DGFMChange in the range of 0-0.5, and the distributed sex character DGF of womenFChange in the range of 0.5-1.
2-2) gender drives feature extraction: extracting 384 dimension acoustic features of segmentation voice with openSMILE tool first. 384 dimension acoustic features are provided by INTERSPEECH 2009Emotion Challenge, by the 32 rudimentary descriptors (LLDs) of dimension and Its statistical value composition.32 dimension LLDs include zero-crossing rate, root mean square energy, fundamental frequency, the harmonic to noise ratio of auto-correlation function and Mel frequency Rate cepstrum coefficient etc..
In order to make feature that there is the other function of distinction, 32 dimension bottles are extracted from acoustic feature with deep neural network DNN Neck feature drives feature as gender.Fig. 2 is illustrated for extracting bottleneck characteristic DNN model structure.The input of DNN is 384 dimensions Acoustic feature.The hidden unit of three hidden layers h1, h2, h3 are respectively 1024,32,1024.The output layer of DNN uses true Teacher signal of the gender label as training DNN.DNN is trained, the output of hidden layer h2 is that 32 dimension genders drive feature.
Step 3, Fusion Features.
By the DGF in step 2 2-2), temporally dimension repeats 25 times into section grade DGF, and size is 25 × 32.By step Two 2-1) extract original spectrum Figure 25 × 129 and section grade DGF be fused into section grade assemblage characteristic F1, size is 25 × 161.
Similarly, by the GDF in step 2 2-3), temporally dimension repeats 25 times into section grade GDF, and size is 25 × 32.It will Step 2 2-1) extract original spectrum Figure 25 × 129 and section grade GDF be fused into section grade assemblage characteristic F2, size be 25 × 161。
Step 4, feature extraction.Using CNN respectively from assemblage characteristic F1、F2Middle extraction level characteristics.There are two convolution by CNN Layer and two maximum pond layers.First convolutional layer has 32 convolution kernels, and convolution size is 5 × 5, activation primitive relu.First The pond size of a pond layer is 2 × 2.Second convolutional layer has 64 convolution kernels, and convolution size is 5 × 5, and activation primitive is relu.The size of second pond layer is 2 × 2.After flattening layer, being fully connected with 1024 hidden units is used Layer ties up the Feature Mapping of study to 1024.The hidden unit of output layer is 7, activation primitive softmax.It is obtained after full articulamentum Take the level characteristics in subsequent classification.
Step 5, classification.By the chronological feature at sentence level of the level characteristics obtained in step 4, it is sent to Learn context Time Dependent in BLSTM network, complete the emotional semantic classification of language grade, distinguishes neutral, sad, frightened, glad, anger Anger, boring, seven kinds of emotions of detest.There are two hidden layer in BLSTM, each layer has 1024 hidden units.
Table 1 is the result that gender Perception Features are merged with sound spectrograph on EmoDB database
ID Feature Size Weight precision Unweighted precision
1 Sound spectrograph 25×129 86.73% 86.40%
2 Sound spectrograph+only hot sex character 25×131 86.92% 86.24%
3 Sound spectrograph+distribution sex character 25×161 88.97% 88.31%
4 Sound spectrograph+gender drives feature 25×161 92.71% 92.62%
Table 1 shows the speech emotion recognition model based on gender perception, carries out speech emotional classification using different characteristic Weighting precision and unweighted precision.By observing table 1, we concluded that 1) will solely hot sex character (male 01, Women 10) merged with sound spectrograph as feature carry out speech emotional classification do not obtain good classification results.This is because solely The dimension of hot sex character is 2, can be ignored substantially, CNN is without calligraphy learning to the information in newly-increased only hot sex character.2) exist Distributed sex character is added in the language emotion recognition system of gender perception and gender driving aspect ratio only uses sound spectrograph and exists Unweighted precision aspect relative error reduces by 14.04% and 45.74% respectively.3) aspect ratio distribution gender is driven using gender Feature has better classification accuracy.The reason is that the feature of gender driving not only indicates sex character, but also it can reflect and speak The true individual difference of person and acoustic information, and distributed sex character can only reflect the gender information of speaker.As a result it proves Speech-emotion recognition method based on gender perception can improve the accuracy of speech emotional classification, and the present invention is effective.

Claims (4)

1. a kind of speech-emotion recognition method based on gender perception, which is characterized in that firstly, utilizing the gender sense of gender information Know feature: distributed sex character and gender guidance feature;Then, gender Perception Features are fused into combination with sound spectrograph respectively Feature learns high-level depth characteristic with CNN-BLSTM network from assemblage characteristic and does emotional semantic classification.
2. a kind of speech-emotion recognition method based on gender perception according to claim 1, which is characterized in that specific Steps are as follows:
Voice segment: language grade voice signal is divided into the voice segments of regular length by step 1;
Step 2, feature prepare
1) it extracts sound spectrograph: short time discrete Fourier transform is carried out to segmentation voice, obtain primary light spectrogram S, size is a × b;
2) extract gender Perception Features: distributed sex character and gender drive feature.
2-1) distributed sex character extracts: the random value of fixed dimension is respectively set for male and female first as men and women's mould Plate;Stochastic variable is added in fixed sex template;Finally, the distributed sex character DGF of maleMIn m-k In the range of change, and the distributed sex character DGF of womenFChange in the range of k-z;
2-2) gender drives feature extraction: the x for extracting segmentation voice first ties up acoustic feature;With deep neural network DNN from sound It learns and extracts y dimension bottleneck characteristic in feature as gender driving feature GDF;
Step 3, Fusion Features
By step 2 1) extract primary light spectrogram S and 2-1) in DGF be fused together into assemblage characteristic F1, in i-th of language The assemblage characteristic vector F of j-th of segment1It can indicate are as follows:
F1ij=[Sij,DGFij] (1)
By step 2 1) extract primary light spectrogram S and 2-2) in GDF be fused together into assemblage characteristic F2, in i-th of language The assemblage characteristic vector F of j-th of segment2It can indicate are as follows:
F2ij=[Sij,GDFij] (2)
Step 4, feature extraction
Level characteristics are extracted from assemblage characteristic respectively using CNN;
Step 5, classification
By the chronological feature at sentence level of the level characteristics obtained in step 4, it is sent in BLSTM network in study Hereafter Time Dependent completes the emotional semantic classification of language grade.
3. a kind of speech-emotion recognition method based on gender perception according to claim 1, which is characterized in that the step The specific construction step of gender driving feature is as follows in rapid two:
1) input of DNN is x dimension acoustic feature;
2) three hidden layers h1, h2, h3 are set, and wherein h2 hiding unit is less than h1, h3, and h2 is also referred to as bottleneck layer;
3) use true gender label as the teacher signal of training DNN, DNN can spreading out by cost function backpropagation Biology is trained, and cost function is measuring the cross entropy between target output and reality output in each trained example;
DNN is trained, the output of hidden layer h2 is bottleneck characteristic, i.e. gender drives feature.
4. according to a kind of speech-emotion recognition method based on gender perception described in claim 1, which is characterized in that the step Configuration based on CNN-BLSTM model in four and five is as follows:
There are two convolutional layers and two maximum pond layers by CNN:
First convolutional layer has n1 convolution kernel, and convolution size is k1×k1
The pond size of first pond layer is p1×p1, second convolutional layer have n2A convolution kernel, convolution size are k2×k2
The size of second pond layer is p2×p2,
After flattening layer, Feature Mapping to s is tieed up using the layer that is fully connected with s hidden unit;
There are two hidden layer in BLSTM, each layer has u hidden unit.
CN201910186313.3A 2019-03-12 2019-03-12 A kind of speech-emotion recognition method based on gender perception Pending CN110047516A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910186313.3A CN110047516A (en) 2019-03-12 2019-03-12 A kind of speech-emotion recognition method based on gender perception

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910186313.3A CN110047516A (en) 2019-03-12 2019-03-12 A kind of speech-emotion recognition method based on gender perception

Publications (1)

Publication Number Publication Date
CN110047516A true CN110047516A (en) 2019-07-23

Family

ID=67274783

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910186313.3A Pending CN110047516A (en) 2019-03-12 2019-03-12 A kind of speech-emotion recognition method based on gender perception

Country Status (1)

Country Link
CN (1) CN110047516A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110555379A (en) * 2019-07-30 2019-12-10 华南理工大学 human face pleasure degree estimation method capable of dynamically adjusting features according to gender
CN110619889A (en) * 2019-09-19 2019-12-27 Oppo广东移动通信有限公司 Sign data identification method and device, electronic equipment and storage medium
CN110675893A (en) * 2019-09-19 2020-01-10 腾讯音乐娱乐科技(深圳)有限公司 Song identification method and device, storage medium and electronic equipment
CN110728997A (en) * 2019-11-29 2020-01-24 中国科学院深圳先进技术研究院 Multi-modal depression detection method and system based on context awareness
CN111402927A (en) * 2019-08-23 2020-07-10 南京邮电大学 Speech emotion recognition method based on segmented spectrogram and dual-Attention
CN111899766A (en) * 2020-08-24 2020-11-06 南京邮电大学 Speech emotion recognition method based on optimization fusion of depth features and acoustic features
CN112712824A (en) * 2021-03-26 2021-04-27 之江实验室 Crowd information fused speech emotion recognition method and system
CN112927723A (en) * 2021-04-20 2021-06-08 东南大学 High-performance anti-noise speech emotion recognition method based on deep neural network
CN113593526A (en) * 2021-07-27 2021-11-02 哈尔滨理工大学 Speech emotion recognition method based on deep learning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013120467A (en) * 2011-12-07 2013-06-17 National Institute Of Advanced Industrial & Technology Device and method for extracting signal features
CN108010514A (en) * 2017-11-20 2018-05-08 四川大学 A kind of method of speech classification based on deep neural network
CN109272993A (en) * 2018-08-21 2019-01-25 中国平安人寿保险股份有限公司 Recognition methods, device, computer equipment and the storage medium of voice class
CN109389992A (en) * 2018-10-18 2019-02-26 天津大学 A kind of speech-emotion recognition method based on amplitude and phase information

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013120467A (en) * 2011-12-07 2013-06-17 National Institute Of Advanced Industrial & Technology Device and method for extracting signal features
CN108010514A (en) * 2017-11-20 2018-05-08 四川大学 A kind of method of speech classification based on deep neural network
CN109272993A (en) * 2018-08-21 2019-01-25 中国平安人寿保险股份有限公司 Recognition methods, device, computer equipment and the storage medium of voice class
CN109389992A (en) * 2018-10-18 2019-02-26 天津大学 A kind of speech-emotion recognition method based on amplitude and phase information

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LINJUAN ZHANG ET AL.: "《Gender-Aware CNN-BLSTM for Speech Emotion Recognition》", 《ICANN 2018: ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING》 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110555379A (en) * 2019-07-30 2019-12-10 华南理工大学 human face pleasure degree estimation method capable of dynamically adjusting features according to gender
CN110555379B (en) * 2019-07-30 2022-03-25 华南理工大学 Human face pleasure degree estimation method capable of dynamically adjusting features according to gender
CN111402927A (en) * 2019-08-23 2020-07-10 南京邮电大学 Speech emotion recognition method based on segmented spectrogram and dual-Attention
CN110619889B (en) * 2019-09-19 2022-03-15 Oppo广东移动通信有限公司 Sign data identification method and device, electronic equipment and storage medium
CN110619889A (en) * 2019-09-19 2019-12-27 Oppo广东移动通信有限公司 Sign data identification method and device, electronic equipment and storage medium
CN110675893A (en) * 2019-09-19 2020-01-10 腾讯音乐娱乐科技(深圳)有限公司 Song identification method and device, storage medium and electronic equipment
CN110675893B (en) * 2019-09-19 2022-04-05 腾讯音乐娱乐科技(深圳)有限公司 Song identification method and device, storage medium and electronic equipment
CN110728997A (en) * 2019-11-29 2020-01-24 中国科学院深圳先进技术研究院 Multi-modal depression detection method and system based on context awareness
CN110728997B (en) * 2019-11-29 2022-03-22 中国科学院深圳先进技术研究院 Multi-modal depression detection system based on context awareness
CN111899766A (en) * 2020-08-24 2020-11-06 南京邮电大学 Speech emotion recognition method based on optimization fusion of depth features and acoustic features
CN111899766B (en) * 2020-08-24 2023-04-14 南京邮电大学 Speech emotion recognition method based on optimization fusion of depth features and acoustic features
CN112712824A (en) * 2021-03-26 2021-04-27 之江实验室 Crowd information fused speech emotion recognition method and system
CN112927723A (en) * 2021-04-20 2021-06-08 东南大学 High-performance anti-noise speech emotion recognition method based on deep neural network
CN113593526A (en) * 2021-07-27 2021-11-02 哈尔滨理工大学 Speech emotion recognition method based on deep learning

Similar Documents

Publication Publication Date Title
CN110047516A (en) A kind of speech-emotion recognition method based on gender perception
CN105427858B (en) Realize the method and system that voice is classified automatically
CN107993665B (en) Method for determining role of speaker in multi-person conversation scene, intelligent conference method and system
CN106228977B (en) Multi-mode fusion song emotion recognition method based on deep learning
CN110634491B (en) Series connection feature extraction system and method for general voice task in voice signal
CN102982809B (en) Conversion method for sound of speaker
CN109036465B (en) Speech emotion recognition method
CN109241255A (en) A kind of intension recognizing method based on deep learning
CN109065032B (en) External corpus speech recognition method based on deep convolutional neural network
WO2023273170A1 (en) Welcoming robot conversation method
CN109460737A (en) A kind of multi-modal speech-emotion recognition method based on enhanced residual error neural network
CN109119072A (en) Civil aviaton's land sky call acoustic model construction method based on DNN-HMM
CN110097894A (en) A kind of method and system of speech emotion recognition end to end
CN106847309A (en) A kind of speech-emotion recognition method
CN108763326A (en) A kind of sentiment analysis model building method of the diversified convolutional neural networks of feature based
CN109065033A (en) A kind of automatic speech recognition method based on random depth time-delay neural network model
CN107492382A (en) Voiceprint extracting method and device based on neutral net
CN110390955A (en) A kind of inter-library speech-emotion recognition method based on Depth Domain adaptability convolutional neural networks
CN102800314A (en) English sentence recognizing and evaluating system with feedback guidance and method of system
CN104538027B (en) The mood of voice social media propagates quantization method and system
CN110148408A (en) A kind of Chinese speech recognition method based on depth residual error
CN111583964A (en) Natural speech emotion recognition method based on multi-mode deep feature learning
CN102237083A (en) Portable interpretation system based on WinCE platform and language recognition method thereof
Wang et al. Research on speech emotion recognition technology based on deep and shallow neural network
CN107293290A (en) The method and apparatus for setting up Speech acoustics model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20190723

WD01 Invention patent application deemed withdrawn after publication