CN103151039A - Speaker age identification method based on SVM (Support Vector Machine) - Google Patents
Speaker age identification method based on SVM (Support Vector Machine) Download PDFInfo
- Publication number
- CN103151039A CN103151039A CN2013100494454A CN201310049445A CN103151039A CN 103151039 A CN103151039 A CN 103151039A CN 2013100494454 A CN2013100494454 A CN 2013100494454A CN 201310049445 A CN201310049445 A CN 201310049445A CN 103151039 A CN103151039 A CN 103151039A
- Authority
- CN
- China
- Prior art keywords
- voice
- speech
- voice signal
- characteristic parameter
- vector machine
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 44
- 238000012706 support-vector machine Methods 0.000 title claims abstract description 22
- 238000012549 training Methods 0.000 claims abstract description 12
- 230000008569 process Effects 0.000 claims abstract description 11
- 239000000284 extract Substances 0.000 claims description 8
- 238000010606 normalization Methods 0.000 claims description 6
- 206010038743 Restlessness Diseases 0.000 claims description 4
- 238000012545 processing Methods 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 238000013139 quantization Methods 0.000 claims description 2
- 230000003993 interaction Effects 0.000 abstract description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000013456 study Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000000366 juvenile effect Effects 0.000 description 2
- 230000013707 sensory perception of sound Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000011840 criminal investigation Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 235000020094 liqueur Nutrition 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000005855 radiation Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000013515 script Methods 0.000 description 1
- 235000020354 squash Nutrition 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
Images
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a speaker age identification method based on an SVM (Support Vector Machine) classifier. The method comprises the following steps that a voice library in which voice signals of speakers of different ages are stored is established; the voice signals in the voice library are preprocessed; voice feature parameters of the preprocessed voice signals are extracted; the SVM training is performed on the basis of the extracted voice feature parameters, and then an SVM model is obtained; and according to the SVM model, the voice feature parameters X of voice to be identified are predicted, after output of each SVM is logically judged in the process of prediction, the voice feature parameter with the most votes is used as the most probable age class, and then a final age identification result is obtained. By using the method provided by the invention, the blank of the prior art in related research on speaker age identification is filled to a certain degree, the speaker age can be judged better, and the method has a broad application prospect on occasions such as man-machine interaction, criminal search, games, entertainments and the like.
Description
Technical field
The present invention relates to mode identification technology, especially a kind of speaker's age bracket recognition methods based on support vector machine (Support Vector Machine, SVM).
Background technology
At present, about speech recognition, the investigative technique of the aspects such as speaker identification is comparative maturity.Other correlative studys that launch on this basis, such as Chinese speech sensibility identification, the identification of speaker's sex, also someone proposes corresponding solution to the directions such as audio classification and identification.But, the identification of relevant speaker's age bracket is not but almost studied and is related to, and the identification of speaker's age bracket is to be applied to a lot of occasions, in interactive system, machine recognition goes out speaker's age bracket, can adopt the machine talk of corresponding age bracket to answer, increase the cordial feeling in man-machine interaction; Perhaps in the detection of some cases, the suspect's that can be identified by audio document age level reduces target search scope etc.Therefore the recognition methods of a kind of speaker's age bracket that the present invention proposes can be provided fundamental basis for the Application and Development of related occasion.
Usually, people's age can be divided into following several stages, children's stage (0~11 years old) roughly, juvenile stage (12~17 years old), the young stage (18~34 years old), stage (35~50 years old) in middle age, old stage (more than 50 years old) etc.Along with the growth at people's age, same person is in the different stages, and one's voice in speech also changes gradually; Be in the voice that the people of same age section sends general character is arranged.The present invention is exactly this characteristic expansion of characteristic that the voice that send of the speaker around each age bracket have corresponding age bracket.
At audio classification, during the identification of speaker's sex, image recognition etc. are identified and used, effect is fine due to the svm classifier method.So the present invention adopts the SVM model to carry out Classification and Identification.Mel-cepstrum coefficient MFCC in speech characteristic parameter is the acoustic feature of deriving as the basis take the auditory properties of people's ear.Because in fact the sound that people's ear can be heard not be simple linear relationship with the frequency of sound.Studies show that, people's ear is followed linear approximate relationship to the perception of sound frequency when 1KHz is following, and frequency is followed linear approximate relationship on the logarithm frequency coordinate at the sound more than 1KHz.MFCC is the cepstrum parameter that extracts in Mei Er scale frequency territory, and this parameter has weakened the radio-frequency component of speech manual, and noise is had adaptability, therefore use this parameter as the characteristic parameter of svm classifier device training identification.
Summary of the invention
The objective of the invention is to adopt the svm classifier device in conjunction with the characteristic parameter MFCC of voice signal, realize the judgement of speaker's age bracket, can be applied to the occasion of needs, detailed process is to extract the phonic signal character parameter that can distinguish speaker's age bracket, utilizes SVM to train and identifies the affiliated age bracket of speaker.
For achieving the above object, a kind of speaker's age bracket recognition methods based on support vector machines of the present invention's proposition comprises the following steps:
Step 1 is set up the sound bank of the speaker's store a plurality of all ages and classes sections voice signal;
Step 2 is carried out pre-service to the voice signal in described sound bank;
Step 3 is to extracting its speech characteristic parameter through pretreated voice signal;
Step 4 is carried out support vector machine training, supported vector machine model based on the speech characteristic parameter that extracts;
Step 5, train the supporting vector machine model that obtains according to described step 4, speech characteristic parameter X to voice to be identified predicts, in forecasting process, the output of each support vector machine is by after logical decision, select the who gets the most votes as most probable age bracket classification, obtain thus final age bracket recognition result.
To sum up, the invention provides a kind of method of the speaker's of identification age bracket, due to the Study of recognition that does not substantially have at present about speaker's age bracket, therefore application prospect of the present invention is more wide, such as, can be applied to man-machine interaction, the criminal investigation search, online chat, the multiple occasion such as Entertainment.In addition, the present invention adopts support vector machine classifier and in conjunction with the characteristic feature parameter of voice signal, identifies the age bracket under the speaker.The characteristic parameter MFCC that extracts in the inventive method meets human hearing characteristic, can effectively distinguish the speaker of all ages and classes section through training.This parameter also has adaptability to noise, has obtained in the speaker identification field using very widely.And the svm classifier device can the realization character parameter dimensionality reduction, have reasonable classifying quality in the application scenario of Classification and Identification.The present invention utilizes the SVM training with the MFCC parameter of all ages and classes section voice, then speech parameter to be measured is carried out Forecasting recognition, can reasonablely realize the judgement of speaker's age bracket.But at all age group boundary, the one's voice in speech temporal evolution is slow due to the speaker, therefore the more difficult identification of the voice at each age group edge, in addition, indivedual speakers' voice characteristic may be inconsistent with corresponding age bracket voice characteristic, and this also will increase the difficulty of identification.In sum, estimate that the present invention can reach more than 70% for the average recognition rate of all age group.
Description of drawings
Fig. 1 is the speaker's age bracket recognition methods process flow diagram that the present invention is based on support vector machines;
Fig. 2 is that SVM trains process flow diagram according to an embodiment of the invention;
Fig. 3 is that SVM adjudicates identification figure according to an embodiment of the invention.
Embodiment
For making the purpose, technical solutions and advantages of the present invention clearer, below in conjunction with specific embodiment, and with reference to accompanying drawing, the present invention is described in more detail.
Fig. 1 is the speaker's age bracket recognition methods process flow diagram that the present invention is based on support vector machines, and as shown in Figure 1, the method specifically comprises the following steps:
Step 1 is set up the sound bank of the speaker's store a plurality of all ages and classes sections voice signal, and described voice signal is take phrase as the unit;
In this step, at first adopt recording pen or other sound pick-up outfits to gather the speaker's of all ages and classes section voice, sampling rate can unify to be 16KHz, 16bit, monophony, in an embodiment of the present invention, each age bracket is recorded 20 speakers (comprising 10 male 10 female), language scripts is that voice content is classical prose " MoonlIght on the Lotus Pond " " figure viewed from behind " etc., reads 1 time for every piece; And then the voice that record are cut into sound bite signal take phrase as the unit.
Step 2 is carried out pre-service to the voice signal in described sound bank;
In this step, described pre-service is further comprising the steps:
Step 21 is carried out sample quantization to voice signal;
Step 22 in order to remove the impact of mouth and nose radiation, promotes the HFS of signal, adopts following formula to carry out pre-emphasis to the voice signal after quantizing and processes:
H(z)=1-0.9375z
-1
Wherein, z represents voice signal, the voice signal that H (z) expression obtains after processing through pre-emphasis;
Step 23 take phrase as the unit, between each word has interval in phrase due to described voice signal, therefore need to adopt end-point detecting method based on energy and zero-crossing rate to remove unvoiced segments in each voice signal.
Wherein, described end-point detecting method adopts two-stage judgement method, and is further comprising the steps:
Step 231 is carried out short division frame with described voice signal and is processed, and frame length is got 20ms, and the voice signal sampling rate is 16KHz, and namely 320 sampled points, obtain a plurality of speech frames;
Step 232 is calculated short-time energy and the short-time zero-crossing rate of each speech frame;
Step 233, according to the average energy of all speech frames, a higher decision threshold E1 is set, size and the described thresholding E1 of the short-time energy of each speech frame are compared, obtain the voice terminal of each speech frame of preliminary judgement, this voice terminal is positioned at the intersection point of described thresholding E1 and speech frame short-time energy envelope outside the corresponding time interval;
Step 234 arranges a slightly low decision threshold E2 according to the average energy of ground unrest, determines the voice terminal of each speech frame on the result of described step 233 preliminary judgement, i.e. the end points of each speech frame;
Step 235 arranges a thresholding Z1 according to the average zero-crossing rate of described ground unrest, and based on the end points of described each speech frame, the voiceless sound of judgement voice front end and the last or end syllable of rear end finally obtain in each speech frame the end points of sound section and unvoiced segments.
Step 3 is to extracting its speech characteristic parameter through pretreated voice signal;
In this step, described speech characteristic parameter is taken as MFCC, and in an embodiment of the present invention, MFCC is such as being 12 dimensions.The process that described speech characteristic parameter extracts can comprise the following steps:
Step 31 is divided into a series of leg-of-mutton Mel wave filter sequences with the speech frequency of described voice signal;
Step 32 is got the weighted sum of all signal amplitudes in each leg-of-mutton Mel wave filter sequence frequency bandwidth as the output of respective filter;
Step 33 is done the logarithm computing to the output of all wave filters;
Step 34, the result that described step 33 is obtained is carried out discrete cosine transform and can be obtained MFCC.
Step 4 is carried out the SVM training based on the described speech characteristic parameter that extracts, and obtains the SVM model;
As shown in Figure 2, the process of described SVM training comprises the following steps:
Step 41, with the speech characteristic parameter MFCC of described each all ages and classes section that extracts as eigenvector;
Step 42 is for the speech characteristic parameter of each all ages and classes section adds class label, in an embodiment of the present invention, totally 5 kinds of age brackets (children's stage, juvenile stage, young stages, stage in middle age, the old stage), namely 5 class data, be made as respectively { 1 with five kinds of age brackets, 2,3,4,5}5 class label is processed;
Step 43 with described eigenvector normalization, and is pressed the ratio convergent-divergent, is reduced in [1 ,+1] scope;
step 44, eigenvector after each all ages and classes section normalization is trained, such as training, the kit svmtrain of the LIBSVM that can use the development and Design such as professor Lin Zhiren of Taiwan Univ. (sees C.-C.Chang and C.-J.Lin.LIBSVM:a library for support vector machines.ACM Transactions on Intelligent Systems and Technology, 2:27:1--27:27, 2011.), supported vector machine set, wherein owing to adopting " one to one " method to carry out 5 class classification in one embodiment of the invention, therefore comprise 10 sorters in training result.Wherein, the Selection of kernel function radial basis kernel function of using in SVM:
K(X,X
i)=exp(-γ||X-X
i||
2)
Wherein, parameter γ is taken as 0.001, X, X
iBe the input feature vector vector.
Step 5, as shown in Figure 3, train the SVM model that obtains according to described step 4, speech characteristic parameter X to voice to be identified predicts, such as the svmpredict that can use LIBSVM predicts, in forecasting process, the output of each support vector machine is by after logical decision, select the who gets the most votes as most probable age bracket classification, can obtain thus final age bracket recognition result.
Wherein, before the characteristic parameter X to voice to be identified predicted, described step 5 also comprised speech characteristic parameter normalization to be identified, i.e. the identical ratio convergent-divergent during according to parameter training to described speech characteristic parameter: be reduced in [1 ,+1] scope.
Above-described specific embodiment; purpose of the present invention, technical scheme and beneficial effect are further described; institute is understood that; the above is only specific embodiments of the invention; be not limited to the present invention; within the spirit and principles in the present invention all, any modification of making, be equal to replacement, improvement etc., within all should being included in protection scope of the present invention.
Claims (10)
1. the speaker's age bracket recognition methods based on support vector machines, is characterized in that, the method comprises the following steps:
Step 1 is set up the sound bank of the speaker's store a plurality of all ages and classes sections voice signal;
Step 2 is carried out pre-service to the voice signal in described sound bank;
Step 3 is to extracting its speech characteristic parameter through pretreated voice signal;
Step 4 is carried out support vector machine training, supported vector machine model based on the speech characteristic parameter that extracts;
Step 5, train the supporting vector machine model that obtains according to described step 4, speech characteristic parameter X to voice to be identified predicts, in forecasting process, the output of each support vector machine is by after logical decision, select the who gets the most votes as most probable age bracket classification, obtain thus final age bracket recognition result.
2. method according to claim 1, is characterized in that, described voice signal is take phrase as the unit.
3. method according to claim 1, is characterized in that, in described step 2, described pre-service is further comprising the steps:
Step 21 is carried out sample quantization to voice signal;
Step 22 is carried out pre-emphasis to the voice signal after quantizing and is processed;
Step 23 adopts end-point detecting method based on energy and zero-crossing rate to remove unvoiced segments in each voice signal.
4. method according to claim 3, is characterized in that, described pre-emphasis processing list is shown:
H(z)=1-0.9375z
-1,
Wherein, z represents voice signal, the voice signal that H (z) expression obtains after processing through pre-emphasis.
5. method according to claim 3, is characterized in that, utilizes described end-point detecting method to detect unvoiced segments and comprise the following steps:
Step 231 is carried out short division frame with described voice signal and is processed, and obtains a plurality of speech frames;
Step 232 is calculated short-time energy and the short-time zero-crossing rate of each speech frame;
Step 233 arranges a higher decision threshold E1 according to the average energy of all speech frames, and size and the described thresholding E1 of the short-time energy of each speech frame compared, and obtains the voice terminal of each speech frame of preliminary judgement;
Step 234 arranges a slightly low decision threshold E2 according to the average energy of ground unrest, determines the voice terminal of each speech frame on the result of described step 233 preliminary judgement, i.e. the end points of each speech frame;
Step 235 arranges a thresholding Z1 according to the average zero-crossing rate of described ground unrest, and based on the end points of described each speech frame, the voiceless sound of judgement voice front end and the last or end syllable of rear end finally obtain in each speech frame the end points of sound section and unvoiced segments.
6. method according to claim 5, is characterized in that, described frame length is got 20ms, and the voice signal sampling rate is 16KHz, i.e. 320 sampled points.
7. method according to claim 1, is characterized in that, described speech characteristic parameter is taken as Mel-cepstrum coefficient MFCC.
8. method according to claim 7, is characterized in that, the step that described speech characteristic parameter extracts comprises the following steps:
Step 31 is divided into a series of leg-of-mutton Mel wave filter sequences with the speech frequency of described voice signal;
Step 32 is got the weighted sum of all signal amplitudes in each leg-of-mutton Mel wave filter sequence frequency bandwidth as the output of respective filter;
Step 33 is done the logarithm computing to the output of all wave filters;
Step 34, the result that described step 33 is obtained is carried out discrete cosine transform and is obtained MFCC.
9. method according to claim 1, is characterized in that, the step of described support vector machine training further comprises:
Step 41, with the speech characteristic parameter of each all ages and classes section of extracting as eigenvector;
Step 42 is for the speech characteristic parameter of each all ages and classes section adds class label;
Step 43 with described eigenvector normalization, and is pressed the ratio convergent-divergent, is reduced in [1 ,+1] scope;
Step 44 is trained supported vector machine set to the eigenvector after each all ages and classes section normalization.
10. method according to claim 1, is characterized in that, described step 5 also comprised speech characteristic parameter normalization to be identified before the characteristic parameter X to voice to be identified predicts, and it is reduced to the interior step of [1 ,+1] scope.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2013100494454A CN103151039A (en) | 2013-02-07 | 2013-02-07 | Speaker age identification method based on SVM (Support Vector Machine) |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2013100494454A CN103151039A (en) | 2013-02-07 | 2013-02-07 | Speaker age identification method based on SVM (Support Vector Machine) |
Publications (1)
Publication Number | Publication Date |
---|---|
CN103151039A true CN103151039A (en) | 2013-06-12 |
Family
ID=48549062
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2013100494454A Pending CN103151039A (en) | 2013-02-07 | 2013-02-07 | Speaker age identification method based on SVM (Support Vector Machine) |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103151039A (en) |
Cited By (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103905650A (en) * | 2014-04-28 | 2014-07-02 | 深圳市中兴移动通信有限公司 | Mobile terminal and method for regulating call volume based on voice recognition |
CN104700843A (en) * | 2015-02-05 | 2015-06-10 | 海信集团有限公司 | Method and device for identifying ages |
CN104714633A (en) * | 2013-12-12 | 2015-06-17 | 华为技术有限公司 | Method and terminal for terminal configuration |
CN105529027A (en) * | 2015-12-14 | 2016-04-27 | 百度在线网络技术(北京)有限公司 | Voice identification method and apparatus |
CN105845143A (en) * | 2016-03-23 | 2016-08-10 | 广州势必可赢网络科技有限公司 | Speaker confirmation method and speaker confirmation system based on support vector machine |
CN105872792A (en) * | 2016-03-25 | 2016-08-17 | 乐视控股(北京)有限公司 | Voice-based service recommending method and device |
CN105957520A (en) * | 2016-07-04 | 2016-09-21 | 北京邮电大学 | Voice state detection method suitable for echo cancellation system |
CN106599110A (en) * | 2016-11-29 | 2017-04-26 | 百度在线网络技术(北京)有限公司 | Artificial intelligence-based voice search method and device |
CN107170457A (en) * | 2017-06-29 | 2017-09-15 | 深圳市泰衡诺科技有限公司 | Age recognition methods, device and terminal |
CN107358949A (en) * | 2017-05-27 | 2017-11-17 | 芜湖星途机器人科技有限公司 | Robot sounding automatic adjustment system |
CN107886955A (en) * | 2016-09-29 | 2018-04-06 | 百度在线网络技术(北京)有限公司 | A kind of personal identification method, device and the equipment of voice conversation sample |
CN108281138A (en) * | 2017-12-18 | 2018-07-13 | 百度在线网络技术(北京)有限公司 | Age discrimination model training and intelligent sound exchange method, equipment and storage medium |
CN108573712A (en) * | 2017-03-13 | 2018-09-25 | 北京贝塔科技股份有限公司 | Voice activity detection model generation method and system and voice activity detection method and system |
CN108694954A (en) * | 2018-06-13 | 2018-10-23 | 广州势必可赢网络科技有限公司 | A kind of Sex, Age recognition methods, device, equipment and readable storage medium storing program for executing |
CN108877773A (en) * | 2018-06-12 | 2018-11-23 | 广东小天才科技有限公司 | A kind of audio recognition method and electronic equipment |
CN109166591A (en) * | 2018-08-29 | 2019-01-08 | 昆明理工大学 | A kind of classification method based on audio frequency characteristics signal |
CN109448756A (en) * | 2018-11-14 | 2019-03-08 | 北京大生在线科技有限公司 | A kind of voice age recognition methods and system |
CN109859744A (en) * | 2017-11-29 | 2019-06-07 | 宁波方太厨具有限公司 | A kind of sound end detecting method applied in range hood |
CN109945900A (en) * | 2019-03-11 | 2019-06-28 | 南京智慧基础设施技术研究院有限公司 | A kind of distributed optical fiber sensing method |
CN110211566A (en) * | 2019-06-08 | 2019-09-06 | 安徽中医药大学 | A kind of classification method of compressed sensing based hepatolenticular degeneration disfluency |
CN110782915A (en) * | 2019-10-31 | 2020-02-11 | 广州艾颂智能科技有限公司 | Waveform music component separation method based on deep learning |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080235019A1 (en) * | 2007-03-23 | 2008-09-25 | Verizon Business Network Services, Inc. | Age determination using speech |
CN102426835A (en) * | 2011-08-30 | 2012-04-25 | 华南理工大学 | Method for identifying local discharge signals of switchboard based on support vector machine model |
-
2013
- 2013-02-07 CN CN2013100494454A patent/CN103151039A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080235019A1 (en) * | 2007-03-23 | 2008-09-25 | Verizon Business Network Services, Inc. | Age determination using speech |
CN102426835A (en) * | 2011-08-30 | 2012-04-25 | 华南理工大学 | Method for identifying local discharge signals of switchboard based on support vector machine model |
Non-Patent Citations (2)
Title |
---|
DAVOOD MAHMOODI ET AL.: "Age Estimation Based on Speech Features and Support Vector Machine", 《2011 3RD COMPUTER SCIENCE AND ELECTRONIC ENGINEERING CONFERENCE (CEEC)》 * |
易克初等: "《语音信号处理》", 30 June 2000, 国防工业出版社 * |
Cited By (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104714633A (en) * | 2013-12-12 | 2015-06-17 | 华为技术有限公司 | Method and terminal for terminal configuration |
CN103905650A (en) * | 2014-04-28 | 2014-07-02 | 深圳市中兴移动通信有限公司 | Mobile terminal and method for regulating call volume based on voice recognition |
CN104700843A (en) * | 2015-02-05 | 2015-06-10 | 海信集团有限公司 | Method and device for identifying ages |
US10650809B2 (en) | 2015-12-14 | 2020-05-12 | Baidu Online Network Technology (Beijing) Co., Ltd. | Speech recognition method and device |
CN105529027A (en) * | 2015-12-14 | 2016-04-27 | 百度在线网络技术(北京)有限公司 | Voice identification method and apparatus |
CN105529027B (en) * | 2015-12-14 | 2019-05-31 | 百度在线网络技术(北京)有限公司 | Audio recognition method and device |
WO2017101450A1 (en) * | 2015-12-14 | 2017-06-22 | 百度在线网络技术(北京)有限公司 | Voice recognition method and device |
CN105845143A (en) * | 2016-03-23 | 2016-08-10 | 广州势必可赢网络科技有限公司 | Speaker confirmation method and speaker confirmation system based on support vector machine |
CN105872792A (en) * | 2016-03-25 | 2016-08-17 | 乐视控股(北京)有限公司 | Voice-based service recommending method and device |
CN105957520A (en) * | 2016-07-04 | 2016-09-21 | 北京邮电大学 | Voice state detection method suitable for echo cancellation system |
CN105957520B (en) * | 2016-07-04 | 2019-10-11 | 北京邮电大学 | A kind of voice status detection method suitable for echo cancelling system |
CN107886955A (en) * | 2016-09-29 | 2018-04-06 | 百度在线网络技术(北京)有限公司 | A kind of personal identification method, device and the equipment of voice conversation sample |
CN106599110A (en) * | 2016-11-29 | 2017-04-26 | 百度在线网络技术(北京)有限公司 | Artificial intelligence-based voice search method and device |
CN108573712A (en) * | 2017-03-13 | 2018-09-25 | 北京贝塔科技股份有限公司 | Voice activity detection model generation method and system and voice activity detection method and system |
CN108573712B (en) * | 2017-03-13 | 2020-07-28 | 北京贝塔科技股份有限公司 | Voice activity detection model generation method and system and voice activity detection method and system |
CN107358949A (en) * | 2017-05-27 | 2017-11-17 | 芜湖星途机器人科技有限公司 | Robot sounding automatic adjustment system |
CN107170457A (en) * | 2017-06-29 | 2017-09-15 | 深圳市泰衡诺科技有限公司 | Age recognition methods, device and terminal |
CN109859744B (en) * | 2017-11-29 | 2021-01-19 | 宁波方太厨具有限公司 | Voice endpoint detection method applied to range hood |
CN109859744A (en) * | 2017-11-29 | 2019-06-07 | 宁波方太厨具有限公司 | A kind of sound end detecting method applied in range hood |
CN108281138A (en) * | 2017-12-18 | 2018-07-13 | 百度在线网络技术(北京)有限公司 | Age discrimination model training and intelligent sound exchange method, equipment and storage medium |
CN108281138B (en) * | 2017-12-18 | 2020-03-31 | 百度在线网络技术(北京)有限公司 | Age discrimination model training and intelligent voice interaction method, equipment and storage medium |
CN108877773A (en) * | 2018-06-12 | 2018-11-23 | 广东小天才科技有限公司 | A kind of audio recognition method and electronic equipment |
CN108877773B (en) * | 2018-06-12 | 2020-07-24 | 广东小天才科技有限公司 | Voice recognition method and electronic equipment |
CN108694954A (en) * | 2018-06-13 | 2018-10-23 | 广州势必可赢网络科技有限公司 | A kind of Sex, Age recognition methods, device, equipment and readable storage medium storing program for executing |
CN109166591A (en) * | 2018-08-29 | 2019-01-08 | 昆明理工大学 | A kind of classification method based on audio frequency characteristics signal |
CN109448756A (en) * | 2018-11-14 | 2019-03-08 | 北京大生在线科技有限公司 | A kind of voice age recognition methods and system |
CN109945900A (en) * | 2019-03-11 | 2019-06-28 | 南京智慧基础设施技术研究院有限公司 | A kind of distributed optical fiber sensing method |
CN110211566A (en) * | 2019-06-08 | 2019-09-06 | 安徽中医药大学 | A kind of classification method of compressed sensing based hepatolenticular degeneration disfluency |
CN110782915A (en) * | 2019-10-31 | 2020-02-11 | 广州艾颂智能科技有限公司 | Waveform music component separation method based on deep learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103151039A (en) | Speaker age identification method based on SVM (Support Vector Machine) | |
CN102723078B (en) | Emotion speech recognition method based on natural language comprehension | |
CN105161093B (en) | A kind of method and system judging speaker's number | |
CN105405439B (en) | Speech playing method and device | |
CN105206271A (en) | Intelligent equipment voice wake-up method and system for realizing method | |
CN105632501A (en) | Deep-learning-technology-based automatic accent classification method and apparatus | |
CN112102850B (en) | Emotion recognition processing method and device, medium and electronic equipment | |
CN111063341A (en) | Method and system for segmenting and clustering multi-person voice in complex environment | |
CN106548775B (en) | Voice recognition method and system | |
WO2012075641A1 (en) | Device and method for pass-phrase modeling for speaker verification, and verification system | |
CN103700370A (en) | Broadcast television voice recognition method and system | |
CN108922541A (en) | Multidimensional characteristic parameter method for recognizing sound-groove based on DTW and GMM model | |
CN111524527A (en) | Speaker separation method, device, electronic equipment and storage medium | |
CN111370030A (en) | Voice emotion detection method and device, storage medium and electronic equipment | |
CN111192659A (en) | Pre-training method for depression detection and depression detection method and device | |
CN113192535B (en) | Voice keyword retrieval method, system and electronic device | |
CN110428853A (en) | Voice activity detection method, Voice activity detection device and electronic equipment | |
CN110827853A (en) | Voice feature information extraction method, terminal and readable storage medium | |
CN105869622B (en) | Chinese hot word detection method and device | |
CN115171731A (en) | Emotion category determination method, device and equipment and readable storage medium | |
CN113823323A (en) | Audio processing method and device based on convolutional neural network and related equipment | |
CN109074809B (en) | Information processing apparatus, information processing method, and computer-readable storage medium | |
CN109065026B (en) | Recording control method and device | |
Ghaemmaghami et al. | Complete-linkage clustering for voice activity detection in audio and visual speech | |
CN114708869A (en) | Voice interaction method and device and electric appliance |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C02 | Deemed withdrawal of patent application after publication (patent law 2001) | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20130612 |