CN103151039A - Speaker age identification method based on SVM (Support Vector Machine) - Google Patents

Speaker age identification method based on SVM (Support Vector Machine) Download PDF

Info

Publication number
CN103151039A
CN103151039A CN2013100494454A CN201310049445A CN103151039A CN 103151039 A CN103151039 A CN 103151039A CN 2013100494454 A CN2013100494454 A CN 2013100494454A CN 201310049445 A CN201310049445 A CN 201310049445A CN 103151039 A CN103151039 A CN 103151039A
Authority
CN
China
Prior art keywords
voice
speech
voice signal
characteristic parameter
vector machine
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2013100494454A
Other languages
Chinese (zh)
Inventor
熊刚
孔庆杰
朱菁
王飞跃
赵红霞
朱凤华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Cloud Computing Industry Technology Innovation and Incubation Center of CAS
Original Assignee
Institute of Automation of Chinese Academy of Science
Cloud Computing Industry Technology Innovation and Incubation Center of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science, Cloud Computing Industry Technology Innovation and Incubation Center of CAS filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN2013100494454A priority Critical patent/CN103151039A/en
Publication of CN103151039A publication Critical patent/CN103151039A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a speaker age identification method based on an SVM (Support Vector Machine) classifier. The method comprises the following steps that a voice library in which voice signals of speakers of different ages are stored is established; the voice signals in the voice library are preprocessed; voice feature parameters of the preprocessed voice signals are extracted; the SVM training is performed on the basis of the extracted voice feature parameters, and then an SVM model is obtained; and according to the SVM model, the voice feature parameters X of voice to be identified are predicted, after output of each SVM is logically judged in the process of prediction, the voice feature parameter with the most votes is used as the most probable age class, and then a final age identification result is obtained. By using the method provided by the invention, the blank of the prior art in related research on speaker age identification is filled to a certain degree, the speaker age can be judged better, and the method has a broad application prospect on occasions such as man-machine interaction, criminal search, games, entertainments and the like.

Description

A kind of speaker's age bracket recognition methods based on vector machine SVM
Technical field
The present invention relates to mode identification technology, especially a kind of speaker's age bracket recognition methods based on support vector machine (Support Vector Machine, SVM).
Background technology
At present, about speech recognition, the investigative technique of the aspects such as speaker identification is comparative maturity.Other correlative studys that launch on this basis, such as Chinese speech sensibility identification, the identification of speaker's sex, also someone proposes corresponding solution to the directions such as audio classification and identification.But, the identification of relevant speaker's age bracket is not but almost studied and is related to, and the identification of speaker's age bracket is to be applied to a lot of occasions, in interactive system, machine recognition goes out speaker's age bracket, can adopt the machine talk of corresponding age bracket to answer, increase the cordial feeling in man-machine interaction; Perhaps in the detection of some cases, the suspect's that can be identified by audio document age level reduces target search scope etc.Therefore the recognition methods of a kind of speaker's age bracket that the present invention proposes can be provided fundamental basis for the Application and Development of related occasion.
Usually, people's age can be divided into following several stages, children's stage (0~11 years old) roughly, juvenile stage (12~17 years old), the young stage (18~34 years old), stage (35~50 years old) in middle age, old stage (more than 50 years old) etc.Along with the growth at people's age, same person is in the different stages, and one's voice in speech also changes gradually; Be in the voice that the people of same age section sends general character is arranged.The present invention is exactly this characteristic expansion of characteristic that the voice that send of the speaker around each age bracket have corresponding age bracket.
At audio classification, during the identification of speaker's sex, image recognition etc. are identified and used, effect is fine due to the svm classifier method.So the present invention adopts the SVM model to carry out Classification and Identification.Mel-cepstrum coefficient MFCC in speech characteristic parameter is the acoustic feature of deriving as the basis take the auditory properties of people's ear.Because in fact the sound that people's ear can be heard not be simple linear relationship with the frequency of sound.Studies show that, people's ear is followed linear approximate relationship to the perception of sound frequency when 1KHz is following, and frequency is followed linear approximate relationship on the logarithm frequency coordinate at the sound more than 1KHz.MFCC is the cepstrum parameter that extracts in Mei Er scale frequency territory, and this parameter has weakened the radio-frequency component of speech manual, and noise is had adaptability, therefore use this parameter as the characteristic parameter of svm classifier device training identification.
Summary of the invention
The objective of the invention is to adopt the svm classifier device in conjunction with the characteristic parameter MFCC of voice signal, realize the judgement of speaker's age bracket, can be applied to the occasion of needs, detailed process is to extract the phonic signal character parameter that can distinguish speaker's age bracket, utilizes SVM to train and identifies the affiliated age bracket of speaker.
For achieving the above object, a kind of speaker's age bracket recognition methods based on support vector machines of the present invention's proposition comprises the following steps:
Step 1 is set up the sound bank of the speaker's store a plurality of all ages and classes sections voice signal;
Step 2 is carried out pre-service to the voice signal in described sound bank;
Step 3 is to extracting its speech characteristic parameter through pretreated voice signal;
Step 4 is carried out support vector machine training, supported vector machine model based on the speech characteristic parameter that extracts;
Step 5, train the supporting vector machine model that obtains according to described step 4, speech characteristic parameter X to voice to be identified predicts, in forecasting process, the output of each support vector machine is by after logical decision, select the who gets the most votes as most probable age bracket classification, obtain thus final age bracket recognition result.
To sum up, the invention provides a kind of method of the speaker's of identification age bracket, due to the Study of recognition that does not substantially have at present about speaker's age bracket, therefore application prospect of the present invention is more wide, such as, can be applied to man-machine interaction, the criminal investigation search, online chat, the multiple occasion such as Entertainment.In addition, the present invention adopts support vector machine classifier and in conjunction with the characteristic feature parameter of voice signal, identifies the age bracket under the speaker.The characteristic parameter MFCC that extracts in the inventive method meets human hearing characteristic, can effectively distinguish the speaker of all ages and classes section through training.This parameter also has adaptability to noise, has obtained in the speaker identification field using very widely.And the svm classifier device can the realization character parameter dimensionality reduction, have reasonable classifying quality in the application scenario of Classification and Identification.The present invention utilizes the SVM training with the MFCC parameter of all ages and classes section voice, then speech parameter to be measured is carried out Forecasting recognition, can reasonablely realize the judgement of speaker's age bracket.But at all age group boundary, the one's voice in speech temporal evolution is slow due to the speaker, therefore the more difficult identification of the voice at each age group edge, in addition, indivedual speakers' voice characteristic may be inconsistent with corresponding age bracket voice characteristic, and this also will increase the difficulty of identification.In sum, estimate that the present invention can reach more than 70% for the average recognition rate of all age group.
Description of drawings
Fig. 1 is the speaker's age bracket recognition methods process flow diagram that the present invention is based on support vector machines;
Fig. 2 is that SVM trains process flow diagram according to an embodiment of the invention;
Fig. 3 is that SVM adjudicates identification figure according to an embodiment of the invention.
Embodiment
For making the purpose, technical solutions and advantages of the present invention clearer, below in conjunction with specific embodiment, and with reference to accompanying drawing, the present invention is described in more detail.
Fig. 1 is the speaker's age bracket recognition methods process flow diagram that the present invention is based on support vector machines, and as shown in Figure 1, the method specifically comprises the following steps:
Step 1 is set up the sound bank of the speaker's store a plurality of all ages and classes sections voice signal, and described voice signal is take phrase as the unit;
In this step, at first adopt recording pen or other sound pick-up outfits to gather the speaker's of all ages and classes section voice, sampling rate can unify to be 16KHz, 16bit, monophony, in an embodiment of the present invention, each age bracket is recorded 20 speakers (comprising 10 male 10 female), language scripts is that voice content is classical prose " MoonlIght on the Lotus Pond " " figure viewed from behind " etc., reads 1 time for every piece; And then the voice that record are cut into sound bite signal take phrase as the unit.
Step 2 is carried out pre-service to the voice signal in described sound bank;
In this step, described pre-service is further comprising the steps:
Step 21 is carried out sample quantization to voice signal;
Step 22 in order to remove the impact of mouth and nose radiation, promotes the HFS of signal, adopts following formula to carry out pre-emphasis to the voice signal after quantizing and processes:
H(z)=1-0.9375z -1
Wherein, z represents voice signal, the voice signal that H (z) expression obtains after processing through pre-emphasis;
Step 23 take phrase as the unit, between each word has interval in phrase due to described voice signal, therefore need to adopt end-point detecting method based on energy and zero-crossing rate to remove unvoiced segments in each voice signal.
Wherein, described end-point detecting method adopts two-stage judgement method, and is further comprising the steps:
Step 231 is carried out short division frame with described voice signal and is processed, and frame length is got 20ms, and the voice signal sampling rate is 16KHz, and namely 320 sampled points, obtain a plurality of speech frames;
Step 232 is calculated short-time energy and the short-time zero-crossing rate of each speech frame;
Step 233, according to the average energy of all speech frames, a higher decision threshold E1 is set, size and the described thresholding E1 of the short-time energy of each speech frame are compared, obtain the voice terminal of each speech frame of preliminary judgement, this voice terminal is positioned at the intersection point of described thresholding E1 and speech frame short-time energy envelope outside the corresponding time interval;
Step 234 arranges a slightly low decision threshold E2 according to the average energy of ground unrest, determines the voice terminal of each speech frame on the result of described step 233 preliminary judgement, i.e. the end points of each speech frame;
Step 235 arranges a thresholding Z1 according to the average zero-crossing rate of described ground unrest, and based on the end points of described each speech frame, the voiceless sound of judgement voice front end and the last or end syllable of rear end finally obtain in each speech frame the end points of sound section and unvoiced segments.
Step 3 is to extracting its speech characteristic parameter through pretreated voice signal;
In this step, described speech characteristic parameter is taken as MFCC, and in an embodiment of the present invention, MFCC is such as being 12 dimensions.The process that described speech characteristic parameter extracts can comprise the following steps:
Step 31 is divided into a series of leg-of-mutton Mel wave filter sequences with the speech frequency of described voice signal;
Step 32 is got the weighted sum of all signal amplitudes in each leg-of-mutton Mel wave filter sequence frequency bandwidth as the output of respective filter;
Step 33 is done the logarithm computing to the output of all wave filters;
Step 34, the result that described step 33 is obtained is carried out discrete cosine transform and can be obtained MFCC.
Step 4 is carried out the SVM training based on the described speech characteristic parameter that extracts, and obtains the SVM model;
As shown in Figure 2, the process of described SVM training comprises the following steps:
Step 41, with the speech characteristic parameter MFCC of described each all ages and classes section that extracts as eigenvector;
Step 42 is for the speech characteristic parameter of each all ages and classes section adds class label, in an embodiment of the present invention, totally 5 kinds of age brackets (children's stage, juvenile stage, young stages, stage in middle age, the old stage), namely 5 class data, be made as respectively { 1 with five kinds of age brackets, 2,3,4,5}5 class label is processed;
Step 43 with described eigenvector normalization, and is pressed the ratio convergent-divergent, is reduced in [1 ,+1] scope;
step 44, eigenvector after each all ages and classes section normalization is trained, such as training, the kit svmtrain of the LIBSVM that can use the development and Design such as professor Lin Zhiren of Taiwan Univ. (sees C.-C.Chang and C.-J.Lin.LIBSVM:a library for support vector machines.ACM Transactions on Intelligent Systems and Technology, 2:27:1--27:27, 2011.), supported vector machine set, wherein owing to adopting " one to one " method to carry out 5 class classification in one embodiment of the invention, therefore comprise 10 sorters in training result.Wherein, the Selection of kernel function radial basis kernel function of using in SVM:
K(X,X i)=exp(-γ||X-X i|| 2)
Wherein, parameter γ is taken as 0.001, X, X iBe the input feature vector vector.
Step 5, as shown in Figure 3, train the SVM model that obtains according to described step 4, speech characteristic parameter X to voice to be identified predicts, such as the svmpredict that can use LIBSVM predicts, in forecasting process, the output of each support vector machine is by after logical decision, select the who gets the most votes as most probable age bracket classification, can obtain thus final age bracket recognition result.
Wherein, before the characteristic parameter X to voice to be identified predicted, described step 5 also comprised speech characteristic parameter normalization to be identified, i.e. the identical ratio convergent-divergent during according to parameter training to described speech characteristic parameter: be reduced in [1 ,+1] scope.
Above-described specific embodiment; purpose of the present invention, technical scheme and beneficial effect are further described; institute is understood that; the above is only specific embodiments of the invention; be not limited to the present invention; within the spirit and principles in the present invention all, any modification of making, be equal to replacement, improvement etc., within all should being included in protection scope of the present invention.

Claims (10)

1. the speaker's age bracket recognition methods based on support vector machines, is characterized in that, the method comprises the following steps:
Step 1 is set up the sound bank of the speaker's store a plurality of all ages and classes sections voice signal;
Step 2 is carried out pre-service to the voice signal in described sound bank;
Step 3 is to extracting its speech characteristic parameter through pretreated voice signal;
Step 4 is carried out support vector machine training, supported vector machine model based on the speech characteristic parameter that extracts;
Step 5, train the supporting vector machine model that obtains according to described step 4, speech characteristic parameter X to voice to be identified predicts, in forecasting process, the output of each support vector machine is by after logical decision, select the who gets the most votes as most probable age bracket classification, obtain thus final age bracket recognition result.
2. method according to claim 1, is characterized in that, described voice signal is take phrase as the unit.
3. method according to claim 1, is characterized in that, in described step 2, described pre-service is further comprising the steps:
Step 21 is carried out sample quantization to voice signal;
Step 22 is carried out pre-emphasis to the voice signal after quantizing and is processed;
Step 23 adopts end-point detecting method based on energy and zero-crossing rate to remove unvoiced segments in each voice signal.
4. method according to claim 3, is characterized in that, described pre-emphasis processing list is shown:
H(z)=1-0.9375z -1
Wherein, z represents voice signal, the voice signal that H (z) expression obtains after processing through pre-emphasis.
5. method according to claim 3, is characterized in that, utilizes described end-point detecting method to detect unvoiced segments and comprise the following steps:
Step 231 is carried out short division frame with described voice signal and is processed, and obtains a plurality of speech frames;
Step 232 is calculated short-time energy and the short-time zero-crossing rate of each speech frame;
Step 233 arranges a higher decision threshold E1 according to the average energy of all speech frames, and size and the described thresholding E1 of the short-time energy of each speech frame compared, and obtains the voice terminal of each speech frame of preliminary judgement;
Step 234 arranges a slightly low decision threshold E2 according to the average energy of ground unrest, determines the voice terminal of each speech frame on the result of described step 233 preliminary judgement, i.e. the end points of each speech frame;
Step 235 arranges a thresholding Z1 according to the average zero-crossing rate of described ground unrest, and based on the end points of described each speech frame, the voiceless sound of judgement voice front end and the last or end syllable of rear end finally obtain in each speech frame the end points of sound section and unvoiced segments.
6. method according to claim 5, is characterized in that, described frame length is got 20ms, and the voice signal sampling rate is 16KHz, i.e. 320 sampled points.
7. method according to claim 1, is characterized in that, described speech characteristic parameter is taken as Mel-cepstrum coefficient MFCC.
8. method according to claim 7, is characterized in that, the step that described speech characteristic parameter extracts comprises the following steps:
Step 31 is divided into a series of leg-of-mutton Mel wave filter sequences with the speech frequency of described voice signal;
Step 32 is got the weighted sum of all signal amplitudes in each leg-of-mutton Mel wave filter sequence frequency bandwidth as the output of respective filter;
Step 33 is done the logarithm computing to the output of all wave filters;
Step 34, the result that described step 33 is obtained is carried out discrete cosine transform and is obtained MFCC.
9. method according to claim 1, is characterized in that, the step of described support vector machine training further comprises:
Step 41, with the speech characteristic parameter of each all ages and classes section of extracting as eigenvector;
Step 42 is for the speech characteristic parameter of each all ages and classes section adds class label;
Step 43 with described eigenvector normalization, and is pressed the ratio convergent-divergent, is reduced in [1 ,+1] scope;
Step 44 is trained supported vector machine set to the eigenvector after each all ages and classes section normalization.
10. method according to claim 1, is characterized in that, described step 5 also comprised speech characteristic parameter normalization to be identified before the characteristic parameter X to voice to be identified predicts, and it is reduced to the interior step of [1 ,+1] scope.
CN2013100494454A 2013-02-07 2013-02-07 Speaker age identification method based on SVM (Support Vector Machine) Pending CN103151039A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2013100494454A CN103151039A (en) 2013-02-07 2013-02-07 Speaker age identification method based on SVM (Support Vector Machine)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2013100494454A CN103151039A (en) 2013-02-07 2013-02-07 Speaker age identification method based on SVM (Support Vector Machine)

Publications (1)

Publication Number Publication Date
CN103151039A true CN103151039A (en) 2013-06-12

Family

ID=48549062

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2013100494454A Pending CN103151039A (en) 2013-02-07 2013-02-07 Speaker age identification method based on SVM (Support Vector Machine)

Country Status (1)

Country Link
CN (1) CN103151039A (en)

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103905650A (en) * 2014-04-28 2014-07-02 深圳市中兴移动通信有限公司 Mobile terminal and method for regulating call volume based on voice recognition
CN104700843A (en) * 2015-02-05 2015-06-10 海信集团有限公司 Method and device for identifying ages
CN104714633A (en) * 2013-12-12 2015-06-17 华为技术有限公司 Method and terminal for terminal configuration
CN105529027A (en) * 2015-12-14 2016-04-27 百度在线网络技术(北京)有限公司 Voice identification method and apparatus
CN105845143A (en) * 2016-03-23 2016-08-10 广州势必可赢网络科技有限公司 Speaker confirmation method and speaker confirmation system based on support vector machine
CN105872792A (en) * 2016-03-25 2016-08-17 乐视控股(北京)有限公司 Voice-based service recommending method and device
CN105957520A (en) * 2016-07-04 2016-09-21 北京邮电大学 Voice state detection method suitable for echo cancellation system
CN106599110A (en) * 2016-11-29 2017-04-26 百度在线网络技术(北京)有限公司 Artificial intelligence-based voice search method and device
CN107170457A (en) * 2017-06-29 2017-09-15 深圳市泰衡诺科技有限公司 Age recognition methods, device and terminal
CN107358949A (en) * 2017-05-27 2017-11-17 芜湖星途机器人科技有限公司 Robot sounding automatic adjustment system
CN107886955A (en) * 2016-09-29 2018-04-06 百度在线网络技术(北京)有限公司 A kind of personal identification method, device and the equipment of voice conversation sample
CN108281138A (en) * 2017-12-18 2018-07-13 百度在线网络技术(北京)有限公司 Age discrimination model training and intelligent sound exchange method, equipment and storage medium
CN108573712A (en) * 2017-03-13 2018-09-25 北京贝塔科技股份有限公司 Voice activity detection model generation method and system and voice activity detection method and system
CN108694954A (en) * 2018-06-13 2018-10-23 广州势必可赢网络科技有限公司 A kind of Sex, Age recognition methods, device, equipment and readable storage medium storing program for executing
CN108877773A (en) * 2018-06-12 2018-11-23 广东小天才科技有限公司 A kind of audio recognition method and electronic equipment
CN109166591A (en) * 2018-08-29 2019-01-08 昆明理工大学 A kind of classification method based on audio frequency characteristics signal
CN109448756A (en) * 2018-11-14 2019-03-08 北京大生在线科技有限公司 A kind of voice age recognition methods and system
CN109859744A (en) * 2017-11-29 2019-06-07 宁波方太厨具有限公司 A kind of sound end detecting method applied in range hood
CN109945900A (en) * 2019-03-11 2019-06-28 南京智慧基础设施技术研究院有限公司 A kind of distributed optical fiber sensing method
CN110211566A (en) * 2019-06-08 2019-09-06 安徽中医药大学 A kind of classification method of compressed sensing based hepatolenticular degeneration disfluency
CN110782915A (en) * 2019-10-31 2020-02-11 广州艾颂智能科技有限公司 Waveform music component separation method based on deep learning

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080235019A1 (en) * 2007-03-23 2008-09-25 Verizon Business Network Services, Inc. Age determination using speech
CN102426835A (en) * 2011-08-30 2012-04-25 华南理工大学 Method for identifying local discharge signals of switchboard based on support vector machine model

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080235019A1 (en) * 2007-03-23 2008-09-25 Verizon Business Network Services, Inc. Age determination using speech
CN102426835A (en) * 2011-08-30 2012-04-25 华南理工大学 Method for identifying local discharge signals of switchboard based on support vector machine model

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
DAVOOD MAHMOODI ET AL.: "Age Estimation Based on Speech Features and Support Vector Machine", 《2011 3RD COMPUTER SCIENCE AND ELECTRONIC ENGINEERING CONFERENCE (CEEC)》 *
易克初等: "《语音信号处理》", 30 June 2000, 国防工业出版社 *

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104714633A (en) * 2013-12-12 2015-06-17 华为技术有限公司 Method and terminal for terminal configuration
CN103905650A (en) * 2014-04-28 2014-07-02 深圳市中兴移动通信有限公司 Mobile terminal and method for regulating call volume based on voice recognition
CN104700843A (en) * 2015-02-05 2015-06-10 海信集团有限公司 Method and device for identifying ages
US10650809B2 (en) 2015-12-14 2020-05-12 Baidu Online Network Technology (Beijing) Co., Ltd. Speech recognition method and device
CN105529027A (en) * 2015-12-14 2016-04-27 百度在线网络技术(北京)有限公司 Voice identification method and apparatus
CN105529027B (en) * 2015-12-14 2019-05-31 百度在线网络技术(北京)有限公司 Audio recognition method and device
WO2017101450A1 (en) * 2015-12-14 2017-06-22 百度在线网络技术(北京)有限公司 Voice recognition method and device
CN105845143A (en) * 2016-03-23 2016-08-10 广州势必可赢网络科技有限公司 Speaker confirmation method and speaker confirmation system based on support vector machine
CN105872792A (en) * 2016-03-25 2016-08-17 乐视控股(北京)有限公司 Voice-based service recommending method and device
CN105957520A (en) * 2016-07-04 2016-09-21 北京邮电大学 Voice state detection method suitable for echo cancellation system
CN105957520B (en) * 2016-07-04 2019-10-11 北京邮电大学 A kind of voice status detection method suitable for echo cancelling system
CN107886955A (en) * 2016-09-29 2018-04-06 百度在线网络技术(北京)有限公司 A kind of personal identification method, device and the equipment of voice conversation sample
CN106599110A (en) * 2016-11-29 2017-04-26 百度在线网络技术(北京)有限公司 Artificial intelligence-based voice search method and device
CN108573712A (en) * 2017-03-13 2018-09-25 北京贝塔科技股份有限公司 Voice activity detection model generation method and system and voice activity detection method and system
CN108573712B (en) * 2017-03-13 2020-07-28 北京贝塔科技股份有限公司 Voice activity detection model generation method and system and voice activity detection method and system
CN107358949A (en) * 2017-05-27 2017-11-17 芜湖星途机器人科技有限公司 Robot sounding automatic adjustment system
CN107170457A (en) * 2017-06-29 2017-09-15 深圳市泰衡诺科技有限公司 Age recognition methods, device and terminal
CN109859744B (en) * 2017-11-29 2021-01-19 宁波方太厨具有限公司 Voice endpoint detection method applied to range hood
CN109859744A (en) * 2017-11-29 2019-06-07 宁波方太厨具有限公司 A kind of sound end detecting method applied in range hood
CN108281138A (en) * 2017-12-18 2018-07-13 百度在线网络技术(北京)有限公司 Age discrimination model training and intelligent sound exchange method, equipment and storage medium
CN108281138B (en) * 2017-12-18 2020-03-31 百度在线网络技术(北京)有限公司 Age discrimination model training and intelligent voice interaction method, equipment and storage medium
CN108877773A (en) * 2018-06-12 2018-11-23 广东小天才科技有限公司 A kind of audio recognition method and electronic equipment
CN108877773B (en) * 2018-06-12 2020-07-24 广东小天才科技有限公司 Voice recognition method and electronic equipment
CN108694954A (en) * 2018-06-13 2018-10-23 广州势必可赢网络科技有限公司 A kind of Sex, Age recognition methods, device, equipment and readable storage medium storing program for executing
CN109166591A (en) * 2018-08-29 2019-01-08 昆明理工大学 A kind of classification method based on audio frequency characteristics signal
CN109448756A (en) * 2018-11-14 2019-03-08 北京大生在线科技有限公司 A kind of voice age recognition methods and system
CN109945900A (en) * 2019-03-11 2019-06-28 南京智慧基础设施技术研究院有限公司 A kind of distributed optical fiber sensing method
CN110211566A (en) * 2019-06-08 2019-09-06 安徽中医药大学 A kind of classification method of compressed sensing based hepatolenticular degeneration disfluency
CN110782915A (en) * 2019-10-31 2020-02-11 广州艾颂智能科技有限公司 Waveform music component separation method based on deep learning

Similar Documents

Publication Publication Date Title
CN103151039A (en) Speaker age identification method based on SVM (Support Vector Machine)
CN102723078B (en) Emotion speech recognition method based on natural language comprehension
CN105161093B (en) A kind of method and system judging speaker's number
CN105405439B (en) Speech playing method and device
CN105206271A (en) Intelligent equipment voice wake-up method and system for realizing method
CN105632501A (en) Deep-learning-technology-based automatic accent classification method and apparatus
CN112102850B (en) Emotion recognition processing method and device, medium and electronic equipment
CN111063341A (en) Method and system for segmenting and clustering multi-person voice in complex environment
CN106548775B (en) Voice recognition method and system
WO2012075641A1 (en) Device and method for pass-phrase modeling for speaker verification, and verification system
CN103700370A (en) Broadcast television voice recognition method and system
CN108922541A (en) Multidimensional characteristic parameter method for recognizing sound-groove based on DTW and GMM model
CN111524527A (en) Speaker separation method, device, electronic equipment and storage medium
CN111370030A (en) Voice emotion detection method and device, storage medium and electronic equipment
CN111192659A (en) Pre-training method for depression detection and depression detection method and device
CN113192535B (en) Voice keyword retrieval method, system and electronic device
CN110428853A (en) Voice activity detection method, Voice activity detection device and electronic equipment
CN110827853A (en) Voice feature information extraction method, terminal and readable storage medium
CN105869622B (en) Chinese hot word detection method and device
CN115171731A (en) Emotion category determination method, device and equipment and readable storage medium
CN113823323A (en) Audio processing method and device based on convolutional neural network and related equipment
CN109074809B (en) Information processing apparatus, information processing method, and computer-readable storage medium
CN109065026B (en) Recording control method and device
Ghaemmaghami et al. Complete-linkage clustering for voice activity detection in audio and visual speech
CN114708869A (en) Voice interaction method and device and electric appliance

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20130612