CN109859744B - Voice endpoint detection method applied to range hood - Google Patents

Voice endpoint detection method applied to range hood Download PDF

Info

Publication number
CN109859744B
CN109859744B CN201711229316.8A CN201711229316A CN109859744B CN 109859744 B CN109859744 B CN 109859744B CN 201711229316 A CN201711229316 A CN 201711229316A CN 109859744 B CN109859744 B CN 109859744B
Authority
CN
China
Prior art keywords
time
short
range hood
threshold value
voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711229316.8A
Other languages
Chinese (zh)
Other versions
CN109859744A (en
Inventor
杜杉杉
茅忠群
诸永定
方献良
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ningbo Fotile Kitchen Ware Co Ltd
Original Assignee
Ningbo Fotile Kitchen Ware Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ningbo Fotile Kitchen Ware Co Ltd filed Critical Ningbo Fotile Kitchen Ware Co Ltd
Priority to CN201711229316.8A priority Critical patent/CN109859744B/en
Publication of CN109859744A publication Critical patent/CN109859744A/en
Application granted granted Critical
Publication of CN109859744B publication Critical patent/CN109859744B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Ventilation (AREA)

Abstract

The invention relates to a voice endpoint detection method applied to a range hood, which comprises the following steps: initializing the working gear number of the range hood; initializing a first short-time energy threshold value array, a second short-time energy threshold value array and a short-time zero-crossing rate threshold value array when the range hood works at each working gear; acquiring a current working gear; collecting and acquiring a short-time signal frame of a voice signal; calculating the short-time energy and the short-time zero crossing rate of each short-time signal frame in the voice signal; calculating first start-stop time coordinate data of the voice signal according to a first energy threshold value corresponding to the current working gear of the range hood; calculating second start-stop time coordinate data of the voice signal according to a second energy threshold value corresponding to the current working gear of the range hood; and calculating third start-stop time coordinate data of the voice signal as the start-stop time coordinate of the voice signal according to the short-time zero-crossing rate threshold value corresponding to the current working gear of the range hood.

Description

Voice endpoint detection method applied to range hood
Technical Field
The invention relates to the technical field of range hoods, in particular to a voice endpoint detection method applied to a range hood.
Background
With the continuous development of intelligent technology, speech recognition technology is popularized, and the speech recognition technology begins to penetrate into various daily necessities for use. If the chinese utility model patent of the grant bulletin number CN205208686U (application number 201521083692.7) 'a voice input control range hood', and the chinese utility model patent of the grant bulletin number CN206113052U (application number 201620882578.9) 'an intelligent lampblack absorber based on intelligent cloud system control', there is also the chinese utility model patent of the grant bulletin number CN206556088U (application number 201621294280.2) 'a range hood with menu broadcast system', wherein all adopted the voice recognition technique in the disclosed range hood, carry out automatic control to the range hood according to the pronunciation of discerning, make the operation of lampblack absorber more convenient and humanized.
But involve speech recognition then need solve the problem of removing noise of pronunciation, range hood noise is great at the during operation, is carrying out the speech recognition in-process, directly influences speech recognition's accuracy to range hood fan noise treatment effect. Meanwhile, when the range hood works at different gears, the noise characteristics are different, and the problem of how to improve the voice recognition capability of the range hood at different working gears is to be solved.
Disclosure of Invention
The invention aims to solve the technical problem of providing a voice endpoint detection method applied to a range hood, which is beneficial to reducing the probability of noise misrecognition into voice under different working gears of the range hood and can reduce the data storage amount in the voice recognition process.
The technical scheme adopted by the invention for solving the problems is as follows: a voice endpoint detection method applied to a range hood is characterized in that: the method comprises the following steps:
s1, initializing the working gear number S of the range hood;
initializing a first short-time energy threshold value array and a second short-time energy threshold value array when the range hood works at each working gear; the first short-time energy threshold array is [ T ]h(1),Th(2),Th(3),......,Th(i),......,Th(s)](ii) a Second short-time energy threshold arrayIs [ T ]l(1),Tl(2),Tl(3),......,Tl(i),......,Tl(s)]Wherein i is a natural number, i is more than or equal to 1 and less than or equal to s, and Tl(i)<Th(i);
Initializing a short-time zero-crossing threshold value array when the range hood works at each working gear:
[Tz(1),Tz(2),Tz(3),......,Tz(i),......,Tz(s)];
s2, acquiring the current working gear data i of the range hood;
s3, acquiring and obtaining a voice signal, and performing pre-emphasis and framing windowing on the acquired voice signal to further acquire a short-time signal frame of the voice signal;
s4, calculating the short-time energy and the short-time zero crossing rate of each short-time signal frame in the voice signal, and further obtaining the relation between the short-time energy and the time of the voice signal and the relation between the short-time zero crossing rate and the time of the voice signal;
s5, according to a first energy threshold value T corresponding to the current working gear i of the range hoodh(i) Calculating first start-up time coordinate data (a, b) of the acquired voice signal;
s6, according to a second energy threshold value T corresponding to the current working gear i of the range hoodl(i) Calculating second start-stop time coordinate data (A, B) of the acquired voice signal;
s7, according to the short-time zero-crossing threshold value T corresponding to the current working gear i of the range hoodz(i) Calculating third start-stop time coordinate data (A) of the acquired voice signal0,B0);
S8, obtaining the start-stop time coordinate of the voice signal as (A)0,B0)。
In order to shorten the processing time, when the second start-stop time coordinate data (a, B) is acquired in S6, a search is made to the left from the start time coordinate a in the first start-stop time coordinate data (a, B) to acquire the start time coordinate a of the second start-stop time, and a search is made to the right from the end time B in the first start-stop time coordinate data (a, B) to acquire the end time coordinate B of the second start-stop time.
To shorten the processing time, in S7, third start-stop time coordinate data (a) is acquired0,B0) Then, the start time coordinate A of the third start-stop time is obtained by searching to the left from the start time coordinate A in the second start-stop time coordinate data (A, B)0Searching rightward from the ending time B in the second start-stop time coordinate data (A, B) to obtain the ending time coordinate B of the third start-stop time0
As an improvement, the method for acquiring the first short-time energy threshold value array, the second short-time energy threshold value array and the short-time zero-crossing rate threshold value array comprises the following steps:
the noise signal of range hood work when each operating range is gathered, and then calculate noise signal's short-term energy average under each operating range, and then constitute noise signal's short-term energy average array:
Figure GDA0002771253970000021
wherein
Figure GDA0002771253970000022
Representing the short-time energy average value of the noise signal when the range hood works at the i gear;
and simultaneously calculating the short-time zero-crossing rate average value of the noise signal under each working gear, and further forming a short-time zero-crossing rate average value array of the noise signal:
Figure GDA0002771253970000031
wherein
Figure GDA0002771253970000032
Representing the short-time zero-crossing rate average value of the noise signal when the range hood works at the i gear;
when the range hood works at each working gear, voice signals are acquired, and then the short-time energy average value of the voice signals under each working gear is calculated to form a short-time energy average value array of the voice signals:
Figure GDA0002771253970000033
wherein
Figure GDA0002771253970000034
Representing the short-time energy average value of the voice signal when the range hood works at the i gear;
meanwhile, calculating the short-time zero-crossing rate average value of the voice signals under each working gear, and further forming a short-time zero-crossing rate average value array of the voice signals:
Figure GDA0002771253970000035
wherein
Figure GDA0002771253970000036
The short-time zero-crossing rate average value of the voice signal when the range hood works at the i gear is represented;
calculating a first short-time energy threshold value of the range hood working under each working gear:
Figure GDA0002771253970000037
wherein 0<α<1; further obtain the first short-time energy threshold value array [ T ] of the range hoodh(1),Th(2),Th(3),......,Th(i),......,Th(s)];
Calculating a second short-time energy threshold value of the range hood working under each working gear:
Figure GDA0002771253970000038
wherein 0<β<1, and Th(i)>Tl(i) (ii) a Further obtain a second short-time energy threshold value array [ T ] of the range hoodl(1),Tl(2),Tl(3),......,Tl(i),......,Tl(s)];
Calculating the short-time zero-crossing rate threshold of the range hood working under each working gear:
Figure GDA0002771253970000039
further obtain the short-time zero-crossing threshold value array [ T ] of the range hoodz(1),Tz(2),Tz(3),......,Tz(i),......,Tz(s)]。
Compared with the prior art, the invention has the advantages that: the voice endpoint detection method applied to the range hood can perform voice endpoint detection according to different working gears and different threshold values respectively, so that the detection result is more accurate, the influence of different characteristics of working gear noise on the detection result is effectively eliminated, the probability that the noise is mistakenly identified as voice in a noise environment is further reduced, meanwhile, the data storage amount in the subsequent voice identification process can be reduced, and the speed of voice identification is improved. In addition, the method has small and low requirement on hardware, and is suitable for the range hood which is an application environment with weak hardware performance.
Drawings
Fig. 1 is a flowchart of a voice endpoint detection method applied to a range hood in the embodiment of the present invention.
Detailed Description
The invention is described in further detail below with reference to the accompanying examples.
In the operation process of the range hood, the noise of the fan is continuously increased along with the improvement of the gears, so that the influence on the accuracy rate of voice recognition is different. The noise of the kitchen of the user mainly comes from the range hood, and because the mechanism of the range hood is fixed and the position of the fan is also fixed, the noise of the range hood is relatively fixed under the condition that the gears are fixed. Therefore, the voice recognition rate of each working gear can be effectively improved by performing voice recognition on the gear information pertinence of the range hood.
As shown in fig. 1, the voice endpoint detection method applied to the range hood in the embodiment includes the following steps:
s1, initializing a working gear number S of the range hood, wherein S is a natural number in the embodiment, and the working gear number S is stored in a control chip when the range hood leaves a factory, so that the chip can identify the working gear of the range hood.
Initializing a first short-time energy threshold value array and a second short-time energy threshold value array when the range hood works at each working gear; the first short-time energy threshold array is [ T ]h(1),Th(2),Th(3),......,Th(i),......,Th(s)](ii) a The second short-time energy threshold array is [ T ]l(1),Tl(2),Tl(3),......,Tl(i),......,Tl(s)]Wherein i is a natural number, i is more than or equal to 1 and less than or equal to s, and Tl(i)<Th(i) (ii) a Initializing a short-time zero-crossing threshold value array when the range hood works at each working gear:
[Tz(1),Tz(2),Tz(3),......,Tz(i),......,Tz(s)]。
the first short-time energy threshold value array, the second short-time energy threshold value array and the short-time zero-crossing rate threshold value array can be tested and obtained in a laboratory environment before the range hood leaves a factory.
The specific acquisition method comprises the following steps: under the laboratory environment, adjust range hood operation on each work gear, utilize a speech processing chip to gather and handle the operating noise of range hood operation under each work gear respectively, specifically sample the quantization to the noise signal for the chip, carry out the pre-emphasis processing again, and then carrying out the framing and windowing processing, calculate the short-term energy average value of noise signal under each work gear at last, short-term energy average value adopts current calculation formula to calculate, and then constitute the short-term energy average value array of noise signal:
Figure GDA0002771253970000041
wherein
Figure GDA0002771253970000042
And the short-time energy average value of the noise signal when the range hood works at the i gear is represented.
Meanwhile, calculating the short-time zero-crossing rate average value of the noise signal under each working gear, wherein the short-time zero-crossing rate average value is calculated by adopting the existing calculation formula, and further forming a short-time zero-crossing rate average value array of the noise signal:
Figure GDA0002771253970000043
wherein
Figure GDA0002771253970000044
And the short-time zero-crossing rate average value of the noise signal when the range hood works in the i gear is represented.
Under the laboratory environment, adjust range hood work on each operating range, the control sends the test pronunciation of standard to range hood's control chip simultaneously, utilize to lead to aforementioned same speech processing chip and gather and handle the speech signal of range hood operation under each operating range respectively, specifically for the chip carries out the sampling quantization to the speech signal, carry out the pre-emphasis processing again, and then carry out framing and windowing processing, calculate the short-time energy average value of speech signal under each operating range at last, short-time energy average value adopts current calculation formula to calculate, and then constitute the short-time energy average value array of speech signal:
Figure GDA0002771253970000051
wherein
Figure GDA0002771253970000052
And the short-time energy average value of the voice signal when the range hood works in the i gear is represented.
Meanwhile, calculating the short-time zero-crossing rate average value of the voice signal under each working gear, wherein the short-time zero-crossing rate average value is calculated by adopting the existing calculation formula, and further forming a short-time zero-crossing rate average value array of the voice signal:
Figure GDA0002771253970000053
wherein
Figure GDA0002771253970000054
And the short-time zero-crossing rate average value of the voice signal when the range hood works in the i gear is represented.
Calculating a first short-time energy threshold value of the range hood working under each working gear:
Figure GDA0002771253970000055
wherein 0<α<1; further obtain the first short-time energy threshold value array [ T ] of the range hoodh(1),Th(2),Th(3),......,Th(i),......,Th(s)]. And (3) actually measuring and obtaining alpha, and in order to obtain a more accurate first short-time energy threshold value array, when the first short-time energy threshold value under each working gear of the range hood is calculated, performing multiple tests to obtain a more accurate alpha value.
Calculating a second short-time energy threshold value of the range hood working under each working gear:
Figure GDA0002771253970000056
wherein 0<β<1, and Th(i)>Tl(i) (ii) a Further obtain a second short-time energy threshold value array [ T ] of the range hoodl(1),Tl(2),Tl(3),......,Tl(i),......,Tl(s)]. And actually measuring and obtaining beta, wherein in order to obtain a more accurate first short-time energy threshold value array, multiple tests can be carried out when the first short-time energy threshold value under each working gear of the range hood is calculated so as to obtain a more accurate beta value.
Calculating the short-time zero-crossing rate threshold of the range hood working under each working gear:
Figure GDA0002771253970000057
further obtain the short-time zero-crossing threshold value array [ T ] of the range hoodz(1),Tz(2),Tz(3),......,Tz(i),......,Tz(s)]。
S2, when a user uses the range hood, the control chip in the range hood automatically detects and acquires the current working gear data i of the range hood.
And S3, acquiring a control voice signal of the user, and performing pre-emphasis, framing and windowing on the acquired voice signal to further acquire a short-time signal frame of the voice signal. Because human special laborsaving structure receives glottis excitation and the influence of scratching the nose radiation, the pronunciation that send out in the oral cavity have the decay at the high band, and pre-emphasis processing adopts high pass filter to promote the speech signal high band response usually. When the speech signal is processed by framing and windowing, a Hamming window can be adopted for framing.
S4, calculating the short-time energy and the short-time zero crossing rate of each short-time signal frame in the voice signal, and further obtaining the relation between the short-time energy and the time of the voice signal and the relation between the short-time zero crossing rate and the time of the voice signal;
s5, according to a first energy threshold value T corresponding to the current working gear i of the range hoodh(i) Calculating first start-up time coordinate data (a, b) of the acquired voice signal; the first start-stop time coordinate data (a, b) may identify an approximate start-stop time point of the speech signal.
S6, according to a second energy threshold value T corresponding to the current working gear i of the range hoodl(i) Second start-stop time coordinate data (A, B) of the acquired voice signal are calculated, and the start-stop time points of voiced sounds of the voice signal can be detected by the second start-stop time coordinate data (A, B). When the second start-stop time coordinate data (a, B) is acquired, a search is made to the left from the start time coordinate a in the first start-stop time coordinate data (a, B) to acquire the start time coordinate a of the second start-stop time, and a search is made to the right from the end time B in the first start-stop time coordinate data (a, B) to acquire the end time coordinate B of the second start-stop time, so that the processing time can be saved.
S7, according to the short-time zero-crossing threshold value T corresponding to the current working gear i of the range hoodz(i) Calculating third start-stop time coordinate data (A) of the acquired voice signal0,B0). Since the common initial consonant of Chinese is used as the start, most of the initial consonants are unvoiced sound, and are easy to be confused with the environmental noise, but the short-time zero crossing rate of the environmental noise is obviously lower than that of the unvoiced sound, the third start-stop time coordinate data (A)0,B0) Can be directly used as the starting and stopping time point of the voice signal.
Acquiring third start-stop time coordinate data (A)0,B0) Then, the start time coordinate A of the third start-stop time is obtained by searching to the left from the start time coordinate A in the second start-stop time coordinate data (A, B)0Searching rightward from the ending time B in the second start-stop time coordinate data (A, B) to obtain the ending time coordinate B of the third start-stop time0
S8, obtaining the start-stop time coordinate of the voice signal as (A)0,B0). By the start-stop time coordinate (A) of the speech signal0,B0) The corresponding effective voice signals can be effectively obtained, and redundant information in the original voice can be removed after the characteristics of the effective voice signals are extracted. Finally, the voice information after the characteristic extraction is matched by using the trained model, so that the voice sent by the user can be effectively acquired.
The voice endpoint detection method applied to the range hood can perform voice endpoint detection according to different working gears and different threshold values respectively, so that the detection result is more accurate, the influence of different characteristics of working gear noise on the detection result is effectively eliminated, the probability that the noise is mistakenly identified as voice in a noise environment is further reduced, meanwhile, the data storage amount in the subsequent voice identification process can be reduced, and the speed of voice identification is improved. In addition, the method has small and low requirement on hardware, and is suitable for the range hood which is an application environment with weak hardware performance.

Claims (3)

1. A voice endpoint detection method applied to a range hood is characterized in that: the method comprises the following steps:
s1, initializing the working gear number S of the range hood;
initializing the range hood to work at each working gearA first short-time energy threshold array and a second short-time energy threshold array of time; the first short-time energy threshold array is [ T ]h(1),Th(2),Th(3),......,Th(i),......,Th(s)](ii) a The second short-time energy threshold array is [ T ]l(1),Tl(2),Tl(3),......,Tl(i),......,Tl(s)]Wherein i is a natural number, i is more than or equal to 1 and less than or equal to s, and Tl(i)<Th(i);
Initializing a short-time zero-crossing threshold value array when the range hood works at each working gear:
[Tz(1),Tz(2),Tz(3),......,Tz(i),......,Tz(s)];
the method for acquiring the first short-time energy threshold value array, the second short-time energy threshold value array and the short-time zero-crossing rate threshold value array comprises the following steps:
the noise signal of range hood work when each operating range is gathered, and then calculate noise signal's short-term energy average under each operating range, and then constitute noise signal's short-term energy average array:
Figure FDA0002771253960000011
wherein
Figure FDA0002771253960000012
Representing the short-time energy average value of the noise signal when the range hood works at the i gear;
and simultaneously calculating the short-time zero-crossing rate average value of the noise signal under each working gear, and further forming a short-time zero-crossing rate average value array of the noise signal:
Figure FDA0002771253960000013
wherein
Figure FDA0002771253960000014
Short time of noise signal for indicating range hood working at i gearA zero crossing rate average value;
when the range hood works at each working gear, voice signals are acquired, and then the short-time energy average value of the voice signals under each working gear is calculated to form a short-time energy average value array of the voice signals:
Figure FDA0002771253960000015
wherein
Figure FDA0002771253960000016
Representing the short-time energy average value of the voice signal when the range hood works at the i gear;
meanwhile, calculating the short-time zero-crossing rate average value of the voice signals under each working gear, and further forming a short-time zero-crossing rate average value array of the voice signals:
Figure FDA0002771253960000017
wherein
Figure FDA0002771253960000018
The short-time zero-crossing rate average value of the voice signal when the range hood works at the i gear is represented;
calculating a first short-time energy threshold value of the range hood working under each working gear:
Figure FDA0002771253960000021
wherein 0<α<1; further obtain the first short-time energy threshold value array [ T ] of the range hoodh(1),Th(2),Th(3),......,Th(i),......,Th(s)];
Calculating a second short-time energy threshold value of the range hood working under each working gear:
Figure FDA0002771253960000022
wherein 0<β<1, and Th(i)>Tl(i) (ii) a Further obtain a second short-time energy threshold value array [ T ] of the range hoodl(1),Tl(2),Tl(3),......,Tl(i),......,Tl(s)];
Calculating the short-time zero-crossing rate threshold of the range hood working under each working gear:
Figure FDA0002771253960000023
further obtain the short-time zero-crossing threshold value array [ T ] of the range hoodz(1),Tz(2),Tz(3),......,Tz(i),......,Tz(s)];
S2, acquiring the current working gear data i of the range hood;
s3, acquiring and obtaining a voice signal, and performing pre-emphasis and framing windowing on the acquired voice signal to further acquire a short-time signal frame of the voice signal;
s4, calculating the short-time energy and the short-time zero crossing rate of each short-time signal frame in the voice signal, and further obtaining the relation between the short-time energy and the time of the voice signal and the relation between the short-time zero crossing rate and the time of the voice signal;
s5, according to a first energy threshold value T corresponding to the current working gear i of the range hoodh(i) Calculating first start-up time coordinate data (a, b) of the acquired voice signal;
s6, according to a second energy threshold value T corresponding to the current working gear i of the range hoodl(i) Calculating second start-stop time coordinate data (A, B) of the acquired voice signal;
s7, according to the short-time zero-crossing threshold value T corresponding to the current working gear i of the range hoodz(i) Calculating third start-stop time coordinate data (A) of the acquired voice signal0,B0);
S8, obtaining the start-stop time coordinate of the voice signal as (A)0,B0)。
2. The method for detecting the voice endpoint applied to the range hood as claimed in claim 1, wherein: in S6, when the second start-stop time coordinate data (a, B) is acquired, a search is made to the left from the start time coordinate a in the first start-stop time coordinate data (a, B) to acquire the start time coordinate a of the second start-stop time, and a search is made to the right from the end time B in the first start-stop time coordinate data (a, B) to acquire the end time coordinate B of the second start-stop time.
3. The method for detecting the voice endpoint applied to the range hood as claimed in claim 1, wherein: in S7, third start-stop time coordinate data (a) is acquired0,B0) Then, the start time coordinate A of the third start-stop time is obtained by searching to the left from the start time coordinate A in the second start-stop time coordinate data (A, B)0Searching rightward from the ending time B in the second start-stop time coordinate data (A, B) to obtain the ending time coordinate B of the third start-stop time0
CN201711229316.8A 2017-11-29 2017-11-29 Voice endpoint detection method applied to range hood Active CN109859744B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711229316.8A CN109859744B (en) 2017-11-29 2017-11-29 Voice endpoint detection method applied to range hood

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711229316.8A CN109859744B (en) 2017-11-29 2017-11-29 Voice endpoint detection method applied to range hood

Publications (2)

Publication Number Publication Date
CN109859744A CN109859744A (en) 2019-06-07
CN109859744B true CN109859744B (en) 2021-01-19

Family

ID=66887533

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711229316.8A Active CN109859744B (en) 2017-11-29 2017-11-29 Voice endpoint detection method applied to range hood

Country Status (1)

Country Link
CN (1) CN109859744B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1835073A (en) * 2006-04-20 2006-09-20 南京大学 Mute detection method based on speech characteristic to jude
CN102056026A (en) * 2009-11-06 2011-05-11 中国移动通信集团设计院有限公司 Audio/video synchronization detection method and system, and voice detection method and system
US20110264447A1 (en) * 2010-04-22 2011-10-27 Qualcomm Incorporated Systems, methods, and apparatus for speech feature detection
CN103151039A (en) * 2013-02-07 2013-06-12 中国科学院自动化研究所 Speaker age identification method based on SVM (Support Vector Machine)
CN107305774A (en) * 2016-04-22 2017-10-31 腾讯科技(深圳)有限公司 Speech detection method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1835073A (en) * 2006-04-20 2006-09-20 南京大学 Mute detection method based on speech characteristic to jude
CN102056026A (en) * 2009-11-06 2011-05-11 中国移动通信集团设计院有限公司 Audio/video synchronization detection method and system, and voice detection method and system
US20110264447A1 (en) * 2010-04-22 2011-10-27 Qualcomm Incorporated Systems, methods, and apparatus for speech feature detection
CN103151039A (en) * 2013-02-07 2013-06-12 中国科学院自动化研究所 Speaker age identification method based on SVM (Support Vector Machine)
CN107305774A (en) * 2016-04-22 2017-10-31 腾讯科技(深圳)有限公司 Speech detection method and device

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
《基于卷积神经网络的语音端点检测方法研究》;王海旭;《中国优秀硕士学位论文全文数据库 信息科技辑》;20160115(第01期);第15-18页 *
《基于双门限两级判决的语音端点检测方法》;路青起等;《电子科技》;20120131;第25卷(第1期);第13-16页 *
《强噪声环境下改进的语音端点检测算法》;鲁远耀等;《计算机应用》;20140510;第34卷(第5期);第1386-1390页 *

Also Published As

Publication number Publication date
CN109859744A (en) 2019-06-07

Similar Documents

Publication Publication Date Title
CN106448663B (en) Voice awakening method and voice interaction device
CN103886871B (en) Detection method of speech endpoint and device thereof
CN105118502B (en) End point detection method and system of voice identification system
CN103617799B (en) A kind of English statement pronunciation quality detection method being adapted to mobile device
CN109473123A (en) Voice activity detection method and device
CN108922541B (en) Multi-dimensional characteristic parameter voiceprint recognition method based on DTW and GMM models
WO2018145584A1 (en) Voice activity detection method and voice recognition method
CN105374352B (en) A kind of voice activated method and system
CN110232933B (en) Audio detection method and device, storage medium and electronic equipment
EP4044175A1 (en) Voice recognition method and apparatus, and computer-readale storage medium
CN108305639B (en) Speech emotion recognition method, computer-readable storage medium and terminal
CN105825852A (en) Oral English reading test scoring method
CN101114449A (en) Model training method for unspecified person alone word, recognition system and recognition method
CN106023986B (en) A kind of audio recognition method based on sound effect mode detection
CN112992191B (en) Voice endpoint detection method and device, electronic equipment and readable storage medium
WO2018095167A1 (en) Voiceprint identification method and voiceprint identification system
CN105679312A (en) Phonetic feature processing method of voiceprint identification in noise environment
CN106504756B (en) Built-in speech recognition system and method
CN109841208A (en) A kind of sound enhancement method applied in range hood
CN108986844B (en) Speech endpoint detection method based on speaker speech characteristics
CN110867193A (en) Paragraph English spoken language scoring method and system
CN104732984B (en) A kind of method and system of quick detection single-frequency prompt tone
CN105976811A (en) Syllable segmentation method containing initial consonant and device thereof
CN109065026A (en) A kind of recording control method and device
CN109859744B (en) Voice endpoint detection method applied to range hood

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant