CN103366739A - Self-adaptive endpoint detection method and self-adaptive endpoint detection system for isolate word speech recognition - Google Patents
Self-adaptive endpoint detection method and self-adaptive endpoint detection system for isolate word speech recognition Download PDFInfo
- Publication number
- CN103366739A CN103366739A CN2012100855848A CN201210085584A CN103366739A CN 103366739 A CN103366739 A CN 103366739A CN 2012100855848 A CN2012100855848 A CN 2012100855848A CN 201210085584 A CN201210085584 A CN 201210085584A CN 103366739 A CN103366739 A CN 103366739A
- Authority
- CN
- China
- Prior art keywords
- word
- frame
- voice
- short
- end points
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Landscapes
- Telephonic Communication Services (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a self-adaptive endpoint detection method and a self-adaptive endpoint detection system for isolate word speech recognition. The self-adaptive endpoint detection method for isolate word speech recognition comprises the following steps: a, a voice input step, wherein a voice signal containing an isolate word to be recognized is input; b, a voice preprocessing step, wherein the voice signal is subjected to amplitude translation and normalization and framing processing operation, and short time average energy and a short time average zero-crossing rate of each frame of voice are calculated; c, an isolate word endpoint rough detection step, wherein isolate word endpoints are roughly estimated through utilization of the short time average energy and the short time average zero-crossing rate of each frame of the voice signal and constraint on the shortest length of continuous voice frames before and after the end points, d, a detection threshold self-adaptive adjustment and accurate endpoint detection step, wherein through utilization of constraint on the smallest time duration and the largest time duration of the isolate word, the detection threshold is subjected to dynamic adjustment operation, the voice endpoints are subjected to front and back fine adjustment, and accurate isolate word endpoints are obtained; e, an isolate word endpoint output and isolate word voice recognition step, wherein the accurate isolate word endpoints are output and isolate word recognition is realized by using voice recognizing technologies.
Description
Technical field
The present invention relates to a kind of voice activity detection algorithm towards alone word voice identification, more particularly, be a kind of can according to ground unrest automatically regulate detection threshold, for the end-point detection algorithm of unspecified person alone word speech recognition.
Background technology
Alone word voice identification is the technology that the voice signal that contains isolated word is changed into corresponding text or order by machine, and have a very wide range of applications field and market background are such as various command control system, voice toy etc.In isolated-word speech recognition system, the signal of input comprises alone word voice and ground unrest etc., finds out starting point and the terminal point of voice from input signal, is called end-point detection.In isolated-word speech recognition system, the accuracy of end-point detection is directly connected to the height of discrimination.
End-point detection algorithm commonly used has the double threshold detection algorithm based on short-time average amplitude and short-time zero-crossing rate, and this algorithm is distinguished voiced sound and unvoiced segments with the short-time average amplitude, distinguishes voiceless sound and unvoiced segments with short-time average zero-crossing rate.This algorithm has good detection effect to the high voice signal of signal to noise ratio (S/N ratio), but affected by noise very large, and is relatively poor for the detection effect of the voice signal of Noise.
Summary of the invention
In view of this, the object of the invention is to overcome existing double threshold detection algorithm large defective affected by noise, strong and weak adaptively modifying detection threshold according to ground unrest, and in conjunction with the restriction according to alone word voice length, provide a kind of alone word voice identification end-point detection algorithm with high robustness.
For achieving the above object, the present invention by the following technical solutions, it may further comprise the steps:
A. phonetic entry: input comprises the voice signal of isolated word to be identified;
B. voice pre-service: voice signal is carried out the amplitude translation, and normalization and minute frame are processed, and calculate short-time average energy and the short-time average zero-crossing rate of each frame voice signal;
C. isolated word end points rough detection: utilize short-time average energy and the short-time average zero-crossing rate of each frame voice signal, and the shortest length constraint of continuous speech frame before and after the end points, the isolated word end points is carried out guestimate;
D. the self-adaptation adjustment of detection threshold and the detection of accurate end points: utilize the restriction of the minimum duration of isolated word and maximum duration, detection threshold is dynamically adjusted, and sound end is carried out the front and back fine setting, obtain accurate isolated word end points;
E. the end points of exporting isolated word carries out alone word voice identification: export accurate isolated word end points, utilize speech recognition technology to carry out isolated word recognition.
In the c step, when carrying out the guestimate of end points, introduce the constraint of end points front and back continuous speech frame length.
In the e step, when carrying out the accurate detection of end points, according to the length constraint of isolated word detection threshold is carried out the self-adaptation adjustment.When the alone word voice length that detects during greater than the maximum length of isolated word, increase the short-time energy high threshold, and adjust backward starting point, adjust terminal point forward, respectively so that the frame average energy of starting point and terminal point greater than new high threshold.When the alone word voice length that detects during greater than the maximum length of isolated word, dwindle the short-time zero-crossing rate threshold value, and adjust backward starting point, adjust terminal point forward, so that starting point former frame and the average zero-crossing rate of terminal point next frame are greater than new short-time zero-crossing rate threshold value.When the alone word voice length that detects during less than the shortest length of isolated word, dwindle the short-time energy high threshold, and adjust forward starting point, adjust terminal point backward, respectively so that the frame average energy of starting point and terminal point greater than new high threshold.When the alone word voice length that detects during less than the shortest length of isolated word, increase the short-time zero-crossing rate threshold value, and adjust forward starting point, adjust terminal point backward, so that starting point former frame and the average zero-crossing rate of terminal point next frame are greater than new short-time zero-crossing rate threshold value.
Towards the self-adaptation endpoint detection system of alone word voice identification, this system comprises: the input media of alone word tone signal to be identified; Voice signal is carried out the amplitude translation, and normalization and minute frame are processed, the device that short-time average energy and the short-time average zero-crossing rate of each frame voice calculated; Utilize short-time average energy and the short-time average zero-crossing rate of each frame voice signal, and the shortest length of continuous speech frame retrains the device that the isolated word end points is carried out guestimate before and after the end points; Utilize the restriction of the minimum duration of isolated word and maximum duration, detection threshold is dynamically adjusted, and sound end is carried out the front and back fine setting, obtain the device of accurate isolated word end points; Export accurate isolated word end points, utilize speech recognition technology to carry out the device of isolated word recognition.
The invention has the beneficial effects as follows: because traditional alone word voice endpoint detection algorithm that detects based on double threshold is affected by noise larger, the present invention provides a kind of new end-point detection algorithm with certain anti-noise ability, self-adaptation adjustment detection threshold.Compared with prior art, the present invention introduces the duration restriction of continuous speech frame when detecting end points, increase the robustness that detects; By introducing the duration restriction relevant with isolated word to be detected, automatically adjust the thresholding that detects simultaneously.The algorithm realization is simple, effective, speed is fast, and has certain anti-noise ability, is particularly suitable for mini-plant and embedded device and realizes, can be used as the front end of isolated-word speech recognition system.
Description of drawings
Fig. 1 is the process flow diagram of the inventive method general frame.
Fig. 2 is that self-adaptation is adjusted detection threshold and the accurate process flow diagram that detects of isolated word end points in the inventive method.
Embodiment
Below in conjunction with accompanying drawing specific implementation method of the present invention is further described.
As shown in Figure 1 and Figure 2, the present invention includes following steps:
1. phonetic entry
Input comprises the voice signal that will detect isolated word.
2. voice pre-service and detection threshold parameter are chosen
Voice signal is carried out the amplitude translation, and then normalized divides frame to voice signal, calculates short-time average energy and the short-time average zero-crossing rate of each frame voice.Preset high threshold EFVU and low threshold value EFVL, zero passage threshold value ZCRT and alone word voice length upper limit LENU and the lower limit LENL that a frame voice signal can be worth frequently according to experiment and experience.
3. the guestimate of isolated word starting point and terminal point
3-1. the guestimate of isolated word starting point
From front to back each frame voice signal is detected, find the guestimate x1 of isolated word starting point, this guestimate must be satisfied following three conditions: the one, begin the short-time average energy of continuous some frames backward greater than high threshold EFVU from x1; The 2nd, begin the voice short-time average energy value of continuous some frames forward less than low threshold value EFVL from x1; The 3rd, x1-1 frame short-time zero-crossing rate is greater than zero passage threshold value ZCRT.
Specifically, search N continuous 1 frame short-time average energy is greater than the frame of high threshold EFVU, and wherein, N1 gets 5 to 7 for good, and the first frame in these frames is designated as as1.
Search forward from as1, find out and the frame of the nearest short-time average energy of as1 less than EFVL, remember that a rear frame of this frame is a1.
Search forward the N2 frame from a1, N2 gets 10 for good, in the statistics N2 frame short-time average energy greater than the frame number of EFVL, if frame number surpasses N3, N3 value 2 to 4, then wherein top energy is designated as new a1 greater than that frame of EFVL.
Continue to search from new a1, until new N2 frame self-energy greater than the not enough N3+1 frame of the frame number of EFVL or detect voice the first frame, is designated as new a1;
Search forward from a1 again, find short-time zero-crossing rate greater than the frame of ZCRT, then a frame is the guestimate x1 of voice starting point behind this frame.
3-2. the guestimate of isolated word terminal point
Search backward since the x1 frame, find the guestimate x2 of isolated word terminal point, this guestimate must be satisfied following three conditions: the one, begin the short-time average energy of continuous some frames backward less than low threshold value EFVL from x2; The 2nd, begin the short-time average value energy of some frames forward greater than high threshold EFVH from x2; The 3rd, x2+1 frame short-time zero-crossing rate is greater than zero passage threshold value ZCRT.
Specifically, search backward since the x1 frame, find N continuous 4 frame short-time average energies less than the frame of EFVL, wherein, N4 gets 20 to 30, and record wherein the first frame frame number is as2.
Begin forward to search the N2 frame from as2, if energy is greater than the not enough N3+1 frame of frame of EFVU in the N2 frame, then record wherein last energy be new as2 greater than the frame of EFVU, continue to search forward from new as2 and search, until energy surpasses the N3 frame greater than the frame number of EFVU in the continuous N2 frame.
Search backward from as2, find out that first energy is less than the frame of EFVL behind the as2, recording this frame is a2;
Can continue to search backward from a2 again, find short-time zero-crossing rate greater than the frame of ZCRT, then this frame former frame guestimate x2 that is the voice terminal point.
4. the accurate judgement of the self-adaptation adjustment of detection threshold and isolated word starting point and terminal point
Adjust adaptively detection threshold by the isolated word length constraint, and near the end points guestimate of isolated word, search its accurate estimation.
4-1. choose auto-adaptive parameter
Set auto-adaptive parameter α, β, μ and λ, wherein, 1.20≤α≤2.00,0.50≤β≤0.80,1.20≤μ≤2.00,0.50≤λ≤0.80.Make GEFVU=LEFVU=EFVU, GZCRT=LZCRT=ZCRT;
The threshold adaptive adjustment when 4-2. detection voice length is oversize
If 4-2-1. x2-x1〉LENU, show that short-time energy high threshold EFVU is too little, high threshold is increased α doubly, and adjust backward x1, forward adjust x2 respectively so that the frame average energy of starting point and terminal point greater than new high threshold.Specifically, make GEFVU=α * GEFVU.Begin to search backward from x1, find first frame greater than GEFVU to be designated as new x1.In like manner, begin to search forward from x2, find first frame greater than GEFVU to be designated as new x2.
If 4-2-2. still have x2-x1〉LENU, show that the short-time zero-crossing rate threshold value is too large, the short-time zero-crossing rate threshold value is reduced into original β doubly, adjust backward x1, forward adjust x2 respectively so that the average zero-crossing rate of frame of x1-1 frame and x2+1 greater than new short-time zero-crossing rate threshold value.Specifically, make LZCRT=β * LZCRT, begin to continue to search backward from x1, find first frame less than LZCRT to be designated as new x1; In like manner, begin to continue to look for forward from x2, find first frame less than LZCRT to be designated as new x2.
4-2-3. circulation 4-2-1 and 4-2-2 step are until still have x2-x1 after x2-x1≤LENU or the circulation N4 time〉LENU, end loop.
4-3. detect the threshold adaptive adjustment too in short-term of voice length
4-3-1. if x2-x1<LENL shows that short-time energy high threshold EFVU is too large, high threshold is reduced into original λ doubly, and adjust forward x1, backward adjust x2 respectively so that the frame average energy of starting point and terminal point greater than new high threshold; Specifically, make LEFVU=λ * LEFVU.Search forward from x1, find first less than the frame of LEFVU, remember that this frame is new x1.In like manner, search backward from x2, find the first frame less than LEFVU, remember that this frame is new x2.
If 4-3-2. still have x2-x1<LENL, show that the short-time average zero-crossing rate high threshold is too little, the short-time zero-crossing rate threshold value is enlarged into original μ doubly, adjust forward x1, backward adjust x2 respectively so that the average zero-crossing rate of frame of x1-1 frame and x2+1 greater than new short-time zero-crossing rate threshold value.Specifically, make GZCRT=μ * GZCRT, begin to continue to search forward from x1, find first frame less than GZCRT to be designated as new x1; In like manner, begin to continue to look for backward from x2, find first frame less than GZCRT to be designated as new x2.
4-3-3. circulation step 4-3-1 and 4-3-2, until x2-x1 〉=LENL, or circulation still has x2-x1<LENL N5 time, end loop.
5. the starting point of output alone word voice is x1, and terminal point is x2, carries out isolated word recognition.
Explanation is at last, above embodiment is only unrestricted in order to technical scheme of the present invention to be described, other modifications that those of ordinary skills make technical scheme of the present invention or be equal to replacement, only otherwise break away from the spirit and scope of technical solution of the present invention, all should be encompassed in the middle of the claim scope of the present invention.
Claims (8)
1. the self-adaptation end-point detecting method towards alone word voice identification is characterized in that, the method may further comprise the steps:
A. phonetic entry
Input comprises the voice signal of isolated word to be identified;
B. voice pre-service
Voice signal is carried out the amplitude translation, and normalization and minute frame are processed, and calculate short-time average energy and the short-time average zero-crossing rate of each frame voice signal;
C. isolated word end points rough detection
Utilize short-time average energy and the short-time average zero-crossing rate of each frame voice signal, and the shortest length of continuous speech frame retrains before and after the end points, and the isolated word end points is carried out guestimate;
D. the self-adaptation adjustment of detection threshold and the detection of accurate end points
Utilize the restriction of the minimum duration of isolated word and maximum duration, detection threshold is dynamically adjusted, and sound end is carried out the front and back fine setting, obtain accurate isolated word end points;
E. the end points of exporting isolated word carries out alone word voice identification
Export accurate isolated word end points, utilize speech recognition technology to carry out isolated word recognition.
2. described self-adaptation end-point detecting method towards alone word voice identification according to claim 1 is characterized in that: in the c step, when carrying out the guestimate of end points, introduce the constraint of continuous speech frame length before and after the end points.
3. described self-adaptation end-point detecting method towards alone word voice identification according to claim 1 is characterized in that: in the e step, when carrying out the accurate detection of end points, according to the length constraint of isolated word detection threshold is carried out the self-adaptation adjustment.
4. described self-adaptation end-point detecting method towards alone word voice identification according to claim 3, it is characterized in that: when the alone word voice length that detects during greater than the maximum length of isolated word, increase the short-time energy high threshold, and adjust backward starting point, adjust terminal point forward, respectively so that the frame average energy of starting point and terminal point greater than new high threshold.
5. described self-adaptation end-point detecting method towards alone word voice identification according to claim 3, it is characterized in that: when the alone word voice length that detects during greater than the maximum length of isolated word, dwindle the short-time zero-crossing rate threshold value, and adjust backward starting point, adjust terminal point forward, so that starting point former frame and the average zero-crossing rate of terminal point next frame are greater than new short-time zero-crossing rate threshold value.
6. described self-adaptation end-point detecting method towards alone word voice identification according to claim 3, it is characterized in that: when the alone word voice length that detects during less than the shortest length of isolated word, dwindle the short-time energy high threshold, and adjust forward starting point, adjust terminal point backward, respectively so that the frame average energy of starting point and terminal point greater than new high threshold.
7. described self-adaptation end-point detecting method towards alone word voice identification according to claim 3, it is characterized in that: when the alone word voice length that detects during less than the shortest length of isolated word, increase the short-time zero-crossing rate threshold value, and adjust forward starting point, adjust terminal point backward, so that starting point former frame and the average zero-crossing rate of terminal point next frame are greater than new short-time zero-crossing rate threshold value.
8. realize the self-adaptation endpoint detection system towards alone word voice identification of the described method of claim 1, it is characterized in that, this system comprises:
The input media of alone word tone signal to be identified;
Voice signal is carried out the amplitude translation, and normalization and minute frame are processed, the device that short-time average energy and the short-time average zero-crossing rate of each frame voice calculated;
Utilize short-time average energy and the short-time average zero-crossing rate of each frame voice signal, and the shortest length of continuous speech frame retrains the device that the isolated word end points is carried out guestimate before and after the end points;
Utilize the restriction of the minimum duration of isolated word and maximum duration, detection threshold is dynamically adjusted, and sound end is carried out the front and back fine setting, obtain the device of accurate isolated word end points;
Export accurate isolated word end points, utilize speech recognition technology to carry out the device of isolated word recognition.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210085584.8A CN103366739B (en) | 2012-03-28 | 2012-03-28 | Towards self-adaptation end-point detecting method and the system thereof of alone word voice identification |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210085584.8A CN103366739B (en) | 2012-03-28 | 2012-03-28 | Towards self-adaptation end-point detecting method and the system thereof of alone word voice identification |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103366739A true CN103366739A (en) | 2013-10-23 |
CN103366739B CN103366739B (en) | 2015-12-09 |
Family
ID=49367940
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201210085584.8A Expired - Fee Related CN103366739B (en) | 2012-03-28 | 2012-03-28 | Towards self-adaptation end-point detecting method and the system thereof of alone word voice identification |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103366739B (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104700830A (en) * | 2013-12-06 | 2015-06-10 | 中国移动通信集团公司 | Voice endpoint detection method and voice endpoint detection device |
CN105118502A (en) * | 2015-07-14 | 2015-12-02 | 百度在线网络技术(北京)有限公司 | End point detection method and system of voice identification system |
CN106448659A (en) * | 2016-12-19 | 2017-02-22 | 广东工业大学 | Speech endpoint detection method based on short-time energy and fractal dimensions |
CN106601234A (en) * | 2016-11-16 | 2017-04-26 | 华南理工大学 | Implementation method of placename speech modeling system for goods sorting |
CN106601233A (en) * | 2016-12-22 | 2017-04-26 | 北京元心科技有限公司 | Voice command recognition method and device and electronic equipment |
CN106847270A (en) * | 2016-12-09 | 2017-06-13 | 华南理工大学 | A kind of double threshold place name sound end detecting method |
CN107045870A (en) * | 2017-05-23 | 2017-08-15 | 南京理工大学 | A kind of the Method of Speech Endpoint Detection of feature based value coding |
CN108665889A (en) * | 2018-04-20 | 2018-10-16 | 百度在线网络技术(北京)有限公司 | The Method of Speech Endpoint Detection, device, equipment and storage medium |
CN108847217A (en) * | 2018-05-31 | 2018-11-20 | 平安科技(深圳)有限公司 | A kind of phonetic segmentation method, apparatus, computer equipment and storage medium |
CN108847218A (en) * | 2018-06-27 | 2018-11-20 | 郑州云海信息技术有限公司 | A kind of adaptive threshold adjusting sound end detecting method, equipment and readable storage medium storing program for executing |
CN111276164A (en) * | 2020-02-15 | 2020-06-12 | 中国人民解放军空军特色医学中心 | Self-adaptive voice activation detection device and method for high-noise environment on airplane |
CN111402931A (en) * | 2020-03-05 | 2020-07-10 | 云知声智能科技股份有限公司 | Voice boundary detection method and system assisted by voice portrait |
CN111613250A (en) * | 2020-07-06 | 2020-09-01 | 泰康保险集团股份有限公司 | Long voice endpoint detection method and device, storage medium and electronic equipment |
CN114495907A (en) * | 2022-01-27 | 2022-05-13 | 多益网络有限公司 | Adaptive voice activity detection method, device, equipment and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5970447A (en) * | 1998-01-20 | 1999-10-19 | Advanced Micro Devices, Inc. | Detection of tonal signals |
CN101206858A (en) * | 2007-12-12 | 2008-06-25 | 北京中星微电子有限公司 | Method and system for testing alone word voice endpoint |
CN101226741A (en) * | 2007-12-28 | 2008-07-23 | 无敌科技(西安)有限公司 | Method for detecting movable voice endpoint |
CN101625857A (en) * | 2008-07-10 | 2010-01-13 | 新奥特(北京)视频技术有限公司 | Self-adaptive voice endpoint detection method |
CN101625860A (en) * | 2008-07-10 | 2010-01-13 | 新奥特(北京)视频技术有限公司 | Method for self-adaptively adjusting background noise in voice endpoint detection |
-
2012
- 2012-03-28 CN CN201210085584.8A patent/CN103366739B/en not_active Expired - Fee Related
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5970447A (en) * | 1998-01-20 | 1999-10-19 | Advanced Micro Devices, Inc. | Detection of tonal signals |
CN101206858A (en) * | 2007-12-12 | 2008-06-25 | 北京中星微电子有限公司 | Method and system for testing alone word voice endpoint |
CN101226741A (en) * | 2007-12-28 | 2008-07-23 | 无敌科技(西安)有限公司 | Method for detecting movable voice endpoint |
CN101625857A (en) * | 2008-07-10 | 2010-01-13 | 新奥特(北京)视频技术有限公司 | Self-adaptive voice endpoint detection method |
CN101625860A (en) * | 2008-07-10 | 2010-01-13 | 新奥特(北京)视频技术有限公司 | Method for self-adaptively adjusting background noise in voice endpoint detection |
Cited By (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104700830A (en) * | 2013-12-06 | 2015-06-10 | 中国移动通信集团公司 | Voice endpoint detection method and voice endpoint detection device |
CN104700830B (en) * | 2013-12-06 | 2018-07-24 | 中国移动通信集团公司 | A kind of sound end detecting method and device |
CN105118502B (en) * | 2015-07-14 | 2017-05-10 | 百度在线网络技术(北京)有限公司 | End point detection method and system of voice identification system |
CN105118502A (en) * | 2015-07-14 | 2015-12-02 | 百度在线网络技术(北京)有限公司 | End point detection method and system of voice identification system |
CN106601234A (en) * | 2016-11-16 | 2017-04-26 | 华南理工大学 | Implementation method of placename speech modeling system for goods sorting |
CN106847270B (en) * | 2016-12-09 | 2020-08-18 | 华南理工大学 | Double-threshold place name voice endpoint detection method |
CN106847270A (en) * | 2016-12-09 | 2017-06-13 | 华南理工大学 | A kind of double threshold place name sound end detecting method |
CN106448659A (en) * | 2016-12-19 | 2017-02-22 | 广东工业大学 | Speech endpoint detection method based on short-time energy and fractal dimensions |
CN106601233A (en) * | 2016-12-22 | 2017-04-26 | 北京元心科技有限公司 | Voice command recognition method and device and electronic equipment |
CN107045870A (en) * | 2017-05-23 | 2017-08-15 | 南京理工大学 | A kind of the Method of Speech Endpoint Detection of feature based value coding |
CN108665889A (en) * | 2018-04-20 | 2018-10-16 | 百度在线网络技术(北京)有限公司 | The Method of Speech Endpoint Detection, device, equipment and storage medium |
CN108665889B (en) * | 2018-04-20 | 2021-09-28 | 百度在线网络技术(北京)有限公司 | Voice signal endpoint detection method, device, equipment and storage medium |
CN108847217A (en) * | 2018-05-31 | 2018-11-20 | 平安科技(深圳)有限公司 | A kind of phonetic segmentation method, apparatus, computer equipment and storage medium |
CN108847218A (en) * | 2018-06-27 | 2018-11-20 | 郑州云海信息技术有限公司 | A kind of adaptive threshold adjusting sound end detecting method, equipment and readable storage medium storing program for executing |
CN108847218B (en) * | 2018-06-27 | 2020-07-21 | 苏州浪潮智能科技有限公司 | Self-adaptive threshold setting voice endpoint detection method, equipment and readable storage medium |
CN111276164A (en) * | 2020-02-15 | 2020-06-12 | 中国人民解放军空军特色医学中心 | Self-adaptive voice activation detection device and method for high-noise environment on airplane |
CN111276164B (en) * | 2020-02-15 | 2021-08-03 | 中国人民解放军空军特色医学中心 | Self-adaptive voice activation detection device and method for high-noise environment on airplane |
CN111402931A (en) * | 2020-03-05 | 2020-07-10 | 云知声智能科技股份有限公司 | Voice boundary detection method and system assisted by voice portrait |
CN111613250A (en) * | 2020-07-06 | 2020-09-01 | 泰康保险集团股份有限公司 | Long voice endpoint detection method and device, storage medium and electronic equipment |
CN111613250B (en) * | 2020-07-06 | 2023-07-18 | 泰康保险集团股份有限公司 | Long voice endpoint detection method and device, storage medium and electronic equipment |
CN114495907A (en) * | 2022-01-27 | 2022-05-13 | 多益网络有限公司 | Adaptive voice activity detection method, device, equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN103366739B (en) | 2015-12-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103366739B (en) | Towards self-adaptation end-point detecting method and the system thereof of alone word voice identification | |
CN105529028B (en) | Speech analysis method and apparatus | |
CN103811003B (en) | A kind of audio recognition method and electronic equipment | |
CN102314884B (en) | Voice-activation detecting method and device | |
CN105810213A (en) | Typical abnormal sound detection method and device | |
CN104091603B (en) | Endpoint detection system and its computational methods based on fundamental frequency | |
US9349384B2 (en) | Method and system for object-dependent adjustment of levels of audio objects | |
CN105118502A (en) | End point detection method and system of voice identification system | |
CN104021789A (en) | Self-adaption endpoint detection method using short-time time-frequency value | |
CN106098076B (en) | One kind estimating time-frequency domain adaptive voice detection method based on dynamic noise | |
CN105810201B (en) | Voice activity detection method and its system | |
Zaw et al. | The combination of spectral entropy, zero crossing rate, short time energy and linear prediction error for voice activity detection | |
JP2019053321A (en) | Method for detecting audio signal and apparatus | |
CN106448659A (en) | Speech endpoint detection method based on short-time energy and fractal dimensions | |
CN107102713A (en) | It is a kind of to reduce the method and device of power consumption | |
CN111540342A (en) | Energy threshold adjusting method, device, equipment and medium | |
CN108847218B (en) | Self-adaptive threshold setting voice endpoint detection method, equipment and readable storage medium | |
CN107331393B (en) | Self-adaptive voice activity detection method | |
CN110738986B (en) | Long voice labeling device and method | |
CN106571138B (en) | Signal endpoint detection method, detection device and detection equipment | |
CN106504756A (en) | Built-in speech recognition system and method | |
Guo et al. | A improved dual-threshold speech endpoint detection algorithm | |
Sharma et al. | Automatic identification of silence, unvoiced and voiced chunks in speech | |
US20230223014A1 (en) | Adapting Automated Speech Recognition Parameters Based on Hotword Properties | |
TWI684912B (en) | Voice wake-up apparatus and method thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CB03 | Change of inventor or designer information |
Inventor after: Huo Xiaosi Inventor after: Yin Mingli Inventor after: Liu Junjiang Inventor after: Zhang Juzhou Inventor after: Huang Xinchao Inventor before: Huo Xiaosi Inventor before: Yin Mingli Inventor before: Liu Junjiang |
|
COR | Change of bibliographic data | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20151209 Termination date: 20180328 |
|
CF01 | Termination of patent right due to non-payment of annual fee |