CN107799124A - A kind of VAD detection methods applied to intelligent sound mouse - Google Patents
A kind of VAD detection methods applied to intelligent sound mouse Download PDFInfo
- Publication number
- CN107799124A CN107799124A CN201710948479.5A CN201710948479A CN107799124A CN 107799124 A CN107799124 A CN 107799124A CN 201710948479 A CN201710948479 A CN 201710948479A CN 107799124 A CN107799124 A CN 107799124A
- Authority
- CN
- China
- Prior art keywords
- audio
- intelligent sound
- vad
- threshold value
- mouse
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G10L2025/783—Detection of presence or absence of voice signals based on threshold decision
Abstract
The invention discloses a kind of VAD detection methods applied to intelligent sound mouse, each audio block that a section audio is collected to intelligent sound mouse carries out sub-frame processing, calculate the average energy value of each audio block framing, and compared with pre-set threshold value, if being more than or equal to predetermined threshold value, the result of VAD detections puts 1, if being less than predetermined threshold value, the result of VAD detections is set to 0, and judges to collect the volume and continuity of sound according to continuous 1 number in a section audio or how many block continuous 1.The present invention is detected by gathering the volume of voice and the continuity of sound to intelligent sound mouse, can correctly judge whether the mode that user interacts with machine is correct, and improves the accuracy of intelligent sound mouse performance to user reasonably to prompt.
Description
Technical field
The invention belongs to technical field of voice recognition, is related to a kind of VAD detection methods, is specifically that one kind is applied to intelligent language
The VAD detection methods of sound mouse.
Background technology
When carrying out man-machine interaction using intelligent sound mouse, when the volume that intelligent sound mouse collects audio is too small,
It can not correctly identify that prompting speaker increases as a result, it is desirable to which intelligent sound mouse gives correct prompting on uploading onto the server
Big one's voice in speech.
VAD detection techniques are that the audio of collection is detected, and by handling audio, judge whether audio meets
Identification requires.Existing VAD detections mainly collect the energy value and short-time zero-crossing rate of a section audio by calculating at present, and sentence
Disconnected energy value and zero-crossing rate whether be more than threshold value judge whether be voice end points, it can only judge the end of a section audio
Point, and the size of this section audio volume can not be judged, prompting user can not be played and correctly interacted with voice and machine.
The content of the invention
The invention provides a kind of VAD detection methods applied to intelligent sound mouse, by the volume to voice and
The continuity of sound is detected, and judges whether the mode that user interacts with machine is correct, lifts correct man-machine interaction mode.
The purpose of the present invention can be achieved through the following technical solutions:
A kind of VAD detection methods applied to intelligent sound mouse, comprise the following steps:
Step S1, intelligent sound mouse collect the voice data that user speaks by the high-performance microphone built in it,
And it is divided into some audio blocks;
Step S2, any audio block collected to intelligent sound mouse in a section audio data carry out sub-frame processing;
Step S3, calculate the short-time energy value of each frame of audio block after sub-frame processing.
Step S4, calculate sub-frame processing after all frames of audio block energy and;
Step S5, calculate the average energy value of the frame of audio block one after sub-frame processing;
Step S6, by the average energy value calculated in step S5 compared with pre-set threshold value, judge the average energy value
Whether predetermined threshold value is more than, if the average energy value is more than or equal to predetermined threshold value, the result of VAD detections puts 1, if the average energy value
Less than predetermined threshold value, the result of VAD detections is set to 0;
Step S7, the other audio blocks collected to intelligent sound mouse in a section audio data carry out VAD detections successively,
Repeat step S2 to S6;
Step S8, the VAD testing results progress that all audio blocks in a section audio are collected to intelligent sound mouse are secondary
Processing, lists the VAD testing results of all audio blocks in a section audio, continuous according to wherein continuous 1 number or how many block
1 judge to collect the volume and continuity of sound, export corresponding feedback information, prompt user correctly to use voice and machine
Device interacts.
Beneficial effects of the present invention:The VAD detection methods of intelligent sound mouse provided by the invention, by intelligent sound
The volume of mouse collection voice and the continuity of sound are detected, and can correctly judge the mode that user interacts with machine
It is whether correct, and improve the accuracy of intelligent sound mouse performance to user reasonably to prompt.
Brief description of the drawings
The present invention is described in further detail with specific embodiment below in conjunction with the accompanying drawings.
Fig. 1 is flow chart of the method for the present invention.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete
Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, rather than whole embodiments.It is based on
Embodiment in the present invention, those of ordinary skill in the art are obtained all other under the premise of creative work is not made
Embodiment, belong to the scope of protection of the invention.
As shown in figure 1, the invention provides a kind of VAD detection methods applied to intelligent sound mouse, including following step
Suddenly:
Step S1, intelligent sound mouse collect the voice data that user speaks by the high-performance microphone built in it,
And it is divided into some audio blocks.
Step S2, any block audio block collected to intelligent sound mouse in voice data carry out sub-frame processing;For
Sample rate is 16000HZ, and in the case of 16, a frame is 512B.
Step S3, calculate the short-time energy value of each frame of audio block after sub-frame processing.
Step S4, calculate sub-frame processing after all frames of audio block energy and;, can for the audio block that size is 4096B
To be classified as 8 frames, the short-time energy value of this 8 frame is added, calculate this 8 frame energy and.
Step S5, calculate the average energy value of the frame of audio block one after sub-frame processing;Audio block for size for 4096B,
8 frames can be classified as, by the energy of this 8 frame and divided by 8, calculate its average value.
Step S6, by the average energy value calculated in step S5 compared with pre-set threshold value, judge the average energy value
Whether predetermined threshold value is more than, if the average energy value is more than or equal to predetermined threshold value, the result of VAD detections puts 1, if the average energy value
Less than predetermined threshold value, the result of VAD detections is set to 0.
Step S7, the other audio blocks collected to intelligent sound mouse in a section audio data carry out VAD detections successively,
Repeat step S2 to S6.
Step S8, the VAD testing results progress that all audio blocks in a section audio are collected to intelligent sound mouse are secondary
Processing, lists the VAD testing results of all audio blocks in a section audio, continuous according to wherein continuous 1 number or how many block
1 judge to collect the volume and continuity of sound, export corresponding feedback information, prompt user correctly to use voice and machine
Device interacts.
The VAD detection methods of intelligent sound mouse provided by the invention, by the sound that voice is gathered to intelligent sound mouse
Amount and the continuity of sound are detected, and can correctly judge whether the mode that user interacts with machine is correct, and to use
The accuracy of intelligent sound mouse performance is improved reasonably to prompt in family.
In the description of this specification, the description of reference term " one embodiment ", " example ", " specific example " etc. means
At least one implementation of the present invention is contained in reference to specific features, structure, material or the feature that the embodiment or example describe
In example or example.In this manual, identical embodiment or example are not necessarily referring to the schematic representation of above-mentioned term.
Moreover, specific features, structure, material or the feature of description can close in any one or more embodiments or example
Suitable mode combines.
Above content is only to structure example of the present invention and explanation, affiliated those skilled in the art couple
Described specific embodiment is made various modifications or supplement or substituted using similar mode, without departing from invention
Structure surmounts scope defined in the claims, all should belong to protection scope of the present invention.
Claims (1)
1. a kind of VAD detection methods applied to intelligent sound mouse, it is characterised in that comprise the following steps:
Step S1, intelligent sound mouse collect the voice data that user speaks by the high-performance microphone built in it, and will
It is divided into some audio blocks;
Step S2, any audio block collected to intelligent sound mouse in a section audio data carry out sub-frame processing;
Step S3, calculate the short-time energy value of each frame of audio block after sub-frame processing.
Step S4, calculate sub-frame processing after all frames of audio block energy and;
Step S5, calculate the average energy value of the frame of audio block one after sub-frame processing;
Step S6, by the average energy value calculated in step S5 compared with pre-set threshold value, whether judge the average energy value
More than predetermined threshold value, if the average energy value is more than or equal to predetermined threshold value, the result of VAD detections puts 1, if the average energy value is less than
Predetermined threshold value, the result of VAD detections are set to 0;
Step S7, other audio blocks in a section audio data are collected to intelligent sound mouse and carry out VAD detections successively, repeated
Step S2 to S6;
Step S8, the VAD testing results that all audio blocks in a section audio are collected to intelligent sound mouse carry out after-treatment,
List the VAD testing results of all audio blocks in a section audio, according to wherein continuous 1 number or how many block continuous 1 come
Judgement collects the volume and continuity of sound, exports corresponding feedback information, prompts user correctly to enter with voice and machine
Row interaction.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710948479.5A CN107799124A (en) | 2017-10-12 | 2017-10-12 | A kind of VAD detection methods applied to intelligent sound mouse |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710948479.5A CN107799124A (en) | 2017-10-12 | 2017-10-12 | A kind of VAD detection methods applied to intelligent sound mouse |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107799124A true CN107799124A (en) | 2018-03-13 |
Family
ID=61532634
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710948479.5A Pending CN107799124A (en) | 2017-10-12 | 2017-10-12 | A kind of VAD detection methods applied to intelligent sound mouse |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107799124A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109587603A (en) * | 2018-12-10 | 2019-04-05 | 北京达佳互联信息技术有限公司 | Method for controlling volume, device and storage medium |
CN110556110A (en) * | 2019-10-24 | 2019-12-10 | 北京九狐时代智能科技有限公司 | Voice processing method and device, intelligent terminal and storage medium |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1403953A (en) * | 2002-09-06 | 2003-03-19 | 浙江大学 | Palm acoustic-print verifying system |
CN102565759A (en) * | 2011-12-29 | 2012-07-11 | 东南大学 | Binaural sound source localization method based on sub-band signal to noise ratio estimation |
CN202584048U (en) * | 2012-05-17 | 2012-12-05 | 大连民族学院 | Smart mouse based on DSP image location and voice recognition |
CN105070287A (en) * | 2015-07-03 | 2015-11-18 | 广东小天才科技有限公司 | Method and device of detecting voice end points in a self-adaptive noisy environment |
CN105869658A (en) * | 2016-04-01 | 2016-08-17 | 金陵科技学院 | Voice endpoint detection method employing nonlinear feature |
CN205943456U (en) * | 2016-08-24 | 2017-02-08 | 安徽咪鼠科技有限公司 | Pronunciation are gathered and preprocessing device based on intelligence pronunciation mouse |
CN107045870A (en) * | 2017-05-23 | 2017-08-15 | 南京理工大学 | A kind of the Method of Speech Endpoint Detection of feature based value coding |
US20170257470A1 (en) * | 2008-04-08 | 2017-09-07 | Lg Electronics Inc. | Mobile terminal and menu control method thereof |
-
2017
- 2017-10-12 CN CN201710948479.5A patent/CN107799124A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1403953A (en) * | 2002-09-06 | 2003-03-19 | 浙江大学 | Palm acoustic-print verifying system |
US20170257470A1 (en) * | 2008-04-08 | 2017-09-07 | Lg Electronics Inc. | Mobile terminal and menu control method thereof |
CN102565759A (en) * | 2011-12-29 | 2012-07-11 | 东南大学 | Binaural sound source localization method based on sub-band signal to noise ratio estimation |
CN202584048U (en) * | 2012-05-17 | 2012-12-05 | 大连民族学院 | Smart mouse based on DSP image location and voice recognition |
CN105070287A (en) * | 2015-07-03 | 2015-11-18 | 广东小天才科技有限公司 | Method and device of detecting voice end points in a self-adaptive noisy environment |
CN105869658A (en) * | 2016-04-01 | 2016-08-17 | 金陵科技学院 | Voice endpoint detection method employing nonlinear feature |
CN205943456U (en) * | 2016-08-24 | 2017-02-08 | 安徽咪鼠科技有限公司 | Pronunciation are gathered and preprocessing device based on intelligence pronunciation mouse |
CN107045870A (en) * | 2017-05-23 | 2017-08-15 | 南京理工大学 | A kind of the Method of Speech Endpoint Detection of feature based value coding |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109587603A (en) * | 2018-12-10 | 2019-04-05 | 北京达佳互联信息技术有限公司 | Method for controlling volume, device and storage medium |
CN110556110A (en) * | 2019-10-24 | 2019-12-10 | 北京九狐时代智能科技有限公司 | Voice processing method and device, intelligent terminal and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108564942B (en) | Voice emotion recognition method and system based on adjustable sensitivity | |
CN109767785A (en) | Ambient noise method for identifying and classifying based on convolutional neural networks | |
CN105374352B (en) | A kind of voice activated method and system | |
CN106503805A (en) | A kind of bimodal based on machine learning everybody talk with sentiment analysis system and method | |
CN103810994B (en) | Speech emotional inference method based on emotion context and system | |
CN108428448A (en) | A kind of sound end detecting method and audio recognition method | |
CN107767863A (en) | voice awakening method, system and intelligent terminal | |
CN108074576A (en) | Inquest the speaker role's separation method and system under scene | |
CN108172242B (en) | Improved Bluetooth intelligent cloud sound box voice interaction endpoint detection method | |
CN103236258B (en) | Based on the speech emotional characteristic extraction method that Pasteur's distance wavelet packets decomposes | |
CN103700370A (en) | Broadcast television voice recognition method and system | |
CN109065051B (en) | Voice recognition processing method and device | |
CN102937320B (en) | Health protection method used for intelligent air conditioner | |
CN104538034A (en) | Voice recognition method and system | |
CN106328151A (en) | Environment de-noising system and application method | |
CN108021635A (en) | The definite method, apparatus and storage medium of a kind of audio similarity | |
CN106548786A (en) | A kind of detection method and system of voice data | |
CN110473536A (en) | A kind of awakening method, device and smart machine | |
CN105654947A (en) | Method and system for acquiring traffic information in traffic broadcast speech | |
CN107799124A (en) | A kind of VAD detection methods applied to intelligent sound mouse | |
CN110910891A (en) | Speaker segmentation labeling method and device based on long-time memory neural network | |
CN108831450A (en) | A kind of virtual robot man-machine interaction method based on user emotion identification | |
CN109841221A (en) | Parameter adjusting method, device and body-building equipment based on speech recognition | |
CN106653000A (en) | Emotion intensity test method based on voice information | |
CN106531195A (en) | Dialogue conflict detection method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180313 |
|
RJ01 | Rejection of invention patent application after publication |