CN107799124A - A kind of VAD detection methods applied to intelligent sound mouse - Google Patents

A kind of VAD detection methods applied to intelligent sound mouse Download PDF

Info

Publication number
CN107799124A
CN107799124A CN201710948479.5A CN201710948479A CN107799124A CN 107799124 A CN107799124 A CN 107799124A CN 201710948479 A CN201710948479 A CN 201710948479A CN 107799124 A CN107799124 A CN 107799124A
Authority
CN
China
Prior art keywords
audio
intelligent sound
vad
threshold value
mouse
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710948479.5A
Other languages
Chinese (zh)
Inventor
冯海洪
朱国冉
许成亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Mic Technology Co Ltd
Original Assignee
Anhui Mic Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Mic Technology Co Ltd filed Critical Anhui Mic Technology Co Ltd
Priority to CN201710948479.5A priority Critical patent/CN107799124A/en
Publication of CN107799124A publication Critical patent/CN107799124A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L2025/783Detection of presence or absence of voice signals based on threshold decision

Abstract

The invention discloses a kind of VAD detection methods applied to intelligent sound mouse, each audio block that a section audio is collected to intelligent sound mouse carries out sub-frame processing, calculate the average energy value of each audio block framing, and compared with pre-set threshold value, if being more than or equal to predetermined threshold value, the result of VAD detections puts 1, if being less than predetermined threshold value, the result of VAD detections is set to 0, and judges to collect the volume and continuity of sound according to continuous 1 number in a section audio or how many block continuous 1.The present invention is detected by gathering the volume of voice and the continuity of sound to intelligent sound mouse, can correctly judge whether the mode that user interacts with machine is correct, and improves the accuracy of intelligent sound mouse performance to user reasonably to prompt.

Description

A kind of VAD detection methods applied to intelligent sound mouse
Technical field
The invention belongs to technical field of voice recognition, is related to a kind of VAD detection methods, is specifically that one kind is applied to intelligent language The VAD detection methods of sound mouse.
Background technology
When carrying out man-machine interaction using intelligent sound mouse, when the volume that intelligent sound mouse collects audio is too small, It can not correctly identify that prompting speaker increases as a result, it is desirable to which intelligent sound mouse gives correct prompting on uploading onto the server Big one's voice in speech.
VAD detection techniques are that the audio of collection is detected, and by handling audio, judge whether audio meets Identification requires.Existing VAD detections mainly collect the energy value and short-time zero-crossing rate of a section audio by calculating at present, and sentence Disconnected energy value and zero-crossing rate whether be more than threshold value judge whether be voice end points, it can only judge the end of a section audio Point, and the size of this section audio volume can not be judged, prompting user can not be played and correctly interacted with voice and machine.
The content of the invention
The invention provides a kind of VAD detection methods applied to intelligent sound mouse, by the volume to voice and The continuity of sound is detected, and judges whether the mode that user interacts with machine is correct, lifts correct man-machine interaction mode.
The purpose of the present invention can be achieved through the following technical solutions:
A kind of VAD detection methods applied to intelligent sound mouse, comprise the following steps:
Step S1, intelligent sound mouse collect the voice data that user speaks by the high-performance microphone built in it, And it is divided into some audio blocks;
Step S2, any audio block collected to intelligent sound mouse in a section audio data carry out sub-frame processing;
Step S3, calculate the short-time energy value of each frame of audio block after sub-frame processing.
Step S4, calculate sub-frame processing after all frames of audio block energy and;
Step S5, calculate the average energy value of the frame of audio block one after sub-frame processing;
Step S6, by the average energy value calculated in step S5 compared with pre-set threshold value, judge the average energy value Whether predetermined threshold value is more than, if the average energy value is more than or equal to predetermined threshold value, the result of VAD detections puts 1, if the average energy value Less than predetermined threshold value, the result of VAD detections is set to 0;
Step S7, the other audio blocks collected to intelligent sound mouse in a section audio data carry out VAD detections successively, Repeat step S2 to S6;
Step S8, the VAD testing results progress that all audio blocks in a section audio are collected to intelligent sound mouse are secondary Processing, lists the VAD testing results of all audio blocks in a section audio, continuous according to wherein continuous 1 number or how many block 1 judge to collect the volume and continuity of sound, export corresponding feedback information, prompt user correctly to use voice and machine Device interacts.
Beneficial effects of the present invention:The VAD detection methods of intelligent sound mouse provided by the invention, by intelligent sound The volume of mouse collection voice and the continuity of sound are detected, and can correctly judge the mode that user interacts with machine It is whether correct, and improve the accuracy of intelligent sound mouse performance to user reasonably to prompt.
Brief description of the drawings
The present invention is described in further detail with specific embodiment below in conjunction with the accompanying drawings.
Fig. 1 is flow chart of the method for the present invention.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, rather than whole embodiments.It is based on Embodiment in the present invention, those of ordinary skill in the art are obtained all other under the premise of creative work is not made Embodiment, belong to the scope of protection of the invention.
As shown in figure 1, the invention provides a kind of VAD detection methods applied to intelligent sound mouse, including following step Suddenly:
Step S1, intelligent sound mouse collect the voice data that user speaks by the high-performance microphone built in it, And it is divided into some audio blocks.
Step S2, any block audio block collected to intelligent sound mouse in voice data carry out sub-frame processing;For Sample rate is 16000HZ, and in the case of 16, a frame is 512B.
Step S3, calculate the short-time energy value of each frame of audio block after sub-frame processing.
Step S4, calculate sub-frame processing after all frames of audio block energy and;, can for the audio block that size is 4096B To be classified as 8 frames, the short-time energy value of this 8 frame is added, calculate this 8 frame energy and.
Step S5, calculate the average energy value of the frame of audio block one after sub-frame processing;Audio block for size for 4096B, 8 frames can be classified as, by the energy of this 8 frame and divided by 8, calculate its average value.
Step S6, by the average energy value calculated in step S5 compared with pre-set threshold value, judge the average energy value Whether predetermined threshold value is more than, if the average energy value is more than or equal to predetermined threshold value, the result of VAD detections puts 1, if the average energy value Less than predetermined threshold value, the result of VAD detections is set to 0.
Step S7, the other audio blocks collected to intelligent sound mouse in a section audio data carry out VAD detections successively, Repeat step S2 to S6.
Step S8, the VAD testing results progress that all audio blocks in a section audio are collected to intelligent sound mouse are secondary Processing, lists the VAD testing results of all audio blocks in a section audio, continuous according to wherein continuous 1 number or how many block 1 judge to collect the volume and continuity of sound, export corresponding feedback information, prompt user correctly to use voice and machine Device interacts.
The VAD detection methods of intelligent sound mouse provided by the invention, by the sound that voice is gathered to intelligent sound mouse Amount and the continuity of sound are detected, and can correctly judge whether the mode that user interacts with machine is correct, and to use The accuracy of intelligent sound mouse performance is improved reasonably to prompt in family.
In the description of this specification, the description of reference term " one embodiment ", " example ", " specific example " etc. means At least one implementation of the present invention is contained in reference to specific features, structure, material or the feature that the embodiment or example describe In example or example.In this manual, identical embodiment or example are not necessarily referring to the schematic representation of above-mentioned term. Moreover, specific features, structure, material or the feature of description can close in any one or more embodiments or example Suitable mode combines.
Above content is only to structure example of the present invention and explanation, affiliated those skilled in the art couple Described specific embodiment is made various modifications or supplement or substituted using similar mode, without departing from invention Structure surmounts scope defined in the claims, all should belong to protection scope of the present invention.

Claims (1)

1. a kind of VAD detection methods applied to intelligent sound mouse, it is characterised in that comprise the following steps:
Step S1, intelligent sound mouse collect the voice data that user speaks by the high-performance microphone built in it, and will It is divided into some audio blocks;
Step S2, any audio block collected to intelligent sound mouse in a section audio data carry out sub-frame processing;
Step S3, calculate the short-time energy value of each frame of audio block after sub-frame processing.
Step S4, calculate sub-frame processing after all frames of audio block energy and;
Step S5, calculate the average energy value of the frame of audio block one after sub-frame processing;
Step S6, by the average energy value calculated in step S5 compared with pre-set threshold value, whether judge the average energy value More than predetermined threshold value, if the average energy value is more than or equal to predetermined threshold value, the result of VAD detections puts 1, if the average energy value is less than Predetermined threshold value, the result of VAD detections are set to 0;
Step S7, other audio blocks in a section audio data are collected to intelligent sound mouse and carry out VAD detections successively, repeated Step S2 to S6;
Step S8, the VAD testing results that all audio blocks in a section audio are collected to intelligent sound mouse carry out after-treatment, List the VAD testing results of all audio blocks in a section audio, according to wherein continuous 1 number or how many block continuous 1 come Judgement collects the volume and continuity of sound, exports corresponding feedback information, prompts user correctly to enter with voice and machine Row interaction.
CN201710948479.5A 2017-10-12 2017-10-12 A kind of VAD detection methods applied to intelligent sound mouse Pending CN107799124A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710948479.5A CN107799124A (en) 2017-10-12 2017-10-12 A kind of VAD detection methods applied to intelligent sound mouse

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710948479.5A CN107799124A (en) 2017-10-12 2017-10-12 A kind of VAD detection methods applied to intelligent sound mouse

Publications (1)

Publication Number Publication Date
CN107799124A true CN107799124A (en) 2018-03-13

Family

ID=61532634

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710948479.5A Pending CN107799124A (en) 2017-10-12 2017-10-12 A kind of VAD detection methods applied to intelligent sound mouse

Country Status (1)

Country Link
CN (1) CN107799124A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109587603A (en) * 2018-12-10 2019-04-05 北京达佳互联信息技术有限公司 Method for controlling volume, device and storage medium
CN110556110A (en) * 2019-10-24 2019-12-10 北京九狐时代智能科技有限公司 Voice processing method and device, intelligent terminal and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1403953A (en) * 2002-09-06 2003-03-19 浙江大学 Palm acoustic-print verifying system
CN102565759A (en) * 2011-12-29 2012-07-11 东南大学 Binaural sound source localization method based on sub-band signal to noise ratio estimation
CN202584048U (en) * 2012-05-17 2012-12-05 大连民族学院 Smart mouse based on DSP image location and voice recognition
CN105070287A (en) * 2015-07-03 2015-11-18 广东小天才科技有限公司 Method and device of detecting voice end points in a self-adaptive noisy environment
CN105869658A (en) * 2016-04-01 2016-08-17 金陵科技学院 Voice endpoint detection method employing nonlinear feature
CN205943456U (en) * 2016-08-24 2017-02-08 安徽咪鼠科技有限公司 Pronunciation are gathered and preprocessing device based on intelligence pronunciation mouse
CN107045870A (en) * 2017-05-23 2017-08-15 南京理工大学 A kind of the Method of Speech Endpoint Detection of feature based value coding
US20170257470A1 (en) * 2008-04-08 2017-09-07 Lg Electronics Inc. Mobile terminal and menu control method thereof

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1403953A (en) * 2002-09-06 2003-03-19 浙江大学 Palm acoustic-print verifying system
US20170257470A1 (en) * 2008-04-08 2017-09-07 Lg Electronics Inc. Mobile terminal and menu control method thereof
CN102565759A (en) * 2011-12-29 2012-07-11 东南大学 Binaural sound source localization method based on sub-band signal to noise ratio estimation
CN202584048U (en) * 2012-05-17 2012-12-05 大连民族学院 Smart mouse based on DSP image location and voice recognition
CN105070287A (en) * 2015-07-03 2015-11-18 广东小天才科技有限公司 Method and device of detecting voice end points in a self-adaptive noisy environment
CN105869658A (en) * 2016-04-01 2016-08-17 金陵科技学院 Voice endpoint detection method employing nonlinear feature
CN205943456U (en) * 2016-08-24 2017-02-08 安徽咪鼠科技有限公司 Pronunciation are gathered and preprocessing device based on intelligence pronunciation mouse
CN107045870A (en) * 2017-05-23 2017-08-15 南京理工大学 A kind of the Method of Speech Endpoint Detection of feature based value coding

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109587603A (en) * 2018-12-10 2019-04-05 北京达佳互联信息技术有限公司 Method for controlling volume, device and storage medium
CN110556110A (en) * 2019-10-24 2019-12-10 北京九狐时代智能科技有限公司 Voice processing method and device, intelligent terminal and storage medium

Similar Documents

Publication Publication Date Title
CN108564942B (en) Voice emotion recognition method and system based on adjustable sensitivity
CN109767785A (en) Ambient noise method for identifying and classifying based on convolutional neural networks
CN105374352B (en) A kind of voice activated method and system
CN106503805A (en) A kind of bimodal based on machine learning everybody talk with sentiment analysis system and method
CN103810994B (en) Speech emotional inference method based on emotion context and system
CN108428448A (en) A kind of sound end detecting method and audio recognition method
CN107767863A (en) voice awakening method, system and intelligent terminal
CN108074576A (en) Inquest the speaker role's separation method and system under scene
CN108172242B (en) Improved Bluetooth intelligent cloud sound box voice interaction endpoint detection method
CN103236258B (en) Based on the speech emotional characteristic extraction method that Pasteur's distance wavelet packets decomposes
CN103700370A (en) Broadcast television voice recognition method and system
CN109065051B (en) Voice recognition processing method and device
CN102937320B (en) Health protection method used for intelligent air conditioner
CN104538034A (en) Voice recognition method and system
CN106328151A (en) Environment de-noising system and application method
CN108021635A (en) The definite method, apparatus and storage medium of a kind of audio similarity
CN106548786A (en) A kind of detection method and system of voice data
CN110473536A (en) A kind of awakening method, device and smart machine
CN105654947A (en) Method and system for acquiring traffic information in traffic broadcast speech
CN107799124A (en) A kind of VAD detection methods applied to intelligent sound mouse
CN110910891A (en) Speaker segmentation labeling method and device based on long-time memory neural network
CN108831450A (en) A kind of virtual robot man-machine interaction method based on user emotion identification
CN109841221A (en) Parameter adjusting method, device and body-building equipment based on speech recognition
CN106653000A (en) Emotion intensity test method based on voice information
CN106531195A (en) Dialogue conflict detection method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180313

RJ01 Rejection of invention patent application after publication