CN108172242A - A kind of improved blue-tooth intelligence cloud speaker interactive voice end-point detecting method - Google Patents

A kind of improved blue-tooth intelligence cloud speaker interactive voice end-point detecting method Download PDF

Info

Publication number
CN108172242A
CN108172242A CN201810014999.3A CN201810014999A CN108172242A CN 108172242 A CN108172242 A CN 108172242A CN 201810014999 A CN201810014999 A CN 201810014999A CN 108172242 A CN108172242 A CN 108172242A
Authority
CN
China
Prior art keywords
processing software
software app
data analyzing
smart machine
tooth intelligence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810014999.3A
Other languages
Chinese (zh)
Other versions
CN108172242B (en
Inventor
鲁霖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Xinzhongxin Technology Co Ltd
Original Assignee
Shenzhen Xinzhongxin Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Xinzhongxin Technology Co Ltd filed Critical Shenzhen Xinzhongxin Technology Co Ltd
Priority to CN201810014999.3A priority Critical patent/CN108172242B/en
Publication of CN108172242A publication Critical patent/CN108172242A/en
Application granted granted Critical
Publication of CN108172242B publication Critical patent/CN108172242B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • H04B5/72
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L2025/783Detection of presence or absence of voice signals based on threshold decision
    • G10L2025/786Adaptive threshold

Abstract

The present invention relates to a kind of improved blue-tooth intelligence cloud speaker interactive voice end-point detecting method, including intelligent cloud speaker, smart machine, data analyzing and processing software APP and bluetooth module.Wherein smart machine is mobile phone, tablet computer etc.;Wherein smart machine includes bluetooth module and data analyzing and processing software APP;Wherein intelligent cloud speaker includes cloud server;Data analyzing and processing software APP is mounted on smart machine;Bluetooth module establishes the connection in audio road with blue-tooth intelligence cloud speaker;The data analyzing and processing software APP of smart machine establishes the connection of control instruction by bluetooth module and blue-tooth intelligence cloud speaker, realizes the control data interaction of data analyzing and processing software APP and blue-tooth intelligence cloud speaker;The beneficial effects of the invention are as follows:Solve in existing the relevant technologies because environmental difference leads to that discrimination is poor, endpoint erroneous judgement, improve man machine language's interactive efficiency and experience.Efficiency is improved, improves user experience.

Description

A kind of improved blue-tooth intelligence cloud speaker interactive voice end-point detecting method
Technical field
The present invention relates to bluetooth low energy consumption technologies application field, more particularly to a kind of improved blue-tooth intelligence cloud speaker voice Interaction end-point detecting method.
Background technology
In field of human-computer interaction, voice activity detection(Voice Activity Detection,VAD)Be one very Crucial work, the quality of algorithm also directly determines the success or failure of entire voice interactive system to a certain extent, as one A complete voice interactive system, the effect finally realized and used depend not only on the algorithm of identification, many correlations Factor all directly affects the success or not of application system, and the purpose of end-point detection is exactly the signal under complicated application environment Voice signal and non-speech audio are told in stream, and determines the beginning and end of voice signal, good end-point detecting method energy The problems that change existing for speech recognition software that detection result is undesirable, discrimination is low etc., the high-precision of end-point detection can ensure that defeated The signal entered is effective complete voice signal, makes recognition effect more accurate quick.
Traditional end-point detecting method is the double-threshold comparison using short-time energy and zero-crossing rate, first in audio in short-term First time differentiation is carried out on energy, this can choose a high threshold and carry out primary thick judgement;Then using in Average zero-crossing rate Second is carried out to differentiate.Although it is small using double threshold end-point detection calculation amount, and preferable discrimination is gnawed in quiet environment, It is that it also has many deficiencies, for example, threshold value needs are set by experience, it is a fixed parameter;In constantly interactive voice In, the scene for being related to context pause is also easily judged by accident, causes man-machine interaction effect undesirable.
Therefore, in daily life, it is related to man-machine friendship field, how accurately detects that the endpoint location of audio signal is skill Art personnel urgently problem to be solved.
Invention content
The technical problems to be solved by the invention are:A kind of improved blue-tooth intelligence cloud speaker interactive voice endpoint inspection is provided Survey method, overcome in existing the relevant technologies because environmental difference leads to that discrimination is poor, endpoint erroneous judgement, improve man-machine language Sound interactive efficiency and experience.
In order to solve the above technical problems, the present invention provides a kind of improved blue-tooth intelligence cloud speaker interactive voice end-point detection Method, including intelligent cloud speaker, smart machine, data analyzing and processing software APP and bluetooth module.Wherein smart machine is hand Machine, tablet computer etc.;Wherein smart machine includes bluetooth module and data analyzing and processing software APP;Wherein intelligent cloud speaker packet Include cloud server;
The data analyzing and processing software APP is mounted on smart machine;
The bluetooth module establishes the connection in audio road with blue-tooth intelligence cloud speaker;
It advanced optimizes, the data analyzing and processing software APP of smart machine is established by bluetooth module and blue-tooth intelligence cloud speaker The control data interaction of data analyzing and processing software APP and blue-tooth intelligence cloud speaker are realized in the connection of control instruction;
It advanced optimizes, normal data interpretation software APP is in standby mode, when smart machine end wakes up interactive voice When, data analyzing and processing software APP starts bluetooth module connection, and starts to record, and acquires audio signal, while and blue-tooth intelligence The cloud server of cloud speaker establishes data transmission channel.
It advanced optimizes, data analyzing and processing software APP sets a mute guard time, and the guard time length is by counting It reaches an agreement on together with Cloud Server according to interpretation software APP;When waking up interactive voice, even if silent, 3 seconds quiet is also had Sound acquisition time is avoided when waking up interactive voice, and user has little time to speak, and whole system, which is just sentenced, stops;In addition, bluetooth module Towards connection mode SCO in very short time too frequent operation, system-level exception, the mute guard time control can be caused Bluetooth module processed towards connection mode SCO in very short time too frequent operation.
It advanced optimizes, the data analyzing and processing software APP of smart machine constantly extracts each frame audio signal;Data point The duration of the audio signal of each frame is set as 10ms by analysis processing software APP.
It advanced optimizes, the data analyzing and processing software APP of smart mobile phone calculates the short-time energy per frame audio signal, short When energy signal calculation formula be:
It advanced optimizes, the data analyzing and processing software APP dynamics of smart machine judge whether per frame audio signal be speech frame; Wherein speech signal energy and amplitude size are directly reacted in short-time energy, and sound section and unvoiced segments are sentenced according to short-time energy Disconnected, data analyzing and processing software APP dynamics find each frame and the maximum energy value in audio frame before, audio frame below As long as less than ceiling capacity frame * threshold values(M), current short-time energy hour, with regard to dynamically turning threshold value down, when the width of volume attenuation Value is too big, is just defined as non-speech frame, starts non-voice and counts, and non-speech frame continuous counter is equivalent to pause 2 seconds, then up to 200 Represent that speech terminates, if there is number of speech frames evidence in centre, counter resets count again.
The formula of adaptive threshold value is:
It advanced optimizes, the data analyzing and processing software APP of smart machine carries out valid endpoint judgement;
It advanced optimizes, the data analyzing and processing software APP of smart machine sends acquisition to cloud server to be terminated, and starts voice Identification;After data analyzing and processing software APP is according to the result for terminating voice collecting, stop recording, and send to cloud server Acquisition completion command starts speech recognition, by interactive voice tests a large amount of in blue-tooth intelligence cloud speaker, accurately judging The endpoint of voice.
It advanced optimizes, a kind of work step of improved blue-tooth intelligence cloud speaker interactive voice end-point detecting method:
A, the data analyzing and processing software APP of smart machine is established with blue-tooth intelligence cloud speaker and is connected;
B, smart machine end wakes up interactive voice;
C, the data analyzing and processing software APP of smart machine starts mute guard time counter;
D, the data analyzing and processing software APP of smart machine constantly extracts each frame audio signal;
E, the data analyzing and processing software APP of smart machine calculates the short-time energy per frame audio signal;
F, the data analyzing and processing software APP dynamics of smart machine judge whether per frame audio signal be speech frame;
H, the data analyzing and processing software APP of smart machine carries out valid endpoint judgement;
I, the data analyzing and processing software APP of smart machine sends acquisition to cloud server and terminates, and starts speech recognition.
After employing above-mentioned technical proposal, the beneficial effects of the invention are as follows:
Scheme compared with the prior art provides a kind of improved blue-tooth intelligence cloud speaker interactive voice end-point detecting method, solution Lead to that discrimination is poor, endpoint erroneous judgement because of environmental difference in certainly existing the relevant technologies, improve man machine language and interact effect Rate and experience.Efficiency is improved, improves user experience.
Description of the drawings
Fig. 1 is a kind of improved blue-tooth intelligence cloud speaker interactive voice end-point detecting method Working mould block diagram
Fig. 2 is a kind of improved blue-tooth intelligence cloud speaker interactive voice end-point detecting method work flow diagram
Specific embodiment
1 to attached drawing 2 and specific embodiment, the present invention will be described in detail, but not as to the present invention below in conjunction with the accompanying drawings Restriction.
As shown in attached drawing 1 to attached drawing 2, a kind of improved blue-tooth intelligence cloud speaker interactive voice end-point detecting method, including Intelligent cloud speaker, smart machine, data analyzing and processing software APP and bluetooth module.Wherein smart machine is mobile phone, tablet computer Deng;Wherein smart machine includes bluetooth module and data analyzing and processing software APP;Wherein intelligent cloud speaker includes cloud service Device;Data analyzing and processing software APP is mounted on smart machine;Bluetooth module establishes audio road with blue-tooth intelligence cloud speaker Connection;The data analyzing and processing software APP of smart machine establishes control instruction by bluetooth module and blue-tooth intelligence cloud speaker Connection, realize the control data interaction of data analyzing and processing software APP and blue-tooth intelligence cloud speaker;Normal data analyzes and processes Software APP is in standby mode, and when smart machine end wakes up interactive voice, data analyzing and processing software APP starts bluetooth mould Block connects, and starts to record, and acquires audio signal, while establishes data transmission with the cloud server of blue-tooth intelligence cloud speaker and lead to Road.Data analyzing and processing software APP sets a mute guard time, and the guard time length is by data analyzing and processing software APP reaches an agreement on together with Cloud Server;When waking up interactive voice, even if silent, the mute acquisition time of 3 seconds is also had, is kept away Exempt from when waking up interactive voice, user has little time to speak, and whole system, which is just sentenced, stops;In addition, bluetooth module towards connection mode SCO too frequent operations in very short time can cause system-level exception, mute guard time control bluetooth module Towards connection mode SCO in very short time too frequent operation.The data analyzing and processing software APP of smart machine is constantly extracted often One frame audio signal;The duration of the audio signal of each frame is set as 10ms by data analyzing and processing software APP.Intelligent hand The data analyzing and processing software APP of machine calculates the short-time energy per frame audio signal, and the calculation formula of short-time energy signal is:;The data analyzing and processing software APP dynamics of smart machine judge per frame audio signal whether be Speech frame;Wherein speech signal energy and amplitude size are directly reacted in short-time energy, to sound section and noiseless according to short-time energy Duan Jinhang judges that data analyzing and processing software APP dynamics find each frame and the maximum energy value in audio frame before, behind As long as audio frame be less than ceiling capacity frame * threshold values(M), current short-time energy hour just dynamically turns threshold value down, works as volume The amplitude of attenuation is too big, is just defined as non-speech frame, starts non-voice and counts, non-speech frame continuous counter is equivalent to and stops up to 200 Pause 2 seconds, then it represents that speech terminates, if there is number of speech frames evidence in centre, counter resets count again.
The formula of adaptive threshold value is:
The data analyzing and processing software APP of smart machine carries out valid endpoint judgement;The data analyzing and processing software of smart machine APP sends acquisition to cloud server to be terminated, and starts speech recognition;Data analyzing and processing software APP is according to end voice collecting Result after, stop recording, and to cloud server send acquisition completion command, start speech recognition, pass through blue-tooth intelligence cloud In speaker in a large amount of interactive voice tests, the endpoint of voice is accurately judged.
A kind of work step of improved blue-tooth intelligence cloud speaker interactive voice end-point detecting method:
A, the data analyzing and processing software APP of smart machine is established with blue-tooth intelligence cloud speaker and is connected;
B, smart machine end wakes up interactive voice;
C, the data analyzing and processing software APP of smart machine starts mute guard time counter;
D, the data analyzing and processing software APP of smart machine constantly extracts each frame audio signal;
E, the data analyzing and processing software APP of smart machine calculates the short-time energy per frame audio signal;
F, the data analyzing and processing software APP dynamics of smart machine judge whether per frame audio signal be speech frame;
H, the data analyzing and processing software APP of smart machine carries out valid endpoint judgement;
I, the data analyzing and processing software APP of smart machine sends acquisition to cloud server and terminates, and starts speech recognition.
In embodiments of the present invention:
The data analyzing and processing software APP of S101 smart machines is established with blue-tooth intelligence cloud sound-box device and is connected;
First, the connection in audio road is established by the bluetooth module in cell phone system and blue-tooth intelligence cloud speaker;Then pass through again The data analyzing and processing software APP of smart machine establishes the connection of control instruction with blue-tooth intelligence cloud speaker, good in order to ensure to have Good compatibility, Android versions are established SPP channels with equipment and are connect, and what IOS editions were then established is the connection of BLE channels, can be real The control data interaction of existing APP and blue-tooth intelligence cloud sound-box device.
S102 smart machines end wakes up interactive voice;
Normal data interpretation software APP handles standby mode, only when equipment end wakes up interactive voice, starts bluetooth SCO connections, and start to record, audio signal is acquired, while data transmission channel is established with cloud server.
The data analyzing and processing software APP of S103 smart machines starts mute guard time counter;
The data analyzing and processing software APP of smart machine starts mute guard time counter, in order to which user has better experience, And the stability of system, a mute guard time is set, when waking up interactive voice, even if silent, specific duration and cloud Server is reached an agreement on together, also has the mute acquisition time of 3 seconds, and when avoiding waking up interactive voice, user has little time to speak, entirely System, which is just sentenced, stops;On the other hand, too frequent operation in the SCO very short time of bluetooth, can cause system-level exception.
The data analyzing and processing software APP of S104 smart machines constantly extracts each frame audio signal;
Audio signal be a unstable state, time-varying signal, in order to obtain more accurately result of calculation, it is believed that it is " short Be in the range of time " stable state, when constant, this time, general data interpretation software APP believes the audio of each frame Number duration be set as 10ms.
The data analyzing and processing software APP of S105 smart machines calculates the short-time energy per frame audio signal;
The calculation formula of short-time energy signal is:
Wherein, the energy value for m-th of sampled point in the i-th frame.
According to short-time energy calculation formula, APP example codes are as follows:
private long getRms(int end, int span) { int begin = end - span;if (begin < 0) { begin = 0; } if (begin % 2 != 0) {begin++; } long sum = 0; for (int i = begin; i < end; i += 2) { short curSample = getShort(this.mRecording[i], this.mRecording[i + 1]); sum += (long) (curSample * curSample); } return sum; }
The data analyzing and processing software APP dynamics of S106 smart machines judge whether per frame audio signal be speech frame;
Short-time energy can directly reflect speech signal energy and amplitude size, and then sound section and unvoiced segments can be carried out Judge, data analyzing and processing software APP dynamics find each frame and the maximum energy value in audio frame before, audio below As long as frame is less than ceiling capacity frame * threshold values(M), current short-time energy hour, with regard to dynamically turning threshold value down, when volume attenuation Amplitude is too big, is just defined as non-speech frame, starts non-voice and counts, and non-speech frame continuous counter is equivalent to pause 2 seconds up to 200, Then represent that speech terminates, if there is number of speech frames evidence in centre, counter resets count again.
Adaptive threshold value:
APP code samples are as follows:
private static final int RMS_COUNT_MAX = 200; // 2s
public boolean isPausing() {
long rms = getRms(this.mRecordedLength, this.mOneSec);
if (rms > this.highestRMS) {
this.highestRMS = rms;
this.rmsCount = 0;
return false;
} else if (((double) rms) < M * ((double) this.highestRMS)) {
if(this.rmsCount < RMS_COUNT_MAX){
this.rmsCount++;
return false;
}else{
this.rmsCount = 0;
return true;
}
} else {
this.rmsCount = 0;
return false;
}
}
The data analyzing and processing software APP of S107 smart machines carries out valid endpoint judgement;
Sound end judgement in human-computer interaction is limited by various aspects, and the mute guard time of such as 3 seconds is local improved short When energy measuring sound end, the stopping acquisition instructions that high in the clouds issues.
APP code samples are as follows:
while (recorder != null && recorder.getState() == AudioRecorder.State.RE CORDING) {
boolean pausing = recorder.isPausing();
if (pausing && mRecordDurationReached) {
if (mBtDeviceSpeechType == BT_DEVICE_SPEECH_RECOGNITION) {
mBtDeviceSpeechType = BT_DEVICE_SPEECH_RECOGNITION_NONE;
stopBluetoothSCO();
}
stopListening(true);
break;
}
try {
Thread.sleep(10);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
The data analyzing and processing software APP of S108 smart machines sends acquisition to high in the clouds to be terminated, and starts speech recognition;
After data analyzing and processing software APP is according to the result for terminating voice collecting, stop recording, and send acquisition to high in the clouds and complete Instruction, can start speech recognition, can cross in blue-tooth intelligence cloud speaker in a large amount of interactive voice tests, substantially can be accurately Judge the endpoint of voice.The transmission and processing of non-speech frame are greatly reduced, efficiency is improved, improves user experience.
As known by the technical knowledge, the technical program can pass through other essence without departing from its spirit or the reality of essential feature Scheme is applied to realize.Therefore, embodiment disclosed above, all things considered are all merely illustrative, and are not only 's.All changes within the scope of the invention or within the scope equivalent to the present invention are included in the invention.

Claims (9)

1. a kind of improved blue-tooth intelligence cloud speaker interactive voice end-point detecting method, including intelligent cloud speaker, smart machine, number According to interpretation software APP and bluetooth module;It is characterized in that:Wherein smart machine is mobile phone, tablet computer etc.;It is wherein intelligent Equipment includes bluetooth module and data analyzing and processing software APP;Wherein intelligent cloud speaker includes cloud server;The data point Analysis processing software APP is mounted on smart machine;The bluetooth module establishes the company in audio road with blue-tooth intelligence cloud speaker It connects;The data analyzing and processing software APP of the smart machine establishes control instruction by bluetooth module and blue-tooth intelligence cloud speaker Connection, realize the control data interaction of data analyzing and processing software APP and blue-tooth intelligence cloud speaker.
2. a kind of improved blue-tooth intelligence cloud speaker interactive voice end-point detecting method according to claim 1, feature It is:A kind of work step of improved blue-tooth intelligence cloud speaker interactive voice end-point detecting method:
A, the data analyzing and processing software APP of smart machine is established with blue-tooth intelligence cloud speaker and is connected;
B, smart machine end wakes up interactive voice;
C, the data analyzing and processing software APP of smart machine starts mute guard time counter;
D, the data analyzing and processing software APP of smart machine constantly extracts each frame audio signal;
E, the data analyzing and processing software APP of smart machine calculates the short-time energy per frame audio signal;
F, the data analyzing and processing software APP dynamics of smart machine judge whether per frame audio signal be speech frame;
H, the data analyzing and processing software APP of smart machine carries out valid endpoint judgement;
I, the data analyzing and processing software APP of smart machine sends acquisition to cloud server and terminates, and starts speech recognition.
3. a kind of improved blue-tooth intelligence cloud speaker interactive voice end-point detecting method according to claim 2, feature It is:Normal data interpretation software APP is in standby mode, when smart machine end wakes up interactive voice, data analysis Handle software APP and start bluetooth module connection, and start to record, acquire audio signal, at the same with the cloud of blue-tooth intelligence cloud speaker End server establishes data transmission channel.
4. a kind of improved blue-tooth intelligence cloud speaker interactive voice end-point detecting method according to claim 2, feature It is:Data analyzing and processing software APP sets a mute guard time, and the guard time length is by data analyzing and processing software APP reaches an agreement on together with Cloud Server;When waking up interactive voice, even if silent, the mute acquisition time of 3 seconds is also had, is kept away Exempt from when waking up interactive voice, user has little time to speak, and whole system, which is just sentenced, stops;In addition, bluetooth module towards connection mode SCO too frequent operations in very short time can cause system-level exception, mute guard time control bluetooth module Towards connection mode SCO in very short time too frequent operation.
5. a kind of improved blue-tooth intelligence cloud speaker interactive voice end-point detecting method according to claim 2, feature It is:The data analyzing and processing software APP of smart machine constantly extracts each frame audio signal;Data analyzing and processing software APP The duration of the audio signal of each frame is set as 10ms.
6. a kind of improved blue-tooth intelligence cloud speaker interactive voice end-point detecting method according to claim 2, feature It is:The data analyzing and processing software APP of smart mobile phone calculates the short-time energy per frame audio signal, the meter of short-time energy signal Calculating formula is:
7. a kind of improved blue-tooth intelligence cloud speaker interactive voice end-point detecting method according to claim 2, feature It is:The data analyzing and processing software APP dynamics of smart machine judge whether per frame audio signal be speech frame;Wherein in short-term can Amount directly reacts speech signal energy and amplitude size, and sound section and unvoiced segments are judged according to short-time energy, data point Analysis processing software APP dynamics find each frame and the maximum energy value in audio frame before, as long as audio frame below is less than Ceiling capacity frame * threshold values(M), current short-time energy hour, with regard to dynamically turning threshold value down, when the amplitude of volume attenuation is too big, Non-speech frame is just defined as, starts non-voice and counts, non-speech frame continuous counter is equivalent to pause 2 seconds, then it represents that say up to 200 Words terminate, if there is number of speech frames evidence in centre, counter resets count again, and the formula of adaptive threshold value is:
8. a kind of improved blue-tooth intelligence cloud speaker interactive voice end-point detecting method according to claim 2, feature It is:The data analyzing and processing software APP of smart machine carries out valid endpoint judgement.
9. a kind of improved blue-tooth intelligence cloud speaker interactive voice end-point detecting method according to claim 2, feature It is:The data analyzing and processing software APP of smart machine carries out valid endpoint judgement;The data analyzing and processing software of smart machine APP sends acquisition to cloud server to be terminated, and starts speech recognition;Data analyzing and processing software APP is according to end voice collecting Result after, stop recording, and to cloud server send acquisition completion command, start speech recognition, pass through blue-tooth intelligence cloud In speaker in a large amount of interactive voice tests, the endpoint of voice is accurately judged.
CN201810014999.3A 2018-01-08 2018-01-08 Improved Bluetooth intelligent cloud sound box voice interaction endpoint detection method Active CN108172242B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810014999.3A CN108172242B (en) 2018-01-08 2018-01-08 Improved Bluetooth intelligent cloud sound box voice interaction endpoint detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810014999.3A CN108172242B (en) 2018-01-08 2018-01-08 Improved Bluetooth intelligent cloud sound box voice interaction endpoint detection method

Publications (2)

Publication Number Publication Date
CN108172242A true CN108172242A (en) 2018-06-15
CN108172242B CN108172242B (en) 2021-06-01

Family

ID=62517740

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810014999.3A Active CN108172242B (en) 2018-01-08 2018-01-08 Improved Bluetooth intelligent cloud sound box voice interaction endpoint detection method

Country Status (1)

Country Link
CN (1) CN108172242B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110097884A (en) * 2019-06-11 2019-08-06 大众问问(北京)信息科技有限公司 A kind of voice interactive method and device
CN110958348A (en) * 2018-09-25 2020-04-03 阿里巴巴集团控股有限公司 Voice processing method and device, user equipment and intelligent sound box
CN110971744A (en) * 2018-09-28 2020-04-07 深圳市冠旭电子股份有限公司 Method and device for controlling voice playing of Bluetooth sound box
CN111083678A (en) * 2018-10-22 2020-04-28 深圳市冠旭电子股份有限公司 Playing control method and system of Bluetooth sound box and intelligent device
CN111554287A (en) * 2020-04-27 2020-08-18 佛山市顺德区美的洗涤电器制造有限公司 Voice processing method and device, household appliance and readable storage medium
CN111968680A (en) * 2020-08-14 2020-11-20 北京小米松果电子有限公司 Voice processing method, device and storage medium
CN112420079A (en) * 2020-11-18 2021-02-26 青岛海尔科技有限公司 Voice endpoint detection method and device, storage medium and electronic equipment
CN112449050A (en) * 2019-08-29 2021-03-05 阿里巴巴集团控股有限公司 Voice interaction method, voice interaction device, computing device and storage medium
CN112863542A (en) * 2021-01-29 2021-05-28 青岛海尔科技有限公司 Voice detection method and device, storage medium and electronic equipment

Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0690436A2 (en) * 1994-06-28 1996-01-03 Alcatel SEL Aktiengesellschaft Detection of the start/end of words for word recognition
CN1264887A (en) * 2000-03-31 2000-08-30 清华大学 Non-particular human speech recognition and prompt method based on special speech recognition chip
CN2745116Y (en) * 2004-11-12 2005-12-07 联想(北京)有限公司 Computer I/O peripheral equipment having wireless connecting function
CN1773605A (en) * 2004-11-12 2006-05-17 中国科学院声学研究所 Sound end detecting method for sound identifying system
CN101107824A (en) * 2004-12-31 2008-01-16 英国电讯有限公司 Connection-oriented communications scheme for connection-less communications traffic
US20080125044A1 (en) * 2006-11-28 2008-05-29 Samsung Electronics Co.; Ltd Audio output system and method for mobile phone
US20090282298A1 (en) * 2008-05-08 2009-11-12 Broadcom Corporation Bit error management methods for wireless audio communication channels
CN101984725A (en) * 2010-11-17 2011-03-09 广州杰赛科技股份有限公司 Wireless access device and method
CN101625857B (en) * 2008-07-10 2012-05-09 新奥特(北京)视频技术有限公司 Self-adaptive voice endpoint detection method
CN202679358U (en) * 2012-05-09 2013-01-16 深圳市芯中芯科技有限公司 Stereo Bluetooth audio module
CN102891408A (en) * 2012-10-12 2013-01-23 歌尔声学股份有限公司 Bluetooth controlled power socket and implementation method for Bluetooth controlled power socket
CN103065629A (en) * 2012-11-20 2013-04-24 广东工业大学 Speech recognition system of humanoid robot
CN103369677A (en) * 2012-04-02 2013-10-23 英特尔移动通信有限责任公司 Radio communication device and method for operating a radio communication device
US20140163984A1 (en) * 2012-12-10 2014-06-12 Lenovo (Beijing) Co., Ltd. Method Of Voice Recognition And Electronic Apparatus
CN104184496A (en) * 2013-05-24 2014-12-03 凌通科技股份有限公司 Bluetooth data/control information transmission module, interactive system and method thereof
CN204517806U (en) * 2015-01-09 2015-07-29 深圳市芯中芯科技有限公司 A kind of audio emission based on 5.8GHz frequency range and receiving system
CN105338645A (en) * 2012-05-30 2016-02-17 英特尔移动通信有限责任公司 Radio communication device
CN106653021A (en) * 2016-12-27 2017-05-10 上海智臻智能网络科技股份有限公司 Voice wake-up control method and device and terminal
CN107277272A (en) * 2017-07-25 2017-10-20 深圳市芯中芯科技有限公司 A kind of bluetooth equipment voice interactive method and system based on software APP

Patent Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0690436A2 (en) * 1994-06-28 1996-01-03 Alcatel SEL Aktiengesellschaft Detection of the start/end of words for word recognition
CN1264887A (en) * 2000-03-31 2000-08-30 清华大学 Non-particular human speech recognition and prompt method based on special speech recognition chip
CN2745116Y (en) * 2004-11-12 2005-12-07 联想(北京)有限公司 Computer I/O peripheral equipment having wireless connecting function
CN1773605A (en) * 2004-11-12 2006-05-17 中国科学院声学研究所 Sound end detecting method for sound identifying system
CN101107824A (en) * 2004-12-31 2008-01-16 英国电讯有限公司 Connection-oriented communications scheme for connection-less communications traffic
US20080125044A1 (en) * 2006-11-28 2008-05-29 Samsung Electronics Co.; Ltd Audio output system and method for mobile phone
US20090282298A1 (en) * 2008-05-08 2009-11-12 Broadcom Corporation Bit error management methods for wireless audio communication channels
CN101625857B (en) * 2008-07-10 2012-05-09 新奥特(北京)视频技术有限公司 Self-adaptive voice endpoint detection method
CN101984725A (en) * 2010-11-17 2011-03-09 广州杰赛科技股份有限公司 Wireless access device and method
CN103369677A (en) * 2012-04-02 2013-10-23 英特尔移动通信有限责任公司 Radio communication device and method for operating a radio communication device
CN202679358U (en) * 2012-05-09 2013-01-16 深圳市芯中芯科技有限公司 Stereo Bluetooth audio module
CN105338645A (en) * 2012-05-30 2016-02-17 英特尔移动通信有限责任公司 Radio communication device
CN102891408A (en) * 2012-10-12 2013-01-23 歌尔声学股份有限公司 Bluetooth controlled power socket and implementation method for Bluetooth controlled power socket
CN103065629A (en) * 2012-11-20 2013-04-24 广东工业大学 Speech recognition system of humanoid robot
US20140163984A1 (en) * 2012-12-10 2014-06-12 Lenovo (Beijing) Co., Ltd. Method Of Voice Recognition And Electronic Apparatus
CN104184496A (en) * 2013-05-24 2014-12-03 凌通科技股份有限公司 Bluetooth data/control information transmission module, interactive system and method thereof
CN204517806U (en) * 2015-01-09 2015-07-29 深圳市芯中芯科技有限公司 A kind of audio emission based on 5.8GHz frequency range and receiving system
CN106653021A (en) * 2016-12-27 2017-05-10 上海智臻智能网络科技股份有限公司 Voice wake-up control method and device and terminal
CN107277272A (en) * 2017-07-25 2017-10-20 深圳市芯中芯科技有限公司 A kind of bluetooth equipment voice interactive method and system based on software APP

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110958348A (en) * 2018-09-25 2020-04-03 阿里巴巴集团控股有限公司 Voice processing method and device, user equipment and intelligent sound box
CN110971744A (en) * 2018-09-28 2020-04-07 深圳市冠旭电子股份有限公司 Method and device for controlling voice playing of Bluetooth sound box
CN110971744B (en) * 2018-09-28 2022-09-23 深圳市冠旭电子股份有限公司 Method and device for controlling voice playing of Bluetooth sound box
CN111083678A (en) * 2018-10-22 2020-04-28 深圳市冠旭电子股份有限公司 Playing control method and system of Bluetooth sound box and intelligent device
CN110097884A (en) * 2019-06-11 2019-08-06 大众问问(北京)信息科技有限公司 A kind of voice interactive method and device
CN110097884B (en) * 2019-06-11 2022-05-17 大众问问(北京)信息科技有限公司 Voice interaction method and device
CN112449050A (en) * 2019-08-29 2021-03-05 阿里巴巴集团控股有限公司 Voice interaction method, voice interaction device, computing device and storage medium
CN111554287A (en) * 2020-04-27 2020-08-18 佛山市顺德区美的洗涤电器制造有限公司 Voice processing method and device, household appliance and readable storage medium
CN111554287B (en) * 2020-04-27 2023-09-05 佛山市顺德区美的洗涤电器制造有限公司 Voice processing method and device, household appliance and readable storage medium
CN111968680A (en) * 2020-08-14 2020-11-20 北京小米松果电子有限公司 Voice processing method, device and storage medium
CN112420079A (en) * 2020-11-18 2021-02-26 青岛海尔科技有限公司 Voice endpoint detection method and device, storage medium and electronic equipment
CN112863542A (en) * 2021-01-29 2021-05-28 青岛海尔科技有限公司 Voice detection method and device, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN108172242B (en) 2021-06-01

Similar Documents

Publication Publication Date Title
CN108172242A (en) A kind of improved blue-tooth intelligence cloud speaker interactive voice end-point detecting method
CN108573701B (en) Query endpointing based on lip detection
US9299344B2 (en) Apparatus and method to classify sound to detect speech
US11830479B2 (en) Voice recognition method and apparatus, and air conditioner
CN103811003B (en) A kind of audio recognition method and electronic equipment
Lu et al. Speakersense: Energy efficient unobtrusive speaker identification on mobile phones
US20190139547A1 (en) Interactive Method and Device
JP6171617B2 (en) Response target speech determination apparatus, response target speech determination method, and response target speech determination program
CN103745723A (en) Method and device for identifying audio signal
EP2681896B1 (en) Method and apparatus for identifying mobile devices in similar sound environment
CN106686223A (en) A system and method for assisting dialogues between a deaf person and a normal person, and a smart mobile phone
CN110335593A (en) Sound end detecting method, device, equipment and storage medium
CN110364178B (en) Voice processing method and device, storage medium and electronic equipment
CN1708782A (en) Method for operating a speech recognition system
CN110097875A (en) Interactive voice based on microphone signal wakes up electronic equipment, method and medium
CN111798850A (en) Method and system for operating equipment by voice and server
KR20110059248A (en) Communication interface apparatus and method for multi-user and system
CN114299953B (en) Speaker role distinguishing method and system combining mouth movement analysis
CN109994129A (en) Speech processing system, method and apparatus
CN109285544A (en) Speech monitoring system
WO2021146857A1 (en) Audio processing method and device
CN114155845A (en) Service determination method and device, electronic equipment and storage medium
CN103761064A (en) Automatic voice input system and method
CN103997381B (en) The identification of examination hall cheating signal intelligent and evidence obtaining method of reducing
CN115831132A (en) Audio encoding and decoding method, device, medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant