WO2018137704A1 - 一种基于麦克风阵列的拾音方法及系统 - Google Patents

一种基于麦克风阵列的拾音方法及系统 Download PDF

Info

Publication number
WO2018137704A1
WO2018137704A1 PCT/CN2018/074304 CN2018074304W WO2018137704A1 WO 2018137704 A1 WO2018137704 A1 WO 2018137704A1 CN 2018074304 W CN2018074304 W CN 2018074304W WO 2018137704 A1 WO2018137704 A1 WO 2018137704A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
microphone array
signal
sound source
speech
Prior art date
Application number
PCT/CN2018/074304
Other languages
English (en)
French (fr)
Inventor
范利春
朱磊
高鹏
Original Assignee
芋头科技(杭州)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 芋头科技(杭州)有限公司 filed Critical 芋头科技(杭州)有限公司
Priority to US16/476,259 priority Critical patent/US11302341B2/en
Publication of WO2018137704A1 publication Critical patent/WO2018137704A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0316Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
    • G10L21/0364Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude for improving intelligibility
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • H04R3/005Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2201/00Details of transducers, loudspeakers or microphones covered by H04R1/00 but not provided for in any of its subgroups
    • H04R2201/40Details of arrangements for obtaining desired directional characteristic by combining a number of identical transducers covered by H04R1/40 but not provided for in any of its subgroups
    • H04R2201/4012D or 3D arrays of transducers

Definitions

  • the present invention relates to the field of signal processing, and in particular, to a method and system for collecting sound based on a microphone array.
  • the general steps for picking up using a microphone array are: first, determining the position of the speaker; and second, enhancing the speech signal using beamforming techniques.
  • an object of the present invention is to provide a method and system for collecting sound based on a microphone array.
  • a method for collecting sound based on a microphone array comprising the following steps:
  • Step 1 using one of the plurality of voice signals picked up and output by a microphone array for voice activation detection to determine whether a voice activation signal occurs, and if so, performing step 2, if not, repeating step 1;
  • Step 2 performing sound source localization by using multiple voice signals output by the microphone array to obtain a sound source positioning direction
  • Step 3 performing voice enhancement on the voice signal of the sound source positioning direction to obtain an enhanced voice signal
  • Step 4 performing voice wake-up detection on the enhanced voice signal, determining whether voice wake-up is detected, and if yes, performing step 5, otherwise, repeating step 1;
  • Step 5 the microphone array picks up and outputs a plurality of voice signals
  • step 6 the multi-channel speech signal picked up by the microphone array is processed into an enhanced speech as the final picked-up audio output.
  • the step 5 is: a sound collection indicator light directed to the sound source positioning direction is illuminated, and the microphone array picks up and outputs a multi-channel voice signal.
  • the microphone array-based sound collection method of the present invention is as follows:
  • Step 11 Select one voice signal from the plurality of voice signals captured by the microphone array
  • Step 12 detecting a voice start point and a voice end point of the speaker in the voice signal
  • Step 13 Determine whether a voice activation signal occurs according to a signal between the voice start point and the voice end point, and if yes, perform step 2; otherwise, repeat step 1.
  • the specific steps of the sound source localization in the step 2 are:
  • the specific step of the voice enhancement in the step 3 is: performing noise suppression on the voice signal in the sound source positioning direction to obtain an enhanced voice signal.
  • the specific step of the step 4 is: sending the enhanced voice signal to a voice wake-up model, and detecting whether the enhanced voice signal includes a set wake-up. Word, if not, go to step 1, and if so, go to step 5.
  • the voice in the direction indicated by the sound collection indicator is enhanced.
  • the microphone array-based sound collection method of the present invention after performing the step 6 after the step 5, further comprises performing steps 1 to 5 according to the multi-channel speech signals acquired in the step 5.
  • the invention also provides a microphone array based sound pickup system, comprising:
  • a microphone array comprising a plurality of microphone units, wherein the plurality of microphone units are used for picking up and outputting a plurality of voice signals;
  • a voice activation unit connected to the microphone array, performing voice activation detection according to at least one voice signal of the multiple voice signals, and outputting a voice activation result signal or a voice inactivity result signal;
  • the sound source locating unit is connected to the microphone array under the action of a first controlled switch that is turned on under the control of the voice activation result signal, and determines the sound source positioning direction by performing sound source localization according to the multi-channel voice signal. ;
  • a first voice enhancement unit connected to the sound source positioning unit, performing voice enhancement on the voice signal in the sound source positioning direction to obtain an enhanced voice signal
  • a voice wake-up detecting unit connected to the first voice enhancing unit, performing voice wake-up detection on the enhanced voice signal, and outputting a voice wake-up result signal or a voice un-awakening result signal;
  • a second voice enhancement unit is connected to the microphone array under the action of a second controlled switch that is turned on under the control of the voice wakeup result signal, and processes the multi-channel voice signal of the microphone array into one enhanced channel Voice, as the final sound output.
  • the microphone array is a planar ring structure composed of a plurality of microphone units, and a plurality of sound collecting lamps are disposed along a circumferential direction of the planar ring structure, and the sound collecting indicator is used Instructing the sound source to locate the direction.
  • the present invention is directed to an increasingly wide application of current speech recognition technology in different scenarios and different needs, and proposes a microphone array based pickup method and system, which can better pick up speech signals in a far field environment. Especially, it can accurately pick up the sound in a high-noise environment, providing an excellent solution for long-distance voice control.
  • the present invention can reduce the calculation amount of data processing of the microphone array by using voice wake-up and voice detection, thereby reducing energy loss. , which reduces energy consumption and saves costs.
  • FIG. 1 is a flow chart of a method of a specific embodiment of the present invention.
  • FIG. 2 is a flow chart of a method of another embodiment of the present invention.
  • FIG. 3 is a structural diagram of the system of the present invention.
  • FIG. 4 is a schematic structural view of a microphone array of the present invention.
  • a method for collecting sound based on a microphone array comprising the following steps:
  • Step 1 using one of the plurality of voice signals picked up and output by a microphone array for voice activation detection to determine whether a voice activation signal occurs, and if so, performing step 2, if not, repeating step 1;
  • Step 2 performing sound source localization by using multiple voice signals output by the microphone array to obtain a sound source positioning direction
  • Step 3 performing voice enhancement on the voice signal in the direction in which the sound source is located, and obtaining an enhanced voice signal;
  • Step 4 performing voice wake-up detection on the enhanced voice signal, determining whether voice wake-up is detected, and if yes, performing step 5, otherwise, repeating step 1;
  • Step 5 the microphone array picks up and outputs the multi-channel voice signal
  • step 6 the multi-channel speech signal picked up by the microphone array is processed into an enhanced speech as the final picked-up audio output.
  • the invention aims at the increasingly wide application of the current speech recognition technology in different scenarios and different needs, and uses voice wake-up to judge the manner of starting the pickup, so that the device enters the pickup state, thereby enhancing the speaker's voice. It can better pick up the voice signal in the far field environment, especially in the high noise environment.
  • step 5 is: a sound collection indicator pointing to the sound source positioning direction is illuminated, and the microphone array picks up and outputs a plurality of voice signals.
  • the pickup indicator is a means of voice interaction and is used to prompt the user for the current pickup direction. After the voice wakes up, the pickup indicator will point to the direction of the sound source. If the direction is the direction of the user, then the user can know that the subsequent words will be picked up by the system. If the direction of the indicator is not the direction of the user, the user also I will understand that my words will not be picked up by the system, and thus decide whether to wake up again.
  • step 6 the speech in the direction indicated by the pickup indicator is enhanced.
  • the microphone array-based sound pickup method of the present invention is as follows:
  • Step 11 selecting one voice signal from the plurality of voice signals captured by the microphone array
  • Step 12 detecting a speech start point and a speech end point of the speaker in the voice signal
  • Step 13 Determine whether a voice activation signal occurs according to a signal between the voice start point and the voice end point. If yes, perform step 2; otherwise, repeat step 1.
  • the voice activation detecting step described above gives a speech start point when a person speaks, and gives a speech end point when the speech ends. Throughout the process, voice activation detection can be performed with just one signal from the microphone array. Voice activation detection can be implemented using a prior art voice activation detection method.
  • the specific step of sound source localization in step 2 is to obtain the position of the sounding sound source as the sound source localization direction according to the signal time difference received by at least two microphones in the microphone array.
  • the sound source localization method can be implemented by using beamforming technology.
  • the specific step of the voice enhancement in the step 3 is: performing noise suppression on the voice signal in the sound source positioning direction to obtain an enhanced voice signal.
  • the use of a microphone array for speech enhancement can obtain a higher signal-to-noise ratio speech signal in the direction of the sound source, achieving the purpose of speech enhancement. This facilitates subsequent processing.
  • the specific step of step 4 is: sending the enhanced voice signal into a voice wake-up model, and detecting whether the enhanced voice signal includes the set wake-up word, if not, turning Go to step 1, and if so, go to step 5.
  • step 4 it is determined whether or not the incoming wake-up state is determined by detecting whether the voice signal includes the set wake-up word. If there is no wake-up, the system does not respond, and the activation detection is continued to determine that other voices have entered. If it wakes up, it lights up the pickup indicator and proceeds to the next step.
  • step 5 is performed after step 5, and step 1 to step 5 are performed according to the multi-channel voice signal acquired in step 5.
  • the microphone array continuously records the voice in all directions while picking up the direction of the pickup indicator.
  • the data will enter the part of the collected data stream to get the final pickup; on the other hand, it will also cycle in the wake-up data stream. This is to ensure that the user in other directions also utters the activation word, or that the pickup indicator does not point to the user direction.
  • the pickup indicator will turn to the new direction of the sound source, and then the collected data stream will pick up in a new direction, while the wake-up data stream will continue to judge in all directions.
  • the specific pickup process in the preferred embodiment is: using any one of the microphone arrays for voice activation detection; after detecting the voice activation signal, using the microphone array for sound source localization; and positioning the sound source according to the sound source localization result
  • the voice signal of the direction is voice enhanced; the enhanced voice signal is sent to the voice wakeup model for voice wakeup detection; when the voice wakeup is detected, the sound pickup indicator lights up and points to the direction of the sound source; Perform voice enhancement and perform voice activation detection.
  • voice is detected, pickup is performed.
  • the wake-up in the new direction is still detected. Once a new wake-up is found in the new direction, the pickup indicator will point to the new wake-up direction and repeat this step.
  • the present invention also provides a microphone array based pickup system, with reference to FIG. 3, comprising:
  • a microphone array comprising a plurality of microphone units for picking up and outputting a plurality of voice signals
  • the voice activation unit 11 is connected to the microphone array, performs voice activation detection according to at least one voice signal of the multiple voice signals, and outputs a voice activation result signal or a voice inactivity result signal;
  • the sound source locating unit 12 is connected to the microphone array under the action of a first controlled switch SK1 that is turned on under the control of the voice activation result signal, and determines the sound source positioning direction according to the sound source positioning according to the multi-channel voice signal;
  • the first speech enhancement unit 13 is connected to the sound source localization unit 12, and performs speech enhancement on the speech signal in the sound source localization direction to obtain an enhanced speech signal;
  • the voice wakeup detecting unit 14 is connected to the first voice augmenting unit 13 to perform voice wakeup detection on the enhanced voice signal, and outputs a voice wakeup result signal or a voice unwake result signal;
  • the second speech enhancement unit 15 is connected to the microphone array under the action of a second controlled switch SK2 that is turned on under the control of the voice wakeup result signal, and processes the multi-channel speech signal of the microphone array into an enhanced single-channel speech signal. As the final sound output.
  • FIG. 3 there are two data streams in the entire pickup process. One is to wake up the data stream and the other is to pick up the data stream.
  • the wake-up data stream is always running, but not all of the modules are working at all times. Only the voice activation module detects the voice activation.
  • the first controlled switch SK1 in Figure 3 is turned on, and the data will flow backwards.
  • the sound source locating unit 12, the first voice enhancing unit 13 and the voice waking detecting unit 14, and most of the three modules of the sound source locating unit 12, the first voice enhancing unit 13 and the voice waking detecting unit 14 are not working at most moments. This will save more resources.
  • the pickup indicator When the voice wakes up, the pickup indicator lights up and points to the direction of the pickup. At this time, the second controlled switch SK2 in Fig. 3 is turned on, and the collected data stream starts to work. When the voice wakes up again, the pickup indicator points in a new direction and picks up in a new direction. The picked-up data stream is turned off until there is no voice in the picked-up data stream.
  • the microphone array can be a planar ring structure composed of a plurality of microphone units.
  • the microphone array comprises a total of 8 microphone units, and the 8 microphone units have a symmetrical structure, so that each The signals in the direction can be treated and treated equally.
  • a plurality of sound collection lamps are disposed along a circumferential direction of the planar annular structure and may be directed to respective directions of the plane for indicating the direction in which the sound source is positioned.

Abstract

一种基于麦克风阵列的拾音方法,包括:步骤1,使用麦克风阵列拾取并输出多路语音信号中的其中一路进行语音激活检测,判断是否出现语音激活信号,如是,执行步骤2,如否,重复步骤1;步骤2,利用多路语音信号进行声源定位,获得声源定位方向;步骤3,对声源定位方向的语音信号进行语音增强,获得增强过的语音信号;步骤4,对增强过的语音信号进行语音唤醒检测,判断是否检测到语音唤醒,如是,执行步骤5,否则,重复步骤1;步骤5,麦克风阵列拾取并输出多路语音信号;步骤6,将多路语音信号处理为一路作为最终所拾取到的音输出。本方法能更好地对远场环境下的语音信号进行拾取,尤其在高噪声环境下能够准确的拾音,为远距离的语音控制提供了优秀的解决方案。

Description

一种基于麦克风阵列的拾音方法及系统 技术领域
本发明涉及信号处理领域,尤其涉及一种基于麦克风阵列的拾音方法及系统。
背景技术
录制高质量的语音信号对语音识别等语音分析方法有着至关重要的作用。传统采用单麦克风录音的方式在远距离高噪声环境下录音质量急剧下降,这极大的限制了语音分析方法的应用场景。因此手机上的语音输入法和语音搜索等应用必须保证说话人离手机话筒的距离足够近,这种拾音环境都归为近场拾音。
使用麦克风阵列录音能够利用多通道的语音信号数据进行后期处理,从而抑制噪声,增强目标语音信号。所以在远场拾音中,麦克风阵列成为必不可少的拾音设备。使用麦克风阵列进行拾音的一般步骤为:第一步,确定说话人的位置;第二步,利用波束形成技术对语音信号进行增强。
然而上述方法在实际使用过程中存在以下问题:(1)并不是所有的时刻都有说话人在讲话,同时并非所有时刻的语音都需要进行拾取,这种情况在近场环境中通过按下开始录音键可以轻松解决,但是在远场环境中却不容易进行处理;(2)当有多个说话人的时候很难确定哪一个是需要拾音的说话人。
发明内容
为了解决以上问题,本发明的目的在于提供一种基于麦克风阵列的拾音方法及系统。
一种基于麦克风阵列的拾音方法,其中,包括以下步骤:
步骤1,使用一麦克风阵列拾取并输出的多路语音信号中的其中一路语音信号进行语音激活检测,判断是否出现语音激活信号,如果是,执行步骤2,如果否,重复步骤1;
步骤2,利用所述麦克风阵列输出的多路语音信号进行声源定位,获得声源定位方向;
步骤3,对所述声源定位方向的语音信号进行语音增强,获得增强过的语音信号;
步骤4,对所述增强过的语音信号进行语音唤醒检测,判断是否检测到语音唤醒,如果是,执行步骤5,否则,重复步骤1;
步骤5,所述麦克风阵列拾取并输出多路语音信号;
步骤6,将所述麦克风阵列拾取的多路语音信号处理为一路增强后的语音,作为最终所拾取到的音输出。
本发明的基于麦克风阵列的拾音方法,所述步骤5为:一指向所述声源定位方向的拾音指示灯被点亮,同时所述麦克风阵列拾取并输出多路语音信号。
本发明的基于麦克风阵列的拾音方法,所述步骤1的具体方法如下:
步骤11,自所述麦克风阵列捕获的多路语音信号中选取一路语音信号;
步骤12,检测所述语音信号中说话者的语音起始点及语音结束点;
步骤13,依据所述语音起始点至所述语音结束点之间的信号判断是否出现语音激活信号,如果是,执行步骤2,否则,重复步骤1。
本发明的基于麦克风阵列的拾音方法,所述步骤2中声源定位的具体步骤为:
依据所述麦克风阵列中的至少两个麦克风接收到的信号时间差获取所述发声声源所在的位置作为声源定位方向。
本发明的基于麦克风阵列的拾音方法,所述步骤3中语音增强的具体步骤为:对所述声源定位方向的语音信号进行噪声抑制,获取一增强过的语音信号。
本发明的基于麦克风阵列的拾音方法,所述步骤4的具体步骤为:将所述增强过的语音信号送入一语音唤醒模型,检测所述增强过的语音信号中是否包含设定的唤醒词,如果没有,转至步骤1,如果有,执行步骤5。
本发明的基于麦克风阵列的拾音方法,所述步骤6中,对所述拾音指示灯所指方向的语音进行增强。
本发明的基于麦克风阵列的拾音方法,所述步骤5之后执行所述步骤6 的同时,还包括依据所述步骤5获取的多路语音信号执行步骤1至步骤5。
本发明还提供一种基于麦克风阵列的拾音系统,包括:
麦克风阵列,包括多个麦克风单元,多个所述麦克风单元用于拾取并输出多路语音信号;
语音激活单元,与所述麦克风阵列连接,依据所述多路语音信号中的至少一路语音信号进行语音激活检测,输出语音激活结果信号或语音未激活结果信号;
声源定位单元,于一受所述语音激活结果信号控制下导通的第一受控开关作用下与所述麦克风阵列连接,并依据所述多路语音信号进行声源定位确定声源定位方向;
第一语音增强单元,与所述声源定位单元连接,对所述声源定位方向的语音信号进行语音增强,获得增强过的语音信号;
语音唤醒检测单元,与所述第一语音增强单元连接,对所述增强过的语音信号进行语音唤醒检测,并输出一语音唤醒结果信号或语音未唤醒结果信号;
第二语音增强单元,于一受所述语音唤醒结果信号控制下导通的第二受控开关作用下与所述麦克风阵列连接,将所述麦克风阵列的多路语音信号处理为一路增强后的语音,作为最终所拾取到的音输出。
本发明的基于麦克风阵列的拾音系统,所述麦克风阵列为多颗麦克风单元组成的平面环形结构,多个拾音指示灯沿所述平面环形结构的环绕方向设 置,所述拾音指示灯用于指示所述声源定位方向。
有益效果:本发明针对当前语音识别技术在不同场景和不同需求下日益广泛的应用,提出了一种基于麦克风阵列的拾音方法和系统,能够更好地对远场环境下的语音信号进行拾取,尤其在高噪声环境下能够准确的拾音,为远距离的语音控制提供了优秀的解决方案;同时本发明利用语音唤醒和语音检测也能够降低麦克风阵列数据处理的计算量,从而减少能量损耗,既降低了能源的消耗,又能节约成本。
附图说明
图1为本发明的一种具体实施例的方法流程图;
图2是本发明的另一种具体实施例的方法流程图;
图3是本发明的系统结构图;
图4是本发明的麦克风阵列结构示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动的前提下所获得的所有其他实施例,都属于本发明保护的范围。
需要说明的是,在不冲突的情况下,本发明中的实施例及实施例中的特征可以相互组合。
下面结合附图和具体实施例对本发明作进一步说明,但不作为本发明的 限定。
一种基于麦克风阵列的拾音方法,其中,包括以下步骤:
步骤1,使用一麦克风阵列拾取并输出的多路语音信号中的其中一路语音信号进行语音激活检测,判断是否出现语音激活信号,如果是,执行步骤2,如果否,重复步骤1;
步骤2,利用麦克风阵列输出的多路语音信号进行声源定位,获得声源定位方向;
步骤3,对声源定位方向的语音信号进行语音增强,获得增强过的语音信号;
步骤4,对增强过的语音信号进行语音唤醒检测,判断是否检测到语音唤醒,如果是,执行步骤5,否则,重复步骤1;
步骤5,麦克风阵列拾取并输出多路语音信号;
步骤6,将麦克风阵列拾取的多路语音信号处理为一路增强后的语音,作为最终所拾取到的音输出。
本发明针对当前语音识别技术在不同场景和不同需求下日益广泛的应用,通过语音唤醒来判断开始拾音的方式,使设备进入拾音状态,进而对说话人的语音进行增强。能够更好地对远场环境下的语音信号进行拾取,尤其在高噪声环境下能够准确的拾音。
作为本发明的一种优选的实施例,步骤5为:一指向声源定位方向的拾音指示灯被点亮,同时麦克风阵列拾取并输出多路语音信号。
拾音指示灯是语音交互的一种体现手段,用来提示用户目前的拾音方向。在语音唤醒之后,拾音指示灯会指向声源的方向,如果此方向是用户的方向, 那么用户就可知道后续说的话将会被系统拾音,如果指示灯的方向不是用户的方向,用户也会明白自己的话不会被系统拾音,从而决定是否重新唤醒。
作为本发明的一种优选实施例,步骤6中,对拾音指示灯所指方向的语音进行增强。
这种交互方式通过提示即将拾音和进行语音增强的方向,使得当说话人看到拾音指示灯指向自己的时候,就知道此方向可以进行拾音;如果指向了其它方向,或者没有亮起,就需要重新使用唤醒词进行语音唤醒。这对设备的正确和高效的使用提供了一种简洁的引导。
本发明的基于麦克风阵列的拾音方法,步骤1的具体方法如下:
步骤11,自麦克风阵列捕获的多路语音信号中选取一路语音信号;
步骤12,检测语音信号中说话者的语音起始点及语音结束点;
步骤13,依据语音起始点至语音结束点之间的信号判断是否出现语音激活信号,如果是,执行步骤2,否则,重复步骤1。
上述的语音激活检测步骤当有人说话的时候,给出语音开始点,当说话结束时,给出语音结束点。在整个流程中,只需麦克风阵列的任意一路信号即可进行语音激活检测。语音激活检测可以采用现有技术的语音激活检测方法实现。
本发明的基于麦克风阵列的拾音方法,步骤2中声源定位的具体步骤为:依据麦克风阵列中的至少两个麦克风接收到的信号时间差获取发声声源所在的位置作为声源定位方向。声源定位方法可以采用波束形成技术实现。
本发明的基于麦克风阵列的拾音方法,步骤3中语音增强的具体步骤为:对声源定位方向的语音信号进行噪声抑制,获取一增强过的语音信号。
利用麦克风阵列进行语音增强可以在声源的方向上获得更高信噪比的语音信号,达到语音增强的目的。从而有利于后续的处理。
本发明的基于麦克风阵列的拾音方法,步骤4的具体步骤为:将增强过的语音信号送入一语音唤醒模型,检测增强过的语音信号中是否包含设定的唤醒词,如果没有,转至步骤1,如果有,执行步骤5。
步骤4中通过检测语音信号是否包含设定的唤醒词,进而决定是否进入唤醒状态,如果没有唤醒,则系统不采取响应,继续进行激活检测来判断有其他语音进入。如果唤醒,就亮起拾音指示灯,进入下一步骤。
一种优选的实施例,步骤5之后执行步骤6的同时,还包括依据步骤5获取的多路语音信号执行步骤1至步骤5。
在对拾音指示灯的方向进行拾音的同时,麦克风阵列会对所有方向的语音进行持续录制。这些数据一方面会进入拾音数据流部分,得到最终的拾音;另一方面也会在唤醒数据流中进行循环工作。这是为了保证其他方向上的用户也说出了激活词的情况,或者是拾音指示灯并没有指向用户方向的情况。这时如果唤醒成功,拾音指示灯会转向新的声源方向,继而拾音数据流会在新的方向上拾音,同时唤醒数据流仍然在所有的方向上进行持续的判断。
该优选的实施例下具体拾音流程为:使用麦克风阵列的任意一路信号进行语音激活检测;当检测到语音激活信号之后,利用麦克风阵列进行声源定位;根据声源定位结果,对声源定位方向的语音信号进行语音增强;将增强过的语音信号送入到语音唤醒模型进行语音唤醒检测;当检测到语音唤醒后,拾音指示灯亮起,并指向声源的方向;对指示灯的方向进行语音增强,并进行语音激活检测,当检测到语音后,进行拾音。最后,在进行拾音的同时, 仍然会检测新方向的唤醒,一旦发现在新方向上有新的唤醒,那么拾音指示灯会指向新的唤醒方向,并一直重复此步骤。
本发明还提供一种基于麦克风阵列的拾音系统,参照图3,包括:
麦克风阵列,包括多个麦克风单元,多个麦克风单元用于拾取并输出多路语音信号;
语音激活单元11,与麦克风阵列连接,依据多路语音信号中的至少一路语音信号进行语音激活检测,输出语音激活结果信号或语音未激活结果信号;
声源定位单元12,于一受语音激活结果信号控制下导通的第一受控开关SK1作用下与麦克风阵列连接,并依据多路语音信号进行声源定位确定声源定位方向;
第一语音增强单元13,与声源定位单元12连接,对声源定位方向的语音信号进行语音增强,获得增强过的语音信号;
语音唤醒检测单元14,与第一语音增强单元13连接,对增强过的语音信号进行语音唤醒检测,并输出一语音唤醒结果信号或语音未唤醒结果信号;
第二语音增强单元15,于一受语音唤醒结果信号控制下导通的第二受控开关SK2作用下与麦克风阵列连接,将麦克风阵列的多路语音信号处理为一路增强后的单路语音信号,作为最终所拾取到的音输出。
参照图3,整个拾音流程中,共有2个数据流。一个是唤醒数据流,一个是拾音数据流。唤醒数据流时时刻刻都在运行,但并不是所有的时刻这些模块都在工作,只有语音激活模块检测到语音激活,图3中的第一受控开关SK1打开,数据才会向后流入到声源定位单元12、第一语音增强单元13和语音唤醒检测单元14,而通常大部分时刻声源定位单元12、第一语音增强单 元13和语音唤醒检测单元14这三个模块是不工作的,这样会更加节省资源。
当语音唤醒时,拾音指示灯亮起,并指向拾音的方向。这时图3中的第二受控开关SK2打开,拾音数据流开始工作。当语音唤醒再次给出信号时,拾音指示灯指向新的方向,在新的方向进行拾音。直到拾音数据流没有语音时,拾音数据流关闭。
本发明的基于麦克风阵列的拾音系统,麦克风阵列可以为多颗麦克风单元组成的平面环形结构,参照图4,该麦克风阵列中共包含8颗麦克风单元,8颗麦克风单元呈对称结构,使得对各个方向的信号都能同等的对待和处理。多个拾音指示灯沿平面环形结构的环绕方向设置,可以指向平面的各个方向,用于指示所述声源定位方向。
以上仅为本发明较佳的实施例,并非因此限制本发明的实施方式及保护范围,对于本领域技术人员而言,应当能够意识到凡运用本发明说明书及图示内容所作出的等同替换和显而易见的变化所得到的方案,均应当包含在本发明的保护范围内。

Claims (10)

  1. 一种基于麦克风阵列的拾音方法,其特征在于,包括以下步骤:
    步骤1,使用一麦克风阵列拾取并输出的多路语音信号中的其中一路语音信号进行语音激活检测,判断是否出现语音激活信号,如果是,执行步骤2,如果否,重复步骤1;
    步骤2,利用所述麦克风阵列输出的多路语音信号进行声源定位,获得声源定位方向;
    步骤3,对所述声源定位方向的语音信号进行语音增强,获得增强过的语音信号;
    步骤4,对所述增强过的语音信号进行语音唤醒检测,判断是否检测到语音唤醒,如果是,执行步骤5,否则,重复步骤1;
    步骤5,所述麦克风阵列拾取并输出多路语音信号;
    步骤6,将所述麦克风阵列拾取的多路语音信号处理为一路增强后的语音,作为最终所拾取到的音输出。
  2. 根据权利要求1所述的基于麦克风阵列的拾音方法,其特征在于,
    所述步骤5为:一指向所述声源定位方向的拾音指示灯被点亮,同时所述麦克风阵列拾取并输出多路语音信号。
  3. 根据权利要求1或2所述的基于麦克风阵列的拾音方法,其特征在于,所述步骤1的具体方法如下:
    步骤11,自所述麦克风阵列捕获的多路语音信号中选取一路语音信号;
    步骤12,检测所述语音信号中说话者的语音起始点及语音结束点;
    步骤13,依据所述语音起始点至所述语音结束点之间的信号判断是否出现语音激活信号,如果是,执行步骤2,否则,重复步骤1。
  4. 根据权利要求1或2所述的基于麦克风阵列的拾音方法,其特征在于,所述步骤2中声源定位的具体步骤为:
    依据所述麦克风阵列中的至少两个麦克风接收到的信号时间差获取所述发声声源所在的位置作为声源定位方向。
  5. 根据权利要求1或2所述的基于麦克风阵列的拾音方法,其特征在于,所述步骤3中语音增强的具体步骤为:对所述声源定位方向的语音信号进行噪声抑制,获取一增强过的语音信号。
  6. 根据权利要求1或2所述的基于麦克风阵列的拾音方法,其特征在于,
    所述步骤4的具体步骤为:将所述增强过的语音信号送入一语音唤醒模型,检测所述增强过的语音信号中是否包含设定的唤醒词,如果没有,转至步骤1,如果有,执行步骤5。
  7. 根据权利要求2所述的基于麦克风阵列的拾音方法,其特征在于,所述步骤6中,对所述拾音指示灯所指方向的语音进行增强。
  8. 根据权利要求1或2所述的基于麦克风阵列的拾音方法,其特征在于, 所述步骤5之后执行所述步骤6的同时,还包括依据所述步骤5获取的多路语音信号执行步骤1至步骤5。
  9. 基于麦克风阵列的拾音系统,其特征在于,包括:
    麦克风阵列,包括多个麦克风单元,多个所述麦克风单元用于拾取并输出多路语音信号;
    语音激活单元,与所述麦克风阵列连接,依据所述多路语音信号中的至少一路语音信号进行语音激活检测,输出语音激活结果信号或语音未激活结果信号;
    声源定位单元,于一受所述语音激活结果信号控制下导通的第一受控开关作用下与所述麦克风阵列连接,并依据所述多路语音信号进行声源定位确定声源定位方向;
    第一语音增强单元,与所述声源定位单元连接,对所述声源定位方向的语音信号进行语音增强,获得增强过的语音信号;
    语音唤醒检测单元,与所述第一语音增强单元连接,对所述增强过的语音信号进行语音唤醒检测,并输出一语音唤醒结果信号或语音未唤醒结果信号;
    第二语音增强单元,于一受所述语音唤醒结果信号控制下导通的第二受控开关作用下与所述麦克风阵列连接,将所述麦克风阵列的多路语音信号处理为一路增强后的语音,作为最终所拾取到的音输出。
  10. 根据权利要求9所述的基于麦克风阵列的拾音系统,其特征在于, 所述麦克风阵列为多颗麦克风单元组成的平面环形结构,多个拾音指示灯沿所述平面环形结构的环绕方向设置,所述拾音指示灯用于指示所述声源定位方向。
PCT/CN2018/074304 2017-01-26 2018-01-26 一种基于麦克风阵列的拾音方法及系统 WO2018137704A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/476,259 US11302341B2 (en) 2017-01-26 2018-01-26 Microphone array based pickup method and system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710061599.3A CN106782585B (zh) 2017-01-26 2017-01-26 一种基于麦克风阵列的拾音方法及系统
CN201710061599.3 2017-01-26

Publications (1)

Publication Number Publication Date
WO2018137704A1 true WO2018137704A1 (zh) 2018-08-02

Family

ID=58955187

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/074304 WO2018137704A1 (zh) 2017-01-26 2018-01-26 一种基于麦克风阵列的拾音方法及系统

Country Status (4)

Country Link
US (1) US11302341B2 (zh)
CN (1) CN106782585B (zh)
TW (1) TWI667926B (zh)
WO (1) WO2018137704A1 (zh)

Families Citing this family (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106910500B (zh) 2016-12-23 2020-04-17 北京小鸟听听科技有限公司 对带麦克风阵列的设备进行语音控制的方法及设备
CN106782585B (zh) * 2017-01-26 2020-03-20 芋头科技(杭州)有限公司 一种基于麦克风阵列的拾音方法及系统
CN107277672B (zh) * 2017-06-07 2020-01-10 福州瑞芯微电子股份有限公司 一种支持唤醒模式自动切换的方法和装置
US10789949B2 (en) * 2017-06-20 2020-09-29 Bose Corporation Audio device with wakeup word detection
CN107331392A (zh) * 2017-06-30 2017-11-07 北京小米移动软件有限公司 位置提示方法、装置以及计算机可读存储介质
CN109300475A (zh) * 2017-07-25 2019-02-01 中国电信股份有限公司 麦克风阵列拾音方法和装置
CN107591151B (zh) * 2017-08-22 2021-03-16 百度在线网络技术(北京)有限公司 远场语音唤醒方法、装置和终端设备
US10951967B2 (en) * 2017-08-23 2021-03-16 Amazon Technologies, Inc. Voice-controlled multimedia device and universal remote
CN107577449B (zh) * 2017-09-04 2023-06-23 百度在线网络技术(北京)有限公司 唤醒语音的拾取方法、装置、设备及存储介质
CN107464565B (zh) * 2017-09-20 2020-08-04 百度在线网络技术(北京)有限公司 一种远场语音唤醒方法及设备
CN107767868A (zh) * 2017-10-23 2018-03-06 深圳北鱼信息科技有限公司 麦克风阵列及语音控制系统
CN107818793A (zh) * 2017-11-07 2018-03-20 北京云知声信息技术有限公司 一种可减少无用语音识别的语音采集处理方法及装置
CN108182948B (zh) * 2017-11-20 2021-08-20 云知声智能科技股份有限公司 可提高语音识别率的语音采集处理方法及装置
US10524046B2 (en) * 2017-12-06 2019-12-31 Ademco Inc. Systems and methods for automatic speech recognition
CN108122563B (zh) * 2017-12-19 2021-03-30 北京声智科技有限公司 提高语音唤醒率及修正doa的方法
CN108093350B (zh) * 2017-12-21 2020-12-15 广东小天才科技有限公司 麦克风的控制方法和麦克风
CN107948910A (zh) * 2017-12-21 2018-04-20 重庆金鑫科技产业发展有限公司 一种新型麦克风安装面板
CN110575051B (zh) * 2018-06-11 2022-03-18 佛山市顺德区美的电热电器制造有限公司 一种烹饪设备及烹饪设备的控制方法、装置和存储介质
CN108986833A (zh) * 2018-08-21 2018-12-11 广州市保伦电子有限公司 基于麦克风阵列的拾音方法、系统、电子设备及存储介质
CN109246550A (zh) * 2018-10-31 2019-01-18 北京小米移动软件有限公司 远场拾音方法、远场拾音装置及电子设备
CN111354341A (zh) * 2018-12-04 2020-06-30 阿里巴巴集团控股有限公司 语音唤醒方法及装置、处理器、音箱和电视机
CN110033773B (zh) * 2018-12-13 2021-09-14 蔚来(安徽)控股有限公司 用于车辆的语音识别方法、装置、系统、设备以及车辆
CN110351633B (zh) * 2018-12-27 2022-05-24 腾讯科技(深圳)有限公司 声音采集设备
CN111383649A (zh) * 2018-12-28 2020-07-07 深圳市优必选科技有限公司 一种机器人及其音频处理方法
CN109697987B (zh) * 2018-12-29 2021-05-25 思必驰科技股份有限公司 一种外接式的远场语音交互装置及实现方法
CN110010126B (zh) * 2019-03-11 2021-10-08 百度国际科技(深圳)有限公司 语音识别方法、装置、设备和存储介质
CN109920433B (zh) * 2019-03-19 2021-08-20 上海华镇电子科技有限公司 嘈杂环境下电子设备的语音唤醒方法
CN110992974B (zh) 2019-11-25 2021-08-24 百度在线网络技术(北京)有限公司 语音识别方法、装置、设备以及计算机可读存储介质
CN111246339B (zh) * 2019-12-31 2021-12-07 上海景吾智能科技有限公司 一种调节拾音方向的方法、系统、存储介质及智能机器人
CN111048104B (zh) * 2020-01-16 2022-11-29 北京声智科技有限公司 语音增强处理方法、装置及存储介质
CN111667832A (zh) * 2020-06-14 2020-09-15 史长宣 一种多场所智能语音控制系统及方法
US11410652B2 (en) * 2020-07-06 2022-08-09 Tencent America LLC Multi-look enhancement modeling and application for keyword spotting
CN112185406A (zh) * 2020-09-18 2021-01-05 北京大米科技有限公司 声音处理方法、装置、电子设备和可读存储介质
TWI782578B (zh) * 2021-06-16 2022-11-01 信邦電子股份有限公司 自行車電控系統及使用者身分認證方法
US11704949B2 (en) 2021-07-21 2023-07-18 Sinbon Electronics Company Ltd. User verifying bicycle control system and user verification method thereof

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104488025A (zh) * 2012-03-16 2015-04-01 纽昂斯通讯公司 用户专用的自动语音识别
CN204390479U (zh) * 2015-03-04 2015-06-10 冠捷显示科技(厦门)有限公司 一种智能家用电器遥控装置
CN105280183A (zh) * 2015-09-10 2016-01-27 百度在线网络技术(北京)有限公司 语音交互方法和系统
CN106024003A (zh) * 2016-05-10 2016-10-12 北京地平线信息技术有限公司 结合图像的语音定位和增强系统及方法
CN106098075A (zh) * 2016-08-08 2016-11-09 腾讯科技(深圳)有限公司 基于麦克风阵列的音频采集方法和装置
CN106155621A (zh) * 2015-04-20 2016-11-23 钰太芯微电子科技(上海)有限公司 可识别声源位置的关键词语音唤醒系统及方法及移动终端
CN106782585A (zh) * 2017-01-26 2017-05-31 芋头科技(杭州)有限公司 一种基于麦克风阵列的拾音方法及系统

Family Cites Families (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1184676B1 (en) * 2000-09-02 2004-05-06 Nokia Corporation System and method for processing a signal being emitted from a target signal source into a noisy environment
AT410597B (de) * 2000-12-04 2003-06-25 Vatter Acoustic Technologies V Verfahren, computersystem und computerprodukt zur messung akustischer raumeigenschaften
US7970123B2 (en) * 2005-10-20 2011-06-28 Mitel Networks Corporation Adaptive coupling equalization in beamforming-based communication systems
EP2197219B1 (en) * 2008-12-12 2012-10-24 Nuance Communications, Inc. Method for determining a time delay for time delay compensation
US9226088B2 (en) * 2011-06-11 2015-12-29 Clearone Communications, Inc. Methods and apparatuses for multiple configurations of beamforming microphone arrays
TR201807219T4 (tr) * 2012-01-17 2018-06-21 Koninklijke Philips Nv Audio kaynağı konum tahmini
US8885815B1 (en) * 2012-06-25 2014-11-11 Rawles Llc Null-forming techniques to improve acoustic echo cancellation
CN102831898B (zh) * 2012-08-31 2013-11-13 厦门大学 带声源方向跟踪功能的麦克风阵列语音增强装置及其方法
US9615172B2 (en) * 2012-10-04 2017-04-04 Siemens Aktiengesellschaft Broadband sensor location selection using convex optimization in very large scale arrays
JP6352259B2 (ja) * 2013-06-27 2018-07-04 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカPanasonic Intellectual Property Corporation of America 制御装置、及び、制御方法
US9640179B1 (en) * 2013-06-27 2017-05-02 Amazon Technologies, Inc. Tailoring beamforming techniques to environments
US9565497B2 (en) * 2013-08-01 2017-02-07 Caavo Inc. Enhancing audio using a mobile device
US9532131B2 (en) * 2014-02-21 2016-12-27 Apple Inc. System and method of improving voice quality in a wireless headset with untethered earbuds of a mobile device
KR102208477B1 (ko) * 2014-06-30 2021-01-27 삼성전자주식회사 마이크 운용 방법 및 이를 지원하는 전자 장치
US20160071526A1 (en) * 2014-09-09 2016-03-10 Analog Devices, Inc. Acoustic source tracking and selection
US10204622B2 (en) * 2015-09-10 2019-02-12 Crestron Electronics, Inc. Acoustic sensory network
JP2016127300A (ja) * 2014-12-26 2016-07-11 アイシン精機株式会社 音声処理装置
KR102351366B1 (ko) * 2015-01-26 2022-01-14 삼성전자주식회사 음성 인식 방법 및 장치
US9565493B2 (en) * 2015-04-30 2017-02-07 Shure Acquisition Holdings, Inc. Array microphone system and method of assembling the same
US10334390B2 (en) * 2015-05-06 2019-06-25 Idan BAKISH Method and system for acoustic source enhancement using acoustic sensor array
US10134425B1 (en) * 2015-06-29 2018-11-20 Amazon Technologies, Inc. Direction-based speech endpointing
JP6625383B2 (ja) * 2015-09-18 2019-12-25 株式会社ディーアンドエムホールディングス コンピュータで読み取り可能なプログラム、オーディオコントローラ、およびワイヤレスオーディオシステム
KR102476600B1 (ko) * 2015-10-21 2022-12-12 삼성전자주식회사 전자 장치, 그의 음성 인식 방법 및 비일시적 컴퓨터 판독가능 기록매체
KR102444061B1 (ko) * 2015-11-02 2022-09-16 삼성전자주식회사 음성 인식이 가능한 전자 장치 및 방법
US20170134853A1 (en) * 2015-11-09 2017-05-11 Stretch Tech Llc Compact sound location microphone
US9826599B2 (en) * 2015-12-28 2017-11-21 Amazon Technologies, Inc. Voice-controlled light switches
US9947316B2 (en) * 2016-02-22 2018-04-17 Sonos, Inc. Voice control of a media playback system
US10063965B2 (en) * 2016-06-01 2018-08-28 Google Llc Sound source estimation using neural networks
US9978390B2 (en) * 2016-06-09 2018-05-22 Sonos, Inc. Dynamic player selection for audio signal processing
US10482899B2 (en) * 2016-08-01 2019-11-19 Apple Inc. Coordination of beamformers for noise estimation and noise suppression
KR102515996B1 (ko) * 2016-08-26 2023-03-31 삼성전자주식회사 음성 인식을 위한 전자 장치 및 그 제어 방법
US10127908B1 (en) * 2016-11-11 2018-11-13 Amazon Technologies, Inc. Connected accessory for a voice-controlled device
BR112019013666A2 (pt) * 2017-01-03 2020-01-14 Koninklijke Philips Nv aparelho de captura de áudio formador de feixes, método de operação para um aparelho de captura de áudio formador de feixes, e produto de programa de computador
US10297267B2 (en) * 2017-05-15 2019-05-21 Cirrus Logic, Inc. Dual microphone voice processing for headsets with variable microphone array orientation
US10580411B2 (en) * 2017-09-25 2020-03-03 Cirrus Logic, Inc. Talker change detection
EP3467819A1 (en) * 2017-10-05 2019-04-10 Harman Becker Automotive Systems GmbH Apparatus and method using multiple voice command devices
US10620913B2 (en) * 2017-12-04 2020-04-14 Amazon Technologies, Inc. Portable voice assistant device with linear lighting elements
TWM579809U (zh) * 2019-01-11 2019-06-21 陳筱涵 Communication aid system for severely hearing impaired

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104488025A (zh) * 2012-03-16 2015-04-01 纽昂斯通讯公司 用户专用的自动语音识别
CN204390479U (zh) * 2015-03-04 2015-06-10 冠捷显示科技(厦门)有限公司 一种智能家用电器遥控装置
CN106155621A (zh) * 2015-04-20 2016-11-23 钰太芯微电子科技(上海)有限公司 可识别声源位置的关键词语音唤醒系统及方法及移动终端
CN105280183A (zh) * 2015-09-10 2016-01-27 百度在线网络技术(北京)有限公司 语音交互方法和系统
CN106024003A (zh) * 2016-05-10 2016-10-12 北京地平线信息技术有限公司 结合图像的语音定位和增强系统及方法
CN106098075A (zh) * 2016-08-08 2016-11-09 腾讯科技(深圳)有限公司 基于麦克风阵列的音频采集方法和装置
CN106782585A (zh) * 2017-01-26 2017-05-31 芋头科技(杭州)有限公司 一种基于麦克风阵列的拾音方法及系统

Also Published As

Publication number Publication date
US20190355375A1 (en) 2019-11-21
TW201828719A (zh) 2018-08-01
CN106782585A (zh) 2017-05-31
TWI667926B (zh) 2019-08-01
US11302341B2 (en) 2022-04-12
CN106782585B (zh) 2020-03-20

Similar Documents

Publication Publication Date Title
WO2018137704A1 (zh) 一种基于麦克风阵列的拾音方法及系统
CN106782591B (zh) 一种在背景噪音下提高语音识别率的装置及其方法
CN107464565B (zh) 一种远场语音唤醒方法及设备
EP3185521A1 (en) Voice wake-up method and device
US11605372B2 (en) Time-based frequency tuning of analog-to-information feature extraction
US9437188B1 (en) Buffered reprocessing for multi-microphone automatic speech recognition assist
CN108665895B (zh) 用于处理信息的方法、装置和系统
Sudharsan et al. Ai vision: Smart speaker design and implementation with object detection custom skill and advanced voice interaction capability
CN108986833A (zh) 基于麦克风阵列的拾音方法、系统、电子设备及存储介质
US20180174574A1 (en) Methods and systems for reducing false alarms in keyword detection
CN206312566U (zh) 一种车载智能音频装置
US11551700B2 (en) Systems and methods for power-efficient keyword detection
WO2020048431A1 (zh) 一种语音处理方法、电子设备和显示设备
US11626104B2 (en) User speech profile management
CN105049802A (zh) 一种语音识别执法记录仪及其识别方法
CN109327752A (zh) 蓝牙拾音器、车载设备以及汽车
GB2526980A (en) Sensor input recognition
CN109473111B (zh) 一种语音赋能装置及方法
CN108616790B (zh) 一种拾音放音电路和系统、拾音放音切换方法
CN208538474U (zh) 语音识别系统
WO2020238703A1 (zh) 获取语音信号的方法及装置
US11735187B2 (en) Hybrid routing for hands-free voice assistant, and related systems and methods
WO2019214299A1 (zh) 自动翻译装置、方法及计算机设备
GB2553040A (en) Sensor input recognition
CN114582351A (zh) 线上音源控制方法、装置、设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18745228

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18745228

Country of ref document: EP

Kind code of ref document: A1