US9830913B2 - VAD detection apparatus and method of operation the same - Google Patents

VAD detection apparatus and method of operation the same Download PDF

Info

Publication number
US9830913B2
US9830913B2 US14/861,113 US201514861113A US9830913B2 US 9830913 B2 US9830913 B2 US 9830913B2 US 201514861113 A US201514861113 A US 201514861113A US 9830913 B2 US9830913 B2 US 9830913B2
Authority
US
United States
Prior art keywords
estimate
power
data representative
acoustic energy
voice activity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US14/861,113
Other versions
US20160064001A1 (en
Inventor
Henrik Thomsen
Dibyendu Nandy
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Samsung Electronics Co Ltd
Original Assignee
Knowles Electronics LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Knowles Electronics LLC filed Critical Knowles Electronics LLC
Priority to US14/861,113 priority Critical patent/US9830913B2/en
Assigned to KNOWLES ELECTRONICS, LLC reassignment KNOWLES ELECTRONICS, LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: THOMSEN, HENRIK, NANDY, Dibyendu
Publication of US20160064001A1 publication Critical patent/US20160064001A1/en
Application granted granted Critical
Publication of US9830913B2 publication Critical patent/US9830913B2/en
Assigned to SAMSUNG ELECTRONICS CO., LTD. reassignment SAMSUNG ELECTRONICS CO., LTD. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: KNOWLES ELECTRONICS, LLC
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/0204Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using subband decomposition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/21Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R19/00Electrostatic transducers
    • H04R19/04Microphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • H04R3/005Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/16Communication-related supplementary services, e.g. call-transfer or call-hold
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L2015/088Word spotting
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals

Definitions

  • This application relates to microphones and, more specifically, to voice activity detection (VAD) approaches used with these microphones.
  • VAD voice activity detection
  • Microphones are used to obtain a voice signal from a speaker. Once obtained, the signal can be processed in a number of different ways. A wide variety of functions can be provided by today's microphones and they can interface with and utilize a variety of different algorithms.
  • Voice triggering for example, as used in mobile systems is an increasingly popular feature that customers wish to use. For example, a user may wish to speak commands into a mobile device and have the device react in response to the commands.
  • a programmable digital signal processor may first use a voice activity detection algorithm to detect if there is voice in an audio signal captured by a microphone, and then, subsequently, analysis is performed on the signal to predict what the spoken word was in the received audio signal.
  • VAD voice activity detection
  • FIG. 1 is a block diagram of a system with microphones that use VAD;
  • FIG. 2 is a state transition diagram showing an interrupt sequence
  • FIG. 3 is a block diagram of a VAD approach
  • FIG. 4 is an analyze filter bank used in VAD
  • FIG. 5 is a block diagram of high pass and low pass filters used in an analyze filter bank
  • FIG. 6 is a graph of the results of the analyze filter bank
  • FIG. 7 is a block diagram of the tracker block
  • FIG. 8 is a graph of the results of the tracker block
  • FIG. 9 is a block diagram of a decision block.
  • the present approaches provide voice activity detection (VAD) methods and devices that determine whether an event or human voice is present.
  • VAD voice activity detection
  • the approaches described herein are efficient, easy to implement, lower part counts, are able to detect voice with very low latency, and reduce false detections.
  • ASIC application specific integrated circuit
  • microprocessor can be used to implement the approaches described herein using programmed computer instructions.
  • VAD approaches may be disposed in the microphone (as described herein), these functionalities may also be disposed in other system elements.
  • a first signal from a first microphone and a second signal from a second microphone are received.
  • the first signal indicates whether a voice signal has been determined at the first microphone
  • the second signal indicates whether a voice signal has been determined at the second microphone.
  • the processing device is activated to receive data and the data is examined for a trigger word.
  • a signal is sent to an application processor to further process information from one or more of the first microphone and the second microphone.
  • the processing device is reset to deactivate data input and allow the first microphone and the second microphone to enter or maintain an event detection mode of operation.
  • the application processor utilizes a voice recognition (VR) module to determine whether other or further commands can be recognized in the information.
  • the first microphone and the second microphone transmit pulse density modulation (PDM) data.
  • PDM pulse density modulation
  • the first microphone includes a first voice activity detection (VAD) module that determines whether voice activity has been detected
  • the second microphone includes a second voice activity detection (VAD) module that determines whether voice activity has been detected.
  • VAD voice activity detection
  • the first VAD module and the second VAD module perform the steps of: receiving sound energy from a source; filtering the sound energy into a plurality of filter bands; obtaining a power estimate for each of the plurality of filter bands; and based upon each power estimate, determining whether voice activity is detected.
  • the filtering utilizes one or more low pass filters, high pass filters, and frequency dividers.
  • the power estimate comprises an upper power estimate and a lower power estimate.
  • either the first VAD module or the second VAD module performs Trigger Phrase recognition. In other aspects, either the first VAD module or the second VAD module performs Command Recognition.
  • the processing device controls the first microphone and the second microphone by varying a clock frequency of a clock supplied to the first microphone and the second microphone.
  • the system includes a first microphone with a first voice activity detection (VAD) module and a second microphone with a second voice activity detection (VAD) module, and a processing device.
  • the processing device is communicatively coupled to the first microphone and the second microphone, and configured to receive a first signal from the first microphone and a second signal from the second microphone.
  • the first signal indicates whether a voice signal has been determined at the first microphone by the first VAD module
  • the second signal indicates whether a voice signal has been determined at the second microphone by the second VAD module.
  • the processing device is further configured, to when the first signal indicates potential voice activity or the second signal indicates potential voice activity, activate and receive data from the first microphone or the second microphone, and subsequently examine the data for a trigger word.
  • a signal is sent to an application processor to further process information from one or more of the first microphone and the second microphone.
  • the processing device is further configured to, when no trigger word is found, transmit a third signal to the first microphone and the second microphone.
  • the third signal causes the first microphone and second microphone to enter or maintain an event detection mode of operation.
  • either the first VAD module or the second VAD module performs Trigger Phrase recognition. In another aspect, either the first VAD module or the second VAD module performs Command Recognition. In other examples, the processing device controls the first microphone and the second microphone by varying a clock frequency of a clock supplied to the first microphone and the second microphone.
  • voice activity is detected in a micro-electro-mechanical system (MEMS) microphone.
  • Sound energy is received from a source and the sound energy is filtered into a plurality of filter bands.
  • a power estimate is obtained for each of the plurality of filter bands. Based upon each power estimate, a determination is made as to whether voice activity is detected.
  • MEMS micro-electro-mechanical system
  • the filtering utilizes one or more low pass filters, high pass filters and frequency dividers.
  • the power estimate comprises an upper power estimate and a lower power estimate.
  • ratios between the upper power estimate and the lower power estimate within the plurality of filter bands are determined, and selected ones of the ratios are compared to a predetermined threshold.
  • ratios between the upper power estimate and the lower power estimate between the plurality of filter bands are determined, and selected ones of the ratios are compared to a predetermined threshold.
  • the system 100 includes a first microphone element 102 , a second microphone element 104 , a right event microphone 106 , a left event microphone 108 , a digital signal processor (DSP)/codec 110 , and an application processor 112 .
  • DSP digital signal processor
  • the first microphone element 102 and the second microphone element 104 are microelectromechanical system (MEMS) elements that receive sound energy and convert the sound energy into electrical signals that represent the sound energy.
  • MEMS microelectromechanical system
  • the elements 102 and 104 include a MEMS die, a diaphragm, and a back plate. Other components may also be used.
  • the right event microphone 106 and the left event microphone 108 receive signals from the microphone elements 102 and 104 , and process these signals.
  • the elements 106 and 108 may include buffers, preamplifiers, analog-to-digital (A-to-D) converters, and other processing elements that convert the analog signal received from elements 102 and 104 into digital signals and perform other processing functions. These elements may, for example, include an ASIC that implements these functions.
  • the right event microphone 106 and the left event microphone 108 also include voice activity detection (VAD) modules 103 and 105 respectively and these may be implemented by an ASIC that executes programmed computer instructions.
  • VAD modules 103 and 105 utilize the approaches described herein to determine whether voice (or some other event) has been detected.
  • This information is transmitted to the digital signal processor (DSP)/codec 110 and the application processor 112 for further processing. Also, the signals (potentially voice information) now in the form of digital information are sent to the digital signal processor (DSP)/codec 110 and the application processor 112 .
  • the digital signal processor (DSP)/codec 110 receives signals from the elements 106 and 108 (including whether the VAD modules have detected voice) and looks for trigger words (e.g., “Hello, My Mobile) using a voice recognition (VR) trigger engine 120 .
  • the codec 110 also performs interrupt processing (see FIG. 2 ) using interrupt handling module 122 . If the trigger word is found, a signal is sent to the application processor 112 to further process received information.
  • the application processor 112 may utilize a VR recognition module 126 (e.g., implemented as hardware and/or software) to determine whether other or further commands can be recognized in the information.
  • the right event microphone 106 and/or the left event microphone 108 will wake up the digital signal processor (DSP)/codec 110 and the application processor 112 by starting to transmit pulse density modulation (PDM) data.
  • PDM pulse density modulation
  • General input/output (I/O) pins 113 of the digital signal processor (DSP)/codec 110 and the application processor 112 are assumed to be configurable for interrupts (or simply polling) as described below with respect to FIG. 2 .
  • the modules 103 and 105 may perform different recognition functions; one VAD module may perform Trigger Keyword recognition and a second VAD module may perform Command Recognition.
  • the digital signal processor (DSP)/codec 110 and the application processor 112 control the right event microphone 106 and the left event microphone 108 by varying the clock frequency of the clock 124 .
  • the microphone 106 or 108 interrupts/wakes up the digital signal processor (DSP)/codec 110 in case of an event being detected.
  • the event may be voice (e.g., it could be the start of the voice trigger word).
  • the digital signal processor (DSP)/codec 110 puts the microphone back in Event Detection mode in case no trigger word is present.
  • the digital signal processor (DSP)/codec 110 determines when to change the microphone back to Event Detection mode.
  • the internal VAD of the DSP/codec 110 could be used to make this decision and/or the internal voice trigger recognitions system of the DSP/Codec 110 .
  • the word trigger recognition didn't recognize any Trigger Word after approximately 2 or 3 seconds then it should configure its input/output pin to be an interrupt pin again and then set the microphone back into detecting mode (step 204 in FIG. 2 ) and then go into sleep mode/power down.
  • the microphone may also track the time of contiguous voice activity. If activity does not persist beyond a certain countdown e.g., 5 seconds, and the microphone also stays in the low power VAD mode of operation, i.e. not put into a standard or high performance mode within that time frame, the implication is that the voice trigger was not detected within that period of detected voice activity, then there is no further activity and the microphone may initiate a change to detection mode from detect and transmit mode. A DSP/Codec on detecting no transmission from the microphone may also go to low power sleep mode.
  • a certain countdown e.g., 5 seconds
  • the VAD approaches described herein can include three functional blocks: an analyze filter bank 302 , power tracker block or module 304 , and a decision block or module 306 .
  • the analyze filter bank 302 filters the input signal into five spectral bands.
  • the power tracker block 304 includes an upper tracker and a lower tracker. For each of these and for each band it obtains a power estimate.
  • the decision block 306 looks at the power estimates and determines if voice or an acoustic event is present.
  • the threshold values can be set by a number of different approaches such as one time parts (OTPs), or various types of wired or wireless interfaces 310 .
  • feedback 308 from the decision block 306 can control the power trackers, this feedback could be the VAD decision.
  • the trackers (described below) could be configured to use another set of attack/release constants if voice is present.
  • the functions described herein can be deployed in any number of functional blocks and it will be understood that the three blocks described are examples only.
  • FIGS. 4, 5, and 6 one example of an analyze filter bank is described, the processing is very similar to the subband coding system, which may be implemented by the wavelet transform, by Quadrature Mirror Filters (QMF) or by other similar approaches.
  • the decimation stage on the high pass filters (D) is omitted compared to the more traditional subband coding/wavelet transform method.
  • the reason for the omission is that later in the signal processing step an estimation of the root mean square (RMS) of energy or power value is obtained and it is not desired to overlap in frequency between the low pass filtering (used to derive the “Mean” of RMS) and the pass band of the analyze filter bank.
  • RMS root mean square
  • This approach will relax the filter requirement to the “Mean” low pass filter.
  • the decimation stage could be introduced as this would save computational requirements.
  • the filter bank includes high pass filters 402 (D), low pass filters 404 (H), and sample frequency dividers 406 (Fs is the sample frequency of the particular channel).
  • This apparatus operates similarly to a sub-band coding approach and has a consistent relative bandwidth as the wavelet transforms.
  • the incoming signal is separated into five bands. Other numbers of bands can also be used.
  • channel 5 has a pass band between 4000 Hz to 8000 Hz;
  • channel 4 has a pass band between 2000 Hz to 4000 Hz;
  • channel 3 has a pass band between 1000 Hz to 2000 Hz;
  • channel 2 has a pass band between 500 Hz to 1000 Hz; and
  • channel 1 has a pass band between 0 Hz to 500 Hz.
  • the high pass filter and the low pass filter are constructed from two all pass filters 502 (G 1 ) and 504 (G 2 ). These filters could be first or second order all pass infinite impulse response (IIR) structures.
  • the input signal X(z) passes through delay block 501 .
  • a low pass filtered sample 512 and a high pass filtered sample 514 are generated.
  • the decimation structure gives several benefits. For example, the order of the H and D filters are double (e.g., two times), and the number of gates and power are reduced in the system.
  • a first curve 602 shows the low pass filter response while a second curve 604 shows the high pass filter response.
  • the tracker 700 includes an absolute value block 702 , a SINC decimation block 704 , and upper and lower tracker block 706 .
  • the block 702 obtains the absolute value of the signal (this could also be the square value).
  • the SINC block 704 is a first order SINC with N decimation factor and it simply accumulates N absolute signal values and then dumps this data after a predetermined time (N sample periods).
  • N sample periods a predetermined time
  • any kind of decimation filter could be used.
  • a short time RMS estimate is found by rectifying and averaging/decimating by the SINC block 704 (i.e., accumulation and dump, if squaring was used in block 704 then a square root operator could be introduced here as well).
  • the decimation factors, N are chosen so the sample rate of each short time RMS estimate is 125 Hz or 250 Hz except the DC channel (channel 1) where the sample rate is 62.5 Hz or 125 Hz.
  • a lower tracker and an upper tracker, i.e., one tracker pair for each channel are included in the tracker block 706 .
  • the operation of the tracker block 706 can be described as:
  • the sample index number is n
  • Kau i and Kru i are attack and release constants for the upper tracker channel number i.
  • Kal i and Krl i are attack and release constants for the lower tracker for channel number i.
  • the output of this block is fed to the decision block described below with respect to FIG. 9 .
  • a first curve 802 shows the upper tracker that follows fast changes in power or RMS.
  • a second curve 804 shows the lower tracker following slower changes in the power or RMS.
  • a third curve 806 represents the input signal to the tracker block.
  • Block 902 is redrawn in FIG. 9 in order to make it easier for the reader (blocks 706 and 902 are the same tracker blocks).
  • the decision block uses the output from the trackers and includes a division block 904 to determine the ratio between the upper and lower tracker for each channel, summation block 908 , comparison block 910 , and sign block 912 .
  • the internal operation of the division block 904 is structured and configured so that an actual division need not be made.
  • the lower tracker value lower i (n) is multiplied by Th i (n) (a predetermined threshold which could be constant and independent of n or changed according to a rule). This is subtracted from the upper i (n) tracker value.
  • the sign(x) function is then performed.
  • Upper and lower tracker signals are estimated by upper and lower tracker block 902 (this block is identical to block 706 ).
  • the ratio between the upper tracker and the lower tracker is then calculated by division block 904 .
  • This ratio is compared with a threshold Th i (n).
  • the flag R_flag i (n) is set if the ratio is larger than the threshold Th i (n), i.e., if sign(x) in 904 is positive.
  • Th i (n) could be constant over time for each channel or follow a rule where it actually changes for each sample instance n.
  • ratios between channels can also be used/calculated.
  • a total number of 25 ratios can be calculated (if 5 filter bands exist). Again, each of these ratios is compared with a Threshold Th i,ch (n).
  • Th i,ch (n) A total number of 25 thresholds exist if 5 channels are available. Again, the threshold can be constant over time n, or change for each sample instance n. In one implementation, not all of the ratios between bands will be used, only a subset.
  • a voice power flag V_flag(n) is also estimated as the sum of three channels from 500 Hz to 4000 Hz by summation block 908 . This flag is set if the power level is low enough, (i.e., smaller than V th (n)) and this is determined by comparison block 910 and sign block 912 . This flag is only in effect when the microphone is in a quiet environment or/and the persons speaking are far away from the microphone.
  • the R_flagi(n) and V_flag(n) are used to decide if the current time step “n” is voice, and stored in E_flag(n).
  • the operation that determines if E_flag (n) is voice (1) or not voice (0) can be described by the following:
  • the final VAD_flag(n) is a smoothed version of the E_flag(n). It simply makes a VAD positive decision true for a minimum time/period of VAD_NUMBER of sample periods.
  • This smoothing can be described by the following approach. This approach can be used to determine if a voice event is detected, but that the voice is present in the background and therefore of no interest. In this respect, a false positive reading is avoided.
  • Hang-on-count represents a time of app VAD_NUMBER/Sample Rate.
  • Sample Rate are the fastest channel, i.e., 250, 125 or 62.5 Hz. It will be appreciated that these approaches examine to see if 4 flags have been set. However, it will be appreciated that any number of threshold values (flags) can be examined.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Signal Processing (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Otolaryngology (AREA)
  • General Health & Medical Sciences (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • User Interface Of Digital Computer (AREA)
  • Telephone Function (AREA)

Abstract

A microphone assembly includes an acoustic sensor and a voice activity detector on an integrated circuit coupled to an external-device interface. The acoustic sensor produces an electrical signal representative of acoustic energy detected by the sensor. A filter bank separates data representative of the acoustic energy into a plurality of frequency bands. A power tracker obtains a power estimate for at least one band, including a first estimate based on relatively fast changes in a power metric of the data and a second estimate based on relatively slow changes in a power metric of the data. The presence of voice activity in the electrical signal is based upon the power estimate.

Description

CROSS REFERENCE TO RELATED APPLICATION
This application is a continuation of U.S. application Ser. No. 14/525,413 (now granted as U.S. Pat. No. 9,147,397), entitled “VAD Detection Apparatus and Method of Operating the Same,” filed Oct. 28, 2014, which claims the benefit under 35 U.S.C. §119 (e) to U.S. Provisional Application No. 61/896,723, entitled “VAD Detection Apparatus and method of operating the same,” filed Oct. 29, 2013, both of which are incorporated herein by reference in their entireties.
TECHNICAL FIELD
This application relates to microphones and, more specifically, to voice activity detection (VAD) approaches used with these microphones.
BACKGROUND
Microphones are used to obtain a voice signal from a speaker. Once obtained, the signal can be processed in a number of different ways. A wide variety of functions can be provided by today's microphones and they can interface with and utilize a variety of different algorithms.
Voice triggering, for example, as used in mobile systems is an increasingly popular feature that customers wish to use. For example, a user may wish to speak commands into a mobile device and have the device react in response to the commands. In these cases, a programmable digital signal processor (DSP) may first use a voice activity detection algorithm to detect if there is voice in an audio signal captured by a microphone, and then, subsequently, analysis is performed on the signal to predict what the spoken word was in the received audio signal. Various voice activity detection (VAD) approaches have been developed and deployed in various types of devices such as cellular phones and personal computers.
In the use of these approaches, false detections, trigger word detections, part counts and silicon area and current consumption have become concerns, especially since these approaches are deployed in electronic devices such as cellular phones. Previous approaches have proven inadequate to address these concerns. Consequently, some user dissatisfaction has developed with respect to these previous approaches.
BRIEF DESCRIPTION OF THE DRAWINGS
For a more complete understanding of the disclosure, reference should be made to the following detailed description and accompanying drawings wherein:
FIG. 1 is a block diagram of a system with microphones that use VAD;
FIG. 2 is a state transition diagram showing an interrupt sequence;
FIG. 3 is a block diagram of a VAD approach;
FIG. 4 is an analyze filter bank used in VAD;
FIG. 5 is a block diagram of high pass and low pass filters used in an analyze filter bank;
FIG. 6 is a graph of the results of the analyze filter bank;
FIG. 7 is a block diagram of the tracker block;
FIG. 8 is a graph of the results of the tracker block;
FIG. 9 is a block diagram of a decision block.
Skilled artisans will appreciate that elements in the figures are illustrated for simplicity and clarity. It will further be appreciated that certain actions and/or steps may be described or depicted in a particular order of occurrence while those skilled in the art will understand that such specificity with respect to sequence is not actually required. It will also be understood that the terms and expressions used herein have the ordinary meaning as is accorded to such terms and expressions with respect to their corresponding respective areas of inquiry and study except where specific meanings have otherwise been set forth herein.
DETAILED DESCRIPTION
The present approaches provide voice activity detection (VAD) methods and devices that determine whether an event or human voice is present. The approaches described herein are efficient, easy to implement, lower part counts, are able to detect voice with very low latency, and reduce false detections.
It will be appreciated that the approaches described herein can be implemented using any combination of hardware or software elements. For example, an application specific integrated circuit (ASIC) or microprocessor can be used to implement the approaches described herein using programmed computer instructions. Additionally, while the VAD approaches may be disposed in the microphone (as described herein), these functionalities may also be disposed in other system elements.
In many of these embodiments and at a processing device, a first signal from a first microphone and a second signal from a second microphone are received. The first signal indicates whether a voice signal has been determined at the first microphone, and the second signal indicates whether a voice signal has been determined at the second microphone. When the first signal indicates potential voice activity or the second signal indicates potential voice activity, the processing device is activated to receive data and the data is examined for a trigger word. When the trigger word is found, a signal is sent to an application processor to further process information from one or more of the first microphone and the second microphone. When no trigger word is found, the processing device is reset to deactivate data input and allow the first microphone and the second microphone to enter or maintain an event detection mode of operation.
In other aspects, the application processor utilizes a voice recognition (VR) module to determine whether other or further commands can be recognized in the information. In other examples, the first microphone and the second microphone transmit pulse density modulation (PDM) data.
In some other aspects, the first microphone includes a first voice activity detection (VAD) module that determines whether voice activity has been detected, and the second microphone includes a second voice activity detection (VAD) module that determines whether voice activity has been detected. In some examples, the first VAD module and the second VAD module perform the steps of: receiving sound energy from a source; filtering the sound energy into a plurality of filter bands; obtaining a power estimate for each of the plurality of filter bands; and based upon each power estimate, determining whether voice activity is detected.
In some examples, the filtering utilizes one or more low pass filters, high pass filters, and frequency dividers. In other examples, the power estimate comprises an upper power estimate and a lower power estimate.
In some aspects, either the first VAD module or the second VAD module performs Trigger Phrase recognition. In other aspects, either the first VAD module or the second VAD module performs Command Recognition.
In some examples, the processing device controls the first microphone and the second microphone by varying a clock frequency of a clock supplied to the first microphone and the second microphone.
In many of these embodiments, the system includes a first microphone with a first voice activity detection (VAD) module and a second microphone with a second voice activity detection (VAD) module, and a processing device. The processing device is communicatively coupled to the first microphone and the second microphone, and configured to receive a first signal from the first microphone and a second signal from the second microphone. The first signal indicates whether a voice signal has been determined at the first microphone by the first VAD module, and the second signal indicates whether a voice signal has been determined at the second microphone by the second VAD module. The processing device is further configured, to when the first signal indicates potential voice activity or the second signal indicates potential voice activity, activate and receive data from the first microphone or the second microphone, and subsequently examine the data for a trigger word. When the trigger word is found, a signal is sent to an application processor to further process information from one or more of the first microphone and the second microphone. The processing device is further configured to, when no trigger word is found, transmit a third signal to the first microphone and the second microphone. The third signal causes the first microphone and second microphone to enter or maintain an event detection mode of operation.
In one aspect, either the first VAD module or the second VAD module performs Trigger Phrase recognition. In another aspect, either the first VAD module or the second VAD module performs Command Recognition. In other examples, the processing device controls the first microphone and the second microphone by varying a clock frequency of a clock supplied to the first microphone and the second microphone.
In many of these embodiments, voice activity is detected in a micro-electro-mechanical system (MEMS) microphone. Sound energy is received from a source and the sound energy is filtered into a plurality of filter bands. A power estimate is obtained for each of the plurality of filter bands. Based upon each power estimate, a determination is made as to whether voice activity is detected.
In some aspects, the filtering utilizes one or more low pass filters, high pass filters and frequency dividers. In other examples, the power estimate comprises an upper power estimate and a lower power estimate. In some examples, ratios between the upper power estimate and the lower power estimate within the plurality of filter bands are determined, and selected ones of the ratios are compared to a predetermined threshold. In other examples, ratios between the upper power estimate and the lower power estimate between the plurality of filter bands are determined, and selected ones of the ratios are compared to a predetermined threshold.
Referring now to FIG. 1, a system 100 that utilizes Voice Activity Detection (VAD) approaches is described. The system 100 includes a first microphone element 102, a second microphone element 104, a right event microphone 106, a left event microphone 108, a digital signal processor (DSP)/codec 110, and an application processor 112. Although two microphones are shown in the system 100, it will be understood that any number of microphones may be used and not all of them require a VAD, but at least one.
The first microphone element 102 and the second microphone element 104 are microelectromechanical system (MEMS) elements that receive sound energy and convert the sound energy into electrical signals that represent the sound energy. In one example, the elements 102 and 104 include a MEMS die, a diaphragm, and a back plate. Other components may also be used.
The right event microphone 106 and the left event microphone 108 receive signals from the microphone elements 102 and 104, and process these signals. For example, the elements 106 and 108 may include buffers, preamplifiers, analog-to-digital (A-to-D) converters, and other processing elements that convert the analog signal received from elements 102 and 104 into digital signals and perform other processing functions. These elements may, for example, include an ASIC that implements these functions. The right event microphone 106 and the left event microphone 108 also include voice activity detection (VAD) modules 103 and 105 respectively and these may be implemented by an ASIC that executes programmed computer instructions. The VAD modules 103 and 105 utilize the approaches described herein to determine whether voice (or some other event) has been detected. This information is transmitted to the digital signal processor (DSP)/codec 110 and the application processor 112 for further processing. Also, the signals (potentially voice information) now in the form of digital information are sent to the digital signal processor (DSP)/codec 110 and the application processor 112.
The digital signal processor (DSP)/codec 110 receives signals from the elements 106 and 108 (including whether the VAD modules have detected voice) and looks for trigger words (e.g., “Hello, My Mobile) using a voice recognition (VR) trigger engine 120. The codec 110 also performs interrupt processing (see FIG. 2) using interrupt handling module 122. If the trigger word is found, a signal is sent to the application processor 112 to further process received information. For instance, the application processor 112 may utilize a VR recognition module 126 (e.g., implemented as hardware and/or software) to determine whether other or further commands can be recognized in the information.
In one example of the operation of the system of FIG. 1, the right event microphone 106 and/or the left event microphone 108 will wake up the digital signal processor (DSP)/codec 110 and the application processor 112 by starting to transmit pulse density modulation (PDM) data. General input/output (I/O) pins 113 of the digital signal processor (DSP)/codec 110 and the application processor 112 are assumed to be configurable for interrupts (or simply polling) as described below with respect to FIG. 2. The modules 103 and 105 may perform different recognition functions; one VAD module may perform Trigger Keyword recognition and a second VAD module may perform Command Recognition. In one aspect, the digital signal processor (DSP)/codec 110 and the application processor 112 control the right event microphone 106 and the left event microphone 108 by varying the clock frequency of the clock 124.
Referring now to FIG. 2, one example of the bidirectional interrupt system that can be deployed in the approaches described herein is described. At step 202, the microphone 106 or 108 interrupts/wakes up the digital signal processor (DSP)/codec 110 in case of an event being detected. The event may be voice (e.g., it could be the start of the voice trigger word). At step 204, the digital signal processor (DSP)/codec 110 puts the microphone back in Event Detection mode in case no trigger word is present. The digital signal processor (DSP)/codec 110 determines when to change the microphone back to Event Detection mode. The internal VAD of the DSP/codec 110 could be used to make this decision and/or the internal voice trigger recognitions system of the DSP/Codec 110. For example, if the word trigger recognition didn't recognize any Trigger Word after approximately 2 or 3 seconds then it should configure its input/output pin to be an interrupt pin again and then set the microphone back into detecting mode (step 204 in FIG. 2) and then go into sleep mode/power down.
In another approach, the microphone may also track the time of contiguous voice activity. If activity does not persist beyond a certain countdown e.g., 5 seconds, and the microphone also stays in the low power VAD mode of operation, i.e. not put into a standard or high performance mode within that time frame, the implication is that the voice trigger was not detected within that period of detected voice activity, then there is no further activity and the microphone may initiate a change to detection mode from detect and transmit mode. A DSP/Codec on detecting no transmission from the microphone may also go to low power sleep mode.
Referring now to FIG. 3, the VAD approaches described herein can include three functional blocks: an analyze filter bank 302, power tracker block or module 304, and a decision block or module 306. The analyze filter bank 302 filters the input signal into five spectral bands.
The power tracker block 304 includes an upper tracker and a lower tracker. For each of these and for each band it obtains a power estimate. The decision block 306 looks at the power estimates and determines if voice or an acoustic event is present.
Optionally, the threshold values can be set by a number of different approaches such as one time parts (OTPs), or various types of wired or wireless interfaces 310. Optionally feedback 308 from the decision block 306 can control the power trackers, this feedback could be the VAD decision. For example the trackers (described below) could be configured to use another set of attack/release constants if voice is present. The functions described herein can be deployed in any number of functional blocks and it will be understood that the three blocks described are examples only.
Referring now to FIGS. 4, 5, and 6 one example of an analyze filter bank is described, the processing is very similar to the subband coding system, which may be implemented by the wavelet transform, by Quadrature Mirror Filters (QMF) or by other similar approaches. In FIG. 4, the decimation stage on the high pass filters (D) is omitted compared to the more traditional subband coding/wavelet transform method. The reason for the omission is that later in the signal processing step an estimation of the root mean square (RMS) of energy or power value is obtained and it is not desired to overlap in frequency between the low pass filtering (used to derive the “Mean” of RMS) and the pass band of the analyze filter bank. This approach will relax the filter requirement to the “Mean” low pass filter. However the decimation stage could be introduced as this would save computational requirements.
Referring now to FIG. 4, the filter bank includes high pass filters 402 (D), low pass filters 404 (H), and sample frequency dividers 406 (Fs is the sample frequency of the particular channel). This apparatus operates similarly to a sub-band coding approach and has a consistent relative bandwidth as the wavelet transforms. The incoming signal is separated into five bands. Other numbers of bands can also be used. In this example, channel 5 has a pass band between 4000 Hz to 8000 Hz; channel 4 has a pass band between 2000 Hz to 4000 Hz; channel 3 has a pass band between 1000 Hz to 2000 Hz; channel 2 has a pass band between 500 Hz to 1000 Hz; and channel 1 has a pass band between 0 Hz to 500 Hz.
Referring now to FIG. 5, the high pass filter and the low pass filter are constructed from two all pass filters 502 (G1) and 504 (G2). These filters could be first or second order all pass infinite impulse response (IIR) structures. The input signal X(z) passes through delay block 501. By changing the signs of adders 508 and 510, a low pass filtered sample 512 and a high pass filtered sample 514 are generated. Combining this structure with the decimation structure gives several benefits. For example, the order of the H and D filters are double (e.g., two times), and the number of gates and power are reduced in the system.
Referring now to FIG. 6, response curves for the high pass and low pass elements are shown. A first curve 602 shows the low pass filter response while a second curve 604 shows the high pass filter response.
Referring now to FIGS. 7 and 8, one example of the power tracker block or module 700 is described. The tracker 700 includes an absolute value block 702, a SINC decimation block 704, and upper and lower tracker block 706. The block 702 obtains the absolute value of the signal (this could also be the square value). The SINC block 704 is a first order SINC with N decimation factor and it simply accumulates N absolute signal values and then dumps this data after a predetermined time (N sample periods). Optionally, any kind of decimation filter could be used. A short time RMS estimate is found by rectifying and averaging/decimating by the SINC block 704 (i.e., accumulation and dump, if squaring was used in block 704 then a square root operator could be introduced here as well). The above functions are performed for each channel, i=1 to 5. The decimation factors, N, are chosen so the sample rate of each short time RMS estimate is 125 Hz or 250 Hz except the DC channel (channel 1) where the sample rate is 62.5 Hz or 125 Hz. The short time rms (Chrms, i) values for each channel, i=1 to 5, are then fed into two trackers of the tracker block 706. A lower tracker and an upper tracker, i.e., one tracker pair for each channel are included in the tracker block 706. The operation of the tracker block 706 can be described as:
upper i ( n ) = { upper i ( n - 1 ) · ( 1 - K au i ) + K au i · Ch rms , i ( n ) , if Ch rms , i ( n ) > upper i ( n - 1 ) upper i ( n - 1 ) · ( 1 - K ru i ) + K ru i · Ch rms , i ( n ) otherwise lower i ( n ) = { lower i ( n - 1 ) · ( 1 - K al i ) + K al i · Ch rms , i ( n ) , if Ch rms , i ( n ) < lower i ( n - 1 ) lower i ( n - 1 ) · ( 1 - K rl i ) + K rl i · Ch rms , i ( n ) otherwise
The sample index number is n, Kaui and Krui are attack and release constants for the upper tracker channel number i. Kali and Krli are attack and release constants for the lower tracker for channel number i. The output of this block is fed to the decision block described below with respect to FIG. 9.
Referring now to FIG. 8, operation of the tracker block is described. A first curve 802 shows the upper tracker that follows fast changes in power or RMS. A second curve 804 shows the lower tracker following slower changes in the power or RMS. A third curve 806 represents the input signal to the tracker block.
Referring now to FIG. 9, one example of a decision block 900 is described. Block 902 is redrawn in FIG. 9 in order to make it easier for the reader ( blocks 706 and 902 are the same tracker blocks). The decision block uses the output from the trackers and includes a division block 904 to determine the ratio between the upper and lower tracker for each channel, summation block 908, comparison block 910, and sign block 912.
The internal operation of the division block 904 is structured and configured so that an actual division need not be made. The lower tracker value loweri(n) is multiplied by Thi(n) (a predetermined threshold which could be constant and independent of n or changed according to a rule). This is subtracted from the upperi(n) tracker value. The sign(x) function is then performed.
Upper and lower tracker signals are estimated by upper and lower tracker block 902 (this block is identical to block 706). The ratio between the upper tracker and the lower tracker is then calculated by division block 904. This ratio is compared with a threshold Thi(n). The flag R_flagi(n) is set if the ratio is larger than the threshold Thi(n), i.e., if sign(x) in 904 is positive. This operation is performed for each channel i=1 to 5. Thi(n) could be constant over time for each channel or follow a rule where it actually changes for each sample instance n.
In addition to the ratio calculation for each channel i=1 to 5 (or 6 or 7 if more channels are available from the filterbank), the ratios between channels can also be used/calculated. The ratio between channels is defined for the i'th channel: Ratioi,ch(n)=upperi=ch(n)/loweri≠ch(n), i, ch are from 1 to the number of channels which in this case is 5. This means that ratio(n)i,i is identical to the ratio calculated above. A total number of 25 ratios can be calculated (if 5 filter bands exist). Again, each of these ratios is compared with a Threshold Thi,ch(n). A total number of 25 thresholds exist if 5 channels are available. Again, the threshold can be constant over time n, or change for each sample instance n. In one implementation, not all of the ratios between bands will be used, only a subset.
The sample rate for all the flags is identical with the sample rate for the faster tracker of all the trackers. The slow trackers are repeated. A voice power flag V_flag(n) is also estimated as the sum of three channels from 500 Hz to 4000 Hz by summation block 908. This flag is set if the power level is low enough, (i.e., smaller than Vth(n)) and this is determined by comparison block 910 and sign block 912. This flag is only in effect when the microphone is in a quiet environment or/and the persons speaking are far away from the microphone.
The R_flagi(n) and V_flag(n) are used to decide if the current time step “n” is voice, and stored in E_flag(n). The operation that determines if E_flag (n) is voice (1) or not voice (0) can be described by the following:
 E_flag(n) = 0;
 If sum_from_1_to_5( R_flagi(n) ) > V_no (i.e., E_flag is set if at least V_no
channels declared voice )
  E_flag(n) = 1
 If R_flag1(n) == 0 and R_flag5(n) == 0
  E_flag(n) = 0
 If V_flag(n) == 1
  E_flag(n) = 0
The final VAD_flag(n) is a smoothed version of the E_flag(n). It simply makes a VAD positive decision true for a minimum time/period of VAD_NUMBER of sample periods. This smoothing can be described by the following approach. This approach can be used to determine if a voice event is detected, but that the voice is present in the background and therefore of no interest. In this respect, a false positive reading is avoided.
VAD_flag(n)=0
If E_flag(n) == 1
 hang_on_count=VAD_NUMBER;
else
 if hang_on_count ~= 0
  decrement( hang_on_count)
  VAD_flag(n)=1
 end
end
Hang-on-count represents a time of app VAD_NUMBER/Sample Rate. Here Sample Rate are the fastest channel, i.e., 250, 125 or 62.5 Hz. It will be appreciated that these approaches examine to see if 4 flags have been set. However, it will be appreciated that any number of threshold values (flags) can be examined.
It will also be appreciated that other rules could be formulated like at least two pair of adjacent channel (or R_flag) are true or maybe three of such pairs or only one pair. These rules are predicated by the fact that human voice tends to be correlated in adjacent frequency channels, due to the acoustic production capabilities/limitations of the human vocal system.
Preferred embodiments are described herein, including the best mode. It should be understood that the illustrated embodiments are exemplary only, and should not be taken as limiting the scope of the appended claims.

Claims (20)

What is claimed is:
1. A method in a microphone assembly including an acoustic sensor and a voice activity detector on an integrated circuit coupled to an external-device interface of the microphone assembly, the method comprising:
receiving acoustic energy at the acoustic sensor;
filtering data representative of the acoustic energy into a plurality of bands;
obtaining a power estimate for at least one of the plurality of bands,
the power estimate including a first estimate based on relatively fast changes in a power metric of the data representative of the acoustic energy and a second estimate based on relatively slow changes in a power metric of the data representative of the acoustic energy;
determining whether voice activity is present in the acoustic energy based on the power estimate for the at least one band.
2. The method of claim 1, further comprising,
determining a ratio of the first estimate and the second estimate of a corresponding band; and
determining whether voice activity is present in the acoustic energy based on a comparison of the ratio to a predetermined threshold.
3. The method of claim 1,
obtaining a power estimate for each of the plurality of bands, each power estimate including a first estimate based on relatively fast changes in a power metric of the data representative of the acoustic energy and a second estimate based on relatively slow changes in a power metric of the data representative of the acoustic energy;
determining multiple ratios based on the first estimate and the second estimate of the plurality of bands;
determining whether voice activity is present in the acoustic energy based on a comparison of the multiple ratios to predetermined thresholds.
4. The method of claim 3, further comprising summing results of the comparisons and determining whether voice activity is present in the acoustic energy based on the summation of results.
5. The method of claim 3, determining the multiple ratios includes determining at least one ratio using the first estimate and the second estimate obtained for the same band.
6. The method of claim 3, determining the multiple ratios includes determining at least one ratio using the first estimate obtained for one band and the second estimate obtained for another band.
7. The method of claim 1, providing an interrupt signal at the external-device interface upon determining that voice activity is present in the acoustic energy.
8. A microphone assembly having an external-device interface, the microphone assembly comprising:
an acoustic sensor having an acoustic input and an electrical output;
a filter bank having an input coupled to the electrical output of the transducer, the filter bank configured to filter data representative of energy detected by the acoustic sensor into a plurality of frequency bands;
a power tracker having an input coupled to an output of the filter bank, the power tracker configured to obtain a power estimate for at least one of the plurality of frequency bands, the power estimate including a first estimate based on relatively fast changes in a power metric of the data representative of the acoustic energy and a second estimate based on relatively slow changes in a power metric of the data representative of the acoustic energy;
a comparison entity coupled to the output of the power tracker, the comparison entity configured to determine whether voice activity is present in the data representative of acoustic energy based upon the power estimate; and
a signal generator configured to generate a wake up signal upon determining that voice activity is present in the data representative of acoustic energy.
9. The microphone assembly of claim 8,
the power tracker configured to determine a ratio of the first estimate and the second estimate of a corresponding frequency band, and
the comparison entity configured to determine whether voice activity is present in the acoustic energy based on a comparison of the ratio to a predetermined threshold.
10. The microphone assembly of claim 8,
the power tracker configured to obtain a power estimate for each of the plurality of frequency bands, each power estimate including a first estimate based on relatively fast changes in a power metric of the data representative of the acoustic energy and a second estimate based on relatively slow changes in a power metric of the data representative of the acoustic energy,
the power tracker configured to determine multiple ratios based on the first estimate and the second estimate of the plurality of frequency bands,
the comparison entity configured to determine whether voice activity is present in the acoustic energy based on a comparison of the multiple ratios to predetermined thresholds.
11. The microphone assembly of claim 10, the comparison entity configured to sum results of the comparisons and to determine whether voice activity is present in the acoustic energy based on the summation of results.
12. The microphone assembly of claim 10, at least one of the multiple ratios includes a ratio of the first estimate and the second estimate for the same frequency band.
13. The microphone assembly of claim 10, at least one of the multiple ratios includes a ratio of the first estimate obtained for one frequency band and the second estimate obtained for another frequency band.
14. The microphone assembly of claim 8, a signal generator configured to provide the wake up signal at the external-device interface upon determining that voice activity is present in the acoustic energy.
15. The microphone assembly of claim 8, wherein the filter bank, the power tracker, the comparison entity, and the signal generator are implemented on an integrated circuit of the microphone assembly.
16. A microphone assembly having an external-device interface, the microphone assembly comprising:
an acoustic sensor having an acoustic input and an electrical output;
an analog to digital (A/D) converter coupled to the acoustic sensor, the A/D converter configured to generate a data representative of an electrical signal generated by the acoustic sensor;
a processor coupled to the A/D converter, the processor configured to:
filter the data representative of the electrical signal into a plurality of bands;
obtain a power estimate for at least one of the plurality of bands, the power estimate including a first estimate based on relatively fast changes in a power metric of the data representative of the acoustic energy and a second estimate based on relatively slow changes in a power metric of the data representative of the acoustic energy;
determine whether voice activity is present in the data representative of the electrical signal based upon the power estimate; and
generate a wake up signal upon determining that voice activity is present in the data representative of the electrical signal.
17. The microphone assembly of claim 16, the processor further configured to determine a ratio of the first estimate and the second estimate and to determine whether voice activity is present in the data representative of the electrical signal based on a comparison of the ratio to a predetermined threshold.
18. The microphone assembly of claim 16,
the processor configured to obtain a power estimate for each of the plurality of bands, each power estimate including a first estimate based on relatively fast changes in a power metric of the data representative of the acoustic energy and a second estimate based on relatively slow changes in a power metric of the data representative of the acoustic energy,
the processor configured to determine multiple ratios based on the first estimate and the second estimate of the plurality of bands, and
the processor configured to determine whether voice activity is present in the data representative of the electrical signal based on a comparison of the multiple ratios to predetermined thresholds.
19. The microphone assembly of claim 18, the processor configured to sum results of the comparisons and to determine whether voice activity is present in the data representative of the electrical signal based on the summation of results.
20. The microphone assembly of claim 16, the processor configured to provide the wake up signal at the external-device interface upon determining that voice activity is present in the data representative of the electrical signal.
US14/861,113 2013-10-29 2015-09-22 VAD detection apparatus and method of operation the same Active 2034-11-07 US9830913B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US14/861,113 US9830913B2 (en) 2013-10-29 2015-09-22 VAD detection apparatus and method of operation the same

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201361896723P 2013-10-29 2013-10-29
US14/525,413 US9147397B2 (en) 2013-10-29 2014-10-28 VAD detection apparatus and method of operating the same
US14/861,113 US9830913B2 (en) 2013-10-29 2015-09-22 VAD detection apparatus and method of operation the same

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US14/525,413 Continuation US9147397B2 (en) 2013-10-29 2014-10-28 VAD detection apparatus and method of operating the same

Publications (2)

Publication Number Publication Date
US20160064001A1 US20160064001A1 (en) 2016-03-03
US9830913B2 true US9830913B2 (en) 2017-11-28

Family

ID=52996382

Family Applications (2)

Application Number Title Priority Date Filing Date
US14/525,413 Expired - Fee Related US9147397B2 (en) 2013-10-29 2014-10-28 VAD detection apparatus and method of operating the same
US14/861,113 Active 2034-11-07 US9830913B2 (en) 2013-10-29 2015-09-22 VAD detection apparatus and method of operation the same

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US14/525,413 Expired - Fee Related US9147397B2 (en) 2013-10-29 2014-10-28 VAD detection apparatus and method of operating the same

Country Status (4)

Country Link
US (2) US9147397B2 (en)
CN (1) CN105830463A (en)
DE (1) DE112014004951T5 (en)
WO (1) WO2015066152A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180130485A1 (en) * 2016-11-08 2018-05-10 Samsung Electronics Co., Ltd. Auto voice trigger method and audio analyzer employing the same
US10360926B2 (en) 2014-07-10 2019-07-23 Analog Devices Global Unlimited Company Low-complexity voice activity detection
US20220115007A1 (en) * 2020-10-08 2022-04-14 Qualcomm Incorporated User voice activity detection using dynamic classifier
US11335361B2 (en) * 2020-04-24 2022-05-17 Universal Electronics Inc. Method and apparatus for providing noise suppression to an intelligent personal assistant
US20220222444A1 (en) * 2019-02-13 2022-07-14 Oracle International Corporation Chatbot conducting a virtual social dialogue
US11720749B2 (en) 2018-10-16 2023-08-08 Oracle International Corporation Constructing conclusive answers for autonomous agents

Families Citing this family (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10020008B2 (en) 2013-05-23 2018-07-10 Knowles Electronics, Llc Microphone and corresponding digital interface
US9711166B2 (en) 2013-05-23 2017-07-18 Knowles Electronics, Llc Decimation synchronization in a microphone
CN105379308B (en) 2013-05-23 2019-06-25 美商楼氏电子有限公司 Microphone, microphone system and the method for operating microphone
US20150356982A1 (en) * 2013-09-25 2015-12-10 Robert Bosch Gmbh Speech detection circuit and method
US9502028B2 (en) 2013-10-18 2016-11-22 Knowles Electronics, Llc Acoustic activity detection apparatus and method
US9147397B2 (en) 2013-10-29 2015-09-29 Knowles Electronics, Llc VAD detection apparatus and method of operating the same
WO2016118480A1 (en) 2015-01-21 2016-07-28 Knowles Electronics, Llc Low power voice trigger for acoustic apparatus and method
US10121472B2 (en) 2015-02-13 2018-11-06 Knowles Electronics, Llc Audio buffer catch-up apparatus and method with two microphones
US9478234B1 (en) 2015-07-13 2016-10-25 Knowles Electronics, Llc Microphone apparatus and method with catch-up buffer
EP3185244B1 (en) 2015-12-22 2019-02-20 Nxp B.V. Voice activation system
CN105609118B (en) * 2015-12-30 2020-02-07 生迪智慧科技有限公司 Voice detection method and device
US10090005B2 (en) * 2016-03-10 2018-10-02 Aspinity, Inc. Analog voice activity detection
US10079027B2 (en) * 2016-06-03 2018-09-18 Nxp B.V. Sound signal detector
US10319375B2 (en) * 2016-12-28 2019-06-11 Amazon Technologies, Inc. Audio message extraction
US10311870B2 (en) * 2017-05-10 2019-06-04 Ecobee Inc. Computerized device with voice command input capability
KR102371313B1 (en) * 2017-05-29 2022-03-08 삼성전자주식회사 Electronic apparatus for recognizing keyword included in your utterance to change to operating state and controlling method thereof
EP3425923B1 (en) * 2017-07-06 2024-05-08 GN Audio A/S Headset with reduction of ambient noise
KR102411766B1 (en) * 2017-08-25 2022-06-22 삼성전자주식회사 Method for activating voice recognition servive and electronic device for the same
US11341987B2 (en) * 2018-04-19 2022-05-24 Semiconductor Components Industries, Llc Computationally efficient speech classifier and related methods
CN109308900B (en) * 2018-10-29 2022-04-05 恒玄科技(上海)股份有限公司 Earphone device, voice processing system and voice processing method
US11637546B2 (en) * 2018-12-14 2023-04-25 Synaptics Incorporated Pulse density modulation systems and methods
WO2020131681A1 (en) * 2018-12-18 2020-06-25 Knowles Electronics, Llc Audio level estimator assisted false wake abatement systems and methods
CN110600060B (en) * 2019-09-27 2021-10-22 云知声智能科技股份有限公司 Hardware audio active detection HVAD system

Citations (183)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4052568A (en) 1976-04-23 1977-10-04 Communications Satellite Corporation Digital voice switch
US5577164A (en) 1994-01-28 1996-11-19 Canon Kabushiki Kaisha Incorrect voice command recognition prevention and recovery processing method and apparatus
US5598447A (en) 1992-05-11 1997-01-28 Yamaha Corporation Integrated circuit device having internal fast clock source
US5675808A (en) 1994-11-02 1997-10-07 Advanced Micro Devices, Inc. Power control of circuit modules within an integrated circuit
US5822598A (en) 1996-07-12 1998-10-13 Ast Research, Inc. Audio activity detection circuit to increase battery life in portable computers
US5983186A (en) 1995-08-21 1999-11-09 Seiko Epson Corporation Voice-activated interactive speech recognition device and method
US6049565A (en) 1994-12-16 2000-04-11 International Business Machines Corporation Method and apparatus for audio communication
US6057791A (en) 1998-02-18 2000-05-02 Oasis Design, Inc. Apparatus and method for clocking digital and analog circuits on a common substrate to enhance digital operation and reduce analog sampling error
US6070140A (en) 1995-06-05 2000-05-30 Tran; Bao Q. Speech recognizer
US6138040A (en) * 1998-07-31 2000-10-24 Motorola, Inc. Method for suppressing speaker activation in a portable communication device operated in a speakerphone mode
US6154721A (en) 1997-03-25 2000-11-28 U.S. Philips Corporation Method and device for detecting voice activity
US6249757B1 (en) 1999-02-16 2001-06-19 3Com Corporation System for detecting voice activity
US6282268B1 (en) 1997-05-06 2001-08-28 International Business Machines Corp. Voice processing system
JP2001236095A (en) 2000-02-23 2001-08-31 Olympus Optical Co Ltd Voice recorder
US6324514B2 (en) 1998-01-02 2001-11-27 Vos Systems, Inc. Voice activated switch with user prompt
US20020054588A1 (en) 2000-09-22 2002-05-09 Manoj Mehta System and method for controlling signal processing in a voice over packet (VoP) environment
US6397186B1 (en) 1999-12-22 2002-05-28 Ambush Interactive, Inc. Hands-free, voice-operated remote control transmitter
US20020116186A1 (en) 2000-09-09 2002-08-22 Adam Strauss Voice activity detector for integrated telecommunications processing
US20020123893A1 (en) 2001-03-01 2002-09-05 International Business Machines Corporation Processing speech recognition errors in an embedded speech recognition system
US6453020B1 (en) 1997-05-06 2002-09-17 International Business Machines Corporation Voice processing system
US20020184015A1 (en) 2001-06-01 2002-12-05 Dunling Li Method for converging a G.729 Annex B compliant voice activity detection circuit
US20030004720A1 (en) 2001-01-30 2003-01-02 Harinath Garudadri System and method for computing and transmitting parameters in a distributed voice recognition system
US20030061036A1 (en) 2001-05-17 2003-03-27 Harinath Garudadri System and method for transmitting speech activity in a distributed voice recognition system
US6564330B1 (en) 1999-12-23 2003-05-13 Intel Corporation Wakeup circuit for computer system that enables codec controller to generate system interrupt in response to detection of a wake event by a codec
US6591234B1 (en) 1999-01-07 2003-07-08 Tellabs Operations, Inc. Method and apparatus for adaptively suppressing noise
US20030144844A1 (en) 2002-01-30 2003-07-31 Koninklijke Philips Electronics N.V. Automatic speech recognition system and method
US6640208B1 (en) 2000-09-12 2003-10-28 Motorola, Inc. Voiced/unvoiced speech classifier
US6665639B2 (en) 1996-12-06 2003-12-16 Sensory, Inc. Speech recognition in consumer electronic products
US20040022379A1 (en) 1997-04-03 2004-02-05 Southwestern Bell Telephone Company Apparatus and method for facilitating service management of communications services in a communications network
US6756700B2 (en) 2002-03-13 2004-06-29 Kye Systems Corp. Sound-activated wake-up device for electronic input devices having a sleep-mode
JP2004219728A (en) 2003-01-15 2004-08-05 Matsushita Electric Ind Co Ltd Speech recognition device
US20040234069A1 (en) * 2003-05-19 2004-11-25 Acoustic Technologies, Inc. Dynamic balance control for telephone
US6832194B1 (en) 2000-10-26 2004-12-14 Sensory, Incorporated Audio recognition peripheral system
US20050207605A1 (en) 2004-03-08 2005-09-22 Infineon Technologies Ag Microphone and method of producing a microphone
US20050240399A1 (en) * 2004-04-21 2005-10-27 Nokia Corporation Signal encoding
US20060074658A1 (en) 2004-10-01 2006-04-06 Siemens Information And Communication Mobile, Llc Systems and methods for hands-free voice-activated devices
US20060233389A1 (en) 2003-08-27 2006-10-19 Sony Computer Entertainment Inc. Methods and apparatus for targeted sound detection and characterization
US20060247923A1 (en) 2000-03-28 2006-11-02 Ravi Chandran Communication system noise cancellation power signal calculation techniques
US7190038B2 (en) 2001-12-11 2007-03-13 Infineon Technologies Ag Micromechanical sensors and methods of manufacturing same
US20070168908A1 (en) 2004-03-26 2007-07-19 Atmel Corporation Dual-processor complex domain floating-point dsp system on chip
US20070278501A1 (en) 2004-12-30 2007-12-06 Macpherson Charles D Electronic device including a guest material within a layer and a process for forming the same
US20080089536A1 (en) 2006-10-11 2008-04-17 Analog Devices, Inc. Microphone Microchip Device with Differential Mode Noise Suppression
US20080175425A1 (en) 2006-11-30 2008-07-24 Analog Devices, Inc. Microphone System with Silicon Microphone Secured to Package Lid
US7415416B2 (en) 2003-09-12 2008-08-19 Canon Kabushiki Kaisha Voice activated device
US20080201138A1 (en) 2004-07-22 2008-08-21 Softmax, Inc. Headset for Separation of Speech Signals in a Noisy Environment
US7418392B1 (en) 2003-09-25 2008-08-26 Sensory, Inc. System and method for controlling the operation of a device by voice commands
US20080267431A1 (en) 2005-02-24 2008-10-30 Epcos Ag Mems Microphone
US20080279407A1 (en) 2005-11-10 2008-11-13 Epcos Ag Mems Microphone, Production Method and Method for Installing
US20080283942A1 (en) 2007-05-15 2008-11-20 Industrial Technology Research Institute Package and packaging assembly of microelectromechanical sysyem microphone
US20090001553A1 (en) 2005-11-10 2009-01-01 Epcos Ag Mems Package and Method for the Production Thereof
US7487089B2 (en) 2001-06-05 2009-02-03 Sensory, Incorporated Biometric client-server security system and method
US20090180655A1 (en) 2008-01-10 2009-07-16 Lingsen Precision Industries, Ltd. Package for mems microphone
WO2009130591A1 (en) 2008-04-25 2009-10-29 Nokia Corporation Method and apparatus for voice activity determination
US7619551B1 (en) 2008-07-29 2009-11-17 Fortemedia, Inc. Audio codec, digital device and voice processing method
US7630504B2 (en) 2003-11-24 2009-12-08 Epcos Ag Microphone comprising integral multi-level quantizer and single-bit conversion means
US20100046780A1 (en) 2006-05-09 2010-02-25 Bse Co., Ltd. Directional silicon condensor microphone having additional back chamber
US20100052082A1 (en) 2008-09-03 2010-03-04 Solid State System Co., Ltd. Micro-electro-mechanical systems (mems) package and method for forming the mems package
US20100057474A1 (en) 2008-06-19 2010-03-04 Hongwei Kong Method and system for digital gain processing in a hardware audio codec for audio transmission
US7720683B1 (en) 2003-06-13 2010-05-18 Sensory, Inc. Method and apparatus of specifying and performing speech recognition operations
US20100128894A1 (en) 2007-05-25 2010-05-27 Nicolas Petit Acoustic Voice Activity Detection (AVAD) for Electronic Systems
US20100128914A1 (en) 2008-11-26 2010-05-27 Analog Devices, Inc. Side-ported MEMS microphone assembly
US20100131783A1 (en) 2008-11-24 2010-05-27 Via Technologies, Inc. System and Method of Dynamically Switching Queue Threshold
US20100183181A1 (en) 2009-01-20 2010-07-22 General Mems Corporation Miniature mems condenser microphone packages and fabrication method thereof
US7774202B2 (en) 2006-06-12 2010-08-10 Lockheed Martin Corporation Speech activated control system and related methods
US7781249B2 (en) 2006-03-20 2010-08-24 Wolfson Microelectronics Plc MEMS process and device
US7795695B2 (en) 2005-01-27 2010-09-14 Analog Devices, Inc. Integrated microphone
US7801729B2 (en) 2007-03-13 2010-09-21 Sensory, Inc. Using multiple attributes to create a voice search playlist
US20100246877A1 (en) 2009-01-20 2010-09-30 Fortemedia, Inc. Miniature MEMS Condenser Microphone Package and Fabrication Method Thereof
US7825484B2 (en) 2005-04-25 2010-11-02 Analog Devices, Inc. Micromachined microphone and multisensor and method for producing same
US7829961B2 (en) 2007-01-10 2010-11-09 Advanced Semiconductor Engineering, Inc. MEMS microphone package and method thereof
US20100290644A1 (en) 2009-05-15 2010-11-18 Aac Acoustic Technologies (Shenzhen) Co., Ltd Silicon based capacitive microphone
US20100292987A1 (en) 2009-05-17 2010-11-18 Hiroshi Kawaguchi Circuit startup method and circuit startup apparatus utilizing utterance estimation for use in speech processing system provided with sound collecting device
US7856283B2 (en) 2005-12-13 2010-12-21 Sigmatel, Inc. Digital microphone interface, audio codec and methods for use therewith
US20100322451A1 (en) 2009-06-19 2010-12-23 Aac Acoustic Technologies (Shenzhen) Co., Ltd MEMS Microphone
US20100322443A1 (en) 2009-06-19 2010-12-23 Aac Acoustic Technologies (Shenzhen) Co., Ltd Mems microphone
US20110007907A1 (en) 2009-07-10 2011-01-13 Qualcomm Incorporated Systems, methods, apparatus, and computer-readable media for adaptive active noise cancellation
US20110013787A1 (en) 2009-07-16 2011-01-20 Hon Hai Precision Industry Co., Ltd. Mems microphone package and mehtod for making same
US20110029109A1 (en) 2009-06-11 2011-02-03 Audioasics A/S Audio signal controller
US7903831B2 (en) 2005-08-20 2011-03-08 Bse Co., Ltd. Silicon based condenser microphone and packaging method for the same
US20110075875A1 (en) 2009-09-28 2011-03-31 Aac Acoustic Technologies (Shenzhen) Co., Ltd Mems microphone package
US7936293B2 (en) 2008-06-17 2011-05-03 Asahi Kasei Microdevices Corporation Delta-sigma modulator
US20110106533A1 (en) 2008-06-30 2011-05-05 Dolby Laboratories Licensing Corporation Multi-Microphone Voice Activity Detector
US7957972B2 (en) 2006-09-05 2011-06-07 Fortemedia, Inc. Voice recognition system and method thereof
US20110150210A1 (en) * 2003-05-19 2011-06-23 Acoustic Technologies, Inc. Distributed VAD control system for telephone
US7994947B1 (en) 2008-06-06 2011-08-09 Maxim Integrated Products, Inc. Method and apparatus for generating a target frequency having an over-sampled data rate using a system clock having a different frequency
US20110208520A1 (en) 2010-02-24 2011-08-25 Qualcomm Incorporated Voice activity detection based on plural voice activity detectors
US8024195B2 (en) 2005-06-27 2011-09-20 Sensory, Inc. Systems and methods of performing speech recognition using historical information
US8036901B2 (en) 2007-10-05 2011-10-11 Sensory, Incorporated Systems and methods of performing speech recognition using sensory inputs of human position
US20110264447A1 (en) * 2010-04-22 2011-10-27 Qualcomm Incorporated Systems, methods, and apparatus for speech feature detection
WO2011140096A1 (en) 2010-05-03 2011-11-10 Aliphcom, Inc. Vibration sensor and acoustic voice activity detection system (vads) for use with electronic systems
US20110280109A1 (en) 2010-05-13 2011-11-17 Maxim Integrated Products, Inc. Synchronization of a generated clock
US20120010890A1 (en) 2008-12-30 2012-01-12 Raymond Clement Koverzin Power-optimized wireless communications device
US8099289B2 (en) 2008-02-13 2012-01-17 Sensory, Inc. Voice interface and search for electronic devices including bluetooth headsets and remote systems
US8112280B2 (en) 2007-11-19 2012-02-07 Sensory, Inc. Systems and methods of performing speech recognition with barge-in for use in a bluetooth system
US20120052907A1 (en) 2010-08-30 2012-03-01 Sensory, Incorporated Hands-Free, Eyes-Free Mobile Device for In-Car Use
US8171322B2 (en) 2008-06-06 2012-05-01 Apple Inc. Portable electronic devices with power management capabilities
US8208621B1 (en) 2007-10-12 2012-06-26 Mediatek Inc. Systems and methods for acoustic echo cancellation
US20120232896A1 (en) 2010-12-24 2012-09-13 Huawei Technologies Co., Ltd. Method and an apparatus for voice activity detection
US8275148B2 (en) 2009-07-28 2012-09-25 Fortemedia, Inc. Audio processing apparatus and method
US20120250881A1 (en) 2011-03-29 2012-10-04 Mulligan Daniel P Microphone biasing
US8321219B2 (en) 2007-10-05 2012-11-27 Sensory, Inc. Systems and methods of performing speech recognition using gestures
US8331581B2 (en) 2007-03-30 2012-12-11 Wolfson Microelectronics Plc Pattern detection circuitry
US20130044898A1 (en) 2011-08-18 2013-02-21 Jordan T. Schultz Sensitivity Adjustment Apparatus And Method For MEMS Devices
US20130058506A1 (en) 2011-07-12 2013-03-07 Steven E. Boor Microphone Buffer Circuit With Input Filter
WO2013049358A1 (en) 2011-09-30 2013-04-04 Google Inc. Systems and methods for continual speech recognition and detection in mobile computing devices
WO2013085499A1 (en) 2011-12-06 2013-06-13 Intel Corporation Low power voice detection
US20130183944A1 (en) 2012-01-12 2013-07-18 Sensory, Incorporated Information Access and Device Control Using Mobile Phones and Audio in the Home Environment
US20130226324A1 (en) 2010-09-27 2013-08-29 Nokia Corporation Audio scene apparatuses and methods
US20130223635A1 (en) 2012-02-27 2013-08-29 Cambridge Silicon Radio Limited Low power audio detection
US20130246071A1 (en) 2012-03-15 2013-09-19 Samsung Electronics Co., Ltd. Electronic device and method for controlling power using voice recognition
US20130322461A1 (en) 2012-06-01 2013-12-05 Research In Motion Limited Multiformat digital audio interface
US20130343584A1 (en) 2012-06-20 2013-12-26 Broadcom Corporation Hearing assist device with external operational support
US8645132B2 (en) 2011-08-24 2014-02-04 Sensory, Inc. Truly handsfree speech recognition in high noise environments
US8645143B2 (en) 2007-05-01 2014-02-04 Sensory, Inc. Systems and methods of performing speech recognition using global positioning (GPS) information
US8666751B2 (en) 2011-11-17 2014-03-04 Microsoft Corporation Audio pattern matching for device activation
US20140064523A1 (en) 2012-08-30 2014-03-06 Infineon Technologies Ag System and Method for Adjusting the Sensitivity of a Capacitive Signal Source
US8687823B2 (en) 2009-09-16 2014-04-01 Knowles Electronics, Llc. Microphone interface and method of operation
US8700399B2 (en) 2009-07-06 2014-04-15 Sensory, Inc. Systems and methods for hands-free voice control and voice search
US20140122078A1 (en) 2012-11-01 2014-05-01 3iLogic-Designs Private Limited Low Power Mechanism for Keyword Based Hands-Free Wake Up in Always ON-Domain
US8731210B2 (en) 2009-09-21 2014-05-20 Mediatek Inc. Audio processing methods and apparatuses utilizing the same
US20140143545A1 (en) 2012-11-20 2014-05-22 Utility Associates, Inc. System and Method for Securely Distributing Legal Evidence
US20140163978A1 (en) 2012-12-11 2014-06-12 Amazon Technologies, Inc. Speech recognition power management
US20140177113A1 (en) 2012-12-19 2014-06-26 Knowles Electronics, Llc Apparatus and Method For High Voltage I/O Electro-Static Discharge Protection
US8768707B2 (en) 2011-09-27 2014-07-01 Sensory Incorporated Background speech recognition assistant using speaker verification
US20140188470A1 (en) 2012-12-31 2014-07-03 Jenny Chang Flexible architecture for acoustic signal processing engine
US8781825B2 (en) 2011-08-24 2014-07-15 Sensory, Incorporated Reducing false positives in speech recognition systems
US20140197887A1 (en) 2013-01-15 2014-07-17 Knowles Electronics, Llc Telescopic OP-AMP With Slew Rate Control
US8798289B1 (en) 2008-08-05 2014-08-05 Audience, Inc. Adaptive power saving for an audio device
US8804974B1 (en) 2006-03-03 2014-08-12 Cirrus Logic, Inc. Ambient audio event detection in a personal audio device headset
US20140244269A1 (en) 2013-02-28 2014-08-28 Sony Mobile Communications Ab Device and method for activating with voice input
US20140244273A1 (en) 2013-02-27 2014-08-28 Jean Laroche Voice-controlled communication connections
US20140249820A1 (en) 2013-03-01 2014-09-04 Mediatek Inc. Voice control device and method for deciding response of voice control according to recognized speech command and detection output derived from processing sensor data
US20140257821A1 (en) 2013-03-07 2014-09-11 Analog Devices Technology System and method for processor wake-up based on sensor data
US20140257813A1 (en) 2013-03-08 2014-09-11 Analog Devices A/S Microphone circuit assembly and system with speech recognition
US20140278435A1 (en) * 2013-03-12 2014-09-18 Nuance Communications, Inc. Methods and apparatus for detecting a voice command
US20140281628A1 (en) 2013-03-15 2014-09-18 Maxim Integrated Products, Inc. Always-On Low-Power Keyword spotting
US20140274203A1 (en) 2013-03-12 2014-09-18 Nuance Communications, Inc. Methods and apparatus for detecting a voice command
US8849231B1 (en) 2007-08-08 2014-09-30 Audience, Inc. System and method for adaptive power control
US20140324431A1 (en) 2013-04-25 2014-10-30 Sensory, Inc. System, Method, and Apparatus for Location-Based Context Driven Voice Recognition
US20140343949A1 (en) 2013-05-17 2014-11-20 Fortemedia, Inc. Smart microphone device
US20140348345A1 (en) 2013-05-23 2014-11-27 Knowles Electronics, Llc Vad detection microphone and method of operating the same
US20140358552A1 (en) 2013-05-31 2014-12-04 Cirrus Logic, Inc. Low-power voice gate for device wake-up
US20150039303A1 (en) 2013-06-26 2015-02-05 Wolfson Microelectronics Plc Speech recognition
US20150046157A1 (en) 2012-03-16 2015-02-12 Nuance Communications, Inc. User Dedicated Automatic Speech Recognition
US20150046162A1 (en) 2008-11-26 2015-02-12 Nuance Communications, Inc. Device, system, and method of liveness detection utilizing voice biometrics
US20150049884A1 (en) 2013-08-16 2015-02-19 Zilltek Technology Corp. Microphone with voice wake-up function
US20150058001A1 (en) 2013-05-23 2015-02-26 Knowles Electronics, Llc Microphone and Corresponding Digital Interface
US20150055803A1 (en) 2013-05-23 2015-02-26 Knowles Electronics, Llc Decimation Synchronization in a Microphone
US8972252B2 (en) 2012-07-06 2015-03-03 Realtek Semiconductor Corp. Signal processing apparatus having voice activity detection unit and related signal processing methods
US20150063594A1 (en) 2013-09-04 2015-03-05 Knowles Electronics, Llc Slew rate control apparatus for digital microphones
US20150073785A1 (en) 2013-09-06 2015-03-12 Nuance Communications, Inc. Method for voicemail quality detection
US20150073780A1 (en) 2013-09-06 2015-03-12 Nuance Communications, Inc. Method for non-intrusive acoustic parameter estimation
US20150088500A1 (en) 2013-09-24 2015-03-26 Nuance Communications, Inc. Wearable communication enhancement device
US8996381B2 (en) 2011-09-27 2015-03-31 Sensory, Incorporated Background speech recognition assistant
US20150106085A1 (en) 2013-10-11 2015-04-16 Apple Inc. Speech recognition wake-up of a handheld portable electronic device
US20150112690A1 (en) 2013-10-22 2015-04-23 Nvidia Corporation Low power always-on voice trigger architecture
US20150110290A1 (en) 2013-10-21 2015-04-23 Knowles Electronics Llc Apparatus And Method For Frequency Detection
US9020819B2 (en) 2006-01-10 2015-04-28 Nissan Motor Co., Ltd. Recognition dictionary system and recognition dictionary system updating method
US20150134331A1 (en) 2013-11-12 2015-05-14 Apple Inc. Always-On Audio Control for Mobile Device
US9043211B2 (en) 2013-05-09 2015-05-26 Dsp Group Ltd. Low power activation of a voice activated device
US20150154981A1 (en) 2013-12-02 2015-06-04 Nuance Communications, Inc. Voice Activity Detection (VAD) for a Coded Speech Bitstream without Decoding
US20150161989A1 (en) 2013-12-09 2015-06-11 Mediatek Inc. System for speech keyword detection and associated method
US9059630B2 (en) 2011-08-31 2015-06-16 Knowles Electronics, Llc High voltage multiplier for a microphone and method of manufacture
US9073747B2 (en) 2013-05-28 2015-07-07 Shangai Sniper Microelectronics Co., Ltd. MEMS microphone and electronic equipment having the MEMS microphone
US9076447B2 (en) 2013-10-18 2015-07-07 Knowles Electronics, Llc Acoustic activity detection apparatus and method
US20150195656A1 (en) 2014-01-03 2015-07-09 Zilltek Technology (Shanghai) Corp. New-Type Microphone Structure
US20150206527A1 (en) 2012-07-24 2015-07-23 Nuance Communications, Inc. Feature normalization inputs to front end processing for automatic speech recognition
US9111548B2 (en) 2013-05-23 2015-08-18 Knowles Electronics, Llc Synchronization of buffered data in multiple microphones
US9112984B2 (en) 2013-03-12 2015-08-18 Nuance Communications, Inc. Methods and apparatus for detecting a voice command
US20150256660A1 (en) 2014-03-05 2015-09-10 Cirrus Logic, Inc. Frequency-dependent sidetone calibration
US20150256916A1 (en) 2014-03-04 2015-09-10 Knowles Electronics, Llc Programmable Acoustic Device And Method For Programming The Same
US9142215B2 (en) 2012-06-15 2015-09-22 Cypress Semiconductor Corporation Power-efficient voice activation
US9147397B2 (en) 2013-10-29 2015-09-29 Knowles Electronics, Llc VAD detection apparatus and method of operating the same
US20150287401A1 (en) 2012-11-05 2015-10-08 Nuance Communications, Inc. Privacy-sensitive speech model creation via aggregation of multiple user models
US9161112B2 (en) 2012-01-16 2015-10-13 Shanghai Sniper Microelectronics Single-wire programmable MEMS microphone, programming method and system thereof
US20150302865A1 (en) 2014-04-18 2015-10-22 Nuance Communications, Inc. System and method for audio conferencing
US20150304502A1 (en) 2014-04-18 2015-10-22 Nuance Communications, Inc. System and method for audio conferencing
US20160012007A1 (en) 2014-03-06 2016-01-14 Knowles Electronics, Llc Digital Microphone Interface
US20160057549A1 (en) * 2013-04-09 2016-02-25 Sonova Ag Method and system for providing hearing assistance to a user
US20160087596A1 (en) 2014-09-19 2016-03-24 Knowles Electronics, Llc Digital microphone with adjustable gain control
US20160134975A1 (en) 2014-11-12 2016-05-12 Knowles Electronics, Llc Microphone With Trimming
US20160133271A1 (en) 2014-11-11 2016-05-12 Knowles Electronic, Llc Microphone With Electronic Noise Filter
US9439005B2 (en) * 2013-11-25 2016-09-06 Oticon A/S Spatial filter bank for hearing system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102074245B (en) * 2011-01-05 2012-10-10 瑞声声学科技(深圳)有限公司 Dual-microphone-based speech enhancement device and speech enhancement method

Patent Citations (201)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4052568A (en) 1976-04-23 1977-10-04 Communications Satellite Corporation Digital voice switch
US5598447A (en) 1992-05-11 1997-01-28 Yamaha Corporation Integrated circuit device having internal fast clock source
US5577164A (en) 1994-01-28 1996-11-19 Canon Kabushiki Kaisha Incorrect voice command recognition prevention and recovery processing method and apparatus
US5675808A (en) 1994-11-02 1997-10-07 Advanced Micro Devices, Inc. Power control of circuit modules within an integrated circuit
US6049565A (en) 1994-12-16 2000-04-11 International Business Machines Corporation Method and apparatus for audio communication
US6070140A (en) 1995-06-05 2000-05-30 Tran; Bao Q. Speech recognizer
US5983186A (en) 1995-08-21 1999-11-09 Seiko Epson Corporation Voice-activated interactive speech recognition device and method
US5822598A (en) 1996-07-12 1998-10-13 Ast Research, Inc. Audio activity detection circuit to increase battery life in portable computers
US7092887B2 (en) 1996-12-06 2006-08-15 Sensory, Incorporated Method of performing speech recognition across a network
US6665639B2 (en) 1996-12-06 2003-12-16 Sensory, Inc. Speech recognition in consumer electronic products
US6999927B2 (en) 1996-12-06 2006-02-14 Sensory, Inc. Speech recognition programming information retrieved from a remote source to a speech recognition system for performing a speech recognition method
US6154721A (en) 1997-03-25 2000-11-28 U.S. Philips Corporation Method and device for detecting voice activity
US20040022379A1 (en) 1997-04-03 2004-02-05 Southwestern Bell Telephone Company Apparatus and method for facilitating service management of communications services in a communications network
US6282268B1 (en) 1997-05-06 2001-08-28 International Business Machines Corp. Voice processing system
US6453020B1 (en) 1997-05-06 2002-09-17 International Business Machines Corporation Voice processing system
US6324514B2 (en) 1998-01-02 2001-11-27 Vos Systems, Inc. Voice activated switch with user prompt
US6057791A (en) 1998-02-18 2000-05-02 Oasis Design, Inc. Apparatus and method for clocking digital and analog circuits on a common substrate to enhance digital operation and reduce analog sampling error
US6138040A (en) * 1998-07-31 2000-10-24 Motorola, Inc. Method for suppressing speaker activation in a portable communication device operated in a speakerphone mode
US6591234B1 (en) 1999-01-07 2003-07-08 Tellabs Operations, Inc. Method and apparatus for adaptively suppressing noise
US6249757B1 (en) 1999-02-16 2001-06-19 3Com Corporation System for detecting voice activity
US6397186B1 (en) 1999-12-22 2002-05-28 Ambush Interactive, Inc. Hands-free, voice-operated remote control transmitter
US6564330B1 (en) 1999-12-23 2003-05-13 Intel Corporation Wakeup circuit for computer system that enables codec controller to generate system interrupt in response to detection of a wake event by a codec
JP2001236095A (en) 2000-02-23 2001-08-31 Olympus Optical Co Ltd Voice recorder
US20060247923A1 (en) 2000-03-28 2006-11-02 Ravi Chandran Communication system noise cancellation power signal calculation techniques
US20020116186A1 (en) 2000-09-09 2002-08-22 Adam Strauss Voice activity detector for integrated telecommunications processing
US6640208B1 (en) 2000-09-12 2003-10-28 Motorola, Inc. Voiced/unvoiced speech classifier
US20020054588A1 (en) 2000-09-22 2002-05-09 Manoj Mehta System and method for controlling signal processing in a voice over packet (VoP) environment
US6832194B1 (en) 2000-10-26 2004-12-14 Sensory, Incorporated Audio recognition peripheral system
US20030004720A1 (en) 2001-01-30 2003-01-02 Harinath Garudadri System and method for computing and transmitting parameters in a distributed voice recognition system
US20020123893A1 (en) 2001-03-01 2002-09-05 International Business Machines Corporation Processing speech recognition errors in an embedded speech recognition system
US20030061036A1 (en) 2001-05-17 2003-03-27 Harinath Garudadri System and method for transmitting speech activity in a distributed voice recognition system
US7941313B2 (en) 2001-05-17 2011-05-10 Qualcomm Incorporated System and method for transmitting speech activity information ahead of speech features in a distributed voice recognition system
US20020184015A1 (en) 2001-06-01 2002-12-05 Dunling Li Method for converging a G.729 Annex B compliant voice activity detection circuit
US7487089B2 (en) 2001-06-05 2009-02-03 Sensory, Incorporated Biometric client-server security system and method
US7190038B2 (en) 2001-12-11 2007-03-13 Infineon Technologies Ag Micromechanical sensors and methods of manufacturing same
US7473572B2 (en) 2001-12-11 2009-01-06 Infineon Technologies Ag Micromechanical sensors and methods of manufacturing same
US20030144844A1 (en) 2002-01-30 2003-07-31 Koninklijke Philips Electronics N.V. Automatic speech recognition system and method
US6756700B2 (en) 2002-03-13 2004-06-29 Kye Systems Corp. Sound-activated wake-up device for electronic input devices having a sleep-mode
JP2004219728A (en) 2003-01-15 2004-08-05 Matsushita Electric Ind Co Ltd Speech recognition device
US20040234069A1 (en) * 2003-05-19 2004-11-25 Acoustic Technologies, Inc. Dynamic balance control for telephone
US20110150210A1 (en) * 2003-05-19 2011-06-23 Acoustic Technologies, Inc. Distributed VAD control system for telephone
US7720683B1 (en) 2003-06-13 2010-05-18 Sensory, Inc. Method and apparatus of specifying and performing speech recognition operations
US20060233389A1 (en) 2003-08-27 2006-10-19 Sony Computer Entertainment Inc. Methods and apparatus for targeted sound detection and characterization
US7415416B2 (en) 2003-09-12 2008-08-19 Canon Kabushiki Kaisha Voice activated device
US7418392B1 (en) 2003-09-25 2008-08-26 Sensory, Inc. System and method for controlling the operation of a device by voice commands
US7774204B2 (en) 2003-09-25 2010-08-10 Sensory, Inc. System and method for controlling the operation of a device by voice commands
US7630504B2 (en) 2003-11-24 2009-12-08 Epcos Ag Microphone comprising integral multi-level quantizer and single-bit conversion means
US20050207605A1 (en) 2004-03-08 2005-09-22 Infineon Technologies Ag Microphone and method of producing a microphone
US20070168908A1 (en) 2004-03-26 2007-07-19 Atmel Corporation Dual-processor complex domain floating-point dsp system on chip
US20050240399A1 (en) * 2004-04-21 2005-10-27 Nokia Corporation Signal encoding
US20080201138A1 (en) 2004-07-22 2008-08-21 Softmax, Inc. Headset for Separation of Speech Signals in a Noisy Environment
US20060074658A1 (en) 2004-10-01 2006-04-06 Siemens Information And Communication Mobile, Llc Systems and methods for hands-free voice-activated devices
US20070278501A1 (en) 2004-12-30 2007-12-06 Macpherson Charles D Electronic device including a guest material within a layer and a process for forming the same
US7795695B2 (en) 2005-01-27 2010-09-14 Analog Devices, Inc. Integrated microphone
US20080267431A1 (en) 2005-02-24 2008-10-30 Epcos Ag Mems Microphone
US7825484B2 (en) 2005-04-25 2010-11-02 Analog Devices, Inc. Micromachined microphone and multisensor and method for producing same
US8024195B2 (en) 2005-06-27 2011-09-20 Sensory, Inc. Systems and methods of performing speech recognition using historical information
US7903831B2 (en) 2005-08-20 2011-03-08 Bse Co., Ltd. Silicon based condenser microphone and packaging method for the same
US20080279407A1 (en) 2005-11-10 2008-11-13 Epcos Ag Mems Microphone, Production Method and Method for Installing
US20090001553A1 (en) 2005-11-10 2009-01-01 Epcos Ag Mems Package and Method for the Production Thereof
US7856283B2 (en) 2005-12-13 2010-12-21 Sigmatel, Inc. Digital microphone interface, audio codec and methods for use therewith
US9020819B2 (en) 2006-01-10 2015-04-28 Nissan Motor Co., Ltd. Recognition dictionary system and recognition dictionary system updating method
US8804974B1 (en) 2006-03-03 2014-08-12 Cirrus Logic, Inc. Ambient audio event detection in a personal audio device headset
US7856804B2 (en) 2006-03-20 2010-12-28 Wolfson Microelectronics Plc MEMS process and device
US7781249B2 (en) 2006-03-20 2010-08-24 Wolfson Microelectronics Plc MEMS process and device
US20100046780A1 (en) 2006-05-09 2010-02-25 Bse Co., Ltd. Directional silicon condensor microphone having additional back chamber
US9119150B1 (en) 2006-05-25 2015-08-25 Audience, Inc. System and method for adaptive power control
US7774202B2 (en) 2006-06-12 2010-08-10 Lockheed Martin Corporation Speech activated control system and related methods
US7957972B2 (en) 2006-09-05 2011-06-07 Fortemedia, Inc. Voice recognition system and method thereof
US20080089536A1 (en) 2006-10-11 2008-04-17 Analog Devices, Inc. Microphone Microchip Device with Differential Mode Noise Suppression
US20080175425A1 (en) 2006-11-30 2008-07-24 Analog Devices, Inc. Microphone System with Silicon Microphone Secured to Package Lid
US7829961B2 (en) 2007-01-10 2010-11-09 Advanced Semiconductor Engineering, Inc. MEMS microphone package and method thereof
US7801729B2 (en) 2007-03-13 2010-09-21 Sensory, Inc. Using multiple attributes to create a voice search playlist
US8331581B2 (en) 2007-03-30 2012-12-11 Wolfson Microelectronics Plc Pattern detection circuitry
US8645143B2 (en) 2007-05-01 2014-02-04 Sensory, Inc. Systems and methods of performing speech recognition using global positioning (GPS) information
US20080283942A1 (en) 2007-05-15 2008-11-20 Industrial Technology Research Institute Package and packaging assembly of microelectromechanical sysyem microphone
US20100128894A1 (en) 2007-05-25 2010-05-27 Nicolas Petit Acoustic Voice Activity Detection (AVAD) for Electronic Systems
US8849231B1 (en) 2007-08-08 2014-09-30 Audience, Inc. System and method for adaptive power control
US8036901B2 (en) 2007-10-05 2011-10-11 Sensory, Incorporated Systems and methods of performing speech recognition using sensory inputs of human position
US8321219B2 (en) 2007-10-05 2012-11-27 Sensory, Inc. Systems and methods of performing speech recognition using gestures
US8208621B1 (en) 2007-10-12 2012-06-26 Mediatek Inc. Systems and methods for acoustic echo cancellation
US8112280B2 (en) 2007-11-19 2012-02-07 Sensory, Inc. Systems and methods of performing speech recognition with barge-in for use in a bluetooth system
US20090180655A1 (en) 2008-01-10 2009-07-16 Lingsen Precision Industries, Ltd. Package for mems microphone
US8099289B2 (en) 2008-02-13 2012-01-17 Sensory, Inc. Voice interface and search for electronic devices including bluetooth headsets and remote systems
US8195467B2 (en) 2008-02-13 2012-06-05 Sensory, Incorporated Voice interface and search for electronic devices including bluetooth headsets and remote systems
WO2009130591A1 (en) 2008-04-25 2009-10-29 Nokia Corporation Method and apparatus for voice activity determination
US20120310641A1 (en) 2008-04-25 2012-12-06 Nokia Corporation Method And Apparatus For Voice Activity Determination
US8171322B2 (en) 2008-06-06 2012-05-01 Apple Inc. Portable electronic devices with power management capabilities
US7994947B1 (en) 2008-06-06 2011-08-09 Maxim Integrated Products, Inc. Method and apparatus for generating a target frequency having an over-sampled data rate using a system clock having a different frequency
US7936293B2 (en) 2008-06-17 2011-05-03 Asahi Kasei Microdevices Corporation Delta-sigma modulator
US20100057474A1 (en) 2008-06-19 2010-03-04 Hongwei Kong Method and system for digital gain processing in a hardware audio codec for audio transmission
US20110106533A1 (en) 2008-06-30 2011-05-05 Dolby Laboratories Licensing Corporation Multi-Microphone Voice Activity Detector
US7619551B1 (en) 2008-07-29 2009-11-17 Fortemedia, Inc. Audio codec, digital device and voice processing method
US8798289B1 (en) 2008-08-05 2014-08-05 Audience, Inc. Adaptive power saving for an audio device
US20100052082A1 (en) 2008-09-03 2010-03-04 Solid State System Co., Ltd. Micro-electro-mechanical systems (mems) package and method for forming the mems package
US20100131783A1 (en) 2008-11-24 2010-05-27 Via Technologies, Inc. System and Method of Dynamically Switching Queue Threshold
US20100128914A1 (en) 2008-11-26 2010-05-27 Analog Devices, Inc. Side-ported MEMS microphone assembly
US20150046162A1 (en) 2008-11-26 2015-02-12 Nuance Communications, Inc. Device, system, and method of liveness detection utilizing voice biometrics
US20120010890A1 (en) 2008-12-30 2012-01-12 Raymond Clement Koverzin Power-optimized wireless communications device
US20100246877A1 (en) 2009-01-20 2010-09-30 Fortemedia, Inc. Miniature MEMS Condenser Microphone Package and Fabrication Method Thereof
US20100183181A1 (en) 2009-01-20 2010-07-22 General Mems Corporation Miniature mems condenser microphone packages and fabrication method thereof
US20140188467A1 (en) 2009-05-01 2014-07-03 Aliphcom Vibration sensor and acoustic voice activity detection systems (vads) for use with electronic systems
US20100290644A1 (en) 2009-05-15 2010-11-18 Aac Acoustic Technologies (Shenzhen) Co., Ltd Silicon based capacitive microphone
US20100292987A1 (en) 2009-05-17 2010-11-18 Hiroshi Kawaguchi Circuit startup method and circuit startup apparatus utilizing utterance estimation for use in speech processing system provided with sound collecting device
US20110029109A1 (en) 2009-06-11 2011-02-03 Audioasics A/S Audio signal controller
US20100322451A1 (en) 2009-06-19 2010-12-23 Aac Acoustic Technologies (Shenzhen) Co., Ltd MEMS Microphone
US20100322443A1 (en) 2009-06-19 2010-12-23 Aac Acoustic Technologies (Shenzhen) Co., Ltd Mems microphone
US20140180691A1 (en) 2009-07-06 2014-06-26 Sensory, Incorporated Systems and methods for hands-free voice control and voice search
US8700399B2 (en) 2009-07-06 2014-04-15 Sensory, Inc. Systems and methods for hands-free voice control and voice search
US20110007907A1 (en) 2009-07-10 2011-01-13 Qualcomm Incorporated Systems, methods, apparatus, and computer-readable media for adaptive active noise cancellation
US20110013787A1 (en) 2009-07-16 2011-01-20 Hon Hai Precision Industry Co., Ltd. Mems microphone package and mehtod for making same
US8275148B2 (en) 2009-07-28 2012-09-25 Fortemedia, Inc. Audio processing apparatus and method
US8687823B2 (en) 2009-09-16 2014-04-01 Knowles Electronics, Llc. Microphone interface and method of operation
US8731210B2 (en) 2009-09-21 2014-05-20 Mediatek Inc. Audio processing methods and apparatuses utilizing the same
US20110075875A1 (en) 2009-09-28 2011-03-31 Aac Acoustic Technologies (Shenzhen) Co., Ltd Mems microphone package
WO2011106065A1 (en) 2010-02-24 2011-09-01 Qualcomm Incorporated Voice activity detection based on plural voice activity detectors
US20110208520A1 (en) 2010-02-24 2011-08-25 Qualcomm Incorporated Voice activity detection based on plural voice activity detectors
US20110264447A1 (en) * 2010-04-22 2011-10-27 Qualcomm Incorporated Systems, methods, and apparatus for speech feature detection
US9165567B2 (en) * 2010-04-22 2015-10-20 Qualcomm Incorporated Systems, methods, and apparatus for speech feature detection
WO2011140096A1 (en) 2010-05-03 2011-11-10 Aliphcom, Inc. Vibration sensor and acoustic voice activity detection system (vads) for use with electronic systems
US20110280109A1 (en) 2010-05-13 2011-11-17 Maxim Integrated Products, Inc. Synchronization of a generated clock
US20120052907A1 (en) 2010-08-30 2012-03-01 Sensory, Incorporated Hands-Free, Eyes-Free Mobile Device for In-Car Use
US20130226324A1 (en) 2010-09-27 2013-08-29 Nokia Corporation Audio scene apparatuses and methods
US20120232896A1 (en) 2010-12-24 2012-09-13 Huawei Technologies Co., Ltd. Method and an apparatus for voice activity detection
US20120250881A1 (en) 2011-03-29 2012-10-04 Mulligan Daniel P Microphone biasing
US20130058506A1 (en) 2011-07-12 2013-03-07 Steven E. Boor Microphone Buffer Circuit With Input Filter
US20130044898A1 (en) 2011-08-18 2013-02-21 Jordan T. Schultz Sensitivity Adjustment Apparatus And Method For MEMS Devices
US8645132B2 (en) 2011-08-24 2014-02-04 Sensory, Inc. Truly handsfree speech recognition in high noise environments
US8781825B2 (en) 2011-08-24 2014-07-15 Sensory, Incorporated Reducing false positives in speech recognition systems
US9059630B2 (en) 2011-08-31 2015-06-16 Knowles Electronics, Llc High voltage multiplier for a microphone and method of manufacture
US9142219B2 (en) 2011-09-27 2015-09-22 Sensory, Incorporated Background speech recognition assistant using speaker verification
US8996381B2 (en) 2011-09-27 2015-03-31 Sensory, Incorporated Background speech recognition assistant
US8768707B2 (en) 2011-09-27 2014-07-01 Sensory Incorporated Background speech recognition assistant using speaker verification
WO2013049358A1 (en) 2011-09-30 2013-04-04 Google Inc. Systems and methods for continual speech recognition and detection in mobile computing devices
US8666751B2 (en) 2011-11-17 2014-03-04 Microsoft Corporation Audio pattern matching for device activation
WO2013085499A1 (en) 2011-12-06 2013-06-13 Intel Corporation Low power voice detection
US20130183944A1 (en) 2012-01-12 2013-07-18 Sensory, Incorporated Information Access and Device Control Using Mobile Phones and Audio in the Home Environment
US9161112B2 (en) 2012-01-16 2015-10-13 Shanghai Sniper Microelectronics Single-wire programmable MEMS microphone, programming method and system thereof
US20130223635A1 (en) 2012-02-27 2013-08-29 Cambridge Silicon Radio Limited Low power audio detection
US20130246071A1 (en) 2012-03-15 2013-09-19 Samsung Electronics Co., Ltd. Electronic device and method for controlling power using voice recognition
US20150046157A1 (en) 2012-03-16 2015-02-12 Nuance Communications, Inc. User Dedicated Automatic Speech Recognition
US20130322461A1 (en) 2012-06-01 2013-12-05 Research In Motion Limited Multiformat digital audio interface
US9142215B2 (en) 2012-06-15 2015-09-22 Cypress Semiconductor Corporation Power-efficient voice activation
US20130343584A1 (en) 2012-06-20 2013-12-26 Broadcom Corporation Hearing assist device with external operational support
US8972252B2 (en) 2012-07-06 2015-03-03 Realtek Semiconductor Corp. Signal processing apparatus having voice activity detection unit and related signal processing methods
US20150206527A1 (en) 2012-07-24 2015-07-23 Nuance Communications, Inc. Feature normalization inputs to front end processing for automatic speech recognition
US20140064523A1 (en) 2012-08-30 2014-03-06 Infineon Technologies Ag System and Method for Adjusting the Sensitivity of a Capacitive Signal Source
US20140122078A1 (en) 2012-11-01 2014-05-01 3iLogic-Designs Private Limited Low Power Mechanism for Keyword Based Hands-Free Wake Up in Always ON-Domain
US20150287401A1 (en) 2012-11-05 2015-10-08 Nuance Communications, Inc. Privacy-sensitive speech model creation via aggregation of multiple user models
US20140143545A1 (en) 2012-11-20 2014-05-22 Utility Associates, Inc. System and Method for Securely Distributing Legal Evidence
US20140163978A1 (en) 2012-12-11 2014-06-12 Amazon Technologies, Inc. Speech recognition power management
US20140177113A1 (en) 2012-12-19 2014-06-26 Knowles Electronics, Llc Apparatus and Method For High Voltage I/O Electro-Static Discharge Protection
US20140188470A1 (en) 2012-12-31 2014-07-03 Jenny Chang Flexible architecture for acoustic signal processing engine
US20140197887A1 (en) 2013-01-15 2014-07-17 Knowles Electronics, Llc Telescopic OP-AMP With Slew Rate Control
US20140244273A1 (en) 2013-02-27 2014-08-28 Jean Laroche Voice-controlled communication connections
US20140244269A1 (en) 2013-02-28 2014-08-28 Sony Mobile Communications Ab Device and method for activating with voice input
US20140249820A1 (en) 2013-03-01 2014-09-04 Mediatek Inc. Voice control device and method for deciding response of voice control according to recognized speech command and detection output derived from processing sensor data
US20140257821A1 (en) 2013-03-07 2014-09-11 Analog Devices Technology System and method for processor wake-up based on sensor data
US20140257813A1 (en) 2013-03-08 2014-09-11 Analog Devices A/S Microphone circuit assembly and system with speech recognition
US20140278435A1 (en) * 2013-03-12 2014-09-18 Nuance Communications, Inc. Methods and apparatus for detecting a voice command
US20140274203A1 (en) 2013-03-12 2014-09-18 Nuance Communications, Inc. Methods and apparatus for detecting a voice command
US9112984B2 (en) 2013-03-12 2015-08-18 Nuance Communications, Inc. Methods and apparatus for detecting a voice command
US20140281628A1 (en) 2013-03-15 2014-09-18 Maxim Integrated Products, Inc. Always-On Low-Power Keyword spotting
US20160057549A1 (en) * 2013-04-09 2016-02-25 Sonova Ag Method and system for providing hearing assistance to a user
US20140324431A1 (en) 2013-04-25 2014-10-30 Sensory, Inc. System, Method, and Apparatus for Location-Based Context Driven Voice Recognition
US9043211B2 (en) 2013-05-09 2015-05-26 Dsp Group Ltd. Low power activation of a voice activated device
US20140343949A1 (en) 2013-05-17 2014-11-20 Fortemedia, Inc. Smart microphone device
US9111548B2 (en) 2013-05-23 2015-08-18 Knowles Electronics, Llc Synchronization of buffered data in multiple microphones
US9113263B2 (en) 2013-05-23 2015-08-18 Knowles Electronics, Llc VAD detection microphone and method of operating the same
US20150350774A1 (en) 2013-05-23 2015-12-03 Knowles Electronics, Llc Vad detection microphone and method of operating the same
US20150058001A1 (en) 2013-05-23 2015-02-26 Knowles Electronics, Llc Microphone and Corresponding Digital Interface
US20150055803A1 (en) 2013-05-23 2015-02-26 Knowles Electronics, Llc Decimation Synchronization in a Microphone
US20150043755A1 (en) 2013-05-23 2015-02-12 Knowles Electronics, Llc Vad detection microphone and method of operating the same
US20150350760A1 (en) 2013-05-23 2015-12-03 Knowles Electronics, Llc Synchronization of Buffered Data in Multiple Microphones
US20140348345A1 (en) 2013-05-23 2014-11-27 Knowles Electronics, Llc Vad detection microphone and method of operating the same
US9073747B2 (en) 2013-05-28 2015-07-07 Shangai Sniper Microelectronics Co., Ltd. MEMS microphone and electronic equipment having the MEMS microphone
US20140358552A1 (en) 2013-05-31 2014-12-04 Cirrus Logic, Inc. Low-power voice gate for device wake-up
US20150039303A1 (en) 2013-06-26 2015-02-05 Wolfson Microelectronics Plc Speech recognition
US20150049884A1 (en) 2013-08-16 2015-02-19 Zilltek Technology Corp. Microphone with voice wake-up function
US20150063594A1 (en) 2013-09-04 2015-03-05 Knowles Electronics, Llc Slew rate control apparatus for digital microphones
US20150073780A1 (en) 2013-09-06 2015-03-12 Nuance Communications, Inc. Method for non-intrusive acoustic parameter estimation
US20150073785A1 (en) 2013-09-06 2015-03-12 Nuance Communications, Inc. Method for voicemail quality detection
US20150088500A1 (en) 2013-09-24 2015-03-26 Nuance Communications, Inc. Wearable communication enhancement device
US20150106085A1 (en) 2013-10-11 2015-04-16 Apple Inc. Speech recognition wake-up of a handheld portable electronic device
US9076447B2 (en) 2013-10-18 2015-07-07 Knowles Electronics, Llc Acoustic activity detection apparatus and method
US20150110290A1 (en) 2013-10-21 2015-04-23 Knowles Electronics Llc Apparatus And Method For Frequency Detection
US20150112690A1 (en) 2013-10-22 2015-04-23 Nvidia Corporation Low power always-on voice trigger architecture
US9147397B2 (en) 2013-10-29 2015-09-29 Knowles Electronics, Llc VAD detection apparatus and method of operating the same
US20150134331A1 (en) 2013-11-12 2015-05-14 Apple Inc. Always-On Audio Control for Mobile Device
US9439005B2 (en) * 2013-11-25 2016-09-06 Oticon A/S Spatial filter bank for hearing system
US20150154981A1 (en) 2013-12-02 2015-06-04 Nuance Communications, Inc. Voice Activity Detection (VAD) for a Coded Speech Bitstream without Decoding
US20150161989A1 (en) 2013-12-09 2015-06-11 Mediatek Inc. System for speech keyword detection and associated method
US20150195656A1 (en) 2014-01-03 2015-07-09 Zilltek Technology (Shanghai) Corp. New-Type Microphone Structure
US20150256916A1 (en) 2014-03-04 2015-09-10 Knowles Electronics, Llc Programmable Acoustic Device And Method For Programming The Same
US20150256660A1 (en) 2014-03-05 2015-09-10 Cirrus Logic, Inc. Frequency-dependent sidetone calibration
US20160012007A1 (en) 2014-03-06 2016-01-14 Knowles Electronics, Llc Digital Microphone Interface
US20150304502A1 (en) 2014-04-18 2015-10-22 Nuance Communications, Inc. System and method for audio conferencing
US20150302865A1 (en) 2014-04-18 2015-10-22 Nuance Communications, Inc. System and method for audio conferencing
US20160087596A1 (en) 2014-09-19 2016-03-24 Knowles Electronics, Llc Digital microphone with adjustable gain control
US20160133271A1 (en) 2014-11-11 2016-05-12 Knowles Electronic, Llc Microphone With Electronic Noise Filter
US20160134975A1 (en) 2014-11-12 2016-05-12 Knowles Electronics, Llc Microphone With Trimming

Non-Patent Citations (17)

* Cited by examiner, † Cited by third party
Title
"MEMS technologies: Microphone" EE Herald Jun. 20, 2013.
Delta-sigma modulation, Wikipedia (Jul. 4, 2013).
International Search Report and Written Opinion for PCT/EP2014/064324, dated Feb. 12, 2015 (13 pages).
International Search Report and Written Opinion for PCT/US2014/038790, dated Sep. 24, 2014 (9 pages).
International Search Report and Written Opinion for PCT/US2014/060567 dated Jan. 16, 2015 (12 pages).
International Search Report and Written Opinion for PCT/US2014/062861 dated Jan. 23, 2015 (12 pages).
International Search Report and Written Opinion for PCT/US2016/013859 dated Apr. 29, 2016 (12 pages).
Kite, Understanding PDM Digital Audio, Audio Precision, Beaverton, OR, 2012.
Pulse-density modulation, Wikipedia (May 3, 2013).
Search Report of Taiwan Patent Application No. 103135811, dated Apr. 18, 2016 (1 page).
U.S. Appl. No. 14/285,585, filed May 22, 2014, Santos.
U.S. Appl. No. 14/495,482, filed Sep. 24, 2014, Murgia.
U.S. Appl. No. 14/522,264, filed Oct. 23, 2014, Murgia.
U.S. Appl. No. 14/698,652, filed Apr. 28, 2015, Yapanel.
U.S. Appl. No. 14/749,425, filed Jun. 24, 2015, Verma.
U.S. Appl. No. 14/853,947, filed Sep. 14, 2015, Yen.
U.S. Appl. No. 62/100,758, filed Jan. 7, 2015, Rossum.

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10360926B2 (en) 2014-07-10 2019-07-23 Analog Devices Global Unlimited Company Low-complexity voice activity detection
US10964339B2 (en) 2014-07-10 2021-03-30 Analog Devices International Unlimited Company Low-complexity voice activity detection
US20180130485A1 (en) * 2016-11-08 2018-05-10 Samsung Electronics Co., Ltd. Auto voice trigger method and audio analyzer employing the same
US10566011B2 (en) * 2016-11-08 2020-02-18 Samsung Electronics Co., Ltd. Auto voice trigger method and audio analyzer employing the same
US11720749B2 (en) 2018-10-16 2023-08-08 Oracle International Corporation Constructing conclusive answers for autonomous agents
US20220222444A1 (en) * 2019-02-13 2022-07-14 Oracle International Corporation Chatbot conducting a virtual social dialogue
US11861319B2 (en) * 2019-02-13 2024-01-02 Oracle International Corporation Chatbot conducting a virtual social dialogue
US11335361B2 (en) * 2020-04-24 2022-05-17 Universal Electronics Inc. Method and apparatus for providing noise suppression to an intelligent personal assistant
US20220223172A1 (en) * 2020-04-24 2022-07-14 Universal Electronics Inc. Method and apparatus for providing noise suppression to an intelligent personal assistant
US11790938B2 (en) * 2020-04-24 2023-10-17 Universal Electronics Inc. Method and apparatus for providing noise suppression to an intelligent personal assistant
US20220115007A1 (en) * 2020-10-08 2022-04-14 Qualcomm Incorporated User voice activity detection using dynamic classifier
US11783809B2 (en) * 2020-10-08 2023-10-10 Qualcomm Incorporated User voice activity detection using dynamic classifier

Also Published As

Publication number Publication date
DE112014004951T5 (en) 2016-07-21
CN105830463A (en) 2016-08-03
US9147397B2 (en) 2015-09-29
US20160064001A1 (en) 2016-03-03
US20150120299A1 (en) 2015-04-30
WO2015066152A1 (en) 2015-05-07

Similar Documents

Publication Publication Date Title
US9830913B2 (en) VAD detection apparatus and method of operation the same
US10425790B2 (en) Sensor device, sensor network system, and data compressing method
US10867611B2 (en) User programmable voice command recognition based on sparse features
US10360926B2 (en) Low-complexity voice activity detection
US20180315416A1 (en) Microphone with programmable phone onset detection engine
US10090005B2 (en) Analog voice activity detection
CN106664486B (en) Method and apparatus for wind noise detection
US10218327B2 (en) Dynamic enhancement of audio (DAE) in headset systems
EP3682651A1 (en) Low latency audio enhancement
US11605372B2 (en) Time-based frequency tuning of analog-to-information feature extraction
US8891786B1 (en) Selective notch filtering for howling suppression
US9454976B2 (en) Efficient discrimination of voiced and unvoiced sounds
EP2881948A1 (en) Spectral comb voice activity detection
CN103426440A (en) Voice endpoint detection device and voice endpoint detection method utilizing energy spectrum entropy spatial information
US20110054889A1 (en) Enhancing Receiver Intelligibility in Voice Communication Devices
CN113766073A (en) Howling detection in a conferencing system
KR101689332B1 (en) Information-based Sound Volume Control Apparatus and Method thereof
JP2014126856A (en) Noise removal device and control method for the same
CN111477246B (en) Voice processing method and device and intelligent terminal
CN114127846A (en) Voice tracking listening device
JP4601970B2 (en) Sound / silence determination device and sound / silence determination method
CN106409312B (en) Audio classifier
WO2020039597A1 (en) Signal processing device, voice communication terminal, signal processing method, and signal processing program
Park et al. Human-robot interface using robust speech recognition and user localization based on noise separation device
WO2019159253A1 (en) Speech processing apparatus, method, and program

Legal Events

Date Code Title Description
AS Assignment

Owner name: KNOWLES ELECTRONICS, LLC, ILLINOIS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:THOMSEN, HENRIK;NANDY, DIBYENDU;SIGNING DATES FROM 20141211 TO 20141217;REEL/FRAME:037401/0478

STCF Information on status: patent grant

Free format text: PATENTED CASE

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 4

AS Assignment

Owner name: SAMSUNG ELECTRONICS CO., LTD., KOREA, REPUBLIC OF

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:KNOWLES ELECTRONICS, LLC;REEL/FRAME:066216/0590

Effective date: 20231219