WO2008156941A1 - Sound discrimination method and apparatus - Google Patents

Sound discrimination method and apparatus Download PDF

Info

Publication number
WO2008156941A1
WO2008156941A1 PCT/US2008/064056 US2008064056W WO2008156941A1 WO 2008156941 A1 WO2008156941 A1 WO 2008156941A1 US 2008064056 W US2008064056 W US 2008064056W WO 2008156941 A1 WO2008156941 A1 WO 2008156941A1
Authority
WO
WIPO (PCT)
Prior art keywords
transducers
distance
gain
frequency bands
microphone
Prior art date
Application number
PCT/US2008/064056
Other languages
French (fr)
Inventor
William R. Short
Original Assignee
Bose Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bose Corporation filed Critical Bose Corporation
Priority to CN2008800209202A priority Critical patent/CN101682809B/en
Priority to JP2010513294A priority patent/JP4965707B2/en
Priority to EP08755825A priority patent/EP2158788A1/en
Publication of WO2008156941A1 publication Critical patent/WO2008156941A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R29/00Monitoring arrangements; Testing arrangements
    • H04R29/004Monitoring arrangements; Testing arrangements for microphones
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/32Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
    • H04R1/40Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
    • H04R1/406Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers microphones

Definitions

  • the invention relates generally to the field of acoustics, and in particular to sound pick-up and reproduction. More specifically, the invention relates to a sound discrimination method and apparatus.
  • a problem with conventional microphones is that they respond not only to the desired instrument or voice, but also to other nearby instruments and/or voices. If, for example, the sound of the drum kit bleeds into the microphone of the lead singer, the reproduced sound is adversely effected. This problem also occurs when musicians are in a studio recording their music.
  • One type of acoustic pick-up device is an omni directional microphone.
  • An omni directional microphone is rarely used for live music because it tends to be more prone to feedback.
  • conventional microphones having a directional acceptance pattern e.g., a cardioid microphone
  • these microphones have insufficient rejection to fully solve the problem.
  • Directional microphones generally have a frequency response that varies with the distance from the source. This is typical of pressure gradient responding microphones . This effect is called the “proximity effect” , and it results in a bass boost when the microphone is close to the source and a loss of bass when the microphone is far from the source. Performers who like proximity effect often vary the distance between the microphone and the instrument (or voice) during a performance to create effects and to change the level of the amplified sound. This process is called “working the mike” .
  • method of distinguishing sound sources includes transforming data, collected by at least two transducers which each react to a characteristic of an acoustic wave, into signals for each transducer location.
  • the transducers are separated by a distance of less than about 70mm or greater than about 90mm.
  • the signals are separated into a plurality of frequency bands for each transducer location. For each band a relationship of the magnitudes of the signals for the transducer locations is compared with a first threshold value. A relative gain change is caused between those frequency bands whose magnitude relationship falls on one side of the threshold value and those frequency bands whose magnitude relationship falls on the other side of the threshold value.
  • sound sources are discriminated from each other based on their distance from the transducers .
  • Further features of the invention include (a) using a fast Fourier transform to convert the signals from a time domain to a frequency domain, (b) comparing a magnitude of a ratio of the signals, (c) causing those frequency bands whose magnitude comparison falls on one side of the threshold value to receive a gain of about 1, (d) causing those frequency bands whose magnitude comparison falls on the other side of the threshold value to receive a gain of about 0, (e) that each transducer is an omni-directional microphone, (f) converting the frequency bands into output signals, (g) using the output signals to drive one or more acoustic drivers to produce sound, (h) providing a user- variable threshold value such that a user can adjust a distance sensitivity from the transducers, or (i) that the characteristic is a local sound pressure, its first-order gradient, higher-order gradients, and/or combinations thereof .
  • Another feature involves providing a second threshold value different from the first threshold value.
  • the causing step causes a relative gain change between those frequency bands whose magnitude comparison falls in a first range between the threshold values and those frequency bands whose magnitude comparison falls outside the threshold values.
  • a still further feature involves providing third and fourth threshold values that define a second range that is different from and does not overlap the first range.
  • the causing step causes a relative gain change between those frequency bands whose magnitude comparison falls in the first or second ranges and those frequency bands whose magnitude comparison falls outside the first and second ranges .
  • transducers to be separated by a distance of no less than about 250 microns, (b) the transducers to be separated by a distance of between about 20mm to about 50mm, (c) the transducers to be separated by a distance of between about 25mm to about 45mm, (d) the transducers to be separated by a distance of about 35mm, and/or (e) the distance between the transducers to be measured from a center of a diaphragm for each transducer.
  • the causing step fades the relative gain change between a low gain and a high gain
  • the fade of the relative gain change is done across the first threshold value
  • the fade of the relative gain change is done across a certain magnitude level for an output signal of one or more of the transducers
  • the causing of a relative gain change is effected by (1) a gain term based on the magnitude relationship and (2) a gain term based on a magnitude of an output signal from one or more of the transducers .
  • Still further features include that (a) a group of gain terms derived for a first group of frequency bands is also applied to a second group of frequency bands, (b) the frequency bands of the first group are lower than the frequency bands of the second group, (c) the group of gain terms derived for the first group of frequency bands is also applied to a third group of frequency bands, and/or (d) the frequency bands of the first group are lower than the frequency bands of the third group.
  • Additional features call for (a) the acoustic wave to be traveling in a compressible fluid, (b) the compressible fluid to be air, (c) the acoustic wave to be traveling in a substantially incompressible fluid (d) the substantially incompressible fluid to be water, (e) the causing step to cause a relative gain change to the signals from only one of the two transducers, (f) a particular frequency band to have a limit in how quickly a gain for that frequency band can change, and/or (g) there to be a first limit for how quickly the gain can increase and a second limit for how quickly the gain can decrease, the first limit and second limit being different.
  • a method of discriminating between sound sources includes transforming data, collected by transducers which react to a characteristic of an acoustic wave, into signals for each transducer location.
  • the signals are separated into a plurality of frequency bands for each location.
  • For each band a relationship of the magnitudes of the signals for the locations is determined.
  • For each band a time delay is determined from the signals between when an acoustic wave is detected by a first transducer and when this wave is detected by a second transducer.
  • a relative gain change is caused between those frequency bands whose magnitude relationship and time delay fall on one side of respective threshold values for magnitude relationship and time delay, and those frequency bands whose (a) magnitude relationship falls on the other side of its threshold value, (b) time delay falls on the other side of its threshold value, or (c) magnitude relationship and time delay both fall on the other side of their respective threshold values.
  • Further features include (a) providing an adjustable threshold value for the magnitude relationship, (b) providing an adjustable threshold value for the time delay, (c) fading the relative gain change across the magnitude relationship threshold, (d) fading the relative gain change across the time delay threshold, (e) that causing of a relative gain change is effected by (1) a gain term based on the magnitude relationship and (2) a gain term based on the time delay, (f) that the causing of a relative gain change is further effected by a gain term based on a magnitude of an output signal from one or more of the transducers, and/ or (g) that for each frequency band there is an assigned threshold value for magnitude relationship and an assigned threshold value for time delay.
  • a still further aspect involves a method of distinguishing sound sources.
  • Data collected by at least three omni-directional microphones which each react to a characteristic of an acoustic wave is captured.
  • the data is processed to determine (1) which data represents one or more sound sources located less than a certain distance from the microphones, and (2) which data represents one or more sound sources located more than the certain distance from the microphones.
  • the results of the processing step are utilized to provide a greater emphasis of data representing the sound source (s) in one of (1) or (2) above over data representing the sound source (s) in the other of (1) or (2) above.
  • sound sources are discriminated from each other based on their distance from the microphones .
  • Additional features include that (a) the utilizing step provides a greater emphasis of data representing the sound source (s) in (1) over data representing the sound source (s) in (2) , (b) after the utilizing step the data is converted into output signals, (c) a first microphone is a first distance from a second microphone and a second distance from a third microphone, the first distance being less than the second distance, (d) the processing step selects high frequencies from the second microphone and low frequencies from the third microphone which are lower than the high frequencies, (e) the low frequencies and high frequencies are combined in the processing step, and/or (f) the processing step determines (1) a phase relationship from the data from microphones one and two, and (2) determines a magnitude relationship from the data from microphones one and three.
  • a personal communication device includes two transducers which react to a characteristic of an acoustic wave to capture data representative of the characteristic.
  • the transducers are separated by a distance of about 70mm or less.
  • a signal processor for processing the data determines (1) which data represents one or more sound sources located less than a certain distance from the transducers, and (2) which data represents one or more sound sources located more than the certain distance from the transducers.
  • the signal processor provides a greater emphasis of data representing the sound source (s) in one of (1) or (2) above over data representing the sound source (s) in the other of (1) or (2) above. As such, sound sources are discriminated from each other based on their distance from the transducers.
  • the signal processor to convert the data into output signals, (b) the output signals to be used to drive a second acoustic driver remote from the device to produce sound remote from the device, (c) the transducers to be separated by a distance of no less than about 250 microns, (d) the device to be a cell phone, and/or (e) the device to be a speaker phone.
  • a still further aspect calls for a microphone system having a silicon chip and two transducers secured to the chip which react to a characteristic of an acoustic wave to capture data representative of the characteristic.
  • the transducers are separated by a distance of about 70mm or less.
  • a signal processor is secured to the chip for processing the data to determine (1) which data represents one or more sound sources located less than a certain distance from the transducers, and (2) which data represents one or more sound sources located more than the certain distance from the transducers.
  • the signal processor provides a greater emphasis of data representing the sound source (s) in one of (1) or (2) above over data representing the sound source (s) in the other of (1) or (2) above, such that sound sources are discriminated from each other based on their distance from the transducers.
  • Another aspect calls for a method of discriminating between sound sources.
  • Data collected by transducers which react to a characteristic of an acoustic wave is transformed into signals for each transducer location.
  • the signals are separated into a plurality of frequency bands for each location.
  • a relationship of the magnitudes of the signals is determined for each band for the locations.
  • For each band a phase shift is determined from the signals which is indicative of when an acoustic wave is detected by a first transducer and when this wave is detected by a second transducer.
  • a relative gain change is caused between those frequency bands whose magnitude relationship and phase shift fall on one side of respective threshold values for magnitude relationship and phase shift, and those frequency bands whose (1) magnitude relationship falls on the other side of its threshold value, (2) phase shift falls on the other side of its threshold value, or (3) magnitude relationship and phase shift both fall on the other side of their respective threshold values.
  • a method of discriminating between sound sources includes transforming data, collected by transducers which react to a characteristic of an acoustic wave, into signals for each transducer location.
  • the signals are separated into a plurality of frequency bands for each location. For each band a relationship of the magnitudes of the signals is determined for the locations.
  • a relative gain change is caused between those frequency bands whose magnitude relationship falls on one side of a threshold value, and those frequency bands whose magnitude relationship falls on the other side of the threshold value.
  • the gain change is faded across the threshold value to avoid abrupt gain changes at or near the threshold.
  • Another feature calls determining from the signals a time delay for each band between when an acoustic wave is detected by a first transducer and when this wave is detected by a second transducer.
  • a relative gain change is caused between those frequency bands whose magnitude relationship and time delay fall on one side of respective threshold values for magnitude relationship and time delay, and those frequency bands whose (1) magnitude relationship falls on the other side of its threshold value, (2) time delay falls on the other side of its threshold value, or (3) magnitude relationship and time delay both fall on the other side of their respective threshold values.
  • the gain change is faded across the threshold value to avoid abrupt gain changes at or near the threshold.
  • Data collected by transducers which react to a characteristic of an acoustic wave, is transformed into signals for each transducer location.
  • the signals are separated into a plurality of frequency bands for each location. Characteristics of the signals are determined for each band which are indicative of a distance and angle to the transducers of a sound source providing energy to a particular band.
  • a relative gain change is caused between those frequency bands whose signal characteristics indicate that a sound source providing energy to a particular band meets distance and angle requirements, and those frequency bands whose signal characteristics indicate that a sound source providing energy to a particular band (a) does not meet a distance requirement, (b) does not meet an angle requirement, or (c) does not meet distance and angle requirements.
  • the characteristics include (a) a phase shift which is indicative of when an acoustic wave is detected by a first transducer and when this wave is detected by a second transducer, and/or (b) a time delay between when an acoustic wave is detected by a first transducer and when this wave is detected by a second transducer, whereby an angle to the transducers of a sound source providing energy to a particular band is indicated.
  • An additional feature calls for the output signals to be (a) recorded on a storage medium, (b) communicated by a transmitter, and/or (c) further processed and used to present information on location of sound sources.
  • a further aspect of the invention calls for a method of distinguishing sound sources.
  • Data collected by four transducers which each react to a characteristic of an acoustic wave is transformed into signals for each transducer location.
  • the signals are separated into a plurality of frequency bands for each transducer location.
  • For each band a relationship of the magnitudes of the signals for at least two different pairs of the transducers is compared with a threshold value.
  • a determination is made for each transducer pair whether the magnitude relationship falls on one side or the other side of the threshold value.
  • the results of each determination is utilized to decide whether an overall magnitude relationship falls on one side or the other side of the threshold value.
  • a relative gain change is caused between those frequency bands whose overall magnitude relationship falls on one side of the threshold value and those frequency bands whose overall magnitude relationship falls on the other side of the threshold value, such that sound sources are discriminated from each other based on their distance from the transducers.
  • a sound distinguishing system is switched to a training mode.
  • a sound source is moved to a plurality of locations within a sound source accept region such that the sound distinguishing system can determine a plurality of thresholds for a plurality of frequency bins.
  • the sound distinguishing system is switched to an operating mode, The sound distinguishing system uses the thresholds to provide a relative emphasis to sound sources located in the sound source accept region over sound sources located outside the sound source accept region.
  • Another feature requires that two of the microphones be connected by an imaginary straight line that extends in either direction to infinity. The third microphone is located away from this line. [00036] One more feature calls for comparing a relationship of the magnitudes of the signals for six unique pairs of the transducers with a threshold value.
  • FIG. 1 is a schematic diagram of a sound source in a first position relative to an acoustic pick-up device
  • FIG. 2 is a schematic diagram of the sound source in a second position relative to the acoustic pick-up device
  • FIG. 3 is a schematic diagram of the sound source in a third position relative to the acoustic pick-up device
  • FIG. 4 is a schematic diagram of the sound source in a fourth position relative to the acoustic pick-up device
  • Fig. 5 is a cross-section of a silicon chip with a microphone array,- [00043] FIG.
  • FIG. 7 is a schematic diagram of a first embodiment of a microphone system
  • Fig. 8 is a plot of the output of a conventional microphone and the microphone system of Fig. 7 versus distance
  • Fig. 9 is a polar plot of the output of a cardioid microphone and the microphone system of Fig. 7 versus angle
  • Figs 10a and 10b are schematic drawings of transducers being exposed to acoustic waves from different directions
  • Fig. 48 Fig.
  • Fig. 11 is a plot of lines of constant magnitude difference (in dB) for a relatively widely spaced pair of transducers;
  • Fig. 12 is a plot of lines of constant magnitude difference (in dB) for a relatively narrowly spaced pair of transducers ;
  • Fig. 13 is a schematic diagram of a second embodiment of a microphone system;
  • Fig. 14 is a schematic diagram of a third embodiment of a microphone system;
  • Figs. 15a and b are plots of gain versus frequency;
  • Fig. 16A is a schematic diagram of a fourth embodiment of a microphone system;
  • Fig. 16B is a schematic diagram of another portion of the fourth embodiment;
  • Figs. 16C-E are graphs of gain terms used in the fourth embodiment ;
  • Fig. 17A is a perspective view of an earphone with integrated microphone;
  • Fig. 17B is a front view of a cell phone with integrated microphone;
  • Figs. 18A and B are plots of frequency verses threshold for magnitude and time delay;
  • Fig. 19 is a graph demonstrating slew rate limiting
  • Fig. 20 is a side schematic diagram of a fifth embodiment of a microphone system
  • Fig. 21 is a top schematic diagram of a sixth embodiment of a microphone system.
  • a microphone system with an unusual set of directional properties is desired.
  • a new microphone system having these properties is disclosed that avoids many of the typical problems of directional microphones while offering improved performance.
  • This new microphone system uses the pressures measured by two or more spaced microphone elements (transducers) to cause a relative positive gain for the signals from sound sources that fall within a certain acceptance window of distance and angle relative to the microphone system compared to the gain for the signals from all other sound sources.
  • an acoustic pick-up device 10 includes front and rear transducers 12 and 14.
  • the transducers collect data at their respective locations by reacting to a characteristic of an acoustic wave such as local sound pressure, the first order sound pressure gradient, higher-order sound pressure gradients, or combinations thereof.
  • Each transducer in this embodiment can be a conventional omni-directional sound pressure responding microphone, and the transducers are arranged in a linear array.
  • the transducers each transform the instantaneous sound pressure present at their respective location into electrical signals which represent the sound pressure over time at those locations.
  • Sound source 15 could also be, for example, a singer or the output of a musical instrument.
  • the distance from sound source 15 to front transducer 12 is R, and the angle between the acoustic pick-up device 10 and the source is ⁇ .
  • Transducers 12, 14 are separated by a distance r t . From the electrical signals discussed above, knowing r ⁇ , and comparing aspects of the signals with thresholds, it can be determined whether or not to accept sounds from sound source 15.
  • the time difference between when a sound pressure wave reaches transducer 12 and when the wave reaches transducer 14 is ⁇ .
  • the symbol c is the speed of sound. Accordingly, a first equation which includes the unknown ⁇ is as follows:
  • FIG. 2 An example is provided in Figure 2.
  • sound source 15 emits spherical waves.
  • the sound pressure magnitude drops as a function of 1/R from source 15 to transducer 12 and 1/ (R+r t ) from source 15 to transducer 14.
  • the distance r t is preferably measured from the center of a diaphragm for each of transducers 12 and 14.
  • Distance r t is preferably smaller than a wavelength for the highest frequency of interest. However, r t should not be too small as the magnitude ratios as a function of distance will be small and thus more difficult to measure.
  • distance r t in one example is preferably about 70 millimeters (mm) or less. At about 70mm the system is best suited for acoustic environments consisting primarily of human speech and similar signals.
  • distance r ⁇ is between about 20mm to about 50mm. More preferably distance r t is between about 25mm to about 45mm. Most preferably distance rt is about 35mm.
  • the description has been inherently done in an environment of a compressible fluid (e.g. air) . It should be noted that this invention will also be effective in an environment of an incompressible fluid (e.g. water or salt water) .
  • an incompressible fluid e.g. water or salt water
  • the transducer spacing can be about 90mm or greater. If it is only desired to measure low or extremely low frequencies, the transducer spacing can get quite large. For example, assuming the speed of sound in water is 1500 meters/second and the highest frequency of interest is 100hz, then the transducers can be spaced 15 meters apart.
  • a cross-section of a silicon chip 35 discloses a Micro-Electro-Mechanical Systems (MEMS) microphone array 37.
  • Array 37 includes a pair of acoustic transducers 34, 41 which are spaced a distance r t of at least about 250 microns from each other.
  • Optional ports 43, 45 increase an effective distance d t at which transducers 34, 41 "hear" their environment.
  • Chip 35 also includes the associated signal processing apparatus (not shown in Figure 5) which are connected to transducers 34, 41.
  • An advantage of a MEMS microphone array is that some or all of desired signal processing (discussed below) , for example: signal conditioning, A/D conversion, windowing, transformation, and D/A conversion, etc., can be placed on the same chip. This provides a very compact, unitary microphone system.
  • An example of a MEMS microphone array is the AKU2001 Tri-State Digital Output CMOS MEMS Microphone available from Akustica, Inc. 2835 East Carson Street, Suite 301, Pittsburgh, PA 15203
  • FIG. 6a a theoretical plot is provided of magnitude difference and time delay difference (phase) of the signals present at the location of transducers 12, 14 due to sound output by sound 15, as a function of source 15 's location (angle and distance) relative to the location of audio device 10 (consisting of transducers 12 and 14) .
  • the plot of Figs. 6a-c was calculated assuming the distance r t between transducers 12, 14 is 35mm.
  • the equations in paragraph 39 above were used to computationally create this plot.
  • R and ⁇ are set to known values and ⁇ and M1/M2 are calculated.
  • the theoretical sound source angle ⁇ and distance R are varied over a wide range to determine a range of ⁇ and M1/M2.
  • a Y axis provides the sound source angle ⁇ in degrees and an X axis provides the sound source distance in meters.
  • Lines 17 of constant magnitude difference in dB are plotted.
  • Lines 19 of constant time difference (microseconds) of the signals at the location of transducers 12, 14 are also plotted. More gradations can be provided if desired.
  • Figure 6B shows an embodiment where two thresholds are used for each of magnitude difference and time difference. Sound sources that cause a magnitude difference of 2 ⁇ dB difference ⁇ 3 and a time difference 80 ⁇ micro seconds ⁇ lOO are accepted. The acceptance window is identified by the hatched area 29. Sound sources that cause a magnitude difference and/or a time difference outside of acceptance window 29 are rejected.
  • Figure 6C shows an embodiment where two acceptance windows 31 and 33 are used. Sound sources that cause a magnitude difference of ⁇ 3dB and a time difference SO ⁇ micro seconds ⁇ lOO are accepted. Sound sources that cause a magnitude difference of 2 ⁇ dB difference ⁇ 3 and a time difference ⁇ IOO micro-seconds are also accepted. Sound sources that cause a magnitude difference and/or a time difference outside of acceptance windows 31 and 33 are rejected. Any number of acceptance windows can be created by using appropriate thresholds for magnitude difference and time difference.
  • Transducers 12, 14 are each preferably an omni-directional microphone element which can connect to other parts of the system via a wire or wirelessly.
  • the transducers in this embodiment have the center of their respective diaphragms separated by a distance of about 35mm. Some or all of the remaining elements in Figure 7 can be incorporated into the microphone, or they can be in one or more separate components.
  • the signals for each transducer pass through respective conventional pre-amplifiers 16 and 18 and a conventional analog-to-digital (A/D) converter 20.
  • a separate A/D converter is used to convert the signal output by each transducer.
  • a multiplexer can be used with a single A/D converter.
  • Amplifiers 16 and 18 can also provide DC power (i.e. phantom power) to respective transducers 12 and 14 if needed.
  • blocks of overlapping data are windowed at a block 22 ( a separate windowing is done on the signal for each transducer) .
  • the windowed data are transformed from the time domain into the frequency domain using a fast Fourier transform (FFT) at a block 24 (a separate FFT is done on the signal for each transducer) .
  • FFT fast Fourier transform
  • Other types of transforms can be used to transform the windowed data from the time domain to the frequency domain.
  • a wavelet transform may be used instead of an FFT to obtain log spaced frequency bins.
  • a sampling frequency of 32000 samples/sec is used with each block containing 512 samples.
  • DFT discrete Fourier transform
  • the FFT is an algorithm for implementing the DFT that speeds the computation.
  • the Fourier transform of a real signal yields a complex result.
  • the magnitude of a complex number X is defined as: sqrt (real (X) . ⁇ 2 + ⁇ mag(X) . ⁇ 2)
  • the magnitude ratio of two complex values, Xl and X2 can be calculated in any of a number of ways .
  • the time delay between two complex values can be calculated in a number of ways.
  • the relationship is the ratio of the signal from front transducer 12 to the signal from rear transducer 14 which is calculated for each frequency bin on a block-by-block basis at a divider block 26.
  • the magnitude of this ratio (relationship) in dB is calculated at a block 28.
  • a time difference (delay) T (Tau) is calculated for each frequency bin on a block-by-block basis by first computing the phase at a block 30 and then dividing the phase by the center frequency of each frequency bin at a divider 32.
  • the time delay represents the lapsed time between when an acoustic wave is detected by transducer 12 and when this wave is detected by a transducer 14.
  • DSP digital signal processing
  • the calculated magnitude relationship and time differences (delay) for each frequency bin (band) are compared with threshold values at a block 34. For example, as described above in Figure 6A, if the magnitude difference is greater than or equal to 2dB and the time delay is greater than or equal to 100 microseconds, then we accept (emphasize) that frequency bin. If the magnitude difference is less than 2dB and/or the time delay is less than 100 microseconds, then we reject (deemphasize) that frequency bin.
  • a user input 36 may be manipulated to vary the acceptance angle threshold (s) and a user input 38 may be manipulates to vary the distance threshold (s) as required by the user.
  • a small number of user presets are provided for different acceptance patterns which the user can select as needed. For example, the user would select between general categories such as narrow or wide for the angle setting and near or far for the distance setting.
  • a visual or other indication is given to the user to let her know the threshold settings for angle and distance. Accordingly, user-variable threshold values can be provided such that a user can adjust a distance selectivity and/or an angle selectivity from the transducers.
  • the user user interface may represent this as changing the distance and/or angle thresholds, but in effect the user is adjusting the magnitude difference and/or the time difference thresholds.
  • a relatively high gain is calculated at a block 40, and when one or both of the parameters is outside the window, a relatively low gain is calculated.
  • the high gain is set at about 1 while the low gain is at about 0.
  • the high gain might be above 1 while the low gain is below the high gain.
  • a relative gain change is caused between those frequency bands whose parameter (magnitude and time delay) comparisons both fall on one side of their respective threshold values and those frequency bands where one or both parameter comparisons fall on the other side of their respective threshold values.
  • the gains are calculated for each frequency bin in each data block.
  • the calculated gain may be further manipulated in other ways known to those skilled in the art to minimize the artifacts generated by such gain change.
  • the minimum gain can be limited to some low value, rather than zero.
  • the gain in any frequency bin can be allowed to rise quickly but fall more slowly using a fast attack slow decay filter.
  • a limit is set on how much the gain is allowed to vary from one frequency bin to the next at any given time .
  • the calculated gain is applied to the frequency domain signal from a single transducer, for example transducer 12 (although transducer 14 could also be used) , at a multiplier 42.
  • a single transducer for example transducer 12 (although transducer 14 could also be used)
  • transducer 14 could also be used
  • the calculated gain is applied to the frequency domain signal from a single transducer, for example transducer 12 (although transducer 14 could also be used) , at a multiplier 42.
  • the modified signal is inverse FFT' d at a block 44 to transform the signal from the frequency domain back into the time domain.
  • the signal is then windowed, overlapped and summed with the previous blocks at a block 46.
  • the signal is converted from a digital signal back to an analog (output) signal.
  • the output of block 48 is then sent to a conventional amplifier (not shown) and acoustic driver (i.e. speaker) (not shown) of a sound reinforcement system to produce sound.
  • a conventional amplifier not shown
  • acoustic driver i.e. speaker
  • an input signal (digital) to block 48 or an output signal (analog) from block 48 can be (a) recorded on a storage medium (e.g. electronic or magnetic) , (b) communicated by a transmitter (wired or wirelessly) , or (c) further processed and used to present information on location of sound sources.
  • the microphone system shown in Figure 7 has the same fall off with R (line segment 49) , but only out to a specified distance, RO.
  • the fall off in microphone output at RO is represented by a line segment 52.
  • RO would typically be set to be approximately 30cm.
  • the new microphone responds to the singer, located closer than RO, but rejects anything further away, such as sound from other instruments or loudspeakers.
  • FIG. 9 angle selectivity will be discussed.
  • Conventional microphones can have any of a variety of directional patterns.
  • a cardioid response which is a common directional pattern for microphones, is shown in the polar plot line 54 (the radius of the curve indicates the relative microphone magnitude response to sound arriving at the indicated angle . )
  • the cardioid microphone has the strongest magnitude response for sounds arriving at the front, with less and less response as the sound source moves to the rear. Sounds arriving from the rear are significantly attenuated.
  • a directional pattern for the microphone system of Figure 7 is shown by the pie shaped line 56.
  • the microphone For sounds arriving within the acceptance angle (in this example, ⁇ 30°), the microphone has high response. Sounds arriving outside this angle are significantly attenuated.
  • the magnitude difference is both a function of distance and angle .
  • the maximum change in magnitude with distance occurs in line with the transducers.
  • the minimum change in magnitude with distance occurs in a line perpendicular to the axis of the transducers .
  • sources 90 deg off axis there is no magnitude difference, regardless of the source distance.
  • Angle is just a function of the time difference alone.
  • the transducer array should be oriented pointing towards the location of a sound source or sources we wish to select.
  • a microphone having this sort of extreme directionality will be much less susceptible to feedback than a conventional microphone for two reasons.
  • the new microphone largely rejects the sound of main or monitor loudspeakers that may be present, because they are too distant and outside the acceptance window.
  • the reduced sensitivity lowers the loop gain of the system, reducing the likelihood of feedback.
  • feedback is exacerbated by having several "open" microphones and speakers on stage . Whereas any one microphone and speaker might be stable and not create feedback, the combination of multiple cross coupled systems can more easily be unstable, causing feedback.
  • the new microphone system described herein is "open" only for a sound source within the acceptance window, making it less likely to contribute to feedback by coupling to another microphone and sound amplification system on stage, even if those other microphones and systems are completely conventional.
  • the new microphone system also greatly reduces the bleed through of sound from other performers or other instruments in a performing or recording application.
  • the acceptance window (both distance and angle) can be tailored by the performer or sound crew on the fly to meet the needs of the performance .
  • the new microphone system can simulate the sound of many different styles of microphones for performers who want that effect as part of their sound. For example, in one embodiment of the invention this system can simulate the proximity effect of conventional microphones by boosting the gain more at low frequencies than high frequencies for magnitude differences indicating small R values.
  • the output of transducer 12 alone is processed on a frequency bin basis to form an output signal.
  • Transducer 12 is typically an omni-directional pressure responding transducer, and it will not exhibit proximity effect as is present in a typical pressure gradient responding microphone.
  • Gain block 40 imposes a distance dependent gain function on the output of transducer 12, but the function described so far either passes or blocks a frequency bin depending on distance/angle from the microphone system.
  • a more complex function can be applied in gain processing block 40, to simulate proximity effect of a pressure gradient microphone, while maintaining the distance/angle selectivity of the system as described.
  • a variable coefficient can be used, where the coefficient value varies as a function of frequency and distance.
  • This function has a first order high pass filter shape, where the corner frequency decreases as distance decreases .
  • Proximity effect can also be caused by combining transducers 12, 14 into a single uni-directional or bidirectional microphone, thereby creating a fixed directional array.
  • the calculated gain is applied to the combined signal from transducers 12, 14, providing pressure gradient type directional behavior (not adjustable by the user) , in addition to the enhanced selectivity of the processing of Fig. 7.
  • the new microphone system does not boost the gain more at low frequencies than high frequencies magnitude differences indicating small R values and so does not display proximity effect.
  • the new microphone can create new microphone effects.
  • One example is a microphone having the same output for all sound source distances within the acceptance window. Using the magnitude difference and time delay between the transducers 12 and 14, the gain is adjusted to compensate for the 1/R falloff from transducer 12.
  • a sound source of constant level would cause the same output magnitude for any distance from the transducers within the acceptance window.
  • This feature can be useful in a public address (PA) system. Inexperienced presenters generally are not careful about maintaining a constant distance from the microphone. With a conventional PA system, their reproduced voice can vary between being too loud and too soft .
  • the improved microphone described herein keeps the voice level constant, independent of the distance between the speaker and the microphone. As a result, variations in the reproduced voice level for an inexperienced speaker are reduced .
  • the new microphone can be used to replace microphones for communications purposes, such as a microphone for a cell phone for consumers (in a headset or otherwise), or a boom microphone for pilots.
  • These personal communication devices typically have a microphone which is intended to be located about 1 foot or less from a user's lips. Rather than using a boom to place a conventional noise canceling microphone close to the user's lips, a pair of small microphones mounted on the headset could use the angle and/or distance thresholds to accept only those sounds having the correct distance and/or angle (e.g. the user's lips) . Other sounds would be rejected.
  • the acceptance window is centered around the anticipated location of the user's mouth.
  • This microphone can also be used for other voice input systems where the location of the talker is known (e.g. in a car) .
  • Some examples include hands free telephony applications, such as hands free operation in a vehicle, and hands free voice command, such as with vehicle systems employing speech recognition capabilities to accept voice input from a user to control vehicle functions.
  • Another example is using the microphone in a speakerphone which can be used, for example, in tele-conferencing.
  • These types of personal communication devices typically have a microphone which is intended to be located more than 1 foot from a user's lips.
  • the new microphone technology of this application can also be used in combination with speech recognition software.
  • the signals from the microphone are passed to the speech recognition algorithm in the frequency domain. Frequency bins that are outside the accept region for sound sources are given a lower weighting than frequency bins that are in the accept region. Such an arrangement can help the speech recognition software to process a desired speakers voice in a noisy environment.
  • phase measurement produces results in the range between - ⁇ and ⁇ .
  • uncertainty in the measurement having a value that is an integral multiple of 2 ⁇ .
  • a measurement of 0 radians of phase difference could just as easily represent a phase difference of 2 ⁇ or -2 ⁇ .
  • Figure 11 show lines of constant magnitude difference (in dB) between transducers 12, 14 for various distances and angles between the acoustic source and transducer 12 when the transducers 12, 14 have a relatively wide spacing between themselves (about 35mm) .
  • Figure 12 shows lines of constant magnitude difference (in dB) between the transducers 12 , 14 for various distances and angles to the acoustic source with a much narrower transducer spacing (about 7mm) . With narrower transducer spacing the magnitude difference is greatly reduced and it is harder to get an accurate distance estimate.
  • This problem can be avoided by using two pairs of transducer elements: a widely spaced pair for low frequency estimates of source distance and angle, and a narrowly spaced pair for high frequency estimates of distance and angle .
  • a widely spaced pair for low frequency estimates of source distance and angle
  • a narrowly spaced pair for high frequency estimates of distance and angle .
  • only three transducer elements are used: widely spaced Tl and T2 for low frequencies and narrowly spaced Tl and T3 for high frequencies .
  • Each of the three signal streams receive standard block processing windowing at block 78 and are converted from the time domain to the frequency domain at FFT block 80.
  • High frequency bins above a pre-defined frequency from the signal of transducer 66 are selected out at block 82.
  • the pre-defined frequency is 4Khz.
  • Low frequency bins at or below 4khz from the signal of transducer 68 are selected out at block 84.
  • the high frequency bins from block 82 are combined with the low frequency bins from block 84 at a block 86 in order to create a full complement of frequency bins . It should be noted that this band splitting can alternatively be done in the analog domain rather than the digital domain.
  • the remainder of the signal processing is substantially the same as for the embodiment in Figure 7 and so will not be described in detail.
  • the ratio of the signal from transducer 64 and the combined low frequency and high frequency signals out of block 86 is calculated.
  • the quotient is processed as described with reference to Figure 7.
  • the calculated gain is applied to the signal from transducer 64, and the resulting signal is applied to standard inverse FFT, windowing, and overlap-and-sum blocks before being converted back to an analog signal by a digital-to-analog converter.
  • the analog signal is then sent to a conventional amplifier 88 and speaker 90 of a sound reinforcement system. This approach avoids the problem of the 2 ⁇ uncertainty.
  • FIG 14 Another embodiment will be described which avoids the problem of the 2 ⁇ uncertainty.
  • the front end of this embodiment is substantially the same as in Figure 13 through FFT block 80.
  • the ratio of the signals from transducers (microphones) 64 and 68 (widely spaced) is calculated at divider 92 and the magnitude difference in dB is determined at block 94.
  • the ratio of the signals from transducers 64 and 66 (narrowly spaced) is calculated at divider 96 and the phase difference is determined at block 98.
  • the phase is divided by the center frequency of each frequency bin at a divider 100 to determine the time delay.
  • the remainder of the signal processing is substantially the same as in Figure 13.
  • the magnitude difference in dB is determined the same way as in that Figure.
  • the ratio of the signals from transducers 64 and 66 is calculated at a divider for low frequency bins (e.g. at or below 4khz) and the phase difference is determined.
  • the phase is divided by the center frequency of each low frequency bin to determine the time delay.
  • the ratio of the signals from transducers 64 and 68 is calculated at a divider for high frequency bins (e.g. above 4khz) and the phase difference is determined.
  • the phase is divided by the center frequency of each high frequency bin to determine the time delay.
  • One method of achieving this goal is to use the instantaneous gains predicted for the frequency bins located in the octave between 2.5 and 5kHz for example, and to apply those same gains to the frequency bins one and two octaves higher, that is, for the bins between 5 and 10kHz, and the bins between 10 and 2OkHz.
  • This approach preserves any harmonic structure that may exist in the audio signal.
  • Other initial octaves, such as 2-4kHz, can be used as long as they are commensurate with transducer spacing.
  • the two calculated gains out of blocks 108 and 110, based on magnitude and time delay, are summed at a summer 116.
  • the reason for summing the gains will be described below.
  • the summed gain for frequencies below 5kHz is passed through at a block 118.
  • the gain for frequency bins between 2.5 and 5kHz is selected out at a block 120 and remapped (applied) into the frequency bins for 5 to 10kHz at a block 122 and for 10 to 2OkHz at a block 124 (as discussed above with respect to Figs. 15a and b above) .
  • the frequency bins for each of these three regions are combined at a block 126 to make a single full bandwidth complement of frequency bins.
  • the output "A" of block 126 is passed on to further signal processing described in Figure 16B. Good high frequency performance is allowed with two relatively widely spaced transducer elements .
  • the output of summer 134 is added at a summer 136 to the gain term "A" (from block 126 of Figure 16A) derived from the sum of the magnitude gain term and the time gain term.
  • the terms are summed at summers 134 and 136, rather than multiplied, to reduce the effects of errors in estimating the location of the source. If all four gain terms are high (i.e. 1) in a particular frequency bin, then that frequency is passed through with unity (1) gain. If any one of the gain terms falls (i.e. is less than 1) , the gain is merely reduced, rather than shutting down the gain of that frequency bin completely. The gain is reduced sufficiently so that the microphone performs its intended function of rejecting sources outside of the acceptance window in order to reduce feedback and bleed-through.
  • the gain reduction is not so large as to create audible artifacts should the estimate of one of the parameters be erroneous .
  • the gain in that frequency bin is turned down partially, rather than fully, making the audible effects of estimation errors significantly less audible .
  • the gain term output by summer 136 which has been calculated in dB, is converted to a linear gain at a block 138, and applied to the signal from transducer 12, as shown in Figure 7.
  • audible artifacts due to poor estimates of the source location are reduced.
  • non-linear blocks 108, 110, 128 and 130 will now be discussed with reference to Figures 16C-E.
  • This example assumes a spacing between the transducers 12 and 14 of about 35mm. The values provided below will change if the transducer spacing changes to something other than 35mm.
  • Each of blocks 108, 110, 128 and 130 rather than being only full-on or full-off (e.g. gain of 1 or 0) , have a short transition region, which fades acoustic sources across a threshold as they pass in and out of the acceptance window.
  • Figure 16E shows that, regarding block 110, for time delays between 28-41 microseconds the output gain rises from 0 to 1.
  • Figure 16D shows that, regarding block 108, for magnitude differences between 2-3dB the output gain rises from 0 to 1. Below 2dB the gain is 0 and above 3dB the gain is 1.
  • Figure 16C shows a gain term that is applied by blocks 128 and 130. In this example, for signal levels below -6OdB a 0 gain is applied. For signal levels from -6OdB to -5OdB the gain increases from 0 to 1. For a transducer signal level above -5OdB the gain is 1.
  • the microphone systems described above can be used in a cell phone or speaker phone.
  • a cell phone or speaker phone would also include an acoustic driver for transmitting sound to the user's ear.
  • the output of the signal processor would be used to drive a second acoustic driver at a remote location to produce sound (e.g. the second acoustic driver could be located in another cell phone or speaker phone 500 miles away) .
  • This embodiment relates to a prior art boom microphone that is used to pick up the human voice with a microphone located at the end of a boom worn on the user's head.
  • Typical applications are communications microphones, such as those used by pilots, or sound reinforcement microphones used by some popular singers in concert. These microphones are normally used when one desires a hands-free microphone located close to the mouth in order to reduce the pickup of sounds from other sources.
  • the boom across the face can be unsightly and awkward.
  • Another application of a boom microphone is for a cell phone headset . These headsets have an earpiece worn on or in the user's ear, with a microphone boom suspended from the earpiece . This microphone may be located in front of a users mouth or dangling from a cord, either of which can be annoying .
  • An earpiece using the new directional technology of this application is described with reference to Figure 17.
  • An earphone 150 includes an earpiece 152 which is inserted into the ear. Alternatively, the earpiece can be placed on or around the ear.
  • the earphone includes an internal speaker (not shown) for creating sound which passes through the ear piece.
  • a wire bundle 153 passes DC power from, for example, a cell phone clipped to a users belt to the earphone 150. The wire bundle also passes audio information into the earphone 150 to be reproduced by the internal speaker.
  • wire bundle 153 is eliminated, the earpiece 152 includes a battery to supply electrical power, and information is passed to and from the earpiece 152 wirelessly.
  • a microphone 154 that includes two or three transducers (not shown) as described above.
  • the microphone 154 can be located separately from the earpiece anywhere in the vicinity of the head (e.g. on a headband of a headset) .
  • the two transducers are aligned along a direction X so as to be aimed in the general direction of the users mouth.
  • the transducers may be part of a MEMS technology may be used to provide a compact, light microphone 154.
  • the wire bundle 153 passes signals from the transducers back to the cell phone where signal processing described above is applied to these signals. This arrangement eliminates the need for a boom.
  • the earphone unit is smaller, lighter weight, and less unsightly.
  • the microphone can be made to respond preferentially to sound coming from the user's mouth, while rejecting sound from other sources (e.g. the speaker in the earphone 150) .
  • other sources e.g. the speaker in the earphone 150
  • the user gets the benefits of having a boom microphone without the need for the physical boom.
  • the general assumption was that of a substantially free field acoustic environment. However, near the head, the acoustic field from sources is modified by the head, and free-field conditions no longer hold. As a result, the acceptance thresholds are preferably changed from free field conditions .
  • the thresholds are a function of frequency.
  • a different threshold is used for every frequency bin for which the gain is calculated.
  • a small number of thresholds are applied to groups of frequency bins . These thresholds are determined empirically.
  • the magnitude and time delay differences in each frequency bin are continually recorded while a sound source radiating energy at all frequencies of interest is moved around the microphone. A high score is assigned to the magnitude and time difference pairs when the source is located in the desired acceptance zone and a low score when it is outside the acceptance zone.
  • multiple sound sources at various locations can be turned on and off by the controller doing the scoring and tabulating.
  • the thresholds for each frequency bin are calculated using the db difference and time (or phase) difference as the independent variables, and the score as the dependent variable. This approach compensates for any difference in frequency response that may exist between the two microphone elements that make up any given unit.
  • the microphone learns what the appropriate thresholds are, given the intended use of the microphone, and the acoustical environment.
  • a user switches the system to a learning mode and moves a small sound source around in a region that the microphone should accept sound sources when operating.
  • the microphone system calculates the magnitude and time delay differences in all frequency bands during the training.
  • the system calculates the best fit of the data, using well known statistical methods and calculates a set of thresholds for each frequency bin or groups of frequency bins. This approach assists in attaining an increased number of correct decisions about sound source location made for sound sources located in a desired acceptance zone.
  • a sound source used for training could be a small loudspeaker playing a test signal that contains energy in all frequency bands of interest during the training period, either simultaneously, or sequentially. If the microphone is part of a live music system, the sound source can be one of the speakers used as a part of the live music reinforcement system. The sound source could also be a mechanical device that creates noise.
  • a musician can use their own voice or instrument as the training source. During a training period, the musician sings or plays their instrument, positioning the mouth or instrument in various locations within the acceptance zone. Again, the microphone system calculates magnitude and time delay differences in all frequency bands, but rejects any bands for which there is little energy. The thresholds are calculated using best fit approaches as before, and bands which have poor information are filled in by interpolation from nearby frequency bands.
  • the user switches the microphone back to a normal operating mode, and it operates using the newly calculated thresholds. Further, once a microphone system is trained to be approximately correct, a check of the microphone training is done periodically throughout the course of a performance (or other use) , using the music of the performance as a test signal .
  • Figure 17B discloses a cell phone 174 which incorporates two microphone elements as described herein. These two elements are located toward a bottom end 176 of the microphone 174 and are aligned in a direction Y that extends perpendicular to the surface of the paper on which Figure 17B lies. Accordingly, the microphone elements are aimed in the general direction of the cell phone users mouth.
  • QC2 Headset available from Bose Corporation .
  • This headset was placed on the head of a mannequin which simulates the human head, torso, and voice.
  • Test signals were played through the mannequin's mouth, and the magnitude and time differences between the two microphone elements were acquired and given a high score, since these signals represent the desired signal in a communications microphone.
  • test signals were played through another source which was moved to a number of locations around the mannequin's head. Magnitude and time differences were acquired and given a low score, since these represent undesired jammers.
  • a best fit algorithm was applied to the data in each frequency bin. The calculated magnitude and time delay thresholds for each bin are shown in the plots of Figures 18A and B.
  • these thresholds could be applied to each bin, as calculated. In order to save memory, it is possible to smooth these plots, and use a small number of thresholds on groups of frequency bins. Alternatively a function is fit to the smoothed curve and used to calculate the gains . These thresholds are applied in, for example, block 34 of Figure 7.
  • slew rate limiting is used in the signal processing. This embodiment is similar to the embodiment of Figure 7 except that slew rate limiting is used in block 40.
  • Slew rate limiting is a non-linear method for smoothing noisy signals. When applied to the embodiments described above, the method prevents the gain control signal (e.g. coming out of block 40 in Fig. 7) from changing too fast, which could cause audible artifacts. For each frequency bin, the gain control signal is not permitted to change more than a specified value from one block to the next. The value may be different for increasing gain than for decreasing gain. Thus, the gain actually applied to the audio signal (e.g. from transducer 12 in Fig. 7) from the output of the slew rate limiter (in block 40 of Fig. 7) may lag behind the calculated gain.
  • a dotted line 170 shows the calculated gain for a particular frequency bin plotted versus time.
  • a solid line 172 shows the slew rate limited gain that results after slew rate limiting is applied.
  • the gain is not permitted to rise faster than lOOdb/sec, and not permitted to fall faster than 200dB/sec.
  • Selection of the slew rate is determined by competing factors .
  • the slew rate should be as fast as possible to maximize rejection of undesired acoustic sources. However, to minimize audible artifacts, the slew rate should be as slow as possible.
  • the gain can be slewed down more slowly than up based on psychoacoustic factors without problems.
  • the applied gain (which has been slew rate limited) lags behind the calculated gain because the calculated gain is rising faster than the threshold.
  • Another example of using more than two transducers is to create multiple transducers pairs whose sound source distance and angle estimates can be compared.
  • the magnitude and phase relationships between the sound pressure measured at any two points due to a source can differ substantially from those same two points measured in a free field.
  • the magnitude and phase relationship at one frequency can fall within the acceptance window, even though the physical location of the sound source is outside the acceptance window.
  • the distance and angle estimate is faulty.
  • the distance and angle estimate for that same frequency made just a short distance away is likely to be correct.
  • a microphone system using multiple pairs of microphone elements can make multiple simultaneous estimates of sound source distance and angle for each frequency bin, and reject those estimates that do not agree with the estimates from the majority of other pairs .
  • a microphone system 180 includes four transducers 182, 184, 186 and 188 arranged in a linear array. The distance between each adjacent pair of transducers is substantially the same. This array has three pair of closely spaced transducers 182-184/184-186/186-188, two pair of moderately spaced transducers 182-186/184-188 and one pair of distantly spaced transducers 182-188.
  • the output signals for each of these six pairs of transducers is processed, for example, as described above with reference to Figure 7 (up to box 34) in a signal processor 190. An accept or reject decision is made for each pair for each frequency.
  • each transducer pair determines whether the magnitude relationship (e.g. ratio) falls on one side or the other side of a threshold value
  • the accept or reject decision for each pair can be weighted in a box 194 based on various criteria known to those skilled in the art. For example, the widely spaced transducer pair 182-188 can be given little weight at high frequencies.
  • the weighted accepts are combined and compared to the combined weighted rejects in a box 196 to make a final accept or reject decision for that frequency bin. In other words, it is decided whether an overall magnitude relationship falls on one side or the other side of the threshold value. Based on this decision, gain is determined at a box 198 and this gain is applied to the output signal of one of the transducers as in Figure 7.
  • a microphone system 200 includes four transducers 202, 204, 206 and 208 arranged at the vertices of an imaginary four-sided polygon.
  • the polygon is in the shape of a square, but the polygon can be in a shape other than a square (e.g. a rectangle, parallelogram, etc.) .
  • more than four transducers can be used at the vertices of a five or more sided polygon.
  • This system has two forward facing pairs 202-206/204-208 facing a forward direction "A", two sideways facing pairs 202- 204/206-208 facing sides B and C, and two diagonally facing pairs 204-206/202-208.
  • the output signals for each pair of transducers are processed in a box 210 and weighted in a box 212 as described in the previous paragraph.
  • a final accept or reject decision is made, as described above in a box 214, and a corresponding gain is selected for the frequency of interest at a box 216.
  • This example allows the microphone system 200 to determine sound source distance even for sound sources 90° off axis located, for example, at locations B and/or C.
  • more than four transducers can be used.
  • five transducers forming ten pairs of transducers can be used. In general, using more transducers results in a more accurate determination of sound source distance and angle.
  • one of the four transducers (e.g. omni-directional microphones) 202, 204, 206 and 208 is eliminated.
  • transducer 202 we will have transducers 204 and 208 which can be connected by an imaginary straight line that extends to infinity in either direction, and transducer 206 which is located away from this line.
  • transducer 206 which is located away from this line.
  • Such an arrangement results in three pair of transducers 204-208, 206-208 and 204-206 which can be used to determine sound source distance and angle.

Landscapes

  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Quality & Reliability (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Otolaryngology (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Obtaining Desirable Characteristics In Audible-Bandwidth Transducers (AREA)

Abstract

A method of distinguishing sound sources includes the step of transforming data, collected by at least two transducers which each react to a characteristic of an acoustic wave, into signals for each transducer location. The transducers are separated by a distance of less than about 70mm or greater than about 90mm. The signals are separated into a plurality of frequency bands for each transducer location. For each band a comparison is made of the relationship of the magnitudes of the signals for the transducer locations with a threshold value. A relative gain change is caused between those frequency bands whose magnitude relationship falls on one side of the threshold value and those frequency bands whose magnitude relationship falls on the other side of the threshold value. As such, sound sources are discriminated from each other based on their distance from the transducers.

Description

SOUND DISCRIMINATION METHOD AND APPARATUS
FIELD
[0001] The invention relates generally to the field of acoustics, and in particular to sound pick-up and reproduction. More specifically, the invention relates to a sound discrimination method and apparatus.
BACKGROUND
[0002] In a typical live music concert, multiple microphones (acoustic pick-up devices) are positioned close to each of the instruments and vocalists . The electrical signals from the microphones are mixed, amplified, and reproduced by loudspeakers so that the musicians can clearly be heard by the audience in a large performance space .
[0003] A problem with conventional microphones is that they respond not only to the desired instrument or voice, but also to other nearby instruments and/or voices. If, for example, the sound of the drum kit bleeds into the microphone of the lead singer, the reproduced sound is adversely effected. This problem also occurs when musicians are in a studio recording their music.
[0004] Conventional microphones also respond to the monitor loudspeakers used by the musicians onstage, and to the house loudspeakers that distribute the amplified sound to the audience. As a result, gains must be carefully monitored to avoid feedback, in which the music amplifying system breaks out in howling that spoils a performance. This is especially problematic in live amplified performances, since the amount of signal from the loudspeaker picked up by the microphone can vary wildly, depending on how musicians move about on stage, or how they move the microphones as they perform. An amplification system that has been carefully adjusted to be free from feedback during rehearsal may suddenly break out in howling during the performance simply because a musician has moved on stage.
[0005] One type of acoustic pick-up device is an omni directional microphone. An omni directional microphone is rarely used for live music because it tends to be more prone to feedback. More typically, conventional microphones having a directional acceptance pattern (e.g., a cardioid microphone) are used to reject off axis sounds output from other instruments or voices, or from speakers, thus reducing the tendency for the system to howl. However, these microphones have insufficient rejection to fully solve the problem.
[0006] Directional microphones generally have a frequency response that varies with the distance from the source. This is typical of pressure gradient responding microphones . This effect is called the "proximity effect" , and it results in a bass boost when the microphone is close to the source and a loss of bass when the microphone is far from the source. Performers who like proximity effect often vary the distance between the microphone and the instrument (or voice) during a performance to create effects and to change the level of the amplified sound. This process is called "working the mike" .
[0007] While some performers like proximity effect, other performers prefer that over the range of angles and distances that the microphone accepts sounds, the frequency response of the improved sound reproducing system should remain as uniform as possible. For these performers the timbre of the instrument should not change as the musician moves closer to or further from the microphone.
[0008] Cell phones, regular phones and speaker phones can have performance problems when there is a lot of background noise. In this situation the clarity of the desired speakers voice is degraded or overwhelmed by this noise . It would be desirable for these phones to be able to discriminate between the desired speaker and the background noise. The phone would then provide a relative emphasis of the speaker's voice over the noise.
SUMMARY
[0009] The present invention is directed to overcoming one or more of the problems set forth above. Briefly summarized, according to one aspect of the present invention, method of distinguishing sound sources includes transforming data, collected by at least two transducers which each react to a characteristic of an acoustic wave, into signals for each transducer location. The transducers are separated by a distance of less than about 70mm or greater than about 90mm. The signals are separated into a plurality of frequency bands for each transducer location. For each band a relationship of the magnitudes of the signals for the transducer locations is compared with a first threshold value. A relative gain change is caused between those frequency bands whose magnitude relationship falls on one side of the threshold value and those frequency bands whose magnitude relationship falls on the other side of the threshold value. As such, sound sources are discriminated from each other based on their distance from the transducers .
[00010] Further features of the invention include (a) using a fast Fourier transform to convert the signals from a time domain to a frequency domain, (b) comparing a magnitude of a ratio of the signals, (c) causing those frequency bands whose magnitude comparison falls on one side of the threshold value to receive a gain of about 1, (d) causing those frequency bands whose magnitude comparison falls on the other side of the threshold value to receive a gain of about 0, (e) that each transducer is an omni-directional microphone, (f) converting the frequency bands into output signals, (g) using the output signals to drive one or more acoustic drivers to produce sound, (h) providing a user- variable threshold value such that a user can adjust a distance sensitivity from the transducers, or (i) that the characteristic is a local sound pressure, its first-order gradient, higher-order gradients, and/or combinations thereof .
[00011] Another feature involves providing a second threshold value different from the first threshold value. The causing step causes a relative gain change between those frequency bands whose magnitude comparison falls in a first range between the threshold values and those frequency bands whose magnitude comparison falls outside the threshold values.
[00012] A still further feature involves providing third and fourth threshold values that define a second range that is different from and does not overlap the first range. The causing step causes a relative gain change between those frequency bands whose magnitude comparison falls in the first or second ranges and those frequency bands whose magnitude comparison falls outside the first and second ranges .
[00013] Additional features call for (a) the transducers to be separated by a distance of no less than about 250 microns, (b) the transducers to be separated by a distance of between about 20mm to about 50mm, (c) the transducers to be separated by a distance of between about 25mm to about 45mm, (d) the transducers to be separated by a distance of about 35mm, and/or (e) the distance between the transducers to be measured from a center of a diaphragm for each transducer.
[00014] Other features include that (a) the causing step fades the relative gain change between a low gain and a high gain, (b) the fade of the relative gain change is done across the first threshold value, (c) the fade of the relative gain change is done across a certain magnitude level for an output signal of one or more of the transducers, and/or (d) the causing of a relative gain change is effected by (1) a gain term based on the magnitude relationship and (2) a gain term based on a magnitude of an output signal from one or more of the transducers . [00015] Still further features include that (a) a group of gain terms derived for a first group of frequency bands is also applied to a second group of frequency bands, (b) the frequency bands of the first group are lower than the frequency bands of the second group, (c) the group of gain terms derived for the first group of frequency bands is also applied to a third group of frequency bands, and/or (d) the frequency bands of the first group are lower than the frequency bands of the third group.
[00016] Additional features call for (a) the acoustic wave to be traveling in a compressible fluid, (b) the compressible fluid to be air, (c) the acoustic wave to be traveling in a substantially incompressible fluid (d) the substantially incompressible fluid to be water, (e) the causing step to cause a relative gain change to the signals from only one of the two transducers, (f) a particular frequency band to have a limit in how quickly a gain for that frequency band can change, and/or (g) there to be a first limit for how quickly the gain can increase and a second limit for how quickly the gain can decrease, the first limit and second limit being different.
[00017] According to another aspect, a method of discriminating between sound sources includes transforming data, collected by transducers which react to a characteristic of an acoustic wave, into signals for each transducer location. The signals are separated into a plurality of frequency bands for each location. For each band a relationship of the magnitudes of the signals for the locations is determined. For each band a time delay is determined from the signals between when an acoustic wave is detected by a first transducer and when this wave is detected by a second transducer. A relative gain change is caused between those frequency bands whose magnitude relationship and time delay fall on one side of respective threshold values for magnitude relationship and time delay, and those frequency bands whose (a) magnitude relationship falls on the other side of its threshold value, (b) time delay falls on the other side of its threshold value, or (c) magnitude relationship and time delay both fall on the other side of their respective threshold values.
[00018] Further features include (a) providing an adjustable threshold value for the magnitude relationship, (b) providing an adjustable threshold value for the time delay, (c) fading the relative gain change across the magnitude relationship threshold, (d) fading the relative gain change across the time delay threshold, (e) that causing of a relative gain change is effected by (1) a gain term based on the magnitude relationship and (2) a gain term based on the time delay, (f) that the causing of a relative gain change is further effected by a gain term based on a magnitude of an output signal from one or more of the transducers, and/ or (g) that for each frequency band there is an assigned threshold value for magnitude relationship and an assigned threshold value for time delay.
[00019] A still further aspect involves a method of distinguishing sound sources. Data collected by at least three omni-directional microphones which each react to a characteristic of an acoustic wave is captured. The data is processed to determine (1) which data represents one or more sound sources located less than a certain distance from the microphones, and (2) which data represents one or more sound sources located more than the certain distance from the microphones. The results of the processing step are utilized to provide a greater emphasis of data representing the sound source (s) in one of (1) or (2) above over data representing the sound source (s) in the other of (1) or (2) above. As such, sound sources are discriminated from each other based on their distance from the microphones .
[00020] Additional features include that (a) the utilizing step provides a greater emphasis of data representing the sound source (s) in (1) over data representing the sound source (s) in (2) , (b) after the utilizing step the data is converted into output signals, (c) a first microphone is a first distance from a second microphone and a second distance from a third microphone, the first distance being less than the second distance, (d) the processing step selects high frequencies from the second microphone and low frequencies from the third microphone which are lower than the high frequencies, (e) the low frequencies and high frequencies are combined in the processing step, and/or (f) the processing step determines (1) a phase relationship from the data from microphones one and two, and (2) determines a magnitude relationship from the data from microphones one and three.
[00021] According to another aspect, a personal communication device includes two transducers which react to a characteristic of an acoustic wave to capture data representative of the characteristic. The transducers are separated by a distance of about 70mm or less. A signal processor for processing the data determines (1) which data represents one or more sound sources located less than a certain distance from the transducers, and (2) which data represents one or more sound sources located more than the certain distance from the transducers. The signal processor provides a greater emphasis of data representing the sound source (s) in one of (1) or (2) above over data representing the sound source (s) in the other of (1) or (2) above. As such, sound sources are discriminated from each other based on their distance from the transducers.
[00022] Further features call for (a) the signal processor to convert the data into output signals, (b) the output signals to be used to drive a second acoustic driver remote from the device to produce sound remote from the device, (c) the transducers to be separated by a distance of no less than about 250 microns, (d) the device to be a cell phone, and/or (e) the device to be a speaker phone.
[00023] A still further aspect calls for a microphone system having a silicon chip and two transducers secured to the chip which react to a characteristic of an acoustic wave to capture data representative of the characteristic. The transducers are separated by a distance of about 70mm or less. A signal processor is secured to the chip for processing the data to determine (1) which data represents one or more sound sources located less than a certain distance from the transducers, and (2) which data represents one or more sound sources located more than the certain distance from the transducers. The signal processor provides a greater emphasis of data representing the sound source (s) in one of (1) or (2) above over data representing the sound source (s) in the other of (1) or (2) above, such that sound sources are discriminated from each other based on their distance from the transducers.
[00024] Another aspect calls for a method of discriminating between sound sources. Data collected by transducers which react to a characteristic of an acoustic wave is transformed into signals for each transducer location. The signals are separated into a plurality of frequency bands for each location. A relationship of the magnitudes of the signals is determined for each band for the locations. For each band a phase shift is determined from the signals which is indicative of when an acoustic wave is detected by a first transducer and when this wave is detected by a second transducer. A relative gain change is caused between those frequency bands whose magnitude relationship and phase shift fall on one side of respective threshold values for magnitude relationship and phase shift, and those frequency bands whose (1) magnitude relationship falls on the other side of its threshold value, (2) phase shift falls on the other side of its threshold value, or (3) magnitude relationship and phase shift both fall on the other side of their respective threshold values.
[00025] An additional feature calls for providing an adjustable threshold value for the phase shift. [00026] According to a further aspect, a method of discriminating between sound sources includes transforming data, collected by transducers which react to a characteristic of an acoustic wave, into signals for each transducer location. The signals are separated into a plurality of frequency bands for each location. For each band a relationship of the magnitudes of the signals is determined for the locations. A relative gain change is caused between those frequency bands whose magnitude relationship falls on one side of a threshold value, and those frequency bands whose magnitude relationship falls on the other side of the threshold value. The gain change is faded across the threshold value to avoid abrupt gain changes at or near the threshold.
[00027] Another feature calls determining from the signals a time delay for each band between when an acoustic wave is detected by a first transducer and when this wave is detected by a second transducer. A relative gain change is caused between those frequency bands whose magnitude relationship and time delay fall on one side of respective threshold values for magnitude relationship and time delay, and those frequency bands whose (1) magnitude relationship falls on the other side of its threshold value, (2) time delay falls on the other side of its threshold value, or (3) magnitude relationship and time delay both fall on the other side of their respective threshold values. The gain change is faded across the threshold value to avoid abrupt gain changes at or near the threshold.
[00028] Other features include that (a) a group of gain terms derived for a first octave is also applied to a second octave, (b) the first octave is lower than the second octave, (c) the group of gain terms derived for the first octave is also applied to a third octave, (d) the frequency bands of the first octave is lower than the third octave, and/or (e) the frequency bands of the first group are lower than the frequency bands of the second group. [00029] Another aspect involves a method of discriminating between sound sources. Data, collected by transducers which react to a characteristic of an acoustic wave, is transformed into signals for each transducer location. The signals are separated into a plurality of frequency bands for each location. Characteristics of the signals are determined for each band which are indicative of a distance and angle to the transducers of a sound source providing energy to a particular band. A relative gain change is caused between those frequency bands whose signal characteristics indicate that a sound source providing energy to a particular band meets distance and angle requirements, and those frequency bands whose signal characteristics indicate that a sound source providing energy to a particular band (a) does not meet a distance requirement, (b) does not meet an angle requirement, or (c) does not meet distance and angle requirements.
[00030] Further features include that the characteristics include (a) a phase shift which is indicative of when an acoustic wave is detected by a first transducer and when this wave is detected by a second transducer, and/or (b) a time delay between when an acoustic wave is detected by a first transducer and when this wave is detected by a second transducer, whereby an angle to the transducers of a sound source providing energy to a particular band is indicated.
[00031] An additional feature calls for the output signals to be (a) recorded on a storage medium, (b) communicated by a transmitter, and/or (c) further processed and used to present information on location of sound sources.
[00032] A further aspect of the invention calls for a method of distinguishing sound sources. Data collected by four transducers which each react to a characteristic of an acoustic wave is transformed into signals for each transducer location. The signals are separated into a plurality of frequency bands for each transducer location. For each band a relationship of the magnitudes of the signals for at least two different pairs of the transducers is compared with a threshold value. A determination is made for each transducer pair whether the magnitude relationship falls on one side or the other side of the threshold value. The results of each determination is utilized to decide whether an overall magnitude relationship falls on one side or the other side of the threshold value. A relative gain change is caused between those frequency bands whose overall magnitude relationship falls on one side of the threshold value and those frequency bands whose overall magnitude relationship falls on the other side of the threshold value, such that sound sources are discriminated from each other based on their distance from the transducers.
[00033] Other features call for (a) the four transducers to be arranged in a linear array, (b) a distance between each adjacent pair of transducers to be substantially the same, (c) each of the four transducers to be located at respective vertices of an imaginary polygon, and/or (d) giving a weight to results of the determination for each transducer pair.
[00034] Another aspect calls for a method of distinguishing sound sources. A sound distinguishing system is switched to a training mode. A sound source is moved to a plurality of locations within a sound source accept region such that the sound distinguishing system can determine a plurality of thresholds for a plurality of frequency bins. The sound distinguishing system is switched to an operating mode, The sound distinguishing system uses the thresholds to provide a relative emphasis to sound sources located in the sound source accept region over sound sources located outside the sound source accept region.
[00035] Another feature requires that two of the microphones be connected by an imaginary straight line that extends in either direction to infinity. The third microphone is located away from this line. [00036] One more feature calls for comparing a relationship of the magnitudes of the signals for six unique pairs of the transducers with a threshold value.
[00037] These and other aspects, objects, features and advantages of the present invention will be more clearly- understood and appreciated from a review of the following detailed description and appended claims, and by reference to the accompanying drawings .
BRIEF DESCRIPTION OF THE DRAWINGS
[00038] FIG. 1 is a schematic diagram of a sound source in a first position relative to an acoustic pick-up device; [00039] FIG. 2 is a schematic diagram of the sound source in a second position relative to the acoustic pick-up device; [00040] FIG. 3 is a schematic diagram of the sound source in a third position relative to the acoustic pick-up device; [00041] FIG. 4 is a schematic diagram of the sound source in a fourth position relative to the acoustic pick-up device; [00042] Fig. 5 is a cross-section of a silicon chip with a microphone array,- [00043] FIG. 6A-C show plots of lines of constant dB difference and time difference as a function of angle and distance,- [00044] Fig. 7 is a schematic diagram of a first embodiment of a microphone system; [00045] Fig. 8 is a plot of the output of a conventional microphone and the microphone system of Fig. 7 versus distance; [00046] Fig. 9 is a polar plot of the output of a cardioid microphone and the microphone system of Fig. 7 versus angle; [00047] Figs 10a and 10b are schematic drawings of transducers being exposed to acoustic waves from different directions; [00048] Fig. 11 is a plot of lines of constant magnitude difference (in dB) for a relatively widely spaced pair of transducers; [00049] Fig. 12 is a plot of lines of constant magnitude difference (in dB) for a relatively narrowly spaced pair of transducers ; [00050] Fig. 13 is a schematic diagram of a second embodiment of a microphone system; [00051] Fig. 14 is a schematic diagram of a third embodiment of a microphone system;
[00052] Figs. 15a and b are plots of gain versus frequency; [00053] Fig. 16A is a schematic diagram of a fourth embodiment of a microphone system; [00054] Fig. 16B is a schematic diagram of another portion of the fourth embodiment; [00055] Figs. 16C-E are graphs of gain terms used in the fourth embodiment ; [00056] Fig. 17A is a perspective view of an earphone with integrated microphone; [00057] Fig. 17B is a front view of a cell phone with integrated microphone; [00058] Figs. 18A and B are plots of frequency verses threshold for magnitude and time delay;
[00059] Fig. 19 is a graph demonstrating slew rate limiting; [00060] Fig. 20 is a side schematic diagram of a fifth embodiment of a microphone system; and [00061] Fig. 21 is a top schematic diagram of a sixth embodiment of a microphone system.
DETAILED DESCRIPTION OF THE EMBODIMENTS
[00062] For some sound applications (e.g. the amplification of live music, sound recording, cell phones and speaker phones) , a microphone system with an unusual set of directional properties is desired. A new microphone system having these properties is disclosed that avoids many of the typical problems of directional microphones while offering improved performance. This new microphone system uses the pressures measured by two or more spaced microphone elements (transducers) to cause a relative positive gain for the signals from sound sources that fall within a certain acceptance window of distance and angle relative to the microphone system compared to the gain for the signals from all other sound sources.
[00063] These goals are achieved with a microphone system having a very different directional pattern than conventional microphones. A new microphone system with this pattern accepts sounds only within an "acceptance window" . Sounds originating within a certain distance and angle from the microphone system are accepted. Sounds originating outside this distance and/or angle are rejected.
[00064] In one application of the new microphone system (a live music performance), sources we'd like to reject, such as the drum kit at the singer's microphone, or the loudspeakers at any microphone, are likely to be too far away and/or at the wrong angle to be accepted by the new microphone system. Accordingly, the problems described above are avoided.
[00065] Beginning with FIG. 1, an acoustic pick-up device 10 includes front and rear transducers 12 and 14. The transducers collect data at their respective locations by reacting to a characteristic of an acoustic wave such as local sound pressure, the first order sound pressure gradient, higher-order sound pressure gradients, or combinations thereof. Each transducer in this embodiment can be a conventional omni-directional sound pressure responding microphone, and the transducers are arranged in a linear array. The transducers each transform the instantaneous sound pressure present at their respective location into electrical signals which represent the sound pressure over time at those locations.
[00066] Consider the ideal situation of a point source of sound 15 in free space, shown as a speaker in Figure 1. Sound source 15 could also be, for example, a singer or the output of a musical instrument. The distance from sound source 15 to front transducer 12 is R, and the angle between the acoustic pick-up device 10 and the source is θ. Transducers 12, 14 are separated by a distance rt. From the electrical signals discussed above, knowing r^, and comparing aspects of the signals with thresholds, it can be determined whether or not to accept sounds from sound source 15. The time difference between when a sound pressure wave reaches transducer 12 and when the wave reaches transducer 14 is τ . The symbol c is the speed of sound. Accordingly, a first equation which includes the unknown θ is as follows:
Figure imgf000016_0001
Also, We can measure the sound pressure magnitude Ml and M2 at the respective locations of transducers 12 and 14, and we know rt . As such, we can set up a second equation including unknown R as :
Figure imgf000016_0002
Thus, we have two equations and two unknowns R and θ (given rt, τ, c and M1/M2) . The two equations are numerically solved simultaneously using a computer.
[00067] An example is provided in Figure 2. In this example it is assumed that sound source 15 emits spherical waves. When R is small compared to a distance r^ between transducers 12, 14, and θ=0°, there will be a large sound pressure magnitude difference between the two transducer signals. This occurs because there is a large relative difference between distance R from sound source 15 to transducer 12 and the distance R + rt from source 15 to transducer 14. For a point source of sound the sound pressure magnitude drops as a function of 1/R from source 15 to transducer 12 and 1/ (R+rt) from source 15 to transducer 14.
[00068] The distance rt is preferably measured from the center of a diaphragm for each of transducers 12 and 14. Distance rt is preferably smaller than a wavelength for the highest frequency of interest. However, rt should not be too small as the magnitude ratios as a function of distance will be small and thus more difficult to measure. Where the acoustic waves are traveling in a gas where c approx. = 343 m/s (e.g. air), distance rt in one example is preferably about 70 millimeters (mm) or less. At about 70mm the system is best suited for acoustic environments consisting primarily of human speech and similar signals. Preferably distance r^ is between about 20mm to about 50mm. More preferably distance rt is between about 25mm to about 45mm. Most preferably distance rt is about 35mm.
[00069] To this point the description has been inherently done in an environment of a compressible fluid (e.g. air) . It should be noted that this invention will also be effective in an environment of an incompressible fluid (e.g. water or salt water) . In the case of water the transducer spacing can be about 90mm or greater. If it is only desired to measure low or extremely low frequencies, the transducer spacing can get quite large. For example, assuming the speed of sound in water is 1500 meters/second and the highest frequency of interest is 100hz, then the transducers can be spaced 15 meters apart.
[00070] Turning to Figure 3, when R is relatively large and θ=0°, the relative time difference (delay) remains the same, but the difference in magnitude between the signals of transducers 12, 14 decreases significantly. As R becomes very large, the magnitude difference approaches zero.
[00071] Referring to Figure 4, For any R, when θ=90°, the time delay between transducers 12 and 14 vanishes, since the path length from sound source 15 to each transducer 12, 14 is the same. At angles between 0° and 90°, the time delay decreases from r^/c to zero. Generally speaking, the magnitudes of the signals of transducers 12, 14 when θ=90° will be equal. It can be seen that there is variation in the relative magnitude, relative phase (or time delay) , or both in the signals output from the transducer pair of figs 2-4 as a function of the location of the sound source 15 with respect to the location of audio device 10. This is shown more completely in Figs 6 a-c, described in more detail below. The sound source angle can be calculated at any angle. However, in this example, the sound source distance R becomes progressively more difficult to estimate as θ approaches ±90°. This is because at ±90° there is no longer any magnitude difference between Ml and M2 regardless of distance. [00072] With reference to Figure 5, a cross-section of a silicon chip 35 discloses a Micro-Electro-Mechanical Systems (MEMS) microphone array 37. Array 37 includes a pair of acoustic transducers 34, 41 which are spaced a distance rt of at least about 250 microns from each other. Optional ports 43, 45 increase an effective distance dt at which transducers 34, 41 "hear" their environment. Distance dt can be set at any desired length up to about 70mm. Chip 35 also includes the associated signal processing apparatus (not shown in Figure 5) which are connected to transducers 34, 41. An advantage of a MEMS microphone array is that some or all of desired signal processing (discussed below) , for example: signal conditioning, A/D conversion, windowing, transformation, and D/A conversion, etc., can be placed on the same chip. This provides a very compact, unitary microphone system. An example of a MEMS microphone array is the AKU2001 Tri-State Digital Output CMOS MEMS Microphone available from Akustica, Inc. 2835 East Carson Street, Suite 301, Pittsburgh, PA 15203
(http: //www. akustica. com/documents/AKU2001ProductBrief .pdf ) . [00073] Turning to Figure 6a, a theoretical plot is provided of magnitude difference and time delay difference (phase) of the signals present at the location of transducers 12, 14 due to sound output by sound 15, as a function of source 15 's location (angle and distance) relative to the location of audio device 10 (consisting of transducers 12 and 14) . The plot of Figs. 6a-c was calculated assuming the distance rt between transducers 12, 14 is 35mm. The equations in paragraph 39 above were used to computationally create this plot. Here, however R and θ are set to known values and τ and M1/M2 are calculated. The theoretical sound source angle θ and distance R are varied over a wide range to determine a range of τ and M1/M2. A Y axis provides the sound source angle θ in degrees and an X axis provides the sound source distance in meters. Lines 17 of constant magnitude difference in dB are plotted. Lines 19 of constant time difference (microseconds) of the signals at the location of transducers 12, 14 are also plotted. More gradations can be provided if desired.
[00074] If, for example, it is desired to only accept sound sources located less than .13 meters from transducer 12 and at an angle θ of less than 25 degrees, we find the intersection of these values at a point 23. At point 23 we see that the magnitude difference must be greater than 2dB and time delay must be greater than 100 microseconds. A hatched area 27 indicates the acceptance window for this setting. If the sound source causes a magnitude difference of greater than or equal to 2dB and a time delay of greater than or equal to 100 microseconds, then we accept that sound source. If the sound source causes a magnitude difference of less than 2dB and/or a time delay of less than 100 microseconds, then we reject that sound source.
[00075] The above type of processing and resulting accepting or rejecting a sound source based on its distance and angle from the transducers, is done on a frequency band by frequency band basis. Relatively narrow frequency bands are desirable to avoid blocking desired sounds or passing non desired sounds. It is preferable to use narrow frequency bands and short time blocks, although those two characteristics conflict with each other. Narrower frequency bands enhance the rejection of unwanted acoustic sources but require longer time blocks. However, longer time blocks create system latency that can be unacceptable to a microphone user. Once a maximum acceptable system latency is determined, the frequency band width can be choosen. Then the block time is selected. Further details are provided below.
[00076] Because the system works independently over many frequency bands, a desired singer, located on-axis .13 meters from the microphone singing a C is accepted, while a guitar located off-axis .25 meters from the microphone playing an E is rejected. Thus, if a desired singer less than .13 meters and on axis from the microphone is singing a C, but a guitar is playing an E .25 meters from the microphone at any angle, the microphone system passes the vocalist's C and its harmonics, while simultaneously rejecting the instrumentalist's E and its harmonics.
[00077] Figure 6B shows an embodiment where two thresholds are used for each of magnitude difference and time difference. Sound sources that cause a magnitude difference of 2≤dB difference≤3 and a time difference 80≤micro seconds≤lOO are accepted. The acceptance window is identified by the hatched area 29. Sound sources that cause a magnitude difference and/or a time difference outside of acceptance window 29 are rejected.
[00078] Figure 6C shows an embodiment where two acceptance windows 31 and 33 are used. Sound sources that cause a magnitude difference of ≥3dB and a time difference SO≤micro seconds≤lOO are accepted. Sound sources that cause a magnitude difference of 2≤dB difference≤3 and a time difference ≥IOO micro-seconds are also accepted. Sound sources that cause a magnitude difference and/or a time difference outside of acceptance windows 31 and 33 are rejected. Any number of acceptance windows can be created by using appropriate thresholds for magnitude difference and time difference.
[00079] Turning now to Figure 7, a microphone system 11 will be described. An acoustic wave from sound source 15 causes transducers 12, 14 to produce electrical signals representing characteristics of the acoustic wave as a function of time. Transducers 12, 14 are each preferably an omni-directional microphone element which can connect to other parts of the system via a wire or wirelessly. The transducers in this embodiment have the center of their respective diaphragms separated by a distance of about 35mm. Some or all of the remaining elements in Figure 7 can be incorporated into the microphone, or they can be in one or more separate components. The signals for each transducer pass through respective conventional pre-amplifiers 16 and 18 and a conventional analog-to-digital (A/D) converter 20. In some embodiments, a separate A/D converter is used to convert the signal output by each transducer. Alternatively, a multiplexer can be used with a single A/D converter. Amplifiers 16 and 18 can also provide DC power (i.e. phantom power) to respective transducers 12 and 14 if needed.
[00080] Using block processing techniques which are well known to those skilled in the art, blocks of overlapping data are windowed at a block 22 ( a separate windowing is done on the signal for each transducer) . The windowed data are transformed from the time domain into the frequency domain using a fast Fourier transform (FFT) at a block 24 (a separate FFT is done on the signal for each transducer) . This separates the signals into a plurality of linear spaced frequency bands (i.e. bins) for each transducer location. Other types of transforms can be used to transform the windowed data from the time domain to the frequency domain. For example, a wavelet transform may be used instead of an FFT to obtain log spaced frequency bins. In this embodiment a sampling frequency of 32000 samples/sec is used with each block containing 512 samples.
[00081] The definition of the discrete Fourier transform (DFT) in its inverse is as follows:
The functions X = f f t (x) and x = if ft (X) implement the transform and inverse transform pair given for vectors Df length ** by
Figure imgf000022_0001
k = l
where
is an N th root of unity
[00082] The FFT is an algorithm for implementing the DFT that speeds the computation. The Fourier transform of a real signal (such as audio) yields a complex result. The magnitude of a complex number X is defined as: sqrt (real (X) .Λ2 + ιmag(X) .Λ2)
[00083] The angle of a complex number X is defined as:
Figure imgf000022_0002
[00084] where the sign of the real and imaginary parts is observed to place the angle in the proper quadrant of the unit circle, allowing a result in the range:
-π < angle(X) < π [00085] The equivalent time delay is defined as :
anglejX) 2 - π - f
[00086] The magnitude ratio of two complex values, Xl and X2 can be calculated in any of a number of ways . One can take the ratio of Xl and X2, and then find the magnitude of the result. Or, one can find the magnitude of Xl and X2 separately, and take their ratio. Alternatively, one can work in log space, and take the log of the magnitude of the ratio, or alternatively, the difference (subtraction) of log (Xl) and log(X2) .
[00087] Similarly, the time delay between two complex values can be calculated in a number of ways. One can take the ratio of Xl and X2, find the angle of the result and divide by the angular frequency. One can find the angle of Xl and X2 separately, subtract them, and divide the result by the angular frequency.
[00088] As described above, a relationship of the signals is established. In some embodiments the relationship is the ratio of the signal from front transducer 12 to the signal from rear transducer 14 which is calculated for each frequency bin on a block-by-block basis at a divider block 26. The magnitude of this ratio (relationship) in dB is calculated at a block 28. A time difference (delay) T (Tau) is calculated for each frequency bin on a block-by-block basis by first computing the phase at a block 30 and then dividing the phase by the center frequency of each frequency bin at a divider 32. The time delay represents the lapsed time between when an acoustic wave is detected by transducer 12 and when this wave is detected by a transducer 14.
[00089] Other well known digital signal processing (DSP) techniques for estimating magnitude and time delay differences between the two transducer signals may be used. For example, an alternate approach to calculating time delay differences is to use cross correlation in each frequency band between the two signals Xl and X2.
[00090] The calculated magnitude relationship and time differences (delay) for each frequency bin (band) are compared with threshold values at a block 34. For example, as described above in Figure 6A, if the magnitude difference is greater than or equal to 2dB and the time delay is greater than or equal to 100 microseconds, then we accept (emphasize) that frequency bin. If the magnitude difference is less than 2dB and/or the time delay is less than 100 microseconds, then we reject (deemphasize) that frequency bin.
[00091] A user input 36 may be manipulated to vary the acceptance angle threshold (s) and a user input 38 may be manipulates to vary the distance threshold (s) as required by the user. In one embodiment a small number of user presets are provided for different acceptance patterns which the user can select as needed. For example, the user would select between general categories such as narrow or wide for the angle setting and near or far for the distance setting.
[00092] A visual or other indication is given to the user to let her know the threshold settings for angle and distance. Accordingly, user-variable threshold values can be provided such that a user can adjust a distance selectivity and/or an angle selectivity from the transducers. The user user interface may represent this as changing the distance and/or angle thresholds, but in effect the user is adjusting the magnitude difference and/or the time difference thresholds.
[00093] When the magnitude difference and time delay both fall within the acceptance window for a particular frequency band, a relatively high gain is calculated at a block 40, and when one or both of the parameters is outside the window, a relatively low gain is calculated. The high gain is set at about 1 while the low gain is at about 0. Alternatively, the high gain might be above 1 while the low gain is below the high gain. In general, a relative gain change is caused between those frequency bands whose parameter (magnitude and time delay) comparisons both fall on one side of their respective threshold values and those frequency bands where one or both parameter comparisons fall on the other side of their respective threshold values.
[00094] The gains are calculated for each frequency bin in each data block. The calculated gain may be further manipulated in other ways known to those skilled in the art to minimize the artifacts generated by such gain change. For example, the minimum gain can be limited to some low value, rather than zero. Additionally, the gain in any frequency bin can be allowed to rise quickly but fall more slowly using a fast attack slow decay filter. In another approach, a limit is set on how much the gain is allowed to vary from one frequency bin to the next at any given time .
[00095] On a frequency bin by frequency bin basis, the calculated gain is applied to the frequency domain signal from a single transducer, for example transducer 12 (although transducer 14 could also be used) , at a multiplier 42. Thus, sound sources in the acceptance window are emphasized relative to sources outside the window.
[00096] Using conventional block processing techniques, the modified signal is inverse FFT' d at a block 44 to transform the signal from the frequency domain back into the time domain. The signal is then windowed, overlapped and summed with the previous blocks at a block 46. At a block 48 the signal is converted from a digital signal back to an analog (output) signal. The output of block 48 is then sent to a conventional amplifier (not shown) and acoustic driver (i.e. speaker) (not shown) of a sound reinforcement system to produce sound. Alternatively, an input signal (digital) to block 48 or an output signal (analog) from block 48 can be (a) recorded on a storage medium (e.g. electronic or magnetic) , (b) communicated by a transmitter (wired or wirelessly) , or (c) further processed and used to present information on location of sound sources.
[00097] Some benefits of this microphone system will be described with respect to Figures 8 and 9. Regarding distance selectivity, the response of a conventional microphone decreases smoothly with distance. For example, for a sound source with constant strength the output level of a typical omni directional microphone falls with distance R as 1/R. This is shown as line segments 49 and 50 in Figure 8 which plots relative microphone output in dB as a function of the log of R, the distance from the microphone to the sound source .
[00098] The microphone system shown in Figure 7 has the same fall off with R (line segment 49) , but only out to a specified distance, RO. The fall off in microphone output at RO is represented by a line segment 52. For a vocalist's microphone that is to be handheld by a singer, RO would typically be set to be approximately 30cm. For a vocalist's microphone fixed on a stand, that distance could be considerably less. The new microphone responds to the singer, located closer than RO, but rejects anything further away, such as sound from other instruments or loudspeakers.
[00099] Turning to Figure 9, angle selectivity will be discussed. Conventional microphones can have any of a variety of directional patterns. A cardioid response, which is a common directional pattern for microphones, is shown in the polar plot line 54 (the radius of the curve indicates the relative microphone magnitude response to sound arriving at the indicated angle . ) The cardioid microphone has the strongest magnitude response for sounds arriving at the front, with less and less response as the sound source moves to the rear. Sounds arriving from the rear are significantly attenuated.
[000100] A directional pattern for the microphone system of Figure 7 is shown by the pie shaped line 56. For sounds arriving within the acceptance angle (in this example, ±30°), the microphone has high response. Sounds arriving outside this angle are significantly attenuated.
[000101] The magnitude difference is both a function of distance and angle . The maximum change in magnitude with distance occurs in line with the transducers. The minimum change in magnitude with distance occurs in a line perpendicular to the axis of the transducers . For sources 90 deg off axis, there is no magnitude difference, regardless of the source distance. Angle, however, is just a function of the time difference alone. For applications where distance selectivity is important, the transducer array should be oriented pointing towards the location of a sound source or sources we wish to select.
[000102] A microphone having this sort of extreme directionality will be much less susceptible to feedback than a conventional microphone for two reasons. First, in a live performance application, the new microphone largely rejects the sound of main or monitor loudspeakers that may be present, because they are too distant and outside the acceptance window. The reduced sensitivity lowers the loop gain of the system, reducing the likelihood of feedback. Additionally, in a conventional system, feedback is exacerbated by having several "open" microphones and speakers on stage . Whereas any one microphone and speaker might be stable and not create feedback, the combination of multiple cross coupled systems can more easily be unstable, causing feedback. The new microphone system described herein is "open" only for a sound source within the acceptance window, making it less likely to contribute to feedback by coupling to another microphone and sound amplification system on stage, even if those other microphones and systems are completely conventional.
[000103] The new microphone system also greatly reduces the bleed through of sound from other performers or other instruments in a performing or recording application. The acceptance window (both distance and angle) can be tailored by the performer or sound crew on the fly to meet the needs of the performance .
[000104] The new microphone system can simulate the sound of many different styles of microphones for performers who want that effect as part of their sound. For example, in one embodiment of the invention this system can simulate the proximity effect of conventional microphones by boosting the gain more at low frequencies than high frequencies for magnitude differences indicating small R values. In the embodiment of Fig. 7, the output of transducer 12 alone is processed on a frequency bin basis to form an output signal. Transducer 12 is typically an omni-directional pressure responding transducer, and it will not exhibit proximity effect as is present in a typical pressure gradient responding microphone. Gain block 40 imposes a distance dependent gain function on the output of transducer 12, but the function described so far either passes or blocks a frequency bin depending on distance/angle from the microphone system. A more complex function can be applied in gain processing block 40, to simulate proximity effect of a pressure gradient microphone, while maintaining the distance/angle selectivity of the system as described. Rather than using a coefficient of either one or zero, a variable coefficient can be used, where the coefficient value varies as a function of frequency and distance. This function has a first order high pass filter shape, where the corner frequency decreases as distance decreases .
[000105] Proximity effect can also be caused by combining transducers 12, 14 into a single uni-directional or bidirectional microphone, thereby creating a fixed directional array. In this case the calculated gain is applied to the combined signal from transducers 12, 14, providing pressure gradient type directional behavior (not adjustable by the user) , in addition to the enhanced selectivity of the processing of Fig. 7. In another embodiment of the invention the new microphone system does not boost the gain more at low frequencies than high frequencies magnitude differences indicating small R values and so does not display proximity effect.
[000106] The new microphone can create new microphone effects. One example is a microphone having the same output for all sound source distances within the acceptance window. Using the magnitude difference and time delay between the transducers 12 and 14, the gain is adjusted to compensate for the 1/R falloff from transducer 12. Such a microphone might be attractive to musicians who do not "work the mike" . A sound source of constant level would cause the same output magnitude for any distance from the transducers within the acceptance window. This feature can be useful in a public address (PA) system. Inexperienced presenters generally are not careful about maintaining a constant distance from the microphone. With a conventional PA system, their reproduced voice can vary between being too loud and too soft . The improved microphone described herein keeps the voice level constant, independent of the distance between the speaker and the microphone. As a result, variations in the reproduced voice level for an inexperienced speaker are reduced .
[000107] The new microphone can be used to replace microphones for communications purposes, such as a microphone for a cell phone for consumers (in a headset or otherwise), or a boom microphone for pilots. These personal communication devices typically have a microphone which is intended to be located about 1 foot or less from a user's lips. Rather than using a boom to place a conventional noise canceling microphone close to the user's lips, a pair of small microphones mounted on the headset could use the angle and/or distance thresholds to accept only those sounds having the correct distance and/or angle (e.g. the user's lips) . Other sounds would be rejected. The acceptance window is centered around the anticipated location of the user's mouth. [000108] This microphone can also be used for other voice input systems where the location of the talker is known (e.g. in a car) . Some examples include hands free telephony applications, such as hands free operation in a vehicle, and hands free voice command, such as with vehicle systems employing speech recognition capabilities to accept voice input from a user to control vehicle functions. Another example is using the microphone in a speakerphone which can be used, for example, in tele-conferencing. These types of personal communication devices typically have a microphone which is intended to be located more than 1 foot from a user's lips. The new microphone technology of this application can also be used in combination with speech recognition software. The signals from the microphone are passed to the speech recognition algorithm in the frequency domain. Frequency bins that are outside the accept region for sound sources are given a lower weighting than frequency bins that are in the accept region. Such an arrangement can help the speech recognition software to process a desired speakers voice in a noisy environment.
[000109] Turning now to Figures 1OA and B, another embodiment will be described. In the embodiment described in Figure 7, two transducers 12, 14 are used with relatively wide spacing between them compared to a wavelength of sound at the maximum operating frequency of the transducers.. The reasons for this will be discussed below. However, as the frequency gets higher, it becomes difficult to reliably estimate the time delay between the two transducers using computationally simple methods. Normally, the phase difference between microphones is calculated for each frequency bin and divided by the center frequency of the bin to estimate time delay. Other techniques can be used, but they are more computationally intensive.
[000110] However, when the wavelength of sound approaches the distance between the microphones, this simple approach breaks down. The phase measurement produces results in the range between -π and π. However, there is an uncertainty in the measurement having a value that is an integral multiple of 2π. A measurement of 0 radians of phase difference could just as easily represent a phase difference of 2π or -2π.
[000111] This uncertainty is illustrated graphically in Figures 10a and 10b. Parallel lines 58 represent the wavelength spacing of the incoming acoustic pressure waves. In both of Figures 10a and 10b, peaks in the acoustic pressure wave reach transducers 12, 14 simultaneously, and so a phase shift of 0 is measured. However, in Figure 10a the wave comes in the direction of an arrow 60 perpendicular to an imaginary straight line joining transducers 12, 14. In this case the time delay actually is zero between the two transducers. On the contrary, in Figure 10b the wave comes in parallel to the imaginary line joining transducers 12, 14 in the direction of an arrow 62. In this example, two wavelengths fit in the space between the two transducers. The time of arrival difference is clearly non-zero, yet the measured phase delay remains zero, rather than the correct value of 4π.
[000112] This issue can be avoided by reducing the distance between transducers 12, 14 such that their spacing is less than a wavelength even for the highest frequency (shortest wavelength) we wish to sense. This approach eliminates the 2π uncertainty. However, a narrower spacing between the transducers decreases the magnitude difference between transducers 12 , 14 , making it harder to measure the magnitude difference (and thus provide distance selectivity) .
[000113] Figure 11 show lines of constant magnitude difference (in dB) between transducers 12, 14 for various distances and angles between the acoustic source and transducer 12 when the transducers 12, 14 have a relatively wide spacing between themselves (about 35mm) . Figure 12 shows lines of constant magnitude difference (in dB) between the transducers 12 , 14 for various distances and angles to the acoustic source with a much narrower transducer spacing (about 7mm) . With narrower transducer spacing the magnitude difference is greatly reduced and it is harder to get an accurate distance estimate.
[000114] This problem can be avoided by using two pairs of transducer elements: a widely spaced pair for low frequency estimates of source distance and angle, and a narrowly spaced pair for high frequency estimates of distance and angle . In one embodiment only three transducer elements are used: widely spaced Tl and T2 for low frequencies and narrowly spaced Tl and T3 for high frequencies .
[000115] We will now turn to Figure 13. Many of the blocks in the Figure 13 are similar to blocks shown in Figure 7. Signals from each of transducers 64, 66 and 68 pass through conventional microphone preamps 70, 72 and 74. Each transducer is preferably an omni-directional microphone element . Note that the spacing between transducers 64 and 66 is smaller than the spacing between transducers 64 and 68. The three signal streams are then each converted from analog form to digital form by an analog-to-digital converter 76.
[000116] Each of the three signal streams receive standard block processing windowing at block 78 and are converted from the time domain to the frequency domain at FFT block 80. High frequency bins above a pre-defined frequency from the signal of transducer 66 are selected out at block 82. In this embodiment the pre-defined frequency is 4Khz. Low frequency bins at or below 4khz from the signal of transducer 68 are selected out at block 84. The high frequency bins from block 82 are combined with the low frequency bins from block 84 at a block 86 in order to create a full complement of frequency bins . It should be noted that this band splitting can alternatively be done in the analog domain rather than the digital domain.
[000117] The remainder of the signal processing is substantially the same as for the embodiment in Figure 7 and so will not be described in detail. The ratio of the signal from transducer 64 and the combined low frequency and high frequency signals out of block 86 is calculated. The quotient is processed as described with reference to Figure 7. The calculated gain is applied to the signal from transducer 64, and the resulting signal is applied to standard inverse FFT, windowing, and overlap-and-sum blocks before being converted back to an analog signal by a digital-to-analog converter. In one embodiment, the analog signal is then sent to a conventional amplifier 88 and speaker 90 of a sound reinforcement system. This approach avoids the problem of the 2π uncertainty.
[000118] Turning to Figure 14, another embodiment will be described which avoids the problem of the 2π uncertainty. The front end of this embodiment is substantially the same as in Figure 13 through FFT block 80. At this point the ratio of the signals from transducers (microphones) 64 and 68 (widely spaced) is calculated at divider 92 and the magnitude difference in dB is determined at block 94. The ratio of the signals from transducers 64 and 66 (narrowly spaced) is calculated at divider 96 and the phase difference is determined at block 98. The phase is divided by the center frequency of each frequency bin at a divider 100 to determine the time delay. The remainder of the signal processing is substantially the same as in Figure 13.
[000119] In a still further embodiment based on Figure 14, the magnitude difference in dB is determined the same way as in that Figure. However, the ratio of the signals from transducers 64 and 66 (narrowly spaced) is calculated at a divider for low frequency bins (e.g. at or below 4khz) and the phase difference is determined. The phase is divided by the center frequency of each low frequency bin to determine the time delay. Further, the ratio of the signals from transducers 64 and 68 (widely spaced) is calculated at a divider for high frequency bins (e.g. above 4khz) and the phase difference is determined. The phase is divided by the center frequency of each high frequency bin to determine the time delay.
[000120] With reference to Figures 15a and b, there is another embodiment that avoids the need for a third transducer. For transducer separations of about 30-35mm, we are able to estimate the source location up to about 5kHz. While frequencies above 5kHz are important for high quality reproduction of music and speech and so can't be discarded, few acoustic sources generate energy only above 5kHz. Generally, sound sources also generate energy below 5kHz.
[000121] We can take advantage of this fact by not bothering to estimate source position above 5kHz. Instead, if acoustic energy is sensed below 5kHz that is within the acceptance window of the microphone, then energy above 5Khz is also allowed to pass, making the assumption that it is coming from the same source .
[000122] One method of achieving this goal is to use the instantaneous gains predicted for the frequency bins located in the octave between 2.5 and 5kHz for example, and to apply those same gains to the frequency bins one and two octaves higher, that is, for the bins between 5 and 10kHz, and the bins between 10 and 2OkHz. This approach preserves any harmonic structure that may exist in the audio signal. Other initial octaves, such as 2-4kHz, can be used as long as they are commensurate with transducer spacing.
[000123] As shown in Figures 15a and b. The signal processing is substantially the same as in Figure 7 except for "compare threshold" block 34 and its inputs. This difference will be described below. In Figure 15a, the gain is calculated up to 5kHz based on the estimated source position. Above 5kHz, it is difficult to get a reliable source location estimate, because of the 2π uncertainty in phase described above. Instead, as shown in Figure 15b, the gain in the octave from 2.5 to 5kHz is repeated for frequency bins spanning the octave 5 to 10 kHz, and again for frequency bins spanning the octave 10 to 20 kHz. [000124] Implementation of this embodiment will be described with reference to Figure 16A which replace the block 34 marked "compare threshold" in Figure 7. The magnitude and time delay ratios out of block 28 and divider 32 (Figure 7) are passed through respective non-linear blocks 108 and 110 (discussed in further detail below) . Blocks 108 and 110 work independently for each frequency bin and for each block of audio data, and create the acceptance window for the microphone system. In this example only one threshold is used for time delay and only one threshold is used for magnitude difference.
[000125] The two calculated gains out of blocks 108 and 110, based on magnitude and time delay, are summed at a summer 116. The reason for summing the gains will be described below. The summed gain for frequencies below 5kHz is passed through at a block 118. The gain for frequency bins between 2.5 and 5kHz is selected out at a block 120 and remapped (applied) into the frequency bins for 5 to 10kHz at a block 122 and for 10 to 2OkHz at a block 124 (as discussed above with respect to Figs. 15a and b above) . The frequency bins for each of these three regions are combined at a block 126 to make a single full bandwidth complement of frequency bins. The output "A" of block 126 is passed on to further signal processing described in Figure 16B. Good high frequency performance is allowed with two relatively widely spaced transducer elements .
[000126] Turning now to Figure 16B, another important feature of this example will be described. The respective magnitudes of the Tl signal 100 and of the T2 signal 102 in dB for each frequency bin on a block by block basis are passed through respective identical non-linear blocks 128 and 130 (discussed below in further detail) . These blocks create low gain terms for frequency bins in which the microphones have a low signal level. When the signal level in a frequency bin is low for either microphone, the gain is reduced . [000127] The two transducer level gain terms are summed with each other at a summer 134. The output of summer 134 is added at a summer 136 to the gain term "A" (from block 126 of Figure 16A) derived from the sum of the magnitude gain term and the time gain term. The terms are summed at summers 134 and 136, rather than multiplied, to reduce the effects of errors in estimating the location of the source. If all four gain terms are high (i.e. 1) in a particular frequency bin, then that frequency is passed through with unity (1) gain. If any one of the gain terms falls (i.e. is less than 1) , the gain is merely reduced, rather than shutting down the gain of that frequency bin completely. The gain is reduced sufficiently so that the microphone performs its intended function of rejecting sources outside of the acceptance window in order to reduce feedback and bleed-through. However, the gain reduction is not so large as to create audible artifacts should the estimate of one of the parameters be erroneous . The gain in that frequency bin is turned down partially, rather than fully, making the audible effects of estimation errors significantly less audible .
[000128] The gain term output by summer 136, which has been calculated in dB, is converted to a linear gain at a block 138, and applied to the signal from transducer 12, as shown in Figure 7. In this embodiment and other embodiments discussed in this application audible artifacts due to poor estimates of the source location are reduced.
[000129] Details of non-linear blocks 108, 110, 128 and 130 will now be discussed with reference to Figures 16C-E. This example assumes a spacing between the transducers 12 and 14 of about 35mm. The values provided below will change if the transducer spacing changes to something other than 35mm. Each of blocks 108, 110, 128 and 130, rather than being only full-on or full-off (e.g. gain of 1 or 0) , have a short transition region, which fades acoustic sources across a threshold as they pass in and out of the acceptance window. Figure 16E shows that, regarding block 110, for time delays between 28-41 microseconds the output gain rises from 0 to 1. For time delays less than 28 microseconds the gain is 0 and for time delays greater than 41 microseconds the gain is 1. Figure 16D shows that, regarding block 108, for magnitude differences between 2-3dB the output gain rises from 0 to 1. Below 2dB the gain is 0 and above 3dB the gain is 1. Figure 16C shows a gain term that is applied by blocks 128 and 130. In this example, for signal levels below -6OdB a 0 gain is applied. For signal levels from -6OdB to -5OdB the gain increases from 0 to 1. For a transducer signal level above -5OdB the gain is 1.
[000130] The microphone systems described above can be used in a cell phone or speaker phone. Such a cell phone or speaker phone would also include an acoustic driver for transmitting sound to the user's ear. The output of the signal processor would be used to drive a second acoustic driver at a remote location to produce sound (e.g. the second acoustic driver could be located in another cell phone or speaker phone 500 miles away) .
[000131] A still further embodiment of the invention will now be described. This embodiment relates to a prior art boom microphone that is used to pick up the human voice with a microphone located at the end of a boom worn on the user's head. Typical applications are communications microphones, such as those used by pilots, or sound reinforcement microphones used by some popular singers in concert. These microphones are normally used when one desires a hands-free microphone located close to the mouth in order to reduce the pickup of sounds from other sources. However, the boom across the face can be unsightly and awkward. Another application of a boom microphone is for a cell phone headset . These headsets have an earpiece worn on or in the user's ear, with a microphone boom suspended from the earpiece . This microphone may be located in front of a users mouth or dangling from a cord, either of which can be annoying .
[000132] An earpiece using the new directional technology of this application is described with reference to Figure 17. An earphone 150 includes an earpiece 152 which is inserted into the ear. Alternatively, the earpiece can be placed on or around the ear. The earphone includes an internal speaker (not shown) for creating sound which passes through the ear piece. A wire bundle 153 passes DC power from, for example, a cell phone clipped to a users belt to the earphone 150. The wire bundle also passes audio information into the earphone 150 to be reproduced by the internal speaker. As an alternative, wire bundle 153 is eliminated, the earpiece 152 includes a battery to supply electrical power, and information is passed to and from the earpiece 152 wirelessly. Further included in the earphone are a microphone 154 that includes two or three transducers (not shown) as described above. Alternatively, the microphone 154 can be located separately from the earpiece anywhere in the vicinity of the head (e.g. on a headband of a headset) . The two transducers are aligned along a direction X so as to be aimed in the general direction of the users mouth. The transducers may be part of a MEMS technology may be used to provide a compact, light microphone 154. The wire bundle 153 passes signals from the transducers back to the cell phone where signal processing described above is applied to these signals. This arrangement eliminates the need for a boom. Thus, the earphone unit is smaller, lighter weight, and less unsightly. Using the signal processing disclosed above (e.g. in Figure 7), the microphone can be made to respond preferentially to sound coming from the user's mouth, while rejecting sound from other sources (e.g. the speaker in the earphone 150) . In this way, the user gets the benefits of having a boom microphone without the need for the physical boom. [000133] For previous embodiments described above, the general assumption was that of a substantially free field acoustic environment. However, near the head, the acoustic field from sources is modified by the head, and free-field conditions no longer hold. As a result, the acceptance thresholds are preferably changed from free field conditions .
[000134] At low frequencies, where the wavelength of sound is much larger than the head, the sound field is not greatly changed, and an acceptance threshold similar to free field may be used. At high frequencies, where the wavelength of sound is smaller than the head, the sound field is significantly changed by the head, and the acceptance thresholds must be changed accordingly.
[000135] In this kind of application, it is desirable for the thresholds to be a function of frequency. In one embodiment, a different threshold is used for every frequency bin for which the gain is calculated. In another embodiment, a small number of thresholds are applied to groups of frequency bins . These thresholds are determined empirically. During a calibration process, the magnitude and time delay differences in each frequency bin are continually recorded while a sound source radiating energy at all frequencies of interest is moved around the microphone. A high score is assigned to the magnitude and time difference pairs when the source is located in the desired acceptance zone and a low score when it is outside the acceptance zone. Alternatively, multiple sound sources at various locations can be turned on and off by the controller doing the scoring and tabulating.
[000136] Using well known statistical methods for minimizing error, the thresholds for each frequency bin are calculated using the db difference and time (or phase) difference as the independent variables, and the score as the dependent variable. This approach compensates for any difference in frequency response that may exist between the two microphone elements that make up any given unit.
[000137] An issue to consider is that microphone elements and analog electronics have tolerances, so the magnitude and phase response of two microphones making up a pair may not be well matched. In addition, the acoustical environment in which the microphone is placed alters the magnitude and time delay relationships for sound sources in the desired acceptance window.
[000138] In order to address these issues an embodiment is provided in which the microphone learns what the appropriate thresholds are, given the intended use of the microphone, and the acoustical environment. In the intended acoustic environment with a relatively low level of background noise, a user switches the system to a learning mode and moves a small sound source around in a region that the microphone should accept sound sources when operating. The microphone system calculates the magnitude and time delay differences in all frequency bands during the training. When the data gathering is complete, the system calculates the best fit of the data, using well known statistical methods and calculates a set of thresholds for each frequency bin or groups of frequency bins. This approach assists in attaining an increased number of correct decisions about sound source location made for sound sources located in a desired acceptance zone.
[000139] A sound source used for training could be a small loudspeaker playing a test signal that contains energy in all frequency bands of interest during the training period, either simultaneously, or sequentially. If the microphone is part of a live music system, the sound source can be one of the speakers used as a part of the live music reinforcement system. The sound source could also be a mechanical device that creates noise.
[000140] Alternately, a musician can use their own voice or instrument as the training source. During a training period, the musician sings or plays their instrument, positioning the mouth or instrument in various locations within the acceptance zone. Again, the microphone system calculates magnitude and time delay differences in all frequency bands, but rejects any bands for which there is little energy. The thresholds are calculated using best fit approaches as before, and bands which have poor information are filled in by interpolation from nearby frequency bands.
[000141] Once the system has been trained, the user switches the microphone back to a normal operating mode, and it operates using the newly calculated thresholds. Further, once a microphone system is trained to be approximately correct, a check of the microphone training is done periodically throughout the course of a performance (or other use) , using the music of the performance as a test signal .
[000142] Figure 17B discloses a cell phone 174 which incorporates two microphone elements as described herein. These two elements are located toward a bottom end 176 of the microphone 174 and are aligned in a direction Y that extends perpendicular to the surface of the paper on which Figure 17B lies. Accordingly, the microphone elements are aimed in the general direction of the cell phone users mouth.
[000143] Referring to Figures 18A and B, two graphs are shown which plot frequency verses magnitude threshold (Fig. 18A) and time delay threshold (Fig. 18B) for a "boomless" boom mike. In this embodiment a microphone with two transducers to one of the ear cups of a headset such as the
® ®
QC2 Headset available from Bose Corporation . This headset was placed on the head of a mannequin which simulates the human head, torso, and voice. Test signals were played through the mannequin's mouth, and the magnitude and time differences between the two microphone elements were acquired and given a high score, since these signals represent the desired signal in a communications microphone. In addition, test signals were played through another source which was moved to a number of locations around the mannequin's head. Magnitude and time differences were acquired and given a low score, since these represent undesired jammers. A best fit algorithm was applied to the data in each frequency bin. The calculated magnitude and time delay thresholds for each bin are shown in the plots of Figures 18A and B. In a practical application, these thresholds could be applied to each bin, as calculated. In order to save memory, it is possible to smooth these plots, and use a small number of thresholds on groups of frequency bins. Alternatively a function is fit to the smoothed curve and used to calculate the gains . These thresholds are applied in, for example, block 34 of Figure 7.
[000144] In another embodiment of the invention, slew rate limiting is used in the signal processing. This embodiment is similar to the embodiment of Figure 7 except that slew rate limiting is used in block 40. Slew rate limiting is a non-linear method for smoothing noisy signals. When applied to the embodiments described above, the method prevents the gain control signal (e.g. coming out of block 40 in Fig. 7) from changing too fast, which could cause audible artifacts. For each frequency bin, the gain control signal is not permitted to change more than a specified value from one block to the next. The value may be different for increasing gain than for decreasing gain. Thus, the gain actually applied to the audio signal (e.g. from transducer 12 in Fig. 7) from the output of the slew rate limiter (in block 40 of Fig. 7) may lag behind the calculated gain.
[000145] Referring to Figure 19, a dotted line 170 shows the calculated gain for a particular frequency bin plotted versus time. A solid line 172 shows the slew rate limited gain that results after slew rate limiting is applied. In this example, the gain is not permitted to rise faster than lOOdb/sec, and not permitted to fall faster than 200dB/sec. Selection of the slew rate is determined by competing factors . The slew rate should be as fast as possible to maximize rejection of undesired acoustic sources. However, to minimize audible artifacts, the slew rate should be as slow as possible. The gain can be slewed down more slowly than up based on psychoacoustic factors without problems.
[000146] Thus between t=0.1 and 0.3 seconds, the applied gain (which has been slew rate limited) lags behind the calculated gain because the calculated gain is rising faster than the threshold. Between t=0.5 and 0.6, the calculated and applied gains are the same, since the calculated gain is falling at a rate less than the threshold. Beyond t=0.6, the calculated gain is falling faster than threshold, and the applied gain lags once again until it can catch up.
[000147] Another example of using more than two transducers is to create multiple transducers pairs whose sound source distance and angle estimates can be compared. In a reverberant sound field, the magnitude and phase relationships between the sound pressure measured at any two points due to a source can differ substantially from those same two points measured in a free field. As a result, for a source in one particular location in a room, and a pair of transducers in another particular location in the room, the magnitude and phase relationship at one frequency can fall within the acceptance window, even though the physical location of the sound source is outside the acceptance window. In this case, the distance and angle estimate is faulty. However, in a typical room, the distance and angle estimate for that same frequency made just a short distance away is likely to be correct. A microphone system using multiple pairs of microphone elements can make multiple simultaneous estimates of sound source distance and angle for each frequency bin, and reject those estimates that do not agree with the estimates from the majority of other pairs .
[000148] An example of the system described in the previous paragraph will be discussed with reference to Figure 20. A microphone system 180 includes four transducers 182, 184, 186 and 188 arranged in a linear array. The distance between each adjacent pair of transducers is substantially the same. This array has three pair of closely spaced transducers 182-184/184-186/186-188, two pair of moderately spaced transducers 182-186/184-188 and one pair of distantly spaced transducers 182-188. The output signals for each of these six pairs of transducers is processed, for example, as described above with reference to Figure 7 (up to box 34) in a signal processor 190. An accept or reject decision is made for each pair for each frequency. In other words it is determined for each transducer pair whether the magnitude relationship (e.g. ratio) falls on one side or the other side of a threshold value The accept or reject decision for each pair can be weighted in a box 194 based on various criteria known to those skilled in the art. For example, the widely spaced transducer pair 182-188 can be given little weight at high frequencies. The weighted accepts are combined and compared to the combined weighted rejects in a box 196 to make a final accept or reject decision for that frequency bin. In other words, it is decided whether an overall magnitude relationship falls on one side or the other side of the threshold value. Based on this decision, gain is determined at a box 198 and this gain is applied to the output signal of one of the transducers as in Figure 7. This system makes fewer false positive errors in accepting a sound source in a reverberant room. 0149] In another example described with reference to Figure 21, a microphone system 200 includes four transducers 202, 204, 206 and 208 arranged at the vertices of an imaginary four-sided polygon. In this example the polygon is in the shape of a square, but the polygon can be in a shape other than a square (e.g. a rectangle, parallelogram, etc.) . Additionally, more than four transducers can be used at the vertices of a five or more sided polygon. This system has two forward facing pairs 202-206/204-208 facing a forward direction "A", two sideways facing pairs 202- 204/206-208 facing sides B and C, and two diagonally facing pairs 204-206/202-208. The output signals for each pair of transducers are processed in a box 210 and weighted in a box 212 as described in the previous paragraph. A final accept or reject decision is made, as described above in a box 214, and a corresponding gain is selected for the frequency of interest at a box 216. This example allows the microphone system 200 to determine sound source distance even for sound sources 90° off axis located, for example, at locations B and/or C. Of course, more than four transducers can be used. For example, five transducers forming ten pairs of transducers can be used. In general, using more transducers results in a more accurate determination of sound source distance and angle.
[000150] In a further embodiment, one of the four transducers (e.g. omni-directional microphones) 202, 204, 206 and 208 is eliminated. For example, if transducer 202 is eliminated, we will have transducers 204 and 208 which can be connected by an imaginary straight line that extends to infinity in either direction, and transducer 206 which is located away from this line. Such an arrangement results in three pair of transducers 204-208, 206-208 and 204-206 which can be used to determine sound source distance and angle.
[000151] The invention has been described with reference to the embodiments described above. However, it will be appreciated that variations and modifications can be effected by a person of ordinary skill in the art without departing from the scope of the invention.

Claims

CLAIMS ;
1. A method of discriminating between sound sources, comprising the steps of: transforming data, collected by transducers which react to a characteristic of an acoustic wave, into signals for each transducer location; separating the signals into a plurality of frequency bands for each location; for each band determining a relationship of the magnitudes of the signals for the locations; for each band determining from the signals a time delay between when an acoustic wave is detected by a first transducer and when this wave is detected by a second transducer; and causing a relative gain change between those frequency bands whose magnitude relationship and time delay fall on one side of respective threshold values for magnitude relationship and time delay, and those frequency bands whose (a) magnitude relationship falls on the other side of its threshold value, (b) time delay falls on the other side of its threshold value, or (c) magnitude relationship and time delay both fall on the other side of their respective threshold values.
2. The method of claim 1, further comprising the step of; providing an adjustable threshold value for the magnitude relationship.
3. The method of claim 1, further comprising the step of ; providing an adjustable threshold value for the time delay.
4. The method of claim 3, further comprising the step of; providing an adjustable threshold value for the magnitude relationship.
5. The method of claim 1, wherein the causing step fades the relative gain change between a low gain and a high gain.
6. The method of claim 5, wherein the fade of the relative gain change is done across the magnitude relationship threshold.
7. The method of claim 5, wherein the fade of the relative gain change is done across the time delay threshold.
8. The method of claim 5, wherein the fade of the relative gain change is done across a certain magnitude level for an output signal of one or more of the transducers.
9. The method of claim 1, wherein the causing of a relative gain change is effected by (a) a gain term based on the magnitude relationship and (b) a gain term based on the time delay.
10. The method of claim 9, wherein the causing of a relative gain change is further effected by a gain term based on a magnitude of an output signal from one or more of the transducers .
11. The method of claim 1, wherein a group of gain terms derived for a first group of frequency bands is also applied to a second group of frequency bands.
12. The method of claim 11, wherein the frequency bands of the first group are lower than the frequency bands of the second group.
13. The method of claim 11, wherein the group of gain terms derived for the first group of frequency bands is also applied to a third group of frequency bands.
14. The method of claim 13, wherein the frequency bands of the first group are lower than the frequency bands of the third group.
15. The method of claim 14, wherein the frequency bands of the first group are lower than the frequency bans of the second group.
16. The method of claim 1, wherein for each frequency band there is an assigned threshold value for magnitude relationship and an assigned threshold value for time delay.
17. A personal communication device, comprising: two transducers which react to a characteristic of an acoustic wave to capture data representative of the characteristic, the transducers being separated by a distance of about 70mm or less; and a signal processor for processing said data to determine (a) which data represents one or more sound sources located less than a certain distance from the transducers, and (b) which data represents one or more sound sources located more than the certain distance from the transducers, the signal processor providing a greater emphasis of data representing the sound source (s) in one of (a) or (b) above over data representing the sound source (s) in the other of (a) or (b) above, such that sound sources are discriminated from each other based on their distance from the transducers .
18. The device of claim 17, wherein the signal processor provides a greater emphasis of data representing the sound source (s) in (a) over data representing the sound source (s) in (b) .
19. The device of claim 17, wherein the signal processor converts the data into output signals.
20. The device of claim 19, wherein the output signals are used to drive a second acoustic driver remote from the device to produce sound remote from the device .
21. The device of claim 17, wherein the characteristic is a local sound pressure, its first-order gradient, higher-order gradients, or combinations thereof.
22. The device of claim 17, wherein the transducers are separated by a distance of no less than about 250 microns.
23. The device of claim 17, wherein the transducers are separated by a distance of between about 20mm to about 50mm.
24. The device of claim 17, wherein the transducers are separated by a distance of between about 25mm to about 45mm.
25. The device of claim 17, wherein the transducers are separated by a distance of about 35mm.
26. The device of claim 17, wherein the distance between the transducers is measured from a center of a diaphragm for each transducer.
27. The device of claim 17, wherein the device is a cell phone.
28. The device of claim 17, wherein the device is a speaker phone .
PCT/US2008/064056 2007-06-21 2008-05-19 Sound discrimination method and apparatus WO2008156941A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN2008800209202A CN101682809B (en) 2007-06-21 2008-05-19 Sound discrimination method and apparatus
JP2010513294A JP4965707B2 (en) 2007-06-21 2008-05-19 Sound identification method and apparatus
EP08755825A EP2158788A1 (en) 2007-06-21 2008-05-19 Sound discrimination method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US11/766,622 2007-06-21
US11/766,622 US8767975B2 (en) 2007-06-21 2007-06-21 Sound discrimination method and apparatus

Publications (1)

Publication Number Publication Date
WO2008156941A1 true WO2008156941A1 (en) 2008-12-24

Family

ID=39643839

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2008/064056 WO2008156941A1 (en) 2007-06-21 2008-05-19 Sound discrimination method and apparatus

Country Status (5)

Country Link
US (2) US8767975B2 (en)
EP (1) EP2158788A1 (en)
JP (2) JP4965707B2 (en)
CN (1) CN101682809B (en)
WO (1) WO2008156941A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103366756A (en) * 2012-03-28 2013-10-23 联想(北京)有限公司 Sound signal reception method and device
US20150332674A1 (en) * 2011-12-28 2015-11-19 Fuji Xerox Co., Ltd. Voice analyzer and voice analysis system

Families Citing this family (54)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101154382A (en) * 2006-09-29 2008-04-02 松下电器产业株式会社 Method and system for detecting wind noise
US8767975B2 (en) * 2007-06-21 2014-07-01 Bose Corporation Sound discrimination method and apparatus
WO2009019748A1 (en) * 2007-08-03 2009-02-12 Fujitsu Limited Sound receiving device, directional characteristic deriving method, directional characteristic deriving apparatus and computer program
EP2202531A4 (en) * 2007-10-01 2012-12-26 Panasonic Corp Sound source direction detector
US8611554B2 (en) * 2008-04-22 2013-12-17 Bose Corporation Hearing assistance apparatus
US20090323985A1 (en) * 2008-06-30 2009-12-31 Qualcomm Incorporated System and method of controlling power consumption in response to volume control
US8218397B2 (en) * 2008-10-24 2012-07-10 Qualcomm Incorporated Audio source proximity estimation using sensor array for noise reduction
US9008321B2 (en) * 2009-06-08 2015-04-14 Nokia Corporation Audio processing
EP2271134A1 (en) * 2009-07-02 2011-01-05 Nxp B.V. Proximity sensor comprising an acoustic transducer for receiving sound signals in the human audible range and for emitting and receiving ultrasonic signals.
US9986347B2 (en) 2009-09-29 2018-05-29 Starkey Laboratories, Inc. Radio frequency MEMS devices for improved wireless performance for hearing assistance devices
US20110075870A1 (en) * 2009-09-29 2011-03-31 Starkey Laboratories, Inc. Radio with mems device for hearing assistance devices
TWI396190B (en) * 2009-11-03 2013-05-11 Ind Tech Res Inst Noise reduction system and noise reduction method
TWI415117B (en) * 2009-12-25 2013-11-11 Univ Nat Chiao Tung Dereverberation and noise redution method for microphone array and apparatus using the same
JP5870476B2 (en) * 2010-08-04 2016-03-01 富士通株式会社 Noise estimation device, noise estimation method, and noise estimation program
US9078077B2 (en) 2010-10-21 2015-07-07 Bose Corporation Estimation of synthetic audio prototypes with frequency-based input signal decomposition
US8675881B2 (en) 2010-10-21 2014-03-18 Bose Corporation Estimation of synthetic audio prototypes
TWI419149B (en) * 2010-11-05 2013-12-11 Ind Tech Res Inst Systems and methods for suppressing noise
US8744091B2 (en) * 2010-11-12 2014-06-03 Apple Inc. Intelligibility control using ambient noise detection
US8983089B1 (en) 2011-11-28 2015-03-17 Rawles Llc Sound source localization using multiple microphone arrays
JP5867066B2 (en) * 2011-12-26 2016-02-24 富士ゼロックス株式会社 Speech analyzer
JP5834948B2 (en) * 2012-01-24 2015-12-24 富士通株式会社 Reverberation suppression apparatus, reverberation suppression method, and computer program for reverberation suppression
US9282405B2 (en) * 2012-04-24 2016-03-08 Polycom, Inc. Automatic microphone muting of undesired noises by microphone arrays
US8666090B1 (en) * 2013-02-26 2014-03-04 Full Code Audio LLC Microphone modeling system and method
US10306389B2 (en) 2013-03-13 2019-05-28 Kopin Corporation Head wearable acoustic system with noise canceling microphone geometry apparatuses and methods
CN105229737B (en) * 2013-03-13 2019-05-17 寇平公司 Noise cancelling microphone device
US9257952B2 (en) 2013-03-13 2016-02-09 Kopin Corporation Apparatuses and methods for multi-channel signal compression during desired voice activity detection
US9197930B2 (en) 2013-03-15 2015-11-24 The Nielsen Company (Us), Llc Methods and apparatus to detect spillover in an audience monitoring system
US10154330B2 (en) * 2013-07-03 2018-12-11 Harman International Industries, Incorporated Gradient micro-electro-mechanical systems (MEMS) microphone
US9473852B2 (en) * 2013-07-12 2016-10-18 Cochlear Limited Pre-processing of a channelized music signal
DE112014003443B4 (en) * 2013-07-26 2016-12-29 Analog Devices, Inc. microphone calibration
US9837066B2 (en) 2013-07-28 2017-12-05 Light Speed Aviation, Inc. System and method for adaptive active noise reduction
US9241223B2 (en) * 2014-01-31 2016-01-19 Malaspina Labs (Barbados) Inc. Directional filtering of audible signals
JP6260504B2 (en) * 2014-02-27 2018-01-17 株式会社Jvcケンウッド Audio signal processing apparatus, audio signal processing method, and audio signal processing program
CA2953619A1 (en) 2014-06-05 2015-12-10 Interdev Technologies Inc. Systems and methods of interpreting speech data
US20160007101A1 (en) * 2014-07-01 2016-01-07 Infineon Technologies Ag Sensor Device
CN104243388B (en) * 2014-09-25 2017-10-27 陈景竑 Acoustic communication system based on OFDM
DE112015005862T5 (en) * 2014-12-30 2017-11-02 Knowles Electronics, Llc Directed audio recording
US9813832B2 (en) * 2015-02-23 2017-11-07 Te Connectivity Corporation Mating assurance system and method
JP6657965B2 (en) * 2015-03-10 2020-03-04 株式会社Jvcケンウッド Audio signal processing device, audio signal processing method, and audio signal processing program
US9865278B2 (en) * 2015-03-10 2018-01-09 JVC Kenwood Corporation Audio signal processing device, audio signal processing method, and audio signal processing program
US9905216B2 (en) 2015-03-13 2018-02-27 Bose Corporation Voice sensing using multiple microphones
CN104868956B (en) * 2015-04-14 2017-12-26 陈景竑 Data communications method based on sound wave channel
US9407989B1 (en) 2015-06-30 2016-08-02 Arthur Woodrow Closed audio circuit
US9788109B2 (en) * 2015-09-09 2017-10-10 Microsoft Technology Licensing, Llc Microphone placement for sound source direction estimation
US11631421B2 (en) 2015-10-18 2023-04-18 Solos Technology Limited Apparatuses and methods for enhanced speech recognition in variable environments
US10215736B2 (en) * 2015-10-23 2019-02-26 International Business Machines Corporation Acoustic monitor for power transmission lines
US10554458B2 (en) * 2017-04-04 2020-02-04 Northeastern University Low-power frequency-shift keying (FSK) wireless transmitters
US10444336B2 (en) * 2017-07-06 2019-10-15 Bose Corporation Determining location/orientation of an audio device
EP3525482B1 (en) 2018-02-09 2023-07-12 Dolby Laboratories Licensing Corporation Microphone array for capturing audio sound field
CN108364642A (en) * 2018-02-22 2018-08-03 成都启英泰伦科技有限公司 A kind of sound source locking means
CN109361828B (en) * 2018-12-17 2021-02-12 北京达佳互联信息技术有限公司 Echo cancellation method and device, electronic equipment and storage medium
US11234073B1 (en) * 2019-07-05 2022-01-25 Facebook Technologies, Llc Selective active noise cancellation
US20240031618A1 (en) * 2020-12-22 2024-01-25 Alien Music Enterprise Inc. Management server
CN114624652B (en) * 2022-03-16 2022-09-30 浙江浙能技术研究院有限公司 Sound source positioning method under strong multipath interference condition

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6549630B1 (en) * 2000-02-04 2003-04-15 Plantronics, Inc. Signal expander with discrimination between close and distant acoustic source
GB2394589A (en) * 2002-10-25 2004-04-28 Motorola Inc Speech recognition device
EP1489596A1 (en) * 2003-06-17 2004-12-22 Sony Ericsson Mobile Communications AB Device and method for voice activity detection
US20050249361A1 (en) * 2004-05-05 2005-11-10 Deka Products Limited Partnership Selective shaping of communication signals
EP1600791A1 (en) * 2004-05-26 2005-11-30 Honda Research Institute Europe GmbH Sound source localization based on binaural signals

Family Cites Families (61)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB806261A (en) 1955-03-28 1958-12-23 Insecta Lab Ltd Improvements in or relating to film forming pesticidal compositions based on aminoplastic and oil-modified alkyd resins
US4066842A (en) 1977-04-27 1978-01-03 Bell Telephone Laboratories, Incorporated Method and apparatus for cancelling room reverberation and noise pickup
US4731847A (en) * 1982-04-26 1988-03-15 Texas Instruments Incorporated Electronic apparatus for simulating singing of song
US4485484A (en) 1982-10-28 1984-11-27 At&T Bell Laboratories Directable microphone system
AT383428B (en) 1984-03-22 1987-07-10 Goerike Rudolf EYEGLASSES TO IMPROVE NATURAL HEARING
US4653102A (en) 1985-11-05 1987-03-24 Position Orientation Systems Directional microphone system
US5181252A (en) 1987-12-28 1993-01-19 Bose Corporation High compliance headphone driving
JP2687613B2 (en) 1989-08-25 1997-12-08 ソニー株式会社 Microphone device
US5197098A (en) 1992-04-15 1993-03-23 Drapeau Raoul E Secure conferencing system
JP3254789B2 (en) 1993-02-05 2002-02-12 ソニー株式会社 Hearing aid
EP0707763B1 (en) 1993-07-07 2001-08-29 Picturetel Corporation Reduction of background noise for speech enhancement
US5651071A (en) 1993-09-17 1997-07-22 Audiologic, Inc. Noise reduction system for binaural hearing aid
US5479522A (en) 1993-09-17 1995-12-26 Audiologic, Inc. Binaural hearing aid
US5815582A (en) 1994-12-02 1998-09-29 Noise Cancellation Technologies, Inc. Active plus selective headset
JPH09212196A (en) 1996-01-31 1997-08-15 Nippon Telegr & Teleph Corp <Ntt> Noise suppressor
US5778082A (en) 1996-06-14 1998-07-07 Picturetel Corporation Method and apparatus for localization of an acoustic source
US6987856B1 (en) 1996-06-19 2006-01-17 Board Of Trustees Of The University Of Illinois Binaural signal processing techniques
US6978159B2 (en) 1996-06-19 2005-12-20 Board Of Trustees Of The University Of Illinois Binaural signal processing using multiple acoustic sensors and digital filtering
US6222927B1 (en) 1996-06-19 2001-04-24 The University Of Illinois Binaural signal processing system and method
US5901232A (en) 1996-09-03 1999-05-04 Gibbs; John Ho Sound system that determines the position of an external sound source and points a directional microphone/speaker towards it
DE19703228B4 (en) 1997-01-29 2006-08-03 Siemens Audiologische Technik Gmbh Method for amplifying input signals of a hearing aid and circuit for carrying out the method
US6137887A (en) 1997-09-16 2000-10-24 Shure Incorporated Directional microphone system
US6888945B2 (en) 1998-03-11 2005-05-03 Acentech, Inc. Personal sound masking system
JP2000059876A (en) 1998-08-13 2000-02-25 Sony Corp Sound device and headphone
US6594365B1 (en) 1998-11-18 2003-07-15 Tenneco Automotive Operating Company Inc. Acoustic system identification using acoustic masking
EP1017253B1 (en) 1998-12-30 2012-10-31 Siemens Corporation Blind source separation for hearing aids
US6704428B1 (en) 1999-03-05 2004-03-09 Michael Wurtz Automatic turn-on and turn-off control for battery-powered headsets
JP3362338B2 (en) 1999-03-18 2003-01-07 有限会社桜映サービス Directional receiving method
WO2001097558A2 (en) 2000-06-13 2001-12-20 Gn Resound Corporation Fixed polar-pattern-based adaptive directionality systems
AU2001290608A1 (en) * 2000-08-31 2002-03-13 Rytec Corporation Sensor and imaging system
JP3670562B2 (en) * 2000-09-05 2005-07-13 日本電信電話株式会社 Stereo sound signal processing method and apparatus, and recording medium on which stereo sound signal processing program is recorded
US8477958B2 (en) 2001-02-26 2013-07-02 777388 Ontario Limited Networked sound masking system
DE10110258C1 (en) 2001-03-02 2002-08-29 Siemens Audiologische Technik Method for operating a hearing aid or hearing aid system and hearing aid or hearing aid system
US20030002692A1 (en) 2001-05-31 2003-01-02 Mckitrick Mark A. Point sound masking system offering visual privacy
WO2003036614A2 (en) 2001-09-12 2003-05-01 Bitwave Private Limited System and apparatus for speech communication and speech recognition
US7194094B2 (en) 2001-10-24 2007-03-20 Acentech, Inc. Sound masking system
CA2479758A1 (en) 2002-03-27 2003-10-09 Aliphcom Microphone and voice activity detection (vad) configurations for use with communication systems
US6912178B2 (en) 2002-04-15 2005-06-28 Polycom, Inc. System and method for computing a location of an acoustic source
WO2004004297A2 (en) 2002-07-01 2004-01-08 Koninklijke Philips Electronics N.V. Stationary spectral power dependent audio enhancement system
US20040125922A1 (en) 2002-09-12 2004-07-01 Specht Jeffrey L. Communications device with sound masking system
US6823176B2 (en) 2002-09-23 2004-11-23 Sony Ericsson Mobile Communications Ab Audio artifact noise masking
JP4247037B2 (en) 2003-01-29 2009-04-02 株式会社東芝 Audio signal processing method, apparatus and program
CA2422086C (en) 2003-03-13 2010-05-25 777388 Ontario Limited Networked sound masking system with centralized sound masking generation
US7099821B2 (en) 2003-09-12 2006-08-29 Softmax, Inc. Separation of target acoustic signals in a multi-transducer arrangement
CN1998265A (en) 2003-12-23 2007-07-11 奥迪吉康姆有限责任公司 Digital cell phone with hearing aid functionality
JP2005339086A (en) * 2004-05-26 2005-12-08 Nec Corp Auction information notifying system, device, and method used for it
US20060013409A1 (en) 2004-07-16 2006-01-19 Sensimetrics Corporation Microphone-array processing to generate directional cues in an audio signal
WO2006026812A2 (en) 2004-09-07 2006-03-16 Sensear Pty Ltd Apparatus and method for sound enhancement
JP4594681B2 (en) 2004-09-08 2010-12-08 ソニー株式会社 Audio signal processing apparatus and audio signal processing method
US20060109983A1 (en) 2004-11-19 2006-05-25 Young Randall K Signal masking and method thereof
US20080262834A1 (en) * 2005-02-25 2008-10-23 Kensaku Obata Sound Separating Device, Sound Separating Method, Sound Separating Program, and Computer-Readable Recording Medium
JP4247195B2 (en) 2005-03-23 2009-04-02 株式会社東芝 Acoustic signal processing apparatus, acoustic signal processing method, acoustic signal processing program, and recording medium recording the acoustic signal processing program
US7415372B2 (en) * 2005-08-26 2008-08-19 Step Communications Corporation Method and apparatus for improving noise discrimination in multiple sensor pairs
JP4637725B2 (en) 2005-11-11 2011-02-23 ソニー株式会社 Audio signal processing apparatus, audio signal processing method, and program
US20070253569A1 (en) 2006-04-26 2007-11-01 Bose Amar G Communicating with active noise reducing headset
DK2030476T3 (en) 2006-06-01 2012-10-29 Hear Ip Pty Ltd Method and system for improving the intelligibility of sounds
US8483416B2 (en) 2006-07-12 2013-07-09 Phonak Ag Methods for manufacturing audible signals
US8369555B2 (en) * 2006-10-27 2013-02-05 Avago Technologies Wireless Ip (Singapore) Pte. Ltd. Piezoelectric microphones
US20080152167A1 (en) * 2006-12-22 2008-06-26 Step Communications Corporation Near-field vector signal enhancement
US8213623B2 (en) 2007-01-12 2012-07-03 Illusonic Gmbh Method to generate an output audio signal from two or more input audio signals
US8767975B2 (en) * 2007-06-21 2014-07-01 Bose Corporation Sound discrimination method and apparatus

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6549630B1 (en) * 2000-02-04 2003-04-15 Plantronics, Inc. Signal expander with discrimination between close and distant acoustic source
GB2394589A (en) * 2002-10-25 2004-04-28 Motorola Inc Speech recognition device
EP1489596A1 (en) * 2003-06-17 2004-12-22 Sony Ericsson Mobile Communications AB Device and method for voice activity detection
US20050249361A1 (en) * 2004-05-05 2005-11-10 Deka Products Limited Partnership Selective shaping of communication signals
EP1600791A1 (en) * 2004-05-26 2005-11-30 Honda Research Institute Europe GmbH Sound source localization based on binaural signals

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
CANETTO B ET AL: "Speech enhancement systems based on microphone arrays", 20020527; 20020527 - 20020531, 27 May 2002 (2002-05-27), pages 1 - 9, XP007905367 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150332674A1 (en) * 2011-12-28 2015-11-19 Fuji Xerox Co., Ltd. Voice analyzer and voice analysis system
CN103366756A (en) * 2012-03-28 2013-10-23 联想(北京)有限公司 Sound signal reception method and device

Also Published As

Publication number Publication date
CN101682809B (en) 2013-07-17
JP2010530718A (en) 2010-09-09
JP4965707B2 (en) 2012-07-04
JP2012147475A (en) 2012-08-02
EP2158788A1 (en) 2010-03-03
US20140294197A1 (en) 2014-10-02
US20080317260A1 (en) 2008-12-25
JP5654513B2 (en) 2015-01-14
CN101682809A (en) 2010-03-24
US8767975B2 (en) 2014-07-01

Similar Documents

Publication Publication Date Title
US8767975B2 (en) Sound discrimination method and apparatus
KR102602090B1 (en) Personalized, real-time audio processing
CA2560034C (en) System for selectively extracting components of an audio input signal
JP5886304B2 (en) System, method, apparatus, and computer readable medium for directional high sensitivity recording control
KR101164299B1 (en) Sound input device
CN108235181B (en) Method for noise reduction in an audio processing apparatus
KR20090082978A (en) Sound system, sound reproducing apparatus, sound reproducing method, monitor with speakers, mobile phone with speakers
US20220277759A1 (en) Playback enhancement in audio systems
KR20090082977A (en) Sound system, sound reproducing apparatus, sound reproducing method, monitor with speakers, mobile phone with speakers
US8135144B2 (en) Microphone system, sound input apparatus and method for manufacturing the same
JP3154468B2 (en) Sound receiving method and device
Capel Newnes Audio and Hi-fi Engineer's Pocket Book
JP3204278B2 (en) Microphone device
JP7060905B1 (en) Sound collection system, sound collection method and program
JP4495704B2 (en) Sound image localization emphasizing reproduction method, apparatus thereof, program thereof, and storage medium thereof
US20230215449A1 (en) Voice reinforcement in multiple sound zone environments
WO2023065317A1 (en) Conference terminal and echo cancellation method
WO2022102322A1 (en) Sound collection system, sound collection method, and program
CN117939339A (en) Microphone system and method of operating a microphone system
CN116132898A (en) Noise-reducing and pickup hearing aid system
Krebber PA systems for indoor and outdoor
Bartlett Differential Head-Worn Microphones for Music
Sladeczek High-Directional Beamforming with a Miniature Loudspeaker Array Christoph Sladeczek, Daniel Beer, Jakob Bergner, Albert Zhykhar, Maximilian Wolf, Andreas Franck
Bartlett A High-Fidelity Differential Cardioid Microphone
JPH08289390A (en) Howling suppression system

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 200880020920.2

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08755825

Country of ref document: EP

Kind code of ref document: A1

DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
WWE Wipo information: entry into national phase

Ref document number: 2010513294

Country of ref document: JP

REEP Request for entry into the european phase

Ref document number: 2008755825

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2008755825

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE