WO2012093345A1 - An audio system and method of operation therefor - Google Patents

An audio system and method of operation therefor Download PDF

Info

Publication number
WO2012093345A1
WO2012093345A1 PCT/IB2012/050007 IB2012050007W WO2012093345A1 WO 2012093345 A1 WO2012093345 A1 WO 2012093345A1 IB 2012050007 W IB2012050007 W IB 2012050007W WO 2012093345 A1 WO2012093345 A1 WO 2012093345A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio
ultrasound
array
signals
audio band
Prior art date
Application number
PCT/IB2012/050007
Other languages
French (fr)
Inventor
Ashish Vijay Pandharipande
Sriram Srinivasan
Original Assignee
Koninklijke Philips Electronics N.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips Electronics N.V. filed Critical Koninklijke Philips Electronics N.V.
Priority to CN201280004763.2A priority Critical patent/CN103329565B/en
Priority to EP12700525.4A priority patent/EP2661905B1/en
Priority to US13/996,347 priority patent/US9596549B2/en
Priority to JP2013547941A priority patent/JP6023081B2/en
Priority to BR112013017063A priority patent/BR112013017063A2/en
Priority to RU2013136491/28A priority patent/RU2591026C2/en
Publication of WO2012093345A1 publication Critical patent/WO2012093345A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R25/00Deaf-aid sets, i.e. electro-acoustic or electro-mechanical hearing aids; Electric tinnitus maskers providing an auditory perception
    • H04R25/40Arrangements for obtaining a desired directivity characteristic
    • H04R25/405Arrangements for obtaining a desired directivity characteristic by combining a plurality of transducers
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S15/00Systems using the reflection or reradiation of acoustic waves, e.g. sonar systems
    • G01S15/02Systems using the reflection or reradiation of acoustic waves, e.g. sonar systems using reflection of acoustic waves
    • G01S15/06Systems determining the position data of a target
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S15/00Systems using the reflection or reradiation of acoustic waves, e.g. sonar systems
    • G01S15/02Systems using the reflection or reradiation of acoustic waves, e.g. sonar systems using reflection of acoustic waves
    • G01S15/06Systems determining the position data of a target
    • G01S15/08Systems for measuring distance only
    • G01S15/10Systems for measuring distance only using transmission of interrupted, pulse-modulated waves
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S7/00Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
    • G01S7/52Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S15/00
    • G01S7/521Constructional features
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01VGEOPHYSICS; GRAVITATIONAL MEASUREMENTS; DETECTING MASSES OR OBJECTS; TAGS
    • G01V1/00Seismology; Seismic or acoustic prospecting or detecting
    • G01V1/001Acoustic presence detection
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10KSOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
    • G10K11/00Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/18Methods or devices for transmitting, conducting or directing sound
    • G10K11/26Sound-focusing or directing, e.g. scanning
    • G10K11/34Sound-focusing or directing, e.g. scanning using electrical steering of transducer arrays, e.g. beam steering
    • G10K11/341Circuits therefor
    • G10K11/348Circuits therefor using amplitude variation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/56Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • H04R3/005Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • H04R3/12Circuits for transducers, loudspeakers or microphones for distributing signals to two or more loudspeakers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/302Electronic adaptation of stereophonic sound system to listener position or orientation
    • H04S7/303Tracking of listener position or orientation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/32Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
    • H04R1/40Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
    • H04R1/403Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers loud-speakers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/32Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
    • H04R1/40Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
    • H04R1/406Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers microphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2400/00Loudspeakers
    • H04R2400/01Transducers used as a loudspeaker to generate sound aswell as a microphone to detect sound
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2430/00Signal processing covered by H04R, not provided for in its groups
    • H04R2430/03Synergistic effects of band splitting and sub-band processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2430/00Signal processing covered by H04R, not provided for in its groups
    • H04R2430/20Processing of the output signals of the acoustic transducers of an array for obtaining a desired directivity characteristic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2499/00Aspects covered by H04R or H04S not otherwise provided for in their subgroups
    • H04R2499/10General applications
    • H04R2499/11Transducers incorporated or for use in hand-held devices, e.g. mobile phones, PDA's, camera's
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2499/00Aspects covered by H04R or H04S not otherwise provided for in their subgroups
    • H04R2499/10General applications
    • H04R2499/13Acoustic transducers and sound field adaptation in vehicles
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/07Synergistic effects of band splitting and sub-band processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/305Electronic adaptation of stereophonic audio signals to reverberation of the listening space

Definitions

  • the invention relates to an audio system and a method of operation therefor, and in particular, but not exclusively, to an audio system capable of estimating user positions.
  • Determination of presence and position related information is of interest in many audio applications including for example for hands-free communication and smart entertainment systems.
  • the knowledge of user locations and their movement may be employed to localize audio-visual effects at user locations for a more personalized experience in entertainment systems. Also, such knowledge may be employed to improve the
  • such applications may use directional audio rendering or capture to provide improved effects.
  • directionality can for example be derived from audio arrays comprising a plurality of audio drivers or sensors.
  • acoustic beamforming is relatively common in many applications, such as in e.g. teleconferencing systems.
  • weights are applied to the signals of individual audio elements thereby resulting in the generation of a beam pattern for the array.
  • the array may be adapted to the user positions in accordance with various algorithms. For example, the weights may be continually updated to result in the maximum signal level or signal to noise ratio in accordance with various algorithms.
  • such conventional approaches require the audio source to be present, and consequently the weights of an acoustic array can be adapted only after a source becomes active.
  • an improved audio system would be advantageous and in particular a system allowing increased flexibility, reduced resource usage, reduced complexity, improved adaptation, improved reliability, improved accuracy and/or improved performance would be advantageous.
  • the Invention seeks to preferably mitigate, alleviate or eliminate one or more of the above mentioned disadvantages singly or in any combination.
  • an audio system comprising: an ultrasound sensor array comprising a plurality of ultrasound sensor elements; an estimator for estimating a presence characteristic of a user in response to ultrasound signals received from the ultrasound sensor array; an audio array circuit for generating a directional response of an audio band array comprising a plurality of audio band elements by applying weights to individual audio band signals for the audio band elements; and a weight circuit for determining the weights in response to the presence characteristic.
  • the invention may provide improved adaptation of the directionality of an audio band array.
  • the approach may for example allow adaptation of filter characteristics for the array processing based on the ultrasound signals.
  • Adaptation of filter characteristics and weights, and thus the directionality of the audio array may be performed in the absence of sound being generated from a target source.
  • the filter characteristics/weights may be set to provide a beam or a notch in a desired direction based on the ultrasound signals.
  • the invention may in many embodiments provide improved accuracy and/or faster adaptation of audio directionality for the audio band array.
  • the initialisation of weights for the audio band array may for example be based on the presence characteristic.
  • the spatial directivity pattern of the audio band array may be adjusted in response to the presence characteristic. For example, if the presence of a user is detected, a directional beam may be generated, and if no user is detected an omnidirectional beam may be generated.
  • the audio band may be considered to correspond to an acoustic band.
  • the audio band may be a band having an upper frequency below 15 kHz and typically below 10 kHz.
  • the ultrasound band may be a band having a lower frequency above 10 kHz and often advantageously above 15 kHz or 20 kHz.
  • the weights may be filter weights of individual filters being applied to the individual audio band signals by the array processor.
  • the weights may be complex values and/or may equivalently be delays, scale factors and/or phase shifts.
  • the presence characteristic comprises a position estimate and the audio array circuit is arranged to determine the weights in response to the position characteristic.
  • the invention may e.g. allow beamforming to track users or audio sources even when they do not generate any sound. In many embodiments, it may provide a faster adaptation of a beam pattern to a specific user position.
  • the audio band elements are audio sensors and the audio array circuit is arranged to generate a directional output signal by combining audio band signals from the audio sensors, the combining comprising applying the weights to the individual audio band signals.
  • the invention may allow an advantageous control of directivity for an audio capture system based on an audio band sensor array.
  • the approach may allow for an audio band audio capture beam to be adapted even when no sound is generated by the target source. Furthermore, the approach may reduce or mitigate the impact of audio generated by undesired sound sources.
  • the audio system comprises a plurality of wideband sensors each of which is both an ultrasound sensor of the ultrasound sensor array and an audio sensor of the audio band array.
  • the same wideband sensor may thus be used as both an audio band element and an ultrasound sensor.
  • This may provide a highly cost efficient implementation in many scenarios.
  • the approach may facilitate and/or improve interworking between the audio band processing and the ultrasound band processing.
  • the approach may in many scenarios allow reuse of parameters determined in response to the ultrasound signals when processing the audio band signals.
  • the approach may facilitate and/or improve synchronization between ultrasound and audio band operations and processing.
  • the plurality of wideband sensors forms both the ultrasound sensor array and the audio band array.
  • Each of the audio band elements and ultrasound sensors may be implemented by a wideband sensor.
  • the same wideband sensor array may thus be used as the audio band array and the ultrasound sensor array.
  • the ultrasound signals and the audio band signals may be different frequency intervals of the same physical signals, namely the wideband sensor elements.
  • the approach may provide a highly cost efficient implementation in many scenarios.
  • the approach may facilitate and/or improve interworking between the audio band processing and the ultrasound band processing.
  • the audio system further comprises: a user movement model arranged to track a position of a user; an update circuit for updating the user movement model in response to both the ultrasound signals and the audio band signals.
  • This may provide improved performance in many embodiments and may in many scenarios provide a substantially improved user movement tracking.
  • the update circuit is arranged to update the user movement model in response to the ultrasound signals when a characteristic of the audio band signals meets a criterion.
  • the criterion may for example be a criterion which is indicative of the desired sound source not generating any sound.
  • the criterion may be a requirement that a signal level of the audio band signals is below a threshold.
  • the threshold may be a variable threshold which varies in response to other parameters.
  • the update circuit is arranged to not update the user movement model in response to the ultrasound signals when a characteristic of the audio band signals meets a criterion.
  • the criterion may for example be a criterion which is indicative of the desired sound source generating sound.
  • the criterion may be a requirement that a signal level of the audio band signals is above a threshold.
  • the threshold may be a variable threshold which varies in response to other parameters.
  • the weight circuit is arranged to determine ultrasound weight delays for the ultrasound signals to correspond to a direction of an ultrasound source; and to determine audio weight delays for the individual audio band signals to correspond to the ultrasound weight delays.
  • the ultrasound sensor array and the audio band array are spatially overlapping.
  • the ultrasound sensor array and the audio band array may specifically be substantially collocated.
  • the audio system further comprises an ultrasound transmitter arranged to transmit an ultrasound test signal, and the estimator is arranged to estimate the presence characteristic in response to a comparison between a characteristic of the ultrasound test signal and a characteristic of the ultrasound signals received from the ultrasound sensor array.
  • the ultrasound transmitter may be proximal to the ultrasound sensor array and may be substantially collocated therewith.
  • the ultrasound transmitter may in some scenarios be implemented by the same ultrasound transducer(s) as one (or more) of the ultrasound sensors.
  • the ultrasound test signal is a pulsed ultrasound signal
  • the estimator is arranged to perform a movement estimation in response to a comparison of signal segments of the ultrasound signals corresponding to different pulses.
  • This may provide a particularly practical and/or improved movement detection that may in many scenarios improve performance of the audio system as a whole.
  • the estimator is arranged to estimate a position of a moving object in response to a difference between the signal segments.
  • This may provide a particularly practical and/or improved movement detection that may in many scenarios improve performance of the audio system as a whole.
  • the audio band elements are audio drivers arranged to generate a sound signal in response to a drive signal, and the individual audio band signals are drive signals.
  • the invention may allow improved performance and or facilitated
  • the approach may for example allow optimization of audio rendering for a specific listening position.
  • a method of operation for an audio system including an ultrasound sensor array comprising a plurality of ultrasound sensor elements, the method comprising: estimating a presence characteristic of a user in response to ultrasound signals received from the ultrasound sensor array; generating a directional response of an audio band array comprising a plurality of audio band elements by applying weights to individual audio band signals for the audio band elements; and determining the weights in response to the presence characteristic.
  • Fig. 1 illustrates an example of an audio system in accordance with some embodiments of the invention
  • Fig. 2 illustrates an example of a beamformer for an audio sensor array
  • Fig. 3 illustrates an example of a beamformer for an audio rendering array
  • Fig. 4 illustrates an example of an audio system in accordance with some embodiments of the invention
  • Fig. 5 illustrates an example of a transmitted ultrasound signal
  • Fig. 6 illustrates an example of an audio system in accordance with some embodiments of the invention.
  • Figs. 7-9 illustrate examples of performance for a de-reverberation application.
  • Fig. 1 illustrates an example of an audio system in accordance with some embodiments of the invention.
  • the audio system comprises an audio band array 101 which comprises a plurality of audio band elements/ transducers.
  • the audio band array 101 may be used to provide directional operation of the audio system by individually processing the signals for each of the audio band elements.
  • the combined effect of the audio band array 101 may correspond to a single audio band element having a directional audio characteristic.
  • the audio band array 101 is coupled to an array processor 103 which is arranged to generate a directional response from the audio band array by individually processing the signals of the individual signals of the individual audio band elements.
  • the audio band array 101 may be used to render sound and the audio band elements/transducers may be audio band drivers/speakers.
  • an input signal may be applied to the array processor 101 which may generate the individual drive signals for the audio band drivers by individually processing the input signal.
  • filter characteristics/ weights may be set individually for each of the audio band drivers such that the resulting radiated audio band signals add or subtract differently in different directions. For example, coherent addition can be produced in a desired direction with noncoherent (and thus reduced signal levels) are produced in other directions.
  • the audio band array 101 may be used to capture sound and the audio band elements/transducers may be audio band sensors.
  • an output signal may be generated by the array processor 101 by individually processing the individual sensor signals from the audio band sensors and subsequently combining the processed signals.
  • filter characteristics/weights may be set individually for each of the audio band sensors such that the combination is more or less a coherent combination in the desired direction.
  • Fig. 2 illustrates an example wherein four input sensor signals are received from four audio band sensors. It will be appreciated that the array may in other embodiments comprise fewer or more elements.
  • Each of the signals is amplified in an individual low noise amplifier 201 after which each signal is filtered in an individual filter 203.
  • the resulting filtered signals are then fed to a combiner 205 which may e.g. simply sum the filter output signals.
  • Fig. 3 illustrates an example wherein an input signal is received by a splitter 301 which generates four signals, one signal for each of four audio band drivers. Each of the signals is then filtered in an individual filter 303 after which each filter output signal is amplified in a suitable output amplifier 305. Each of the output amplifiers thus generates a drive signal for an audio band driver.
  • the directionality of the audio band array can thus be controlled by suitably adapting the individual filters 203, 303.
  • the filters 203, 303 can be adapted such that coherent summation is achieved for a desired direction.
  • the directionality of the audio band array can accordingly be modified dynamically simply by changing the characteristics of the filter.
  • the audio band beam/ pattern of the audio band array can be controlled by modifying the weights of the filters as will be known to the skilled person.
  • the modification of the filter weights may specifically correspond to a modification of one or more of a gain, a phase and a delay. Each of these parameters may be constant for all frequencies or may be frequency dependent. Further, modifications of the filter weights may be performed in the frequency domain and/or the time domain. For example, time domain adaptation may be performed by adjusting coefficients (taps) of a FIR filter. As another example, the signals may be converted to the frequency domain by a Fast Fourier Transform. The resulting frequency domain signal may then be filtered by applying coefficients/weights to each of the frequency bin values. The resulting filtered frequency domain signal may then be converted back to the time domain by an inverse Fast Fourier Transform.
  • the filters 203, 303 may simply correspond to a variable delay. It is noted that a simple delay corresponds to a filter having an impulse response corresponding to a Dirac pulse at a time position corresponding to the delay. Thus, introducing a variable delay corresponds to a introducing a filter wherein the coefficients are weighted to provide the desired delay (e.g. it is equivalent to a FIR filter where the coefficient corresponding to the delay is set to one and all other coefficients are set to zero. For fractional delays (relative to sample instants) FIR interpolation may be considered).
  • the approach may correspond to a Delay and Sum Beamformer (DSB) for the audio band sensor case.
  • DSB Delay and Sum Beamformer
  • more complex filtering may be performed and specifically a frequency dependent filtering may be applied.
  • the approach may correspond to a Filter and Sum Beamformer (FSB) for the audio band sensor case.
  • FSB Filter and Sum Beamformer
  • gain adjustment or compensation may be introduced for an audio band rendering system.
  • calibration may be performed to compensate for variations in the characteristics of the audio band drivers.
  • the combination of the audio band sensor example may take other signals into account and may for example subtract signals that are derived from the individual signals.
  • side-lobe cancelling may be introduced by subtracting a suitably generated estimate of such signals.
  • Various algorithms are known for controlling weights of an audio band beamformer. Generally these algorithms determine weights for the audio band beamformer based on knowledge of a desired directivity and accordingly determines the weights based e.g. on predetermined values relating directions to weights (e.g. using a look up table).
  • the weights are typically adapted in a feedback fashion based on the received audio. For example, the weights are dynamically adapted to provide a maximum signal level or a maximum signal to noise ratio estimate.
  • the system comprises an ultrasound sensor array 105 which comprises a plurality of ultrasound sensors that generate a plurality of ultrasound signals.
  • the ultrasound signals are fed to an estimation processor 107 which is arranged to generate a presence estimate for a user in response to the ultrasound signals.
  • the estimation processor 107 is coupled to a weight processor 109 which is further coupled to the array processor 103.
  • the weight processor 109 is arranged to determine the filter characteristics for the array processor 103 in response to the presence estimate.
  • the system thus uses characteristics estimated from the ultrasound audio environment to control the operation in the audio band.
  • the ultrasound band may be considered to be from 10 kHz and whereas the audio band may be considered to be the frequency range below 15 kHz.
  • the audio band will thus include frequency intervals below 15 kHz.
  • the system further comprises an ultrasound transmitter 111 which is arranged to radiate ultrasound signals. Such signals will be reflected by objects in the room and the reflected signals or echoes can be captured by the ultrasound sensor array 105.
  • the filter characteristics and weights may fully or partially for at least some of the time be dependent on the received ultrasound signals and specifically on echoes from radiated ultrasound signals.
  • the estimation processor 107 receives the ultrasound signals from the sensor array 105 and based on these it estimates a presence characteristic for a user.
  • the presence characteristic may in a simple example simply indicate whether a user is estimated to be present or not. However, in most embodiments, the presence characteristic is an indication of a position of a user. It will be appreciated that a full position estimate need not be determined but that in some
  • the weight processor 107 may e.g. simply estimate a rough direction to the user. Based on the determined presence characteristic, the weight processor 107 proceeds to determine suitable weights to result in a desired beam pattern for the specific presence characteristic.
  • the audio system may be set up in an environment wherein the ultrasound transmitter 111 does not generate any significant echoes at the ultrasound sensor array 105 (e.g. in a large empty space where all objects are sufficiently far away to not generate significant echoes).
  • the estimation processor 107 may perform a very simple detection by comparing the ultrasound signal level to a threshold and setting the presence indicator to indicate a presence of a user if the threshold is exceeded and otherwise setting it to indicate that no user is detected.
  • the weight processor 107 may then proceed to modify the weights accordingly.
  • the weights may be set to provide a pattern which is as omnidirectional as possible, and if a user is detected the weights may be set to provide a predetermined narrow beam in the direction of a nominal position where the user is assumed to be (e.g. directly in front of the ultrasound sensor array 105).
  • Such an approach may be suitable for many applications and can be used for both audio rendering/playback and for audio capturing. E.g. when no user is present sound is radiated in all directions and/or captured from all directions. This may support peripheral users in different positions.
  • the audio system automatically adapts to focus on this specific user.
  • the system seeks to determine a presence/position characteristic for a user but may not know if the ultrasound signals are caused by a user or other object.
  • the presence characteristic may be considered to be a presence characteristic for an object.
  • the object may then be assumed to be the user.
  • the presence characteristic may comprise or consist of a position (direction) estimate for user and the weight processor 107 may be arranged to determine weights to provide a suitable pattern for this direction (e.g. by directing a beam in that direction).
  • the audio system may thus use ultrasound measurements to adjust the directivity of an audio band beam.
  • the ultrasound sensor array 105 and the audio band array 101 may be substantially collocated and may e.g. be adjacent to each other. However, in many embodiments the ultrasound sensor array 105 and the audio band array 101 may advantageously overlap each other. Thus, for an audio capture application, the apertures of the ultrasound sensor array 105 and the audio band (sensor) array 101 may overlap each other.
  • ultrasound sensors are placed in-between audio band sensors such that the arrays are interleaved with each other.
  • Such an approach provides for improved and facilitated operation and increased accuracy. Specifically there is no necessity for complex calculations to translate positions relative to the ultrasound sensor array 105 to positions relative to the audio band array 101. Rather, if an estimated direction to a user is determined based on the ultrasound signals, this direction can be used directly when determining suitable filter weights for the audio band signals.
  • the description will focus on an audio capture system which adapts the audio beam pattern towards a desired sound source.
  • the audio system may for example be a teleconferencing system.
  • the ultrasound sensor array 105 and the audio band array 101 are not only collocated or overlapping but actually use the same audio band elements.
  • Fig. 4 illustrates an example of the exemplary audio capture system.
  • the system of Fig. 4 comprises an audio band array of audio band transducers in the form of wideband audio band sensors 401.
  • Each of the wideband audio sensors 401 captures sound in a wideband range which covers at least part of the audio band and the ultrasound band.
  • the active frequency interval for capture by the wideband audio sensors 401 includes frequencies below 2 kHz and above 10 kHz (or below 500Hz or 1 kHz and/or above 15 kHz or 20 kHz in many scenarios).
  • each of the wideband audio sensors 401 is both an audio band sensor and an ultrasound sensor. Hence, the same sensors are used both to provide the captured audio input as well as the ultrasound input.
  • the wideband audio sensors 401 are coupled to an array processor 403 which proceeds to filter and combine the audio band signals as described for the array processor 103 of Fig. 1. However, in many scenarios the array processor 103 may further low pass filter the signals to limit the signals to the audio band.
  • the wideband audio sensors 401 are coupled to an estimator 405 which is arranged to determine a presence characteristic for a user along the same lines as the presence estimator 107 of Fig. 7.
  • the estimator 405 is coupled to a weight processor 407 which is arranged to determine the weights for the array processor 403 based on the presence characteristic corresponding to the approach of the weight processor 107 of Fig. 1.
  • the respective ultrasound signals may e.g. be generated by a high pass filtering of the transducer signals and the audio band signals may be generated by a low pass filtering of the transducer signal.
  • An audio band signal may have at least 80% of the total signal energy below
  • an ultrasound signal may have at least 80% of the total signal energy above 10 kHz.
  • the system further comprises an ultrasound transmitter 409 which is located centrally in the audio array 401.
  • the system of Fig. 4 may operate similarly to that described for the capture application of Fig. 1. However, typically, the system may specifically be used to estimate user positions based on the ultrasound signals, and this position estimate may be used to fully or partially control the weights of the audio band combining in order to provide a desired directive sensitivity of the audio capture.
  • the weights may not only be determined based on the presence or position estimate generated from the ultrasound signals but may in some scenarios alternatively or additionally be generated based on the audio band signals captured by the audio array 401 (and typically generated by filtering of these or in some cases used directly when the ultrasound signal components are negligible when performing the audio band processing).
  • the audio system may include conventional functionality for adapting the weights of a beamformer for an audio array.
  • the ultrasound signals can be used to determine suitable weights which can be used for the beamforming algorithm.
  • initialisation of an audio beamforming algorithm may be performed using the weights determined from the ultrasound signals.
  • wideband sensors as both audio band and ultrasound sensors provide a number of advantages. Indeed, it may facilitate implementation and manufacturing as fewer sensors are used. This may reduce cost and often reduce the form factor of the sensor segment of the audio system. It may for example allow implementation of a teleconferencing audio system using the described approach in a single relatively small enclosure. However, the approach may further provide improved performance and may in particular provide higher accuracy and/or reduced or facilitated signal processing with reduced complexity. Indeed, the translation between different audio band and ultrasound sensor arrays may often be substantially facilitated. Indeed, in many scenarios the parameters determined to result in a coherent addition for the ultrasound signals may directly be used as parameters for the audio beamforming. E.g. the same delays may be used for each individual path.
  • the system may be used in for hands-free communication where one or more users communicate with remotely located users using a fixed system.
  • acoustic beamforming can be performed in order to localize the sources and direct the acoustic beam to those locations.
  • sources to be (acoustically) active.
  • the beamforming weights need to be updated if the sources have moved, leading to drops in quality.
  • an active source at a certain location. The source goes quiet and moves to another location and then again becomes active. Communication would initially suffer since the acoustic beamformer weights need updating.
  • the beamforming weights that are computed could be inaccurate resulting in poor quality or even a communication outage.
  • the presence characteristic is thus a position estimate or indication, such as e.g. a direction of the assumed user.
  • the position estimate can be determined in response to the ultrasound signal transmitted by the ultrasound transmitter 409.
  • the signal components in the ultrasound band can be used to compute user locations based on time-of-flight processing which allows a computation of range and/or direction-of-arrival processing for angular information.
  • T denotes the duration over which the pulse comprising sinusoids is transmitted
  • PRI denotes the duration over which echoes may be received.
  • the estimator 405 may for each pulse correlate the received ultrasound signal from each wideband audio sensor to delayed versions of the transmitted pulse.
  • the delay which results in the largest correlation can be considered to correspond to the time of flight for the ultrasound signal and the relative difference in the delays (and thus the times of flight) between the array elements can be used to determine a direction towards the object reflecting the ultrasound.
  • the ultrasound signals are also used to provide a motion estimate for the user.
  • the ultrasound position estimate may be based on moving objects, i.e. on changes in the echoes received by the wideband sensors.
  • the ultrasound transmitter may emanate a series of pulses, such as those of Fig. 5.
  • the estimator 405 may then proceed to first determine the range of the moving sources only from the wideband sensor array 401 while discarding static objects from consideration.
  • the estimator 405 in the example proceeds to consider the difference of the received signals from two consecutive transmit pulses rather than consider each response individually. Echoes from static objects result in the same contribution in received signals from consecutive transmit pulses, and hence the difference would be (close to) zero. Echoes from moving sources on the other hand result in a non-zero difference signal.
  • Signal power is then computed per range bin based on the difference signal. A moving source is determined to be present at a certain range bin if the computed signal power exceeds a detection threshold.
  • the detection threshold may be adapted to ambient noise conditions. Having determined the radial range, the angular position may be calculated by determining the direction-of-arrival (DoA) of the moving sources. The range along with the angle gives the instantaneous location of each moving source.
  • DoA direction-of-arrival
  • the location estimate (azimuth) provided by the ultrasound array can be translated in to the relative delays that occur when an audio signal emanating from that location propagates to the audio sensors of the array 401.
  • a uniform linear audio sensor array is assumed with an inter- element spacing of d m.
  • Let 6f denote the estimate of the location of the audio source (the object reflecting the ultrasound signals) relative to the wideband sensor array 401.
  • the relative delays required for forming a beam in the direction of the assumed user can now be computed from the location estimate provided by the ultrasound array.
  • DSB Delay-and-Sum Beamformer
  • the audio system may directly determine ultrasound weight delays for the ultrasound signals that correspond to a direction of an ultrasound source (such as a reflecting object).
  • the audio band weight delays for the individual audio band signals may then directly be used to correspond to the ultrasound weight delays.
  • the presence characteristic may indeed be represented by the determined delays themselves.
  • the approach may provide a number of advantages. For example, resetting the filters to the delays corresponding to the location determined by the ultrasound signals after a period of acoustic inactivity by the user, and then allowing the filters to adapt when the audio band becomes active ensures faster convergence than the case where the filters corresponding to the old location need to be adapted.
  • the audio system of Fig. 1 may be arranged to track movement of a user, where the estimated movement is updated using both the results from the audio band and from the ultrasound band.
  • the audio beamforming may then be based on the current position estimate for a user. For example, past location information can be combined with a movement model to obtain user movement trajectories, where the model may be
  • the user movement model may for example be a simple model which e.g. simply uses the last estimated position as the current position, or may be more complex and for example implement complex movement models that may predict movement and combine position estimates from both the ultrasound and audio bands.
  • the location and movement trajectory information may e.g. then be used as a priori input to the acoustic beamformer, i.e. the array processor 403 may after a pause in the audio from the desired signal source be initialised with weights corresponding to the estimated user position.
  • An audio-only system is unable to track this movement due to the absence of an audible signal, and needs time to converge to the correct weights once the person starts talking from location B.
  • Using the location estimated from the ultrasound array solves this problem as it can continuously track the user during the movement for location A to location B.
  • Fig. 6 illustrates an example of how the audio system of Fig. 4 may be implemented using a movement model which is updated on the basis of position estimates generated both from the ultrasound signals and from the audio band signals.
  • the estimator 405 comprises an ultrasound position estimator 601 which receives the signals from the wideband audio sensors 401 and which generates a position estimate from the ultrasound signal components.
  • the previously described approach may for example be used.
  • the estimator 405 further comprises an audio band position estimator 603 which receives the signals from the wideband audio sensors 401 and which generates a position estimate from the audio band signal components. It will be appreciated that any suitable algorithm may be used, including for example an adaptive algorithm determining relative delays that result in the maximum summed signal level. It will also be appreciated that in some embodiments, the position determination may be integrated with the
  • the beamforming process of the array processor 403 e.g. by the audio system including a feedback path from the array processor 403 to the audio band position estimator 603.
  • the ultrasound position estimator 601 and the audio band position estimator 603 are coupled to an update processor 605 which is further coupled to a movement model 607.
  • the movement model 607 is a model that generates a position estimate for the user.
  • the update processor 605 controls the movement model based on the position estimates from the ultrasound position estimator 601 and the audio band position estimator 603.
  • the movement model 607 may simple comprise a memory which stores the latest position estimate provided by the update processor 605.
  • the update processor 605 may continuously evaluate the ultrasound and audio band position estimates and proceed to feed forward the position estimate that is considered to be valid. If both are considered valid, an average position estimate may be forwarded, and if none of them are considered valid no position estimate is forwarded.
  • the position estimate may simply be considered valid if the signal level of the combined signal is above a given threshold and otherwise may be considered to be invalid.
  • the ultrasound position estimate may thus be used if the audio band signals meet a criterion. For example, if the audio band signals do not combine to generate a sufficiently high signal level, the user model is not updated on the basis of the audio band position estimate but instead the user model is updated on the basis of the ultrasound position estimate. Thus, if it is likely that the user is not speaking, the ultrasound signals are used for position estimation.
  • the ultrasound position estimate may not be used if the audio band signals meet a criterion. For example, if the audio band signals do combine to generate a sufficiently high signal level, the user model is not updated on the basis of the ultrasound position estimate but instead the user model is updated on the basis of the audio band position estimate. Thus, if it is likely that the user is speaking, the audio band signals are used for position estimation.
  • array processing may be switched between ultrasound and audible-sound e.g. in order to save power resulting from active ultrasound transmission.
  • audible activity in the human hearing range of frequencies
  • the system switches from ultrasound mode to an audio band mode.
  • the audio beamforming weights are initialized with the latest location estimates provided by the ultrasound signals.
  • the audio band signals are used for user localization.
  • audible activity levels fall below a set threshold, the system switches to the ultrasound mode.
  • improved detection performance may be achieved using joint ultrasound and audio band localization as follows.
  • the system may switch to the audio band mode if the audible activity is above a set threshold. This may improve overall user detection.
  • the system may return to the ultrasound mode if movement is detected. Alternately, the system may stay in the audio band mode as long as audible activity remains above the set threshold.
  • Fig. 7 illustrates an example of the improvement in C50 provided by a conventional beamformer for different filter lengths.
  • the user is in front of the array for the first approx. 10 s, and at an angle of 45 degrees for the next 10 s.
  • the system needs several seconds to converge, especially when long filters are used. This is a significant problem in many hands-free communication systems where the user is free to move during a conversation.
  • Such a problem may be mitigated in the audio system of the described approach as the system may continually track users using ultrasound signals and/or acoustic signals. Specifically, as illustrated in Fig. 8, tracking may be performed using ultrasound signals as the user moves from in front of the sensor (0 degrees) to an angle of 45 degrees.
  • the invention can be implemented in any suitable form including hardware, software, firmware or any combination of these.
  • the invention may optionally be
  • an embodiment of the invention may be physically, functionally and logically implemented in any suitable way. Indeed the functionality may be implemented in a single unit, in a plurality of units or as part of other functional units. As such, the invention may be implemented in a single unit or may be physically and functionally distributed between different units, circuits and processors.

Abstract

An audio system comprises an ultrasound sensor array (105) which has a plurality of ultrasound sensor elements, and an audio band array (101) comprising a plurality of audio band elements. The same array of wideband audio transducers may be used for both the ultrasound sensor array (105) and the audio band array (101). An estimator (107) generates a presence characteristic of a user in response to ultrasound signals received from the ultrasound sensor array. The presence characteristic may specifically comprise a position estimate for the user. An audio array circuit (103) generates a directional response for the audio band array (101) by applying weights to individual audio band signals for the audio band elements. A weight circuit (109) determines the weights in response to the presence characteristic. The system may provide improved adaptation of the directivity of the audio band array (101) and specifically does not require the sound source in the audio band to be active for adaptation.

Description

An audio system and method of operation therefor
FIELD OF THE INVENTION
The invention relates to an audio system and a method of operation therefor, and in particular, but not exclusively, to an audio system capable of estimating user positions. BACKGROUND OF THE INVENTION
Determination of presence and position related information is of interest in many audio applications including for example for hands-free communication and smart entertainment systems. The knowledge of user locations and their movement may be employed to localize audio-visual effects at user locations for a more personalized experience in entertainment systems. Also, such knowledge may be employed to improve the
performance of hands-free (voice) communications, e.g. by attenuating sound from other directions than the estimated direction of the desired user.
In particular, such applications may use directional audio rendering or capture to provide improved effects. Such directionality can for example be derived from audio arrays comprising a plurality of audio drivers or sensors. Thus, acoustic beamforming is relatively common in many applications, such as in e.g. teleconferencing systems. In such systems, weights are applied to the signals of individual audio elements thereby resulting in the generation of a beam pattern for the array. The array may be adapted to the user positions in accordance with various algorithms. For example, the weights may be continually updated to result in the maximum signal level or signal to noise ratio in accordance with various algorithms. However, such conventional approaches require the audio source to be present, and consequently the weights of an acoustic array can be adapted only after a source becomes active.
This is disadvantageous in many scenarios. For example, user tracking tends to become inaccurate when there are only short bursts of acoustic activity. Such a scenario is typical for many applications including for example speech applications where the speaker typically only talks in intervals. Furthermore, beamforming can only be employed effectively after a certain duration of acoustic activity as the weight adaption takes some time to become sufficiently accurate. Also, false detections can occur in the presence of other acoustic sources. For example, if a radio or computer is producing sounds in the room the system may adapt to this sound source rather than the intended sound source, or the adaptation may be compromised by the noise source.
In order to address such issues, it has been proposed to use video cameras to perform position determination and to use the video signal to control the adaptation of the weights. However, such approaches tend to be complex, expensive and resource demanding in terms of computational and power resource usage.
Hence, an improved audio system would be advantageous and in particular a system allowing increased flexibility, reduced resource usage, reduced complexity, improved adaptation, improved reliability, improved accuracy and/or improved performance would be advantageous.
SUMMARY OF THE INVENTION
Accordingly, the Invention seeks to preferably mitigate, alleviate or eliminate one or more of the above mentioned disadvantages singly or in any combination.
According to an aspect of the invention there is provided an audio system comprising: an ultrasound sensor array comprising a plurality of ultrasound sensor elements; an estimator for estimating a presence characteristic of a user in response to ultrasound signals received from the ultrasound sensor array; an audio array circuit for generating a directional response of an audio band array comprising a plurality of audio band elements by applying weights to individual audio band signals for the audio band elements; and a weight circuit for determining the weights in response to the presence characteristic.
The invention may provide improved adaptation of the directionality of an audio band array. The approach may for example allow adaptation of filter characteristics for the array processing based on the ultrasound signals. Adaptation of filter characteristics and weights, and thus the directionality of the audio array, may be performed in the absence of sound being generated from a target source. Specifically, the filter characteristics/weights may be set to provide a beam or a notch in a desired direction based on the ultrasound signals.
The invention may in many embodiments provide improved accuracy and/or faster adaptation of audio directionality for the audio band array. The initialisation of weights for the audio band array may for example be based on the presence characteristic.
In some embodiments, the spatial directivity pattern of the audio band array may be adjusted in response to the presence characteristic. For example, if the presence of a user is detected, a directional beam may be generated, and if no user is detected an omnidirectional beam may be generated.
The audio band may be considered to correspond to an acoustic band. The audio band may be a band having an upper frequency below 15 kHz and typically below 10 kHz. The ultrasound band may be a band having a lower frequency above 10 kHz and often advantageously above 15 kHz or 20 kHz.
The weights may be filter weights of individual filters being applied to the individual audio band signals by the array processor. The weights may be complex values and/or may equivalently be delays, scale factors and/or phase shifts.
In accordance with an optional feature of the invention, the presence characteristic comprises a position estimate and the audio array circuit is arranged to determine the weights in response to the position characteristic.
This may provide improved performance and/or additional capability for many applications. The invention may e.g. allow beamforming to track users or audio sources even when they do not generate any sound. In many embodiments, it may provide a faster adaptation of a beam pattern to a specific user position.
In accordance with an optional feature of the invention, the audio band elements are audio sensors and the audio array circuit is arranged to generate a directional output signal by combining audio band signals from the audio sensors, the combining comprising applying the weights to the individual audio band signals.
The invention may allow an advantageous control of directivity for an audio capture system based on an audio band sensor array. The approach may allow for an audio band audio capture beam to be adapted even when no sound is generated by the target source. Furthermore, the approach may reduce or mitigate the impact of audio generated by undesired sound sources.
In accordance with an optional feature of the invention, the audio system comprises a plurality of wideband sensors each of which is both an ultrasound sensor of the ultrasound sensor array and an audio sensor of the audio band array.
The same wideband sensor may thus be used as both an audio band element and an ultrasound sensor. This may provide a highly cost efficient implementation in many scenarios. The approach may facilitate and/or improve interworking between the audio band processing and the ultrasound band processing. For example, the approach may in many scenarios allow reuse of parameters determined in response to the ultrasound signals when processing the audio band signals. Specifically, the approach may facilitate and/or improve synchronization between ultrasound and audio band operations and processing.
In accordance with an optional feature of the invention, the plurality of wideband sensors forms both the ultrasound sensor array and the audio band array.
Each of the audio band elements and ultrasound sensors may be implemented by a wideband sensor. The same wideband sensor array may thus be used as the audio band array and the ultrasound sensor array. The ultrasound signals and the audio band signals may be different frequency intervals of the same physical signals, namely the wideband sensor elements.
The approach may provide a highly cost efficient implementation in many scenarios. The approach may facilitate and/or improve interworking between the audio band processing and the ultrasound band processing.
In accordance with an optional feature of the invention, the audio system further comprises: a user movement model arranged to track a position of a user; an update circuit for updating the user movement model in response to both the ultrasound signals and the audio band signals.
This may provide improved performance in many embodiments and may in many scenarios provide a substantially improved user movement tracking.
In accordance with an optional feature of the invention, the update circuit is arranged to update the user movement model in response to the ultrasound signals when a characteristic of the audio band signals meets a criterion.
This may improve user movement tracking in many scenarios.
The criterion may for example be a criterion which is indicative of the desired sound source not generating any sound. As a simple example, the criterion may be a requirement that a signal level of the audio band signals is below a threshold. The threshold may be a variable threshold which varies in response to other parameters.
In accordance with an optional feature of the invention, the update circuit is arranged to not update the user movement model in response to the ultrasound signals when a characteristic of the audio band signals meets a criterion.
This may improve user movement tracking in many scenarios.
The criterion may for example be a criterion which is indicative of the desired sound source generating sound. As a simple example, the criterion may be a requirement that a signal level of the audio band signals is above a threshold. The threshold may be a variable threshold which varies in response to other parameters. In accordance with an optional feature of the invention, the weight circuit is arranged to determine ultrasound weight delays for the ultrasound signals to correspond to a direction of an ultrasound source; and to determine audio weight delays for the individual audio band signals to correspond to the ultrasound weight delays.
This may provide facilitated and/or improved performance in many scenarios.
In accordance with an optional feature of the invention, the ultrasound sensor array and the audio band array are spatially overlapping.
This may provide facilitated and/or improved performance in many scenarios. The ultrasound sensor array and the audio band array may specifically be substantially collocated.
In accordance with an optional feature of the invention, the audio system further comprises an ultrasound transmitter arranged to transmit an ultrasound test signal, and the estimator is arranged to estimate the presence characteristic in response to a comparison between a characteristic of the ultrasound test signal and a characteristic of the ultrasound signals received from the ultrasound sensor array.
This may provide improved performance. The ultrasound transmitter may be proximal to the ultrasound sensor array and may be substantially collocated therewith. The ultrasound transmitter may in some scenarios be implemented by the same ultrasound transducer(s) as one (or more) of the ultrasound sensors.
In accordance with an optional feature of the invention, the ultrasound test signal is a pulsed ultrasound signal, and the estimator is arranged to perform a movement estimation in response to a comparison of signal segments of the ultrasound signals corresponding to different pulses.
This may provide a particularly practical and/or improved movement detection that may in many scenarios improve performance of the audio system as a whole.
In accordance with an optional feature of the invention, the estimator is arranged to estimate a position of a moving object in response to a difference between the signal segments.
This may provide a particularly practical and/or improved movement detection that may in many scenarios improve performance of the audio system as a whole.
In accordance with an optional feature of the invention, the audio band elements are audio drivers arranged to generate a sound signal in response to a drive signal, and the individual audio band signals are drive signals. The invention may allow improved performance and or facilitated
implementation and/or operation of an audio system providing a directional sound
reproduction. The approach may for example allow optimization of audio rendering for a specific listening position.
According to an aspect of the invention there is provided a method of operation for an audio system including an ultrasound sensor array comprising a plurality of ultrasound sensor elements, the method comprising: estimating a presence characteristic of a user in response to ultrasound signals received from the ultrasound sensor array; generating a directional response of an audio band array comprising a plurality of audio band elements by applying weights to individual audio band signals for the audio band elements; and determining the weights in response to the presence characteristic.
These and other aspects, features and advantages of the invention will be apparent from and elucidated with reference to the embodiment(s) described hereinafter. BRIEF DESCRIPTION OF THE DRAWINGS
Embodiments of the invention will be described, by way of example only, with reference to the drawings, in which
Fig. 1 illustrates an example of an audio system in accordance with some embodiments of the invention;
Fig. 2 illustrates an example of a beamformer for an audio sensor array;
Fig. 3 illustrates an example of a beamformer for an audio rendering array; Fig. 4 illustrates an example of an audio system in accordance with some embodiments of the invention;
Fig. 5 illustrates an example of a transmitted ultrasound signal;
Fig. 6 illustrates an example of an audio system in accordance with some embodiments of the invention; and
Figs. 7-9 illustrate examples of performance for a de-reverberation application.
DETAILED DESCRIPTION OF SOME EMBODIMENTS OF THE INVENTION
Fig. 1 illustrates an example of an audio system in accordance with some embodiments of the invention.
The audio system comprises an audio band array 101 which comprises a plurality of audio band elements/ transducers. The audio band array 101 may be used to provide directional operation of the audio system by individually processing the signals for each of the audio band elements. Thus, the combined effect of the audio band array 101 may correspond to a single audio band element having a directional audio characteristic.
The audio band array 101 is coupled to an array processor 103 which is arranged to generate a directional response from the audio band array by individually processing the signals of the individual signals of the individual audio band elements.
In some embodiments, the audio band array 101 may be used to render sound and the audio band elements/transducers may be audio band drivers/speakers. Thus, an input signal may be applied to the array processor 101 which may generate the individual drive signals for the audio band drivers by individually processing the input signal. Specifically, filter characteristics/ weights may be set individually for each of the audio band drivers such that the resulting radiated audio band signals add or subtract differently in different directions. For example, coherent addition can be produced in a desired direction with noncoherent (and thus reduced signal levels) are produced in other directions.
In some embodiments, the audio band array 101 may be used to capture sound and the audio band elements/transducers may be audio band sensors. Thus, an output signal may be generated by the array processor 101 by individually processing the individual sensor signals from the audio band sensors and subsequently combining the processed signals. Specifically, filter characteristics/weights may be set individually for each of the audio band sensors such that the combination is more or less a coherent combination in the desired direction.
Fig. 2 illustrates an example wherein four input sensor signals are received from four audio band sensors. It will be appreciated that the array may in other embodiments comprise fewer or more elements. Each of the signals is amplified in an individual low noise amplifier 201 after which each signal is filtered in an individual filter 203. The resulting filtered signals are then fed to a combiner 205 which may e.g. simply sum the filter output signals.
Fig. 3 illustrates an example wherein an input signal is received by a splitter 301 which generates four signals, one signal for each of four audio band drivers. Each of the signals is then filtered in an individual filter 303 after which each filter output signal is amplified in a suitable output amplifier 305. Each of the output amplifiers thus generates a drive signal for an audio band driver.
The directionality of the audio band array can thus be controlled by suitably adapting the individual filters 203, 303. Specifically, the filters 203, 303 can be adapted such that coherent summation is achieved for a desired direction. The directionality of the audio band array can accordingly be modified dynamically simply by changing the characteristics of the filter. Thus, the audio band beam/ pattern of the audio band array can be controlled by modifying the weights of the filters as will be known to the skilled person.
The modification of the filter weights may specifically correspond to a modification of one or more of a gain, a phase and a delay. Each of these parameters may be constant for all frequencies or may be frequency dependent. Further, modifications of the filter weights may be performed in the frequency domain and/or the time domain. For example, time domain adaptation may be performed by adjusting coefficients (taps) of a FIR filter. As another example, the signals may be converted to the frequency domain by a Fast Fourier Transform. The resulting frequency domain signal may then be filtered by applying coefficients/weights to each of the frequency bin values. The resulting filtered frequency domain signal may then be converted back to the time domain by an inverse Fast Fourier Transform.
As a low complexity example, the filters 203, 303 may simply correspond to a variable delay. It is noted that a simple delay corresponds to a filter having an impulse response corresponding to a Dirac pulse at a time position corresponding to the delay. Thus, introducing a variable delay corresponds to a introducing a filter wherein the coefficients are weighted to provide the desired delay (e.g. it is equivalent to a FIR filter where the coefficient corresponding to the delay is set to one and all other coefficients are set to zero. For fractional delays (relative to sample instants) FIR interpolation may be considered).
Thus, the approach may correspond to a Delay and Sum Beamformer (DSB) for the audio band sensor case.
In some embodiments, more complex filtering may be performed and specifically a frequency dependent filtering may be applied. Thus, the approach may correspond to a Filter and Sum Beamformer (FSB) for the audio band sensor case.
It will be appreciated that in some embodiments further processing of the individual signals may be performed. For example, gain adjustment or compensation may be introduced for an audio band rendering system. E.g. calibration may be performed to compensate for variations in the characteristics of the audio band drivers.
As another example, the combination of the audio band sensor example may take other signals into account and may for example subtract signals that are derived from the individual signals. For example, side-lobe cancelling may be introduced by subtracting a suitably generated estimate of such signals. Various algorithms are known for controlling weights of an audio band beamformer. Generally these algorithms determine weights for the audio band beamformer based on knowledge of a desired directivity and accordingly determines the weights based e.g. on predetermined values relating directions to weights (e.g. using a look up table). For the audio band sensor case, the weights are typically adapted in a feedback fashion based on the received audio. For example, the weights are dynamically adapted to provide a maximum signal level or a maximum signal to noise ratio estimate.
However, in the system of Fig. 1 the adaptation of the filter characteristics are alternatively or additionally dependent on the ultrasound audio environment. The system comprises an ultrasound sensor array 105 which comprises a plurality of ultrasound sensors that generate a plurality of ultrasound signals. The ultrasound signals are fed to an estimation processor 107 which is arranged to generate a presence estimate for a user in response to the ultrasound signals. The estimation processor 107 is coupled to a weight processor 109 which is further coupled to the array processor 103. The weight processor 109 is arranged to determine the filter characteristics for the array processor 103 in response to the presence estimate.
The system thus uses characteristics estimated from the ultrasound audio environment to control the operation in the audio band. The ultrasound band may be considered to be from 10 kHz and whereas the audio band may be considered to be the frequency range below 15 kHz. The audio band will thus include frequency intervals below 15 kHz.
In the specific example of Fig. 1, the system further comprises an ultrasound transmitter 111 which is arranged to radiate ultrasound signals. Such signals will be reflected by objects in the room and the reflected signals or echoes can be captured by the ultrasound sensor array 105.
Thus, in the system of Fig. 1 the filter characteristics and weights may fully or partially for at least some of the time be dependent on the received ultrasound signals and specifically on echoes from radiated ultrasound signals. The estimation processor 107 receives the ultrasound signals from the sensor array 105 and based on these it estimates a presence characteristic for a user. The presence characteristic may in a simple example simply indicate whether a user is estimated to be present or not. However, in most embodiments, the presence characteristic is an indication of a position of a user. It will be appreciated that a full position estimate need not be determined but that in some
embodiments, the weight processor 107 may e.g. simply estimate a rough direction to the user. Based on the determined presence characteristic, the weight processor 107 proceeds to determine suitable weights to result in a desired beam pattern for the specific presence characteristic.
As a simple example, the audio system may be set up in an environment wherein the ultrasound transmitter 111 does not generate any significant echoes at the ultrasound sensor array 105 (e.g. in a large empty space where all objects are sufficiently far away to not generate significant echoes). However, when a user enters an area in front of the ultrasound transmitter 111 and ultrasound sensor array 105, a significant echo may be generated. The estimation processor 107 may perform a very simple detection by comparing the ultrasound signal level to a threshold and setting the presence indicator to indicate a presence of a user if the threshold is exceeded and otherwise setting it to indicate that no user is detected. The weight processor 107 may then proceed to modify the weights accordingly. For example, if no user is present the weights may be set to provide a pattern which is as omnidirectional as possible, and if a user is detected the weights may be set to provide a predetermined narrow beam in the direction of a nominal position where the user is assumed to be (e.g. directly in front of the ultrasound sensor array 105). Such an approach may be suitable for many applications and can be used for both audio rendering/playback and for audio capturing. E.g. when no user is present sound is radiated in all directions and/or captured from all directions. This may support peripheral users in different positions.
However, when a user steps in front of the system, the audio system automatically adapts to focus on this specific user.
It will be appreciated that the system seeks to determine a presence/position characteristic for a user but may not know if the ultrasound signals are caused by a user or other object. Thus, the presence characteristic may be considered to be a presence characteristic for an object. The object may then be assumed to be the user.
In many embodiments, the presence characteristic may comprise or consist of a position (direction) estimate for user and the weight processor 107 may be arranged to determine weights to provide a suitable pattern for this direction (e.g. by directing a beam in that direction). The audio system may thus use ultrasound measurements to adjust the directivity of an audio band beam.
In many scenarios, the ultrasound sensor array 105 and the audio band array 101 may be substantially collocated and may e.g. be adjacent to each other. However, in many embodiments the ultrasound sensor array 105 and the audio band array 101 may advantageously overlap each other. Thus, for an audio capture application, the apertures of the ultrasound sensor array 105 and the audio band (sensor) array 101 may overlap each other. An example is where ultrasound sensors are placed in-between audio band sensors such that the arrays are interleaved with each other. Such an approach provides for improved and facilitated operation and increased accuracy. Specifically there is no necessity for complex calculations to translate positions relative to the ultrasound sensor array 105 to positions relative to the audio band array 101. Rather, if an estimated direction to a user is determined based on the ultrasound signals, this direction can be used directly when determining suitable filter weights for the audio band signals.
In the following, more specific examples of the system will be described. The description will focus on an audio capture system which adapts the audio beam pattern towards a desired sound source. The audio system may for example be a teleconferencing system.
In the example, the ultrasound sensor array 105 and the audio band array 101 are not only collocated or overlapping but actually use the same audio band elements. Fig. 4 illustrates an example of the exemplary audio capture system.
The system of Fig. 4 comprises an audio band array of audio band transducers in the form of wideband audio band sensors 401. Each of the wideband audio sensors 401 captures sound in a wideband range which covers at least part of the audio band and the ultrasound band. Indeed the active frequency interval for capture by the wideband audio sensors 401 includes frequencies below 2 kHz and above 10 kHz (or below 500Hz or 1 kHz and/or above 15 kHz or 20 kHz in many scenarios).
Thus, each of the wideband audio sensors 401 is both an audio band sensor and an ultrasound sensor. Hence, the same sensors are used both to provide the captured audio input as well as the ultrasound input.
The wideband audio sensors 401 are coupled to an array processor 403 which proceeds to filter and combine the audio band signals as described for the array processor 103 of Fig. 1. However, in many scenarios the array processor 103 may further low pass filter the signals to limit the signals to the audio band.
Similarly, the wideband audio sensors 401 are coupled to an estimator 405 which is arranged to determine a presence characteristic for a user along the same lines as the presence estimator 107 of Fig. 7. The estimator 405 is coupled to a weight processor 407 which is arranged to determine the weights for the array processor 403 based on the presence characteristic corresponding to the approach of the weight processor 107 of Fig. 1. In the system where the same transducer signals are used both for the audio band and ultrasound processing, the respective ultrasound signals may e.g. be generated by a high pass filtering of the transducer signals and the audio band signals may be generated by a low pass filtering of the transducer signal.
An audio band signal may have at least 80% of the total signal energy below
10 kHz whereas an ultrasound signal may have at least 80% of the total signal energy above 10 kHz.
The system further comprises an ultrasound transmitter 409 which is located centrally in the audio array 401.
The system of Fig. 4 may operate similarly to that described for the capture application of Fig. 1. However, typically, the system may specifically be used to estimate user positions based on the ultrasound signals, and this position estimate may be used to fully or partially control the weights of the audio band combining in order to provide a desired directive sensitivity of the audio capture.
It will be appreciated that the weights may not only be determined based on the presence or position estimate generated from the ultrasound signals but may in some scenarios alternatively or additionally be generated based on the audio band signals captured by the audio array 401 (and typically generated by filtering of these or in some cases used directly when the ultrasound signal components are negligible when performing the audio band processing). For example, the audio system may include conventional functionality for adapting the weights of a beamformer for an audio array. However, during intervals of no sound or at initialisation, the ultrasound signals can be used to determine suitable weights which can be used for the beamforming algorithm. Thus, initialisation of an audio beamforming algorithm may be performed using the weights determined from the ultrasound signals.
The use of wideband sensors as both audio band and ultrasound sensors provide a number of advantages. Indeed, it may facilitate implementation and manufacturing as fewer sensors are used. This may reduce cost and often reduce the form factor of the sensor segment of the audio system. It may for example allow implementation of a teleconferencing audio system using the described approach in a single relatively small enclosure. However, the approach may further provide improved performance and may in particular provide higher accuracy and/or reduced or facilitated signal processing with reduced complexity. Indeed, the translation between different audio band and ultrasound sensor arrays may often be substantially facilitated. Indeed, in many scenarios the parameters determined to result in a coherent addition for the ultrasound signals may directly be used as parameters for the audio beamforming. E.g. the same delays may be used for each individual path.
As a specific example, the system may be used in for hands-free communication where one or more users communicate with remotely located users using a fixed system. In order to provide a high quality interface, acoustic beamforming can be performed in order to localize the sources and direct the acoustic beam to those locations. However this conventionally requires sources to be (acoustically) active. In conventional systems, during and immediately after periods of inactivity, the beamforming weights need to be updated if the sources have moved, leading to drops in quality. As an example scenario, consider an active source at a certain location. The source goes quiet and moves to another location and then again becomes active. Communication would initially suffer since the acoustic beamformer weights need updating. Also if there are non-human acoustic sources like a TV or notebook operating in the environment, the beamforming weights that are computed could be inaccurate resulting in poor quality or even a communication outage.
However, in the present system, such disadvantages can be mitigated by the ultrasound signals being used to track and update the weights during intervals without acoustic activity. Furthermore, external noise sources are unlikely to affect the ultrasound processing thereby providing more reliable estimates which could be used in case of excessive undesired noise.
In many embodiments, the presence characteristic is thus a position estimate or indication, such as e.g. a direction of the assumed user. The position estimate can be determined in response to the ultrasound signal transmitted by the ultrasound transmitter 409. In particular the signal components in the ultrasound band can be used to compute user locations based on time-of-flight processing which allows a computation of range and/or direction-of-arrival processing for angular information.
In the following, an example will be described based on the ultrasound transmitter transmitting a pulsed signal, e.g. such as the one illustrated in Fig. 5. In the example, T denotes the duration over which the pulse comprising sinusoids is transmitted and PRI denotes the duration over which echoes may be received.
The estimator 405 may for each pulse correlate the received ultrasound signal from each wideband audio sensor to delayed versions of the transmitted pulse. The delay which results in the largest correlation can be considered to correspond to the time of flight for the ultrasound signal and the relative difference in the delays (and thus the times of flight) between the array elements can be used to determine a direction towards the object reflecting the ultrasound.
In some embodiments, the ultrasound signals are also used to provide a motion estimate for the user. Specifically, the ultrasound position estimate may be based on moving objects, i.e. on changes in the echoes received by the wideband sensors.
For example, the ultrasound transmitter may emanate a series of pulses, such as those of Fig. 5. The estimator 405 may then proceed to first determine the range of the moving sources only from the wideband sensor array 401 while discarding static objects from consideration. The estimator 405 in the example proceeds to consider the difference of the received signals from two consecutive transmit pulses rather than consider each response individually. Echoes from static objects result in the same contribution in received signals from consecutive transmit pulses, and hence the difference would be (close to) zero. Echoes from moving sources on the other hand result in a non-zero difference signal. Signal power is then computed per range bin based on the difference signal. A moving source is determined to be present at a certain range bin if the computed signal power exceeds a detection threshold. The detection threshold may be adapted to ambient noise conditions. Having determined the radial range, the angular position may be calculated by determining the direction-of-arrival (DoA) of the moving sources. The range along with the angle gives the instantaneous location of each moving source.
The location estimate (azimuth) provided by the ultrasound array can be translated in to the relative delays that occur when an audio signal emanating from that location propagates to the audio sensors of the array 401. For clarity and simplicity, and without loss of generality, a uniform linear audio sensor array is assumed with an inter- element spacing of d m. Let 6f , denote the estimate of the location of the audio source (the object reflecting the ultrasound signals) relative to the wideband sensor array 401.
Assuming a far- field model and therefore planar wave propagation, the delay in seconds at sensor i of the array, relative to the first sensor is given by
where c is the speed of sound in air. The signal received at sensor i can be written as :r, (t) = s(f - r. ) 4- ( f ).
where s(t) is the desired sound and n, (t} is the noise signal at sensor i.
The relative delays required for forming a beam in the direction of the assumed user can now be computed from the location estimate provided by the ultrasound array. The signals from the audio sensors can specifically be compensated such that the signals for the determined direction add coherently in a Delay-and-Sum Beamformer (DSB) structure: f (f j = I^ i i - ).
It will be appreciated that the above equation can be implemented by appropriately delaying the input signals to ensure causality.
A particular advantage of many systems wherein the audio band array and ultrasound array are closely located, and in particular of a system wherein the same sensors provide both the ultrasound and audio band signals, is that the estimate of the relative delays r; obtained from the ultrasound signals can directly be used for the audio band signal.
This avoids the potential loss in accuracy in having to translate the delays to a location estimate relative to the ultrasound array, and then translate this position back to delays for an audio band array which may be located elsewhere.
Thus, in many embodiments the audio system may directly determine ultrasound weight delays for the ultrasound signals that correspond to a direction of an ultrasound source (such as a reflecting object). The audio band weight delays for the individual audio band signals may then directly be used to correspond to the ultrasound weight delays. In such scenarios the presence characteristic may indeed be represented by the determined delays themselves.
It is noted that although the approach has been described with specific reference to a DSB it is also applicable to e.g. more complex beamformers such as a Filter- Sum-Beamformer (FSB) or a sidelobe cancelling beamformer.
The approach may provide a number of advantages. For example, resetting the filters to the delays corresponding to the location determined by the ultrasound signals after a period of acoustic inactivity by the user, and then allowing the filters to adapt when the audio band becomes active ensures faster convergence than the case where the filters corresponding to the old location need to be adapted.
The audio system of Fig. 1 may be arranged to track movement of a user, where the estimated movement is updated using both the results from the audio band and from the ultrasound band. The audio beamforming may then be based on the current position estimate for a user. For example, past location information can be combined with a movement model to obtain user movement trajectories, where the model may be
continuously updated based on the current position estimated from either the audio band signals, the ultrasound signals, or from both. The user movement model may for example be a simple model which e.g. simply uses the last estimated position as the current position, or may be more complex and for example implement complex movement models that may predict movement and combine position estimates from both the ultrasound and audio bands. The location and movement trajectory information may e.g. then be used as a priori input to the acoustic beamformer, i.e. the array processor 403 may after a pause in the audio from the desired signal source be initialised with weights corresponding to the estimated user position.
This may be particularly advantageous e.g. when the audio source is a person who moves from location A to location B without talking. An audio-only system is unable to track this movement due to the absence of an audible signal, and needs time to converge to the correct weights once the person starts talking from location B. Using the location estimated from the ultrasound array solves this problem as it can continuously track the user during the movement for location A to location B.
Fig. 6 illustrates an example of how the audio system of Fig. 4 may be implemented using a movement model which is updated on the basis of position estimates generated both from the ultrasound signals and from the audio band signals.
In the example, the estimator 405 comprises an ultrasound position estimator 601 which receives the signals from the wideband audio sensors 401 and which generates a position estimate from the ultrasound signal components. The previously described approach may for example be used.
The estimator 405 further comprises an audio band position estimator 603 which receives the signals from the wideband audio sensors 401 and which generates a position estimate from the audio band signal components. It will be appreciated that any suitable algorithm may be used, including for example an adaptive algorithm determining relative delays that result in the maximum summed signal level. It will also be appreciated that in some embodiments, the position determination may be integrated with the
beamforming process of the array processor 403 e.g. by the audio system including a feedback path from the array processor 403 to the audio band position estimator 603.
The ultrasound position estimator 601 and the audio band position estimator 603 are coupled to an update processor 605 which is further coupled to a movement model 607. The movement model 607 is a model that generates a position estimate for the user. The update processor 605 controls the movement model based on the position estimates from the ultrasound position estimator 601 and the audio band position estimator 603.
As a simple example, the movement model 607 may simple comprise a memory which stores the latest position estimate provided by the update processor 605. The update processor 605 may continuously evaluate the ultrasound and audio band position estimates and proceed to feed forward the position estimate that is considered to be valid. If both are considered valid, an average position estimate may be forwarded, and if none of them are considered valid no position estimate is forwarded.
It will be appreciated that any suitable approach for determining whether a position estimate is valid may be used. For example, the position estimate may simply be considered valid if the signal level of the combined signal is above a given threshold and otherwise may be considered to be invalid.
In some embodiments, the ultrasound position estimate may thus be used if the audio band signals meet a criterion. For example, if the audio band signals do not combine to generate a sufficiently high signal level, the user model is not updated on the basis of the audio band position estimate but instead the user model is updated on the basis of the ultrasound position estimate. Thus, if it is likely that the user is not speaking, the ultrasound signals are used for position estimation.
In some embodiments, the ultrasound position estimate may not be used if the audio band signals meet a criterion. For example, if the audio band signals do combine to generate a sufficiently high signal level, the user model is not updated on the basis of the ultrasound position estimate but instead the user model is updated on the basis of the audio band position estimate. Thus, if it is likely that the user is speaking, the audio band signals are used for position estimation.
Thus, in some embodiments array processing may be switched between ultrasound and audible-sound e.g. in order to save power resulting from active ultrasound transmission. Hence, when audible activity (in the human hearing range of frequencies) is detected, the system switches from ultrasound mode to an audio band mode. During the switch, the audio beamforming weights are initialized with the latest location estimates provided by the ultrasound signals. As long as audible activity persists, the audio band signals are used for user localization. When audible activity levels fall below a set threshold, the system switches to the ultrasound mode.
As another example, improved detection performance may be achieved using joint ultrasound and audio band localization as follows. In the ultrasound mode, if no user is detected possibly because of lack of significant movement over a duration of time, the system may switch to the audio band mode if the audible activity is above a set threshold. This may improve overall user detection. The system may return to the ultrasound mode if movement is detected. Alternately, the system may stay in the audio band mode as long as audible activity remains above the set threshold.
An example of the advantages that can be achieved by the system can be demonstrated by consideration of a dereverberation application where beamforming is used to reduce the amount of reverberation captured by the array. Reverberation affects the clarity of speech, which can be quantified through the clarity index or C50, which is the ratio (in dB) of the energy of the sound arriving at the ear within 50 ms after it is generated to the energy of the sound that arrives after 50 ms. The performance of beamformers that perform dereverberation can thus be measured by the improvement in the clarity index that results from processing.
Fig. 7 illustrates an example of the improvement in C50 provided by a conventional beamformer for different filter lengths. The user is in front of the array for the first approx. 10 s, and at an angle of 45 degrees for the next 10 s. When the user changes location, it can be seen that there is a sharp drop in performance, and the system needs several seconds to converge, especially when long filters are used. This is a significant problem in many hands-free communication systems where the user is free to move during a conversation.
Such a problem may be mitigated in the audio system of the described approach as the system may continually track users using ultrasound signals and/or acoustic signals. Specifically, as illustrated in Fig. 8, tracking may be performed using ultrasound signals as the user moves from in front of the sensor (0 degrees) to an angle of 45 degrees.
This change in location is provided as input to the beamformer. The beamformer weights can then be reset to the delays corresponding to the new location. Fig. 9 illustrates the corresponding improvement in C50. Clearly, faster convergence can be observed when accurate location estimates are provided. It will be appreciated that the above description for clarity has described embodiments of the invention with reference to different functional circuits, units and processors. However, it will be apparent that any suitable distribution of functionality between different functional circuits, units or processors may be used without detracting from the invention. For example, functionality illustrated to be performed by separate processors or controllers may be performed by the same processor or controllers. Hence, references to specific functional units or circuits are only to be seen as references to suitable means for providing the described functionality rather than indicative of a strict logical or physical structure or organization.
The invention can be implemented in any suitable form including hardware, software, firmware or any combination of these. The invention may optionally be
implemented at least partly as computer software running on one or more data processors and/or digital signal processors. The elements and components of an embodiment of the invention may be physically, functionally and logically implemented in any suitable way. Indeed the functionality may be implemented in a single unit, in a plurality of units or as part of other functional units. As such, the invention may be implemented in a single unit or may be physically and functionally distributed between different units, circuits and processors.
Although the present invention has been described in connection with some embodiments, it is not intended to be limited to the specific form set forth herein. Rather, the scope of the present invention is limited only by the accompanying claims. Additionally, although a feature may appear to be described in connection with particular embodiments, one skilled in the art would recognize that various features of the described embodiments may be combined in accordance with the invention. In the claims, the term comprising does not exclude the presence of other elements or steps.
Furthermore, although individually listed, a plurality of means, elements, circuits or method steps may be implemented by e.g. a single circuit, unit or processor. Additionally, although individual features may be included in different claims, these may possibly be advantageously combined, and the inclusion in different claims does not imply that a combination of features is not feasible and/or advantageous. Also the inclusion of a feature in one category of claims does not imply a limitation to this category but rather indicates that the feature is equally applicable to other claim categories as appropriate.
Furthermore, the order of features in the claims do not imply any specific order in which the features must be worked and in particular the order of individual steps in a method claim does not imply that the steps must be performed in this order. Rather, the steps may be performed in any suitable order. In addition, singular references do not exclude a plurality. Thus references to "a", "an", "first", "second" etc do not preclude a plurality. Reference signs in the claims are provided merely as a clarifying example shall not be construed as limiting the scope of the claims in any way.

Claims

CLAIMS:
1. An audio system comprising:
an ultrasound sensor array (105) comprising a plurality of ultrasound sensor elements;
an estimator (107) for estimating a presence characteristic of a user in response to ultrasound signals received from the ultrasound sensor array;
an audio array circuit (103) for generating a directional response of an audio band array (101) comprising a plurality of audio band elements by applying weights to individual audio band signals for the audio band elements; and
a weight circuit (109) for determining the weights in response to the presence characteristic.
2. The audio system of claim 1 wherein the presence characteristic comprises a position estimate and the audio array circuit (103) is arranged to determine the weights in response to the position characteristic.
3. The audio system of clam 1 wherein the audio band elements are audio sensors and the audio array circuit (103) is arranged to generate a directional output signal by combining audio band signals from the audio sensors, the combining comprising applying the weights to the individual audio band signals.
4. The audio system of claim 3 comprising a plurality of wideband sensors each of which is both an ultrasound sensor of the ultrasound sensor array (105) and an audio sensor of the audio band array (101).
5. The audio system of claim 4 wherein the plurality of wideband sensors forms both the ultrasound sensor array (105) and the audio band array (101).
6. The audio system of claim 3 further comprising:
a user movement model (607) arranged to track a position of a user; an update circuit (605) for updating the user movement model in response to both the ultrasound signals and the audio band signals.
7. The audio system of claim 6 wherein the update circuit (605) is arranged to update the user movement model (607) in response to the ultrasound signals when a characteristic of the audio band signals meets a criterion.
8. The audio system of claim 6 wherein the update circuit (605) is arranged to not update the user movement model (607) in response to the ultrasound signals when a characteristic of the audio band signals meets a criterion.
9. The audio system of claim 1 wherein the weight circuit (407) is arranged to determine ultrasound weight delays for the ultrasound signals to correspond to a direction of an ultrasound source; and to determine audio weight delays for the individual audio band signals to correspond to the ultrasound weight delays.
10. The audio system of claim 1 wherein the ultrasound sensor array (105) and the audio band array (101) are spatially overlapping.
11. The audio system of claim 1 further comprising an ultrasound transmitter
(111) arranged to transmit an ultrasound test signal, and wherein the estimator (107) is arranged to estimate the presence characteristic in response to a comparison between a characteristic of the ultrasound test signal and a characteristic of the ultrasound signals received from the ultrasound sensor array.
12. The audio system of claim 8 wherein the ultrasound test signal is a pulsed ultrasound signal, and the estimator (107) is arranged to perform a movement estimation in response to a comparison of signal segments of the ultrasound signals corresponding to different pulses.
13. The audio system of claim 12 wherein the estimator (107) is arranged to estimate a position of a moving object in response to a difference between the signal segments.
14. The audio system of claim 1 wherein the audio band elements are audio drivers arranged to generate a sound signal in response to a drive signal, and the individual audio band signals are drive signals.
15. A method of operation for an audio system including an ultrasound sensor array (105) comprising a plurality of ultrasound sensor elements, the method comprising:
estimating a presence characteristic of a user in response to ultrasound signals received from the ultrasound sensor array (105);
generating a directional response of an audio band array (101) comprising a plurality of audio band elements by applying weights to individual audio band signals for the audio band elements; and
determining the weights in response to the presence characteristic.
PCT/IB2012/050007 2011-01-05 2012-01-02 An audio system and method of operation therefor WO2012093345A1 (en)

Priority Applications (6)

Application Number Priority Date Filing Date Title
CN201280004763.2A CN103329565B (en) 2011-01-05 2012-01-02 Audio system and operational approach thereof
EP12700525.4A EP2661905B1 (en) 2011-01-05 2012-01-02 An audio system and method of operation therefor
US13/996,347 US9596549B2 (en) 2011-01-05 2012-01-02 Audio system and method of operation therefor
JP2013547941A JP6023081B2 (en) 2011-01-05 2012-01-02 Audio system and method of operating audio system
BR112013017063A BR112013017063A2 (en) 2011-01-05 2012-01-02 audio system and method of operating an audio system
RU2013136491/28A RU2591026C2 (en) 2011-01-05 2012-01-02 Audio system system and operation method thereof

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP11150153.2 2011-01-05
EP11150153 2011-01-05

Publications (1)

Publication Number Publication Date
WO2012093345A1 true WO2012093345A1 (en) 2012-07-12

Family

ID=45498065

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2012/050007 WO2012093345A1 (en) 2011-01-05 2012-01-02 An audio system and method of operation therefor

Country Status (7)

Country Link
US (1) US9596549B2 (en)
EP (1) EP2661905B1 (en)
JP (1) JP6023081B2 (en)
CN (1) CN103329565B (en)
BR (1) BR112013017063A2 (en)
RU (1) RU2591026C2 (en)
WO (1) WO2012093345A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014138134A3 (en) * 2013-03-05 2014-10-30 Tiskerling Dynamics Llc Adjusting the beam pattern of a speaker array based on the location of one or more listeners
CN104640001A (en) * 2013-11-07 2015-05-20 大陆汽车系统公司 Cotalker nulling based on multi super directional beamformer
EP2941019A1 (en) * 2014-04-30 2015-11-04 Oticon A/s Instrument with remote object detection unit
WO2016043880A1 (en) * 2014-09-16 2016-03-24 Symbol Technologies, Llc Ultrasonic locationing interleaved with alternate audio functions
US9900723B1 (en) 2014-05-28 2018-02-20 Apple Inc. Multi-channel loudspeaker matching using variable directivity
US10244300B2 (en) 2014-04-30 2019-03-26 Oticon A/S Instrument with remote object detection unit
US10264383B1 (en) 2015-09-25 2019-04-16 Apple Inc. Multi-listener stereo image array

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9392389B2 (en) 2014-06-27 2016-07-12 Microsoft Technology Licensing, Llc Directional audio notification
US9326060B2 (en) 2014-08-04 2016-04-26 Apple Inc. Beamforming in varying sound pressure level
CN105469819A (en) 2014-08-20 2016-04-06 中兴通讯股份有限公司 Microphone selection method and apparatus thereof
US20160066067A1 (en) * 2014-09-03 2016-03-03 Oberon, Inc. Patient Satisfaction Sensor Device
US9782672B2 (en) 2014-09-12 2017-10-10 Voyetra Turtle Beach, Inc. Gaming headset with enhanced off-screen awareness
WO2018008396A1 (en) * 2016-07-05 2018-01-11 ソニー株式会社 Acoustic field formation device, method, and program
RU167902U1 (en) * 2016-10-11 2017-01-11 Общество с ограниченной ответственностью "Музыкальное издательство "Рэй Рекордс" High quality audio output device
CN110875058A (en) * 2018-08-31 2020-03-10 中国移动通信有限公司研究院 Voice communication processing method, terminal equipment and server
CN109597312B (en) * 2018-11-26 2022-03-01 北京小米移动软件有限公司 Sound box control method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09238390A (en) * 1996-02-29 1997-09-09 Sony Corp Speaker equipment
DE19943872A1 (en) * 1999-09-14 2001-03-15 Thomson Brandt Gmbh Device for adjusting the directional characteristic of microphones for voice control
US20080095401A1 (en) * 2006-10-19 2008-04-24 Polycom, Inc. Ultrasonic camera tracking system and associated methods

Family Cites Families (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4586195A (en) 1984-06-25 1986-04-29 Siemens Corporate Research & Support, Inc. Microphone range finder
JPS6292687A (en) * 1985-10-18 1987-04-28 Toshiba Corp Video camera
JPH0438390A (en) 1990-06-01 1992-02-07 Matsushita Electric Works Ltd Soundproof window
US7831358B2 (en) * 1992-05-05 2010-11-09 Automotive Technologies International, Inc. Arrangement and method for obtaining information using phase difference of modulated illumination
US7421321B2 (en) * 1995-06-07 2008-09-02 Automotive Technologies International, Inc. System for obtaining vehicular information
US9008854B2 (en) * 1995-06-07 2015-04-14 American Vehicular Sciences Llc Vehicle component control methods and systems
US6731334B1 (en) * 1995-07-31 2004-05-04 Forgent Networks, Inc. Automatic voice tracking camera system and method of operation
JP3000982B2 (en) * 1997-11-25 2000-01-17 日本電気株式会社 Super directional speaker system and method of driving speaker system
JP2000050387A (en) * 1998-07-16 2000-02-18 Massachusetts Inst Of Technol <Mit> Parameteric audio system
IL127569A0 (en) * 1998-09-16 1999-10-28 Comsense Technologies Ltd Interactive toys
US6408679B1 (en) * 2000-02-04 2002-06-25 The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration Bubble measuring instrument and method
US9219708B2 (en) * 2001-03-22 2015-12-22 DialwareInc. Method and system for remotely authenticating identification devices
JP2003284182A (en) * 2002-03-25 2003-10-03 Osaka Industrial Promotion Organization Ultrasonic wave sensor element, ultrasonic wave array sensor system, and adjustment method of resonance frequency therefor
WO2005098826A1 (en) * 2004-04-05 2005-10-20 Koninklijke Philips Electronics N.V. Method, device, encoder apparatus, decoder apparatus and audio system
KR100707103B1 (en) * 2004-06-09 2007-04-13 학교법인 포항공과대학교 High directional ultrasonic ranging measurement system and method in air using parametric array
US7995768B2 (en) 2005-01-27 2011-08-09 Yamaha Corporation Sound reinforcement system
KR20080046199A (en) 2005-09-21 2008-05-26 코닌클리케 필립스 일렉트로닉스 엔.브이. Ultrasound imaging system with voice activated controls using remotely positioned microphone
US7733224B2 (en) * 2006-06-30 2010-06-08 Bao Tran Mesh network personal emergency response appliance
US7414705B2 (en) * 2005-11-29 2008-08-19 Navisense Method and system for range measurement
CN100440264C (en) 2005-11-30 2008-12-03 中国科学院声学研究所 Supersonic invasion detection method and detection device
US8829799B2 (en) * 2006-03-28 2014-09-09 Wireless Environment, Llc Autonomous grid shifting lighting device
US8519566B2 (en) * 2006-03-28 2013-08-27 Wireless Environment, Llc Remote switch sensing in lighting devices
US9060683B2 (en) * 2006-05-12 2015-06-23 Bao Tran Mobile wireless appliance
US7539533B2 (en) * 2006-05-16 2009-05-26 Bao Tran Mesh network monitoring appliance
US7372770B2 (en) 2006-09-12 2008-05-13 Mitsubishi Electric Research Laboratories, Inc. Ultrasonic Doppler sensor for speech-based user interface
US20070161904A1 (en) * 2006-11-10 2007-07-12 Penrith Corporation Transducer array imaging system
US8220334B2 (en) * 2006-11-10 2012-07-17 Penrith Corporation Transducer array imaging system
US20080112265A1 (en) * 2006-11-10 2008-05-15 Penrith Corporation Transducer array imaging system
US9295444B2 (en) * 2006-11-10 2016-03-29 Siemens Medical Solutions Usa, Inc. Transducer array imaging system
US20080188752A1 (en) * 2007-02-05 2008-08-07 Penrith Corporation Automated movement detection with audio and visual information
NZ581214A (en) 2007-04-19 2012-01-12 Epos Dev Ltd Processing audible and ultrasonic sound inputs using a sensor with a wide frequency response
US8249731B2 (en) * 2007-05-24 2012-08-21 Alexander Bach Tran Smart air ventilation system
US9317110B2 (en) * 2007-05-29 2016-04-19 Cfph, Llc Game with hand motion control
JP4412367B2 (en) 2007-08-21 2010-02-10 株式会社デンソー Ultrasonic sensor
US8175291B2 (en) 2007-12-19 2012-05-08 Qualcomm Incorporated Systems, methods, and apparatus for multi-microphone based speech enhancement
JP2009244234A (en) * 2008-03-31 2009-10-22 New Industry Research Organization Ultrasonic array sensor and signal processing method
TW200948165A (en) 2008-05-15 2009-11-16 Asustek Comp Inc Sound system with acoustic calibration function
US8842851B2 (en) 2008-12-12 2014-09-23 Broadcom Corporation Audio source localization system and method
US8275622B2 (en) 2009-02-06 2012-09-25 Mitsubishi Electric Research Laboratories, Inc. Ultrasonic doppler sensor for speaker recognition
US8964298B2 (en) * 2010-02-28 2015-02-24 Microsoft Corporation Video display modification based on sensor input for a see-through near-to-eye display
CN102893175B (en) * 2010-05-20 2014-10-29 皇家飞利浦电子股份有限公司 Distance estimation using sound signals

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09238390A (en) * 1996-02-29 1997-09-09 Sony Corp Speaker equipment
DE19943872A1 (en) * 1999-09-14 2001-03-15 Thomson Brandt Gmbh Device for adjusting the directional characteristic of microphones for voice control
US20080095401A1 (en) * 2006-10-19 2008-04-24 Polycom, Inc. Ultrasonic camera tracking system and associated methods

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20150115918A (en) * 2013-03-05 2015-10-14 애플 인크. Adjusting the beam pattern of a speaker array based on the location of one or more listeners
US10986461B2 (en) 2013-03-05 2021-04-20 Apple Inc. Adjusting the beam pattern of a speaker array based on the location of one or more listeners
CN105190743A (en) * 2013-03-05 2015-12-23 苹果公司 Adjusting the beam pattern of a speaker array based on the location of one or more listeners
WO2014138134A3 (en) * 2013-03-05 2014-10-30 Tiskerling Dynamics Llc Adjusting the beam pattern of a speaker array based on the location of one or more listeners
US11399255B2 (en) 2013-03-05 2022-07-26 Apple Inc. Adjusting the beam pattern of a speaker array based on the location of one or more listeners
EP3483874A1 (en) * 2013-03-05 2019-05-15 Apple Inc. Adjusting the beam pattern of a speaker array based on the location of one or more listeners
EP3879523A1 (en) * 2013-03-05 2021-09-15 Apple Inc. Adjusting the beam pattern of a plurality of speaker arrays based on the locations of two listeners
US10021506B2 (en) 2013-03-05 2018-07-10 Apple Inc. Adjusting the beam pattern of a speaker array based on the location of one or more listeners
KR101892643B1 (en) * 2013-03-05 2018-08-29 애플 인크. Adjusting the beam pattern of a speaker array based on the location of one or more listeners
EP2965312B1 (en) * 2013-03-05 2019-01-02 Apple Inc. Adjusting the beam pattern of a speaker array based on the location of one or more listeners
CN104640001B (en) * 2013-11-07 2020-02-18 大陆汽车系统公司 Co-talker zero-setting method and device based on multiple super-directional beam former
CN104640001A (en) * 2013-11-07 2015-05-20 大陆汽车系统公司 Cotalker nulling based on multi super directional beamformer
US10244300B2 (en) 2014-04-30 2019-03-26 Oticon A/S Instrument with remote object detection unit
US9813825B2 (en) 2014-04-30 2017-11-07 Oticon A/S Instrument with remote object detection unit
EP2941019A1 (en) * 2014-04-30 2015-11-04 Oticon A/s Instrument with remote object detection unit
US9900723B1 (en) 2014-05-28 2018-02-20 Apple Inc. Multi-channel loudspeaker matching using variable directivity
WO2016043880A1 (en) * 2014-09-16 2016-03-24 Symbol Technologies, Llc Ultrasonic locationing interleaved with alternate audio functions
US10816638B2 (en) 2014-09-16 2020-10-27 Symbol Technologies, Llc Ultrasonic locationing interleaved with alternate audio functions
CN107076823B (en) * 2014-09-16 2019-12-31 讯宝科技有限责任公司 System and method for ultrasonic location interleaved with alternate audio functionality
CN107076823A (en) * 2014-09-16 2017-08-18 讯宝科技有限责任公司 It is intertwined with the ultrasonic wave positioning of alternate audio function
US10264383B1 (en) 2015-09-25 2019-04-16 Apple Inc. Multi-listener stereo image array

Also Published As

Publication number Publication date
US20130272096A1 (en) 2013-10-17
JP2014506428A (en) 2014-03-13
EP2661905A1 (en) 2013-11-13
BR112013017063A2 (en) 2018-06-05
EP2661905B1 (en) 2020-08-12
RU2591026C2 (en) 2016-07-10
RU2013136491A (en) 2015-02-10
US9596549B2 (en) 2017-03-14
CN103329565B (en) 2016-09-28
CN103329565A (en) 2013-09-25
JP6023081B2 (en) 2016-11-09

Similar Documents

Publication Publication Date Title
US9596549B2 (en) Audio system and method of operation therefor
US10079026B1 (en) Spatially-controlled noise reduction for headsets with variable microphone array orientation
CN110741434B (en) Dual microphone speech processing for headphones with variable microphone array orientation
US8644517B2 (en) System and method for automatic disabling and enabling of an acoustic beamformer
US10331396B2 (en) Filter and method for informed spatial filtering using multiple instantaneous direction-of-arrival estimates
EP2749042B1 (en) Processing signals
CN110140359B (en) Audio capture using beamforming
CN110140360B (en) Method and apparatus for audio capture using beamforming
RU2759715C2 (en) Sound recording using formation of directional diagram
WO2013049739A2 (en) Processing signals
GB2495278A (en) Processing received signals from a range of receiving angles to reduce interference
CN110140171B (en) Audio capture using beamforming

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12700525

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2012700525

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 13996347

Country of ref document: US

ENP Entry into the national phase

Ref document number: 2013547941

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2013136491

Country of ref document: RU

Kind code of ref document: A

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112013017063

Country of ref document: BR

ENP Entry into the national phase

Ref document number: 112013017063

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20130702