WO2018158558A1 - Dispositif de capture et de sortie audio - Google Patents
Dispositif de capture et de sortie audio Download PDFInfo
- Publication number
- WO2018158558A1 WO2018158558A1 PCT/GB2018/050418 GB2018050418W WO2018158558A1 WO 2018158558 A1 WO2018158558 A1 WO 2018158558A1 GB 2018050418 W GB2018050418 W GB 2018050418W WO 2018158558 A1 WO2018158558 A1 WO 2018158558A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- speaker
- microphones
- housing
- microphone array
- output
- Prior art date
Links
- 230000005236 sound signal Effects 0.000 claims abstract description 30
- 230000003044 adaptive effect Effects 0.000 claims abstract description 18
- 230000001419 dependent effect Effects 0.000 claims abstract description 8
- 230000004044 response Effects 0.000 claims description 14
- 239000002184 metal Substances 0.000 claims description 3
- 238000000034 method Methods 0.000 description 9
- 230000008569 process Effects 0.000 description 7
- 230000003595 spectral effect Effects 0.000 description 6
- 238000004590 computer program Methods 0.000 description 4
- 230000008878 coupling Effects 0.000 description 4
- 238000010168 coupling process Methods 0.000 description 4
- 238000005859 coupling reaction Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 238000009877 rendering Methods 0.000 description 4
- 238000001228 spectrum Methods 0.000 description 4
- 238000004891 communication Methods 0.000 description 3
- 238000001914 filtration Methods 0.000 description 3
- 238000000926 separation method Methods 0.000 description 3
- 238000003491 array Methods 0.000 description 2
- 230000001066 destructive effect Effects 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 230000006866 deterioration Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000010363 phase shift Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 238000007493 shaping process Methods 0.000 description 1
- 239000012780 transparent material Substances 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M9/00—Arrangements for interconnection not involving centralised switching
- H04M9/08—Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic
- H04M9/082—Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic using echo cancellers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04B—TRANSMISSION
- H04B3/00—Line transmission systems
- H04B3/02—Details
- H04B3/20—Reducing echo effects or singing; Opening or closing transmitting path; Conditioning for transmission in one direction or the other
- H04B3/23—Reducing echo effects or singing; Opening or closing transmitting path; Conditioning for transmission in one direction or the other using a replica of transmitted signal in the time domain, e.g. echo cancellers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/42—Systems providing special services or facilities to subscribers
- H04M3/56—Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities
- H04M3/568—Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities audio processing specific to telephonic conferencing, e.g. spatial distribution, mixing of participants
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
- H04R3/005—Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L2021/02082—Noise filtering the noise being echo, reverberation of the speech
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02166—Microphone arrays; Beamforming
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R1/00—Details of transducers, loudspeakers or microphones
- H04R1/20—Arrangements for obtaining desired frequency or directional characteristics
- H04R1/22—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired frequency characteristic only
- H04R1/26—Spatial arrangements of separate transducers responsive to two or more frequency ranges
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R2201/00—Details of transducers, loudspeakers or microphones covered by H04R1/00 but not provided for in any of its subgroups
- H04R2201/40—Details of arrangements for obtaining desired directional characteristic by combining a number of identical transducers covered by H04R1/40 but not provided for in any of its subgroups
- H04R2201/401—2D or 3D arrays of transducers
Definitions
- This specification generally relates to a device for capturing and outputting audio.
- Audio capture and output devices such as conference speakerphones include a microphone and a speaker for capturing a user's voice and outputting the voice of a recipient.
- Such devices may employ signal processing algorithms to reduce the occurrence of echo or feedback.
- the specification describes a device comprising: an acoustically transparent housing; a microphone array comprising a plurality of microphones for capturing audio data, the plurality of microphones being located at a lower end of the housing; at least one speaker configured to output an output sound dependent on an output audio signal, the speaker being located above the microphone array; and a data processing apparatus configured to provide: a plurality of acoustic echo cancellers, each for use in generating a respective echo cancelled audio signal dependent on audio data derived from a respective microphone, using information derived from the said output audio signal; and an adaptive beamformer configured to combine the echo cancelled audio signals.
- the plurality of microphones may be provided in a circular array.
- the microphones may be provided at equally spaced intervals at the lower end of the housing.
- the microphones may be omnidirectional microphones
- the microphone array may comprise at least two microphones and no more than eight microphones.
- the speaker may be concentric with the microphone array.
- the microphones may be located no more than 20 mm from the bottom surface of the housing.
- the housing may be cylindrical.
- the housing may be a mesh.
- the housing may comprise a perforated metal.
- the speaker may comprise a high frequency speaker driver and a low frequency speaker driver.
- the speaker may comprise a plurality of speaker drivers, each speaker driver being arranged to direct sound in a different direction to the other speaker drivers.
- the plurality of speaker drivers may be arranged to render the output audio signal as spatial audio.
- the adaptive beamformer may be a minimum variance distortionless response beamformer.
- Figure 1 is a schematic illustration of a device for capturing and outputting audio according to an embodiment of the specification
- Figure 2 is a schematic diagram illustrating the relationship between components of the device according to an embodiment of the specification
- Figure 3 is a schematic illustration of the device according to an embodiment of the specification.
- Figure 4 is a schematic illustration of the device according to an embodiment of the specification.
- FIG. 5 is a schematic illustration of the device according to an embodiment of the specification. Detailed Description
- the device described herein is a device for capturing audio and outputting sound.
- the device may be, for example, a speakerphone device used for communication over a telephone network, such as in a teleconferencing system.
- the device may be a speakerphone for use in conference calling, wherein the microphones capture sound including a user's voice for transmission to a recipient at the end of the telephone connection.
- the speaker may output sound corresponding to an audio signal transmitted to the device from the recipient to the user.
- the device may be configured to output audio transmitted to the speaker from a server.
- the output audio may, for example, comprise computer synthesised speech, an audio track, such as music, or a pre-recorded voice file.
- the audio captured by the microphones in such a device may be transmitted to the server.
- the device or the server may be configured to recognise certain sounds or words captured by the microphone as instructions, and the device may provide an output in response to the instruction.
- FIG ⁇ is a schematic illustration of a device ⁇ for capturing and outputting audio according to an embodiment of this specification.
- the device l comprises an acoustically transparent housing 10.
- the housing 10 may comprise any material which allows sound waves to pass through.
- the housing 10 may comprise a mesh.
- the mesh may be a material including a plurality of holes through which sound waves are able to pass.
- the mesh may be formed of perforated metal.
- any suitable acoustically transparent material may be utilised.
- a microphone array 20 is provided at a lower end of the housing 10.
- the microphone array 20 comprises a plurality of microphones 21.
- the microphones 21 are configured to capture audio data.
- the captured audio data may be a voice of one or more users speaking in the vicinity of the microphone array 20.
- the captured audio data may also include background noise other than the voice of a user. Due to the housing 10 being acoustically transparent, each microphone 21 may capture audio data from audio sources provided in a 360 degree range around the microphone array 20.
- the microphones 21 may be provided as a circular array 20.
- the spacing between the microphones 21 may be uniform.
- the microphone array 20 may be configured to uniformly capture audio data in a 360 range.
- the microphone array 20 may additionally include a central microphone element.
- the microphones 21 may be omnidirectional microphones. This allows various beamforming algorithms to be implemented in order to improve reduction of background noise and reverberation in the captured audio data. If arrays of
- the separation between the microphones 21 may be no more than a few centimetres. In this way, the directional sensitivity of the microphone array 20 may be improved while maintaining a substantially uniform 360 degree pickup range of audio sources.
- the microphone array 20 may comprise no more than eight microphones 21, and no fewer than two microphones 21. In this way, production costs may be reduced while providing a uniform 360 pickup.
- the microphone array 20 may be provided inside the housing 10 on the bottom surface.
- the microphones 21 By including the microphones 21 on the bottom surface, interference from sound reflected off the surface on which the device 1 is seated may be reduced. This may occur because the distance that the reflected sound waves may travel from the surface from which they are reflected to the microphones 21 is small, which reduces the phase difference between the sound received at the microphones 21 from the audio source and the reflected sound waves, thereby reducing destructing interference.
- the destructive interference which may occur is commonly known as the "comb filtering effect". Indeed, the signal may be increased by constructive
- the microphones 21 may be provided to be no more than 20 mm from the bottom surface of the housing. Additionally, to reduce any comb filtering effect, the speaker enclosure may be provided at least 100 mm above the microphone array to reduce its impact on the sound field in proximity of the microphone array. By placing the speaker enclosure at least 100 mm away from the microphone array, there is flexibility in the options for designing the microphone array. For example, if desired, the microphone array could be configured to use a large number of microphones, and may include a central microphone element.
- a speaker 30 is provided at a plane located above the microphone array 20.
- the speaker 30 may be configured to output an output sound dependent on an output audio signal.
- the speaker 30 may be configured to receive the output audio signal, and to output an output sound corresponding to the output audio signal.
- the speaker 30 may be arranged to direct sound in a direction away from the microphones 21.
- the speaker 30 is arranged to direct the output sound in an upward direction.
- the speaker 30 may be arranged to direct sound in directions other than an upward direction.
- the speaker 30 is generally arranged so that the sound generated by the speaker 30s is not directed in the direction of the microphones 21. In this way, device 1 may reduce the amount of captured audio captured by the microphones 21 which corresponds to sound output by the speaker 30. Sound output by the speaker 30, if captured by the microphones 21, may result in an echo in the captured audio data.
- the device 1 includes a data processing apparatus 40 which is configured to provide a plurality of acoustic echo cancellers 50 and a beamformer 60, described in more detail with respect to Figure 2.
- the acoustic echo cancellers 50 and the beamformer 60 may be implemented in the data processing apparatus 40 by way of a computer program.
- the computer program when executed by the data processing apparatus 40, causes the data processing apparatus 40 to perform acoustic echo cancellation using audio data derived from each of the microphones 21 individually. Additionally, the computer program, when executed by the data processing apparatus, causes the data processing apparatus to perform a beamforming process on the echo cancelled signals 53 which are generated using the acoustic echo cancellers 50.
- the echo cancelling process comprises removing components corresponding to the output audio signal from the audio data derived from each individual microphone 21.
- An acoustic echo canceller 51 may be provided to correspond respectively to each of the microphones 21.
- Each acoustic echo canceller 51 is configured to receive the output audio signal and to generate an echo replica, which is subtracted from the respective microphone signal at a subtraction node 52. In this way, the echo cancelling process causes an echo cancelled audio signal 53 to be generated for each microphone.
- the acoustic echo cancellers 50 are adaptive echo cancellers, and may each comprise an adaptive filter. More specifically, in producing an echo cancelled audio signal 53, the output of the subtraction node 52 may be used to adjust the coefficients of the adaptive filter.
- the beamformer 60 is configured to receive each echo cancelled audio signal 53.
- the beamformer 60 may be an adaptive beamformer 60 configured to perform adaptive spatial signal processing to the echo cancelled signals 53.
- the beamforming process may adapt to increase a signal strength from a given direction. For example, the direction of a user's voice may be determined using speech detection and direction of arrival algorithms, and the signals from this direction may be increased by the beamformer 60. Additionally, spectral statistics of noise maybe determined, and the beamformer 60 may be configured to reduce the noise accordingly. Therefore, the echo cancelled signals 53 corresponding to captured audio data from each microphone 21 are combined by the beamformer 60 and filtered spatially to improve the signal quality based on a desired signal source.
- the beamformer 60 outputs the spatially filtered signal.
- Performing acoustic echo cancellation using audio data derived from each microphone 21 individually before providing the echo cancelled signals 53 to the beamformer 60 allows the speaker 30 to be placed above the microphone array 20 without requiring the coupling factors and phase of any sound received from the speaker 30 to be equal at each microphone 21, in contrast to arrangements in which the beamformer is fed by the original microphone signals and providing the signal from the beamformer to a single input acoustic echo canceller. Therefore, the arrangement of the speaker 30 in the device 1 above the microphone array 20 may be flexible and not subject to constraints for minimising the phase difference for sound received at the microphones 21 in the array 20.
- the device 1 does not require a null directional response to be achieved along the vertical axes (towards the speaker) by mutually cancelling the echo among the various microphones
- the device 1 may be configured to implement a variety of adaptive beamforming processes.
- the type of adaptive beamforming may be selected to reduce background noise (which may include stationary and diffuse or non stationary and localised background noise).
- Adaptive beamforming algorithms are able to adjust a directional response so that its nulls are directed towards dominant noise sources.
- the beamformer 6o of the embodiments described herein may be implemented, for example, by using a Minimum Variance Distortionless Response (MVDR) algorithm.
- MVDR Minimum Variance Distortionless Response
- any suitable beamforming algorithm may be selected, such as, for example, any algorithm falling inside the Generalised Sidelobe Canceller framework.
- MVDR may provide for a flat frequency response in a given microphone pickup direction while reducing the array gain for given interference signal, if the spectral properties of the interference can be estimated.
- the spectral properties may include, but are not limited to, the power spectrum and cross spectrum of microphone pairs. It will be understood than any suitable spectral property may be used. If the microphone array includes a central microphone element, a wider range of beamforming algorithms may be considered.
- the adaptive beamforming algorithms may be configured to reduce localised stationary noise sources (such as, for example, noise from a PC fan). These algorithms may also be extended to reduce non-stationary noise sources if the noise sources can be distinguished from the desired audio sources such as the speech of a user. For example, as described above, speech discrimination algorithms may be used to distinguish between a user's speech and background noise. The algorithms may determine that noise coming from a given region should be treated as interference.
- localised stationary noise sources such as, for example, noise from a PC fan.
- speech discrimination algorithms may be used to distinguish between a user's speech and background noise. The algorithms may determine that noise coming from a given region should be treated as interference.
- the beamformer 60 may be configured to reduce any residual echo leftover after the captured audio data has passed through the acoustic echo cancellers 50.
- Such residual echo may include distortion introduced by the speaker 30 to the sound output from the speaker 30. Such distortion does not have a corresponding signal component in the output audio signal, and so the distortion will not be removed when the acoustic echo cancellers 50 remove the output audio signal components from the captured audio data.
- the acoustic echo cancellers 50 may provide estimates of the residual echo spectrum and detection of the various talking states (single talk, double talk, for example), and so it the residual echo spectral statistics can be fed to a MVDR beamformer.
- the beamformer 60 may be configured to focus on desired speech sources while reducing noise and residual echo by dynamically shaping its directional response according to various states.
- states may include talking states such as single talk, double talk, near speech only, noise only.
- the directional response may also be shaped based on speech, echo, and noise spectrum and levels.
- Such directional response may not be possible using time invariant beamforming. For example, beamforming followed by acoustic echo cancellation used in other applications requires the use of time invariant beamformers, since any time varying process inserted along the echo path would severely degrade the performance of any acoustic echo cancellation algorithm.
- the beamformer may be configured to switch between various beam pattern shapes and arrangement according to a user activity and position according to the level of residual echo.
- the beamformer can also be used to reduce residual echo and can be designed to have an upper working limit of between 4kHz and 8kHz.
- the upper working limit may depend on the microphone spacing and may for example be 5kHz, 6kHz or 7kHz.
- the upper working limit may decrease with increasing distance between the microphones.
- a sealed enclosure 35 around the sides of the loudspeaker and between the loudspeaker and the microphone will provide acoustic shadowing which may be enough to keep any echo coupling low above 4kHz to 8 kHz.
- the audio signal processing performed by the data processing apparatus 30 may be implemented in the short-time Fourier transform (STFT) domain, using a Hamming window with zero padding and Fast Fourier Transform (FFT).
- STFT short-time Fourier transform
- FFT Fast Fourier Transform
- the echo cancellation may be performed using a partitioned block frequency domain adaptive filter
- PBFDAF generalised multi-delay filter
- GMDF generalised multi-delay filter
- the beamforming may be conveniently implemented and in particular the MVDR.
- MVDR an estimate is made of the covariance matrix of the background noise (or interference more in general) for all microphone pairs.
- the covariance matrix is combined with the steering vector (the phase shift factor resulting from a sound coming from the desired beam pick up direction) in order to get the beam weights that generate a directional pattern with flat frequency response in the look direction and minimise noise at the output.
- the steering vector the phase shift factor resulting from a sound coming from the desired beam pick up direction
- Using a MVDR beamformer it is possible to generate various beams with various look directions (for example four beams with four look directions using four microphones), and use a beam steering algorithm based on spectral distance to pick the "loudest" among all the generated beams. It will be understood that it is possible to extend this sort of method including speech vs non-speech discrimination in order to make the beamformer to focus on just real speech and neglect non speech noises.
- the data processing apparatus 40 may be of any suitable composition and may include one or more processors of any suitable type or suitable combination of types.
- the data processing apparatus 40 may comprise a programmable processor that interprets computer program instructions and processes data.
- the data processing apparatus 40 may comprise, for example, programmable hardware with embedded firmware.
- a processing apparatus may alternatively or additionally include one or more specialised circuit such as field programmable gate arrays FPGA, Application Specific Integrated Circuits (ASICs), signal processing devices etc.
- the data processing apparatus may include memory having computer readable instructions stored thereon, which when executed by a processor causes the processor to cause performance of operations and/or methods described herein.
- the speaker 30 may comprise a number of different frequency speaker drivers. Speaker drivers convert a received electrical audio signal to sound waves.
- the speaker 30 comprises a single driver element 31 arranged to direct sound in an upward direction.
- the speaker 30 may also comprise a cone reflector 32 on top of the driver 31. As such, the sound may be radiated through a ring shaped window.
- the driver 31 may be a full range driver covering the whole audio frequency range. However, the driver 31 may be a limited frequency driver, and may be combined with drivers covering different frequency ranges, as described in more detail with reference to Figure 3 and 4.
- the speaker 30 may be provided in an enclosure 35 sealed around the sides and the base. In this way, the coupling between the sound from the speaker 30 and the microphone array 20 may be reduced by reducing the sound directed towards the microphones 21.
- the microphone array 20 is a circular array 20.
- the speaker 30 is provided to be coaxial with the microphone array 20.
- the speaker 30 may be provided at a location other than coaxial with the microphone array 20.
- the speaker 30 is provided at a plane above the microphone array 20.
- the speaker 30 and the microphone array 20 are separated by the acoustically transparent housing 10.
- the acoustically transparent housing 10 allows the microphone array 20 to be exposed to surrounding sound.
- a greater physical separation between the speaker 30 and the microphone array 20 may reduce the amount of sound output by the speaker 30 which is received at the microphone array 20. This may help to reduce any echo in the captured audio data. Additionally, a greater physical separation between the speaker 30 and the microphone array 20 may reduce any acoustic shadowing or scattering caused by the speaker enclosure 35 affecting the sound field around the array 20.
- the device 1 may include a speaker 30 arrangement which differs to that depicted in Figure 1.
- the speaker 30 includes a low frequency driver 33, also commonly known as a "woofer”.
- the speaker 30 also includes a high frequency driver 34, also commonly known as a "tweeter”.
- the high frequency driver 34 may be provided on top of the low frequency driver 33 such that it is separated from the low frequency driver 33 by a seal.
- the speaker 30 arrangement may be separated from the microphone array 20 by the enclosure 35.
- a cone reflector 32 may be placed on top of the high frequency driver 34. The cone reflector 32 may help to spread the distribution of the sound. Including two different frequency drivers may improve the frequency response of the speaker 30, but may increase the cost of the speaker 30 compared to the example of Figure 1.
- the speaker 30 is provided such that the drivers 33, 34 are concentric with each other and with the microphone array 20.
- the embodiment is not limited to the drivers 33, 34 being provided in a concentric manner.
- Figure 4 illustrates a device 1 including an alternative speaker 30 arrangement to those depicted in Figures 1 and 3.
- a low frequency driver 33 is provided above the microphone array 20 similarly to Figure 1, arranged to face an upward direction.
- the driver 33 may be provided to be concentric with the microphone array 20.
- the speaker 30 may further comprise a plurality of high frequency drivers 34 directed radially outward. Including a plurality of high frequency drivers 34 may further improve the frequency response of the speaker 30. However, a greater number of high frequency drivers 34 may increase the complexity and cost of the device 1.
- Including a plurality of high frequency drivers 34 may provide for spatial rendering of the high frequency components a multichannel output audio signal.
- the output audio signal is rendered such that given sounds may be perceived to come from a given spatial location.
- spatial rendering of the output audio signal maybe performed, as the system does not require equal phase and coupling factors of the sound received at each microphone 21.
- higher frequencies maybe more relevant for spatial rendering, only one low frequency driver may be required in combination with the plurality of high frequency drivers 34.
- Figure 5 depicts an example in which the device 1 may include a speaker 30
- the speaker 30 is arranged to project sound in a direction substantially perpendicular to the upward facing direction of the speaker 30 described with reference to Figure 1.
- the device 1 when placed on a surface in a room, the device 1 may be arranged such that the speaker 30 is directed towards a user, as desired. This may improve the experience of the user receiving the sound from the speaker 30.
- the speaker 30 may be provided in this way for example, for use with a video
- the speaker 30 may be configured to be pointing towards the same area.
- the speaker may be provided to face a desired direction without being concentric with the microphone array.
- the echo cancellation and beamforming performed on the captured audio signals allows the speaker to be provided according to any of the embodiments described above. Indeed, it will be understood that the speaker may be positioned in any location above the microphone array and facing any direction. In addition, the embodiments described allow for a large volume enclosure to be provided for the speaker, and the speaker may include multiple drivers in order to improve the frequency response, or to provide spatial audio.
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Otolaryngology (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Computer Networks & Wireless Communication (AREA)
- Circuit For Audible Band Transducer (AREA)
Abstract
La présente invention concerne un dispositif comprenant : un boîtier acoustiquement transparent ; un réseau de microphones comprenant une pluralité de microphones pour capturer des données audio, la pluralité de microphones étant située au niveau d'une extrémité inférieure du boîtier ; au moins un haut-parleur conçu pour émettre un son de sortie en fonction d'un signal audio de sortie, le haut-parleur étant situé au-dessus du réseau de microphones ; et un appareil de traitement de données conçu pour fournir : une pluralité de dispositifs d'élimination d'écho acoustique, chacun destiné à être utilisé pour générer un signal audio à écho éliminé respectif en fonction de données audio dérivées d'un microphone respectif, à l'aide d'informations dérivées dudit signal audio de sortie ; et un formeur de faisceaux adaptatif conçu pour combiner les signaux audio à écho éliminé.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB1703478.6 | 2017-03-03 | ||
GB1703478.6A GB2545359B (en) | 2017-03-03 | 2017-03-03 | Device for capturing and outputting audio |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2018158558A1 true WO2018158558A1 (fr) | 2018-09-07 |
Family
ID=58543782
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/GB2018/050418 WO2018158558A1 (fr) | 2017-03-03 | 2018-02-16 | Dispositif de capture et de sortie audio |
Country Status (2)
Country | Link |
---|---|
GB (1) | GB2545359B (fr) |
WO (1) | WO2018158558A1 (fr) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112151051A (zh) * | 2020-09-14 | 2020-12-29 | 海尔优家智能科技(北京)有限公司 | 音频数据的处理方法和装置及存储介质 |
US20220353593A1 (en) * | 2021-04-29 | 2022-11-03 | Halonix Technologies Private Limited | Apparatus and methods for cancelling the noise of a speaker for speech recognition |
WO2023133589A3 (fr) * | 2022-01-10 | 2023-08-10 | Shure Acquisition Holdings, Inc. | Microphone à formation de faisceau avec haut-parleur |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10667071B2 (en) | 2018-05-31 | 2020-05-26 | Harman International Industries, Incorporated | Low complexity multi-channel smart loudspeaker with voice control |
CN110166898B (zh) * | 2019-05-20 | 2021-03-30 | 南京南方电讯有限公司 | 一种高保真远传输的阵列麦克风 |
CN116325794A (zh) | 2020-10-13 | 2023-06-23 | Ask工业股份公司 | 麦克风单元、麦克风元阵列和带有麦克风元阵列的网络 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040125942A1 (en) * | 2002-11-29 | 2004-07-01 | Franck Beaucoup | Method of acoustic echo cancellation in full-duplex hands free audio conferencing with spatial directivity |
US20060239443A1 (en) * | 2004-10-15 | 2006-10-26 | Oxford William V | Videoconferencing echo cancellers |
GB2495130A (en) * | 2011-09-30 | 2013-04-03 | Skype | Microphone array with beamforming means and echo-cancelling means |
US20160007114A1 (en) * | 2009-11-12 | 2016-01-07 | Robert Henry Frater | Speakerphone and/or Microphone Arrays and Methods and Systems of Using the Same |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7970151B2 (en) * | 2004-10-15 | 2011-06-28 | Lifesize Communications, Inc. | Hybrid beamforming |
US9083782B2 (en) * | 2013-05-08 | 2015-07-14 | Blackberry Limited | Dual beamform audio echo reduction |
-
2017
- 2017-03-03 GB GB1703478.6A patent/GB2545359B/en active Active
-
2018
- 2018-02-16 WO PCT/GB2018/050418 patent/WO2018158558A1/fr active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040125942A1 (en) * | 2002-11-29 | 2004-07-01 | Franck Beaucoup | Method of acoustic echo cancellation in full-duplex hands free audio conferencing with spatial directivity |
US20060239443A1 (en) * | 2004-10-15 | 2006-10-26 | Oxford William V | Videoconferencing echo cancellers |
US20160007114A1 (en) * | 2009-11-12 | 2016-01-07 | Robert Henry Frater | Speakerphone and/or Microphone Arrays and Methods and Systems of Using the Same |
GB2495130A (en) * | 2011-09-30 | 2013-04-03 | Skype | Microphone array with beamforming means and echo-cancelling means |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112151051A (zh) * | 2020-09-14 | 2020-12-29 | 海尔优家智能科技(北京)有限公司 | 音频数据的处理方法和装置及存储介质 |
CN112151051B (zh) * | 2020-09-14 | 2023-12-19 | 海尔优家智能科技(北京)有限公司 | 音频数据的处理方法和装置及存储介质 |
US20220353593A1 (en) * | 2021-04-29 | 2022-11-03 | Halonix Technologies Private Limited | Apparatus and methods for cancelling the noise of a speaker for speech recognition |
US11627395B2 (en) * | 2021-04-29 | 2023-04-11 | Halonix Technologies Private Limited | Apparatus and methods for cancelling the noise of a speaker for speech recognition |
WO2023133589A3 (fr) * | 2022-01-10 | 2023-08-10 | Shure Acquisition Holdings, Inc. | Microphone à formation de faisceau avec haut-parleur |
Also Published As
Publication number | Publication date |
---|---|
GB2545359B (en) | 2018-02-14 |
GB2545359A (en) | 2017-06-14 |
GB201703478D0 (en) | 2017-04-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TWI713844B (zh) | 用於語音處理的方法及積體電路 | |
US8046219B2 (en) | Robust two microphone noise suppression system | |
WO2018158558A1 (fr) | Dispositif de capture et de sortie audio | |
US9966059B1 (en) | Reconfigurale fixed beam former using given microphone array | |
US10269369B2 (en) | System and method of noise reduction for a mobile device | |
KR101566649B1 (ko) | 근거리 널 및 빔 형성 | |
JP6196320B2 (ja) | 複数の瞬間到来方向推定を用いるインフォ−ムド空間フィルタリングのフィルタおよび方法 | |
US9020163B2 (en) | Near-field null and beamforming | |
US9521486B1 (en) | Frequency based beamforming | |
US10250975B1 (en) | Adaptive directional audio enhancement and selection | |
US20030026437A1 (en) | Sound reinforcement system having an multi microphone echo suppressor as post processor | |
US20110129095A1 (en) | Audio Zoom | |
KR20060128944A (ko) | 일반화된 사이드로브 제거를 위한 노이즈 레퍼런스들을생성하는 방법 | |
US20130322655A1 (en) | Method and device for microphone selection | |
CN111078185A (zh) | 录制声音的方法及设备 | |
US20200204915A1 (en) | Method of compensating a processed audio signal | |
EP3545691B1 (fr) | Capture sonore en champ lointain | |
Sugiyama et al. | A new generalized sidelobe canceller with a compact array of microphones suitable for mobile terminals | |
As’ad et al. | Beamforming designs robust to propagation model estimation errors for binaural hearing aids | |
JP2003250192A (ja) | 拡声装置および室内拡声システム | |
US20200267490A1 (en) | Sound wave field generation | |
Reindl et al. | An acoustic front-end for interactive TV incorporating multichannel acoustic echo cancellation and blind signal extraction | |
US20240249742A1 (en) | Partially adaptive audio beamforming systems and methods | |
US20230224635A1 (en) | Audio beamforming with nulling control system and methods | |
Adebisi et al. | Acoustic signal gain enhancement and speech recognition improvement in smartphones using the REF beamforming algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18708172 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 07.11.2019) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 18708172 Country of ref document: EP Kind code of ref document: A1 |