EP2757811B1 - Modal beamforming - Google Patents
Modal beamforming Download PDFInfo
- Publication number
- EP2757811B1 EP2757811B1 EP13152209.6A EP13152209A EP2757811B1 EP 2757811 B1 EP2757811 B1 EP 2757811B1 EP 13152209 A EP13152209 A EP 13152209A EP 2757811 B1 EP2757811 B1 EP 2757811B1
- Authority
- EP
- European Patent Office
- Prior art keywords
- function
- regularization
- white noise
- eigenbeam
- parameter
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 claims description 27
- 230000005236 sound signal Effects 0.000 claims description 18
- 230000008569 process Effects 0.000 claims description 13
- 102100026436 Regulator of MON1-CCZ1 complex Human genes 0.000 claims description 12
- 101710180672 Regulator of MON1-CCZ1 complex Proteins 0.000 claims description 12
- 230000006978 adaptation Effects 0.000 claims description 11
- 230000007812 deficiency Effects 0.000 claims description 5
- 238000012804 iterative process Methods 0.000 claims description 4
- 230000009191 jumping Effects 0.000 claims description 2
- 238000010586 diagram Methods 0.000 description 15
- 230000004044 response Effects 0.000 description 11
- 238000012546 transfer Methods 0.000 description 8
- 230000003247 decreasing effect Effects 0.000 description 6
- 230000001419 dependent effect Effects 0.000 description 5
- 239000011159 matrix material Substances 0.000 description 5
- 238000005259 measurement Methods 0.000 description 5
- 239000000203 mixture Substances 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 241001274197 Scatophagus argus Species 0.000 description 3
- 230000003321 amplification Effects 0.000 description 3
- 238000000354 decomposition reaction Methods 0.000 description 3
- 238000003199 nucleic acid amplification method Methods 0.000 description 3
- 230000035945 sensitivity Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000009472 formulation Methods 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 230000003595 spectral effect Effects 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 238000011144 upstream manufacturing Methods 0.000 description 1
- 239000002023 wood Substances 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
- H04R3/005—Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R2201/00—Details of transducers, loudspeakers or microphones covered by H04R1/00 but not provided for in any of its subgroups
- H04R2201/40—Details of arrangements for obtaining desired directional characteristic by combining a number of identical transducers covered by H04R1/40 but not provided for in any of its subgroups
- H04R2201/401—2D or 3D arrays of transducers
Definitions
- the embodiments disclosed herein refer to sound capture systems and methods, particularly to sound capture methods that employ modal beamforming.
- Beamforming sound capture systems comprise at least (a) an array of two or more microphones and (b) a beamformer that combines audio signals generated by the microphones to form an auditory scene representative of at least a portion of an acoustic sound field. Due to the underlying geometry, it is natural to represent the sound field captured on the surface of a sphere with respect to spherical harmonics. In this context, spherical harmonics are also known as acoustic modes (or eigenbeams) and the appending signal-processing techniques as modal beamforming.
- the sphere may exist physically, or may merely be conceptual.
- the microphones are arranged around a rigid sphere made of, for example, wood or hard plastic.
- the microphones are arranged in free-field around an "open" sphere, referred to as an open-sphere configuration.
- the rigid-sphere configuration provides a more robust numerical formulation, the open-sphere configuration might be more desirable in practice at low frequencies where large spheres are realized.
- Beamforming techniques allow for the controlling of the characteristics of the microphone array in order to achieve a desired directivity.
- One of the most general formulations is the filter-and-sum beamformer, which has readily been generalized by the concept of modal subspace decomposition. This approach finds optimum finite impulse response (FIR) filter coefficients for each microphone by solving an eigenvalue problem and projecting the desired beam pattern onto the set of eigenbeam patterns found.
- FIR finite impulse response
- Beamforming sound capture systems enable picking up acoustic signals dependent on their direction of propagation.
- the directional pattern of the microphone array can be varied over a wide range due to the degrees of freedom offered by the plurality of microphones and the processing of the associated beamformer. This enables, for example, steering the look direction, adapting the pattern according to the actual acoustic situation, and/or zooming in to or out from an acoustic source. All this can be done by controlling the beamformer, which is typically implemented via software, such that no mechanical alteration of the microphone array is needed.
- WO 03/061336 A1 discloses a microphone array-based audio system that supports representations of auditory scenes using second-order (or higher) harmonic expansions based on the audio signals generated by the microphone array.
- a plurality of audio sensors are mounted on the surface of an acoustically rigid sphere. The number and location of the audio sensors on the sphere are designed to enable the audio signals generated by those sensors to be decomposed into a set of eigenbeams having at least one eigenbeam of order two (or higher). Beamforming (e.g., steering, weighting, and summing) can then be applied to the resulting eigenbeam outputs to generate one or more channels of audio signals that can be utilized to accurately render an auditory scene.
- Beamforming e.g., steering, weighting, and summing
- Alternative embodiments include using shapes other than spheres, using acoustically soft spheres and/or positioning audio sensors in two or more concentric patterns.
- common beamformers fail to be directive at very low frequencies. Therefore, modal beamformers having less frequency-dependent directivity are desired.
- a method for generating an auditory scene comprises: receiving eigenbeam outputs generated by decomposing a plurality of audio signals, each audio signal having been generated by a different microphone of a microphone array, wherein each eigenbeam output corresponds to a different eigenbeam for the microphone array and the microphones are arranged on a rigid sphere or an open sphere; generating the auditory scene based on the eigenbeam outputs and their corresponding eigenbeams, wherein generating the auditory scene comprises applying a weighting value to each eigenbeam output to form steered eigenbeam outputs and combining the steered eigenbeam outputs to generate the auditory scene.
- Generating the auditory scene further comprises applying a regularized equalizing filter to each steered eigenbeam output, the regularized equalizing filter(s) being configured to compensate for acoustic deficiencies of the microphone array and having a regularized equalization function.
- the regularized equalization function is a radial equalization function that comprises the quotient of a regularization function limiting the radial equalization function and a radial function describing an acoustic wave field in the vicinity of the surface of the rigid sphere or the center of the open sphere.
- the regularization function is the quotient of the absolute value of the square of the radial function and the sum of the absolute value of the square of the radial function and a regularization parameter, the regularization parameter being set to a value greater than 0 and smaller than a maximum value that is smaller than infinity.
- a modal beamformer system for generating an auditory scene comprises: a steering unit that is configured to receive eigenbeam outputs and to apply a weighting value to each eigenbeam output to provide steered provide steered eigenbeam outputs, the eigenbeam outputs having been generated by decomposing a plurality of audio signals, each audio signal having been generated by a different microphone of a microphone array, wherein each eigenbeam output corresponds to a different eigenbeam for the microphone array and the microphones are arranged on a rigid sphere or an open sphere.
- the system further comprises a weighting unit that is configured to receive the steered eigenbeam outputs and to generate weighted steered eigenbeam outputs, and a summing element configured to combine the weighted steered eigenbeam outputs to generate the auditory scene.
- the weighting unit is further configured to apply a regularized equalizing filter to each steered eigenbeam output, the regularized equalizing filter(s) being configured to compensate for acoustic deficiencies of the microphone array and having a regularized equalization function.
- the regularized equalization function is a radial equalization function that comprises the quotient of a regularization function limiting the radial equalization function and a radial function describing an
- the regularization function is the quotient of the absolute value of the square of the radial function and the sum of the absolute value of the square of the radial function and a regularization parameter, the regularization parameter being set to a value greater than 0 and smaller than a maximum value that is smaller than infinity.
- FIG. 1 is a block diagram illustrating the basic structure of a beamforming sound capture system as described in more detail, for instance, in WO 03/061336 .
- the sound capture system comprises a plurality Q of microphones Mic1, Mic2, ... MicQ configured to form a microphone array, a matrixing unit MU (also known as modal decomposer or eigenbeam former), and a modal beamformer BF.
- modal beamformer BF comprises a steering unit SU, a weighting unit WU, and a summing element SE, each of which will be discussed in further detail later in this specification.
- MicQ generates a time-varying analog or digital audio signal S 1 ( ⁇ 1 , ⁇ 1 ,ka), S 2 ( ⁇ 1 , ⁇ 2 ,ka) ... S Q (E Q , ⁇ Q ,ka) corresponding to the sound incident at the location of that microphone.
- Y + ⁇ m,n ( ⁇ , ⁇ ) corresponds to a different mode for the microphone array.
- the term auditory scene is used generically to refer to any desired output from a sound capture system, such as the system of FIG. 1 .
- the definition of the particular auditory scene will vary from application to application.
- the output generated by beamformer BF may correspond to one or more output signals, e.g., one for each speaker used to generate the resultant auditory scene.
- beamformer BF may simultaneously generate beampatterns for two or more different auditory scenes, each of which can be independently steered to any direction in space.
- microphones Mic1, Mic2, ... MicQ may be mounted on the surface of an acoustically rigid sphere or may be arranged on a virtual (open) sphere to form the microphone array.
- weighting unit WU may be arranged upstream of steering unit SU so that the non-steered eigenbeams are weighted (not shown and not claimed).
- FIG. 2 shows a schematic diagram of a possible microphone array MA for the sound capture system of FIG. 1 .
- microphone array MA comprises the Q microphones Mic1, Mic2, ... MicQ of FIG. 1 mounted on the surface of an acoustically rigid sphere RS in a "truncated icosahedron" pattern.
- Each microphone Mic1, Mic2, ... MicQ in microphone array MA generates one of the audio signals S 1 ( ⁇ 1 , ⁇ 1 ,ka), S 2 ( ⁇ 1 , ⁇ 2 ,ka)...
- S Q (E Q , ⁇ Q ,ka) that is transmitted to matrixing unit MU of FIG. 1 via some suitable (e.g., wired or wireless) connection (not shown in FIG. 2 ).
- the continuous spherical sensor may be replaced by a discrete spherical array, in particular when the subsequent processing is digital-signal processing.
- beamformer BF exploits the geometry of the spherical array of FIG. 2 and relies on the spherical harmonic decomposition of the incoming sound field by matrixing unit MU to construct a desired spatial response.
- steering unit SU generates (according to Y + ⁇ m,n ( ⁇ Des , ⁇ Des )) steered spherical harmonics Y +1 0,0 ( ⁇ Des , ⁇ Des ), Y +1 1,0 ( ⁇ Des , ⁇ Des ), ... Y + ⁇ m,n ( ⁇ Des, ⁇ Des ) from the spherical harmonics Y +1 0,0 ( ⁇ , ⁇ ), Y +1 1,0 ( ⁇ , ⁇ ), ...
- Beamformer BF can provide continuous steering of the beampattern in 3-D space by changing a few scalar multipliers, while the filters determining the beampattern itself remain constant. The shape of the beampattern is invariant with respect to the steering direction. Beamformer BF needs only one filter per spherical harmonic (in the weighting unit WU), rather than per microphone as in known beamforming concepts, which significantly reduces the computational cost.
- the sound capture system of FIG. 1 with the spherical array geometry of FIG. 2 enables accurate control over the beampattern in 3-D space.
- the sound capture system can also provide multi-direction beampatterns or toroidal beampatterns giving uniform directivity in one plane. These properties can be useful for applications such as general multichannel speech pick-up, video conferencing, and direction of arrival (DOA) estimation. It can also be used as an analysis tool for room acoustics to measure, e.g., directional properties of the sound field.
- DOA direction of arrival
- the eigenbeams are also suitable for wave field synthesis (WFS) methods that enable spatially accurate sound reproduction in a fairly large volume, allowing for reproduction of the sound field that is present around the recording sphere. This allows for all kinds of general real-time spatial audio.
- WFS wave field synthesis
- FIG. 3 A circuit that provides the beamforming functionality is shown in detail in FIG. 3 .
- the modal beamformer circuit of FIG. 3 receives the Q audio signals S 1 (S 1 , ⁇ 1 ,ka), S 2 ( ⁇ 1 , ⁇ 2 ,ka) ... S Q ( ⁇ Q , ⁇ Q ,ka) provided by microphones Mic1, Mic2, ... MicQ, transforms the audio signals S 1 ( ⁇ 1 , ⁇ 1 ,ka), S 2 ( ⁇ 1 , ⁇ 2 ,ka) ... S Q (6 Q , ⁇ Q ,ka) into the spherical harmonics Y +1 0,0 ( ⁇ , ⁇ ), Y +1 1,0 ( ⁇ , ⁇ ), ... Y + ⁇ m,n ( ⁇ , ⁇ ), and steers the spherical harmonics.
- the circuit of FIG. 3 may be realized by hardware (and software) components that (together) build matrixing unit MU and the modal beamformer, which includes steering unit SU, modal weighting unit WU, and summing element SE.
- Matrixing unit MU and steering unit SU include coefficient elements CE that multiply the respective input signals with given coefficients and adders AD that sum up the input signals multiplied with coefficients so that the audio signals S 1 ( ⁇ 1 , ⁇ 1 ,ka), S 2 ( ⁇ 1 , ⁇ 2 ,ka) ...
- S Q ( ⁇ Q , ⁇ Q ,ka) are decomposed into the eigenbeams, i.e., the spherical harmonics Y +1 0,0 ( ⁇ , ⁇ ), Y +1 1,0 ( ⁇ , ⁇ ), ... Y + ⁇ m,n ( ⁇ , ⁇ ), which are then processed to provide the steered spherical harmonics Y +1 0,0 ( ⁇ Des , ⁇ Des ), Y +1 1,0 ( ⁇ Des , ⁇ Des ), ... Y + ⁇ m,n ( ⁇ Des , ⁇ Des ).
- Modal weighting unit WU includes delay elements DE, coefficient elements CE, and adders AD, which are connected to form FIR filters for weighting. The output signals of these FIR filters are summed up by summing element SE.
- Matrixing unit MU in the modal beamformer of FIG. 3 is responsible for decomposing the sound field, which is picked up by microphones Mic1, Mic2, ... MicQ and decomposed into the different eigenbeam outputs, i.e., the spherical harmonics Y +1 0,0 ( ⁇ , ⁇ ), Y +1 1,0 ( ⁇ , ⁇ ), ... Y + ⁇ m,n ( ⁇ , ⁇ ), corresponding to the zero-order, first-order, and second-order spherical harmonics.
- This can also be seen as a transformation, where the sound field is transformed from the time or frequency domain into the "modal domain”.
- the real and imaginary parts of the spherical harmonics can also work with the real and imaginary parts of the spherical harmonics.
- weighting unit WU may be implemented accordingly.
- Steering unit SU allows for steering the look direction by the angles ⁇ Des and ⁇ Des .
- Weighting unit WU compensates for a frequency-dependent sensitivity over the modes (eigenbeams), i.e., modal weighting over frequency, to the effect that the modal composition is adjusted, e.g., equalized.
- Equalizing is used to compensate for deficiencies of the microphone array, e.g., self-noise of the microphones, location errors of the microphones at the surface of the sphere, and other electrical and mechanical drawbacks.
- the order of a modal beamformer has to be reduced toward low frequencies, leading to a gradually decreasing directivity pattern with decreasing frequency.
- the ambisonic components up to M th order can be calculated from the Q microphone signals:
- B W ⁇ 1 Y T Y ⁇ 1 Y T p a
- B diag W m ⁇ 1 Y + p a
- diag EQ m ka diagonal matrix having the radial equalizing functions EQ m ka , in which 0 ⁇ m ⁇ M .
- FIG. 4 An arrangement for extracting the N ambisonic components B from the wave field p a is illustrated in FIG. 4 .
- the related sound field is defined solely by the pressure distribution p a ( ⁇ q , ⁇ q ) on the sphere's surface, which can be easily measured by sound pressure sensors (microphones).
- p a ⁇ q , ⁇ q , ⁇ q , 0
- inner sources i.e., sources inside the measurement sphere
- outer sources i.e., sources outside the measurement sphere
- the outer sources serve to model the scattered field occurring at the surface of a scattered sphere.
- a parameter called susceptibility K( ⁇ ) or its reciprocal white noise gain WNG( ⁇ ) may be used.
- white noise gain WNG( ⁇ ) addresses most effects and problems caused by microphone noise, changes in the transfer function, and variations of the microphone positions, so that it is representative of the sensitivity of the beamformer.
- a white noise gain WNG( ⁇ ) > 0 [dB] characterizes a sufficient suppression of uncorrelated errors and is thus indicative of a robust system behavior, while a white noise gain WNG( ⁇ ) ⁇ 0 [dB] is indicative of an amplification of the noise and is therefore indicative of an increasingly unstable system behavior.
- the array gain G( ⁇ ) is the ratio of the energy of sound coming from the look direction of the beamformer to the energy of omnidirectionally incoming sound.
- the array gain G( ⁇ ) is a measure for the improvement in the acoustic signal-to-noise ratio SNR, based on the directivity of the modal beamformer for sound coming from the look direction of the beamformer.
- parameters required for calculation are set to a starting value or a constant value, as the case may be.
- the following parameters may be set to, for instance:
- Regularization provides the ability to achieve a robust system by way of adjusting the regularization parameter ⁇ ( ⁇ ). This is a trade-off between a higher robustness, i.e., a higher white noise gain WNG dB ( ⁇ ), and less directivity in look direction ⁇ ( ⁇ 0 , ⁇ 0 , ⁇ ), i.e., a decreasing array gain G dB ( ⁇ ).
- the adaptation process begins with the maximum directivity G dBMax ( ⁇ ) and is then decreased by the increasing regularization parameter ⁇ ( ⁇ ) until the desired white noise gain threshold WNG dBMin is no more undercut.
- Steps 4, 5, and 6 serve to calculate the white noise gain WNG db ( ⁇ ).
- the regularization filter T m ( ⁇ ) or T m (ka) is calculated as outlined above using regularization parameter ⁇ ( ⁇ ).
- the transfer function EQ m ( ⁇ ) is calculated as outlined above using the current version of the transfer function T m ( ⁇ ) of the regularization filter or the current version of the regularization parameter ⁇ ( ⁇ ).
- the white noise gain WNG db ( ⁇ ) is calculated as outlined above using the transfer function EQ m ( ⁇ ) and the current version of the transfer function T m ( ⁇ ) of the regularization filter (regularization function). Steps 4 and 5 may be taken simultaneously or in opposite order.
- step 10 the directivity ⁇ ( ⁇ 0 , ⁇ 0 , ⁇ ) of the modal beamformer is calculated for sound coming from the look direction using the transfer function EQ m ( ⁇ ) provided in step 5.
- step 12 the current white noise gain WNG db ( ⁇ ) is compared with the predetermined white noise gain threshold WNG dBMin ( ⁇ ), and it is checked to see if the regularization parameter ⁇ ( ⁇ ) has reached its maximum according to (
- step 14 the adaptation process for the current angular frequency ⁇ has been completed so that the current equalizing function EQ m ( ⁇ ) has been limited to the given threshold or if the current regularization parameter has reached its maximum.
- step 14 the current angular frequency ⁇ is checked to see if it has reached its maximum value ⁇ Max . If ⁇ ⁇ ⁇ Max , the process jumps back to step 2 using the current angular frequency ⁇ . Otherwise, i.e., if the equalizing filter has been adapted for the complete set of frequencies, the filter coefficients are outputted in step 15.
- the directivity characteristic of the beamformer is a 4 th -order cardioid and the minimum white noise gain WNG db ( ⁇ ) used in the adaptation process is -10 [dB].
- FIG. 9 illustrates a regularization parameter over frequency ⁇ ( ⁇ ) for a common 4 th -order modal beamformer.
- regularization i.e., limiting the maximum directivity index for frequencies up to, for instance, 750 [Hz]
- values above a minimum lower threshold WNG dbMin of -10 [dB] may be maintained.
- the exemplary beamformer exhibits the desired directivity of a 4 th -order cardioid.
- FIG. 10 illustrates the corresponding white noise gain WNG for the above-mentioned 4 th -order beamformer, which supports the findings in connection with the diagram of FIG. 9 .
- the corresponding directivity index DI and the array gain G db ( ⁇ ) as shown FIG. 11 illustrate that the maximum array gain G db ( ⁇ ) is more or less below 10 [dB] depending on the frequency.
- FIG. 16 depicts the resulting directivity of the beamformer outlined above in look directivity ⁇ ( ⁇ 0 , ⁇ 0 , ⁇ ) as amplitudes over frequency.
Landscapes
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Otolaryngology (AREA)
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Circuit For Audible Band Transducer (AREA)
Description
- The embodiments disclosed herein refer to sound capture systems and methods, particularly to sound capture methods that employ modal beamforming.
- Beamforming sound capture systems comprise at least (a) an array of two or more microphones and (b) a beamformer that combines audio signals generated by the microphones to form an auditory scene representative of at least a portion of an acoustic sound field. Due to the underlying geometry, it is natural to represent the sound field captured on the surface of a sphere with respect to spherical harmonics. In this context, spherical harmonics are also known as acoustic modes (or eigenbeams) and the appending signal-processing techniques as modal beamforming.
- Two spherical microphone array configurations are commonly employed: the sphere may exist physically, or may merely be conceptual. In the first configuration, the microphones are arranged around a rigid sphere made of, for example, wood or hard plastic. In the second configuration, the microphones are arranged in free-field around an "open" sphere, referred to as an open-sphere configuration. Although the rigid-sphere configuration provides a more robust numerical formulation, the open-sphere configuration might be more desirable in practice at low frequencies where large spheres are realized.
- Beamforming techniques allow for the controlling of the characteristics of the microphone array in order to achieve a desired directivity. One of the most general formulations is the filter-and-sum beamformer, which has readily been generalized by the concept of modal subspace decomposition. This approach finds optimum finite impulse response (FIR) filter coefficients for each microphone by solving an eigenvalue problem and projecting the desired beam pattern onto the set of eigenbeam patterns found.
- Beamforming sound capture systems enable picking up acoustic signals dependent on their direction of propagation. The directional pattern of the microphone array can be varied over a wide range due to the degrees of freedom offered by the plurality of microphones and the processing of the associated beamformer. This enables, for example, steering the look direction, adapting the pattern according to the actual acoustic situation, and/or zooming in to or out from an acoustic source. All this can be done by controlling the beamformer, which is typically implemented via software, such that no mechanical alteration of the microphone array is needed. International patent application publication
WO 03/061336 A1 - A method for generating an auditory scene comprises: receiving eigenbeam outputs generated by decomposing a plurality of audio signals, each audio signal having been generated by a different microphone of a microphone array, wherein each eigenbeam output corresponds to a different eigenbeam for the microphone array and the microphones are arranged on a rigid sphere or an open sphere; generating the auditory scene based on the eigenbeam outputs and their corresponding eigenbeams, wherein generating the auditory scene comprises applying a weighting value to each eigenbeam output to form steered eigenbeam outputs and combining the steered eigenbeam outputs to generate the auditory scene. Generating the auditory scene further comprises applying a regularized equalizing filter to each steered eigenbeam output, the regularized equalizing filter(s) being configured to compensate for acoustic deficiencies of the microphone array and having a regularized equalization function. The regularized equalization function is a radial equalization function that comprises the quotient of a regularization function limiting the radial equalization function and a radial function describing an acoustic wave field in the vicinity of the surface of the rigid sphere or the center of the open sphere. The regularization function is the quotient of the absolute value of the square of the radial function and the sum of the absolute value of the square of the radial function and a regularization parameter, the regularization parameter being set to a value greater than 0 and smaller than a maximum value that is smaller than infinity.
- A modal beamformer system for generating an auditory scene comprises: a steering unit that is configured to receive eigenbeam outputs and to apply a weighting value to each eigenbeam output to provide steered provide steered eigenbeam outputs, the eigenbeam outputs having been generated by decomposing a plurality of audio signals, each audio signal having been generated by a different microphone of a microphone array, wherein each eigenbeam output corresponds to a different eigenbeam for the microphone array and the microphones are arranged on a rigid sphere or an open sphere. The system further comprises a weighting unit that is configured to receive the steered eigenbeam outputs and to generate weighted steered eigenbeam outputs, and a summing element configured to combine the weighted steered eigenbeam outputs to generate the auditory scene. The weighting unit is further configured to apply a regularized equalizing filter to each steered eigenbeam output, the regularized equalizing filter(s) being configured to compensate for acoustic deficiencies of the microphone array and having a regularized equalization function. The regularized equalization function is a radial equalization function that comprises the quotient of a regularization function limiting the radial equalization function and a radial function describing an
- acoustic wave field in the vicinity of the rigid sphere or open sphere. The regularization function is the quotient of the absolute value of the square of the radial function and the sum of the absolute value of the square of the radial function and a regularization parameter, the regularization parameter being set to a value greater than 0 and smaller than a maximum value that is smaller than infinity.
- The figures identified below are illustrative of some embodiments of the invention. The figures are not intended to be limiting of the invention recited in the appended claims. The embodiments, both as to their organization and manner of operation, together with further object and advantages thereof, may best be understood with reference to the following description, taken in connection with the accompanying drawings, in which:
-
FIG. 1 is a schematic representation of a generalized structure of a sound capture system that employs modal beamforming; -
FIG. 2 is a schematic representation of a possible microphone array for the sound capture system ofFIG. 1 ; -
FIG. 3 is a schematic representation of a more detailed structure of a sound capture system that employs modal beamforming; -
FIG. 4 is a schematic representation of an arrangement for extracting ambisonic components with which an arbitrary sound field can be coded and/or decoded; -
FIG. 5 is a schematic representation of an arrangement for measuring a sound pressure field; -
FIG. 6 is a schematic diagram illustrating the radial function of a spherical microphone array; -
FIG. 7 is a schematic diagram illustrating the magnitude frequency response of the equalizer filter corresponding to the radial function illustrated inFIG. 6 ; -
FIG. 8 is a flow chart illustrating the process of calculating the equalizer filter referred to above in connection withFIG. 7 ; -
FIG. 9 is a schematic diagram illustrating the regularization parameter over frequency for an improved 4th-order modal beamformer with a given minimal white noise gain of -10 [dB]; -
FIG. 10 is a schematic diagram corresponding to the flow chart ofFIG. 8 and the diagram ofFIG. 9 , and illustrating the white noise gain for a 4th-order modal beamformer utilizing a regularized equalizing filter; -
FIG. 11 is a schematic diagram corresponding to the flow chart ofFIG. 8 and the diagram ofFIG. 9 , and illustrating the directivity index for a 4th-order modal beamformer utilizing a regularized equalizing filter; -
FIG. 12 is a schematic diagram illustrating the magnitude frequency response of the improved regularized equalizing filter; -
FIG. 13 is a schematic diagram illustrating the corresponding phase response of the improved filter ofFIG. 12 ; -
FIG. 14 is a schematic diagram illustrating the magnitude frequency response of an improved, regularized equalizing filter; -
FIG. 15 is a schematic diagram illustrating the corresponding phase frequency response of the improved filter ofFIG. 14 ; and -
FIG. 16 is a schematic diagram illustrating the cylindrical view of the directional pattern of the improved 4th-order modal beamformer over frequency. -
FIG. 1 is a block diagram illustrating the basic structure of a beamforming sound capture system as described in more detail, for instance, inWO 03/061336 FIG. 1 , modal beamformer BF comprises a steering unit SU, a weighting unit WU, and a summing element SE, each of which will be discussed in further detail later in this specification. Each microphone Mic1, Mic2, ... MicQ generates a time-varying analog or digital audio signal S1(θ1,ϕ1,ka), S2(θ1,ϕ2,ka) ... SQ(EQ,ϕQ,ka) corresponding to the sound incident at the location of that microphone. - Matrixing unit MU decomposes (according to Y+ = (YTY)-1YT) the audio signals S1(θ1,ϕ1,ka), S2(θ1,ϕ2,ka) ... SQ(EQ,ϕQ,ka) generated by the different microphones Mic1, Mic2, ... MicQ to generate a set of spherical harmonics Y+1 0,0(θ,ϕ), Y+1 1,0(θ,ϕ), ... Y+σ m,n(θ,ϕ), also known as eigenbeams or modal outputs, where each spherical harmonic Y+1 0,0(θ,ϕ),
Y +11,0(θ,ϕ), ... Y+σ m,n(θ,ϕ) corresponds to a different mode for the microphone array. The spherical harmonics Y+1 0,0(θ,ϕ),Y +11,0(θ,ϕ), ... Y+σ m,n(θ,ϕ) are then processed by beamformer BF to generate an auditory scene that is represented in the present example by output signal OUT (=Ψ(θDes,ϕDes)). In this specification, the term auditory scene is used generically to refer to any desired output from a sound capture system, such as the system ofFIG. 1 . The definition of the particular auditory scene will vary from application to application. For example, the output generated by beamformer BF may correspond to one or more output signals, e.g., one for each speaker used to generate the resultant auditory scene. Moreover, depending on the application, beamformer BF may simultaneously generate beampatterns for two or more different auditory scenes, each of which can be independently steered to any direction in space. In certain implementations of the sound capture system, microphones Mic1, Mic2, ... MicQ may be mounted on the surface of an acoustically rigid sphere or may be arranged on a virtual (open) sphere to form the microphone array. Alternatively, weighting unit WU may be arranged upstream of steering unit SU so that the non-steered eigenbeams are weighted (not shown and not claimed). -
FIG. 2 shows a schematic diagram of a possible microphone array MA for the sound capture system ofFIG. 1 . In particular, microphone array MA comprises the Q microphones Mic1, Mic2, ... MicQ ofFIG. 1 mounted on the surface of an acoustically rigid sphere RS in a "truncated icosahedron" pattern. Each microphone Mic1, Mic2, ... MicQ in microphone array MA generates one of the audio signals S1(θ1,ϕ1,ka), S2(θ1,ϕ2,ka)... SQ(EQ,ϕQ,ka) that is transmitted to matrixing unit MU ofFIG. 1 via some suitable (e.g., wired or wireless) connection (not shown inFIG. 2 ). The continuous spherical sensor may be replaced by a discrete spherical array, in particular when the subsequent processing is digital-signal processing. - Referring again to
FIG. 1 , beamformer BF exploits the geometry of the spherical array ofFIG. 2 and relies on the spherical harmonic decomposition of the incoming sound field by matrixing unit MU to construct a desired spatial response. In beamformer BF, steering unit SU generates (according to Y+σ m,n(θDes,ϕDes)) steered spherical harmonics Y+1 0,0(θDes,ϕDes), Y+1 1,0(θDes,ϕDes), ... Y+σ m,n(θDes,ϕDes) from the spherical harmonics Y+1 0,0(θ,ϕ), Y+1 1,0(θ,ϕ), ... Y+σ m,n(θ,ϕ), which are further processed by weighting unit WU and summing element SE. Beamformer BF can provide continuous steering of the beampattern in 3-D space by changing a few scalar multipliers, while the filters determining the beampattern itself remain constant. The shape of the beampattern is invariant with respect to the steering direction. Beamformer BF needs only one filter per spherical harmonic (in the weighting unit WU), rather than per microphone as in known beamforming concepts, which significantly reduces the computational cost. - The sound capture system of
FIG. 1 with the spherical array geometry ofFIG. 2 enables accurate control over the beampattern in 3-D space. In addition to pencil-like beams, the sound capture system can also provide multi-direction beampatterns or toroidal beampatterns giving uniform directivity in one plane. These properties can be useful for applications such as general multichannel speech pick-up, video conferencing, and direction of arrival (DOA) estimation. It can also be used as an analysis tool for room acoustics to measure, e.g., directional properties of the sound field. The sound capture system ofFIG. 1 offers another advantage: it supports decomposition of the sound field into mutually orthogonal components, the eigenbeams (i.e., spherical harmonics) that can also be used to reproduce the sound field. The eigenbeams are also suitable for wave field synthesis (WFS) methods that enable spatially accurate sound reproduction in a fairly large volume, allowing for reproduction of the sound field that is present around the recording sphere. This allows for all kinds of general real-time spatial audio. - A circuit that provides the beamforming functionality is shown in detail in
FIG. 3 . The modal beamformer circuit ofFIG. 3 receives the Q audio signals S1(S1,ϕ1,ka), S2(θ1,ϕ2,ka) ... SQ(θQ,ϕQ,ka) provided by microphones Mic1, Mic2, ... MicQ, transforms the audio signals S1(θ1,ϕ1,ka), S2(θ1,ϕ2,ka) ... SQ(6Q,ϕQ,ka) into the spherical harmonics Y+1 0,0(θ,ϕ), Y+1 1,0(θ,ϕ), ... Y+σ m,n(θ,ϕ), and steers the spherical harmonics. The circuit ofFIG. 3 may be realized by hardware (and software) components that (together) build matrixing unit MU and the modal beamformer, which includes steering unit SU, modal weighting unit WU, and summing element SE. Matrixing unit MU and steering unit SU include coefficient elements CE that multiply the respective input signals with given coefficients and adders AD that sum up the input signals multiplied with coefficients so that the audio signals S1(θ1,ϕ1,ka), S2(θ1,ϕ2,ka) ... SQ(θQ,ϕQ,ka) are decomposed into the eigenbeams, i.e., the spherical harmonics Y+1 0,0(θ,ϕ), Y+1 1,0(θ,ϕ), ... Y+σ m,n(θ,ϕ), which are then processed to provide the steered spherical harmonics Y+1 0,0(θDes,ϕDes), Y+1 1,0(θDes,ϕDes), ... Y+σ m,n(θDes,ϕDes). Modal weighting unit WU includes delay elements DE, coefficient elements CE, and adders AD, which are connected to form FIR filters for weighting. The output signals of these FIR filters are summed up by summing element SE. - Matrixing unit MU in the modal beamformer of
FIG. 3 is responsible for decomposing the sound field, which is picked up by microphones Mic1, Mic2, ... MicQ and decomposed into the different eigenbeam outputs, i.e., the spherical harmonics Y+1 0,0(θ,ϕ), Y+1 1,0(θ,ϕ), ... Y+σ m,n(θ,ϕ), corresponding to the zero-order, first-order, and second-order spherical harmonics. This can also be seen as a transformation, where the sound field is transformed from the time or frequency domain into the "modal domain". To simplify a time-domain implementation, one can also work with the real and imaginary parts of the spherical harmonics. This will result in real-value coefficients, which are more suitable for a time-domain implementation. If the sensitivity equals the imaginary part of a spherical harmonic, then the beampattern of the corresponding array factor will also be the imaginary part of this spherical harmonic. To compensate for this frequency dependence, weighting unit WU may be implemented accordingly. Steering unit SU allows for steering the look direction by the angles θDes and ϕDes. Weighting unit WU compensates for a frequency-dependent sensitivity over the modes (eigenbeams), i.e., modal weighting over frequency, to the effect that the modal composition is adjusted, e.g., equalized. Equalizing is used to compensate for deficiencies of the microphone array, e.g., self-noise of the microphones, location errors of the microphones at the surface of the sphere, and other electrical and mechanical drawbacks. Summation node SE performs the actual beamforming for the sound capture system by summing up the weighted harmonics to yield the beamformer output OUT = ψ(θDes, ϕDes), i.e., the auditory scene. - Due to self-noise amplification, the order of a modal beamformer has to be reduced toward low frequencies, leading to a gradually decreasing directivity pattern with decreasing frequency. Regularization of the radial filter is configured such that, for example, the white noise gain will not fall below a given limit (e.g., WNGdBMin = - 10[dB] (±3 [dB])) to keep the robustness, i.e., the self-noise amplification, within a tolerable range, and a constant directivity in look direction over frequency, such as 0 [dB], will be reached. By doing this, an optimum balance between robustness and directivity will result, leading to a modal beamformer with enhanced properties in which the directivity of the modal beamformer is enhanced by keeping the transfer function in look direction at a frequency-independent constant value and a minimum threshold of robustness. Regularization may be achieved by adapting the weighting coefficients of the FIR filters in weighting unit WU to an optimum.
- But before going into detail on the regularization process, some general issues are discussed, in particular issues with regard to the measurement of the acoustic wave field via a rigid spherical microphone array. In general, sound pressure values pa(θq, ϕq) can be described by way of the Fourier-Bessel series truncated to the Mth order at positions θq, ϕq of the Q microphones located at radius a, in which 1 ≤ q ≤ Q, as follows:
-
-
- An arrangement for extracting the N ambisonic components B from the wave field pa is illustrated in
FIG. 4 . The room and, thus, the spherical harmonics Y+1 0,0(θ,ϕ), Y+1 1,0(θ,ϕ), ... Y+σ m,n(θ,ϕ) are sampled by way of matrix Y+ at the position(s) θq, ϕq with the Q microphones, in which: - Combining the Q microphone signals (1 ≤ q ≤ Q), i.e., S1(θ1,ϕ1,ka), S2(θ1,ϕ2,ka) ... SQ(θQ,ϕQ,ka), by way of matrix Y+ into N output signals, which correspond to signals that would have been obtained when a wave field is sampled with N microphones having a certain directivity, can be seen as a transformation from the time domain into the spatial domain. By way of a radial equalizing function EQm(ka) the thereby generated spherical harmonic signals are then weighted to provide frequency-independent normalized-to-1 ambisonic components Bσ m,n or the ambisonic signals B.
- Referring now to
FIG. 5 , the derivation of the radial function Wm(ka) of a rigid closed sphere with microphones arranged on the sphere's surface can be described as follows: at the surface of a rigid closed sphere, velocity va is zero, i.e., va(θq,ϕq,ka) = 0. - Therefore, the related sound field is defined solely by the pressure distribution pa(θq, ϕq) on the sphere's surface, which can be easily measured by sound pressure sensors (microphones). Mathematically, the underlying, physically logical condition that va(θq,ϕq,ka) = 0 holds at the surface of a rigid body can be met when inner sources (i.e., sources inside the measurement sphere) and outer sources (i.e., sources outside the measurement sphere) are superposed, as illustrated in
FIG. 5 . For instance, the outer sources serve to model the scattered field occurring at the surface of a scattered sphere. Based on the general form of the Bessel series, -
-
-
- The Euler equation links the sound velocity v(θq,Φq,ka) to the sound pressure p(θq,Φq,ka) and the fact that sound velocity v(θq,Φq,ka) and sound pressure p(θq,Φq,ka) can be derived by weighting spherical harmonics according to the Fourier-Bessel series:
-
- From the two previous equations, a simplified relationship can be provided for the sound pressure pscat(θq,Φq,ka) that results from the sound field of the spherical waves distributing inner sound sources and that can be measured on the sphere's surface (r = a) at the positions (θq,Φq) where the q pressure sensors (microphones) are arranged,, thereby neglecting the constants jρck and 4π:
-
- An accordingly calculated magnitude frequency response for the radial functions wm(ka)=1/EQm(ka) for a sphere radius of a=0.9m in a spectral range of 50Hz to 6700Hz for orders m up to M=10 is shown in
FIG. 6 . The corresponding radial equalizing function EQm(ka) for orders m up to M=4, is depicted inFIG. 7 . - The equations outlined above provide a least-square solution that offers the smallest-mean-squared error, but cannot be used per se in connection with small or very small wm(ka) values. However, this is the case at higher orders m and/or lower frequencies f so that instabilities may occur due to amplified noise of the sensors or measurement system, positioning errors of the microphones, or irregularities in the frequency characteristic, which may deteriorate the results.
-
- If e = 0, the system works as a least-square beamformer (ideal case as shown above, i.e., without any regularization, which leads to the solution with the highest directivity but also with the least robustness). If ε = ∞, the system works as a delay-and-sum beamformer, which delivers the maximum possible robustness but the least directivity. The radial equalizing functions EQm(ka) can be further simplified to read as:
- Thus, with regularization parameters ε(ka) or ε(ω) one can control the modal beamformer to exhibit a certain robustness with respect to the inherent noise that is amplified with wm(ka), in particular at lower frequencies.
- In order to calculate appropriate values for the regularization parameters ε(ka) or ε(ω), a parameter called susceptibility K(ω) or its reciprocal white noise gain WNG(ω) may be used. For instance, white noise gain WNG(ω) addresses most effects and problems caused by microphone noise, changes in the transfer function, and variations of the microphone positions, so that it is representative of the sensitivity of the beamformer. A white noise gain WNG(ω) > 0 [dB] characterizes a sufficient suppression of uncorrelated errors and is thus indicative of a robust system behavior, while a white noise gain WNG(ω) < 0 [dB] is indicative of an amplification of the noise and is therefore indicative of an increasingly unstable system behavior.
-
-
-
-
-
-
- In words, the array gain G(ω) is the ratio of the energy of sound coming from the look direction of the beamformer to the energy of omnidirectionally incoming sound.
-
-
-
- For instance, when M = 4, then the achievable maximum array gain GdBmax(ω) is approximately 14dB.
- Referring now to
FIG. 8 , an exemplary iterative process of adapting the parameters of a modal beamformer is described in detail. In an initializingstep 1, parameters required for calculation are set to a starting value or a constant value, as the case may be. The following parameters may be set to, for instance: - WNG parameter
- Minimum white noise gain threshold WNGdBMin(ω), which is not undercut by the regularized modal beamformer; for instance, WNG dB
Min = -10[dB]. - Offset ΔWNGdB in [dB], by which the minimum white noise gain threshold WNGdBMin(ω) is overcut or undercut during the adaptation process; for instance, ΔWNGdB = 0.5dB.
- Minimum white noise gain threshold WNGdBMin(ω), which is not undercut by the regularized modal beamformer; for instance, WNG dB
- Regularization parameter ε(ω)
- Maximum regularization parameter εMax , which is the upper limit for the regularization parameter ε(ω); for instance, εMax = 1.
- Step size by which the regularization parameter ε(ω) is increased or decreased.
- Frequency ω
- Start value of the (angular) frequency for the adaptation process; for instance, ω = 2π1[Hz].
- Step size by which the (angular) frequency is increased or decreased when the adaption is completed at a certain frequency; for instance, Δω = 2π1[Hz].
- Maximum (angular) frequency at which an adaptation is performed; for instance, ωMax = πfs [Hz].
- Then the adaptation process is started in
step 2. Instep 3, the regularization parameter is set to, e.g., ε(ω) = 0 for the current frequency ω under investigation. Regularization provides the ability to achieve a robust system by way of adjusting the regularization parameter ε(ω). This is a trade-off between a higher robustness, i.e., a higher white noise gain WNG dB(ω), and less directivity in look direction ψ(θ 0,ϕ 0,ω), i.e., a decreasing array gain GdB (ω). If the regularization parameter is set to ε(ω) = 0, the adaptation process begins with the maximum directivity GdBMax(ω) and is then decreased by the increasing regularization parameter ε(ω) until the desired white noise gain threshold WNGdBMin is no more undercut. -
Steps step 4, the regularization filter Tm(ω) or Tm(ka), is calculated as outlined above using regularization parameter ε(ω). Instep 5, the transfer function EQm(ω) is calculated as outlined above using the current version of the transfer function Tm(ω) of the regularization filter or the current version of the regularization parameter ε(ω). Instep 6, the white noise gain WNGdb(ω) is calculated as outlined above using the transfer function EQm(ω) and the current version of the transfer function Tm(ω) of the regularization filter (regularization function).Steps -
- In
step 10, the directivity ψ(θ 0,ϕ 0,ω) of the modal beamformer is calculated for sound coming from the look direction using the transfer function EQm(ω) provided instep 5. -
- In step 12, the current white noise gain WNGdb(ω) is compared with the predetermined white noise gain threshold WNGdBMin(ω), and it is checked to see if the regularization parameter ε(ω) has reached its maximum according to (|WNG dB
Min - WNG dB(ω)|> ΔWNG) and (ε(ω) ≤ ε Max). If both requirements are met, i.e., if (|WNG dBMin - WNG dB(ω)|>ΔWNG)&(ε(ω) ≤ εMax ), the adaptation process is not yet finished, resulting in jumping back tostep 3 and starting again with an updated regularization parameter ε(ω). - Otherwise, i.e., if the adaptation process for the current angular frequency ω has been completed so that the current equalizing function EQm(ω) has been limited to the given threshold or if the current regularization parameter has reached its maximum, the angular frequency ω is incremented according to ω = ω + Δω in step 13, which is followed by step 14.
- In step 14, the current angular frequency ω is checked to see if it has reached its maximum value ωMax. If ω < ωMax, the process jumps back to
step 2 using the current angular frequency ω. Otherwise, i.e., if the equalizing filter has been adapted for the complete set of frequencies, the filter coefficients are outputted instep 15. - Referring to
FIGS. 9 through 16 , measurements made with an exemplary arrangement in combination with an exemplary adaptation method are described in detail. The arrangement includes a sphere having a radius of a = 0.09 [m] and the shape of an obtuse icosahedron, which is a blend of two platonic solids, i.e., an icosahedron and a dodecahedron. The number of microphones arranged on the sphere is Q = 32. The directivity characteristic of the beamformer is a 4th-order cardioid and the minimum white noise gain WNGdb(ω) used in the adaptation process is -10 [dB]. -
FIG. 9 illustrates a regularization parameter over frequency ε(ω) for a common 4th-order modal beamformer. As can be seen fromFIG. 9 with regularization, i.e., limiting the maximum directivity index for frequencies up to, for instance, 750 [Hz], values above a minimum lower threshold WNGdbMin of -10 [dB] may be maintained. Above 750 [Hz], the exemplary beamformer exhibits the desired directivity of a 4th-order cardioid.FIG. 10 illustrates the corresponding white noise gain WNG for the above-mentioned 4th-order beamformer, which supports the findings in connection with the diagram ofFIG. 9 . The corresponding directivity index DI and the array gain Gdb(ω) as shownFIG. 11 illustrate that the maximum array gain Gdb(ω) is more or less below 10 [dB] depending on the frequency. - However, applying the adapted regularization filter (Tm(ω)) described herein causes a monotonic decrease of the array gain Gdb(ω) down to 7.5 [dB] at 20 [Hz] as shown in
FIG. 11 . The magnitude frequency responses of the thereby applied M regularization filter Tm(ω) is shown inFIG. 12 , and its corresponding frequency-independent phase characteristic is illustrated inFIG. 13 . - Further applying the optimized radial equalizing filter (EQm(ω)) causes an improved regularized equalizing filter whose magnitude frequency response is depicted in
FIG. 14 and whose phase frequency response is depicted inFIG. 15 . The directivity of the corresponding improved beamformer at frequencies above 650 [Hz] is a 4th-order cardioid, between 300 [Hz] and 650 [Hz] a 3rd-order cardioid, between 70 [Hz] and 300 [Hz] a 2nd-order cardioid, and below 70 [Hz] a 1st-order cardioid.FIG. 16 depicts the resulting directivity of the beamformer outlined above in look directivity ψ(θ 0,ϕ 0,ω) as amplitudes over frequency. - While exemplary embodiments are described above, it is not intended that these embodiments describe all possible forms of the invention. The words used in the specification are words of description rather than limitation, and it is understood that various changes may be made. Additionally, the features of various implementing embodiments may be combined to form further embodiments of the invention.
Claims (11)
- A method for generating an auditory scene (OUT), comprising:receiving eigenbeam outputs, the eigenbeam outputs having been generated by decomposing a plurality of audio signals (S1(θ1,ϕ1,ka), S2(θ1,ϕ2,ka) ... SQ(θQ,ϕQ,ka)), each audio signal (S1 (θ1,ϕ1,ka), S2(θ1,ϕ2,ka) ... SQ(ϕQ,ϕQ,ka)) having been generated by a different microphone (Mic1, Mic2, ... MicQ) of a microphone array, wherein each eigenbeam output corresponds to a different eigenbeam for the microphone array, and the microphones (Mic1, Mic2, ... MicQ) are arranged on a rigid sphere (RS) or an open sphere; andgenerating the auditory scene (OUT) based on the eigenbeam outputs and their corresponding eigenbeams (Y+1 0,0(θ,ϕ), Y+11,0(θ,ϕ), ... Y+σ m,n(θ,ϕ)), wherein:generating the auditory scene (OUT) comprises applying a weighting value to each eigenbeam output to form steered eigenbeam outputs and combining the steered eigenbeam outputs to generate the auditory scene (OUT);generating the auditory scene (OUT) further comprises applying a regularized equalizing filter (EQ) to each steered eigenbeam output, the regularized equalizing filter(s) (EQ) being configured to compensate for acoustic deficiencies of the microphone array and having a regularized equalization function (EQm(ka));the regularized equalization function (EQm(ka)) is a radial equalization function that comprises the quotient of a regularization function (Tm(ka)) limiting the radial equalization function and a radial function (Wm(ka)) describing an acoustic wave field in the vicinity of the surface of the rigid sphere (RS) or the center of the open sphere; characterized in thatthe regularization function (Tm(ka)) is the quotient of the absolute value of the square of the radial function and the sum of the absolute value of the square of the radial function and a regularization parameter, the regularization parameter being set to a value greater than 0 and smaller than a maximum value that is smaller than infinity.
- The method of claim 1 wherein the maximum value of the regularization parameter is 1.
- The method of claim 1 or 2 wherein the regularization parameter depends on a susceptibility parameter that is the reciprocal of a white noise gain parameter, the white noise gain parameter being greater than a minimum white noise gain parameter that is not undercut by the equalizing filter (EQ).
- The method of claim 3 wherein the minimum white noise gain parameter is -10 [dB].
- The method of any one of claims 1 through 4 wherein the regularization parameter is adapted in an iterative process.
- The method of claim 5 wherein, for a given frequency, the iterative process comprises:setting at least the minimum white noise gain parameter and the regularization parameters to a starting value or a constant value; andcalculating the white noise gain, the regularization function, and the radial equalization function; andcomparing the calculated white noise gain parameter with the set minimum white noise gain parameter; andcalculating the directivity for sound coming from the look direction using the radial equalization function; andscaling the radial equalization function; andcomparing the calculated white noise gain with the set minimum white noise gain and checking if the regularization parameter has reached its maximum; if both requirements are met, the adaptation process is not yet finished, resulting in jumping back and starting again with an updated regularization parameter; otherwise the process for the current frequency has been completed and the frequency is incremented; andchecking if the current frequency has reached its maximum value; if the frequency has not reached its maximum, the process jumps back and starts again with another frequency; otherwise the filter coefficients are outputted.
- The method of claim 6 wherein the iterative process comprises an offset white noise gain parameter by which the minimum white noise gain parameter is overcut or undercut at maximum during adaptation.
- A modal beamformer system (BF) for generating an auditory scene (OUT), comprising:a steering unit (SU) that is configured to receive eigenbeam outputs and to apply a weighting value to each eigenbeam output to provide steered eigenbeam outputs, the eigenbeam outputs having been generated by decomposing a plurality of audio signals (S1(θ1,ϕ1,ka), S2(θ1,ϕ2,ka) ... SQ(θQ,ϕQ,ka)), each audio signal (S1(θ1,ϕ1,ka), S2(θ1,ϕ2,ka) ... SQ(θQ,ϕQ,ka)) having been generated by a different microphone (Mic1, Mic2, ... MicQ) of a microphone array, wherein each eigenbeam output corresponds to a different eigenbeam (Y+1 0,0(θ,ϕ), Y+11,0(θ,ϕ), ... Y+σ m,n(θ,ϕ)) for the microphone array, and the microphones (Mic1, Mic2, ... MicQ) are arranged on a rigid sphere (RS) or an open sphere;a weighting unit (WU) that is configured to receive the steered eigenbeam outputs and to generate weighted steered eigenbeam outputs; anda summing element (SE) configured to combine the weighted steered eigenbeam outputs to generate the auditory scene (OUT), wherein:the weighting unit (WU) is further configured to apply a regularized equalizing filter (EQ) to each steered eigenbeam output, the regularized equalizing filter(s) (EQ) being configured to compensate for acoustic deficiencies of the microphone array and having a regularized equalization function (EQm(ka)); andthe regularized equalization function (EQm(ka)) is a radial equalization function that comprises the quotient of a regularization function (Tm(ka)) limiting the radial equalization function and a radial function (Wm(ka)) describing an acoustic wave field in the vicinity of the rigid sphere (RS) or open sphere; characterized in thatthe regularization function (Tm(ka)) is the quotient of the absolute value of the square of the radial function and the sum of the absolute value of the square of the radial function and a regularization parameter, the regularization parameter being set to a value greater than 0 and smaller than a maximum value that is smaller than infinity.
- The system of claim 8 wherein the maximum value of the regularization parameter is 1.
- The system of claim 8 or 9 wherein the regularization parameter depends on a susceptibility parameter that is the reciprocal of a white noise gain parameter, the white noise gain parameter being greater than a minimum white noise gain parameter that is not undercut by the equalizing filter (EQ).
- The system of claim 10 wherein the minimum white noise gain parameter is -10 [dB].
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP13152209.6A EP2757811B1 (en) | 2013-01-22 | 2013-01-22 | Modal beamforming |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP13152209.6A EP2757811B1 (en) | 2013-01-22 | 2013-01-22 | Modal beamforming |
Publications (2)
Publication Number | Publication Date |
---|---|
EP2757811A1 EP2757811A1 (en) | 2014-07-23 |
EP2757811B1 true EP2757811B1 (en) | 2017-11-01 |
Family
ID=47605386
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP13152209.6A Active EP2757811B1 (en) | 2013-01-22 | 2013-01-22 | Modal beamforming |
Country Status (1)
Country | Link |
---|---|
EP (1) | EP2757811B1 (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3787316A1 (en) * | 2018-02-09 | 2021-03-03 | Oticon A/s | A hearing device comprising a beamformer filtering unit for reducing feedback |
CN111929665B (en) * | 2020-09-01 | 2024-02-09 | 中国科学院声学研究所 | Target depth identification method and system based on wave number spectrum main lobe position |
CN113791385A (en) * | 2021-09-15 | 2021-12-14 | 张维翔 | Three-dimensional positioning method and system |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030147539A1 (en) | 2002-01-11 | 2003-08-07 | Mh Acoustics, Llc, A Delaware Corporation | Audio system based on at least second-order eigenbeams |
GB0906269D0 (en) * | 2009-04-09 | 2009-05-20 | Ntnu Technology Transfer As | Optimal modal beamformer for sensor arrays |
-
2013
- 2013-01-22 EP EP13152209.6A patent/EP2757811B1/en active Active
Non-Patent Citations (1)
Title |
---|
BERTET ST PRG A(C)PHANIE ET AL: "3D Sound Field Recording with Higher Order Ambisonics - Objective Measurements and Validation of Spherical Microphone", AES CONVENTION 120; MAY 2006, AES, 60 EAST 42ND STREET, ROOM 2520 NEW YORK 10165-2520, USA, 1 May 2006 (2006-05-01), XP040507751 * |
Also Published As
Publication number | Publication date |
---|---|
EP2757811A1 (en) | 2014-07-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP2905975B1 (en) | Sound capture system | |
EP1856948B1 (en) | Position-independent microphone system | |
US10356514B2 (en) | Spatial encoding directional microphone array | |
Jin et al. | Design, optimization and evaluation of a dual-radius spherical microphone array | |
US10659873B2 (en) | Spatial encoding directional microphone array | |
US11032663B2 (en) | System and method for virtual navigation of sound fields through interpolation of signals from an array of microphone assemblies | |
Moreau et al. | 3d sound field recording with higher order ambisonics–objective measurements and validation of a 4th order spherical microphone | |
JP6030660B2 (en) | Method and apparatus for processing a spherical microphone array signal on a hard sphere used to generate an ambisonic representation of a sound field | |
KR101415026B1 (en) | Method and apparatus for acquiring the multi-channel sound with a microphone array | |
EP1466498B1 (en) | Audio system based on at least second order eigenbeams | |
EP3576426B1 (en) | Low complexity multi-channel smart loudspeaker with voice control | |
KR100856246B1 (en) | Apparatus And Method For Beamforming Reflective Of Character Of Actual Noise Environment | |
US12022276B2 (en) | Apparatus, method or computer program for processing a sound field representation in a spatial transform domain | |
Zhao et al. | Design of robust differential microphone arrays with the Jacobi–Anger expansion | |
Epain et al. | Independent component analysis using spherical microphone arrays | |
Jackson et al. | Sound field planarity characterized by superdirective beamforming | |
Delikaris-Manias et al. | Signal-dependent spatial filtering based on weighted-orthogonal beamformers in the spherical harmonic domain | |
EP2757811B1 (en) | Modal beamforming | |
Zaunschirm et al. | Measurement-based modal beamforming using planar circular microphone arrays | |
Rasumow et al. | The impact of the white noise gain (WNG) of a virtual artificial head on the appraisal of binaural sound reproduction | |
Shabtai et al. | Spherical array beamforming for binaural sound reproduction | |
Coleman et al. | Stereophonic personal audio reproduction using planarity control optimization | |
Mendoza et al. | An Adaptive Algorithm for Speaker Localization in Real Environments using Smartphones | |
Ayllón et al. | Optimum microphone array for monaural and binaural in-the-canal hearing aids | |
Zou et al. | A broadband speech enhancement technique based on frequency invariant beamforming and GSC |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20130122 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
AX | Request for extension of the european patent |
Extension state: BA ME |
|
R17P | Request for examination filed (corrected) |
Effective date: 20150108 |
|
RBV | Designated contracting states (corrected) |
Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
17Q | First examination report despatched |
Effective date: 20151116 |
|
GRAP | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOSNIGR1 |
|
INTG | Intention to grant announced |
Effective date: 20170526 |
|
GRAS | Grant fee paid |
Free format text: ORIGINAL CODE: EPIDOSNIGR3 |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: FG4D |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: EP Ref country code: AT Ref legal event code: REF Ref document number: 943133 Country of ref document: AT Kind code of ref document: T Effective date: 20171115 |
|
REG | Reference to a national code |
Ref country code: IE Ref legal event code: FG4D |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R096 Ref document number: 602013028624 Country of ref document: DE |
|
REG | Reference to a national code |
Ref country code: NL Ref legal event code: MP Effective date: 20171101 |
|
REG | Reference to a national code |
Ref country code: LT Ref legal event code: MG4D |
|
REG | Reference to a national code |
Ref country code: AT Ref legal event code: MK05 Ref document number: 943133 Country of ref document: AT Kind code of ref document: T Effective date: 20171101 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: NL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20171101 Ref country code: FI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20171101 Ref country code: SE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20171101 Ref country code: ES Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20171101 Ref country code: NO Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20180201 Ref country code: LT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20171101 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: RS Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20171101 Ref country code: BG Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20180201 Ref country code: IS Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20180301 Ref country code: GR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20180202 Ref country code: HR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20171101 Ref country code: LV Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20171101 Ref country code: AT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20171101 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: CZ Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20171101 Ref country code: DK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20171101 Ref country code: CY Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20171101 Ref country code: EE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20171101 Ref country code: SK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20171101 |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R097 Ref document number: 602013028624 Country of ref document: DE |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: RO Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20171101 Ref country code: IT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20171101 Ref country code: SM Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20171101 Ref country code: PL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20171101 |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: PL |
|
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
26N | No opposition filed |
Effective date: 20180802 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: FR Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20180131 Ref country code: LU Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20180122 |
|
REG | Reference to a national code |
Ref country code: IE Ref legal event code: MM4A |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: ST Effective date: 20180928 |
|
REG | Reference to a national code |
Ref country code: BE Ref legal event code: MM Effective date: 20180131 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: BE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20180131 Ref country code: CH Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20180131 Ref country code: LI Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20180131 Ref country code: SI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20171101 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20180122 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: MC Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20171101 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: MT Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20180122 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: TR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20171101 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: PT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20171101 Ref country code: HU Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT; INVALID AB INITIO Effective date: 20130122 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: MK Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20171101 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: AL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20171101 |
|
P01 | Opt-out of the competence of the unified patent court (upc) registered |
Effective date: 20230526 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 20231219 Year of fee payment: 12 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DE Payment date: 20231219 Year of fee payment: 12 |