EP2437517A1 - Manipulation de scène sonore - Google Patents

Manipulation de scène sonore Download PDF

Info

Publication number
EP2437517A1
EP2437517A1 EP10275102A EP10275102A EP2437517A1 EP 2437517 A1 EP2437517 A1 EP 2437517A1 EP 10275102 A EP10275102 A EP 10275102A EP 10275102 A EP10275102 A EP 10275102A EP 2437517 A1 EP2437517 A1 EP 2437517A1
Authority
EP
European Patent Office
Prior art keywords
signal
audio
auxiliary signal
audio signals
sound source
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
EP10275102A
Other languages
German (de)
English (en)
Other versions
EP2437517B1 (fr
Inventor
Toon Van Waterschoot
Wouter Tirry
Marc Moonen
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NXP BV
Original Assignee
NXP BV
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NXP BV filed Critical NXP BV
Priority to EP10275102.1A priority Critical patent/EP2437517B1/fr
Priority to CN2011103036497A priority patent/CN102447993A/zh
Priority to US13/248,805 priority patent/US20120082322A1/en
Publication of EP2437517A1 publication Critical patent/EP2437517A1/fr
Application granted granted Critical
Publication of EP2437517B1 publication Critical patent/EP2437517B1/fr
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • H04R3/005Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2430/00Signal processing covered by H04R, not provided for in its groups
    • H04R2430/20Processing of the output signals of the acoustic transducers of an array for obtaining a desired directivity characteristic

Definitions

  • This invention relates to manipulation of a sound scene comprising multiple sound sources. It is particularly relevant in the case of simultaneous recording of audio by multiple microphones.
  • an audio-processing device comprising:
  • a device addresses the problem of sound scene manipulation from a fundamentally different perspective, in that it allows any specified level change to be performed for each of the individual sound source components in the observed mixture(s), without relying on an explicit sound source separation.
  • the disadvantages overcome by the device as compared to the state of the art can be explained by considering each of the three approaches highlighted already above:
  • One application of a method or device is the enhancement of acoustic signals like speech or music.
  • the sound scene consists of desired as well as undesired sound sources, and the aim of the sound scene manipulation comprises reducing the level of the undesired sound sources relative to the level of the desired sound sources.
  • a handheld personal electronic device comprising a plurality of microphones; and the audio processing device referred to above.
  • the invention is particularly suited to mobile, handheld applications, since it has relatively light computational demands. It may therefore be usable with a mobile device having limited processing resources or may enable power consumption to be reduced.
  • the mobile or handheld device preferably incorporates a video recording apparatus with a visual zoom capability, and the audio processing device is preferably adapted to modify the desired gain factors in accordance with a configuration of the visual zoom. This enables the device to implement an acoustic zoom function.
  • the microphones are preferably omni-directional microphones.
  • the present device may be particularly beneficial in these circumstances, because the source separation problem is inherently more difficult when using omni-directional microphones. If the microphones are unidirectional, there will often be significant selectivity (in terms of signal power) between the sources among the diverse audio signals. This can make the manipulation task easier.
  • the present device is able to work also with omnidirectional microphones, where there will be less selectivity in the raw audio signals.
  • the present device is therefore more flexible. For example, it can exploit spatial selectivity by means of beamforming techniques, but it is not limited to spatial selectivity through the use of unidirectional microphones.
  • a method of processing audio signals comprising:
  • the parameters of the different mixture may be reweighting factors, which relate the levels of the components in the at least one auxiliary signal to their respective levels in the reference audio signal.
  • the method is particularly relevant to configurations with more than one microphone. Sound from all of the sound sources is detected at each microphone. Therefore, each sound source gives rise to a corresponding component in each audio signal.
  • the number of sources may be less than, equal to, or greater than the number of audio signals (which is equal to the number of microphones).
  • the sum of the number of audio signals and the number of auxiliary signals should be at least equal to the number of sources which it is desired to independently control.
  • Each auxiliary signal contains a different mixture of the components. That is, the components occur with different amplitude in each of the auxiliary signals (according to the reweighting factors).
  • the auxiliary signals and the audio signals should be linearly independent; and the sets of reweighting factors which relate the signal components to each auxiliary signal should also be linearly independent of one another.
  • the levels of the source signal components in the auxiliary signals are varied by a power ratio in the range -40dB to +6dB, more preferably -30dB to 0dB, still more preferably - 25dB to 0dB, compared to their levels in the reference audio signal(s).
  • a scaling coefficient is preferably applied to the reference original audio signal and the result is combined with the scaled auxiliary signals.
  • the scaled auxiliary signals and/or scaled audio signals may be combined by summing them.
  • the scaling coefficients and the desired gain factors will have different values (and may be different in number). They would only be identical if the auxiliary signals were to achieve perfect separation of the sources, which is usually impossible in practice.
  • Each desired gain factor corresponds to the desired volume (amplitude) of a respective one of the sound-sources.
  • the scaling coefficients correspond to the auxiliary signals and/or input audio signals.
  • the number of reweighting factors is equal to the product of the number of signal components and the number of auxiliary signals, since in general each auxiliary signal will comprise a mixture of all of the signal components.
  • the desired gain factors; the reweighting factors and the scaling coefficients are related by a linear system of equations; and the step of calculating the set of scaling coefficients comprises solving the system of equations.
  • the step of calculating the set of scaling coefficients may comprise: calculating the inverse of a matrix of the reweighting factors; and multiplying the desired gain factors by the result of this inversion calculation.
  • the reweighting factors may be formed into a matrix and the inverse of this matrix may be calculated explicitly. Alternatively, the inverse may be calculated implicitly by equivalent linear algebraic calculations. The result of the inversion may be expressed as a matrix, though this is not essential.
  • the at least one auxiliary signal may be a linear combination of any of: one or more of the audio signals; one or more shifted versions of the audio signals; and one or more filtered versions of the audio signals.
  • the at least one auxiliary signal may be generated by at least one of: fixed beamforming; adaptive beamforming; and adaptive spectral modification.
  • fixed beamforming means a spatially selective signal processing operation with a time-invariant spatial response.
  • Adaptive beamforming means a spatially selective signal processing operation with a time-varying spatial response.
  • Adaptive spectral modification means a frequency-selective signal processing operation with a time-varying frequency response, such as the class of methods known in the art as adaptive spectral attenuation or adaptive spectral subtraction.
  • An adaptive spectral modification process typically does not exploit spatial diversity, but only frequency diversity among the signal components.
  • Fixed beamforming may be beneficial when there is some prior expectation that one or more of the sound sources is localised and located in a predetermined direction relative to a set of microphones. The fixed beamformer will then modify the power of the corresponding signal component, relative to other components.
  • Adaptive beamforming may be beneficial when a localised sound source is expected, but its orientation relative to the microphone(s) is unknown.
  • Adaptive spectral modification may be useful when sound sources can be discriminated to some extent by their spectral characteristics. This may be the case for a diffuse noise source, for example.
  • the methods of generating the auxiliary signal or signals are preferably chosen according to the expected sound environment in a given application. For example, if several sources in known directions are expected, it may be appropriate to use multiple fixed beamformers. If multiple moving sources are expected, multiple adaptive beamformers may be beneficial. In this way - as will be apparent to those skilled in the art - one or more instances of different means of generating the auxiliary signal may be combined, in embodiments.
  • a first auxiliary signal is generated by a first method; a second auxiliary signal is generated by a second, different method; and the second auxiliary signal is generated based on an output of the first method.
  • the fixed beamforming may be adapted to emphasize sounds originating directly in front of the microphone or microphone array. For example, this may be useful when the microphone is used in conjunction with a camera, because the camera (and therefore the microphone) is likely to be aimed at a subject who is one of the sound sources.
  • An output of the fixed beamformer may be input to the adaptive beamformer.
  • This may be a noise reference output of the fixed beamformer, wherein the power ratio of a component originating from the fixed direction is reduced relative to other components. It is advantageous to use this signal in the adaptive beamformer, in order to find a (remaining) localised source in an unknown direction, because the burden on the adaptive beamformer to suppress the fixed signals may be reduced.
  • An output of the adaptive beamformer may be input to the adaptive spectral modification.
  • the method of the invention may be seen as a flexible framework for combining weak separators, to allow an arbitrary desired weighting on sound sources.
  • the individual operations of beamforming or spectral modification preferably cause a change in the signal power of individual sound source components in the range -25dB to 0dB. This refers to the input/output power ratio of each operation, ignoring cascade effects due to the output of one unit being connected to the input of another
  • the method may optionally comprising: synthesizing a first output audio signal by applying scaling coefficients to a first reference audio signal and at least one first auxiliary signal and combining the results; and synthesizing a second output audio signal by applying scaling coefficients to a second, different reference audio signal and at least one second auxiliary signal and combining the results.
  • the at least one first auxiliary signal and at least one second auxiliary signal may be the same or different signals.
  • the two different reference audio signals should be selected from appropriately arranged microphones, for a desired stereo effect.
  • the method can be extended to synthesize an arbitrarily greater number of outputs, as desired for any particular application.
  • the sound sources may comprise one or more localised sound sources and a diffuse noise field.
  • the desired gain factors may be time-varying.
  • the method is particularly well suited to real-time implementation, which means that the desired gain can be adjusted dynamically. This may be useful for example for dynamically balancing changing sound sources, or for acoustic zooming.
  • the sound scene can be balanced using time-invariant gain factors, while in a dynamic scenario (that is, with moving or temporally modulated sound sources) the use of time-varying gain factors is more relevant.
  • the desired gain factors can be chosen in dependence upon the state of a visual zoom function.
  • a key example is the process of manipulating the sound scene such that it properly matches the video zooming operations. For example, when zooming in on a particular subject, the sound level of this subject should increase accordingly while keeping the level of the other sound sources constant. In this case, the desired gain factor corresponding to the sound source in front of the camera will be increased over time, while the other gain factors are time-invariant.
  • Also provided is a computer program comprising computer program code means adapted to perform all the steps of a method as described above, when said program is run on a computer; and such a computer program embodied on a computer readable medium.
  • each of the N audio signals U n ( ⁇ ) detected at the microphones as a function of the localized sound sources and the diffuse sound field in the frequency domain as follows:
  • U n 0 ⁇ denotes the diffuse noise component.
  • U n 0 ⁇ denotes the diffuse noise component.
  • the aim of the envisaged sound scene manipulation is to produce N manipulated signals, or audio output signals, ⁇ n (t), in which each of the levels of the individual sound source components is changed in a user-specified way as compared to the respective levels in the nth microphone signal.
  • these will be referred to as the "desired gain factors”.
  • the reweighting factors can be calculated or estimated depending on the embodiment of the invention, as described in greater detail below.
  • equation (7) above is a model for the auxiliary signals which will usually be satisfied only approximately, in practice.
  • the auxiliary signals will be derived from the various microphone signals. Therefore, they will be composed of filtered versions of the sound source components, instead of the unfiltered ("dry") sound source components themselves suggested by equation (7).
  • the number of localized interfering sound sources is taken to be one, for the purposes of this explanation. Furthermore, in this example, it is assumed that the capture device is equipped with two or more microphones. Those skilled in the art will appreciate that none of these assumptions should be taken to limit the scope of the invention.
  • Both embodiments have the general structure shown in the block diagram of Fig. 1 .
  • An array of microphones 4 produces a corresponding plurality of audio signals 6. These are fed as input to an auxiliary signal generator 10.
  • the auxiliary signal generator generates auxiliary signals, each comprising a mixture of the same sound source components detected by the microphones 4, but with the components present in the mixture with different relative strengths (as compared with their levels in the original audio signals 6). In the embodiments described below, these auxiliary signals are derived by processing combinations of the audio signals 6 in various ways.
  • the auxiliary signals and the input audio signals 6 are fed as inputs to an audio synthesis unit 20. This unit 20 applies scaling coefficients to the signals and sums them, to produce output signals 40. In the output signals 40, the sound source components are present with desired strengths.
  • the scaling coefficient calculator 30 converts the desired gains ⁇ g(t) ⁇ into a set of scaling coefficients ⁇ a(t) ⁇ .
  • Each of the desired gains is associated with a sound source detectable at the microphones 4; whereas each of the scaling coefficients is associated with one of the auxiliary signals.
  • the scaling coefficient calculator 30 exploits knowledge about the parameters of the auxiliary signals to transform from desired gains ⁇ g(t) ⁇ to suitable scaling coefficients ⁇ a(t) ⁇ .
  • Fig. 2 shows a block structure for the calculation of the auxiliary signals x n (t), y n (t), and z n (t) required in the algorithm.
  • the auxiliary signal generator 10 consists of three functional blocks 210, 212, 214:
  • the audio synthesis unit 20 is indicated by the dashed box 220. This produces the output signal ⁇ 0 ( t ) as a weighted summation of the auxiliary signals x 0 , y 0 , and z 0 , as well as the reference audio signal u 0 .
  • the weights are the scaling coefficients, a, derived by the scaling coefficient calculator 30 (not shown in Fig. 2 ).
  • auxiliary signals are not explicitly used to calculate the output signal.
  • these signals are used internally in the adaptive beamformer and adaptive spectral attenuation algorithms.
  • the signals x n (t), n > 0 at the output of the fixed beamformer will be constructed to be "noise reference signals"; that is, signals in which the desired (front and optionally back) sound sources have been suppressed and which are used subsequently in the adaptive beamformer to estimate the localized interfering sound source component in the primary output signal x 0 (t) of the fixed beamformer.
  • the signal y 1 (t) is then constructed to be a "diffuse noise reference” that is used by the adaptive spectral attenuation algorithm to estimate the diffuse noise component in the primary output signal y 0 (t) of the fixed beamformer.
  • a stereo output signal should preferably not be created by calculating ⁇ 0 ( t ) and ⁇ 1 ( t ) using these auxiliary signals.
  • the block structure shown in Fig. 3 is used for the stereo case.
  • N 2 (that is, when the array consists of more than two microphones)
  • u 0 (t) and u 1 (t) to be those two microphone signals that are best suited to deliver a stereo image.
  • this will typically depend on the placement of the microphones.
  • the scaling coefficient calculator 30 uses knowledge of the reweighting factors ⁇ n ( p,m ) to derive the scaling coefficients, a(t), from the desired gains, g(t).
  • the reweighting factors are found by using knowledge of the characteristics of the various blocks 210, 212, 214 in the auxiliary signal generator.
  • the reweighting factors are determined offline.
  • the input-output relation of the three functional blocks in the block structure can be described in the frequency domain as follows.
  • the secondary adaptive beamformer output signal should ideally be an estimate of the diffuse noise component in the primary adaptive beamformer output signal.
  • a more efficient approach involves setting the values of the reweighting factors off-line (in advance), making use of the fixed beamformer response (known a priori) and of heuristics about the behaviour of the adaptive beamformer and spectral attenuation response.
  • the values chosen can be approximations of the theoretical values predicted by the equations above. For example, the values may be set heuristically in 5dB steps. In many applications, the method will be largely insensitive to 5dB or 10dB deviations from the precise theoretical values.
  • the fixed beamformer creates a primary output signal X 0 ( ⁇ ) that spatially enhances the front sound source signal, as well as a number of other output signals X n ( ⁇ ), n > 0 that serve as "noise references" for the adaptive beamformer.
  • X 0
  • BM blocking matrix
  • a superdirective (SD) design method which is recommendable when the aim is to maximize the directivity factor of the microphone array - that is, to maximize the array gain in the presence of a diffuse noise field.
  • SD superdirective
  • I N represents the N ⁇ N identity matrix
  • is a regularization parameter
  • DF dB 10 log 10 1 2 ⁇ ⁇ ⁇ ⁇ 0 2 ⁇ ⁇ ⁇ W 1 , : 1 H ⁇ ⁇ G ⁇ ⁇ ⁇ F 2 W 1 , : 1 H ⁇ ⁇ ⁇ ⁇ U N ⁇ W 1 , : 1 ⁇ ⁇ d ⁇
  • FBRR dB 10 log 10 ⁇ 0 2 ⁇ ⁇ ⁇ W 1 , : 1 H ⁇ ⁇ G ⁇ ⁇ ⁇ F 2 ⁇ d ⁇ ⁇ 0 2 ⁇ ⁇ ⁇ W 1 , : 1 H ⁇ ⁇ G ⁇ ⁇ ⁇ B 2 ⁇ d ⁇ .
  • the FBRR increases for higher filter lengths and approximately saturates for a length greater than or equal to 128.
  • the frequency-domain SD design is executed at L FSB /2 frequencies that are uniformly distributed in the Nyquist interval, after which the frequency-domain FSB coefficients are transformed to length-L FSB time-domain filters.
  • Experiments have also shown a significant performance gap between the 2-mic configuration and other configurations, with greater than 2 microphones, both in terms of directivity and FBRR.
  • the BM in the fixed beamformer consists of a number of filter-and-sum beamformers that each operate on one particular subset of microphone signals. In this way, a number of noise reference signals is created, in which the power of the desired signal components is maximally reduced relative to the power of these components in the microphone signals.
  • N-1 noise references are created by designing N-1 different filter-and-sum beamformers.
  • it might be preferable to create fewer than N-1 noise references which then leads to a reduction of the number of input signals x n (t) for the adaptive beamformer.
  • BM design In the context of the BM design, we consider the back sound source (if any) to be an undesired signal (which should be cancelled by the adaptive beamformer); hence the BM design reduces to a front-cancelling beamformer (FCB) design.
  • FCB front-cancelling beamformer
  • one of several different fixed beamformer design methods can be employed.
  • FCB design we should specify a zero response in the front direction and a non-zero response in any other direction.
  • the latter direction should the back direction to avoid that the design would actually correspond to a front-back-cancelling beamformer design.
  • M the number of equations in the linear system of equations above
  • the back response is indeed close to a unity response for most microphone configurations and filter length values.
  • the front source response varies heavily according to the microphone configuration and filter length used.
  • At least one microphone pair in an endfire configuration should preferably be included in the array to obtain a satisfactory power reduction of the front sound source component.
  • Concerning the choice of the BM filter length experiments show that there is no clear threshold effect - that is, the response in the front direction decreases with a nearly constant slope (provided an endfire microphone pair is included).
  • the BM filter length should preferably be chosen according to the desired front sound source power reduction.
  • the adaptive beamformer in the block scheme may be implemented using a generalized sidelobe canceller (GSC) algorithm; a multi-channel Wiener filtering (MWF) algorithm; or any other adaptive algorithm.
  • GSC generalized sidelobe canceller
  • MMF multi-channel Wiener filtering
  • SDW-MWF speech-distortion-weighted multi-channel Wiener filtering
  • the parameter ⁇ can be tuned to trade off energy reduction of the undesired components versus distortion of the desired component.
  • GSVD generalized singular value decomposition
  • QR decomposition QR decomposition
  • time-domain stochastic gradient method time-domain stochastic gradient method
  • frequency-domain stochastic gradient method a common feature of these implementations.
  • the correlation matrices ⁇ x ( F ) ( ⁇ ) and ⁇ x ( B,I,N ) ( ⁇ ) are explicitly estimated before the SDW-MWF filter estimate is computed.
  • the mean SNR at the microphones is equal to 10 dB.
  • the adaptation of the SDW-MWF algorithm is based on a stochastic gradient frequency-domain implementation, and is controlled by a perfect (manual) voice activity detection (VAD). Two features of the SDW-MWF have been evaluated, namely:
  • the algorithm without a feedforward filter corresponds to the GSC algorithm, while the algorithm with a feedforward filter is not relevant due to an intolerable speech distortion.
  • the adaptive spectral attenuation block is included in the structure with the aim of reducing the diffuse noise energy in the primary adaptive beamformer output signal.
  • are estimated by means of a Discrete Fourier transform (DFT), with k and I denoting the DFT frequency bin and time frame indices.
  • DFT Discrete Fourier transform
  • is subsequently transformed back to the time domain by applying an inverse DFT (IDFT), and by using the phase spectrum of the primary adaptive beamformer output signal Y 0 ( ⁇ k , l ).
  • IDFT inverse DFT
  • a first possibility is to regard the back sound source as an undesired sound source, in which case its level should remain constant. However, since the back sound source is typically very close to the camera, its level should often be reduced to obtain an acceptable balance between the back sound source and the other sound sources.
  • a second possibility is to have the back sound source gain factor follow the inverse trajectory of the front sound source gain factor, possibly combined with a fixed back sound source level reduction. While such an inverse level trajectory would obviously make sense from a physical point of view, it may be perceived somewhat too artificial, since the front sound source level change is then supported by visual cues, while the back sound source level change is not.
  • the front sound source is a male speech signal corresponding to a camera recording that consists of a far shot phase (5 s), a zoom-in phase (10 s), and a close-up phase (11 s).
  • ⁇ I 90 deg.
  • a 3-microphone array was used, employing microphones 1,3, and 4 as indicated in Fig. 1 .
  • the fixed beamformer consists of a superdirective FSB and a single-noise-reference front-cancelling BM, both a with filter length of 64.
  • the adaptive beamformer is calculated using a GSC algorithm and has a filter length of 128.
  • the desired AZ effect consists in keeping the level of the undesired sound sources (including the back sound source in the second simulation) unaltered, while increasing the level of the front sound source during the zoom-in phase, according to the perceptually optimal trajectory defined above.
  • the values of the re-weighting factors were determined empirically in advance, rather than at run-time (as described previously above).
  • the performance of the method depends in part upon the accuracy to which the reweighting factors can be estimated. The greater the accuracy, the better the performance of the manipulation will be.
  • Fig. 4 is a flowchart summarising a method according to an embodiment.
  • audio signals 6 are received from the microphones 4.
  • the desired gain factors 8 are input.
  • the auxiliary signal generator generates the auxiliary signals.
  • the scaling coefficient calculator 30 calculates the scaling coefficients, a(t).
  • the audio synthesis unit 20 applies the scaling coefficients to the generated auxiliary signals and reference audio signals, to synthesise output audio signals 40.
  • auxiliary signal calculation should be such that it exploits the diversity of the individual sound sources in the sound scene.
  • multiple microphones are used, then exploiting spatial diversity is often the most straightforward option - and this is exploited by the beamformers in the embodiments described above.
  • auxiliary signal generator will vary according to the application and the characteristics of the audio environment.
  • a computer program may be stored/distributed on a suitable medium, such as an optical storage medium or a solid-state medium supplied together with or as part of other hardware, but may also be distributed in other forms, such as via the Internet or other wired or wireless telecommunication systems. Any reference signs in the claims should not be construed as limiting the scope.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Health & Medical Sciences (AREA)
  • Otolaryngology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Circuit For Audible Band Transducer (AREA)
EP10275102.1A 2010-09-30 2010-09-30 Manipulation de scène sonore Active EP2437517B1 (fr)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP10275102.1A EP2437517B1 (fr) 2010-09-30 2010-09-30 Manipulation de scène sonore
CN2011103036497A CN102447993A (zh) 2010-09-30 2011-09-29 声音场景操纵
US13/248,805 US20120082322A1 (en) 2010-09-30 2011-09-29 Sound scene manipulation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
EP10275102.1A EP2437517B1 (fr) 2010-09-30 2010-09-30 Manipulation de scène sonore

Publications (2)

Publication Number Publication Date
EP2437517A1 true EP2437517A1 (fr) 2012-04-04
EP2437517B1 EP2437517B1 (fr) 2014-04-02

Family

ID=43533377

Family Applications (1)

Application Number Title Priority Date Filing Date
EP10275102.1A Active EP2437517B1 (fr) 2010-09-30 2010-09-30 Manipulation de scène sonore

Country Status (1)

Country Link
EP (1) EP2437517B1 (fr)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013184299A1 (fr) * 2012-06-08 2013-12-12 Apple Inc. Réglage des paramétrages de mise en forme de faisceaux audio en fonction d'un état de système
KR101721424B1 (ko) * 2015-12-31 2017-03-31 서강대학교산학협력단 독립성분분석을 기반으로 한 반향에 강인한 다음원 탐지 방법
US9838821B2 (en) 2013-12-27 2017-12-05 Nokia Technologies Oy Method, apparatus, computer program code and storage medium for processing audio signals
WO2018219582A1 (fr) * 2017-05-29 2018-12-06 Harman Becker Automotive Systems Gmbh Capture de son
CN112384975A (zh) * 2018-07-12 2021-02-19 杜比实验室特许公司 使用辅助信号的音频装置的传输控制
CN114203163A (zh) * 2022-02-16 2022-03-18 荣耀终端有限公司 音频信号处理方法及装置
CN114863944A (zh) * 2022-02-24 2022-08-05 中国科学院声学研究所 一种低时延音频信号超定盲源分离方法及分离装置

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2353193A (en) * 1999-06-22 2001-02-14 Yamaha Corp Sound processing
EP2131610A1 (fr) * 2008-06-02 2009-12-09 Starkey Laboratories, Inc. Compression et mélange pour dispositifs d'assistance auditive

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2353193A (en) * 1999-06-22 2001-02-14 Yamaha Corp Sound processing
EP2131610A1 (fr) * 2008-06-02 2009-12-09 Starkey Laboratories, Inc. Compression et mélange pour dispositifs d'assistance auditive

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
JANG INSEON ET AL: "An Object-based 3D Audio Broadcasting System for Interactive Services", AES CONVENTION 118; MAY 2005, AES, 60 EAST 42ND STREET, ROOM 2520 NEW YORK 10165-2520, USA, 1 May 2005 (2005-05-01), XP040507192 *
S. DOCLO; A. SPRIET; J. WOUTERS; M. MOONEN: "Frequency-domain criterion for the speech distortion weighted multichannel wiener filter for robust noise reduction", SPEECH COMMUN., vol. 49, no. 7-8, July 2007 (2007-07-01), pages 636 - 656
S. DOCLO; M. MOONEN: "Superdirective beamforming robust against microphone mismatch", IEEE TRANS. AUDIO SPEECH LANG. PROCESS., vol. 15, no. 2, February 2007 (2007-02-01), pages 617 - 631

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013184299A1 (fr) * 2012-06-08 2013-12-12 Apple Inc. Réglage des paramétrages de mise en forme de faisceaux audio en fonction d'un état de système
TWI502584B (zh) * 2012-06-08 2015-10-01 Apple Inc 電腦實施的波束成形方法,波束成形系統及相關之非暫時性電腦可讀媒體
US9838821B2 (en) 2013-12-27 2017-12-05 Nokia Technologies Oy Method, apparatus, computer program code and storage medium for processing audio signals
KR101721424B1 (ko) * 2015-12-31 2017-03-31 서강대학교산학협력단 독립성분분석을 기반으로 한 반향에 강인한 다음원 탐지 방법
WO2018219582A1 (fr) * 2017-05-29 2018-12-06 Harman Becker Automotive Systems Gmbh Capture de son
US10869126B2 (en) 2017-05-29 2020-12-15 Harman Becker Automotive Systems Gmbh Sound capturing
CN112384975A (zh) * 2018-07-12 2021-02-19 杜比实验室特许公司 使用辅助信号的音频装置的传输控制
CN114203163A (zh) * 2022-02-16 2022-03-18 荣耀终端有限公司 音频信号处理方法及装置
CN114863944A (zh) * 2022-02-24 2022-08-05 中国科学院声学研究所 一种低时延音频信号超定盲源分离方法及分离装置
CN114863944B (zh) * 2022-02-24 2023-07-14 中国科学院声学研究所 一种低时延音频信号超定盲源分离方法及分离装置

Also Published As

Publication number Publication date
EP2437517B1 (fr) 2014-04-02

Similar Documents

Publication Publication Date Title
US20120082322A1 (en) Sound scene manipulation
EP2437517B1 (fr) Manipulation de scène sonore
EP3253075B1 (fr) Prothèse auditive comprenant une unité de filtrage à formateur de faisceau comprenant une unité de lissage
EP2537353B1 (fr) Dispositif et procédé pour diminuer le bruit spatial en fonction de la direction
US8577054B2 (en) Signal processing apparatus, signal processing method, and program
EP2647221B1 (fr) Appareil et procédé d'acquisition sonore spatialement sélective par triangulation acoustique
KR101834913B1 (ko) 복수의 입력 오디오 신호를 잔향제거하기 위한 신호 처리 장치, 방법 및 컴퓨터가 판독 가능한 저장매체
KR20090051614A (ko) 마이크로폰 어레이를 이용한 다채널 사운드 획득 방법 및장치
EP3275208B1 (fr) Mélange de sous-bande de multiples microphones
WO2020029998A1 (fr) Dispositif de formation de faisceau assisté par électroencéphalogramme, procédé de formation de faisceau et système auditif fixé à l'oreille
Yang et al. Joint Optimization of Neural Acoustic Beamforming and Dereverberation with x-Vectors for Robust Speaker Verification.
Doclo et al. Extension of the multi-channel Wiener filter with ITD cues for noise reduction in binaural hearing aids
Kim Hearing aid speech enhancement using phase difference-controlled dual-microphone generalized sidelobe canceller
Koutrouvelis et al. A convex approximation of the relaxed binaural beamforming optimization problem
CN114255777A (zh) 实时语音去混响的混合方法及系统
Shabtai Optimization of the directivity in binaural sound reproduction beamforming
Westhausen et al. Low bit rate binaural link for improved ultra low-latency low-complexity multichannel speech enhancement in Hearing Aids
Çolak et al. A novel voice activity detection for multi-channel noise reduction
Kovalyov et al. Dfsnet: A steerable neural beamformer invariant to microphone array configuration for real-time, low-latency speech enhancement
Salvati et al. Joint identification and localization of a speaker in adverse conditions using a microphone array
JP2017181761A (ja) 信号処理装置及びプログラム、並びに、ゲイン処理装置及びプログラム
Herzog et al. Direction preserving wind noise reduction of b-format signals
Zhang A parametric unconstrained binaural beamformer based noise reduction and spatial cue preservation for hearing-assistive devices
Jan et al. A blind source separation approach based on IVA for convolutive speech mixtures
Fahim et al. A planar microphone array for spatial coherence-based source separation

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO SE SI SK SM TR

AX Request for extension of the european patent

Extension state: BA ME RS

17P Request for examination filed

Effective date: 20121004

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

INTG Intention to grant announced

Effective date: 20130618

RIN1 Information on inventor provided before grant (corrected)

Inventor name: MOONEN, MARC

Inventor name: TIRRY, WOUTER

Inventor name: VAN WATERSCHOOT, TOON

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

INTG Intention to grant announced

Effective date: 20131028

GRAS Grant fee paid

Free format text: ORIGINAL CODE: EPIDOSNIGR3

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO SE SI SK SM TR

REG Reference to a national code

Ref country code: GB

Ref legal event code: FG4D

REG Reference to a national code

Ref country code: AT

Ref legal event code: REF

Ref document number: 660734

Country of ref document: AT

Kind code of ref document: T

Effective date: 20140415

Ref country code: CH

Ref legal event code: EP

REG Reference to a national code

Ref country code: IE

Ref legal event code: FG4D

REG Reference to a national code

Ref country code: DE

Ref legal event code: R096

Ref document number: 602010014766

Country of ref document: DE

Effective date: 20140515

REG Reference to a national code

Ref country code: AT

Ref legal event code: MK05

Ref document number: 660734

Country of ref document: AT

Kind code of ref document: T

Effective date: 20140402

REG Reference to a national code

Ref country code: NL

Ref legal event code: VDEP

Effective date: 20140402

REG Reference to a national code

Ref country code: LT

Ref legal event code: MG4D

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IS

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20140802

Ref country code: CY

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20140402

Ref country code: NO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20140702

Ref country code: GR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20140703

Ref country code: FI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20140402

Ref country code: CZ

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20140402

Ref country code: LT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20140402

Ref country code: BG

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20140702

Ref country code: NL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20140402

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: ES

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20140402

Ref country code: PL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20140402

Ref country code: AT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20140402

Ref country code: HR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20140402

Ref country code: SE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20140402

Ref country code: LV

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20140402

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: PT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20140804

REG Reference to a national code

Ref country code: DE

Ref legal event code: R097

Ref document number: 602010014766

Country of ref document: DE

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: EE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20140402

Ref country code: BE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20140402

Ref country code: SK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20140402

Ref country code: DK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20140402

Ref country code: RO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20140402

PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

26N No opposition filed

Effective date: 20150106

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20140402

REG Reference to a national code

Ref country code: DE

Ref legal event code: R097

Ref document number: 602010014766

Country of ref document: DE

Effective date: 20150106

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MC

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20140402

Ref country code: LU

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20140930

REG Reference to a national code

Ref country code: CH

Ref legal event code: PL

GBPC Gb: european patent ceased through non-payment of renewal fee

Effective date: 20140930

REG Reference to a national code

Ref country code: IE

Ref legal event code: MM4A

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: CH

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20140930

Ref country code: LI

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20140930

Ref country code: GB

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20140930

Ref country code: SI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20140402

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20140930

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SM

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20140402

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20140402

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: HU

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT; INVALID AB INITIO

Effective date: 20100930

Ref country code: TR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20140402

REG Reference to a national code

Ref country code: FR

Ref legal event code: PLFP

Year of fee payment: 7

REG Reference to a national code

Ref country code: FR

Ref legal event code: PLFP

Year of fee payment: 8

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20140402

REG Reference to a national code

Ref country code: FR

Ref legal event code: PLFP

Year of fee payment: 9

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: AL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20140402

REG Reference to a national code

Ref country code: DE

Ref legal event code: R081

Ref document number: 602010014766

Country of ref document: DE

Owner name: GOODIX TECHNOLOGY (HK) COMPANY LIMITED, CN

Free format text: FORMER OWNER: NXP B.V., EINDHOVEN, NL

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: FR

Payment date: 20230928

Year of fee payment: 14

Ref country code: DE

Payment date: 20230920

Year of fee payment: 14