EP1055317A1

EP1055317A1 - Method for improving acoustic noise attenuation in hand-free devices

Info

Publication number: EP1055317A1
Application number: EP99904718A
Authority: EP
Inventors: Gerhard Schmidt
Original assignee: Siemens AG
Current assignee: Infineon Technologies AG
Priority date: 1998-02-13
Filing date: 1999-01-18
Publication date: 2000-11-29
Also published as: JP2002503924A; DE19805942C1; US6618481B1; WO1999041898A1

Abstract

The present invention relates to a method for improving acoustic noise attenuation in hand-free devices essentially used in automobiles, wherein said method uses a level balance (22) as well as a plurality of echo-compensation adaptive filters (34) each for processing a given partial band. In at least one partial band, another adaptive filter (shadow filter 36) of a lower order is connected in parallel to an adaptive echo-compensation filter (34). Spatial modifications can thus be determined on the basis of a combined estimation which comprises a correlation analysis as well as a residual-error comparison of the two concurring filters (34, 36).

Description

1 description

Methods for improving acoustic attenuation in hands-free systems

The present invention relates to a method for improving the acoustic attenuation in hands-free systems with a level balance and a plurality of adaptive echo compensation filters, each of which processes a subband.

In the case of hands-free systems, it is absolutely necessary to suppress the signals of the distant subscriber which are emitted by the loudspeaker and thus picked up again by the microphone, since otherwise unpleasant echoes disrupt the connection. Up to now, a level balance has been usually provided to suppress these echoes, that is, for acoustic attenuation, which strongly dampens the transmission or reception path depending on the conversation situation. However, this makes two-way communication (full duplex operation) practically impossible.

With previous technology, attempts have already been made to provide adequate attenuation despite acceptable intercom characteristics. For this purpose, a frequency-selective, controllable echo suppression was provided in addition to the level balance. In this regard, reference is made to the as yet unpublished patent application DE 197 14 966 by the applicant. Other methods are, for example, in the advertising brochure of the NEC company "Reflexion ™ Acoustic Echo Canceller on the μPD7701x Family", 1996, or in the description of the Motorola DSP5600x digital processor (M. Knox, P. Abbott, C. Cox: A Highly Integrated H.320 audio subsystem using the Motorola DSP5600x Digital Processor).

Such echo cancellation methods work satisfactorily in normal rooms. When using hands-free devices in motor vehicles, the detection of two-way communication becomes clear - depending on the interior acoustics 2 more difficult than in office space. In motor vehicles in particular, it is extremely difficult to differentiate between abrupt changes in the interior acoustics, for example by movements of the vehicle occupants, with respect to the two-way communication.

It is therefore an object of the invention to provide a method for improving acoustic attenuation in hands-free devices, in which a clear distinction between two-way communication and abrupt changes in the interior acoustics can also be recognized in motor vehicles and taken into account in the control of the hands-free device.

This object is achieved with a method having the features of patent claim 1. Advantageous refinements of this method are specified in the subclaims.

According to the invention, a further adaptive filter (shadow filter) of a lower order is connected in parallel to the adaptive echo cancellation filter in at least one subband. Room changes can then be detected by combining a performance evaluation of the two residual echo powers and a correlation analysis of the estimated and the measured microphone signal.

Several different sampling rates can preferably be used. This can reduce the computational effort.

It is also preferred that the further adaptive filter has a significantly lower order.

The echo cancellation is preferably implemented in frequency subbands by means of a filter bank.

Both performance evaluations of competing are preferably used for the adaptation or the step size control 3 adaptive filters, as well as correlation-based analyzes used.

It is also preferred to estimate power transmission factors in subbands for determining the step size.

It is also preferred that the echo compensation filters provide estimates for the echo attenuation introduced by them, since these estimates can preferably be used to control the attenuation of the level balance. As a result, the attenuation to be introduced by the level balance can be further reduced and the conversation quality in the case of two-way communication can be further improved.

In addition, it is preferred to detect the simultaneous activity of both conversation participants (intercom). It is then possible, for example, to reduce the total attenuation of the level balance in the case of two-way communication in order to further improve the two-way communication capability (full duplex operation) of the hands-free device.

The present invention is described below with reference to the embodiment shown in the accompanying drawings. It shows:

Figure 1 is a simplified model of a hands-free device connected to a digital connection.

2 shows a simplified block diagram of a hands-free device;

3 curves for the attenuation requirements of the user as a function of the echo run time;

4 shows an overview of the method according to the invention with shadow filter and correlation analysis; 4 FIG. 5 shows the control of the power ^transmission factors in a clear representation;

6 shows an overview of the shadow filter approach.

1 shows a simplified model of a hands-free device 10 connected to a digital connection 12. The A-law coding or decoding used in the European ISDN network is shown in the two left blocks 14, 16. The speaker-room microphone system 18 (LRM system) with the local call participant 20, the user of the hands-free device, is sketched on the right-hand side.

The acoustic coupling between loudspeaker and microphone leads to crosstalk via the LRM system. This crosstalk is perceived by the distant subscriber as a disturbing echo. Acoustic waves emerge from the loudspeaker and spread out in the room. Reflection on the walls and other objects in the room creates several paths of propagation, which result in different durations of the loudspeaker signal. The echo signal at the microphone thus consists of the superimposition of a large number of echo components and possibly the useful signal n (t): the local speaker.

The connection between the participants can also generate echoes at transitions between different transmission systems. However, the network operators try to take special measures against such echo sources directly at the critical points, so that these echoes can be disregarded here. Fork echoes, which arise in telephones with an analog interface due to mismatching of the line simulation to the line impedance, can also be disregarded when using digital connections. 5 An overview of a hands-free ^{device is shown} in FIG. 2. The central element is a level balance 22, which is shown in the left part of FIG. 2. Optionally, two gain controls 24, 26 (Automatic Gain Control = AGC) can be switched on in the transmit and receive path. The level balance 22 guarantees the minimum attenuation prescribed by the ITU or ETSI recommendations by adding attenuation to the transmission and / or reception path depending on the conversation situation. When the remote subscriber is active, the reception path is activated and the signal from the remote subscriber is output undamped on the loudspeaker. The echoes that occur when the compensators are switched off or poorly balanced are greatly reduced by the damping inserted into the transmission path. When the local speaker is active, the situation is reversed. While the reception path is strongly attenuated, the level balance 22 does not insert any attenuation into the transmission path and the signal of the local speaker is transmitted undamped. Controlling the level balance 22 in the case of two-way communication becomes more difficult. Here, both paths (and thus also the subscriber signals) each receive half of the damping to be inserted or, if the control is not optimal, at least one of the two signal paths is damped. Intercom is therefore not possible or only possible to a limited extent.

This is remedied by the use of adaptive echo cancellers 28 - shown in the right part of FIG. 2. These try to digitally emulate the LRM system in order to then remove the echo component of the distant subscriber from the microphone signal. Depending on how well the compensators manage this, the total attenuation to be inserted by the level balance 22 can be reduced.

The echo composition was implemented in frequency subbands, the width of the individual bands being between 250 Hz and 500 Hz at 8 kHz sampling rate or between 500 Hz and 1000 Hz at 16 kHz sampling rate. The use of a frequency 6 selective echo cancellation has several advantages. Firstly, by using undersampling and oversampling, the system can be operated as a multirate system, which reduces the signal processing effort. On the other hand, by dividing the sub-band, the "compensation power" can be distributed differently over the individual frequency ranges and thus an effective adaptation of the "compensation power" to speech signals can be achieved. Subband processing also has a decorrelating effect when the overall tape processing is compared with the individual subband systems. For speech signals, this means an increase in the convergence speed of the adaptive filters. In addition to these advantages, the disadvantage of subband processing must not be ignored. Breaking down a signal into individual frequency ranges always results in a runtime. However, since the method is used for video conferences or in GSM mobile phones, such runtimes are permissible.

In video conferencing systems, the runtime is mainly determined by the image processing component. Since attempts are generally made to output the image and sound of the distant subscriber lip-synchronized to the local subscriber, the running time of the acoustic echoes can increase to several hundred milliseconds. 3 shows the results of a study in which an attempt was made to find out which echo attenuation is necessary depending on the duration of this echo, so that 90, 70 and 50 percent of the respondents were satisfied with the quality of the call.

Based on this study, a pure audio runtime of 30 - 40 ms (at 8 kHz sampling rate) only requires 35 dB echo attenuation. With lip-synchronous broadcasting of image and sound and a related runtime of 300 ms, for example, the requirement increases to 53 dB. The runtime can also be more than 100 ms in GSM connections. The requirements for echo cancellation 7 in videoconferencing and GSM systems are therefore higher than the requirements for conventional hands-free telephones.

Since the echo cancellers are limited in their performance and cannot achieve such high echo attenuation with the available hardware, a so-called post filter 30 was introduced. This evaluates the step sizes of the individual subbands together with the other detector results and filters the synthesis filter output signal again frequency-selectively. Since the setting algorithm of the filter 30 was designed according to a Wiener approach, this post-filtering is also referred to below as Wiener filtering.

The echo cancellers are controlled in several stages. All power-based control units 32 work autonomously for each compensator, that is to say independently of the remaining frequency ranges. A separate adaptation and control unit 32 is therefore sketched in FIG. 2 for each compensator. The control stage, which is based on correlation analyzes of the estimated and the measured microphone signal, is used for intercom detection and is therefore evaluated equally in all frequency ranges. A further level takes into account the accuracy limited by the fixed point arithmetic and controls the adaptation depending on the modulation.

The final intercom detection is also carried out separately with its own unit, which is based on both the level balance detectors and the echo cancellers. This unit causes the level balance to reduce the total attenuation to be inserted again (in accordance with ITU recommendation G.167).

When using the hands-free device in motor vehicles, the detection of two-way communication is dependent on the 8 Interior acoustics - significantly more difficult than in offices. In particular, previous methods can only make a limited distinction between two-way communication and abrupt changes in space if the signal power in the transmission path (signal e (k)) is increased. In the latter case, movements of the driver (steering movements, gesturing) lead to changes in the transmission path between the loudspeaker and the microphone, as a result of which the echo cancellers are no longer adjusted to the room. Depending on the interior acoustics, the signal power of the right echo increases up to the order of magnitude that can be achieved with two-way communication. In order to prevent the intercom detector from activating the attenuation reduction in such situations, a so-called shadow filter 36 was used.

A second filter 36 with a significantly reduced order - hereinafter referred to as shadow filter 36 - was connected in parallel to one of the echo cancellers 34. This second filter 36 is dimensioned so that it can only compensate for the direct sound. Due to its shortened length and its adapted control, it can adapt much faster than the actual echo cancellation filter 34. The control of the shadow filter 36 is based only on the excitation by the distant call participant. After room changes, the residual error power (signal e ^ k _r ),

4) of the shadow filter 36 is reduced significantly faster than that of the long echo compensation filter 34. A detector evaluates the error powers of the two competing filters and, in the case of detected changes in space, causes a quick estimate of the power transmission factor between the signals x (k) and e (k). . In the case of strong changes in the room, this means that there is no longer an incorrect detection of two-way communication and the level balance 22 suppresses the pending residual echo. At the same time, the step size of all echo cancellers 28 is adapted, which leads to rapid readaptation. A detailed description follows. 9 The distinction between single and intercom phases is strong background noise in motor vehicles (. B. engine and wind noise z) is more difficult and restricted with the Bishe ^¬ ring detectors possible. To this Randbe- yet dingung care to wear, an extended Korre ^¬ lationsanalyse is presented. In contrast to the prior art, this analysis uses the estimated and the measured microphone signal of a subband. This selection allows significantly higher background noise levels without delivering measurably poorer results. Incorrect detections in poorly balanced compensators are intercepted by the shadow filter evaluation.

The combination of these two detection methods - the shadow filter and the correlation analysis - allow fast and stable adaptation of the echo cancellers even under the difficult conditions in motor vehicles. The control of the residual damping, which is to be inserted by the level balance, can be carried out permissible with the described method. Control of attenuation reduction in intercom is included.

The state of the art in relation to shadow filters results, for example, from S.D. Peters: A Self-Tuning NLMS Adaptive Filter Using Parallel Adaption, IEEE Transactions on Circuits and Systems - II, Analog and Digital Signal Processing, Vol. 44, No. 1, Jan. 1997. In addition to the actual adaptive total band filter, two shadow filters of the same length are adapted in parallel. The step size for the actual filter is then determined from the two error signals.

Only by using a single subband shadow filter - which is significantly shorter than the actual filter - can changes in space be detected with very little effort using the method proposed in this invention. 10 The state of the art in terms of correlation analyzes fin ^¬ det for example, in P. Heitkämper: A correlation to detect spokesman activities 8. Aachen Colloquium signal theory RTWH Aachen, March 1994. Here is the correlation between the microphone and the Lautsprechersi ^¬ gnal evaluated. The disadvantage of this method is that the number of false detections also increases with increasing background noise level, so that it cannot be used in vehicles, or only to a limited extent.

The frequency band analysis and synthesis required for subband processing is implemented as a polyphase filter bank.

In order to be able to use a hands-free method with a level balance and several adaptive echo compensation filters, each of which processes a sub-band, in motor vehicles, adjustments to the changed boundary conditions (compared to the use in "normal" offices) must be made.

In handsfree talking in motor vehicles, for example, significant background noise (e.g. engine and headwind noise) that can interfere with the adaptation can be expected. Furthermore, the performance of these noises can fluctuate greatly - as examples, the operation when driving fast on the highway and the operation in a quiet parking lot can be given. The reverberation times of vehicle interiors (approx. 50 - 80 ms) are significantly lower compared to office rooms. Movements by the driver (steering, gesturing, etc.) thus have a significantly greater impact on the impulse response of the loudspeaker-room-microphone system (LRM system).

In order to ensure stable adaptation of the echo cancellers and corresponding control of the attenuation requirements for the level balance under the described boundary conditions, the combined use of a correlation analysis 11 se and a shadow filter are presented. The procedure presented below estimates the quantities listed in Table 1.

The notation of the formula symbols introduced in Table 1 is retained throughout the description. The superscript ^(r> or subscript _r indicate the sampling rate reduced by the factor r. Smoothed sizes are indicated by overlines. The selection of individual subbands is made by suitable selection of the parameter μ.

In order to achieve a stable and fast adaptation of the echo cancellers, the subband echo cancellers 28 are controlled by their step sizes a Dk _r ). The equation for these quantities is:

Λr) (K) ΪW (* r>

The sizes | * J, ^r (& _r ) | and \ e ^ ⁽ (k _r ) \ represent smoothed estimates for the signal power of the remote subscriber and for the error power. Both estimates are determined by first-order non-linear recursive amount smoothing.

12th

Name meaning

<* Ϊ ⁾ Step size of the echo canceller in subband μ - The value range of this variable is between zero and one. With a step size a ^ (k _r = 0, the old

Keep space estimate - with a step size a ^ ⁾ (k _r ) = 1, the maximum

Adapted speed.

PV.EΛK) Power transmission factor in the subband

^p _E Λ ^k) Power transmission factor in the entire band or attenuation reduction of the level balance.

K K) \ Estimated quantity for the signal power of the distant party (excitation power) in the subband μ.

\ x (k) \ Estimated quantity for the signal power of the distant call participant (excitation power) in the entire band.

\ e; ⁽ κ) \ Estimated quantity for the error performance in the subband μ.

\ e (k) \ Estimated quantity for the error performance in Ge velvet ribbon.

Table 1: Estimates and their meaning

l * i ^r (= ßΛ) (\ RΦΪ)} \ + \ Irn {x _μ ⁾ ()} \) ₊ (l-ß _x (k _r )) \ 7? {k _r -1) |

Wi) \ = ßΛ) (\ ^ {e ^ (k _r )} \ + \ lm {e ^ (k _r )} \) + (l-ß _e (k _r )) \ e ^ (k _r -1 ) | with ß _R if | Re {xi "(* _r )} | + | Im {x ^ (* _r )} |> \ x ^ (k _r -1) ßΛK) ß _F , otherwise ß _R if | Re {e ^ (* _r )} | + | Im {e ^ (* _r )} \> \ e ^ k _r - 1) ßλK) = [2.2: ß _F , otherwise 13 The time constants ß _R and ß _F are chosen so that an increase in signal power can be followed faster than a decrease in power. The actual calculation of the step sizes uses a DSP-specific logarithmization or linearization.

The power transmission factors p ^{ ^ _EK (k _r ) in each

Subbands are estimated if the condition of the speakerphone allows. The quality of these estimates also determines the quality of the entire hands-free system over the long term. Accordingly, the determination of these variables also involves a significantly higher process effort.

5 shows an overview for estimating the power transmission factors. Basically, these factors should only be estimated when the distant participant speaks individually. If single speech was detected, the variance of the estimate can be influenced by different time constants. Very slow estimation methods lead to very good results in stationary environments. In these cases, the hands-free device reaches states in which it is fully duplex-capable or at least almost fully duplex-capable, i.e. in compliance with the ITU recommendations, conversations are possible without noticeable attenuation.

If the status of the hands-free system changes, e.g. due to changes in space, sluggish estimation methods lead to incorrect detection and undesirable reductions in echo attenuation, that is to say a reduction in the quality of the call.

The distinction between two-way communication and room changes is particularly critical. Both lead to an increase in error performance. In the case of two-way communication, the estimation of the power transmission factors should be stopped and the total attenuation of the level balance according to the ITU-T or ETSI recommendations 14 can be reduced. In the event of room changes, the power transmission factors should be re-estimated as quickly as possible.

Before the explicit calculation formulas for the individual transmission factors are given, the two detectors, which are supposed to detect room changes or intercom, are presented in the following two sections. The combined evaluation, which is required to determine the power transmission factors, is also described in a separate section.

In order to recognize changes in space, a second filter is connected in parallel with the actual adaptive filter in the first subband (frequency range 250 Hz - 750 Hz) (FIG. 6). This so-called shadow filter is significantly shorter than the conventional one and is designed in such a way that it can mainly compensate for direct sound and the first reflections. Due to the reduced order, the shadow filter can adjust much faster, if not as far as the longer echo compensation filter.

The shadow filter c ^ {k _r ) is, like the subband echo compensators ^ ^{ k _r ), with an NLMS algorithm ^rl K ^{) χ} u ^(r) ( ⁾

££ <*, +!) - ^) + *) ^ ^ ⁽ 2.4 ⁾ ^X SF r) ^X SF V r)

adapted. Vectors are identified by an underscore. The notation ^H stands for Hermitian - the superscript asterisk ^* describes complex conjugation. The vector x ^ {k _r results from the excitation vector of the first subband x \ ^r} (k _r ) by a corresponding shortening of the length. In contrast to the echo canceller, the step size control of the shadow filter is only standard-controlled: 15 α sf if x ^ ^H (k _r ) x ^ (k _r )> N ^ α: 2.5; otherwise

The parameter α ^ is adjustable and should be about 1.

The size N _sf is also adjustable and should be adjusted to the length of the shadow filter.

In order to detect spatial changes, the error performances of the echo compensation filter and the shadow filter are compared. For this purpose, as already introduced previously, non-linear, recursive first-order amount smoothing is calculated:

\ e _v ⁽ K) | = ß _v {\ Rt {e \ ^r k _r )} | + \ im {e ^r k _r )} | )

(2.6!

+ (l-ß _v )> (k _r -i) \

\ <(K) \ = ß _v (\ Re {e £ (k _r )} \ + \ lm {e £ (k _r )})

(2.7)

+ o-A) i (^ - i)

The quotient of these two estimators

N ⁾

'v.SF (*,)

; 2.8)

<(K) \ determines the detector output, which is generated as follows:

Condition detection result

R ^(r) (^)> R _σ no changes in space were detected,

R _ϋ > R ^(r k _r )> R _] detected weak spatial changes,

R _{] ≥} R (k _r ) large spatial changes detected animals.

R ₀ > Ri applies here. The quotient calculation is again carried out using logarithmization or linearization. The further use of the detection results is described below. 16 To intercom to realize the calculation of a NOR ^{¬-programmed} correlation estimate between the measured y ^r) (k _r) and the estimated microphone signal y \ ^r) (k _r) is suggested. To simplify the calculation, however, not the entire signal is used, but only the respective real part. The correlation coefficient " ^r) (^ _r ) is calculated as follows:

, _. , ". l∑ ^> (t, -,)) Ret '(*, - l)> l ∑. ^" _« 'L ^Re «" Λ -'.} Re {?! "(*, -D> |

Due to the amount formation, the correlation coefficient can assume a value range of p ⁽ ₀ ^r) (k _r ) e [0 ... l]. Little one

Values mean only a slight correlation between the signals, i.e. Intercom; Values close to 1, on the other hand, indicate a high correlation, i.e. for individual speaking.

The correlation analysis starts from compensated compensators - the signals y \ ^r) (k _r ) and y \ ^r {k _r ) then have no running time difference. This does not apply to poorly balanced compensators. In order to enable an analysis here, too, the evaluation is also carried out for a time offset in both directions. The correlation coefficients p ^r) (k _r ) are calculated for different values of n:

_{(r) / I} I∑ RetV. ^(r) (* r - '+ »)} Re {y \'^"')} | P k _r ) =' -_ ". (2.10)

∑ _{ι = n} iRe 'C *, -i + «)} Re {y \ ^r k _r -i)} \ for n> 0

IV ^{W "1+} " R ^r) (k _r - i + ή)} Re {y ^r) (k - 1)} I

∑ _{j = o} | Re {y, ^w (* _r -i + »)} Re { ^W (* _r - 1)} | for n <0. The values for /? are preferably taken from an integer interval that contains the value 0. P _n ^(r k _r ) is preferably calculated for five values of n. 17 To reduce the effort, the sums of the numerator or denominator can be calculated recursively. The decisive factor for the detector output is the maximum over the calculated correlation coefficients tii (k _r ) = Max _n {p _n ^{r K)}. (2.12)

The detection criterion can thus be specified as follows:

Condition detection result mi (k _r )> P _g single speech (distant) detected, PnL (kr) <P _g intercom detected.

With the detectors described above, the initially only “rough” description (FIG. 5) of the estimate of the power transmission factors can be concretized. The condition detects the excitation detection of the distant speaker

^~ WÖΛ ^{> X} * .M ⁽² - ¹³ > is queried. If the amount smoothing exceeds a limit value, further criteria are queried. In the other case, an insufficient excitation is detected, which leads to the adaptation ^r) ( _r ) = 0 (2.14 ) and maintaining the previous transfer factor estimate

P ^ Eκ (k _r ) = P ^ _EK (k _r - l) (2.15)

leads. The threshold values should be adapted to the statistical properties of the input signal, in particular to the power density spectrum. If sufficient excitation has been detected, the spatial change detection of the shadow filter is evaluated in a second detection stage. Should the shadow filter change to "strong" room changes

R,> R ^(r k _r ) (2.16) 18 detect, a non-linear, recursive first-order smoothing of the power transmission factors is carried out. This smoothing uses the shortest time constants compared to the estimates below. The estimated values are therefore tracked very quickly to the instantaneous values. The determination equation of the transfer factors is in the case of detected strong changes in space:

\ 4 ^k Z ß K) (2.17) μj, ^{r) (} *, ⁾ i + (\ -ß _L {K)) p "{k _r - \). The time constant is set as follows

KH _r )

PRO * for ^all ß _L (K) = ⁽ *,) | > PZΛK- .2.11 ß _R0F otherwise with <ß _ROιF <ß _RO <\.

In the case of detection of "weak" changes in space

R, <'^> ,) <% (2.19)

a recursive smoothing is also carried out according to equation 2.17, but with the time constants

' ⁽ *, ⁾ ! ßm, R ls ß K) = .o (*,) | >% W (* rl) - (2.20; ß ' _R R _Λ \, F _π otherwise with O <ß _RF <ß _RhR <\.

Compared to the detection of large changes in space, the re-estimation of the power transmission factors takes place more slowly, ie the following applies: ÄI.F ^> ΛO.F '

(2.21)

PR \, F ^> R0.R-

(2.22) 19 No room changes were detected by the shadow filter

R _o <R ^(r) (k _r X (2.23)

In this way, further criteria for distinguishing between single and two-way communication are evaluated. The first stage here is the correlation analysis already mentioned. Will the condition

P ⁽ l (K) <P _g (2.24)

is met, two-way communication is detected and the transmission factor estimate is stopped, i.e.

P (k _r ) = p ~ (k _r - \). (2.25)

In the case of single speech detection by the correlation analysis pL () ≥P _g - (2.26)

a further comparison is made in order to exclude intercom situations as far as possible. If the measured total band error power is below the estimated one, single speech is finally recognized. The condition for this is:

| * (*) | P _EK ) K _GS <\ e (k) \. (2.27)

The second level of intercom detection is evaluated with overall band signals. The sizes \ x (k) \ and \ e (k) \ are according to

| * (*) | = ßo _B , (k) \ x (k ~ N _AS ) \ + (l-ß _GB k)) \ x (k - 1) 1

H *) | = ß _{os, e} (*) I e (k) | + (1 - ß _{aB, e} (*)) \ e (k - 1) |

With 20th

P _{GB, R,} Hs \ x (k - N _AS ) \> \ x (k - l) ß <» _* (*) = ßσs.F. ^otherwise ß _{GB Ä} if \ e (k) \> \ e (k - l) _cs Λk) (2. 21 ßcB.F, ^Sθmt

certainly. With these recursive estimators, the time constants ß _GB and ß _{GB F} are chosen so that an increase in the

Signal power is followed quickly, but a power drop is slower - ie ß _{GB F} > ß _{GB R} • Since a delay time is inserted between the microphone and the error signal through the filter bank, the excitation signal of the remote subscriber is delayed accordingly, the size N _AS therefore describes the length of the analysis or synthesis filter.

The calculation of the total band power transmission factor p _EK (k) is carried out analogously to the subband transmission factors with several detectors. First, the excitation power of the distant participant is checked - if a threshold is not exceeded here, the old estimate is retained. If sufficient excitation has been detected, the error performance of the shadow filter is evaluated and the p _EK (k) estimate with a correspondingly short one when the room changes are detected

Time constants performed. If the shadow filter detector does not detect any changes in space, the correlation analysis of the first subband is evaluated as the last control level. If single speech is detected here (condition 2.26), a recursively smoothed estimate is carried out, otherwise the old transmission factor is retained.

The constant K _cs can be used to react to the variance of the variables entering condition 2.27 - it should be chosen so that two-way talk is not recognized even with slight fluctuations in the signal powers. The detection should only recognize two-way communication when the measured error power exceeds the estimated power by a certain value. In such cases, the 21 Estimation of the power transfer factors (Equation 2.17; carried out very slowly, ie

^{} (} *,) | -

'R3.R falls

^{} (} *, ⁾ | > PVEΛK 1) ß K): 2.29) ß R3.F otherwise

with O <ß _R3F <ß _R3Ji <1.

In the other detection case - detection on single speech, the time constants are according to

ι ^«( * r ⁾ i ß _R 2.R f ^allS l * i" ⁽ *,) l> PK -i) ß _L (K) = \ [2.30; ß. _R2F otherwise

itO <ß _R2F <ß _R2R <1

set. All possible paths of FIG. 5 are thus provided with explicit information about the detection conditions. The following applies to the individual time constants:

0 <ß _R0 .R <ß _R , R <ß _R 2.R <ß _R R <1>

; 2.3i: ⁰ <ß _R 0.F <ß _R X, F <ß _R 2.F <ß _R3 .F < ^l

(2.32) The quality of the estimation of the subband and the total band transmission factor strongly determines the quality of the entire hands-free system. The subband estimates are of great importance for a stable and, above all, fast adaptation. Only when the echo cancellers achieve high echo attenuation can the hands-free device be "led out" of half-duplex operation and work with almost no noticeable attenuation by means of a level balance. In the case of large changes in space, which occurs more frequently when operating in motor vehicles, a high quality of the damping estimate is in the 22

Total band (p _EK (k)) necessary. With the method described here, the requirements set can be satisfactorily met with little computing effort.

Claims

23 claims

1. A method for improving the acoustic attenuation in hands-free devices with a level balance (22) and several adaptive echo compensation filters (34), each of which processes a subband, characterized in that in at least one subband a further adaptive filter (shadow filter (36) ) of a different order is connected in parallel to the adaptive echo compensation filter (34), and changes in space are recognized on the basis of a correlation analysis and a performance evaluation of the shadow filter output.

2. The method according to claim 1, characterized in that several different sampling rates are used.

3. The method according to claim 1 or claim 2, characterized in that the further filter (36) has a substantially lower order from the actual echo cancellation filter.

4. The method according to any one of claims 1 to 3, characterized in that the echo compensation filter (34) by means of a filter bank (28) are implemented in frequency subbands.

5. The method according to any one of claims 1 to 4, characterized in that both performance evaluations of competing adaptive filters (34, 36) of different orders and correlation-based analyzes are used to control the adaptation and the step size.

6. The method according to any one of claims 1 to 5, characterized in that 24 for determining the step size, power transmission factors in sub-bands are estimated.

7. The method according to any one of claims 1 to 6, characterized in that the echo compensation filter (34) provide estimates for the echo attenuation introduced by them.

8. The method according to claim 7, characterized in that the estimated values for the damping are used to control the damping of the level balance (22).

9. The method according to any one of claims 1 to 8, characterized in that the simultaneous activity of both conversation participants (intercom) is detected.

10. The method according to claim 9, characterized in that the total attenuation of the level balance (22) is reduced in the opposite case.