EP2710787A1 - Post-traitement non-linéaire pour annulation d'écho acoustique - Google Patents

Post-traitement non-linéaire pour annulation d'écho acoustique

Info

Publication number
EP2710787A1
EP2710787A1 EP11721215.9A EP11721215A EP2710787A1 EP 2710787 A1 EP2710787 A1 EP 2710787A1 EP 11721215 A EP11721215 A EP 11721215A EP 2710787 A1 EP2710787 A1 EP 2710787A1
Authority
EP
European Patent Office
Prior art keywords
end signal
signal
coherence
suppression factors
echo
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP11721215.9A
Other languages
German (de)
English (en)
Inventor
Andrew John Macdonald
Jan Skoglund
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Google LLC
Original Assignee
Google LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Google LLC filed Critical Google LLC
Publication of EP2710787A1 publication Critical patent/EP2710787A1/fr
Withdrawn legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M9/00Arrangements for interconnection not involving centralised switching
    • H04M9/08Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic
    • H04M9/082Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic using echo cancellers

Definitions

  • the present invention relates generally to a method and system for cancellation of echoes in telecommunication systems. It particularly relates to a method and system for removing residual echo from an error signal by non-linear post processing of the error signal.
  • Speech quality is an important factor for telephony system suppliers. Customer demand makes it vital to strive for continuous improvements.
  • An echo which is a delayed version of what was originally transmitted, is regarded as a severe distraction to the speaker if the delay is long. For short round trip delays of less than approximately 20 ms, the speaker will not be able to distinguish the echo from the side tone in the handset.
  • a remotely generated echo signal often has a substantial delay.
  • the speech and channel coding compulsory in digital radio communications systems and for telephony over the Internet protocol (IP telephony, for short) also result in significant delays which make the echoes generated a relatively short distance away clearly audible to the speaker. Hence, canceling the echo is a significant factor in maintaining speech quality.
  • An echo canceller typically includes a linear filtering part which essentially is an adaptive filter that tries to adapt to the echo path. In this way, a replica of the echo can be produced from the far-end signal and subtracted from the near-end signal, thereby canceling the echo.
  • the filter generating the echo replica may have a finite or infinite impulse response. Most commonly it is an adaptive, linear finite impulse response (FIR) filter with a number of delay lines and a corresponding number of coefficients, or filter delay taps. The coefficients are values, which when multiplied with delayed versions of the filter input signal, generate an estimate of the echo.
  • the filter is adapted, i.e. updated, so that the coefficients converge to optimum values.
  • a traditional way to cancel out the echo is to update a finite impulse response (FIR) filter using the normalized least mean square (NLMS) algorithm.
  • FIR finite impulse response
  • NLMS normalized least mean square
  • the AEC employs the linear filter as a first stage to model the system impulse response. An estimated echo signal is obtained by filtering the far-end signal. This estimated echo signal is then subtracted from the near-end signal to cancel the echo. A problem, however, is that some audible echo will generally remain in the residual error signal after this first stage. A second stage post-process
  • a method for non-linear post processing of an audio signal for acoustic echo cancellation includes receiving as input, by a non-linear processor, at least two of the following signals: a far-end signal to be rendered and a plurality of capture-side signals, transforming the received signals to the frequency domain, and computing, for each frequency band, one or more coherence measures between the received signals.
  • the method also includes deriving suppression factors corresponding to each band based on the one or more coherence measures and applying the suppression factors to one of the capture-side signals to substantially remove echo from the capture-side signal.
  • the plurality of capture-side signals include a near-end captured signal and an error signal containing a residual echo output from a linear adaptive filter.
  • the method includes tracking the coherence measures over a predetermined amount of time to determine whether the near- end signal is in a "no-echo-state" or in an "echo-state".
  • the computing step further includes computing, for each frequency band, a first coherence measure between the far-end signal and the near-end signal; a second coherence measure between the near-end signal and the error signal; and applying the first and second coherence measures to compute the suppression factors.
  • the suppression factors are directly proportional to a combination of the coherence measures.
  • the suppression factors are directly proportional to one of the first coherence measure and the second coherence measure when the near-end signal is in the "no echo state".
  • the suppression factors are directly proportional to a minimum of the first and second coherence measures when the near-end signal is in the "echo state".
  • the first coherence measure is a frequency-domain analog to time-domain correlation between the far-end signal and the near-end signal.
  • the second coherence measure is a frequency-domain analog to time-domain correlation between the near-end signal and the error signal.
  • the method further includes applying suppression factors to the error signal to substantially remove the residual echo from the error signal.
  • the method further includes detecting filter divergence by comparing the energy of the error signal and the near-end signal and applying the suppression factors to the near-end signal based on detected filter divergence.
  • the method also includes accentuating valleys in the suppression factors by raising to a power.
  • the method includes weighting the suppression factors with a curve configured to influence less accurate bands.
  • the method includes tracking a minimum suppression factor and scaling the suppression factors such that the minimum approaches a target value.
  • the method includes transforming the far-end signal, the near-end signal, and the error signal to the frequency domain.
  • the frequency bands correspond to individual Discrete Fourier Transform (DFT) coefficients.
  • DFT Discrete Fourier Transform
  • a system for non-linear postprocessing of an audio signal for acoustic echo cancellation includes a non-linear processor and a transform unit.
  • the non-linear processor receives, as input, at least two of the following signals: a far-end signal to be rendered and a plurality of capture- side signals.
  • the transform unit transforms the received signals to the frequency domain.
  • the non-linear processor is configured to: compute, for each frequency band, one or more coherence measures between the received signals, derive suppression factors corresponding to each band based on the one or more coherence measures, and apply the suppression factors to one of the capture-side signals to substantially remove echo from the capture-side signal.
  • the non-linear processor is configured to track the coherence measures over a predetermined amount of time to determine whether the near-end signal is in a "no-echo-state" or in an "echo-state".
  • the non-linear processor is configured to compute, for each frequency band, a first coherence measure between the far- end signal and the near-end signal; a second coherence measure between the near-end signal and the error signal; and apply the first and second coherence measures to compute the suppression factors.
  • the non-linear processor is configured to apply suppression factors to the error signal to substantially remove the residual echo from the error signal.
  • the non-linear processor is configured to detect a filter divergence by comparing energy of the error signal and the near-end signal and apply the suppression factors to the near-end signal based on the detected filter divergence.
  • the non-linear processor is configured to accentuate valleys in the suppression factors by raising to a power.
  • the non-linear processor is configured to weight the suppression factors with a curve configured to influence less accurate bands.
  • the non-linear processor is configured to track a minimum suppression factor and scale the suppression factors such that the minimum approaches a target value.
  • the transform unit is configured to transform the far-end signal, the near-end signal, and the error signal to the frequency domain.
  • the frequency bands correspond to individual Discrete Fourier Transform (DFT) coefficients.
  • DFT Discrete Fourier Transform
  • a computer-readable storage medium having stored thereon computer executable program for non-linear post-processing of an audio signal for acoustic echo cancellation.
  • the computer program when executed causes a processor to execute the steps of: receiving as input, by a non-linear processor, at least two of the following signals: a far-end signal to be rendered and a plurality of capture-side signals; transforming the received signals to the frequency domain; computing, for each frequency band, one or more coherence measures between the received signals; deriving suppression factors corresponding to each band based on the one or more coherence measures; and applying the suppression factors to one of the capture-side signals to substantially remove echo from the capture-side signal.
  • the computer program when executed causes the processor to further execute the step of tracking the coherence measures over a predetermined amount of time to determine whether the near-end signal is in a "no-echo-state" or in an "echo-state".
  • the computer program when executed causes the processor to further execute the steps of computing, for each frequency band, a first coherence measure between the far-end signal and the near-end signal, a second coherence measure between the near-end signal and the error signal, and applying the first and second coherence measures to compute the suppression factors.
  • the computer program when executed causes the processor to further execute the step of applying suppression factors to the error signal to substantially remove the residual echo from the error signal.
  • the computer program when executed causes the processor to further execute the steps of detecting filter divergence by comparing the energy of the error signal and the near-end signal and applying the suppression factors to the near-end signal based on detected filter divergence.
  • the computer program when executed causes the processor to further execute the step of accentuating valleys in the suppression factors by raising to a power.
  • the computer program when executed causes the processor to execute the step of weighting the suppression factors with a curve configured to influence less accurate bands.
  • the computer program when executed causes the processor to further execute the step of tracking a minimum suppression factor and scaling the suppression factors such that the minimum approaches a target value.
  • the computer program when executed causes the processor to further execute the step of transforming the far-end signal, the near-end signal, and the error signal to the frequency domain.
  • Fig. 1 is a block diagram of an acoustic echo canceller in accordance with an embodiment of the present invention.
  • Fig. 2 illustrates a more detailed block diagram describing the functions that may be performed in the adaptive filter of Fig. 1 in accordance with an embodiment of the present invention.
  • FIG. 3 illustrates computational stages of the adaptive filter of Fig. 2 in accordance with an embodiment of the present invention.
  • FIG. 4 illustrates a more detailed block diagram describing block G m in Fig. 3 in accordance with an embodiment of the present invention.
  • Fig. 5 illustrates a flow diagram describing computational stages of the nonlinear processor of Fig. 1 in accordance with an embodiment of the present invention.
  • Fig. 6 is a flow diagram illustrating operations performed by the acoustic echo canceller according to an embodiment of the present invention illustrated in Fig. 5.
  • Fig. 7 is a flow diagram illustrating operations performed by the acoustic echo canceller according to an embodiment of the present invention illustrated in Fig. 6.
  • FIG. 8 is a block diagram illustrating an exemplary computing device that is arranged for acoustic echo cancellation in accordance with an embodiment of the present invention.
  • Fig. 1 illustrates an acoustic echo canceller (AEC) 100 in accordance with an exemplary embodiment of the present invention.
  • AEC acoustic echo canceller
  • the AEC 100 is designed as a high quality echo canceller for voice and audio communication over packet switched networks. More specifically, the AEC 100 is designed to cancel acoustic echo 130 that emerges due to the reflection of sound waves of a render device 10 from boundary surfaces and other objects back to a near-end capture device 20. The echo 130 may also exist due to the direct path from render device 10 to the capture device 20.
  • Render device 10 may be any of a variety of audio output devices, including a loudspeaker or group of loudspeakers configured to output sound from one or more channels.
  • Capture device 20 may be any of a variety of audio input devices, such as one or more microphones configured to capture sound and generate input signals.
  • render device 10 and capture device 20 may be hardware devices internal to a computer system, or external peripheral devices connected to a computer system via wired and/or wireless connections.
  • render device 10 and capture device 20 may be components of a single device, such as a microphone, telephone handset, etc.
  • one or both of render device 10 and capture device 20 may include analog-to-digital and/or digital-to-analog transformation functionalities.
  • the echo canceller 100 includes a linear filter 102, a nonlinear processor (LP) 104, a far-end buffer 106, and a blocking buffer 108.
  • a far- end signal 110 generated at the far-end and transmitted to the near-end is input to the filter 102 via the far-end buffer (FEBuf) 106 and the blocking buffer 108.
  • the far-end signal 110 is also input to a play-out buffer 112 located near the render device 10.
  • the output signal 1 16 of the far-end buffer 106 is input to the blocking buffer 108 and the output signal 118 of the blocking buffer is input to the linear filter 102.
  • the far-end buffer 106 is configured to compensate for and synchronize to buffering at sound devices (not shown).
  • the blocking buffer 108 is configured to block the signal samples for a frequency-domain transformation to be performed by the linear filter 102 and the NLP 104.
  • the linear filter 102 is an adaptive filter.
  • Linear filter 102 operates in the frequency domain through, e.g., the Discrete Fourier Transform (DFT).
  • the DFT may be implemented as a Fast Fourier Transform (FFT).
  • FFT Fast Fourier Transform
  • the other input to the filter 102 is the near-end signal (Sin) 122 from the capture device 20 via a recording buffer 1 14.
  • the near-end signal 122 includes near-end speech 120 and the echo 130.
  • the NLP 104 receives three signals as input. It receives (1) the far-end signal via the far-end buffer 106 and blocking buffer 108, (2) the near-end signal via the recording buffer 114, and (3) the output signal 124 of the filter 102.
  • the output signal 124 is also referred to as an error signal. In a case when the NLP 104 attenuates the output signal 124, a comfort noise signal is generated which will be explained later.
  • each frame is divided into 64 sample blocks. Since this choice of block size does not produce an integer number of blocks per frame the signal needs to be buffered before the processing. This buffering is handled by the blocking buffer 108 as discussed above. Both the filter 102 and the NLP 104 operate in the frequency domain and utilize DFTs of 128 samples.
  • the performance of the AEC 100 is influenced by the operation of the play- out buffer 112 and the recording buffer 114 at the sound device.
  • the AEC 100 may not start unless the combined size of the play-out buffer 1 12 and the recording buffer 114 is reasonably stable within a predetermined limit. For example, if the combined size is stable within +/- 8 ms of the first started size, for four consecutive frames, the AEC 100 is started by filling up the internal far-end buffer 106.
  • FIG. 2 illustrates a more detailed block diagram describing the functions performed in the filter 102 of Fig. 1.
  • Fig. 3 illustrates computational stages of the filter 102 in accordance with an embodiment of the present invention.
  • the adaptive filter 102 includes a first transform section 200, an inverse transform section 202, a second transform section 204, and an impulse response section (H) 206.
  • the far-end signal x(n) 210 to be rendered at the render device 10 is input to the first transform section 200.
  • the output signal X(n, k) of the first transform section 200 is input to the impulse response section 206.
  • the output signal Y(n, k) is input to the second transform section 202 which outputs the signal y(n).
  • This signal ' y(n) is then subtracted from the near-end signal d(n) 220 captured by the capture device 20 to output an error signal e(n) 230 as the output of the linear stage of the filter 102.
  • the error signal 230 is also input to the second transform section 204 the output signal of which, E(n, k), is also input to the impulse response section 206.
  • the above-mentioned adaptive filtering approach relates to an implementation of a standard blocked time-domain Least Mean Square (LMS) algorithm.
  • LMS Least Mean Square
  • the complexity reduction is due to the filtering and the correlations being performed in the frequency domain, where time-domain convolution is replaced by multiplication.
  • the error is formed in the time domain and is transformed to the frequency domain for updating the filter 102 as illustrated in Fig. 2.
  • Fig. 4 illustrates a more detailed block diagram describing block G m in the FLMS method of Fig. 3 in accordance with an embodiment of the present invention.
  • Ijv is a N x N-sized identity matrix
  • 0,v is a N x N-sized zero 'matrix. This means that the time domain vector is appended with N zeros before the Fourier transform.
  • the far-end samples, x(n) 310 are blocked into vectors of 2N samples, i.e. two blocks, at step S312,
  • x(k-m) [x ((k - m-2)N) ... x((k - m) -l)] T
  • the estimated echo signal is then obtained as the N last coefficients of the inverse transformed sum of the filter products performed at step S320 from which first block is discarded at step S322.
  • the estimated echo signal is represented as
  • N zeros are inserted at step S316 to the error vector, and the augmented vector is transformed at step S318 as
  • Fig. 4 illustrates a more detailed block diagram describing block G m in Fig. 3 in accordance with an embodiment of the present invention where the filter coefficient update can be expressed as
  • B(k) as shown in Fig. 4 is a modified error vector.
  • the modification includes a power normalization followed by a magnitude limiter 410.
  • the normalized error vector as also shown in Fig. 4, is
  • B (k) ⁇ ⁇ ⁇ - 1) + (1 - ⁇ ⁇ )
  • the diagonal matrix X(k-m) is conjugated by the conjugate unit 420 which is then multiplied with vector B(k) prior to performing an inverse DFT transform by the Inverse Discrete Fourier Transform (IDFT) unit 430. Then the discard last block unit 440 discards the last block. After discarding the last block, a zero block is appended by the append zero block unit 450 prior to performing a DFT by the DFT unit 460. Then, a block delay is introduced by the delay unit 480 which outputs Wm(k).
  • IDFT Inverse Discrete Fourier Transform
  • Fig. 5 illustrates a flow diagram describing computational processes of the NLP 104 of Fig. 1 in accordance with an embodiment of the present invention.
  • the NLP 104 of the AEC 100 accepts three signals as input: i) the far-end signal x(n) 110 to be rendered by the render device 10, ii) the near-end signal d(n) 122 captured by the capture device 20, and iii) the output error signal e(n) 124 of the linear stage performed at the filter 102.
  • the error signal e(n) 124 typically contains residual echo that should be removed for good performance.
  • the objective of the NLP 104 is to remove this residual echo.
  • the first step is to transform all three input signals to the frequency domain.
  • the far-end signal 1 10 is transformed to the frequency domain.
  • the near-end signal 122 is transformed to the frequency domain and at step S501 ", the error signal 124 is transformed to the frequency domain.
  • the NLP 104 is block-based and shares the block length N of the linear stage, but uses an overlap-add method rather than overlap- save: consecutive blocks are concatenated, windowed and transformed. By defining o as the element- wise product operator, the k* transformed block is expressed as
  • n N, N + L . . . , 2N to provide perfect reconstruction.
  • the length 2N DFT vectors are retained.
  • the redundant N - 1 complex coefficients are discarded.
  • X/c, Ok and refer to the frequency-domain representations of the k* far-end, near- end and error blocks, respectively.
  • echo suppression is achieved by multiplying each frequency band of the error signal e(n) 124 with a suppression factor between 0 and 1.
  • each band corresponds to an individual DFT coefficient. In general, however, each band may correspond to an arbitrary range of frequencies. Comfort noise is added and after undergoing an inverse FFT, the suppressed signal is windowed, and overlapped and added with the previous block to obtain the output.
  • the power spectral density (PSD) of each signal is obtained.
  • the PSD of the far-end signal x(n) 110 is computed.
  • the PSD of the near- end signal d(n) 122 is computed and at step S503", the PSD of the error signal e(n) 124 is computed.
  • the PSDs of the far-end signal 110, near-end signal 122, and the error signal 124 are represented by S x , S d , and S e , respectively.
  • the complex-valued cross-PSDs between i) the far-end signal x(n) 110 and near-end signal d(n) 122, and ii) the near-end signal d(n) 122 and error signal e(n) 124 are also obtained.
  • the complex-valued cross-PSD between the far-end signal 110 and the near-end signal 122 is computed and at step S504', the complex-valued cross-PSD between the near-end signal 122 and the error signal 124 is computed.
  • the complex-valued cross-PSD of the far-end signal 110 and near-end signal 122 is represented as S Xd .
  • the complex-valued cross-PSD of the near-end signal 122 and error signal 124 is represented as Sa e .
  • the PSDs are exponentially smoothed to avoid sudden erroneous shifts in echo suppression.
  • the PSDs are given by
  • an old block is selected to best synchronize it with the corresponding echo in the near-end at step S505.
  • the linear filter 102 diverges from a good echo path estimate. This tends to result in a highly distorted error signal, which although still useful for analysis, should not be used for output. According to an embodiment of the invention, divergence may be detected fairly easily, as it usually adds rather than removes energy from the near-end signal d(n) 122.
  • the divergence state determined at step S51 1 is utilized to either select (S512) E k or Dt as follows: If
  • the PSDs are used to compute the coherence measures for each frequency band between i) the far-end signal 1 10 and near-end signal 122 at step S513 as follows:
  • Coherence is a frequency- domain analog to time-domain correlation. It is a measure of similarity with 0 ⁇ c(n) ⁇ 1 ; where a higher coherence corresponds to more similarity.
  • c X d 1 - c X d .
  • the echo 130 is suppressed while allowing simultaneous near-end speech 120 to pass through.
  • the NLP 104 is configured to achieve this because the coherence is calculated independently for each frequency band. Thus, bands containing echo are fully or partially suppressed, while bands free of echo are not affected.
  • f s is the sampling frequency.
  • the preferred bands were chosen from frequency regions most likely to be accurate across a range of scenarios.
  • step S519 the system either selects c d e or c X d.
  • c xd is tracked over time to determine the broad state of the system at step S521. The purpose of this is to avoid suppression when the echo path is close to zero (e.g. during a call with a headset).
  • a thresholded minimum of c xd is computed at step S519 as follows:
  • the system may contain echo and otherwise does not contain echo.
  • the echo state is provided through an interface for potential use by other audio processing components.
  • suppression is limited by selecting suppression factors as follows at step S520, S524 and S518:
  • the overdrive ⁇ is set at step S531 such that applying it to the minimum will result in the target suppression level:
  • the S/, level is computed at step S533.
  • the final suppression factors s Y are produced according to the following algorithm.
  • s is first weighted towards s h according to a weighting vector V S N with components 0 ⁇ V S N (n) ⁇ 1 : f n x _ f v 8N (n)sh + v SN (n) ⁇ s(n) if s(n ⁇ > Sh
  • the weighting is selected to influence typically less accurate bands more heavily.
  • V T N is another weighting vector fulfilling a similar purpose as V S N- Overdriving through raising to a power serves to accentuate valleys in s v .
  • Yfc s 7 Q Efc + N3 ⁇ 4, where s k is artificial noise and at step S537, an inverse transform is performed to obtain the output signal y(n).
  • the suppression removes near-end noise as well as echo, resulting in an audible change in the noise level. This issue is mitigated by adding generated "comfort noise” to replace the lost noise.
  • the generation of N will be discussed in a later section below.
  • White noise may be produced by generating a random complex vector, u ⁇ , on the unit circle. This is shaped to match No k and weighted by the suppression levels to give the following comfort noise:
  • Fig. 6 shows a flow diagram illustrating operations performed by the acoustic echo canceller 100 according to the exemplary aspect of the present invention. More specifically, according to an embodiment of the invention, Fig. 6 further describes the algorithms on how echo state and suppression factors are determined in the NLP 104 of the AEC 100 as described above with respect to Fig. 5.
  • both the coherence c xd between the far-end signal 1 10 and near-end signal 122 and the coherence Cd e between the near-end signal 122 and error signal 124 are tracked over time to determine the state of the AEC 100. Based on the determination of a high or a low coherence, the NLP 104 decides whether to enter or leave the coherent state.
  • coherence is a frequency-domain analog to time-domain correlation. More specifically, as mentioned above with reference to Fig. 5, coherence is a measure of similarity with 0 ⁇ c(n) ⁇ 1 ; where a higher coherence corresponds to more similarity.
  • step S613 if the NLP 104 determines that the AEC 100 is not in the coherent state, the following suppression factor s is output by the NLP 104 at step S621 :
  • the suppression factors may then be applied by the NLP 104 to the error signal 124 to substantially remove residual echo from the error signal 124.
  • Fig. 7 is a flow diagram illustrating operations performed by the AEC 100 according to an embodiment of the present invention illustrated in Fig. 1. More specifically, according to an embodiment of the invention, Fig. 7 further describes the algorithms on how to remove residual echo from the error signal 124 by utilizing the echo state information and suppression factors determined in the NLP 104 of the AEC 100 as described above with respect to Figs. 5 and 6.
  • the NLP 104 receives as input the far-end signal 1 10 to be rendered, the near-end captured signal 122, and the error signal 124 containing a residual echo output from the linear adaptive filter 102.
  • the far-end signal 1 10, the near-end signal 122, and the error signal 124 are transformed into the frequency domain by the corresponding transform sections as described above with reference to Figs. 2-5.
  • a first coherence measure is computed between the far-end signal 1 10 and the near-end signal 122 according to the algorithm as described above with reference to Fig. 5.
  • a second coherence measure is computed between the near-end signal 122 and the error signal 124 according to the algorithm as described above with reference to Fig. 5.
  • suppression factors are derived corresponding to each band of frequencies.
  • the suppression factors are applied to the error signal 124 or to the near-end signal 122 to substantially remove echo from the error signal 124 or the near-end signal 122.
  • Fig. 8 is a block diagram illustrating an example computing device 800 that may be utilized to implement the AEC 100 including, but not limited to, the NLP 104, the filter 102, the far-end buffer 106, and the blocking buffer 108 as well as the processes illustrated in Figs. 3 and 5-7 in accordance with the present disclosure.
  • computing device 800 typically includes one or more processors 810 and system memory 820.
  • a memory bus 830 can be used for communicating between the processor 810 and the system memory 820.
  • processor 810 can be of any type including but not limited to a microprocessor ( ⁇ ), a microcontroller ( ⁇ ), a digital signal processor (DSP), or any combination thereof.
  • Processor 810 can include one more levels of caching, such as a level one cache 81 1 and a level two cache 812, a processor core 813, and registers 814.
  • the processor core 813 can include an arithmetic logic unit (ALU), a floating point unit (FPU), a digital signal processing core (DSP Core), or any combination thereof.
  • a memory controller 815 can also be used with the processor 810, or in some implementations the memory controller 815 can be an internal part of the processor 810.
  • system memory 820 can be of any type including but not limited to volatile memory (such as RAM), non-volatile memory (such as ROM, flash memory, etc.) or any combination thereof.
  • System memory 820 typically includes an operating system 821, one or more applications 822, and program data 824.
  • Application 822 includes an echo cancellation processing algorithm 823 that is arranged to remove residual echo from an error signal.
  • Program Data 824 includes echo cancellation routing data 825 that is useful for removing residual echo from an error signal, as will be further described below.
  • application 822 can be arranged to operate with program data 824 on an operating system 821 such that residual echo from and error signal is removed. This described basic configuration is illustrated in Fig. 8 by those components within dashed line 801.
  • Computing device 800 can have additional features or functionality, and additional interfaces to facilitate communications between the basic configuration 801 and any required devices and interfaces.
  • a bus/interface controller 840 can be used to facilitate communications between the basic configuration 801 and one or more data storage devices 850 via a storage interface bus 841.
  • the data storage devices 850 can be removable storage devices 851 , non-removable storage devices 852, or a combination thereof.
  • Examples of removable storage and non-removable storage devices include magnetic disk devices such as flexible disk drives and hard-disk drives (HDD), optical disk drives such as compact disk (CD) drives or digital versatile disk (DVD) drives, solid state drives (SSD), and tape drives to name a few.
  • Example computer storage media can include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data.
  • System memory 820, removable storage 851 and non-removable storage 852 are all examples of computer storage media.
  • Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by computing device 800. Any such computer storage media can be part of device 800.
  • Computing device 800 can also include an interface bus 842 for facilitating communication from various interface devices (e.g., output interfaces, peripheral interfaces, and communication interfaces) to the basic configuration 801 via the bus/interface controller 840.
  • Example output devices 860 include a graphics processing unit 861 and an audio processing unit 862, which can be configured to communicate to various external devices such as a display or speakers via one or more A/V ports 863.
  • Example peripheral interfaces 870 include a serial interface controller 871 or a parallel interface controller 872, which can be configured to communicate with external devices such as input devices (e.g., keyboard, mouse, pen, voice input device, touch input device, etc.) or other peripheral devices (e.g., printer, scanner, etc.) via one or more I/O ports 873.
  • An example communication device 880 includes a network controller 881 , which can be arranged to facilitate communications with one or more other computing devices 890 over a network communication via one or more communication ports 882.
  • the communication connection is one example of a communication media.
  • Communication media may typically be embodied by computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave or other transport mechanism, and includes any information delivery media.
  • a "modulated data signal" can be a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal.
  • communication media can include wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, radio frequency (RF), infrared (IR) and other wireless media.
  • RF radio frequency
  • IR infrared
  • the term computer readable media as used herein can include both storage media and communication media.
  • Computing device 800 can be implemented as a portion of a small-form factor portable (or mobile) electronic device such as a cell phone, a personal data assistant (PDA), a personal media player device, a wireless web-watch device, a personal headset device, an application specific device, or a hybrid device that include any of the above functions.
  • Computing device 800 can also be implemented as a personal computer including both laptop computer and non-laptop computer configurations.
  • ASICs Application Specific Integrated Circuits
  • FPGAs Field Programmable Gate Arrays
  • DSPs digital signal processors
  • Examples of a signal bearing medium include, but are not limited to, the following: a recordable type medium such as a floppy disk, a hard disk drive, a Compact Disc (CD), a Digital Video Disk (DVD), a digital tape, a computer memory, etc.; and a transmission type medium such as a digital and/or an analog communication medium (e.g., a fiber optic cable, a waveguide, a wired communications link, a wireless communication link, etc.).
  • a typical data processing system generally includes one or more of a system unit housing, a video display device, a memory such as volatile and non-volatile memory, processors such as microprocessors and digital signal processors, computational entities such as operating systems, drivers, graphical user interfaces, and applications programs, one or more interaction devices, such as a touch pad or screen, and/or control systems including feedback loops and control motors (e.g., feedback for sensing position and/or velocity; control motors for moving and/or adjusting components and/or quantities).
  • a typical data processing system may be implemented utilizing any suitable commercially available components, such as those typically found in data computing/communication and/or network computing/communication systems.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)
  • Filters That Use Time-Delay Elements (AREA)

Abstract

L'invention concerne un procédé et un système de post-traitement non-linéaire d'un signal audio pour annulation d'écho acoustique. Le système comprend un processeur non linéaire (NLP) (104) qui reçoit, en entrée, au moins deux des signaux suivants: un signal d'extrémité éloignée à restituer et une pluralité de signaux côté capture. Le NLP (104) calcule en premier, pour chaque bande de fréquence, une ou plusieurs mesures de cohérence entre les signaux reçus, et détermine des facteurs de suppression correspondant à chaque bande, en fonction desdites mesures de cohérence. Le NLP (104) applique également les facteurs de suppression à l'un des signaux côté capture afin d'en éliminer sensiblement l'écho.
EP11721215.9A 2011-05-17 2011-05-17 Post-traitement non-linéaire pour annulation d'écho acoustique Withdrawn EP2710787A1 (fr)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2011/036856 WO2012158163A1 (fr) 2011-05-17 2011-05-17 Post-traitement non-linéaire pour annulation d'écho acoustique

Publications (1)

Publication Number Publication Date
EP2710787A1 true EP2710787A1 (fr) 2014-03-26

Family

ID=44209915

Family Applications (1)

Application Number Title Priority Date Filing Date
EP11721215.9A Withdrawn EP2710787A1 (fr) 2011-05-17 2011-05-17 Post-traitement non-linéaire pour annulation d'écho acoustique

Country Status (3)

Country Link
EP (1) EP2710787A1 (fr)
CN (1) CN103718538B (fr)
WO (1) WO2012158163A1 (fr)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5982069B2 (ja) * 2013-03-19 2016-08-31 コーニンクレッカ フィリップス エヌ ヴェKoninklijke Philips N.V. オーディオ処理のための方法及び装置
CN105794190B (zh) * 2013-12-12 2019-09-20 皇家飞利浦有限公司 一种音频回声抑制器及音频回声抑制方法
GB2515593B (en) 2013-12-23 2015-12-23 Imagination Tech Ltd Acoustic echo suppression
CN104994249B (zh) * 2015-05-19 2017-03-15 百度在线网络技术(北京)有限公司 声回波消除方法和装置
CN105304077A (zh) * 2015-09-22 2016-02-03 广东欧珀移动通信有限公司 一种声波处理方法及装置
US9870763B1 (en) * 2016-11-23 2018-01-16 Harman International Industries, Incorporated Coherence based dynamic stability control system
CN108172233B (zh) * 2017-12-12 2019-08-13 天格科技(杭州)有限公司 基于远端估计信号和误差信号回归因子的回声消除方法
CN108390663B (zh) * 2018-03-09 2021-07-02 电信科学技术研究院有限公司 一种有限冲激响应滤波器系数矢量的更新方法及装置
CN108831497B (zh) * 2018-05-22 2020-06-09 出门问问信息科技有限公司 一种回声压缩方法及装置、存储介质、电子设备
CN112292844B (zh) * 2019-05-22 2022-04-15 深圳市汇顶科技股份有限公司 双端通话检测方法、双端通话检测装置以及回声消除系统
CN110335618B (zh) * 2019-06-06 2021-07-30 福建星网智慧软件有限公司 一种改善非线性回声抑制的方法及计算机设备
CN112929506B (zh) * 2019-12-06 2023-10-17 阿里巴巴集团控股有限公司 音频信号的处理方法及装置,计算机存储介质及电子设备
CN111048118B (zh) * 2019-12-24 2022-07-26 大众问问(北京)信息科技有限公司 一种语音信号处理方法、装置及终端
CN110992975B (zh) * 2019-12-24 2022-07-12 大众问问(北京)信息科技有限公司 一种语音信号处理方法、装置及终端
CN111048096B (zh) * 2019-12-24 2022-07-26 大众问问(北京)信息科技有限公司 一种语音信号处理方法、装置及终端
CN111341336B (zh) * 2020-03-16 2023-08-08 北京字节跳动网络技术有限公司 一种回声消除方法、装置、终端设备及介质
KR20210125846A (ko) 2020-04-09 2021-10-19 삼성전자주식회사 복수의 마이크로폰들을 사용한 음성 처리 장치 및 방법

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5587998A (en) * 1995-03-03 1996-12-24 At&T Method and apparatus for reducing residual far-end echo in voice communication networks
FI106489B (fi) * 1996-06-19 2001-02-15 Nokia Networks Oy Kaikusalpa ja kaiunpoistajan epälineaarinen prosessori
SG71035A1 (en) * 1997-08-01 2000-03-21 Bitwave Pte Ltd Acoustic echo canceller
US6658107B1 (en) * 1998-10-23 2003-12-02 Telefonaktiebolaget Lm Ericsson (Publ) Methods and apparatus for providing echo suppression using frequency domain nonlinear processing
US7006458B1 (en) * 2000-08-16 2006-02-28 3Com Corporation Echo canceller disabler for modulated data signals
US7433463B2 (en) * 2004-08-10 2008-10-07 Clarity Technologies, Inc. Echo cancellation and noise reduction method
US8036879B2 (en) * 2007-05-07 2011-10-11 Qnx Software Systems Co. Fast acoustic cancellation
JP5347794B2 (ja) * 2009-07-21 2013-11-20 ヤマハ株式会社 エコー抑圧方法およびその装置
CN101719969B (zh) * 2009-11-26 2013-10-02 美商威睿电通公司 判断双端对话的方法、系统以及消除回声的方法和系统

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
None *
See also references of WO2012158163A1 *

Also Published As

Publication number Publication date
WO2012158163A1 (fr) 2012-11-22
CN103718538A (zh) 2014-04-09
CN103718538B (zh) 2015-12-16

Similar Documents

Publication Publication Date Title
EP2710787A1 (fr) Post-traitement non-linéaire pour annulation d'écho acoustique
WO2012158164A1 (fr) Utilisation d'information de suppression d'écho pour limiter l'adaptation de réglage de puissance
JP4975073B2 (ja) デジタル適応フィルタと同フィルタを用いたアコスティックエコーキャンセラ
JP5450567B2 (ja) クリアな信号の取得のための方法及びシステム
US8488776B2 (en) Echo suppressing method and apparatus
KR101250124B1 (ko) 에코 억제 필터를 위한 제어 정보를 계산하는 장치 및 방법 및 지연 값을 계산하는 장치 및 방법
JP4702371B2 (ja) エコー抑圧方法及び装置
JP5284475B2 (ja) 前白色化を伴うlmsアルゴリズムによって適応させられる適応フィルタの更新済みフィルタ係数を決定する方法
JP2014502074A (ja) 後期残響成分のモデリングを含むエコー抑制
WO2012158168A1 (fr) Procédé et système de compensation d'une dérive d'horloge
EP2716023B1 (fr) Commande de taille de pas d'adaptation et de gain de suppression dans la régulation d'écho acoustique
US7003095B2 (en) Acoustic echo canceler and handsfree telephone set
CN111355855B (zh) 回声处理方法、装置、设备及存储介质
US20050008143A1 (en) Echo canceller having spectral echo tail estimator
JP5057109B2 (ja) エコーキャンセラ装置
EP2710789A1 (fr) Post-traitement non linéaire pour annulation d'écho acoustique à très large bande
JP6143702B2 (ja) エコー消去装置、その方法及びプログラム
KR20220157475A (ko) 반향 잔류 억제
KR100431965B1 (ko) 시변 적응알고리즘이 적용된 음향반향 제거장치 및 그 방법
KR102649227B1 (ko) 듀얼 마이크 어레이 에코 제거 방법, 장치 및 전자 장비
US7917562B2 (en) Method and system for estimating and applying a step size value for LMS echo cancellers
Nguyen Ngoc et al. Implementation of the LMS and NLMS algorithms for Acoustic Echo Cancellationin teleconference systemusing MATLAB

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20131116

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

DAX Request for extension of the european patent (deleted)
RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: GOOGLE LLC

17Q First examination report despatched

Effective date: 20180316

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20180927

P01 Opt-out of the competence of the unified patent court (upc) registered

Effective date: 20230519