CN102065190A - Method and device for eliminating echo - Google Patents

Method and device for eliminating echo Download PDF

Info

Publication number
CN102065190A
CN102065190A CN2010106181360A CN201010618136A CN102065190A CN 102065190 A CN102065190 A CN 102065190A CN 2010106181360 A CN2010106181360 A CN 2010106181360A CN 201010618136 A CN201010618136 A CN 201010618136A CN 102065190 A CN102065190 A CN 102065190A
Authority
CN
China
Prior art keywords
threshold
sub
band
cross
adaptive filter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2010106181360A
Other languages
Chinese (zh)
Other versions
CN102065190B (en
Inventor
封伶刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New H3C Technologies Co Ltd
Original Assignee
Hangzhou H3C Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou H3C Technologies Co Ltd filed Critical Hangzhou H3C Technologies Co Ltd
Priority to CN 201010618136 priority Critical patent/CN102065190B/en
Publication of CN102065190A publication Critical patent/CN102065190A/en
Application granted granted Critical
Publication of CN102065190B publication Critical patent/CN102065190B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)

Abstract

The invention discloses a method and a device for eliminating echo. The method comprises the steps of: determining the state of each subband adaptive filter respectively according to the cross correlation coefficient of an input signal on the near end of each subband and an echo estimation signal; when one end of the subband adaptive filter is in a speaking state, replacing the residual signal of the subband adaptive filter with comfortable noise of the subband, and outputting the replaced signal; and when both ends of the subband adaptive filter are in the speaking state, outputting the residual signal of the subband adaptive filter. In the invention, the signal processing characteristic of the subband can be fully utilized so that the residual echo can be suppressed more effectively. In addition, the suppression on the residual echo while both ends are in the speaking state and the protection on the voice of a person speaking on one end are enhanced, and the integral effect and the fluency of the system are improved.

Description

Echo cancellation method and device
Technical Field
The present invention relates to the field of communications technologies, and in particular, to an echo cancellation method and an echo cancellation device.
Background
In a voice communication system, after a far-end input signal reaches a local signal receiving device (e.g., a telephone), the far-end input signal passes through a sound box, a room, and the like of the local signal receiving device to reach a receiver, and in the process, an echo is often generated due to sound reflection in the sound box and the room. To cancel the echo, echo cancellation techniques are required to listen for the echo signal and cancel it from the speech signal. Echo cancellation has long been a very challenging task, mainly due to:
(1) acoustic echoes will enter the microphone directly or after one or more reflections in a superimposed form, resulting in a long tail of the echo and a long impulse response of the corresponding echo channel, typically several hundred milliseconds.
(2) While the acoustic spectrum of a speech signal is non-flat and diffuse, and the conventional adaptive algorithm is related to the statistical properties of the input signal, the diffusion of eigenvalues in the autocorrelation matrix of the speech signal slows down the adaptive convergence process of, for example, NLMS (Normalized Least Mean Square) algorithm.
(3) The characteristics of the Acoustic Echo channel are non-stationary, the impulse response of the Acoustic Echo changes greatly due to the movement of a speaker or other people or objects in a room, and the fast changing characteristics of the Echo channel require that the convergence speed of AEC (Acoustic Echo Cancellation) should be as fast as possible and have a good fast tracking capability.
(4) In an actual system, due to the influence of nonlinearity of audio acquisition and playing equipment, nonlinear echoes generated by the nonlinear echo acquisition and playing equipment cannot be eliminated by an adaptive filter; due to the influence of environmental noise and the like, the coefficient of the adaptive filter may not be perfectly matched with the actual room impulse response after convergence, and a residual echo which is not eliminated is generated; the nonlinear echo and the linear residual echo need to be added with a nonlinear post-processing module after the adaptive filter for further processing, so that the residual echo is suppressed, and the overall effect of the echo cancellation system is improved.
Fig. 1 is a schematic diagram of an echo cancellation post-processing algorithm, in which symbols are described as follows:
x: a far-end input signal;
y: x actual echo signals formed by passing through the room;
v: local speaker's voice and background noise;
d: a near-end input signal of an echo canceller;
Figure BSA00000405486700021
obtaining estimated echo through adaptive filter operation;
e: filtering the output residual signal;
epost: an output signal after a post-processing algorithm;
h: an actual room impulse response;
Figure BSA00000405486700022
the adaptive filter coefficients, i.e. the estimate of h.
As shown in fig. 1, the post-processing algorithm is based on an adaptive filter and adds a VAD (Voice Activity Detector) and a CNG (Comfort Noise Generation) module. The VAD module is used for detecting whether a signal received by a near-end microphone has a voice signal, wherein the signal received by the microphone in the system comprises an echo of a far-end signal passing through a room, the voice of a near-end speaker, near-end environment background noise and the like; when it is determined that no person is speaking at the far end or the near end through VAD detection, that is, when a signal received by a microphone only contains background noise, spectral characteristics (i.e., LPC (Linear Prediction Coefficient) and energy gain) of the background noise are estimated. Since the background noise generally changes slowly, the spectral feature period estimated by the VAD is updated to the CNG module. CNG generates a section of white noise excitation according to the background noise spectrum characteristics provided by VAD, and generates comfortable noise by a prediction error filter consisting of LPC coefficients and energy gain; when the echo cancellation system detects a single-ended speaking state through the double-end detection module, the NLP (non-Linear Post-processing) replaces a residual echo signal output by the adaptive filter with a comfortable noise generated by CNG (compressed natural gas) so as to prevent a far end from hearing the residual echo which is not completely cancelled; of course, when talking on both ends (i.e., the near end is talking), the NLP will deliver the signal containing the near end speaker's voice output from the adaptive filter directly to the far end.
At present, the adaptive filter algorithm has full-band and sub-band based, and compared with the full-band based post-processing algorithm, the sub-band based post-processing algorithm has the following specific advantages:
(1) the adaptive filtering algorithm based on the sub-band has the advantages of high convergence rate, low calculation complexity and the like, so that the adaptive filtering algorithm is widely adopted, and the post-processing algorithm based on the full frequency band is difficult to expand to the sub-band;
(2) when double-end talking, the signal output by the adaptive filter contains the voice of the near-end speaker and the residual echo signal, and the residual echo can not be processed based on the full-frequency-band post-processing algorithm, and the far-end can hear the residual echo;
(3) in some practical environments, due to the fact that the device has serious nonlinearity or environmental noise is serious, the adaptive filter cannot be converged well, residual echo is obvious, when near-end speaker voice with the amplitude equivalent to that of echo occurs, the nonlinear processing algorithm is difficult to distinguish the states of single-ended speech and double-ended speech, and misjudgment can cause that the residual echo cannot be well inhibited in single-ended speech or the cut-off is serious in double-ended speech.
The subband-based adaptive filtering algorithm is widely used in the field of echo cancellation due to high convergence rate and low computation complexity. However, due to the existence of the nonlinear echo and the linear residual echo, a nonlinear post-processing algorithm is required to further process the signal output by the subband adaptive filter to suppress the residual echo. Meanwhile, when only far-end speech is available, comfortable noise which is consistent with the spectrum of near-end background noise is inserted after residual echo is suppressed, so that the problem of background noise interruption caused by residual echo transition suppression during single-end speech is relieved.
In the process of implementing the invention, the inventor finds that the prior art has at least the following problems:
in a traditional system, a subband self-adaptive filtering output signal is synthesized into a full-band signal and then processed by a full-band-based post-processing algorithm, and the post-processing algorithm does not fully utilize the advantages of subbands and further improves the post-processing effect.
Disclosure of Invention
The invention aims to provide an echo cancellation method and a device thereof, which are used for realizing echo cancellation based on sub-bands, and therefore, the invention adopts the following technical scheme:
an echo cancellation method, comprising the steps of:
determining the state of each sub-band adaptive filter according to the cross-correlation coefficient of the near-end input signal and the echo estimation signal of each sub-band;
when the sub-band adaptive filter is in a single-end speaking state, replacing a residual signal of the sub-band adaptive filter by comfort noise of the sub-band and then outputting a replaced signal;
when the sub-band adaptive filter is in a double-talk state, a residual signal of the sub-band adaptive filter is output.
In the above method, determining the state of the subband adaptive filter according to the cross-correlation coefficient between the near-end input signal of the subband and the echo estimation signal includes:
when the cross correlation coefficient of the sub-band is larger than or equal to a first threshold value, determining that the sub-band adaptive filter is in a single-ended speaking state;
when the cross correlation coefficient of the sub-band is smaller than or equal to a second threshold value, determining that the sub-band adaptive filter is in a double-talk state;
for the sub-band with the cross correlation coefficient between the second threshold value and the first threshold value, determining the state of the sub-band adaptive filter according to the number or the occupied proportion of the sub-band adaptive filter in the specified state or the average value of the cross correlation coefficient of each sub-band adaptive filter;
wherein 0 < the second threshold < the first threshold < 1.
In the method, for the subband with the cross-correlation coefficient between the second threshold and the first threshold, determining the state of the subband adaptive filter according to the number of the subband adaptive filters in the specified state, specifically:
if the number of the sub-band adaptive filters in the single-ended speaking state exceeds a set threshold, the adaptive filters of the sub-bands with the cross-correlation coefficients between the second threshold and the first threshold are in the single-ended speaking state; otherwise, the adaptive filter of each sub-band with the cross correlation coefficient between the second threshold and the first threshold is in a double-end speaking state; or,
if the number of the sub-band adaptive filters in the double-end speaking state exceeds a set threshold, the adaptive filters of all sub-bands with the cross correlation coefficients between a second threshold and a first threshold are in the double-end speaking state; otherwise, the adaptive filter of each sub-band with the cross-correlation coefficient between the second threshold and the first threshold is in the single-end speaking state.
In the method, for the sub-band with the cross-correlation coefficient between the second threshold and the first threshold, the state of the sub-band adaptive filter is determined according to the proportion of the sub-band adaptive filter in the specified state, specifically:
if the proportion of the number of the sub-band adaptive filters in the single-ended speaking state exceeds a set threshold, the adaptive filters of the sub-bands with the cross-correlation coefficients between a second threshold and a first threshold are in the single-ended speaking state; otherwise, the adaptive filter of each sub-band with the cross correlation coefficient between the second threshold and the first threshold is in a double-end speaking state; or,
if the proportion of the number of the sub-band adaptive filters in the double-end speaking state exceeds a set threshold, the adaptive filters of the sub-bands with the cross correlation coefficients between a second threshold and a first threshold are in the double-end speaking state; otherwise, the adaptive filter of each sub-band with the cross-correlation coefficient between the second threshold and the first threshold is in the single-end speaking state.
In the method, for the sub-band with the cross-correlation coefficient between the second threshold and the first threshold, the state of the sub-band adaptive filter is determined according to the average value of the cross-correlation coefficient of each sub-band adaptive filter, which specifically includes:
if the average value of the cross correlation coefficients of all the sub-bands is larger than a third threshold value, the self-adaptive filter of all the sub-bands with the cross correlation coefficients between the second threshold value and the first threshold value is in a single-ended speaking state; otherwise, the adaptive filter of each sub-band with the cross correlation coefficient between the second threshold and the first threshold is in a double-end speaking state;
wherein 0 < the second threshold < the third threshold < the first threshold < 1.
In the method, for the sub-band with the cross-correlation coefficient between the second threshold and the first threshold, the state of the sub-band adaptive filter is determined according to the average value of the cross-correlation coefficient of each sub-band adaptive filter, which specifically includes:
if the weighted average value of the cross-correlation coefficient of each sub-band is greater than a third threshold value, the self-adaptive filter of each sub-band with the cross-correlation coefficient between the second threshold value and the first threshold value is in a single-ended speaking state; otherwise, the adaptive filter of each sub-band with the cross correlation coefficient between the second threshold and the first threshold is in a double-end speaking state;
wherein 0 < the second threshold < the third threshold < the first threshold < 1.
In the above method, the weight value corresponding to the sub-band whose cross-correlation coefficient is greater than the first threshold, the weight value corresponding to the sub-band whose cross-correlation coefficient is less than the second threshold, and the weight value of the sub-band whose cross-correlation coefficient is between the second threshold and the first threshold are greater.
In the above method, the weight values of the sub-bands are:
<math><mrow><msub><mi>&lambda;</mi><mi>i</mi></msub><mo>=</mo><mfrac><msubsup><mi>&sigma;</mi><msub><mi>d</mi><mi>i</mi></msub><mn>2</mn></msubsup><mrow><munderover><mi>&Sigma;</mi><mrow><mi>i</mi><mo>-</mo><mn>1</mn></mrow><mi>N</mi></munderover><msubsup><mi>&sigma;</mi><msub><mi>d</mi><mi>i</mi></msub><mn>2</mn></msubsup></mrow></mfrac></mrow></math>
wherein N is the number of sub-bands,
Figure BSA00000405486700062
is the energy of the sub-band near-end input signal of the echo canceller.
An echo cancellation device, comprising:
the state determining module is used for determining the state of each sub-band self-adaptive filter according to the cross-correlation coefficient of the near-end input signal and the echo estimation signal of each sub-band;
the output module is used for replacing the residual signal of the subband self-adaptive filter with the comfortable noise of the subband when the subband self-adaptive filter is in a single-ended speaking state and then outputting the replaced signal; when the sub-band adaptive filter is in a double-talk state, a residual signal of the sub-band adaptive filter is output.
In the above apparatus, the state determining module is specifically configured to determine that the subband adaptive filter is in a single-ended speaking state when the cross-correlation coefficient of the subband is greater than or equal to a first threshold; when the cross correlation coefficient of the sub-band is smaller than or equal to a second threshold value, determining that the sub-band adaptive filter is in a double-talk state; for the sub-band with the cross correlation coefficient between the second threshold value and the first threshold value, determining the state of the sub-band adaptive filter according to the number or the occupied proportion of the sub-band adaptive filter in the specified state or the average value of the cross correlation coefficient of each sub-band adaptive filter; wherein 0 < the second threshold < the first threshold < 1.
In the above apparatus, the state determining module is specifically configured to, when determining the state of the subband adaptive filter according to the number of subband adaptive filters in a specified state for subbands with cross-correlation coefficients between a second threshold and a first threshold, determine that the adaptive filter of each subband with cross-correlation coefficients between the second threshold and the first threshold is in a single-ended speaking state if the number of subband adaptive filters in the single-ended speaking state exceeds a set threshold; otherwise, judging that the self-adaptive filter of each sub-band with the cross-correlation coefficient between the second threshold and the first threshold is in a double-end speaking state; or if the number of the sub-band adaptive filters in the double-end speaking state exceeds a set threshold, judging that the adaptive filters of all sub-bands with the cross correlation coefficients between a second threshold and a first threshold are in the double-end speaking state; otherwise, the adaptive filter of each sub-band with the cross correlation coefficient between the second threshold and the first threshold is judged to be in a single-end speaking state.
In the above apparatus, the state determining module is specifically configured to, when determining the state of the subband adaptive filter according to the proportion of the subband adaptive filter in the specified state for the subband having the cross-correlation coefficient between the second threshold and the first threshold, determine that the adaptive filter of each subband having the cross-correlation coefficient between the second threshold and the first threshold is in the single-ended speaking state if the proportion of the number of the subband adaptive filters in the single-ended speaking state exceeds a set threshold; otherwise, judging that the self-adaptive filter of each sub-band with the cross-correlation coefficient between the second threshold and the first threshold is in a double-end speaking state; or if the proportion of the number of the sub-band adaptive filters in the double-end speaking state exceeds a set threshold, judging that the adaptive filters of the sub-bands with the cross correlation coefficients between the second threshold and the first threshold are in the double-end speaking state; otherwise, the adaptive filter of each sub-band with the cross correlation coefficient between the second threshold and the first threshold is judged to be in a single-end speaking state.
In the above apparatus, the state determining module is specifically configured to, when determining the state of the subband adaptive filter according to the average value of the cross-correlation coefficient of each subband adaptive filter for subbands having cross-correlation coefficients between the second threshold and the first threshold, determine that the adaptive filter of each subband having cross-correlation coefficients between the second threshold and the first threshold is in a single-ended speaking state if the average value of the cross-correlation coefficient of each subband is greater than a third threshold; otherwise, judging that the self-adaptive filter of each sub-band with the cross-correlation coefficient between the second threshold and the first threshold is in a double-end speaking state; wherein 0 < the second threshold < the third threshold < the first threshold < 1.
In the above apparatus, the state determining module is specifically configured to, when determining the state of the subband adaptive filter according to an average value of the cross-correlation coefficients of the subband adaptive filters, for a subband having a cross-correlation coefficient between a second threshold and a first threshold, determine that the adaptive filter of each subband having a cross-correlation coefficient between the second threshold and the first threshold is in a single-ended speaking state if a weighted average value of the cross-correlation coefficients of each subband is greater than a third threshold; otherwise, judging that the self-adaptive filter of each sub-band with the cross-correlation coefficient between the second threshold and the first threshold is in a double-end speaking state; wherein 0 < the second threshold < the third threshold < the first threshold < 1.
In the above apparatus, the weight value corresponding to the sub-band whose cross-correlation coefficient used by the state determining module is greater than the first threshold, the weight value corresponding to the sub-band whose cross-correlation coefficient is less than the second threshold, and the weight value of the sub-band whose cross-correlation coefficient is between the second threshold and the first threshold are greater.
The beneficial technical effects of the invention comprise:
the state of the adaptive filter of each sub-band is determined according to the cross-correlation coefficient of the near-end input signal and the echo estimation signal of each sub-band, and different processing methods are used according to different states, namely, the echo cancellation processing is carried out only in a single-end speaking state, so that the characteristic of sub-band signal processing is fully utilized, the residual echo is more effectively inhibited, the inhibition of the residual echo in double-end speaking and the protection of the voice of a local speaker are enhanced, and the overall effect and the fluency of the system are improved.
Drawings
FIG. 1 is a diagram illustrating a prior art echo cancellation post-processing algorithm;
fig. 2 is a schematic diagram of an echo cancellation process according to an embodiment of the present invention;
fig. 3 is a schematic diagram of another echo cancellation process according to an embodiment of the present invention;
fig. 4 is a schematic diagram of another echo cancellation process according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of an echo cancellation device according to an embodiment of the present invention.
Detailed Description
In an echo cancellation system, a near-end input signal d and an estimated echo signal are commonly used
Figure BSA00000405486700081
The cross-correlation coefficient eta of the adaptive filter is used for representing the convergence degree of the adaptive filter, namely when the correlation coefficient eta is close to 1, the convergence of the adaptive filter is considered to be better, namely the echo is estimated
Figure BSA00000405486700082
Better approaches the input signal d; and when the correlation coefficient eta is close to 0, the filter is consideredConvergence is not ideal or in a double-talk state.
η may be expressed using the following formula:
<math><mrow><mi>&eta;</mi><mo>=</mo><mfrac><mrow><mi>E</mi><mo>[</mo><mi>d</mi><mrow><mo>(</mo><mi>n</mi><mo>)</mo></mrow><mover><mi>y</mi><mo>^</mo></mover><mrow><mo>(</mo><mi>n</mi><mo>)</mo></mrow><mo>]</mo></mrow><msqrt><msubsup><mi>&sigma;</mi><mi>d</mi><mn>2</mn></msubsup><msubsup><mi>&sigma;</mi><mover><mi>y</mi><mo>^</mo></mover><mn>2</mn></msubsup></msqrt></mfrac><mo>&CenterDot;</mo><mo>&CenterDot;</mo><mo>&CenterDot;</mo><mrow><mo>[</mo><mn>1</mn><mo>]</mo></mrow></mrow></math>
wherein d (n) represents a near-end input signal of the echo canceller,representing the estimated echo resulting from the adaptive filter operation,
Figure BSA00000405486700093
the energy of d (n) is represented,
Figure BSA00000405486700094
to represent
Figure BSA00000405486700095
The energy of (a).And
Figure BSA00000405486700097
the calculation formula of (c) can be as follows:
<math><mrow><msubsup><mi>&sigma;</mi><mi>d</mi><mn>2</mn></msubsup><mo>=</mo><mrow><mo>(</mo><mn>1</mn><mo>-</mo><mi>&beta;</mi><mo>)</mo></mrow><msubsup><mi>&sigma;</mi><mi>d</mi><mn>2</mn></msubsup><mo>+</mo><mi>&beta;</mi><mo>|</mo><mi>d</mi><mrow><mo>(</mo><mi>n</mi><mo>)</mo></mrow><msup><mo>|</mo><mn>2</mn></msup><mo>&CenterDot;</mo><mo>&CenterDot;</mo><mo>&CenterDot;</mo><mrow><mo>[</mo><mn>2</mn><mo>]</mo></mrow></mrow></math>
<math><mrow><msubsup><mi>&sigma;</mi><mover><mi>y</mi><mo>^</mo></mover><mn>2</mn></msubsup><mo>=</mo><mrow><mo>(</mo><mn>1</mn><mo>-</mo><mi>&beta;</mi><mo>)</mo></mrow><msubsup><mi>&sigma;</mi><mover><mi>y</mi><mo>^</mo></mover><mn>2</mn></msubsup><mo>+</mo><mi>&beta;</mi><msup><mrow><mo>|</mo><mover><mi>y</mi><mo>^</mo></mover><mrow><mo>(</mo><mi>n</mi><mo>)</mo></mrow><mo>|</mo></mrow><mn>2</mn></msup><mo>&CenterDot;</mo><mo>&CenterDot;</mo><mo>&CenterDot;</mo><mrow><mo>[</mo><mn>3</mn><mo>]</mo></mrow></mrow></math>
based on the near-end input signal d and the estimated echo signalThe cross-correlation coefficient eta of the embodiment of the invention is used for judging the state of the filter in the non-linear post-processing process, and the idea is popularized to the sub-band, namely, the eta is usedi(subscript i denotes the ith subband) to determine the state of the ith subband adaptive filter:
<math><mrow><msub><mi>&eta;</mi><mi>i</mi></msub><mo>=</mo><mfrac><mrow><mi>E</mi><mo>[</mo><msubsup><mi>d</mi><mi>i</mi><mo>*</mo></msubsup><mrow><mo>(</mo><mi>n</mi><mo>)</mo></mrow><msub><mover><mi>y</mi><mo>^</mo></mover><mi>i</mi></msub><mrow><mo>(</mo><mi>n</mi><mo>)</mo></mrow><mo>]</mo></mrow><msqrt><msubsup><mi>&sigma;</mi><msub><mi>d</mi><mi>i</mi></msub><mn>2</mn></msubsup><msubsup><mi>&sigma;</mi><msub><mover><mi>y</mi><mo>^</mo></mover><mi>i</mi></msub><mn>2</mn></msubsup></msqrt></mfrac><mo>,</mo><mi>i</mi><mo>=</mo><mn>1,2</mn><mo>,</mo><mo>.</mo><mo>.</mo><mo>.</mo><mo>,</mo><mi>L</mi><mo>,</mo><mo>.</mo><mo>.</mo><mo>.</mo><mo>,</mo><mi>N</mi><mo>&CenterDot;</mo><mo>&CenterDot;</mo><mo>&CenterDot;</mo><mrow><mo>[</mo><mn>4</mn><mo>]</mo></mrow></mrow></math>
wherein N is the number of subbands; di(n) represents the sub-band near-end input signal of the echo canceller,
Figure BSA000004054867000912
representing the estimated echo obtained by the operation of a subband adaptive filter, subband signal di(n) and
Figure BSA000004054867000913
possibly a plurality; superscript is a conjugation operation;denotes di(ii) the energy of (n),
Figure BSA000004054867000915
to represent
Figure BSA000004054867000916
Energy of sub-band signal
Figure BSA000004054867000917
Andthe calculation of (d) is given by:
<math><mrow><msubsup><mi>&sigma;</mi><msub><mi>d</mi><mi>i</mi></msub><mn>2</mn></msubsup><mo>=</mo><mrow><mo>(</mo><mn>1</mn><mo>-</mo><mi>&beta;</mi><mo>)</mo></mrow><msubsup><mi>&sigma;</mi><msub><mi>d</mi><mi>i</mi></msub><mn>2</mn></msubsup><mo>+</mo><mi>&beta;</mi><mo>|</mo><msub><mi>d</mi><mi>i</mi></msub><mrow><mo>(</mo><mi>n</mi><mo>)</mo></mrow><msup><mo>|</mo><mn>2</mn></msup><mo>&CenterDot;</mo><mo>&CenterDot;</mo><mo>&CenterDot;</mo><mrow><mo>[</mo><mn>5</mn><mo>]</mo></mrow></mrow></math>
<math><mrow><msubsup><mi>&sigma;</mi><msub><mover><mi>y</mi><mo>^</mo></mover><mi>i</mi></msub><mn>2</mn></msubsup><mo>=</mo><mrow><mo>(</mo><mn>1</mn><mo>-</mo><mi>&beta;</mi><mo>)</mo></mrow><msubsup><mi>&sigma;</mi><msub><mover><mi>y</mi><mo>^</mo></mover><mi>i</mi></msub><mn>2</mn></msubsup><mo>+</mo><mi>&beta;</mi><msup><mrow><mo>|</mo><msub><mover><mi>y</mi><mo>^</mo></mover><mi>i</mi></msub><mrow><mo>(</mo><mi>n</mi><mo>)</mo></mrow><mo>|</mo></mrow><mn>2</mn></msup><mo>&CenterDot;</mo><mo>&CenterDot;</mo><mo>&CenterDot;</mo><mrow><mo>[</mo><mn>6</mn><mo>]</mo></mrow></mrow></math>
in the embodiment of the invention, when the state of the subband adaptive filter is judged to be single-ended speaking (namely when the correlation coefficient eta isiIs close to 1 and is expressed as etai→ 1), it indicates that the residual signal output by the ith subband adaptive filter is mainly residual echo, in which case it needs to be further suppressed, such as the residual signal can be replaced by comfort noise generated by the subband; when the state of the subband adaptive filter is judged to be double-ended speech (namely when the correlation coefficient eta isiClose to 0, expressed as ηi→ 0), the residual signal output from the subband adaptive filter may be directly output without being processed.
Embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
Referring to fig. 2, a schematic diagram of a subband-based echo cancellation process according to an embodiment of the present invention is shown, where the process may include:
step 201, calculating the cross-correlation coefficient eta between the near-end input signal and the echo estimation signal of each sub-bandi
Specifically, the cross-correlation coefficient between the near-end input signal and the echo estimation signal of each sub-band can be calculated according to formula (4), formula (5) and formula (6).
Step 202, according to the cross-correlation coefficient eta of each sub-bandiAnd determining the state of each sub-band adaptive filter. If the sub-band adaptive filter is in the single-ended speaking state, go to step 203; if the subband adaptive filter is in the double-ended speaking state, the process proceeds to step 204.
Wherein if the cross-correlation coefficient eta of the sub-bandiIf T is more than or equal to T (wherein T is a set threshold value, and T is more than 0 and less than 1), the subband adaptive filter is considered to be in a single-ended speaking state; if cross correlation coefficient of subband etaiIf T is less than T, the sub-band adaptive filter is considered to be in a double-end talking state. The value of T may be determined according to the degree of system nonlinearity, the strength of the ambient background noise, the performance of the subband adaptive filter, and other factors, for example, T is 0.5.
Step 203, for the sub-band adaptive filter in the single-ended speaking state, the residual signal is processed and then output to suppress or eliminate the echo.
In this step, echo can be eliminated in various ways, and the embodiment of the present invention preferably replaces the subband residual signal with the comfort noise of the subband and then outputs the subband residual signal by the subband adaptive filter.
And step 204, directly outputting the residual signal of the sub-band adaptive filter for the sub-band adaptive filter in the double-talk state.
As can be seen from the above process, the sub-units at the same time are processedThe belt is judged to be in different states and respectively carries out corresponding subsequent processing; etaiJudging the speech as single-ended speech, replacing residual echo with comfortable noise of corresponding sub-band, etaiIf the value is less than T, the speech is judged to be double-ended speech, and the residual signal output by the sub-band self-adaptive filter is directly output, so that residual echo is specifically inhibited, and the phenomenon of sound cutting during double-ended speech is avoided to a certain extent.
When the system nonlinearity is severe or the environmental background noise ratio is strong, the mutual giving coefficient eta of the sub-bandsi→ (. eta. +. DELTA.) as defined byi→ 0.5 + -0.2, in this case, if only the cross-correlation coefficient η of each subband is relied uponiThe relation with the threshold T makes it difficult to obtain an accurate state, which may affect the effect of echo cancellation.
To solve this problem, the embodiment of the present invention further improves the scheme shown in fig. 1, and integrates the information of each sub-band, when there are a set number of sub-bands that can be definitely determined as single-ended speech (i.e. η |)i→ 1) or other critical sub-bands (i.e. η) when it can be definitely determined that the proportion of the single-ended speech sub-band reaches the set proportioni→ (. eta. +. DELTA.) as defined byi→ (0.5 ± 0.2)) is judged to be single-ended speaking, so that the probability is relatively high; similarly, when there are a predetermined number of sub-bands that can be definitely determined as double-ended speech, the probability that other sub-bands in the critical state are determined as double-ended speech is relatively high. The improved process is described in detail below with reference to fig. 2.
Referring to fig. 3, a schematic diagram of another subband-based echo cancellation process provided in the embodiment of the present invention is shown, where the process may include:
step 301, calculating the cross-correlation coefficient η between the near-end input signal and the echo estimation signal of each sub-bandi
Step 302-304 according to the cross-correlation coefficient eta of each sub-bandiAnd determining the state of each sub-band adaptive filter. If the sub-band adaptive filter is in the single-ended speaking state, go to step 305;if the subband adaptive filter is in the double-ended speech state, then step 306 is performed.
Wherein if the cross-correlation coefficient eta of the sub-bandi≥T1(wherein T is1A set first threshold), the subband adaptive filter is considered to be in a single-ended speaking state; if cross correlation coefficient of subband etai≤T2(wherein T is2A set second threshold), the subband adaptive filter is considered to be in a double-talk state; cross correlation coefficient T of sub-band2<ηi<T1Then it is necessary to further determine whether the subband filter is in a single-ended or double-ended speaking state. For the cross-correlation coefficient etaiIn (T)2,T1) The sub-band adaptive filter of the range can determine whether the sub-band adaptive filter belongs to the single-end speaking state or the double-end speaking state according to the number or the occupied proportion of the sub-band adaptive filters in the single-end speaking state or the double-end speaking state. Wherein, 0 < T2<T1< 1, for example, when the system nonlinearity is severe or the environmental background noise is strong, the cross-correlation coefficient eta of each sub-band adaptive filteriWhen → 0.5 + -0.2, can set T1=0.3,T2=0.7。
In particular, for the cross-correlation coefficient ηiIn (T)2,T1) The subband adaptive filter of the range can determine its state in the following manner (fig. 3 shows only a specific implementation of one of them):
the first method is as follows: if the cross correlation coefficient ηi≥T1Exceeds a set threshold (n > th _ n as shown in fig. 3, where n is the cross-correlation coefficient η)i≥T1The number of subbands of (th _ n) is a set threshold), the cross-correlation coefficient η is considered to beiIn (T)2,T1) The sub-band adaptive filter of the range is in a single-ended speaking state; otherwise, the cross-correlation coefficient η is considerediIn (T)2,T1) The subband adaptive filters of the range are in a double talk state.
The second method comprises the following steps: if the cross correlation coefficient ηi≥T1If the proportion of the sub-bands in all the sub-bands exceeds a set threshold (for example, exceeds 50%), the cross-correlation coefficient eta is considerediIn (T)2,T1) The sub-band adaptive filter of the range is in a single-ended speaking state; otherwise, the cross-correlation coefficient η is considerediIn (T)2,T1) The subband adaptive filters of the range are in a double talk state.
The third method comprises the following steps: if the cross correlation coefficient ηi≤T2If the number of sub-bands exceeds a predetermined threshold, the cross-correlation coefficient η is considered to beiIn (T)2,T1) The sub-band adaptive filters of the range are in double-ended speech state, otherwise the cross-correlation coefficient eta is considerediIn (T)2,T1) The subband adaptive filters of the range are in a single-ended speech state.
The method is as follows: if the cross correlation coefficient ηi≤T2If the proportion of the sub-bands in all the sub-bands exceeds a set threshold (for example, exceeds 50%), the cross-correlation coefficient eta is considerediIn (T)2,T1) The sub-band adaptive filters of the range are in double-ended speech state, otherwise the cross-correlation coefficient eta is considerediIn (T)2,T1) The subband adaptive filters of the range are in a single-ended speech state.
Step 305, for the subband adaptive filter in the single-ended speaking state, the residual signal is processed and output to suppress or eliminate the echo. The specific implementation manner can be the same as step 203 in fig. 2.
And step 306, directly outputting the residual signal of the subband adaptive filter for the subband adaptive filter in the double-talk state.
An alternative to the flow shown in fig. 3 may be as shown in fig. 4, which differs from the flow shown in fig. 3 in that: for the cross-correlation coefficient etaiIn (T)2,T1) The subband adaptive filter of the range, determines the state it is in by: root of herbaceous plantCross correlation coefficient eta according to all sub-bandsiSuch as an arithmetic mean or a mean derived by another algorithm. As shown in fig. 4, the process may include:
step 401, calculating the cross-correlation coefficient η between the near-end input signal and the echo estimation signal of each sub-bandi
Step 402-404 according to the cross correlation coefficient eta of each sub-bandiAnd determining the state of each sub-band adaptive filter. If the sub-band adaptive filter is in the single-ended speaking state, go to step 405; if the subband adaptive filter is in the double-ended speaking state, step 406 is performed.
Wherein if the cross-correlation coefficient eta of the sub-bandi≥T1Then, the subband adaptive filter is considered to be in a single-ended speaking state; if cross correlation coefficient of subband etai≤T2Then the sub-band adaptive filter is considered to be in a double-end speaking state; cross correlation coefficient T of sub-band2<ηi<T1Then it is necessary to further determine whether the subband filter is in a single-ended or double-ended speaking state. For the cross-correlation coefficient etaiIn (T)2,T1) A range of subband adaptive filters, which may be based on the cross-correlation coefficient η of all subbandsiTo determine whether the subband adaptive filter belongs to a single-ended speech state or a double-ended speech state.
In particular, if the cross-correlation coefficient η of all sub-bands isiThe arithmetic mean value gamma of is not less than T3(wherein T is3Is a set third threshold), the cross-correlation coefficient ηiIn (T)2,T1) The sub-band adaptive filter of the range is in a single-ended speaking state; if the cross-correlation coefficient η of all sub-bandsiArithmetic mean of gamma < T3Cross correlation coefficient ηiIn (T)2,T1) The subband adaptive filters of the range are in a double talk state. Wherein, 0 < T2<T3<T1< 1, e.g. can set T1=0.8,T2=0.2,T30.5. Cross correlation coefficient eta of all sub-bandsiThe arithmetic mean γ of (d) can be expressed as:
<math><mrow><mi>&gamma;</mi><mo>=</mo><mfrac><mn>1</mn><mi>N</mi></mfrac><munderover><mi>&Sigma;</mi><mrow><mi>i</mi><mo>-</mo><mn>1</mn></mrow><mi>N</mi></munderover><msub><mi>&eta;</mi><mi>i</mi></msub><mo>&CenterDot;</mo><mo>&CenterDot;</mo><mo>&CenterDot;</mo><mrow><mo>[</mo><mn>7</mn><mo>]</mo></mrow></mrow></math>
where N is the number of subbands.
Step 405, for the subband adaptive filter in the single-ended speaking state, the residual signal is processed and then output to suppress or eliminate echo. The specific implementation manner can be the same as step 103 in fig. 1.
In step 406, for the subband adaptive filter in the double-talk state, the residual signal of the subband adaptive filter is directly output.
In the flow shown in fig. 4, the cross-correlation coefficient η is determined directly by calculating the average of the correlation coefficients of the respective subbandsiIn (T)2,T1) The state of the subband adaptive filter for a range may be relatively coarse. For example, when a certain sub-band contains only background noise, or echo or near-end speech signal with little energy component, the correlation coefficient η of the sub-bandiAnd the average value calculation is also involved, and when the proportion of the sub-bands with weak energy is more, the final average value result gamma can be adversely affected.
To solve this problem, another embodiment of the present invention further improves the flow shown in fig. 4, and introduces an energy factor to determine the cross-correlation coefficient ηiIn (T)2,T1) The state of the subband adaptive filter for the range. Specifically, the average value γ is obtained by performing energy weighted averaging on the correlation coefficients of the respective subbands, and a relatively large energy may be givenWith a more well-defined sub-band (e.g. η)i≥T1Or ηi≤T2) With a larger weight, the final judgment critical state becomes a more reasonable state.
The modified procedure is substantially the same as the procedure shown in fig. 3, except that equation (7) is replaced with equation (8) below to calculate the mean value:
<math><mrow><mi>&gamma;</mi><mo>=</mo><munderover><mi>&Sigma;</mi><mrow><mi>i</mi><mo>=</mo><mn>1</mn></mrow><mi>N</mi></munderover><msub><mi>&lambda;</mi><mi>i</mi></msub><msub><mi>&eta;</mi><mi>i</mi></msub><mo>&CenterDot;</mo><mo>&CenterDot;</mo><mo>&CenterDot;</mo><mrow><mo>[</mo><mn>8</mn><mo>]</mo></mrow></mrow></math>
where N is the number of subbands, the energy weight can be expressed as:
<math><mrow><msub><mi>&lambda;</mi><mi>i</mi></msub><mo>=</mo><mfrac><msubsup><mi>&sigma;</mi><msub><mi>d</mi><mi>i</mi></msub><mn>2</mn></msubsup><mrow><munderover><mi>&Sigma;</mi><mrow><mi>i</mi><mo>-</mo><mn>1</mn></mrow><mi>N</mi></munderover><msubsup><mi>&sigma;</mi><msub><mi>d</mi><mi>i</mi></msub><mn>2</mn></msubsup></mrow></mfrac><mo>&CenterDot;</mo><mo>&CenterDot;</mo><mo>&CenterDot;</mo><mrow><mo>[</mo><mn>9</mn><mo>]</mo></mrow></mrow></math>
in formula (9)
Figure BSA00000405486700143
Can be calculated by the formula (2).
It can be seen from the above flow that the embodiment of the present invention determines that each sub-band at the same time is in different states, and performs corresponding subsequent processing respectively; etai≥T1Judging as single-ended speaking, the residual echo is replaced by comfortable noise of corresponding sub-band, etai≤T2Judging the speech at two ends, and directly outputting a residual signal output by the sub-band adaptive filter; especially for the sub-bands of critical states (i.e. cross-correlation coefficient eta)iIn (T)2,T1) Sub-bands of the range) to be classified into more reasonable states by weighted means to achieve better suppression of residual echo and avoid the phenomenon of clipping when talking at both ends.
Based on the same technical concept, the embodiment of the present invention further provides an echo cancellation device that can be applied to the above-mentioned process provided by the embodiment of the present invention.
As shown in fig. 5, the echo canceling device may include:
a state determining module 501, configured to determine a state of each subband adaptive filter according to a cross-correlation coefficient of a near-end input signal and an echo estimation signal of each subband;
an output module 502, configured to output a replaced signal after replacing a residual signal of the subband adaptive filter with comfort noise of the subband when the subband adaptive filter is in a single-ended speaking state; when the sub-band adaptive filter is in a double-talk state, a residual signal of the sub-band adaptive filter is output.
When the state determining module 501 determines that the cross-correlation coefficient of the subband is greater than or equal to the first threshold, it determines that the subband adaptive filter is in the single-ended speaking state; when the cross correlation coefficient of the sub-band is judged to be smaller than or equal to a second threshold value, the sub-band adaptive filter is determined to be in a double-end speaking state; for the sub-band with the cross correlation coefficient between the second threshold value and the first threshold value, determining the state of the sub-band adaptive filter according to the number or the occupied proportion of the sub-band adaptive filter in the specified state or the average value of the cross correlation coefficient of each sub-band adaptive filter; wherein 0 < the second threshold < the first threshold < 1.
Specifically, when determining the state of the subband adaptive filter according to the number of the subband adaptive filters in the specified state for the subband having the cross-correlation coefficient between the second threshold and the first threshold, the state determining module 501 determines that the adaptive filter of each subband having the cross-correlation coefficient between the second threshold and the first threshold is in the single-ended speaking state if the number of the subband adaptive filters in the single-ended speaking state exceeds the set threshold; otherwise, judging that the self-adaptive filter of each sub-band with the cross-correlation coefficient between the second threshold and the first threshold is in a double-end speaking state; or if the number of the sub-band adaptive filters in the double-end speaking state exceeds a set threshold, judging that the adaptive filters of all sub-bands with the cross correlation coefficients between a second threshold and a first threshold are in the double-end speaking state; otherwise, the adaptive filter of each sub-band with the cross correlation coefficient between the second threshold and the first threshold is judged to be in a single-end speaking state.
Specifically, when determining the state of the subband adaptive filter according to the proportion of the subband adaptive filter in the specified state for the subband with the cross-correlation coefficient between the second threshold and the first threshold, the state determining module 501 determines that the adaptive filter of each subband with the cross-correlation coefficient between the second threshold and the first threshold is in the single-ended speaking state if the proportion of the number of the subband adaptive filters in the single-ended speaking state exceeds the set threshold; otherwise, judging that the self-adaptive filter of each sub-band with the cross-correlation coefficient between the second threshold and the first threshold is in a double-end speaking state; or if the proportion of the number of the sub-band adaptive filters in the double-end speaking state exceeds a set threshold, judging that the adaptive filters of the sub-bands with the cross correlation coefficients between the second threshold and the first threshold are in the double-end speaking state; otherwise, the adaptive filter of each sub-band with the cross correlation coefficient between the second threshold and the first threshold is judged to be in a single-end speaking state.
Specifically, when determining the state of the sub-band adaptive filter according to the average value of the cross-correlation coefficient of each sub-band adaptive filter for the sub-band having the cross-correlation coefficient between the second threshold and the first threshold, the state determining module 501 determines that the adaptive filter of each sub-band having the cross-correlation coefficient between the second threshold and the first threshold is in the single-ended speaking state if the average value of the cross-correlation coefficient of each sub-band is greater than the third threshold; otherwise, judging that the self-adaptive filter of each sub-band with the cross-correlation coefficient between the second threshold and the first threshold is in a double-end speaking state; wherein 0 < the second threshold < the third threshold < the first threshold < 1.
Specifically, when determining the state of the sub-band adaptive filter according to the average value of the cross-correlation coefficient of each sub-band adaptive filter for the sub-band having the cross-correlation coefficient between the second threshold and the first threshold, the state determining module 501 determines that the adaptive filter of each sub-band having the cross-correlation coefficient between the second threshold and the first threshold is in the single-ended speaking state if the weighted average value of the cross-correlation coefficient of each sub-band is greater than the third threshold; otherwise, judging that the self-adaptive filter of each sub-band with the cross-correlation coefficient between the second threshold and the first threshold is in a double-end speaking state; wherein 0 < the second threshold < the third threshold < the first threshold < 1.
Specifically, the weight value corresponding to the sub-band whose cross-correlation coefficient used by the state determining module 501 is greater than the first threshold, the weight value corresponding to the sub-band whose cross-correlation coefficient is less than the second threshold, and the weight value of the sub-band whose cross-correlation coefficient is between the second threshold and the first threshold are greater.
In summary, the embodiment of the present invention is simple and easy to implement in a real-time system, and makes full use of the characteristic of sub-band signal processing, so as to effectively suppress the residual echo, and enhance the suppression of the residual echo during dual-end speech and the protection of the voice of the speaker at the local end, so as to improve the overall effect and fluency of the system.
Through the above description of the embodiments, those skilled in the art will clearly understand that the present invention may be implemented by software plus a necessary general hardware platform, and certainly may also be implemented by hardware, but in many cases, the former is a better embodiment. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for enabling a terminal device (which may be a mobile phone, a personal computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and improvements can be made without departing from the principle of the present invention, and such modifications and improvements should also be considered within the scope of the present invention.

Claims (15)

1. An echo cancellation method, comprising the steps of:
determining the state of each sub-band adaptive filter according to the cross-correlation coefficient of the near-end input signal and the echo estimation signal of each sub-band;
when the sub-band adaptive filter is in a single-end speaking state, replacing a residual signal of the sub-band adaptive filter by comfort noise of the sub-band and then outputting a replaced signal;
when the sub-band adaptive filter is in a double-talk state, a residual signal of the sub-band adaptive filter is output.
2. The method of claim 1, wherein determining the state of the subband adaptive filter based on cross-correlation coefficients of the near-end input signal and the echo estimate signal for the subband comprises:
when the cross correlation coefficient of the sub-band is larger than or equal to a first threshold value, determining that the sub-band adaptive filter is in a single-ended speaking state;
when the cross correlation coefficient of the sub-band is smaller than or equal to a second threshold value, determining that the sub-band adaptive filter is in a double-talk state;
for the sub-band with the cross correlation coefficient between the second threshold value and the first threshold value, determining the state of the sub-band adaptive filter according to the number or the occupied proportion of the sub-band adaptive filter in the specified state or the average value of the cross correlation coefficient of each sub-band adaptive filter;
wherein 0 < the second threshold < the first threshold < 1.
3. The method as claimed in claim 2, wherein for the sub-band having the cross-correlation coefficient between the second threshold and the first threshold, the state of the sub-band adaptive filter is determined according to the number of sub-band adaptive filters in the specified state, specifically:
if the number of the sub-band adaptive filters in the single-ended speaking state exceeds a set threshold, the adaptive filters of the sub-bands with the cross-correlation coefficients between the second threshold and the first threshold are in the single-ended speaking state; otherwise, the adaptive filter of each sub-band with the cross correlation coefficient between the second threshold and the first threshold is in a double-end speaking state; or,
if the number of the sub-band adaptive filters in the double-end speaking state exceeds a set threshold, the adaptive filters of all sub-bands with the cross correlation coefficients between a second threshold and a first threshold are in the double-end speaking state; otherwise, the adaptive filter of each sub-band with the cross-correlation coefficient between the second threshold and the first threshold is in the single-end speaking state.
4. The method as claimed in claim 2, wherein for the sub-band having the cross-correlation coefficient between the second threshold and the first threshold, the state of the sub-band adaptive filter is determined according to the proportion of the sub-band adaptive filter in the specified state, specifically:
if the proportion of the number of the sub-band adaptive filters in the single-ended speaking state exceeds a set threshold, the adaptive filters of the sub-bands with the cross-correlation coefficients between a second threshold and a first threshold are in the single-ended speaking state; otherwise, the adaptive filter of each sub-band with the cross correlation coefficient between the second threshold and the first threshold is in a double-end speaking state; or,
if the proportion of the number of the sub-band adaptive filters in the double-end speaking state exceeds a set threshold, the adaptive filters of the sub-bands with the cross correlation coefficients between a second threshold and a first threshold are in the double-end speaking state; otherwise, the adaptive filter of each sub-band with the cross-correlation coefficient between the second threshold and the first threshold is in the single-end speaking state.
5. The method as claimed in claim 2, wherein for the sub-band having the cross-correlation coefficient between the second threshold and the first threshold, the state of the sub-band adaptive filter is determined according to the average value of the cross-correlation coefficient of each sub-band adaptive filter, specifically:
if the average value of the cross correlation coefficients of all the sub-bands is larger than a third threshold value, the self-adaptive filter of all the sub-bands with the cross correlation coefficients between the second threshold value and the first threshold value is in a single-ended speaking state; otherwise, the adaptive filter of each sub-band with the cross correlation coefficient between the second threshold and the first threshold is in a double-end speaking state;
wherein 0 < the second threshold < the third threshold < the first threshold < 1.
6. The method as claimed in claim 2, wherein for the sub-band having the cross-correlation coefficient between the second threshold and the first threshold, the state of the sub-band adaptive filter is determined according to the average value of the cross-correlation coefficient of each sub-band adaptive filter, specifically:
if the weighted average value of the cross-correlation coefficient of each sub-band is greater than a third threshold value, the self-adaptive filter of each sub-band with the cross-correlation coefficient between the second threshold value and the first threshold value is in a single-ended speaking state; otherwise, the adaptive filter of each sub-band with the cross correlation coefficient between the second threshold and the first threshold is in a double-end speaking state;
wherein 0 < the second threshold < the third threshold < the first threshold < 1.
7. The method of claim 6, wherein the weight values corresponding to subbands having cross-correlation coefficients greater than a first threshold and subbands having cross-correlation coefficients less than a second threshold are greater than the weight values corresponding to subbands having cross-correlation coefficients between the second threshold and the first threshold.
8. The method of claim 6, wherein the weight values for the subbands are:
<math><mrow><msub><mi>&lambda;</mi><mi>i</mi></msub><mo>=</mo><mfrac><msubsup><mi>&sigma;</mi><msub><mi>d</mi><mi>i</mi></msub><mn>2</mn></msubsup><mrow><munderover><mi>&Sigma;</mi><mrow><mi>i</mi><mo>-</mo><mn>1</mn></mrow><mi>N</mi></munderover><msubsup><mi>&sigma;</mi><msub><mi>d</mi><mi>i</mi></msub><mn>2</mn></msubsup></mrow></mfrac></mrow></math>
wherein N is the number of sub-bands,
Figure FSA00000405486600032
is the energy of the sub-band near-end input signal of the echo canceller.
9. An echo cancellation device, comprising:
the state determining module is used for determining the state of each sub-band self-adaptive filter according to the cross-correlation coefficient of the near-end input signal and the echo estimation signal of each sub-band;
the output module is used for replacing the residual signal of the subband self-adaptive filter with the comfortable noise of the subband when the subband self-adaptive filter is in a single-ended speaking state and then outputting the replaced signal; when the sub-band adaptive filter is in a double-talk state, a residual signal of the sub-band adaptive filter is output.
10. The apparatus of claim 9, wherein the state determination module is specifically configured to determine that the subband adaptive filter is in a single-ended speaking state when the cross-correlation coefficient of the subband is greater than or equal to a first threshold; when the cross correlation coefficient of the sub-band is smaller than or equal to a second threshold value, determining that the sub-band adaptive filter is in a double-talk state; for the sub-band with the cross correlation coefficient between the second threshold value and the first threshold value, determining the state of the sub-band adaptive filter according to the number or the occupied proportion of the sub-band adaptive filter in the specified state or the average value of the cross correlation coefficient of each sub-band adaptive filter; wherein 0 < the second threshold < the first threshold < 1.
11. The apparatus according to claim 10, wherein the state determining module is specifically configured to, when determining the state of the subband adaptive filter according to the number of subband adaptive filters in a specified state for subbands having cross-correlation coefficients between a second threshold and a first threshold, determine that the adaptive filter of each subband having cross-correlation coefficients between the second threshold and the first threshold is in a single-ended speaking state if the number of subband adaptive filters in the single-ended speaking state exceeds a set threshold; otherwise, judging that the self-adaptive filter of each sub-band with the cross-correlation coefficient between the second threshold and the first threshold is in a double-end speaking state; or if the number of the sub-band adaptive filters in the double-end speaking state exceeds a set threshold, judging that the adaptive filters of all sub-bands with the cross correlation coefficients between a second threshold and a first threshold are in the double-end speaking state; otherwise, the adaptive filter of each sub-band with the cross correlation coefficient between the second threshold and the first threshold is judged to be in a single-end speaking state.
12. The apparatus according to claim 10, wherein the state determining module is specifically configured to, for a subband having a cross-correlation coefficient between a second threshold and a first threshold, determine a state of the subband adaptive filter according to a ratio of the subband adaptive filters in a specified state, and if the ratio of the number of the subband adaptive filters in a single-ended speaking state exceeds a set threshold, determine that the adaptive filter of each subband having the cross-correlation coefficient between the second threshold and the first threshold is in the single-ended speaking state; otherwise, judging that the self-adaptive filter of each sub-band with the cross-correlation coefficient between the second threshold and the first threshold is in a double-end speaking state; or if the proportion of the number of the sub-band adaptive filters in the double-end speaking state exceeds a set threshold, judging that the adaptive filters of the sub-bands with the cross correlation coefficients between the second threshold and the first threshold are in the double-end speaking state; otherwise, the adaptive filter of each sub-band with the cross correlation coefficient between the second threshold and the first threshold is judged to be in a single-end speaking state.
13. The apparatus according to claim 10, wherein the state determining module is specifically configured to, for a sub-band having a cross-correlation coefficient between a second threshold and a first threshold, determine a state of the sub-band adaptive filter according to an average value of the cross-correlation coefficient of each sub-band adaptive filter, and if the average value of the cross-correlation coefficient of each sub-band is greater than a third threshold, determine that the adaptive filter of each sub-band having the cross-correlation coefficient between the second threshold and the first threshold is in a single-ended speaking state; otherwise, judging that the self-adaptive filter of each sub-band with the cross-correlation coefficient between the second threshold and the first threshold is in a double-end speaking state; wherein 0 < the second threshold < the third threshold < the first threshold < 1.
14. The apparatus according to claim 10, wherein the state determining module is specifically configured to, for a sub-band having a cross-correlation coefficient between a second threshold and a first threshold, determine a state of the sub-band adaptive filter according to an average value of the cross-correlation coefficient of each sub-band adaptive filter, and if a weighted average value of the cross-correlation coefficient of each sub-band is greater than a third threshold, determine that the adaptive filter of each sub-band having the cross-correlation coefficient between the second threshold and the first threshold is in a single-ended speaking state; otherwise, judging that the self-adaptive filter of each sub-band with the cross-correlation coefficient between the second threshold and the first threshold is in a double-end speaking state; wherein 0 < the second threshold < the third threshold < the first threshold < 1.
15. The apparatus of claim 14, wherein the state determination module uses a weight value for subbands having cross-correlation coefficients greater than a first threshold and a weight value for subbands having cross-correlation coefficients less than a second threshold that are greater than the weight value for subbands having cross-correlation coefficients between the second threshold and the first threshold.
CN 201010618136 2010-12-31 2010-12-31 Method and device for eliminating echo Active CN102065190B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201010618136 CN102065190B (en) 2010-12-31 2010-12-31 Method and device for eliminating echo

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201010618136 CN102065190B (en) 2010-12-31 2010-12-31 Method and device for eliminating echo

Publications (2)

Publication Number Publication Date
CN102065190A true CN102065190A (en) 2011-05-18
CN102065190B CN102065190B (en) 2013-08-28

Family

ID=44000284

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201010618136 Active CN102065190B (en) 2010-12-31 2010-12-31 Method and device for eliminating echo

Country Status (1)

Country Link
CN (1) CN102065190B (en)

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104036784A (en) * 2014-06-06 2014-09-10 华为技术有限公司 Echo cancellation method and device
CN104427143A (en) * 2013-09-06 2015-03-18 联芯科技有限公司 Residual echo detection method and system
CN104579828A (en) * 2014-12-24 2015-04-29 深圳市国电科技通信有限公司 Power line communication simulation method and system based on energy detection
CN104754157A (en) * 2013-12-26 2015-07-01 联芯科技有限公司 Residual echo suppression method and system
CN106297816A (en) * 2015-05-20 2017-01-04 广州质音通讯技术有限公司 The non-linear processing methods of a kind of echo cancellor and device and electronic equipment
CN106331402A (en) * 2016-08-25 2017-01-11 西南交通大学 Coefficient difference based proportional subband convex combination adaptive echo cancellation method
CN107045874A (en) * 2016-02-05 2017-08-15 深圳市潮流网络技术有限公司 A kind of Non-linear Speech Enhancement Method based on correlation
CN107316652A (en) * 2017-06-30 2017-11-03 北京睿语信息技术有限公司 Sidetone removing method and device
CN108630219A (en) * 2018-05-08 2018-10-09 北京小鱼在家科技有限公司 A kind of audio frequency processing system, method, apparatus, equipment and storage medium
CN108877825A (en) * 2018-06-26 2018-11-23 珠海宏桥高科技有限公司 A kind of Network echo cancellation element and method based on voice-activated and logic control
CN108986836A (en) * 2018-08-29 2018-12-11 质音通讯科技(深圳)有限公司 A kind of control method of echo suppressor, device, equipment and storage medium
CN109327633A (en) * 2017-07-31 2019-02-12 上海谦问万答吧云计算科技有限公司 Sound mixing method, device, equipment and storage medium
CN109727604A (en) * 2018-12-14 2019-05-07 上海蔚来汽车有限公司 Frequency domain echo cancel method and computer storage media for speech recognition front-ends
CN109785853A (en) * 2019-03-11 2019-05-21 出门问问信息科技有限公司 A kind of echo cancel method, device, system and storage medium
CN109961798A (en) * 2017-12-26 2019-07-02 华平信息技术股份有限公司 Echo cancelling system, method, readable computer storage medium and terminal
CN110246516A (en) * 2019-07-25 2019-09-17 福建师范大学福清分校 The processing method of small space echo signal in a kind of voice communication
CN110971769A (en) * 2019-11-19 2020-04-07 百度在线网络技术(北京)有限公司 Call signal processing method and device, electronic equipment and storage medium
CN110995951A (en) * 2019-12-13 2020-04-10 展讯通信(上海)有限公司 Echo cancellation method, device and system based on double-end sounding detection
CN110992975A (en) * 2019-12-24 2020-04-10 大众问问(北京)信息科技有限公司 Voice signal processing method and device and terminal
CN111294473A (en) * 2019-01-28 2020-06-16 展讯通信(上海)有限公司 Signal processing method and device
CN111355855A (en) * 2020-03-12 2020-06-30 紫光展锐(重庆)科技有限公司 Echo processing method, device, equipment and storage medium
CN111933164A (en) * 2020-06-29 2020-11-13 北京百度网讯科技有限公司 Training method and device of voice processing model, electronic equipment and storage medium
CN113763975A (en) * 2020-06-05 2021-12-07 大众问问(北京)信息科技有限公司 Voice signal processing method and device and terminal

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004082249A1 (en) * 2003-03-10 2004-09-23 Tandberg Telecom As Echo canceller with reduced requirement for processing power
CN1668058A (en) * 2005-02-21 2005-09-14 南望信息产业集团有限公司 Recursive least square difference based subband echo canceller
US20090046847A1 (en) * 2007-08-15 2009-02-19 Motorola, Inc. Acoustic echo canceller using multi-band nonlinear processing
CN101562669A (en) * 2009-03-11 2009-10-21 屈国良 Method of adaptive full duplex full frequency band echo cancellation
CN101719969A (en) * 2009-11-26 2010-06-02 美商威睿电通公司 Method and system for judging double-end conversation and method and system for eliminating echo
CN101785290A (en) * 2007-08-31 2010-07-21 摩托罗拉公司 Acoustic echo based on noise circumstance is eliminated

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004082249A1 (en) * 2003-03-10 2004-09-23 Tandberg Telecom As Echo canceller with reduced requirement for processing power
CN1668058A (en) * 2005-02-21 2005-09-14 南望信息产业集团有限公司 Recursive least square difference based subband echo canceller
US20090046847A1 (en) * 2007-08-15 2009-02-19 Motorola, Inc. Acoustic echo canceller using multi-band nonlinear processing
CN101785290A (en) * 2007-08-31 2010-07-21 摩托罗拉公司 Acoustic echo based on noise circumstance is eliminated
CN101562669A (en) * 2009-03-11 2009-10-21 屈国良 Method of adaptive full duplex full frequency band echo cancellation
CN101719969A (en) * 2009-11-26 2010-06-02 美商威睿电通公司 Method and system for judging double-end conversation and method and system for eliminating echo

Cited By (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104427143A (en) * 2013-09-06 2015-03-18 联芯科技有限公司 Residual echo detection method and system
CN104427143B (en) * 2013-09-06 2017-02-22 联芯科技有限公司 residual echo detection method and system
CN104754157B (en) * 2013-12-26 2017-06-16 联芯科技有限公司 Method and system for suppressing residual echo
CN104754157A (en) * 2013-12-26 2015-07-01 联芯科技有限公司 Residual echo suppression method and system
CN104036784A (en) * 2014-06-06 2014-09-10 华为技术有限公司 Echo cancellation method and device
CN104579828A (en) * 2014-12-24 2015-04-29 深圳市国电科技通信有限公司 Power line communication simulation method and system based on energy detection
CN104579828B (en) * 2014-12-24 2018-08-03 深圳市国电科技通信有限公司 A kind of Power Line Communication Simulation method and system based on energy measuring
CN106297816B (en) * 2015-05-20 2019-12-13 广州质音通讯技术有限公司 Echo cancellation nonlinear processing method and device and electronic equipment
CN106297816A (en) * 2015-05-20 2017-01-04 广州质音通讯技术有限公司 The non-linear processing methods of a kind of echo cancellor and device and electronic equipment
CN107045874A (en) * 2016-02-05 2017-08-15 深圳市潮流网络技术有限公司 A kind of Non-linear Speech Enhancement Method based on correlation
CN107045874B (en) * 2016-02-05 2021-03-02 深圳市潮流网络技术有限公司 Non-linear voice enhancement method based on correlation
CN106331402A (en) * 2016-08-25 2017-01-11 西南交通大学 Coefficient difference based proportional subband convex combination adaptive echo cancellation method
CN106331402B (en) * 2016-08-25 2019-03-22 西南交通大学 One kind being based on the proportional subband convex combination adaptive echo null method of coefficient difference
CN107316652A (en) * 2017-06-30 2017-11-03 北京睿语信息技术有限公司 Sidetone removing method and device
CN109327633B (en) * 2017-07-31 2020-09-22 苏州谦问万答吧教育科技有限公司 Sound mixing method, device, equipment and storage medium
CN109327633A (en) * 2017-07-31 2019-02-12 上海谦问万答吧云计算科技有限公司 Sound mixing method, device, equipment and storage medium
CN109961798A (en) * 2017-12-26 2019-07-02 华平信息技术股份有限公司 Echo cancelling system, method, readable computer storage medium and terminal
CN109961798B (en) * 2017-12-26 2021-06-11 华平信息技术股份有限公司 Echo cancellation system, echo cancellation method, readable computer storage medium, and terminal
CN108630219A (en) * 2018-05-08 2018-10-09 北京小鱼在家科技有限公司 A kind of audio frequency processing system, method, apparatus, equipment and storage medium
CN108630219B (en) * 2018-05-08 2021-05-11 北京小鱼在家科技有限公司 Processing system, method and device for echo suppression audio signal feature tracking
CN108877825A (en) * 2018-06-26 2018-11-23 珠海宏桥高科技有限公司 A kind of Network echo cancellation element and method based on voice-activated and logic control
CN108986836A (en) * 2018-08-29 2018-12-11 质音通讯科技(深圳)有限公司 A kind of control method of echo suppressor, device, equipment and storage medium
CN109727604B (en) * 2018-12-14 2023-11-10 上海蔚来汽车有限公司 Frequency domain echo cancellation method for speech recognition front end and computer storage medium
CN109727604A (en) * 2018-12-14 2019-05-07 上海蔚来汽车有限公司 Frequency domain echo cancel method and computer storage media for speech recognition front-ends
CN111294473B (en) * 2019-01-28 2022-01-04 展讯通信(上海)有限公司 Signal processing method and device
CN111294473A (en) * 2019-01-28 2020-06-16 展讯通信(上海)有限公司 Signal processing method and device
CN109785853B (en) * 2019-03-11 2020-06-16 出门问问信息科技有限公司 Echo cancellation method, device, system and storage medium
CN109785853A (en) * 2019-03-11 2019-05-21 出门问问信息科技有限公司 A kind of echo cancel method, device, system and storage medium
CN110246516A (en) * 2019-07-25 2019-09-17 福建师范大学福清分校 The processing method of small space echo signal in a kind of voice communication
CN110246516B (en) * 2019-07-25 2022-06-17 福建师范大学福清分校 Method for processing small space echo signal in voice communication
CN110971769B (en) * 2019-11-19 2022-05-03 百度在线网络技术(北京)有限公司 Call signal processing method and device, electronic equipment and storage medium
CN110971769A (en) * 2019-11-19 2020-04-07 百度在线网络技术(北京)有限公司 Call signal processing method and device, electronic equipment and storage medium
CN110995951A (en) * 2019-12-13 2020-04-10 展讯通信(上海)有限公司 Echo cancellation method, device and system based on double-end sounding detection
CN110995951B (en) * 2019-12-13 2021-09-03 展讯通信(上海)有限公司 Echo cancellation method, device and system based on double-end sounding detection
CN110992975A (en) * 2019-12-24 2020-04-10 大众问问(北京)信息科技有限公司 Voice signal processing method and device and terminal
CN111355855B (en) * 2020-03-12 2021-06-15 紫光展锐(重庆)科技有限公司 Echo processing method, device, equipment and storage medium
CN111355855A (en) * 2020-03-12 2020-06-30 紫光展锐(重庆)科技有限公司 Echo processing method, device, equipment and storage medium
CN113763975A (en) * 2020-06-05 2021-12-07 大众问问(北京)信息科技有限公司 Voice signal processing method and device and terminal
CN113763975B (en) * 2020-06-05 2023-08-29 大众问问(北京)信息科技有限公司 Voice signal processing method, device and terminal
CN111933164B (en) * 2020-06-29 2022-10-25 北京百度网讯科技有限公司 Training method and device of voice processing model, electronic equipment and storage medium
CN111933164A (en) * 2020-06-29 2020-11-13 北京百度网讯科技有限公司 Training method and device of voice processing model, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN102065190B (en) 2013-08-28

Similar Documents

Publication Publication Date Title
CN102065190B (en) Method and device for eliminating echo
US9343056B1 (en) Wind noise detection and suppression
CN101719969B (en) Method and system for judging double-end conversation and method and system for eliminating echo
US8010355B2 (en) Low complexity noise reduction method
EP1298815B1 (en) Echo processor generating pseudo background noise with high naturalness
EP2905778B1 (en) Echo cancellation method and device
US6792107B2 (en) Double-talk detector suitable for a telephone-enabled PC
CN101778183B (en) Method and device for suppressing residual echo
US7856097B2 (en) Echo canceling apparatus, telephone set using the same, and echo canceling method
US7203308B2 (en) Echo canceller ensuring further reduction in residual echo
EP1978649A2 (en) Spectral Domain, Non-Linear Echo Cancellation Method in a Hands-Free Device
US9685172B2 (en) Method and device for suppressing residual echoes based on inverse transmitter receiver distance and delay for speech signals directly incident on a transmitter array
US20080031467A1 (en) Echo reduction system
US8116448B2 (en) Acoustic echo canceler
CN101958122B (en) Method and device for eliminating echo
US9343073B1 (en) Robust noise suppression system in adverse echo conditions
CN101917527A (en) Method and device of echo elimination
CN109273019B (en) Method for double-talk detection for echo suppression and echo suppression
CN110956975B (en) Echo cancellation method and device
US8964967B2 (en) Subband domain echo masking for improved duplexity of spectral domain echo suppressors
CN106033673A (en) Near-end speech signal detecting method and near-end speech signal detecting device
US8369511B2 (en) Robust method of echo suppressor
Yang Multilayer adaptation based complex echo cancellation and voice enhancement
CN111917926B (en) Echo cancellation method and device in communication terminal and terminal equipment
US7711107B1 (en) Perceptual masking of residual echo

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CP03 Change of name, title or address

Address after: 310052 Binjiang District Changhe Road, Zhejiang, China, No. 466, No.

Patentee after: Xinhua three Technology Co., Ltd.

Address before: 310053 Hangzhou hi tech Industrial Development Zone, Zhejiang province science and Technology Industrial Park, No. 310 and No. six road, HUAWEI, Hangzhou production base

Patentee before: Huasan Communication Technology Co., Ltd.

CP03 Change of name, title or address