CN111246037B - Echo cancellation method, device, terminal equipment and medium - Google Patents

Echo cancellation method, device, terminal equipment and medium Download PDF

Info

Publication number
CN111246037B
CN111246037B CN202010183666.0A CN202010183666A CN111246037B CN 111246037 B CN111246037 B CN 111246037B CN 202010183666 A CN202010183666 A CN 202010183666A CN 111246037 B CN111246037 B CN 111246037B
Authority
CN
China
Prior art keywords
signal
filter
far
processing
echo
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010183666.0A
Other languages
Chinese (zh)
Other versions
CN111246037A (en
Inventor
吴威麒
肖波
许一峰
陈满砚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing ByteDance Network Technology Co Ltd
Original Assignee
Beijing ByteDance Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing ByteDance Network Technology Co Ltd filed Critical Beijing ByteDance Network Technology Co Ltd
Priority to CN202010183666.0A priority Critical patent/CN111246037B/en
Publication of CN111246037A publication Critical patent/CN111246037A/en
Application granted granted Critical
Publication of CN111246037B publication Critical patent/CN111246037B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M9/00Arrangements for interconnection not involving centralised switching
    • H04M9/08Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic
    • H04M9/085Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic using digital techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/26Pre-filtering or post-filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L2021/02082Noise filtering the noise being echo, reverberation of the speech

Abstract

The disclosure discloses an echo cancellation method, an echo cancellation device, a terminal device and a medium. The method comprises the following steps: acquiring a far-end signal; processing the far-end signal by a step-length variable adaptive filter to obtain an echo signal, wherein the step-length variable adaptive filter is an adaptive filter with a variable learning factor step length when processing each frame of the far-end signal; determining a residual spectrum signal according to a microphone signal and the echo signal; and carrying out nonlinear processing on the residual spectrum signal to obtain an output signal so as to complete echo cancellation. By using the method, the generation of the echo leakage phenomenon is effectively avoided by the step-length variable self-adaptive filter. Furthermore, echo is effectively eliminated based on nonlinear processing.

Description

Echo cancellation method, device, terminal equipment and medium
Technical Field
The present disclosure relates to the field of communications technologies, and in particular, to an echo cancellation method, apparatus, terminal device, and medium.
Background
An adaptive filter refers to a filter that changes parameters and structure of the filter using an adaptive algorithm according to a change in environment. In general, the structure of the adaptive filter is not changed. While the coefficients of the adaptive filter are time-varying coefficients updated by the adaptive algorithm. I.e. its coefficients are automatically adapted continuously to a given signal to obtain a desired response. The most important feature of the adaptive filter is that it can operate efficiently in unknown environments and can track the time-varying characteristics of the input signal.
Generally, the learning factor of the conventional linear adaptive filter cannot be adjusted quickly according to the change of an echo path or the occurrence of a double-talk state, and the convergence speed is relatively slow, so that the problem of echo leakage often occurs easily.
Disclosure of Invention
The present disclosure provides an echo cancellation method, apparatus, terminal device and medium, which effectively avoid the problem of echo leakage.
In a first aspect, an embodiment of the present disclosure provides an echo cancellation method, including:
acquiring a far-end signal;
processing the far-end signal by a step-length variable adaptive filter to obtain an echo signal, wherein the step-length variable adaptive filter is an adaptive filter with a variable learning factor step length when processing each frame of the far-end signal;
determining a residual spectrum signal according to a microphone signal and the echo signal;
and carrying out nonlinear processing on the residual spectrum signal to obtain an output signal so as to complete echo cancellation.
In a second aspect, an embodiment of the present disclosure further provides an echo cancellation device, including:
the acquisition module is used for acquiring a far-end signal;
the first processing module is used for processing the far-end signal by a step-length variable adaptive filter to obtain an echo signal, wherein the step-length variable adaptive filter is an adaptive filter with a variable learning factor step length when processing each frame of the far-end signal;
a determining module for determining a residual spectrum signal according to a microphone signal and the echo signal;
and the second processing module is used for carrying out nonlinear processing on the residual spectrum signal to obtain an output signal so as to complete echo cancellation.
In a third aspect, an embodiment of the present disclosure further provides a terminal device, including:
one or more processing devices;
storage means for storing one or more programs;
the one or more programs are executed by the one or more processing devices, so that the one or more processing devices implement the echo cancellation method provided by the embodiment of the disclosure.
In a fourth aspect, the disclosed embodiments also provide a computer-readable medium, on which a computer program is stored, where the computer program, when executed by a processing device, implements the echo cancellation method provided by the disclosed embodiments.
The embodiment of the disclosure provides an echo cancellation method, an echo cancellation device, a terminal device and a medium, wherein a far-end signal is obtained firstly; secondly, processing the far-end signal by a step-length variable adaptive filter to obtain an echo signal, wherein the step-length variable adaptive filter is an adaptive filter with a variable learning factor step length when processing each frame of the far-end signal; then determining a residual spectrum signal according to the microphone signal and the echo signal; and finally, carrying out nonlinear processing on the residual spectrum signal to obtain an output signal so as to complete echo cancellation. By using the method, the generation of the echo leakage phenomenon is effectively avoided by the step-length variable self-adaptive filter. Furthermore, echo is effectively eliminated based on nonlinear processing.
Drawings
Fig. 1 is a schematic flowchart of an echo cancellation method according to an embodiment of the present disclosure;
fig. 2 is a schematic flowchart of an echo cancellation method according to a second embodiment of the disclosure;
fig. 2a is a schematic structural diagram of an echo cancellation method according to a second embodiment of the present disclosure;
fig. 2b is a schematic diagram of a remote signal according to a second embodiment of the disclosure;
fig. 2c is a schematic diagram of a near-end signal according to a second embodiment of the disclosure;
fig. 2d is a schematic diagram of an output signal according to a second embodiment of the disclosure;
fig. 3 is a schematic structural diagram of an echo cancellation device according to a third embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of a terminal device according to a fourth embodiment of the present disclosure.
Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.
It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order, and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.
The term "include" and variations thereof as used herein are open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment".
It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.
The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.
In the following embodiments, optional features and examples are provided in each embodiment, and various features described in the embodiments may be combined to form a plurality of alternatives, and each numbered embodiment should not be regarded as only one technical solution. Furthermore, the embodiments and features of the embodiments in the present disclosure may be combined with each other without conflict.
Example one
Fig. 1 is a schematic flowchart of an echo cancellation method according to an embodiment of the present disclosure, where the method is applicable to solve the problem of leaky echo. The method may be performed by an echo cancellation device, wherein the device may be implemented by software and/or hardware and is generally integrated on a terminal device, which in this embodiment includes but is not limited to: mobile phones, computers, personal digital assistants, and the like.
The echo cancellation method disclosed by the present disclosure may be an echo processing method based on a software algorithm level, the echo cancellation method may be packaged as an application program in a terminal device, and the echo cancellation method may be used to solve a technical problem of echo leakage in the terminal device, and may also solve a technical problem of echo leakage in a communication process of other terminal devices.
As shown in fig. 1, a method for echo cancellation according to a first embodiment of the present disclosure includes the following steps:
and S110, acquiring a far-end signal.
The far-end signal may be a signal collected by a far-end microphone. The manner in which the far-end signal is acquired is not limited herein. After the far-end signal is acquired, the far-end signal can be used for estimating an echo signal so as to perform echo cancellation processing on a microphone signal acquired by a near-end microphone.
When a speaker A speaks locally, the voice is sent to a speaker B through audio preprocessing, coding and packaging, when the end B broadcasts through a loudspeaker, the voice of the speaker A is recorded back, and the voice is sent to the speaker A through coding and packaging, so that the speaker A hears own echo and seriously interferes with conversation communication. The echo cancellation method may be integrated on the terminal equipment used by speaker a. The microphone on the terminal device may be a near-end microphone. The microphone on the terminal device used by speaker B may be a far-end microphone.
And S120, processing the far-end signal by a step-length variable self-adaptive filter to obtain an echo signal.
The step-size-variable adaptive filter is an adaptive filter with a variable learning factor step size when processing the far-end signal of each frame. In order to solve the technical problem of the leaky echo of the linear adaptive filter, the echo signal is determined by the step-size variable adaptive filter.
The step size of the learning factor is variable when the step size variable adaptive filter processes each frame of far-end signals. The step-size-variable adaptive filter may be a linear filter in which the step size of the learning factor is variable; or a Kalman filter with variable learning factor step size, etc. The echo signal may be considered to be an echo signal estimated based on the far-end signal.
After the far-end signal is obtained, the step may process the far-end signal through the step size variable adaptive filter to obtain the echo signal. Specifically, when the step-size variable adaptive filter processes the far-end signal, the far-end signal may be divided into at least two sub-filter blocks, and the far-end signal is processed in the frequency domain. The learning factor of the step-size variable adaptive filter is variable when processing each frame of far-end signals.
Specifically, the processing of the far-end signal by the step-size variable adaptive filter to obtain the echo signal includes:
determining a voice signal of each sub-filter block according to the far-end signal;
carrying out Fourier transform on each voice signal to obtain a corresponding frequency domain signal;
and multiplying the frequency domain signal of each sub-filter by a filter coefficient, accumulating and carrying out Fourier inversion to obtain an echo signal.
The step-size variable adaptive filter may first determine the speech signal corresponding to each sub-filter when processing the far-end signal.
Illustratively, the nth frame speech of the nth sub-filter block is represented as:
xp(n)=[x(nR-pL-M+1),...x(nR-p*L)]T. Where P is 0, 1, 2., (P-1), L is the length of each filter, R is the speech frame shift, M is the number of sample points where the frame overlaps with the frame, n, L, R, and M are positive numbers, P is an integer, and x (n) represents the far-end signal.
The frequency domain signal corresponding to the speech signal can be represented as Xp(n,k)=FFT(xp(n)). Wherein, XpAnd (n, k) may be a frequency domain signal of a k frequency point of the nth frame.
After the frequency domain signal of each sub-filter is determined, each sub-filter may be multiplied by the corresponding filter coefficient and then accumulated to perform inverse fourier transform to obtain an echo signal. For example, the last L elements of the signal obtained after the inverse fourier transform are used as echo signals.
The echo signal may be:
Figure BDA0002413419990000061
the last L elements of (a).
Wherein the content of the first and second substances,
Figure BDA0002413419990000062
being echo signals, Wp(n, k) is the filter coefficient of the p-th sub-filter block, XpAnd (n, k) is a frequency domain signal of the voice signal corresponding to the p-th sub-filter block.
And S130, determining a residual spectrum signal according to the microphone signal and the echo signal.
The microphone signal may be considered to be the signal picked up by the near-end microphone. The microphone signal may be the sum of the near-end speech signal, the local noise signal and the actual echo signal. The residual spectrum signal can be regarded as a fourier transformed residual signal. The residual signal may be considered as the microphone signal after the echo signal is removed.
After the echo signal is determined, the microphone signal of the frequency domain from which the echo signal is removed is used as a residual spectrum signal in the step.
And S140, carrying out nonlinear processing on the residual spectrum signal to obtain an output signal so as to complete echo cancellation.
After the residual spectrum signal is determined, the residual spectrum signal may be subjected to nonlinear processing in this step to obtain an output signal from which the echo is removed. The output signal may be transmitted to a remote end.
The non-linear processing may be post-non-linear filtering to suppress non-linear noise in the residual spectrum signal. For example, the residual spectrum signal may be subjected to a residual echo noise reduction process and/or a learning factor-based nonlinear process to obtain an output signal.
The non-linear processing based on the learning factor may be to determine a non-linear factor based on the learning factor, so as to perform non-linear processing on the residual spectrum signal based on the non-linear factor.
In a method for echo cancellation provided in an embodiment of the present disclosure, a far-end signal is first obtained; secondly, processing the far-end signal by a step-length variable adaptive filter to obtain an echo signal, wherein the step-length variable adaptive filter is an adaptive filter with a variable learning factor step length when processing each frame of the far-end signal; then determining a residual spectrum signal according to the microphone signal and the echo signal; and finally, carrying out nonlinear processing on the residual spectrum signal to obtain an output signal so as to complete echo cancellation. By using the method, the generation of the echo leakage phenomenon is effectively avoided by the step-length variable self-adaptive filter. Furthermore, echo is effectively eliminated based on nonlinear processing.
On the basis of the above-described embodiment, a modified embodiment of the above-described embodiment is proposed, and it is to be noted herein that, in order to make the description brief, only the differences from the above-described embodiment are described in the modified embodiment.
In one embodiment, the step size variable adaptive filter has a learning factor proportional to the filter coefficients. For example, the learning factor is changed based on the filter coefficient to adaptively and dynamically adjust the filter weight, so as to achieve the purpose of fast and stable operation. The filter coefficients may be proportional (e.g., proportional) to the scaling factor and thus the learning factor.
In one embodiment, the step-size variable adaptive filter is a linear filter with a step-size variable learning factor, and the step-size variable adaptive filter comprises at least two sub-filter blocks.
For example, assuming that the total length of the linear filter is N order, the linear filter is divided into P sub-filter blocks, and each filter has a length of L, so that L is N/P; accordingly, the speech frame is shifted to R, M sample points overlap between frames, and the frame length is R + M, which is simplified to L ═ M ═ R here. Wherein P is a positive integer greater than or equal to 2. N, L, R and M are positive numbers, which are not limited herein and can be set by those skilled in the art according to the actual situation. The step size of the learning factor of the linear filter is variable.
In one embodiment, a learning factor of the step-size variable adaptive filter is determined based on the echo signal, the residual spectrum signal, and filter coefficients.
The filter coefficients may be the filter coefficients of each sub-filter block.
Illustratively, the learning factor of the step-size variable adaptive filter per frame of the far-end signal can be determined by the following formula:
Figure BDA0002413419990000081
wherein mu (n, k) is the learning factor of the step-length variable adaptive filter when processing the k-th frequency point far-end signal of the nth frame,
Figure BDA0002413419990000082
to reveal a factor, Wp(n, k) are filter coefficients of the p-th sub-filter block,
Figure BDA0002413419990000083
as frequency domain signals of echo signals, i.e.
Figure BDA0002413419990000084
Obtaining an echo signal through Fourier inverse transformation,
Figure BDA0002413419990000085
is a residual spectral signal.
The leakage factor may be determined based on a frequency domain signal of the echo signal and a residual spectral signal. When determining the leakage factor corresponding to the far-end signal of the current frame, the determination may be based on the leakage factor corresponding to the far-end signal of the previous frame.
In one embodiment, the filter coefficients of each sub-filter block in the step-size variable adaptive filter when processing the far-end signal of the current frame are determined according to the corresponding speech signal when processing the far-end signal of the previous frame and the filter coefficients, the learning factor and the residual spectrum signal when processing the far-end signal of the previous frame.
The filter coefficients may be different for each sub-filter when processing the far-end signal for different frames.
In one embodiment, the filter coefficients of each sub-filter may be determined by the following equation:
Wp(n+1,k)=Wp(n,k)+μ(n,k)conj(Xp(n,k))E(n,k);
wherein, Wp(n +1, k) is the filter coefficient of the p sub-filter processing the current frame far-end signal, i.e. the filter coefficient when the p sub-filter block processes the n +1 frame k frequency point far-end signal, WpThe (n, k) th sub-filter processes the filter coefficient of the far-end signal of the previous frame, namely the filter coefficient when the p sub-filter block processes the k frequency point far-end signal of the nth frame, mu (n, k) is the learning factor of the far-end signal of the previous frame, XpAnd (n, k) is a voice signal corresponding to the last frame of far-end signal processed by the p-th sub-filter, namely, a voice signal corresponding to the k-th frequency point far-end signal of the n-th frame of the p-th sub-filter block, and E (n, k) is a residual spectrum signal corresponding to the last frame of far-end signal, namely, a residual spectrum signal corresponding to the k-th frequency point far-end signal of the n-th frame. conj (.) represents the conjugate operation of the matrix.
In one embodiment of the present invention,
Figure BDA0002413419990000091
where 0 is a 0 vector of M rows and 1 column.
Example two
Fig. 2 is a schematic flow chart of an echo cancellation method according to a second embodiment of the present disclosure, which is embodied based on the above embodiments. In this embodiment, determining a residual spectrum signal according to a microphone signal and the echo signal specifically includes:
extracting an echo signal from the microphone signal to obtain a residual signal;
and carrying out Fourier transform on the residual signal to obtain a residual spectrum signal.
For a detailed description of the present embodiment, please refer to the above embodiments.
As shown in fig. 2, a second echo cancellation method provided in the embodiment of the present disclosure includes the following steps:
and S210, acquiring a far-end signal.
And S220, processing the far-end signal by a step-length variable self-adaptive filter to obtain an echo signal.
And S230, extracting an echo signal from the microphone signal to obtain a residual signal.
In determining the residual spectrum signal, the present embodiment may first determine the residual signal. This step may determine a difference of the microphone signal and the echo signal as a residual signal. For example, the residual signal is represented as:
Figure BDA0002413419990000102
wherein e (n) is a residual signal, d (n) is a microphone signal,
Figure BDA0002413419990000103
is an echo signal.
And S240, carrying out Fourier transform on the residual signal to obtain a residual spectrum signal.
The residual spectrum signal can be expressed as:
Figure BDA0002413419990000101
where E (n, k) is a residual spectrum signal. This step may perform fourier transform on a vector formed by the residual signal and the 0 vector of M rows and 1 columns to obtain a residual spectrum signal.
And S250, carrying out nonlinear processing on the residual spectrum signal to obtain an output signal.
The following describes an exemplary echo cancellation method provided by the present disclosure:
in order to solve the technical problem of echo leakage, the echo cancellation method in the disclosure provides a sub-band-based equal scale factor, and the filter weight is dynamically adjusted in a self-adaptive manner, so as to achieve the purpose of rapidness and stability. In addition, a nonlinear suppression factor based on a learning factor is provided, the suppression factor can accurately suppress echo components, protect near-end voice components, and finally, the purpose of completely eliminating echo is achieved through a round of residual echo noise reduction.
Fig. 2a is a schematic structural diagram of an echo cancellation method provided in the second embodiment of the present disclosure, and referring to fig. 2a, the echo cancellation method of the present disclosure includes two parts: the step length variable linear filter reduces the damage of near-end voice; and based on the nonlinear processing of the learning rate and the residual echo noise reduction processing, the residual echo is completely eliminated.
The far-end signal x (n) is estimated by the variable length adaptive filter h (n) (namely the step length variable adaptive filter)
Figure BDA0002413419990000111
The microphone signals d (n) (y (n) + s (n)) + v (n), where s (n) is the near-end speech, v (n) is the local noise, and y (n) is the actual echo. Residual signal
Figure BDA0002413419990000112
The residual signal passes through a nonlinear processing module to obtain an output signal out (n), wherein the nonlinear processing module comprises a learning factor-based nonlinear processing module and a residual echo noise reduction processing module.
In this disclosure, the adaptive filter coefficients are represented as: wp(n, k) wherein P is 0, 1, 2., (P-1).
The echo frequency domain and time domain signals estimated by the adaptive filtering are represented as:
Figure BDA0002413419990000113
Figure BDA0002413419990000114
the last L elements of (1);
residual signal
Figure BDA0002413419990000115
The FFT transform obtains a residual spectrum, namely a residual spectrum signal:
Figure BDA0002413419990000116
where 0 is a 0 vector of M rows and 1 column.
The convergence conditions of the P sub-filter blocks in the same sub-band are stable and consistent, the convergence conditions of different sub-bands are different, on the basis, an equal scale factor based on the sub-band is provided, the filter coefficient with larger weight among different sub-bands is enhanced, the filter coefficient with smaller weight is reduced, the convergence is accelerated, meanwhile, the proportion is determined through the comprehensive performance of each sub-band of the P sub-filter blocks, the unstable jitter condition among different blocks of the same sub-band can be eliminated, and the filter divergence is reduced.
The specific calculation method is as follows:
calculation of the equal scale factor:
Figure BDA0002413419990000117
the variable step learning rate factor (i.e., learning factor) is as follows:
Figure BDA0002413419990000121
the equal scale factor is proportional to the filter coefficient, and the learning factor is proportional to the filter coefficient.
The leakage factor is calculated as follows:
Figure BDA0002413419990000122
Figure BDA0002413419990000123
SEY(n,k)=α(n)SEY(n-1,k)+(1-α(n))SE(n,k)conj(SY(n,k));
SYY(n,k)=α(n)SYY(n-1,k)+(1-α(n))SY(n,k)conj(SY(n,k))。
the estimated power spectral density of the echo signal is approximately expressed as:
Figure BDA0002413419990000124
where real (.) represents taking the real part. conj (.) represents the conjugate operation of the matrix.
The power spectral density of the linear residual is approximately expressed as:
SE(n,k)=α(n)SE(n-1,k)+(1-α(n))real(E(n,k)conj(E(n,k)))。
the filter coefficient weights are updated as follows:
Wp(n+1,k)=Wp(n,k)+μ(n,k)conj(Xp(n,k))E(n,k);
Figure BDA0002413419990000125
where 0 represents a 0 vector. Such as a 0 vector of M rows and 1 column.
Fig. 2b is a schematic diagram of a remote signal according to a second embodiment of the disclosure; fig. 2c is a schematic diagram of a near-end signal according to a second embodiment of the disclosure; fig. 2d is a schematic diagram of an output signal according to a second embodiment of the disclosure. Referring to fig. 2b-2d, the resulting output signal is effectively noise-free based on echo cancellation processing of the far-end signal and the near-end signal.
The echo cancellation method provided in the second embodiment of the present disclosure embodies the operation of determining the residual spectrum signal. By using the method, the technical problem of echo leakage can be effectively solved.
On the basis of the above-described embodiment, a modified embodiment of the above-described embodiment is proposed, and it is to be noted herein that, in order to make the description brief, only the differences from the above-described embodiment are described in the modified embodiment.
In one embodiment, the performing nonlinear processing on the residual spectrum signal to obtain an output signal specifically includes:
and carrying out nonlinear processing on the residual spectrum signal based on the learning factor to obtain an output signal.
The larger the learning factor is, the larger the estimated echo intensity is, that is, the higher the probability of the echo occurring at the frequency point is, and the nonlinear processing factor based on the learning factor can effectively distinguish the echo frequency point region from the near-end voice frequency point region, so that the echo frequency point can be further inhibited in a targeted manner, and especially in a double-talk state, near-end voice can be effectively protected.
In one embodiment, the subjecting the residual spectrum signal to a non-linear processing based on the learning factor to obtain an output signal includes:
determining a nonlinear factor according to the learning factor, the number of sub-filter blocks included in the step-size variable adaptive filter and a corresponding voice signal when the step-size variable adaptive filter processes the far-end signal;
determining a product of the non-linearity factor and the residual spectral signal as an output signal.
When the residual spectrum signal is subjected to the non-linear processing based on the learning factor, the output signal may be obtained by multiplying the residual spectrum signal by the non-linear factor.
The voice signal corresponding to the processing of the far-end signal by the step-size variable adaptive filter can be regarded as the voice signal corresponding to the processing of the far-end signal by each sub-filter included in the step-size variable adaptive filter.
The corresponding speech signal when the step size variable adaptive filter processes the far-end signal can be used to determine the energy of the far-end signal in all sub-filter blocks.
In one embodiment, the non-linearity factor is determined by the following equation:
Figure BDA0002413419990000141
Xp(n,k)=FFT(xp(n));
wherein H (n, k) is a non-linear factor, and P is the step-size variable adaptive filterNumber of sub-filter blocks included, xp(n) is the nth frame speech signal of the p-th sub-filter block, and T represents the transpose operation of the matrix.
In one embodiment, the method further specifically includes:
and carrying out residual echo noise reduction processing on the output signal to obtain a noise-reduced output signal.
Estimating residual echo:
Figure BDA0002413419990000142
the method of noise estimation using minimum tracking is as follows:
the smoothed spectrum of the microphone signal is represented as:
D(n,k)=FFT(d(n));
Dsmooth(n,k)=0.85Dsmooth(n-1,k)+0.15|D(n,k)|2
N(n,k)=min(Dsmooth(n,k),Dsmooth(n-1,k)...Dsmooth(n-win_size,k));
the win _ size may be equal to the number of time windows, where the number of time windows may be set according to practical situations, and is not limited herein, for example, the win _ size is 80, and the length of a single time window is 10ms, which corresponds to a time window of 800 ms.
The residual echo is processed as noise, so the total noise estimate is:
Total(n,k)=min(Res(n,k)+N(n,k),|E(n,k)|2)。
the wiener filter estimator computes as follows:
the posterior signal-to-noise ratio:
Figure BDA0002413419990000151
the decision-directed (DD) algorithm estimates the a priori signal-to-noise ratio:
Figure BDA0002413419990000152
calculating the final wiener filter factor:
Figure BDA0002413419990000153
E2(n,k)=E1(n,k)Wiener(n,k);
where E1(n, k) is an output signal after nonlinear processing.
Obtaining a sample output of the nth frame: out (n) IFFT (E2(n, k)).
EXAMPLE III
Fig. 3 is a schematic structural diagram of an echo cancellation device according to a third embodiment of the present disclosure, which is applicable to solve the problem of leaky echo, where the device may be implemented by software and/or hardware and is generally integrated on a terminal device.
As shown in fig. 3, the apparatus includes: an acquisition module 31, a first processing module 32, a determination module 33 and a second processing module 34;
the acquiring module 31 is configured to acquire a far-end signal;
a first processing module 32, configured to process the far-end signal through a step-size variable adaptive filter to obtain an echo signal, where the step-size variable adaptive filter learns that the step size of a factor is variable when processing each frame of the far-end signal;
a determining module 33, configured to determine a residual spectrum signal according to a microphone signal and the echo signal;
and a second processing module 34, configured to perform nonlinear processing on the residual spectrum signal to obtain an output signal, so as to complete echo cancellation.
In this embodiment, the apparatus first acquires the far-end signal through the acquisition module 31; secondly, the first processing module 32 is configured to process the far-end signal by using a step-size variable adaptive filter to obtain an echo signal, where the step-size variable adaptive filter is an adaptive filter with a variable learning factor step size when processing each frame of the far-end signal; then, determining a residual spectrum signal by a determining module 33 according to the microphone signal and the echo signal; and finally, the residual spectrum signal is subjected to nonlinear processing by a second processing module 34 to obtain an output signal, so as to complete echo cancellation.
The embodiment provides an echo cancellation device, which effectively avoids the generation of a leakage echo phenomenon through a step-size variable adaptive filter. Furthermore, echo is effectively eliminated based on nonlinear processing.
Further, a learning factor of the step-size variable adaptive filter is determined based on the echo signal, the residual spectrum signal, and filter coefficients.
Further, the step size variable adaptive filter has a learning factor proportional to the filter coefficient.
Further, the step-size variable adaptive filter is a linear filter with a variable learning factor step size, and the step-size variable adaptive filter comprises at least two sub-filter blocks.
Further, the filter coefficient when each sub-filter block in the step-size variable adaptive filter processes the far-end signal of the current frame is determined according to the corresponding speech signal when processing the far-end signal of the previous frame and the filter coefficient, the learning factor and the residual spectrum signal when processing the far-end signal of the previous frame.
Further, the filter coefficients of each sub-filter block are determined by the following formula:
Wp(n+1,k)=Wp(n,k)+μ(n,k)conj(Xp(n,k))E(n,k);
wherein, Wp(n +1, k) is the filter coefficient when the p sub-filter block processes the k frequency point far-end signal of the n +1 frame, Wp(n, k) is the filter coefficient when the p sub-filter block processes the nth frame k frequency point far-end signal, mu (n, k) is the learning factor, E (n, k) is the residual spectrum signal corresponding to the nth frame k frequency point far-end signal, XpAnd (n, k) is a voice signal corresponding to the remote signal of the nth frequency point of the nth frame of the pth sub-filter block, and conj (.) represents the conjugate operation of the matrix.
Further, the determining module 33 is specifically configured to:
extracting an echo signal from the microphone signal to obtain a residual signal;
and carrying out Fourier transform on the residual signal to obtain a residual spectrum signal.
Further, the second processing module 34 is specifically configured to:
and carrying out nonlinear processing on the residual spectrum signal based on the learning factor to obtain an output signal.
Further, the second processing module 34 performs a non-linear processing on the residual spectrum signal based on the learning factor to obtain an output signal, including:
determining a nonlinear factor according to the learning factor, the number of sub-filter blocks included in the step-size variable adaptive filter and a corresponding voice signal when the step-size variable adaptive filter processes the far-end signal;
determining a product of the non-linearity factor and the residual spectral signal as an output signal.
Further, the second processing module 34 determines the non-linearity factor by the following formula:
Figure BDA0002413419990000171
Xp(n,k)=FFT(xp(n));
wherein H (n, k) is a non-linear factor, P is the number of sub-filter blocks included in the step-size-variable adaptive filter, xp(n) is the nth frame speech signal of the p-th sub-filter block.
Further, the apparatus further comprises:
and the noise reduction module is used for carrying out residual echo noise reduction processing on the output signal to obtain a noise-reduced output signal.
The echo cancellation device can execute the echo cancellation method provided by any embodiment of the disclosure, and has corresponding functional modules and beneficial effects of the execution method.
Example four
Fig. 4 is a schematic structural diagram of a terminal device according to a fourth embodiment of the present disclosure. Fig. 4 shows a schematic structural diagram of a terminal device 400 suitable for implementing an embodiment of the present disclosure. The terminal Device 400 in the embodiments of the present disclosure may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, a Digital broadcast receiver, a Personal Digital Assistant (PDA), a tablet computer (PAD), a Portable Multimedia Player (PMP), a vehicle mounted terminal (e.g., a car navigation terminal), and the like, and a fixed terminal such as a desktop computer and the like. The terminal device 400 shown in fig. 4 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 4, the terminal device 400 may include one or more processing means (e.g., a central processing unit, a graphics processor, etc.) 401 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)402 or a program loaded from a storage means 408 into a Random Access Memory (RAM) 403. One or more processing devices 401 implement the methods as provided by the present disclosure. In the RAM403, various programs and data necessary for the operation of the terminal apparatus 400 are also stored. The processing device 401, the ROM402, and the RAM403 are connected to each other via a bus 404. An input/output (I/O) interface 405 is also connected to bus 404.
Generally, the following devices may be connected to the I/O interface 405: input devices 406 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 407 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 408, including, for example, magnetic tape, hard disk, etc., storage 408 for storing one or more programs; and a communication device 409. The communication means 409 may allow the terminal device 400 to communicate with other devices wirelessly or by wire to exchange data. While fig. 4 illustrates a terminal apparatus 400 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication device 409, or from the storage device 408, or from the ROM 402. The computer program performs the above-described functions defined in the methods of the embodiments of the present disclosure when executed by the processing device 401.
It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
The computer-readable medium may be contained in the terminal device 400; or may exist separately without being assembled into the terminal device 400.
The computer-readable medium carries one or more programs which, when executed by the terminal device, cause the terminal device 400 to:
acquiring a far-end signal;
processing the far-end signal by a step-length variable adaptive filter to obtain an echo signal, wherein the step-length variable adaptive filter is an adaptive filter with a variable learning factor step length when processing each frame of the far-end signal;
determining a residual spectrum signal according to a microphone signal and the echo signal;
and carrying out nonlinear processing on the residual spectrum signal to obtain an output signal so as to complete echo cancellation.
Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. Each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present disclosure may be implemented by software or hardware. Wherein the name of a module in some cases does not constitute a limitation on the module itself.
The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), systems on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), and the like.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
Example 1 provides, in accordance with one or more embodiments of the present disclosure, an echo cancellation method, including:
acquiring a far-end signal;
processing the far-end signal by a step-length variable adaptive filter to obtain an echo signal, wherein the step-length variable adaptive filter is an adaptive filter with a variable learning factor step length when processing each frame of the far-end signal;
determining a residual spectrum signal according to a microphone signal and the echo signal;
and carrying out nonlinear processing on the residual spectrum signal to obtain an output signal so as to complete echo cancellation.
Example 2 in accordance with one or more embodiments of the present disclosure, the method of example 1,
a learning factor of the step-size variable adaptive filter is determined based on the echo signal, the residual spectrum signal, and filter coefficients.
Example 3 in accordance with one or more embodiments of the present disclosure, the method of example 1,
the learning factor of the step-size variable adaptive filter is proportional to the filter coefficient.
Example 4 in accordance with one or more embodiments of the present disclosure, the method of example 1,
the step-size-variable adaptive filter is a linear filter with a variable learning factor step size, and comprises at least two sub-filter blocks.
Example 5 in accordance with one or more embodiments of the present disclosure, the method of example 4,
and the filter coefficient when each sub-filter block in the step length variable self-adaptive filter processes the far-end signal of the current frame is determined according to the corresponding voice signal when the far-end signal of the previous frame is processed and the filter coefficient, the learning factor and the residual spectrum signal when the far-end signal of the previous frame is processed.
Example 6 in accordance with one or more embodiments of the present disclosure, the method of example 5,
the filter coefficients for each sub-filter block are determined by the following formula:
Wp(n+1,k)=Wp(n,k)+μ(n,k)conj(Xp(n,k))E(n,k);
wherein, Wp(n +1, k) is the filter coefficient when the p sub-filter block processes the k frequency point far-end signal of the n +1 frame, Wp(n, k) is the filter coefficient when the p sub-filter block processes the nth frame k frequency point far-end signal, mu (n, k) is the learning factor, E (n, k) is the residual spectrum signal corresponding to the nth frame k frequency point far-end signal, XpAnd (n, k) is a voice signal corresponding to the remote signal of the nth frequency point of the nth frame of the pth sub-filter block, and conj (.) represents the conjugate operation of the matrix.
Example 7 the method of example 1, the determining a residual spectral signal from the microphone signal and the echo signal, according to one or more embodiments of the present disclosure, comprising:
extracting an echo signal from the microphone signal to obtain a residual signal;
and carrying out Fourier transform on the residual signal to obtain a residual spectrum signal.
Example 8 the method of example 1, the non-linearly processing the residual spectrum signal to obtain an output signal, according to one or more embodiments of the present disclosure, includes:
and carrying out nonlinear processing on the residual spectrum signal based on the learning factor to obtain an output signal.
Example 9 the method of example 8, the subjecting the residual spectrum signal to a non-linear processing based on the learning factor to obtain an output signal, according to one or more embodiments of the present disclosure, includes:
determining a nonlinear factor according to the learning factor, the number of sub-filter blocks included in the step-size variable adaptive filter and a corresponding voice signal when the step-size variable adaptive filter processes the far-end signal;
determining a product of the non-linearity factor and the residual spectral signal as an output signal.
Example 10 in accordance with the method of example 9, in accordance with one or more embodiments of the present disclosure, the non-linearity factor is determined by the following equation:
Figure BDA0002413419990000241
Xp(n,k)=FFT(xp(n));
wherein H (n, k) is a non-linear factor, P is the number of sub-filter blocks included in the step-size-variable adaptive filter, xp(n) is the nth frame speech signal of the p-th sub-filter block.
Example 11 the method of example 8, in accordance with one or more embodiments of the present disclosure, further comprising:
and carrying out residual echo noise reduction processing on the output signal to obtain a noise-reduced output signal.
Example 12 provides, in accordance with one or more embodiments of the present disclosure, an echo cancellation device, including:
the acquisition module is used for acquiring a far-end signal;
the first processing module is used for processing the far-end signal by a step-length variable adaptive filter to obtain an echo signal, wherein the step-length variable adaptive filter is an adaptive filter with a variable learning factor step length when processing each frame of the far-end signal;
a determining module for determining a residual spectrum signal according to a microphone signal and the echo signal;
and the second processing module is used for carrying out nonlinear processing on the residual spectrum signal to obtain an output signal so as to complete echo cancellation.
Example 13 provides, in accordance with one or more embodiments of the present disclosure, a terminal device, comprising:
one or more processing devices;
storage means for storing one or more programs;
when executed by the one or more processing devices, cause the one or more processing devices to implement the method of any of examples 1-11.
Example 14 provides a computer-readable medium having stored thereon a computer program that, when executed by a processing apparatus, implements the method of any of examples 1-11, in accordance with one or more embodiments of the present disclosure.
The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other embodiments in which any combination of the features described above or their equivalents does not depart from the spirit of the disclosure. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.

Claims (11)

1. An echo cancellation method, comprising:
acquiring a far-end signal;
processing the far-end signal by a step-length variable adaptive filter to obtain an echo signal, wherein the step-length variable adaptive filter is an adaptive filter with a variable learning factor step length when processing each frame of the far-end signal;
determining a residual spectrum signal according to a microphone signal and the echo signal;
carrying out nonlinear processing on the residual spectrum signal to obtain an output signal so as to complete echo cancellation;
the performing nonlinear processing on the residual spectrum signal to obtain an output signal includes:
carrying out nonlinear processing on the residual spectrum signal based on the learning factor to obtain an output signal;
the step-size-variable adaptive filter is a linear filter with a variable learning factor step size, and comprises at least two sub-filter blocks;
the filter coefficients for each sub-filter block are determined by the following formula:
Wp(n+1,k)=Wp(n,k)+μ(n,k)conj(Xp(n,k))E(n,k);
wherein, Wp(n +1, k) is the filter coefficient when the p sub-filter block processes the k frequency point far-end signal of the n +1 frame, Wp(n, k) is the filter coefficient when the p sub-filter block processes the nth frame k frequency point far-end signal, mu (n, k) is the learning factor, E (n, k) is the residual spectrum signal corresponding to the nth frame k frequency point far-end signal, XpAnd (n, k) is a voice signal corresponding to the remote signal of the nth frequency point of the nth frame of the pth sub-filter block, and conj (.) represents the conjugate operation of the matrix.
2. The method of claim 1, wherein a learning factor of the step-size variable adaptive filter is determined based on the echo signal, the residual spectrum signal, and filter coefficients.
3. The method of claim 1, wherein the step size variable adaptive filter has a learning factor proportional to filter coefficients.
4. The method of claim 1, wherein the filter coefficients of each sub-filter block in the step-size variable adaptive filter when processing the far-end signal of the current frame are determined according to the corresponding speech signal when processing the far-end signal of the previous frame and the filter coefficients, the learning factor and the residual spectrum signal when processing the far-end signal of the previous frame.
5. The method of claim 1, wherein determining a residual spectral signal from the microphone signal and the echo signal comprises:
extracting an echo signal from the microphone signal to obtain a residual signal;
and carrying out Fourier transform on the residual signal to obtain a residual spectrum signal.
6. The method of claim 1, wherein the subjecting the residual spectrum signal to a non-linear processing based on the learning factor to obtain an output signal comprises:
determining a nonlinear factor according to the learning factor, the number of sub-filter blocks included in the step-size variable adaptive filter and a corresponding voice signal when the step-size variable adaptive filter processes the far-end signal;
determining a product of the non-linearity factor and the residual spectral signal as an output signal.
7. The method of claim 6, wherein the non-linearity factor is determined by the formula:
Figure FDA0003217101890000021
Xp(n,k)=FFT(xp(n));
wherein H (n, k) is a non-linear factor, P is the number of sub-filter blocks included in the step-size-variable adaptive filter, xp(n) the nth frame speech signal of the p-th sub-filter block;
mu (n, k) is a learning factor, XpAnd (n, k) is a voice signal corresponding to the far-end signal of the kth frequency point of the nth frame of the p-th sub-filter block, and real (.) represents a real part.
8. The method of claim 1, further comprising:
and carrying out residual echo noise reduction processing on the output signal to obtain a noise-reduced output signal.
9. An echo cancellation device, comprising:
the acquisition module is used for acquiring a far-end signal;
the first processing module is used for processing the far-end signal by a step-length variable adaptive filter to obtain an echo signal, wherein the step-length variable adaptive filter is an adaptive filter with a variable learning factor step length when processing each frame of the far-end signal;
a determining module for determining a residual spectrum signal according to a microphone signal and the echo signal;
the second processing module is used for carrying out nonlinear processing on the residual spectrum signal to obtain an output signal so as to complete echo cancellation;
the second processing module is further used for carrying out nonlinear processing on the residual spectrum signal based on the learning factor to obtain an output signal;
the step-size-variable adaptive filter is a linear filter with a variable learning factor step size, and comprises at least two sub-filter blocks;
the filter coefficients for each sub-filter block are determined by the following formula:
Wp(n+1,k)=Wp(n,k)+μ(n,k)conj(Xp(n,k))E(n,k);
wherein, Wp(n +1, k) is the filter coefficient when the p sub-filter block processes the k frequency point far-end signal of the n +1 frame, Wp(n, k) is the filter coefficient when the p sub-filter block processes the nth frame k frequency point far-end signal, mu (n, k) is the learning factor, E (n, k) is the residual spectrum signal corresponding to the nth frame k frequency point far-end signal, XpAnd (n, k) is a voice signal corresponding to the remote signal of the nth frequency point of the nth frame of the pth sub-filter block, and conj (.) represents the conjugate operation of the matrix.
10. A terminal device, comprising:
one or more processing devices;
storage means for storing one or more programs;
when executed by the one or more processing devices, cause the one or more processing devices to implement the echo cancellation method of any one of claims 1-8.
11. A computer-readable medium, on which a computer program is stored, which, when being executed by processing means, carries out the echo cancellation method according to any one of claims 1-8.
CN202010183666.0A 2020-03-16 2020-03-16 Echo cancellation method, device, terminal equipment and medium Active CN111246037B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010183666.0A CN111246037B (en) 2020-03-16 2020-03-16 Echo cancellation method, device, terminal equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010183666.0A CN111246037B (en) 2020-03-16 2020-03-16 Echo cancellation method, device, terminal equipment and medium

Publications (2)

Publication Number Publication Date
CN111246037A CN111246037A (en) 2020-06-05
CN111246037B true CN111246037B (en) 2021-11-16

Family

ID=70876990

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010183666.0A Active CN111246037B (en) 2020-03-16 2020-03-16 Echo cancellation method, device, terminal equipment and medium

Country Status (1)

Country Link
CN (1) CN111246037B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113938548A (en) * 2020-06-29 2022-01-14 阿里巴巴集团控股有限公司 Echo suppression method and device for terminal communication
CN111798827A (en) * 2020-07-07 2020-10-20 上海立可芯半导体科技有限公司 Echo cancellation method, apparatus, system and computer readable medium
CN112492112B (en) * 2020-11-19 2022-03-18 睿云联(厦门)网络通讯技术有限公司 Echo eliminating method and device based on intercom system
CN113489854B (en) * 2021-06-30 2024-03-01 北京小米移动软件有限公司 Sound processing method, device, electronic equipment and storage medium
CN113488067A (en) * 2021-06-30 2021-10-08 北京小米移动软件有限公司 Echo cancellation method, echo cancellation device, electronic equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103067629A (en) * 2013-01-18 2013-04-24 苏州科达科技股份有限公司 Echo cancellation device
CN106782593A (en) * 2017-02-27 2017-05-31 重庆邮电大学 A kind of many band structure sef-adapting filter changing methods eliminated for acoustic echo
CN109087665A (en) * 2018-07-06 2018-12-25 南京时保联信息科技有限公司 A kind of nonlinear echo suppressing method
CN109509482A (en) * 2018-12-12 2019-03-22 北京达佳互联信息技术有限公司 Echo cancel method, echo cancelling device, electronic equipment and readable medium
CN109754813A (en) * 2019-03-26 2019-05-14 南京时保联信息科技有限公司 Variable step echo cancel method based on fast convergence characteristic
CN109935238A (en) * 2019-04-01 2019-06-25 北京百度网讯科技有限公司 A kind of echo cancel method, device and terminal device
CN110838300A (en) * 2019-11-18 2020-02-25 紫光展锐(重庆)科技有限公司 Echo cancellation processing method and processing system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2302401A (en) * 1999-12-09 2001-06-18 Frederick Johannes Bruwer Speech distribution system
US10192567B1 (en) * 2017-10-18 2019-01-29 Motorola Mobility Llc Echo cancellation and suppression in electronic device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103067629A (en) * 2013-01-18 2013-04-24 苏州科达科技股份有限公司 Echo cancellation device
CN106782593A (en) * 2017-02-27 2017-05-31 重庆邮电大学 A kind of many band structure sef-adapting filter changing methods eliminated for acoustic echo
CN109087665A (en) * 2018-07-06 2018-12-25 南京时保联信息科技有限公司 A kind of nonlinear echo suppressing method
CN109509482A (en) * 2018-12-12 2019-03-22 北京达佳互联信息技术有限公司 Echo cancel method, echo cancelling device, electronic equipment and readable medium
CN109754813A (en) * 2019-03-26 2019-05-14 南京时保联信息科技有限公司 Variable step echo cancel method based on fast convergence characteristic
CN109935238A (en) * 2019-04-01 2019-06-25 北京百度网讯科技有限公司 A kind of echo cancel method, device and terminal device
CN110838300A (en) * 2019-11-18 2020-02-25 紫光展锐(重庆)科技有限公司 Echo cancellation processing method and processing system

Also Published As

Publication number Publication date
CN111246037A (en) 2020-06-05

Similar Documents

Publication Publication Date Title
CN111246037B (en) Echo cancellation method, device, terminal equipment and medium
CN111341336B (en) Echo cancellation method, device, terminal equipment and medium
US10477031B2 (en) System and method for suppression of non-linear acoustic echoes
EP3703052A1 (en) Echo cancellation method and apparatus based on time delay estimation
JP4210521B2 (en) Noise reduction method and apparatus
JP4377952B1 (en) Adaptive filter and echo canceller having the same
US9866792B2 (en) Display apparatus and echo cancellation method thereof
JP2002517021A (en) Signal Noise Reduction by Spectral Subtraction Using Linear Convolution and Causal Filtering
KR102190833B1 (en) Echo suppression
CN110556125B (en) Feature extraction method and device based on voice signal and computer storage medium
US9330677B2 (en) Method and apparatus for generating a noise reduced audio signal using a microphone array
US20080152157A1 (en) Method and system for eliminating noises in voice signals
CN113539285A (en) Audio signal noise reduction method, electronic device, and storage medium
CN113223545A (en) Voice noise reduction method and device, terminal and storage medium
CN112602150A (en) Noise estimation method, noise estimation device, voice processing chip and electronic equipment
JP2016503262A (en) Echo suppression
CN113674752B (en) Noise reduction method and device for audio signal, readable medium and electronic equipment
CN113744748A (en) Network model training method, echo cancellation method and device
US10650839B2 (en) Infinite impulse response acoustic echo cancellation in the frequency domain
CN111917926B (en) Echo cancellation method and device in communication terminal and terminal equipment
CN113763976B (en) Noise reduction method and device for audio signal, readable medium and electronic equipment
CN113113038A (en) Echo cancellation method and device and electronic equipment
JP4964267B2 (en) Adaptive filter and echo canceller having the same
CN113763975A (en) Voice signal processing method and device and terminal
CN112309418A (en) Method and device for inhibiting wind noise

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant