WO2020124325A1 - 一种回声消除中的自适应滤波方法、装置、设备及存储介质 - Google Patents

一种回声消除中的自适应滤波方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2020124325A1
WO2020124325A1 PCT/CN2018/121560 CN2018121560W WO2020124325A1 WO 2020124325 A1 WO2020124325 A1 WO 2020124325A1 CN 2018121560 W CN2018121560 W CN 2018121560W WO 2020124325 A1 WO2020124325 A1 WO 2020124325A1
Authority
WO
WIPO (PCT)
Prior art keywords
subband
error
value
sub
band
Prior art date
Application number
PCT/CN2018/121560
Other languages
English (en)
French (fr)
Inventor
郭红敬
李国梁
王鑫山
胥林渊
朱虎
Original Assignee
深圳市汇顶科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市汇顶科技股份有限公司 filed Critical 深圳市汇顶科技股份有限公司
Priority to CN201880003450.2A priority Critical patent/CN111868826B/zh
Priority to PCT/CN2018/121560 priority patent/WO2020124325A1/zh
Publication of WO2020124325A1 publication Critical patent/WO2020124325A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M9/00Arrangements for interconnection not involving centralised switching
    • H04M9/08Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic
    • H04M9/082Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic using echo cancellers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation

Definitions

  • the present application relates to the field of communications, and in particular to an adaptive filtering method, device, equipment, and storage medium in echo cancellation.
  • Filters are usually used to extract information that is of interest and close to a specified quality from noisy data.
  • echo cancellation An important application of adaptive filters is echo cancellation.
  • the main idea of echo cancellation is to use an adaptive filter to identify unknown echo channels. Based on the correlation between the speaker signal and the generated echo signal, the far-end signal model is established, the echo path is simulated, and the adaptive echo algorithm is adjusted to make the estimated echo signal approximate the real echo signal, and the echo cancellation function can be realized .
  • the minimum gradient descent method is used to update the weight of the adaptive filter. When the weight is updated, the filter performance is generally improved by adaptively changing the iteration step size, but this method will introduce a lot of calculations, and there is filtering. The problem is that the steady-state error of the filter is large and the filter is easy to diverge.
  • the present application provides an adaptive filtering method, device, equipment, and storage medium in echo cancellation.
  • a first aspect of an embodiment of the present application provides an adaptive filtering method in echo cancellation, including:
  • the frequency domain error signal is divided into multiple sub-bands.
  • the frequency domain error signal is obtained by transforming the time domain error signal.
  • the time domain error signal is obtained by subtracting the microphone acquisition signal from the time domain estimated echo signal;
  • the weight of the adaptive filter is updated according to the limiting errors of multiple subbands.
  • dividing the frequency domain error signal into a plurality of subbands includes: dividing the frequency domain error signal within the frequency range of the speech signal into the first subband;
  • the frequency domain error signal within the frequency range of the non-speech signal is divided into second subbands.
  • dividing the frequency domain error signal into multiple subbands further includes: comparing the speaker frequency response corresponding to the frequency of the first subband Value and the first preset value;
  • the first subband corresponding to the frequency point where the frequency response value of the speaker is greater than the first preset value is divided into a fourth subband.
  • processing multiple subbands separately to obtain the limiting errors of the multiple subbands includes:
  • Modulo the normalization error of the third subband and the normalization error of the fourth subband to obtain the modulus value of the third subband and the modulus value of the fourth subband, respectively;
  • the modulus value of the third subband and the modulus value of the fourth subband are processed separately to obtain the limiting error of the third subband and the limiting error of the fourth subband.
  • the modulus value of the third subband and the modulus value of the fourth subband are processed separately to obtain the limiting error of the third subband and
  • the limiting errors of the fourth subband include:
  • the limiting error of the third subband is determined according to the modulus value of the third subband and the third threshold value
  • the limiting error of the fourth subband is determined according to the modulus value of the fourth subband and the fourth threshold value.
  • the limit error of the third subband is determined according to the modulus value of the third subband and the third threshold value, and according to the fourth The modulus value of the sub-band and the fourth threshold determine the limiting error of the fourth sub-band including:
  • the limit error of the third sub-band is set to the normalized error of the third sub-band
  • the limit error of the third sub-band is set as the third threshold multiplied by the normalized error of the third sub-band and then divided by the third sub-band Modulus of the band;
  • the limit error of the fourth sub-band is set to the normalized error of the fourth sub-band
  • the limit error of the fourth subband is set to the fourth threshold multiplied by the normalized error of the fourth subband and then divided by the fourth subband The modulus of the band.
  • the third threshold value is set to be less than the fourth threshold value.
  • processing multiple subbands separately to obtain the limit errors of the multiple subbands further includes: setting the limit error of the second subband to The second fixed value.
  • the second fixed value is set to zero.
  • processing multiple subbands separately to obtain the limiting error of the multiple subbands further includes:
  • the modulus value of the second subband is processed to obtain the limiting error of the second subband.
  • processing the modulus value of the second subband to obtain the limiting error of the second subband includes: The band sets a second threshold value; the limit error of the second subband is determined according to the modulus value of the second subband and the second threshold value.
  • determining the limit error of the second subband according to the modulus value of the second subband and the second threshold value includes: If the modulus value of the second sub-band is less than the second threshold value, the limit error of the second sub-band is set to the normalized error of the second sub-band;
  • the limit error of the second subband is set as the second threshold multiplied by the normalized error of the second subband and then divided by the second subband The modulus of the band.
  • the second threshold value is set to be less than the third threshold value.
  • a second aspect of an embodiment of the present application provides an adaptive filtering device in echo cancellation, including:
  • the sub-band dividing module is used to divide the frequency domain error signal into multiple sub-bands.
  • the frequency domain error signal is obtained by transforming the time domain error signal.
  • the time domain error signal is obtained by subtracting the microphone acquisition signal from the time domain estimated echo signal;
  • a sub-band processing module for processing multiple sub-bands separately to obtain the limiting error of multiple sub-bands
  • the weight update module is used to update the weight of the adaptive filter according to the limit errors of multiple subbands.
  • dividing the subband module includes:
  • a first dividing module for dividing the frequency domain error signal in the frequency range of the speech signal into first subbands
  • the second dividing module is used to divide the frequency domain error signal in the frequency range of the non-speech signal into second subbands.
  • dividing the subband module further includes:
  • a comparison module for comparing the speaker frequency response value corresponding to the frequency of the first sub-band with the first preset value
  • a third dividing module configured to divide the first subband corresponding to the frequency point where the frequency response value of the speaker is less than the first preset value into a third subband
  • the fourth dividing module is configured to divide the first subband corresponding to the frequency point where the frequency response value of the speaker is greater than the first preset value into a fourth subband.
  • the subband processing module includes:
  • Normalization module used to normalize the third subband and the fourth subband respectively to obtain the normalization error of the third subband and the normalization error of the fourth subband;
  • a modulo module for modulo the normalization error of the third subband and the normalization error of the fourth subband to obtain the modulus value of the third subband and the modulus value of the fourth subband, respectively;
  • the limit error generation module is configured to process the modulus value of the third subband and the modulus value of the fourth subband respectively to obtain the limit error of the third subband and the limit error of the fourth subband.
  • the limit error generation module includes:
  • a threshold setting module for setting a third threshold and a fourth threshold for the third subband and the fourth subband, respectively;
  • the limit error generation sub-module is used to determine the limit error of the third sub-band according to the modulus value of the third sub-band and the third threshold, and also to determine the third sub-band according to the modulus value and the fourth threshold Limiting error of four subbands.
  • the limit error generation submodule includes:
  • a third setting module configured to set the limit error of the third subband to the normalized error of the third subband when the modulus value of the third subband is less than the third threshold
  • a fourth setting module configured to set the limit error of the fourth subband to the normalized error of the fourth subband when the modulus value of the fourth subband is less than the fourth threshold;
  • the threshold setting module is further configured to set the third threshold to be less than the fourth threshold.
  • the subband processing module further includes: a fixed value setting module, configured to set the limit error of the second subband to the second Value.
  • the fixed value setting module is further configured to set the second fixed value to zero.
  • the normalization module is further used to normalize the second subband to obtain the normalization error of the second subband ;
  • the modulo module is also used to modulo the normalized error of the second subband to obtain the modulus value of the second subband;
  • the limited error generation module is also used to process the modulus value of the second subband to obtain the limited error of the second subband.
  • the threshold setting module is further configured to set a second threshold for the second subband
  • the limit error generation sub-module is also used to determine the limit error of the second sub-band according to the modulus value of the second sub-band and the second threshold value.
  • the limit error generation submodule further includes:
  • a second setting module configured to set the limit error of the second subband to the normalized error of the second subband when the modulus value of the second subband is less than the second threshold
  • the threshold setting module is further configured to set the second threshold to be less than the third threshold.
  • a third aspect of the embodiments of the present application provides a device for adaptive filtering in echo cancellation, including: a memory and a processor;
  • the memory is coupled with the processor
  • Memory used to store program instructions
  • the processor is configured to call the program instructions stored in the memory and cause the device to perform the adaptive filtering method in the echo cancellation of the first aspect.
  • a fourth aspect of the embodiments of the present application provides a computer-readable storage medium, including: a computer program stored thereon, characterized in that the computer program is executed by a processor to implement the Adaptive filtering method.
  • the beneficial effects of the embodiments of the present application are that the embodiments of the present application provide an adaptive filtering method, device, equipment and storage medium in echo cancellation, which divides the sub-band of the frequency domain error signal and uses Respectively get the limit error, and use the limit error to update the adaptive filter weight to perform adaptive filtering to achieve echo cancellation, which can effectively reduce the calculation amount and filter steady-state error, and avoid filter divergence, even if it is In the presence of noise interference and near-end speech, better echo cancellation can be achieved.
  • FIG. 1 is a flowchart of an adaptive filtering method in echo cancellation according to an embodiment of the present application
  • FIG. 3 is a schematic structural diagram of an adaptive filtering module according to an embodiment of the present application.
  • FIG. 4 is a flowchart of a weight update method of an adaptive filtering method according to an embodiment of the present application
  • FIG. 5 is a schematic diagram of a system structure of an application scenario of an adaptive filtering method in echo cancellation according to an embodiment of the present application
  • FIG. 6 is a schematic structural diagram of an adaptive filtering device in echo cancellation according to an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of a device according to an embodiment of the present application.
  • FIG. 1 is a flowchart of an adaptive filtering method in echo cancellation according to an embodiment of the present application.
  • frequency domain error signals are divided into subbands and processed to obtain limit errors.
  • the method of updating the weight of the adaptive filter performs adaptive filtering to achieve echo cancellation. The method includes the following steps:
  • the frequency domain error signal is obtained by transforming the time domain error signal.
  • the time domain error signal is obtained by subtracting the microphone acquisition signal from the time domain estimated echo signal;
  • the subband can be a continuous frequency in the frequency domain Frequency domain error signal corresponding to a point, for example, one subband may be the frequency domain error signal corresponding to the continuous frequency points No. 6 to 59, and the subband may also be a frequency domain error signal corresponding to the discontinuous frequency point in the frequency domain
  • another subband may be the frequency domain error signal corresponding to the frequency points 1 to 5 and 60 to 65, and between the frequency points 1 to 5 and 60 to 65 of the other subband There is no frequency point No. 6 to No.
  • the embodiments of the present application do not limit the number of subbands.
  • the method for dividing the subbands provided in the embodiments of the present application is only an exemplary description, and those skilled in the art can implement the above according to the premise of not paying creative labor. Example to get other division methods.
  • multiple subbands are processed separately, and the processing methods of the multiple subbands may be different from each other, or two or more subbands may be processed in the same manner, which may be determined according to the actual situation.
  • the processing of the subbands in this embodiment There are no restrictions.
  • the weights of the adaptive filter are updated with the limiting errors of all the above subbands.
  • the adaptive filter may be simply referred to as a filter.
  • the beneficial effect of the embodiment of the present application is that the embodiment of the present application provides an adaptive filtering method in echo cancellation, which uses frequency domain error signals to be divided into subbands and separately processed to obtain limit errors, and the limit errors are used to update the adaptive filter
  • the weighting method of the filter performs adaptive filtering to achieve echo cancellation, which can effectively reduce the amount of calculation and filter steady-state error, and avoid filter divergence, even in the presence of noise interference and near-end speech. Achieve a better echo cancellation effect.
  • dividing the frequency domain error signal into multiple subbands may include the following two steps:
  • the frequency domain error signal within the frequency range of the non-speech signal is divided into second subbands.
  • a sampling rate of 8000 Hz and 128 sampling points are taken as an example for illustration. Since the frequency resolution is 62.5 Hz, the frequency of the voice signal is mainly concentrated in the range of 300 Hz to 3400 Hz, so the frequency range of the voice signal is about Between frequency point 6 and frequency point 55, the frequency corresponding to frequency point 1 is 0Hz. According to the method of dividing subbands of this embodiment, the frequency domain error signals corresponding to frequency points 6 to 55 can be divided into the first subband, frequency points 1 to 5 and frequency 56 The frequency domain error signal corresponding to frequency point 65 is divided into second subbands.
  • the frequency domain error signal corresponding to the frequency point near the frequency range of the voice signal can also be divided into the first subband, for example, the frequency point 56 to 59
  • the frequency-domain error signal corresponding to the point is divided into the first subband, that is, the frequency-domain error signal corresponding to the frequency point from No. 6 to No. 59 is divided into the first subband, the No. 1 to No. 5 frequency points and the No.
  • the frequency domain error signal corresponding to the frequency point 60 to the frequency point 65 is divided into second subbands.
  • the frequency-domain error signal in the frequency range of the non-speech signal usually contains some low-frequency or high-frequency noise.
  • the frequency-domain error signal is divided into a first subband and a second subband, so that the first subband and the second subband
  • the sub-bands are processed to obtain the limiting error, which can further reduce the steady-state error of the filter and avoid the divergence of the filter.
  • the first subband and the second subband may be further divided, for example, the frequency domain corresponding to the frequency points No. 1 to No. 5 in the second subband in the above embodiment
  • the error signal is divided into one sub-band, and the frequency domain error signals corresponding to the frequency points from No. 60 to No. 65 are divided into another sub-band.
  • the application does not limit the number of divided sub-bands.
  • dividing the frequency domain error signal into multiple subbands further includes the following steps:
  • the first subband corresponding to the frequency point where the frequency response value of the speaker is greater than the first preset value is divided into a fourth subband.
  • the first subband may be further divided according to the speaker frequency response corresponding to the frequency point of the first subband. Since the frequency response of the speaker corresponding to each frequency point of the first subband is different, the frequency domain error signal corresponding to the frequency point of the first subband is also related to the frequency response of the speaker.
  • the first preset value is 5.0 dB as an example for illustration, and the first subband corresponding to the frequency point where the frequency response value of the speaker is less than 5.0 dB can be divided into third Sub-band, the first sub-band corresponding to the frequency point where the speaker frequency response value is greater than 5.0dB is divided into the fourth sub-band. As can be seen from FIG.
  • the frequency corresponding to the speaker frequency response value equal to 5.0dB is approximately 1500Hz, so the first The frequency domain error signal corresponding to frequency range 6 to frequency range 24 (frequency corresponding to frequency range 25 is 1500 Hz) is divided into the third subband, and the frequency domain corresponding to frequency range 26 to frequency 55 The error signal is divided into the fourth subband.
  • the frequency response of speakers from frequency 6 to frequency 24 is strongly attenuated and the echo energy is weak, so the energy of the third subband is small, and the energy of the fourth subband is large.
  • the first subband is further divided into a third subband and a fourth subband, which can facilitate subsequent processing of the third subband and the fourth subband separately, which can further reduce the steady-state error of the filter and avoid filtering
  • the device diverges.
  • the setting of the first preset value of 5.0 dB in this embodiment of the present application is merely an exemplary description, and those skilled in the art can obtain other first preset values according to the foregoing embodiments without paying creative efforts.
  • the frequency domain error signal corresponding to frequency 25 (frequency corresponding to frequency 25 is 1500 Hz) is divided into the third subband, and the frequency error signal corresponding to frequency 26 to frequency 55 is divided into the fourth Subband.
  • This embodiment does not limit whether the frequency domain error signal corresponding to the frequency point where the speaker frequency response value is equal to the first preset value is divided into the third subband or the fourth subband, and the frequency domain error signal corresponding to this frequency point may also be Do other processing.
  • the frequency domain error signal corresponding to the frequency point near the frequency range of the voice signal may also be divided into the first subband, for example, the frequency point 56 to the frequency point 59 may also be The corresponding frequency domain error signal is divided into the first subband, that is, the frequency domain error signal corresponding to the frequency points from No. 6 to No. 59 is divided into the first subband. Therefore, in this embodiment, the The frequency domain error signals corresponding to frequency points 6 to 25 are divided into third subbands, and the frequency domain error signals corresponding to frequency points 26 to 59 are divided into fourth subbands.
  • processing multiple subbands separately to obtain the limiting error of multiple subbands includes:
  • Modulo the normalization error of the third subband and the normalization error of the fourth subband to obtain the modulus value of the third subband and the modulus value of the fourth subband, respectively;
  • the modulus value of the third subband and the modulus value of the fourth subband are processed separately to obtain the limiting error of the third subband and the limiting error of the fourth subband.
  • the method of separately processing a plurality of subbands to obtain a limiting error can further reduce the steady-state error of the filter and prevent the filter from diverging.
  • the calculation method of the normalized error includes but is not limited to:
  • can prevent the denominator in the above formula from being zero
  • X(k) is the far-end signal in the frequency domain, which is obtained by transforming the far-end signal in the time domain x(n).
  • the power of X(k) is used to calculate the normalized error
  • the smoothing calculation process is used to obtain a smooth power spectrum P(k).
  • the "X(k)" in the normalized calculation method can be replaced by a smooth power spectrum P(k).
  • the modulus value is expressed as:
  • the limit error is expressed as: E′(k).
  • the modulus value of the third subband and the modulus value of the fourth subband are processed separately to obtain the limit error of the third subband and the limit error of the fourth subband It includes the following steps:
  • the limiting error of the third subband is determined according to the modulus value of the third subband and the third threshold value
  • the limiting error of the fourth subband is determined according to the modulus value of the fourth subband and the fourth threshold value.
  • the specific values of the third threshold value and the fourth threshold value are not limited, and the third threshold value and the fourth threshold value may be the same or different.
  • the limit error of the third subband is determined according to the modulus value of the third subband and the third threshold
  • the limit error of the fourth subband is determined according to the modulus value of the fourth subband and the fourth threshold, to a certain extent , Further avoiding the filter divergence, reducing the steady-state error of the filter.
  • the limit error of the third subband is determined according to the modulus value of the third subband and the third threshold value, and according to the modulus value of the fourth subband and the third threshold value.
  • the determination of the limit error of the fourth sub-band by the four thresholds includes the following steps:
  • the limit error of the third sub-band is set to the normalized error of the third sub-band
  • the limit error of the third sub-band is set as the third threshold multiplied by the normalized error of the third sub-band and then divided by the third sub-band Modulus of the band;
  • the limit error of the fourth sub-band is set to the normalized error of the fourth sub-band
  • the limit error of the fourth subband is set to the fourth threshold multiplied by the normalized error of the fourth subband and then divided by the fourth subband The modulus of the band.
  • the limiting error of the third subband E′(k) E norm (k)
  • is:
  • the threshold value of the corresponding subband is represented by TH. It should be noted that if the frequency point of the third subband or the fourth subband corresponds to the modulus of the frequency domain error signal
  • Is equal to the threshold of the corresponding subband, then the limit error of the frequency domain error signal corresponding to the frequency of the subband is not limited, and the limit error of the frequency domain error signal corresponding to the frequency of the subband can be E′( k) E norm (k); it can also be:
  • the limiting error of the frequency domain error signal corresponding to the frequency of the sub-band may also be other limiting error setting methods, which is not limited in this embodiment.
  • the limit error of the sub-band is determined according to the modulus value of the corresponding sub-band and the corresponding threshold value, which further avoids the divergence of the filter and reduces the steady-state error of the filter.
  • the third threshold value is set to be less than the fourth threshold value, specifically, the third threshold value TH3 may be set to 0.0000001, and the fourth threshold value TH4 0.00005.
  • the embodiments of the present application do not limit the specific values of the third threshold value and the fourth threshold value.
  • the method for setting the threshold values provided in the embodiments of the present application is only for illustrative description, and those skilled in the art do not On the premise of paying creative efforts, other threshold setting methods can be obtained according to the above embodiments. Setting the third threshold value to be smaller than the fourth threshold value, to a certain extent, further avoids the filter divergence and reduces the steady-state error of the filter.
  • processing the multiple subbands separately to obtain the limit errors of the multiple subbands also includes setting the limit error of the second subband to a second fixed value, because The second sub-band usually contains some low-frequency or high-frequency noise. If the weight of the adaptive filter is directly updated using the second sub-band, there is a high probability that the filter will diverge. Therefore, the second sub-band is limited The error is set to the second fixed value, which can avoid the divergence of the adaptive filter, reduce the steady-state error of the filter, and can reduce the amount of calculation. It should be noted that this embodiment does not limit the size of the second fixed value.
  • multiple subbands are processed separately, and the limit error of multiple subbands is obtained. It also includes another processing method for the second subband, specifically including the following steps :
  • processing the modulus value of the second subband to obtain the limiting error of the second subband may include the following steps:
  • the limit error of the second subband is determined according to the modulus value of the second subband and the second threshold value.
  • the limiting error of the second subband is determined according to the modulus value of the second subband and the second threshold value, to a certain extent, the divergence of the filter is further avoided, and the steady-state error of the filter is reduced .
  • determining the limit error of the second subband according to the modulus value of the second subband and the second threshold value may include the following steps:
  • the limit error of the second subband is set to the normalized error of the second subband
  • the limit error of the second subband is set as the second threshold multiplied by the normalized error of the second subband and then divided by the second subband The modulus of the band.
  • is equal to the second threshold TH2
  • is equal to the second threshold TH2
  • the limitation error of the frequency domain error signal corresponding to the frequency point of the second subband may also be other limitation error setting methods, which is not limited in this embodiment.
  • the second threshold value TH2 is set to be smaller than the third threshold value TH3, which further avoids the divergence of the filter and reduces the steady-state error of the filter.
  • this embodiment provides a structural schematic diagram of an adaptive filtering module that can perform the adaptive filtering method in the echo cancellation provided by any of the foregoing embodiments.
  • adaptive filtering There are many ways to implement adaptive filtering, and the specific ones can be divided into time domain filtering and frequency domain filtering.
  • Time-domain filtering mainly uses the adaptive algorithm to calculate the time-domain sampling points one by one, while frequency-domain filtering mainly uses FFT transform to adaptively adjust the weights of the adaptive filter in the frequency domain.
  • Frequency-domain filtering is relative to time-domain filtering The amount of calculation is relatively small.
  • This embodiment adopts an adaptive filtering method of frequency domain filtering.
  • FIG. 3 is a schematic structural diagram of an adaptive filtering module according to an embodiment of the present application.
  • the adaptive filtering module 105 includes: a data buffer module 201 and an FFT module 202, a power calculation module 203, a frequency domain filtering module 204, an IFFT module 205, a data analysis module 206, a data zero-fill FFT module 207, a frequency domain error control module 208, and a frequency domain weight update module 209. It should be noted that the structure of the adaptive filtering module provided in this embodiment of the present application may further include other modules, which is not limited in this embodiment.
  • the input signal of the adaptive filtering module 105 is the far-end signal x(n), and the data buffer module 201 buffers the incoming far-end signal x(n), and performs data blocking and overlapping operations on the far-end signal x(n) .
  • the length of each data block is N, and M data overlaps between each two adjacent data blocks, when filtering is performed on each data block as a whole, the actual processed data is NM, and the remaining M data Data processed last time.
  • the power calculation module 203 is used to calculate the power of N frequency points, and the method of calculating the power is not limited.
  • the frequency domain filtering module 204 performs frequency domain filtering on the frequency domain far-end signal X(k), and the weight of the adaptive filter is W n (k). Therefore, the frequency domain estimated echo signal after filtering is
  • IFFT module 205 is used to estimate the frequency domain echo signal Estimating the echo signal in the time domain by IFFT which is Transform to According to the principle of the overlap retention algorithm, the real effective data is the last NM data.
  • the corresponding non-overlapping (NM) data corresponds to the (NM) data [y(M+1), y(M+2), ..., y(N)] T of the collected signal of the microphone and the time domain estimate
  • the last (NM) data of the echo signal Subtract to get the last (NM) data [e(M+1), e(M+2), ..., e(N)] T of the time-domain error signal.
  • the frequency domain error signal E(k) represents the linear echo signal remaining at the k-th frequency point, and this part of the information is the correct error Information
  • the weight used to update the adaptive filter is also accurate, but if there is noise or there is near-end speech interference, the frequency domain error signal E(k) represents the noise signal at the kth frequency point, speech The sum of the interference signal and the residual linear echo signal.
  • the frequency domain error signal E(k) to update the weight of the adaptive filter will produce a large error, and will cause the filter to diverge, so the frequency To limit the domain error signal E(k), the method of dividing the sub-bands provided in the above embodiments and separately processing to obtain the limiting error, and using the limiting error to update the adaptive filter weights may be used.
  • the frequency domain error control module 208 first divides the frequency domain error signal E(k) obtained by the data zero-fill FFT module 207 into subbands according to the method provided in the above embodiment, and then processes the multiple subbands separately to obtain the limit errors E of the multiple subbands ′(K).
  • the frequency domain weight update module 209 according to the principle of the normalized least mean square error (NLMS) and the limit error E′(k), the weight update rule of the adaptive filter is:
  • X * (k) is the conjugate of the frequency domain far-end signal X(k).
  • the weight value of the updated adaptive filter is W n+1 (k), and then it is sent by the frequency domain weight update module 209 to the frequency domain filter module 204 for adaptive filtering.
  • this embodiment provides a weight updating method of an adaptive filtering method, which can be applied to the adaptive filtering method in echo cancellation provided by any of the above embodiments, It can be applied to the frequency domain weight update module of the adaptive filter module in any of the above embodiments, which can further reduce the steady-state error of the filter and avoid the divergence of the filter.
  • a weight updating method of an adaptive filtering method which can be applied to the adaptive filtering method in echo cancellation provided by any of the above embodiments, It can be applied to the frequency domain weight update module of the adaptive filter module in any of the above embodiments, which can further reduce the steady-state error of the filter and avoid the divergence of the filter.
  • X * (k) [X * (1), X * (2), ..., X * (N)] T ,
  • E′(k) [E′(1),E′(2),...,E′(N)] T ;
  • ⁇ (n) [ ⁇ (1), ⁇ (2), ..., ⁇ (N)] T ;
  • ⁇ ′(k) [ ⁇ (1), ⁇ (2), ..., ⁇ (N)] T ;
  • the weight of the adaptive filter is updated according to the iteration step; in S305, according to the obtained updated frequency domain gradient ⁇ ′(k), the iteration step ⁇ is selected, where ⁇ [0,1], updated from Adaptive filter weights, where the weight update formula is:
  • W n+1 (k) W n (k)+ ⁇ ′(k).
  • this embodiment provides an application scenario of an adaptive filtering method in echo cancellation.
  • the application scenario may adopt the adaptive filtering method in echo cancellation provided by any of the above embodiments , It can further reduce the steady-state error of the filter and avoid the divergence of the filter. Even in the presence of noise interference and near-end speech, it can achieve a better echo cancellation effect.
  • FIG. 5 is a schematic diagram of a system structure of an application scenario of an adaptive filtering method in echo cancellation according to an embodiment of the present application.
  • the system 100 in this embodiment may include a speaker 101, a microphone 102, and voice activity.
  • the adaptive filtering module 105 may be implemented by using the adaptive filtering method in the echo cancellation provided by any of the above embodiments.
  • the echo signal d(n) is generated through spatial propagation, and the echo signal d(n) is collected by the microphone 102.
  • the VAD module 103 is used to detect the voice activity of the time-domain far-end signal x(n) and the microphone's collected signal y(n) to determine whether there is a voice signal.
  • Voice activity detection can use energy, time domain information, frequency domain information, entropy information, etc. to determine the presence or absence of voice signals, or can select the corresponding characteristics to detect according to actual needs.
  • the DTD module 104 determines whether it is a two-way conversation based on the difference or correlation between the time domain far-end signal x(n) and the microphone acquisition signal y(n) in the energy domain.
  • the adaptive filtering module 105 approximates the true echo path by updating the weights of the adaptive filter according to the correlation between the time-domain far-end signal x(n) and the echo signal d(n), thereby obtaining the estimated echo signal Subtract the time-domain estimated echo signal from the microphone's collected signal y(n) The linear echo cancellation is completed.
  • the adaptive filtering module 105 may be implemented in the time domain or the frequency domain.
  • Frequency domain adaptive filtering includes frequency domain block adaptive filtering and subband adaptive filtering. In this embodiment, frequency domain block adaptive filtering is used as an example for description.
  • the NLP module 106 is used to cancel the nonlinear residual echo in the echo cancellation system 100 to improve the effect of echo cancellation.
  • the NLP module 106 determines the magnitude of the nonlinear residual echo based on the correlation between the time-domain far-end signal x(n), the microphone acquisition signal y(n), and the error signal e(n), and combines the VAD module 103 and DTD
  • the output of the module 104 suppresses the nonlinear residual echo, where
  • the processing results of the VAD module 103 and the DTD module 104 can control the processing process of the adaptive filtering module 105 and the NLP module 106, for example, the processing results of the VAD module 103 and the DTD module 104 can control the iterative step size when the adaptive filtering module 105 performs filtering The degree of non-linear suppression of ⁇ and NLP module 106.
  • the VAD module 103 performs voice activity detection on the energy characteristics of the time-domain far-end signal x(n) and the microphone's collected signal y(n).
  • the VAD module 103 detects once every frame, and the time-domain far-end signal x(n) is smoothed
  • the power of the current far-end signal x(n) in the time domain is:
  • n is the starting frequency and m is the ending frequency.
  • the frequency of the voice signal is mainly concentrated in the range of 300-3400Hz, so the frequency range calculated according to the value of n and m and the frequency resolution falls within the range of 300Hz-3400Hz.
  • the VAD module 103 can set the threshold value TH1. If P cur > TH1, there is a voice signal, otherwise, there is no voice signal.
  • the DTD module 104 is a double-end detection in a broad sense.
  • the main purpose is to detect the situation where only the far end has a voice signal, because the weight update of the adaptive filter is effective only when there is a voice signal at the far end.
  • the degree of non-linear suppression of the NLP module 106 can be further controlled.
  • the DTD module 104 detects a one-way call, the non-linear suppression of the NLP module 106 is stronger; if the DTD module 104 detects a two-way call The degree of non-linear suppression of the NLP module 106 is weaker, so as not to cause speech damage.
  • the adaptive filtering module 105 adopts frequency-domain block adaptive filtering. Refer to the schematic structural diagram of the adaptive filtering module shown in FIG. 3. The specific process is:
  • the buffer module 201 buffers the incoming time-domain far-end signal x(n), and blocks the time-domain far-end signal x(n), each data block is 128 in length, and uses the 50% overlap reservation method, so The first 64 data are the last processed data, and the last 64 data are the actually processed data.
  • the power calculation module 203 is used to calculate the power of the far-end signal X(k) in the frequency domain.
  • the method for calculating the power is not limited.
  • the forgetting factor ⁇ 0.8.
  • the frequency domain filtering module 204 is used to perform frequency domain filtering on the frequency domain far-end signal X(k), and the frequency domain estimated echo signal is obtained according to the weight W n (k) of the adaptive filter of the frequency domain filtering module 204.
  • IFFT module 205 is used to estimate the frequency domain echo signal Estimating the echo signal in the time domain by IFFT among them, According to the principle of the overlap retention algorithm, the real effective data is the last 64.
  • the data zero-filling FFT module 207 is used to fill 64 zeros in front of the last 64 data [e(65), e(66),...,e(128)] T of the time-domain error signal e(n) above.
  • the frequency domain error control module 208 first calculates the normalized error for the frequency domain error signal E(k) obtained by the data zero-fill FFT module 207:
  • can prevent the denominator in the above formula from being zero.
  • ⁇ X(k) ⁇ can be replaced with a smooth power spectrum P(k).
  • the frequency resolution is 62.5Hz
  • the frequency of the voice signal is mainly concentrated in the range of 300Hz-3400Hz.
  • the frequency range of the voice signal can be obtained in the 6th frequency Point to the 55th frequency point.
  • the frequency domain error signals corresponding to frequency points 6 to 55 can be divided into first subbands, and frequency points 1 to 5 and 56 can be divided.
  • the frequency domain error signal corresponding to the frequency point to the 65th frequency point is set as the second subband.
  • the embodiment of the present application does not limit the number of divided subbands.
  • the frequency domain error signal corresponding to the frequency point near the frequency range of the voice signal can also be divided into the first subband, and this processing method can prevent high frequency leakage.
  • the frequency domain error signals corresponding to frequency points 56 to 59 may be divided into the first subband, that is, the frequency domain error signals corresponding to frequency points 6 to 59 may be divided into the first A subband.
  • the first sub-band can be further divided according to the frequency response curve of the speaker shown in FIG.
  • the frequency domain error signal E(k) can be divided into two sub-bands, and the first preset value is 5.0dB as an example for illustration.
  • the 25th The frequency point (the 25th frequency point corresponds to a frequency of 1500 Hz) is divided by the dividing line, the frequency domain error signal corresponding to the frequency point No. 6 to the frequency point No. 25 is divided into the third sub-band, and the frequency point No. 26 The frequency domain error signal corresponding to frequency point 59 is divided into the fourth subband.
  • is less than the corresponding The threshold value, the limit error E′(k) E norm (k), otherwise the limit error:
  • the second fixed value may also be other values, and the size of the second fixed value is not limited in this embodiment.
  • the frequency domain weight update module 209 according to the principle of the NLMS algorithm and the limit error E′(k), the weight update rule of the adaptive filter is:
  • the far-end conjugate X * (k) is the conjugate of the far-end signal X(k) in the frequency domain
  • ⁇ ′(k) is the frequency domain gradient.
  • is the iteration step and ⁇ [0,1].
  • the steps refer to the flowchart of the weight update method in the foregoing embodiment.
  • FIG. 6 is a diagram of the echo cancellation provided in this embodiment.
  • the adaptive filtering device 60 in echo cancellation includes:
  • the sub-band dividing module 61 is used to divide the frequency-domain error signal into multiple sub-bands.
  • the frequency-domain error signal is obtained by transforming the time-domain error signal.
  • the time-domain error signal is obtained by subtracting the microphone acquisition signal from the time-domain estimated echo signal;
  • the sub-band processing module 62 is configured to separately process multiple sub-bands to obtain the limiting errors of the multiple sub-bands.
  • the weight update module 63 is used to update the weight of the adaptive filter according to the limit errors of multiple subbands.
  • the sub-band dividing module includes:
  • a first dividing module for dividing the frequency domain error signal in the frequency range of the speech signal into first subbands
  • the second dividing module is used to divide the frequency domain error signal in the frequency range of the non-speech signal into second subbands.
  • the sub-band dividing module further includes:
  • a comparison module for comparing the speaker frequency response value corresponding to the frequency of the first sub-band with the first preset value
  • a third dividing module configured to divide the first subband corresponding to the frequency point where the frequency response value of the speaker is less than the first preset value into a third subband
  • the fourth dividing module is configured to divide the first subband corresponding to the frequency point where the frequency response value of the speaker is greater than the first preset value into a fourth subband.
  • the sub-band processing module includes:
  • Normalization module used to normalize the third subband and the fourth subband respectively to obtain the normalization error of the third subband and the normalization error of the fourth subband;
  • a modulo module for modulo the normalization error of the third subband and the normalization error of the fourth subband to obtain the modulus value of the third subband and the modulus value of the fourth subband, respectively;
  • the limit error generation module is configured to process the modulus value of the third subband and the modulus value of the fourth subband respectively to obtain the limit error of the third subband and the limit error of the fourth subband.
  • the limit error generation module includes:
  • a threshold setting module for setting a third threshold and a fourth threshold for the third subband and the fourth subband, respectively;
  • the limit error generation sub-module is used to determine the limit error of the third sub-band according to the modulus value of the third sub-band and the third threshold, and also to determine the third sub-band according to the modulus value and the fourth threshold Limiting error of four subbands.
  • the limit error generation submodule includes:
  • a third setting module configured to set the limit error of the third subband to the normalized error of the third subband when the modulus value of the third subband is less than the third threshold
  • a fourth setting module configured to set the limit error of the fourth subband to the normalized error of the fourth subband when the modulus value of the fourth subband is less than the fourth threshold;
  • the threshold setting module is also used to set the third threshold to be less than the fourth threshold.
  • the sub-band processing module also includes:
  • the fixed value setting module is used to set the limit error of the second subband to the second fixed value.
  • the fixed value setting module is also used to set the second fixed value to zero.
  • the normalization module is also used to normalize the second subband to obtain the normalization error of the second subband;
  • the modulo module is also used to modulo the normalized error of the second subband to obtain the modulus value of the second subband;
  • the limited error generation module is also used to process the modulus value of the second subband to obtain the limited error of the second subband.
  • the threshold setting module is also used to set a second threshold for the second subband
  • the limit error generation sub-module is also used to determine the limit error of the second sub-band according to the modulus value of the second sub-band and the second threshold value.
  • the limit error generation submodule also includes:
  • a second setting module configured to set the limit error of the second subband to the normalized error of the second subband when the modulus value of the second subband is less than the second threshold
  • the threshold setting module is also used to set the second threshold to be less than the third threshold.
  • An embodiment of the present application provides an adaptive filtering device in echo cancellation, which adopts a method of dividing sub-bands of frequency domain error signals and separately processing to obtain a limiting error, and using the limiting error to update the adaptive filter weights for adaptive filtering To achieve echo cancellation, it can effectively reduce the amount of calculation and filter steady-state error, and avoid filter divergence, even in the presence of noise interference and near-end speech, it can also achieve better echo cancellation effects.
  • An embodiment of the present application may further provide a device for performing the adaptive filtering method in echo cancellation provided in the above embodiment.
  • the device 70 includes: a memory 71 and a processor 72;
  • the memory 71 is coupled to the processor 72;
  • the memory 71 is used to store program instructions
  • the processor 72 is used to call the program instructions stored in the memory, so that the device performs any adaptive filtering method in echo cancellation.
  • a device provided by an embodiment of the present application can execute the adaptive filtering method in echo cancellation provided by any of the foregoing embodiments.
  • the adaptive filtering method in echo cancellation provided by any of the foregoing embodiments.
  • An embodiment of the present application may further provide a computer-readable storage medium on which a computer program is stored, which when executed by the processor 72 implements an adaptive filtering method in any echo cancellation performed by the device.
  • the computer-readable storage medium provided by the embodiment of the present application can execute the adaptive filtering method in the echo cancellation provided by any of the above-mentioned embodiments.
  • the specific implementation process and beneficial effects refer to the above, which will not be repeated here.
  • the processor may be an integrated circuit chip with signal processing capabilities.
  • each step of the foregoing method embodiment may be completed by an integrated logic circuit of hardware in a processor or instructions in the form of software.
  • the aforementioned processor may be a general-purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), an existing programmable gate array (FPGA), or other available Programming logic devices, discrete gates or transistor logic devices, discrete hardware components.
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA existing programmable gate array
  • Programming logic devices discrete gates or transistor logic devices, discrete hardware components.
  • the methods, steps, and logical block diagrams disclosed in the embodiments of the present application may be implemented or executed.
  • the general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the steps of the method disclosed in conjunction with the embodiments of the present application may be directly embodied and executed by a hardware decoding processor, or may be executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module may be located in a mature storage medium in the art, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, and registers.
  • the storage medium is located in the memory, and the processor reads the information in the memory and completes the steps of the above method in combination with its hardware.
  • the memory in the embodiments of the present application may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory.
  • the non-volatile memory may be read-only memory (read-only memory (ROM), programmable read-only memory (programmable rom, PROM), erasable programmable read-only memory (erasable PROM, EPROM), electronically Erase programmable EPROM (EEPROM) or flash memory.
  • the volatile memory may be a random access memory (random access memory, RAM), which is used as an external cache.
  • RAM random access memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • DRAM synchronous dynamic random access memory
  • double SDRAM double SDRAM
  • DDR SDRAM double data rate synchronous dynamic random access memory
  • enhanced SDRAM enhanced SDRAM
  • SLDRAM synchronous connection dynamic random access memory
  • direct RAMbus RAM direct RAMbus RAM
  • B corresponding to A means that B is associated with A, and B can be determined according to A.
  • determining B based on A does not mean determining B based on A alone, and B may also be determined based on A and/or other information.
  • the disclosed system, device, and method may be implemented in other ways.
  • the device embodiments described above are only schematic.
  • the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical, or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium.
  • the technical solution of the present application essentially or part of the contribution to the existing technology or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to enable a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present application.
  • the foregoing storage media include: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), magnetic disk or optical disk and other media that can store program codes .

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)
  • Telephone Function (AREA)

Abstract

一种回声消除中的自适应滤波方法、装置、设备及存储介质。回声消除中的自适应滤波方法包括:将频域误差信号划分成多个子带(S101);分别处理多个子带,得到多个子带的限制误差(S102);根据多个子带的限制误差更新自适应滤波器的权值(S103)。采用对频域误差信号划分子带并分别处理得到限制误差,用限制误差来更新自适应滤波器权值的方法进行自适应滤波来实现回声消除,能有效地减小计算量和滤波器稳态误差,并且避免滤波器发散,即使是在噪声干扰和近端语音存在的情况下,也能达到较好的回声消除效果。

Description

一种回声消除中的自适应滤波方法、装置、设备及存储介质 技术领域
本申请涉及通信领域,尤其涉及一种回声消除中的自适应滤波方法、装置、设备及存储介质。
背景技术
滤波器通常用来从含有噪声的数据中提取人们感兴趣的、接近规定质量的信息。其中,自适应滤波器的输出与滤波器系统之间存在反馈通道,能够根据某一时刻滤波器的输出与期望信号的误差调整自适应滤波器的权值,从而实现自适应滤波器的权值的动态调整,实现最优滤波,因此,自适应滤波器被广泛运用于各个领域。
自适应滤波器一个重要的应用是回声消除,回声消除的主要思想是使用一个自适应滤波器对未知的回声信道进行系统辨识。根据扬声器信号与产生的回声信号的相关性为基础,建立远端信号模型,模拟回声路径,通过自适应算法调整,使其估计的回声信号与真实的回声信号相逼近,即可实现回声消除功能。一般使用最小梯度下降法对自适应滤波器的权值进行更新,在权值更新时,一般通过自适应改变迭代步长来提高滤波器性能,但是这种方法会引入大量的计算,并且存在滤波器稳态误差较大、滤波器容易发散的问题。
发明内容
针对现有技术中存在的计算量大、滤波器稳态误差较大以及滤波 器容易发散的问题,本申请提供了一种回声消除中的自适应滤波方法、装置、设备及存储介质。
本申请的实施例的第一方面提供了一种回声消除中的自适应滤波方法,包括:
将频域误差信号划分成多个子带,频域误差信号由时域误差信号变换得到,时域误差信号为麦克风的采集信号与时域估计回声信号相减得到;
分别处理多个子带,得到多个子带的限制误差;
根据多个子带的限制误差更新自适应滤波器的权值。
另外,结合第一方面,在第一方面的一种实现方式中,将频域误差信号划分成多个子带包括:将语音信号频率范围内的频域误差信号划分为第一子带;
将非语音信号频率范围内的频域误差信号划分为第二子带。
另外,结合第一方面及其上述实现方式,在第一方面的另一种实现方式中,将频域误差信号划分成多个子带还包括:比较第一子带的频点对应的扬声器频率响应值与第一预设值;
将扬声器频率响应值小于第一预设值的频点对应的第一子带划分为第三子带;
将扬声器频率响应值大于第一预设值的频点对应的第一子带划分为第四子带。
另外,结合第一方面及其上述实现方式,在第一方面的另一种实现方式中,分别处理多个子带,得到多个子带的限制误差包括:
对第三子带和第四子带分别进行归一化得到第三子带的归一化误差和第四子带的归一化误差;
对第三子带的归一化误差和第四子带的归一化误差分别求模得到第三子带的模值和第四子带的模值;
分别处理第三子带的模值和第四子带的模值得到第三子带的限制误差和第四子带的限制误差。
另外,结合第一方面及其上述实现方式,在第一方面的另一种实 现方式中,分别处理第三子带的模值和第四子带的模值得到第三子带的限制误差和第四子带的限制误差包括:
对第三子带和第四子带分别设置第三门限值和第四门限值;
根据第三子带的模值与第三门限值确定第三子带的限制误差,根据第四子带的模值与第四门限值确定第四子带的限制误差。
另外,结合第一方面及其上述实现方式,在第一方面的另一种实现方式中,根据第三子带的模值与第三门限值确定第三子带的限制误差,根据第四子带的模值与第四门限值确定第四子带的限制误差包括:
若第三子带的模值小于第三门限值,则设置第三子带的限制误差为第三子带的归一化误差;
若第三子带的模值大于第三门限值,则设置第三子带的限制误差为第三门限值乘以第三子带的归一化误差得到的结果再除以第三子带的模值;
若第四子带的模值小于第四门限值,则设置第四子带的限制误差为第四子带的归一化误差;
若第四子带的模值大于第四门限值,则设置第四子带的限制误差为第四门限值乘以第四子带的归一化误差得到的结果再除以第四子带的模值。
另外,结合第一方面及其上述实现方式,在第一方面的另一种实现方式中,设置第三门限值小于第四门限值。
另外,结合第一方面及其上述实现方式,在第一方面的另一种实现方式中,分别处理多个子带,得到多个子带的限制误差还包括:将第二子带的限制误差设置为第二定值。
另外,结合第一方面及其上述实现方式,在第一方面的另一种实现方式中,设置第二定值为零。
另外,结合第一方面及其上述实现方式,在第一方面的另一种实现方式中,分别处理多个子带,得到多个子带的限制误差还包括:
对第二子带进行归一化得到第二子带的归一化误差;
对第二子带的归一化误差求模得到第二子带的模值;
处理第二子带的模值得到第二子带的所述限制误差。
另外,结合第一方面及其上述实现方式,在第一方面的另一种实现方式中,处理第二子带的模值得到第二子带的所述限制误差包括:对所述第二子带设置第二门限值;根据第二子带的模值与第二门限值确定第二子带的限制误差。
另外,结合第一方面及其上述实现方式,在第一方面的另一种实现方式中,根据第二子带的模值与第二门限值确定第二子带的限制误差包括:若第二子带的模值小于第二门限值,则设置第二子带的限制误差为第二子带的归一化误差;
若第二子带的模值大于第二门限值,则设置第二子带的限制误差为第二门限值乘以第二子带的归一化误差得到的结果再除以第二子带的模值。
另外,结合第一方面及其上述实现方式,在第一方面的另一种实现方式中,设置所述第二门限值小于所述第三门限值。
本申请的实施例的第二方面提供了一种回声消除中的自适应滤波装置,包括:
划分子带模块,用于将频域误差信号划分成多个子带,频域误差信号由时域误差信号变换得到,时域误差信号为麦克风的采集信号与时域估计回声信号相减得到;
子带处理模块,用于分别处理多个子带,得到多个子带的限制误差;以及
权值更新模块,用于根据多个子带的限制误差更新自适应滤波器的权值。
另外,结合第二方面,在第二方面的一种实现方式中,划分子带模块包括:
第一划分模块,用于将语音信号频率范围内的频域误差信号划分为第一子带;以及
第二划分模块,用于将非语音信号频率范围内的频域误差信号划分为第二子带。
另外,结合第二方面及其上述实现方式,在第二方面的另一种实现方式中,划分子带模块还包括:
比较模块,用于比较第一子带的频点对应的扬声器频率响应值与第一预设值;
第三划分模块,用于将扬声器频率响应值小于第一预设值的频点对应的第一子带划分为第三子带;以及
第四划分模块,用于将扬声器频率响应值大于第一预设值的频点对应的第一子带划分为第四子带。
另外,结合第二方面及其上述实现方式,在第二方面的另一种实现方式中,子带处理模块包括:
归一化模块:用于对第三子带和第四子带分别进行归一化得到第三子带的归一化误差和第四子带的归一化误差;
求模模块,用于对第三子带的归一化误差和第四子带的归一化误差分别求模得到第三子带的模值和第四子带的模值;以及
限制误差生成模块,用于分别处理第三子带的模值和第四子带的模值得到第三子带的限制误差和第四子带的限制误差。
另外,结合第二方面及其上述实现方式,在第二方面的另一种实现方式中,限制误差生成模块包括:
门限值设置模块,用于对第三子带和第四子带分别设置第三门限值和第四门限值;以及
限制误差生成子模块,用于根据第三子带的模值与第三门限值确定第三子带的限制误差,还用于根据第四子带的模值与第四门限值确定第四子带的限制误差。
另外,结合第二方面及其上述实现方式,在第二方面的另一种实现方式中,限制误差生成子模块包括:
第三设置模块,用于当第三子带的模值小于第三门限值时,设置第三子带的限制误差为第三子带的归一化误差;或者
还用于当第三子带的模值大于第三门限值时,设置第三子带的限制误差为第三门限值乘以第三子带的归一化误差得到的结果再除以 第三子带的模值;以及
第四设置模块,用于当第四子带的模值小于第四门限值时,设置第四子带的限制误差为第四子带的归一化误差;或者
还用于当第四子带的模值大于第四门限值时,设置第四子带的限制误差为第四门限值乘以第四子带的归一化误差得到的结果再除以第四子带的模值。
另外,结合第二方面及其上述实现方式,在第二方面的另一种实现方式中,门限值设置模块还用于设置第三门限值小于第四门限值。
另外,结合第二方面及其上述实现方式,在第二方面的另一种实现方式中,子带处理模块还包括:定值设置模块,用于将第二子带的限制误差设置为第二定值。
另外,结合第二方面及其上述实现方式,在第二方面的另一种实现方式中,定值设置模块还用于设置第二定值为零。
另外,结合第二方面及其上述实现方式,在第二方面的另一种实现方式中,归一化模块还用于对第二子带进行归一化得到第二子带的归一化误差;
求模模块还用于对第二子带的归一化误差求模得到第二子带的模值;
限制误差生成模块还用于处理第二子带的模值得到第二子带的限制误差。
另外,结合第二方面及其上述实现方式,在第二方面的另一种实现方式中,门限值设置模块还用于对第二子带设置第二门限值;
限制误差生成子模块还用于根据第二子带的模值与第二门限值确定第二子带的限制误差。
另外,结合第二方面及其上述实现方式,在第二方面的另一种实现方式中,限制误差生成子模块还包括:
第二设置模块,用于当第二子带的模值小于第二门限值时,设置第二子带的限制误差为第二子带的归一化误差;或者
还用于当第二子带的模值大于第二门限值时,设置第二子带的限 制误差为第二门限值乘以第二子带的归一化误差得到的结果再除以第二子带的模值。
另外,结合第二方面及其上述实现方式,在第二方面的另一种实现方式中,门限值设置模块还用于设置第二门限值小于第三门限值。
本申请的实施例的第三方面提供了一种设备,用于回声消除中的自适应滤波,包括:存储器和处理器;
存储器与处理器耦合;
存储器,用于存储程序指令;
处理器,用于调用存储器存储的程序指令,使得设备执行上述第一方面的回声消除中的自适应滤波方法。
本申请的实施例的第四方面提供了一种计算机可读存储介质,包括:其上存储有计算机程序,其特征在于,计算机程序被处理器执行时实现上述第一方面的回声消除中的自适应滤波方法。
与现有技术相比,本申请实施例的有益效果在于:本申请实施例提供了一种回声消除中的自适应滤波方法、装置、设备及存储介质,采用对频域误差信号划分子带并分别处理得到限制误差,用限制误差来更新自适应滤波器权值的方法进行自适应滤波来实现回声消除,能有效地减小计算量和滤波器稳态误差,并且避免滤波器发散,即使是在噪声干扰和近端语音存在的情况下,也能达到较好的回声消除效果。
附图说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1是本申请实施例的一种回声消除中的自适应滤波方法的流程图;
图2是本申请实施例的扬声器的频率响应曲线图;
图3是本申请实施例的自适应滤波模块的结构示意图;
图4是本申请实施例的自适应滤波方法的权值更新方法的流程图;
图5是本申请实施例的一种回声消除中的自适应滤波方法的应用场景的系统结构示意图;
图6是本申请实施例的一种回声消除中的自适应滤波装置的结构示意图;
图7是本申请实施例的一种设备的结构示意图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请的部分实施例采用举例的方式进行详细的阐述。然而,本领域的普通技术人员可以理解,在各例子中,为了使读者更好地理解本申请而提出了许多技术细节。但是,即使没有这些技术细节和基于以下各实施例的种种变化和修改,也可以实现本申请所要求保护的技术方案。
请参考图1,图1是本申请实施例的一种回声消除中的自适应滤波方法的流程图,本实施例采用对频域误差信号划分子带并分别处理得到限制误差,用限制误差来更新自适应滤波器权值的方法进行自适应滤波来实现回声消除,该方法包括以下步骤:
S101,将频域误差信号划分成多个子带;
S102,分别处理多个子带,得到多个子带的限制误差;
S103,根据多个子带的限制误差更新自适应滤波器的权值。
在S101中,频域误差信号由时域误差信号变换得到,时域误差信号为麦克风的采集信号与时域估计回声信号相减得到;需要注意的是,子带在频域上可以是连续频点对应的频域误差信号,例如,一个子带可以为第6至59号连续频点对应的的频域误差信号,子带在频 域上也可以是非连续频点对应的的频域误差信号,例如:另一个子带可以为第1至5号和第60至65号频点对应的的频域误差信号,另一个子带的第1至5号和第60至65号频点之间不存在第6至59号频点,这种非连续的频点对应的频域误差信号仍然可以划分为一个子带。另外,本申请实施例对子带的个数不做限制,本申请实施例提供的子带的划分方法仅为示例性说明,本领域的技术人员在不付出创造性劳动的前提下可以根据上述实施例得到其他划分方式。
在S102中,分别处理多个子带,多个子带的处理方式可以互不相同,也可以有两个或者多个子带的处理方式相同,可以依据实际情况而定,本实施例对子带的处理方式不做限制。
在S103中,根据S102中得到的多个子带的限制误差,用以上全部子带的限制误差来更新自适应滤波器的权值。本实施例中,自适应滤波器可以简称为滤波器。
本申请实施例的有益效果在于:本申请实施例提供了一种回声消除中的自适应滤波方法,采用对频域误差信号划分子带并分别处理得到限制误差,用限制误差来更新自适应滤波器权值的方法进行自适应滤波来实现回声消除,能有效地减小计算量和滤波器稳态误差,并且避免滤波器发散,即使是在噪声干扰和近端语音存在的情况下,也能达到较好的回声消除效果。
基于上述实施例公开的内容,可选的,本实施例中,将频域误差信号划分成多个子带可以包括以下两个步骤:
将语音信号频率范围内的频域误差信号划分为第一子带;
将非语音信号频率范围内的频域误差信号划分为第二子带。
对于本实施例,以采样速率为8000Hz,128个采样点为例进行示例性说明,由于频率分辨率为62.5Hz,语音信号频率主要集中在300Hz-3400Hz范围内,因此语音信号频率范围大约在第6号频点到第55号频点之间,这里第1号频点对应的频率为0Hz。根据本实施例划分子带的方法,可以将第6号频点到第55号频点对应的频域误差信号划分为第一子带,第1号到第5号频点和第56号频点到第65 号频点对应的频域误差信号划分为第二子带。另外,为了防止高频泄露,对于语音信号频率范围附近的频点对应的频域误差信号,也可以将其划分到第一子带,例如,也可以将第56号频点到第59号频点对应的频域误差信号划分到第一子带,即将第6号频点到第59号频点对应的频域误差信号划分为第一子带,第1号到第5号频点和第60号频点到第65号频点对应的频域误差信号划分为第二子带。非语音信号频率范围内的频域误差信号通常含有一些低频或高频噪声,按照该方法将频域误差信号划分成第一子带和第二子带,以便分别对第一子带和第二子带进行处理得到限制误差,可以进一步减小滤波器稳态误差,避免滤波器发散。
另外,在其他实施例中,也可以对第一子带和第二子带进行进一步划分,例如将上述实施例中的第二子带中的第1号到第5号频点对应的频域误差信号划分为一个子带,第60号频点到第65号频点对应的频域误差信号划分为另一个子带,本申请对划分子带的个数不作限制。
基于上述实施例公开的内容,可选的,本实施例提供的回声消除中的自适应滤波方法,将频域误差信号划分成多个子带还包括以下步骤:
比较第一子带的频点对应的扬声器频率响应值与第一预设值;
将扬声器频率响应值小于第一预设值的频点对应的第一子带划分为第三子带;
将扬声器频率响应值大于第一预设值的频点对应的第一子带划分为第四子带。
本实施例中,可以根据第一子带的频点对应的扬声器频率响应对第一子带进一步划分。由于第一子带的各频点对应的扬声器频率响应不同,因此第一子带的频点对应的频域误差信号也与扬声器频率响应有关。根据图2所示的扬声器的频率响应曲线,以第一预设值为5.0dB为例进行示例性说明,可以将扬声器频率响应值小于5.0dB的频点对应的第一子带划分为第三子带,将扬声器频率响应值大于5.0dB的频 点对应的第一子带划分为第四子带,由图2可知,扬声器频率响应值等于5.0dB对应的频率大致为1500Hz,因此可以将第6号频点到第24号频点(第25号频点对应的频率为1500Hz)对应的频域误差信号划分为第三子带,第26号频点到第55号频点对应的频域误差信号划分为第四子带。如图2所示,第6号频点到第24号频点的扬声器的频率响应衰减较强,回声能量较弱,因此第三子带能量较小,而第四子带的能量较大,按照该方法将第一子带进一步划分为第三子带和第四子带,可以方便后续对第三子带和第四子带分别处理,可以进一步地减小滤波器稳态误差,避免滤波器发散。需要说明的是,本申请实施例设置第一预设值为5.0dB仅为示例性说明,本领域的技术人员在不付出创造性劳动的前提下可以根据上述实施例得到其他第一预设值。另外,本实施例中,对扬声器频率响应值等于5.0dB的频点对应的第一子带划分到第三子带还是第四子带不做限制,因此也可以将第6号频点到第25号频点(第25号频点对应的频率为1500Hz)对应的频域误差信号划分为第三子带,第26号频点到第55号频点对应的频域误差信号划分为第四子带。本实施例对扬声器频率响应值等于第一预设值的频点对应的频域误差信号划分为第三子带还是第四子带不做限制,也可以将此频点对应的频域误差信号做其他处理。可选的,为了防止高频泄露,也可以将语音信号频率范围附近的频点对应的频域误差信号划分到第一子带,例如,也可以将第56号频点到第59号频点对应的频域误差信号划分到第一子带,即将第6号频点到第59号频点对应的频域误差信号划分为第一子带,因此,在本实施例中,也可以将第6号频点到第25号频点对应的频域误差信号划分为第三子带,将第26号频点到第59号频点对应的频域误差信号划分为第四子带。
基于上述实施例公开的内容,可选的,本实施例中,分别处理多个子带,得到多个子带的限制误差包括:
对第三子带和第四子带分别进行归一化得到第三子带的归一化误差和第四子带的归一化误差;
对第三子带的归一化误差和第四子带的归一化误差分别求模得到第三子带的模值和第四子带的模值;
分别处理第三子带的模值和第四子带的模值得到第三子带的限制误差和第四子带的限制误差。
本实施例中对多个子带分别处理得到限制误差的方法能进一步地减小滤波器稳态误差,并且避免滤波器发散。本实施例中,归一化误差的计算方法包括但不限于:
Figure PCTCN2018121560-appb-000001
上式中,δ可以防止上式中分母为零,X(k)为频域远端信号,是由时域远端信号x(n)变换得到,另外,还可以通过计算频域远端信号X(k)的功率来计算归一化误差,采用平滑计算处理,得到平滑功率谱P(k),具体公式为:P(k)=P(k)(α)+(1-α)X 2(k),其中,k=1,2,…,N,α为遗忘因子。归一化计算方法中的‖X(k)‖可以用平滑功率谱P(k)来代替。本实施例中,模值表示为:|E norm(k)|,限制误差表示为:E′(k)。
基于上述实施例公开的内容,可选的,本实施例中,分别处理第三子带的模值和第四子带的模值得到第三子带的限制误差和第四子带的限制误差包括以下步骤:
对第三子带和第四子带分别设置第三门限值和第四门限值;
根据第三子带的模值与第三门限值确定第三子带的限制误差,根据第四子带的模值与第四门限值确定第四子带的限制误差。
本实施例中,对第三门限值和第四门限值的具体数值不做限制,第三门限值和第四门限值可以相同,也可以不同。根据第三子带的模值与第三门限值确定第三子带的限制误差,根据第四子带的模值与第四门限值确定第四子带的限制误差,在一定程度上,进一步地避免了滤波器发散,减小了滤波器的稳态误差。
基于上述实施例公开的内容,可选的,本实施例中,根据第三子带的模值与第三门限值确定第三子带的限制误差,根据第四子带的模 值与第四门限值确定第四子带的限制误差包括以下步骤:
若第三子带的模值小于第三门限值,则设置第三子带的限制误差为第三子带的归一化误差;
若第三子带的模值大于第三门限值,则设置第三子带的限制误差为第三门限值乘以第三子带的归一化误差得到的结果再除以第三子带的模值;
若第四子带的模值小于第四门限值,则设置第四子带的限制误差为第四子带的归一化误差;
若第四子带的模值大于第四门限值,则设置第四子带的限制误差为第四门限值乘以第四子带的归一化误差得到的结果再除以第四子带的模值。
在本实施例中,如果第三子带的模值|E norm(k)|小于第三门限值TH3,则令第三子带的限制误差E′(k)=E norm(k),若第三子带的模值|E norm(k)|大于第三门限值TH3,则令第三子带的限制误差:
Figure PCTCN2018121560-appb-000002
如果第四子带的模值|E norm(k)|小于第四门限值TH4,则令第四子带的限制误差E′(k)=E norm(k),若第四子带的模值|E norm(k)|大于第四门限值TH4,则令第四子带的限制误差:
Figure PCTCN2018121560-appb-000003
本实施例中,对应子带的门限值用TH表示,需要说明的是,若第三子带或第四子带的频点对应的频域误差信号的模值|E norm(k)|等于对应子带的门限值,则对该子带的频点对应的频域误差信号的限制误差不做限定,该子带的频点对应的频域误差信号的限制误差可以是E′(k)=E norm(k);也可以是:
Figure PCTCN2018121560-appb-000004
该子带的频点对应的频域误差信号的限制误差也可以是其他的限制误差设置方式,本实施例对此不作限制。本实施例根据对应子带的模值与对应的门限值确定该子带的限制误差,对进一步地避免了滤 波器发散,减小了滤波器的稳态误差。
基于上述实施例公开的内容,可选的,本实施例中,设置第三门限值小于第四门限值,具体地,可以设置第三门限值TH3为0.0000001,第四门限值TH4为0.00005。另外,本申请实施例对第三门限值和第四门限值的具体数值不做限制,本申请实施例提供的门限值的设置方法仅为示例性说明,本领域的技术人员在不付出创造性劳动的前提下可以根据上述实施例得到其他门限值设置方式。设置第三门限值小于第四门限值,在一定程度上,进一步地避免了滤波器发散,减小了滤波器的稳态误差。
基于上述实施例公开的内容,可选的,本实施例中,分别处理所述多个子带,得到多个子带的限制误差还包括将第二子带的限制误差设置为第二定值,由于第二子带通常含有一些低频或高频噪声,如果直接使用第二子带对自适应滤波器的权值进行更新,有很大概率会使得滤波器发散,因此,将第二子带的限制误差设置为第二定值,可以避免自适应滤波器发散,降低滤波器的稳态误差,并且,可以减小计算量。需要说明的是,本实施例对第二定值的大小不做限制。
基于上述实施例公开的内容,可选的,本实施例中,可以设置第二定值为零,即设置第二子带的限制误差E′(k)=0,将第二子带的限制误差设置为零,可以进一步地避免自适应滤波器发散,降低滤波器的稳态误差,并且,可以减小计算量。
基于上述实施例公开的内容,在另一种可选的实施例当中,分别处理多个子带,得到多个子带的限制误差还包括对第二子带的另一种处理方式,具体包括以下步骤:
对第二子带进行归一化得到第二子带的归一化误差;
对第二子带的归一化误差求模得到第二子带的模值;
处理第二子带的模值得到第二子带的限制误差。
基于上述实施例公开的内容,可选的,本实施例中,处理第二子带的模值得到第二子带的限制误差可以包括以下步骤:
对第二子带设置第二门限值;
根据第二子带的模值与第二门限值确定第二子带的限制误差。
本实施例中,根据第二子带的模值与第二门限值确定第二子带的限制误差,在一定程度上,进一步地避免了滤波器发散,减小了滤波器的稳态误差。
基于上述实施例公开的内容,可选的,本实施例中,根据第二子带的模值与第二门限值确定第二子带的限制误差可以包括以下步骤:
若第二子带的模值小于第二门限值,则设置第二子带的限制误差为第二子带的归一化误差;
若第二子带的模值大于第二门限值,则设置第二子带的限制误差为第二门限值乘以第二子带的归一化误差得到的结果再除以第二子带的模值。
在本实施例中,如果第二子带的模值|E norm(k)|小于第二门限值TH2,则令第二子带的限制误差E′(k)=E norm(k),若第二子带的模值|E norm(k)|大于第二门限值TH2,则令第二子带的限制误差:
Figure PCTCN2018121560-appb-000005
需要说明的是,若第二子带的频点对应的频域误差信号的模值|E norm(k)|等于第二门限值TH2,则对第二子带的该频点对应的频域误差信号的限制误差不做限定,第二子带的该频点对应的频域误差信号的限制误差可以是E′(k)=E norm(k);也可以是:
Figure PCTCN2018121560-appb-000006
第二子带的该频点对应的频域误差信号的限制误差也可以是其他的限制误差设置方式,本实施例对此不作限制。
基于上述实施例公开的内容,可选的,本实施例中,设置第二门限值TH2小于第三门限值TH3,进一步地避免了滤波器发散,减小了滤波器的稳态误差。
基于上述实施例公开的内容,可选的,本实施例提供一种自适应滤波模块的结构示意图,该自适应滤波模块可以执行上述任一实施例提供的回声消除中的自适应滤波方法。自适应滤波的实现方法有很多, 具体的可以分为时域滤波和频域滤波。时域滤波主要是对时域采样点使用自适应算法逐一计算,而频域滤波主要通过FFT变换,在频域进行自适应滤波器的权值的自适应调整,频域滤波相对于时域滤波计算量相对较小。本实施例采用频域滤波的自适应滤波方法,图3是根据本申请实施例的自适应滤波模块的结构示意图,如图3所示,自适应滤波模块105包括:数据缓存模块201、FFT模块202、功率计算模块203、频域滤波模块204、IFFT模块205、数据分析模块206、数据补零FFT模块207、频域误差控制模块208以及频域权值更新模块209。需要说明的是,本申请实施例提供的自适应滤波模块的结构还可以包括其他模块,本实施例对此不做限制。
自适应滤波模块105的输入信号为远端信号x(n),数据缓存模块201对进入的远端信号x(n)进行缓存,并且对远端信号x(n)进行数据分块和重叠操作。假设每个数据块长度为N,并且,每两个相邻数据块之间重叠M个数据,则以每个数据块为整体进行滤波时,实际处理的数据为N-M个,剩下M个数据为上一次处理过的数据。
FFT模块202将远端信号x(n)通过FFT变换为频域远端信号X(k),以上述每个数据块包含N个数据点为例,即由x(n)=[x(1),x(2),…,x(N)] T变换为X(k)=[X(1),X(2),…,X(N)] T,频域上的N个频点可以对应N个数据。
功率计算模块203用来计算N个频点的功率,计算功率的方法不限,可选地,本实施例采用平滑计算处理,得到平滑功率谱P(k),具体公式为:P(k)=P(k)(α)+(1-α)X 2(k),k=1,2,…,N。其中α为遗忘因子。
频域滤波模块204对频域远端信号X(k)进行频域滤波,自适应滤波器的权值为W n(k),因此滤波后得到频域估计回声信号为
Figure PCTCN2018121560-appb-000007
IFFT模块205用于将频域估计回声信号
Figure PCTCN2018121560-appb-000008
通过IFFT变换为时域估计回声信号
Figure PCTCN2018121560-appb-000009
Figure PCTCN2018121560-appb-000010
变换为
Figure PCTCN2018121560-appb-000011
根据重叠保留算法原理,真正有效的 数据为后N-M个。
数据分析模块206根据重叠保留算法取出当前时刻麦克风的采集信号y(n)的(N-M)个数据与前一时刻的M个数据组合得到y(n)=[y(1),y(2),…,y(N)] T,其中,麦克风的采集信号的N-M个数据[y(M+1),y(M+2),……,y(N)] T与时域远端信号的非重叠的(N-M)个数据对应,将麦克风的采集信号的(N-M)个数据[y(M+1),y(M+2),……,y(N)] T与时域估计回声信号的后(N-M)个数据
Figure PCTCN2018121560-appb-000012
相减,得到时域误差信号的后(N-M)个数据[e(M+1),e(M+2),……,e(N)] T
数据补零FFT模块207用于在时域误差信号的后N-M个数据的前面补M个零得到时域误差信号e(n)=[e(1),e(2),……,e(N)] T,数据补零FFT模块207还用于将补零后的时域误差信号e(n)进行FFT得到频域误差信号E(k),E(k)=[E(1),E(2),…,E(N)] T
假设在没有噪声也没有近端语音干扰的理想情况下(无非线性回声)频域误差信号E(k)表示的是在第k号频点处残余的线性回声信号,这部分信息是正确的误差信息,用来更新自适应滤波器的权值也是准确的,但是如果有噪声或者有近端语音干扰存在时,频域误差信号E(k)表示在第k号频点处的噪声信号、语音干扰信号和残余的线性回声信号的总和,此时用频域误差信号E(k)来更新自适应滤波器的权值就会产生很大的误差,并且会使得滤波器发散,因此需要对频域误差信号E(k)进行限制,可以采用上述实施例提供的划分子带并分别处理得到限制误差,用限制误差来更新自适应滤波器权值的方法。
频域误差控制模块208首先对数据补零FFT模块207得到的频域误差信号E(k)按照上述实施例提供的方法划分子带,然后分别处理多个子带,得到多个子带的限制误差E′(k)。
频域权值更新模块209,根据归一化最小均方误差算法(Normalized Least-Mean-Square,NLMS)的原理以及限制误差E′(k), 自适应滤波器的权值更新规则为:
W n+1(k)=W n(k)+μX(k) *E′(k)
X *(k)为频域远端信号X(k)的共轭。更新后的自适应滤波器的权值为W n+1(k),再由频域权值更新模块209发送到频域滤波模块204,进行自适应滤波。
基于上述实施例公开的内容,可选的,本实施例提供一种自适应滤波方法的权值更新方法,该方法可以应用于上述任一实施例提供的回声消除中的自适应滤波方法,也可以应用于上述任一实施例的自适应滤波模块的频域权值更新模块中,能够进一步地减小滤波器稳态误差,避免滤波器发散。权值更新方法的流程图参照图4,该方法包括以下步骤:
S301,频域梯度计算;在S301中,根据频域远端信号的共轭X *(k)以及限制误差E′(k),计算频域梯度Δ(k)=X *(k)E′(k)。其中:
X *(k)=[X *(1),X *(2),…,X *(N)] T
E′(k)=[E′(1),E′(2),…,E′(N)] T
S302,IFFT;在S302中,对频域梯度Δ(k)进行IFFT变换,得到时域梯度Δ(n):
Δ(n)=[Δ(1),Δ(2),…,Δ(N)] T
S303,时域梯度更新;在S303中,将时域梯度Δ(n)的后M个数据置零,得到更新的时域梯度Δ′(n);
S304,FFT;在S304中,将更新的时域梯度Δ′(n)进行FFT,得到频域梯度Δ′(k),其中:
Δ′(k)=[Δ(1),Δ(2),…,Δ(N)] T
S305,根据迭代步长更新自适应滤波器的权值;在S305中,根据得到的更新的频域梯度Δ′(k),选取迭代步长μ,其中μ∈[0,1],更新自适应滤波器的权值,其中权值更新的公式为:
W n+1(k)=W n(k)+μΔ′(k)。
基于上述实施例公开的内容,可选的,本实施例提供一种回声消除中的自适应滤波方法的应用场景,该应用场景可以采用上述任一实 施例提供的回声消除中的自适应滤波方法,能够进一步地减小滤波器稳态误差,避免滤波器发散,即使是在噪声干扰和近端语音存在的情况下,也能达到较好的回声消除效果。请参考图5,图5是本申请实施例的一种回声消除中的自适应滤波方法的应用场景的系统结构示意图,本实施例中的该系统100可以包括扬声器101、麦克风102、语音活动性检测(Voice Activity Detection,VAD)模块103、双向通话检测(double talk detector,DTD)模块104、自适应滤波模块105和非线性处理(non-linear processing,NLP)模块106。其中,自适应滤波模块105可以采用上述任一实施例提供的回声消除中的自适应滤波方法来执行。
图5中,时域远端信号x(n)经过扬声器101播放出来后,经过空间传播产生回声信号d(n),回声信号d(n)被麦克风102采集。产生回声信号d(n)的过程可以建模为由时域远端信号x(n)经过一个回声路径的响应函数h(n)得到,即d(n)=h(n)*x(n)。麦克风102采集的信号还包括近端信号ν(n)、噪声信号n(n),因此,麦克风的采集信号y(n)=ν(n)+n(n)+d(n)。VAD模块103用于对时域远端信号x(n)、麦克风的采集信号y(n)进行语音活动性检测,判断是否有语音信号的存在。语音活动性检测可以使用能量、时域信息、频域信息、熵信息等来判断有无语音信号的存在,或者可以根据实际需求选择相应的特征来检测。DTD模块104根据时域远端信号x(n)和麦克风的采集信号y(n)在能量域的差异或者相关性来判断是否为双向通话。自适应滤波模块105根据时域远端信号x(n)与回声信号d(n)的相关性,通过自适应滤波器的权值的更新来逼近真实的回声路径,从而得到估计回声信号
Figure PCTCN2018121560-appb-000013
在麦克风的采集信号y(n)中减去时域估计回声信号
Figure PCTCN2018121560-appb-000014
就完成了线性回声消除。自适应滤波模块105可以在时域实现,也可以在频域实现。频域自适应滤波包括频域分块自适应滤波和子带自适应滤波。本实施例以频域分块自适应滤波为例进行说明。NLP模块106用于消除回声消除系统100中的非线性残余回声,以提高回声消除的效果。NLP模块106根据时域远端信号x(n)、麦克风的采集信号y(n)和 误差信号e(n)之间的相关性来判断非线性残余回声的大小,并结合VAD模块103和DTD模块104的输出结果对非线性残余回声进行抑制,其中
Figure PCTCN2018121560-appb-000015
VAD模块103和DTD模块104的处理结果可以控制自适应滤波模块105和NLP模块106的处理过程,例如VAD模块103和DTD模块104的处理结果可以控制自适应滤波模块105进行滤波时的迭代步长μ和NLP模块106的非线性抑制程度。
VAD模块103对时域远端信号x(n)、麦克风的采集信号y(n)的能量特征进行语音活动性检测,VAD模块103每帧检测一次,时域远端信号x(n)的平滑功率谱为P(k)=[P(1),P(2),…,P(N)] T,N=128。当前时域远端信号x(n)的功率为:
Figure PCTCN2018121560-appb-000016
其中n为起始频点,m为终止频点。语音信号频率主要集中在300-3400Hz范围内,因此根据n、m的值和频率分辨率计算得到的频率范围落在300Hz-3400Hz范围内。根据本实施例提供的数据,通过计算,可以得到n=6,m=55,时域远端信号x(n)的第1号频点对应的频率为0Hz。VAD模块103可以设置门限值TH1,若P cur>TH1,则表示有语音信号,反之则没有语音信号。
DTD模块104是广义上的双端检测,主要目的是检测只有远端有语音信号的情况,因为只有在远端有语音信号时,自适应滤波器的权值更新才有效,此时给定一个较大的迭代步长μ,比如μ=0.6,使得滤波器快速收敛;其他情况下,由于自适应滤波器的权值更新不准确,此时可以给定一个较小的迭代步长,比如μ=0.01。根据DTD模块104的处理结果,进一步还可以控制NLP模块106的非线性抑制程度,如果DTD模块104检测到单向通话,NLP模块106的非线性抑制程度强一些;如果DTD模块104检测到双向通话,NLP模块106的非线性抑制程度弱一些,以免造成语音损伤。
自适应滤波模块105采用频域分块自适应滤波,可以参照图3所示的自适应滤波模块的结构示意图。具体流程为:
缓存模块201对进入的时域远端信号x(n)进行缓存,并且对时域远端信号x(n)进行数据分块,每个数据块长度为128,采用50%重叠保留法,因此其中前64个数据为上一次处理过的数据,后64个数据为实际处理的数据。
FFT模块202将128个数据通过FFT变换到频域,即由x(n)=[x(1),x(2),…,x(128)] T变换为X(k)=[X(1),X(2),…,X(128)] T,X(k)为频域远端信号。
功率计算模块203用于计算频域远端信号X(k)的功率,计算功率的方法不限,可选地,本实施例采用平滑计算的方法处理,得到平滑功率谱P(k),P(k)=P(k)(α)+(1-α)X 2(k),k=1,2,…,128。其中遗忘因子α=0.8。
频域滤波模块204用于对频域远端信号X(k)进行频域滤波,根据频域滤波模块204的自适应滤波器的权值W n(k)得到频域估计回声信号
Figure PCTCN2018121560-appb-000017
IFFT模块205用于将频域估计回声信号
Figure PCTCN2018121560-appb-000018
通过IFFT变换为时域估计回声信号
Figure PCTCN2018121560-appb-000019
其中,
Figure PCTCN2018121560-appb-000020
Figure PCTCN2018121560-appb-000021
根据重叠保留算法原理,真正有效的数据为后64个。
数据分析模块206用于根据重叠保留算法取出当前时刻麦克风的采集信号y(n)的64个数据:y(n)=[y(1),y(2),…,y(64)] T,并与时域估计回声信号
Figure PCTCN2018121560-appb-000022
的后64个数据相减,得到时域误差信号e(n)的后64个数据[e(65),e(66),…,e(128)] T
数据补零FFT模块207用于在上述时域误差信号e(n)的后64个数据[e(65),e(66),…,e(128)] T的前面补64个零,补零后的时域误差信号e(n)再进行FFT得到频域误差信号E(k),E(k)=[E(1),E(2),…,E(128)] T
频域误差控制模块208首先对数据补零FFT模块207得到的频域误差信号E(k)来计算归一化误差:
Figure PCTCN2018121560-appb-000023
其中,δ可以防止上式中分母为零,可选地,上式当中‖X(k)‖可以用平滑功率谱P(k)来代替。以采样速率为8000Hz,128个采样点为例进行说明,得到频率分辨率为62.5Hz,语音信号频率主要集中在300Hz-3400Hz范围内,根据频率分辨率可得语音信号频率范围在第6号频点到第55号频点之间。作为一种可选的实施例,可以将第6号频点到第55号频点对应的频域误差信号划分为第一子带,并将第1号到第5号频点和第56号频点到第65号频点对应的频域误差信号设置为第二子带,本申请实施例对划分子带的个数不做限制。
另外,对于语音信号频率范围附近的频点对应的频域误差信号,也可以将其划分为第一子带,这种处理方法可以防止高频泄露。例如,也可以将第56号频点到第59号频点对应的频域误差信号划分到第一子带,即将第6号频点到第59号频点对应的频域误差信号划分为第一子带。另外,依据图2所示的扬声器的频率响应曲线可以对第一子带做进一步划分,如图所示在频率范围200Hz~1500Hz之间,语音信号的衰减较强,回声能量较弱,可选地,可以将频域误差信号E(k)分成2个子带,以第一预设值为5.0dB为例进行示例性说明,根据图2所示的扬声器的频率响应曲线,可以以第25号频点(第25个频点对应频率为1500Hz)为分界线来划分,将第6号频点到第25号频点对应的频域误差信号划分为第三子带,将第26号频点到第59号频点对应的频域误差信号划分为第四子带。根据图2当中扬声器的频率响应可知,第三子带能量较小,可以将第三门限值可以设置为TH3=0.0000001。第四门限值TH4可以设置得比第三门限值大一点,具体地,第四门限值可以设置为TH4=0.00005。然后将第三子带和第四子带的模值|E norm(k)|与相对应的门限值进行比较,如果该子带的模值|E norm(k)|小于该子带对应的门限值,则令限制误差E′(k)=E norm(k),反之则令限制误差:
Figure PCTCN2018121560-appb-000024
按照上述实施例提供的第二子带的限制误差的获取方法,可以设 置第二子带的限制误差E′(k)为第二定值,例如,令E′(k)=0,需要说明的是,该第二定值也可以是其他的数值,本实施例对第二定值的大小不做限制。按照上述实施例提供的第二子带的限制误差的另一种获取方法,可以设置第二门限值,可选的,可以设置第二门限值小于第三门限值,例如:将第二门限值可以设置为TH2=0.00000001。然后将第二子带的模值|E norm(k)|与第二门限值TH2进行比较,如果该子带的模值|E norm(k)|小于第二门限值TH2,则令第二子带的限制误差E′(k)=E norm(k),反之则令第二子带的限制误差:
Figure PCTCN2018121560-appb-000025
频域权值更新模块209,根据NLMS算法的原理以及限制误差E′(k),自适应滤波器的权值更新规则为:
W n+1(k)=W n(k)+μX(k) *E′(k)=W n(k)+μΔ′(k)
远端共轭X *(k)为频域远端信号X(k)的共轭,Δ′(k)为频域梯度。其中,μ为迭代步长并且μ∈[0,1],本实施例中取迭代步长μ=0.6,但是本实施例对迭代步长的取值不限制,自适应滤波器的权值更新步骤可以参照上述实施例中的权值更新方法的流程图。
本申请实施例还可提供一种回声消除中的自适应滤波装置,用于执行前述实施例中提出的一种回声消除中的自适应滤波方法,图6为本实施例提供的回声消除中的自适应滤波装置的结构示意图,该装置可以执行上述图1至图5所示的方法,如图6所示,该回声消除中的自适应滤波装置60包括:
划分子带模块61,用于将频域误差信号划分成多个子带,频域误差信号由时域误差信号变换得到,时域误差信号为麦克风的采集信号与时域估计回声信号相减得到;
子带处理模块62,用于分别处理多个子带,得到多个子带的限制误差;以及
权值更新模块63,用于根据多个子带的限制误差更新自适应滤波器的权值。
可选的,划分子带模块包括:
第一划分模块,用于将语音信号频率范围内的频域误差信号划分为第一子带;以及
第二划分模块,用于将非语音信号频率范围内的频域误差信号划分为第二子带。
可选的,划分子带模块还包括:
比较模块,用于比较第一子带的频点对应的扬声器频率响应值与第一预设值;
第三划分模块,用于将扬声器频率响应值小于第一预设值的频点对应的第一子带划分为第三子带;以及
第四划分模块,用于将扬声器频率响应值大于第一预设值的频点对应的第一子带划分为第四子带。
可选的,子带处理模块包括:
归一化模块:用于对第三子带和第四子带分别进行归一化得到第三子带的归一化误差和第四子带的归一化误差;
求模模块,用于对第三子带的归一化误差和第四子带的归一化误差分别求模得到第三子带的模值和第四子带的模值;以及
限制误差生成模块,用于分别处理第三子带的模值和第四子带的模值得到第三子带的限制误差和第四子带的限制误差。
可选的,限制误差生成模块包括:
门限值设置模块,用于对第三子带和第四子带分别设置第三门限值和第四门限值;以及
限制误差生成子模块,用于根据第三子带的模值与第三门限值确定第三子带的限制误差,还用于根据第四子带的模值与第四门限值确定第四子带的限制误差。
可选的,限制误差生成子模块包括:
第三设置模块,用于当第三子带的模值小于第三门限值时,设置第三子带的限制误差为第三子带的归一化误差;或者
还用于当第三子带的模值大于第三门限值时,设置第三子带的限 制误差为第三门限值乘以第三子带的归一化误差得到的结果再除以第三子带的模值;以及
第四设置模块,用于当第四子带的模值小于第四门限值时,设置第四子带的限制误差为第四子带的归一化误差;或者
还用于当第四子带的模值大于第四门限值时,设置第四子带的限制误差为第四门限值乘以第四子带的归一化误差得到的结果再除以第四子带的模值。
可选的,门限值设置模块还用于设置第三门限值小于第四门限值。
可选的,子带处理模块还包括:
定值设置模块,用于将第二子带的限制误差设置为第二定值。
可选的,定值设置模块还用于设置第二定值为零。
可选的,归一化模块还用于对第二子带进行归一化得到第二子带的归一化误差;
求模模块还用于对第二子带的归一化误差求模得到第二子带的模值;
限制误差生成模块还用于处理第二子带的模值得到第二子带的限制误差。
可选的,门限值设置模块还用于对第二子带设置第二门限值;
限制误差生成子模块还用于根据第二子带的模值与第二门限值确定第二子带的限制误差。
可选的,限制误差生成子模块还包括:
第二设置模块,用于当第二子带的模值小于第二门限值时,设置第二子带的限制误差为第二子带的归一化误差;或者
还用于当第二子带的模值大于第二门限值时,设置第二子带的限制误差为第二门限值乘以第二子带的归一化误差得到的结果再除以第二子带的模值。
可选的,门限值设置模块还用于设置第二门限值小于第三门限值。
本申请实施例提供了一种回声消除中的自适应滤波装置,采用对频域误差信号划分子带并分别处理得到限制误差,用限制误差来更新 自适应滤波器权值的方法进行自适应滤波来实现回声消除,能有效地减小计算量和滤波器稳态误差,并且避免滤波器发散,即使是在噪声干扰和近端语音存在的情况下,也能达到较好的回声消除效果。
本申请实施例还可提供一种设备,用于执行上述实施例提出的一种回声消除中的自适应滤波方法,如图7所示,该设备70包括:存储器71和处理器72;
存储器71与处理器72耦合;
存储器71,用于存储程序指令;
处理器72,用于调用存储器存储的程序指令,使得该设备执行任一回声消除中的自适应滤波方法。
本申请实施例提供的一种设备,可执行上述任一所述实施例提供的回声消除中的自适应滤波方法,其具体的实现过程及有益效果参见上述,在此不再赘述。
本申请实施例还可提供一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器72执行时实现执行该设备执行的任一回声消除中的自适应滤波方法。
本申请实施例提供的计算机可读存储介质,可执行上述任一所述实施例提供的回声消除中的自适应滤波方法,其具体的实现过程及有益效果参见上述,在此不再赘述。
虽然本专利文件包含许多细节,但是这些不应被解释为对任何发明或要求保护的范围的限制,而是被解释为可以是对特定发明的特定实施例所特有的特征的描述。本专利文件中描述的某些特征在单独实施例的上下文中还可以在单个实施例中组合实现。相反,在单个实施例的上下文中描述的各种特征还可以在多个实施例中单独实现或以任何合适的子组合形式实现。而且,虽然特征可以在上面描述为在某些组合中起作用,并且甚至最初如此要求保护,但是来自要求保护的组合的一个或多个特征在一些情况下可以从组合中删除,并且要求保护的组合可以涉及子组合或子组合的变形。
类似地,虽然在附图中以特定顺序描述了操作,但是这不应理解 为要求这些操作以所示的特定顺序或按照顺序依次执行,或者要求执行所有所示的操作,以实现期望的结果。而且,在本专利文件中描述的实施例中的各种单独的系统部件不应理解为在所有实施例中需要这种分离。
本专利文件仅描述了几个实现和示例,并且可以基于本专利文件中描述和示出的内容来做出其他实现、增强和变化。
应注意,本申请上述方法实施例可以应用于处理器中,或者由处理器实现。处理器可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器可以是通用处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤。
可以理解,本申请实施例中的存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable rom,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例 性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(dynamic RAM,DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。应注意,本文描述的系统和方法的存储器旨在包括但不限于这些和任意其它适合类型的存储器。
应理解,在本申请实施例中,“与A相应的B”表示B与A相关联,根据A可以确定B。但还应理解,根据A确定B并不意味着仅仅根据A确定B,还可以根据A和/或其它信息确定B。
另外,本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以 结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (28)

  1. 一种回声消除中的自适应滤波方法,其特征在于,包括:
    将频域误差信号划分成多个子带,所述频域误差信号由时域误差信号变换得到,所述时域误差信号为麦克风的采集信号与时域估计回声信号相减得到;
    分别处理所述多个子带,得到所述多个子带的限制误差;
    根据所述多个子带的限制误差更新自适应滤波器的权值。
  2. 根据权利要求1所述的回声消除中的自适应滤波方法,其特征在于,所述将频域误差信号划分成多个子带包括:
    将语音信号频率范围内的所述频域误差信号划分为第一子带;
    将非所述语音信号频率范围内的所述频域误差信号划分为第二子带。
  3. 根据权利要求2中任一项所述的回声消除中的自适应滤波方法,其特征在于,所述将频域误差信号划分成多个子带还包括:
    比较所述第一子带的频点对应的扬声器频率响应值与第一预设值;
    将所述扬声器频率响应值小于所述第一预设值的所述频点对应的所述第一子带划分为第三子带;
    将所述扬声器频率响应值大于所述第一预设值的所述频点对应的所述第一子带划分为第四子带。
  4. 根据权利要求3所述的回声消除中的自适应滤波方法,其特征在于,所述分别处理所述多个子带,得到所述多个子带的限制误差包括:
    对所述第三子带和所述第四子带分别进行归一化得到所述第三子带的归一化误差和所述第四子带的归一化误差;
    对所述第三子带的归一化误差和所述第四子带的归一化误差分别求模得到所述第三子带的模值和所述第四子带的模值;
    分别处理所述第三子带的模值和所述第四子带的模值得到所述 第三子带的所述限制误差和所述第四子带的所述限制误差。
  5. 根据权利要求4所述的回声消除中的自适应滤波方法,其特征在于,所述分别处理所述第三子带的模值和所述第四子带的模值得到所述第三子带的所述限制误差和所述第四子带的所述限制误差包括:
    对所述第三子带和所述第四子带分别设置第三门限值和第四门限值;
    根据所述第三子带的模值与所述第三门限值确定所述第三子带的所述限制误差以及根据所述第四子带的模值与所述第四门限值确定所述第四子带的所述限制误差。
  6. 根据权利要求5所述的回声消除中的自适应滤波方法,其特征在于,所述根据所述第三子带的模值与所述第三门限值确定所述第三子带的所述限制误差以及根据所述第四子带的模值与所述第四门限值确定所述第四子带的所述限制误差包括:
    若所述第三子带的模值小于所述第三门限值,则设置所述第三子带的所述限制误差为所述第三子带的归一化误差;
    若所述第三子带的模值大于所述第三门限值,则设置所述第三子带的所述限制误差为所述第三门限值乘以所述第三子带的归一化误差得到的结果再除以所述第三子带的模值;
    若所述第四子带的模值小于所述第四门限值,则设置所述第四子带的所述限制误差为所述第四子带的归一化误差;
    若所述第四子带的模值大于所述第四门限值,则设置所述第四子带的所述限制误差为所述第四门限值乘以所述第四子带的归一化误差得到的结果再除以所述第四子带的模值。
  7. 根据权利要求5或6所述的回声消除中的自适应滤波方法,其特征在于,设置所述第三门限值小于所述第四门限值。
  8. 根据权利要求4至7中任一项所述的回声消除中的自适应滤波方法,其特征在于,所述分别处理所述多个子带,得到所述多个子带的限制误差还包括:将所述第二子带的所述限制误差设置为第二定 值。
  9. 根据权利要求8所述的回声消除中的自适应滤波方法,其特征在于,设置所述第二定值为零。
  10. 根据权利要求5至7中任一项所述的回声消除中的自适应滤波方法,其特征在于,所述分别处理所述多个子带,得到所述多个子带的限制误差还包括:
    对所述第二子带进行归一化得到所述第二子带的归一化误差;
    对所述第二子带的归一化误差求模得到所述第二子带的模值;
    处理所述第二子带的模值得到所述第二子带的所述限制误差。
  11. 根据权利要求10所述的回声消除中的自适应滤波方法,其特征在于,所述处理所述第二子带的模值得到所述第二子带的所述限制误差包括:
    对所述第二子带设置第二门限值;
    根据所述第二子带的模值与所述第二门限值确定所述第二子带的所述限制误差。
  12. 根据权利要求11所述的回声消除中的自适应滤波方法,其特征在于,所述根据所述第二子带的模值与所述第二门限值确定所述第二子带的所述限制误差包括:
    若所述第二子带的模值小于所述第二门限值,则设置所述第二子带的所述限制误差为所述第二子带的归一化误差;
    若所述第二子带的模值大于所述第二门限值,则设置所述第二子带的所述限制误差为所述第二门限值乘以所述第二子带的归一化误差得到的结果再除以所述第二子带的模值。
  13. 根据权利要求11或12所述的回声消除中的自适应滤波方法,其特征在于,设置所述第二门限值小于所述第三门限值。
  14. 一种回声消除中的自适应滤波装置,其特征在于,包括:
    划分子带模块,用于将频域误差信号划分成多个子带,所述频域误差信号由时域误差信号变换得到,所述时域误差信号为麦克风的采集信号与时域估计回声信号相减得到;
    子带处理模块,用于分别处理所述多个子带,得到所述多个子带的限制误差;以及
    权值更新模块,用于根据所述多个子带的限制误差更新自适应滤波器的权值。
  15. 根据权利要求14所述的回声消除中的自适应滤波装置,其特征在于,所述划分子带模块包括:
    第一划分模块,用于将语音信号频率范围内的所述频域误差信号划分为第一子带;以及
    第二划分模块,用于将非所述语音信号频率范围内的所述频域误差信号划分为第二子带。
  16. 根据权利要求15所述的回声消除中的自适应滤波装置,其特征在于,所述划分子带模块还包括:
    比较模块,用于比较所述第一子带的频点对应的扬声器频率响应值与第一预设值;
    第三划分模块,用于将所述扬声器频率响应值小于所述第一预设值的所述频点对应的所述第一子带划分为第三子带;以及
    第四划分模块,用于将所述扬声器频率响应值大于所述第一预设值的所述频点对应的所述第一子带划分为第四子带。
  17. 根据权利要求16所述的回声消除中的自适应滤波装置,其特征在于,所述子带处理模块包括:
    归一化模块:用于对所述第三子带和所述第四子带分别进行归一化得到所述第三子带的归一化误差和所述第四子带的归一化误差;
    求模模块,用于对所述第三子带的归一化误差和所述第四子带的归一化误差分别求模得到所述第三子带的模值和所述第四子带的模值;以及
    限制误差生成模块,用于分别处理所述第三子带的模值和所述第四子带的模值得到所述第三子带的所述限制误差和所述第四子带的所述限制误差。
  18. 根据权利要求17所述的回声消除中的自适应滤波装置,其 特征在于,所述限制误差生成模块包括:
    门限值设置模块,用于对所述第三子带和所述第四子带分别设置第三门限值和第四门限值;以及
    限制误差生成子模块,用于根据所述第三子带的模值与所述第三门限值确定所述第三子带的所述限制误差,还用于根据所述第四子带的模值与所述第四门限值确定所述第四子带的所述限制误差。
  19. 根据权利要求18所述的回声消除中的自适应滤波装置,其特征在于,所述限制误差生成子模块包括:
    第三设置模块,用于当所述第三子带的模值小于所述第三门限值时,设置所述第三子带的所述限制误差为所述第三子带的归一化误差;或者
    还用于当所述第三子带的模值大于所述第三门限值时,设置所述第三子带的所述限制误差为所述第三门限值乘以所述第三子带的归一化误差得到的结果再除以所述第三子带的模值;以及
    第四设置模块,用于当所述第四子带的模值小于所述第四门限值时,设置所述第四子带的所述限制误差为所述第四子带的归一化误差;或者
    还用于当所述第四子带的模值大于所述第四门限值时,设置所述第四子带的所述限制误差为所述第四门限值乘以所述第四子带的归一化误差得到的结果再除以所述第四子带的模值。
  20. 根据权利要求18或19所述的回声消除中的自适应滤波装置,其特征在于,所述门限值设置模块还用于设置所述第三门限值小于所述第四门限值。
  21. 根据权利要求17至20中任一项所述的回声消除中的自适应滤波装置,其特征在于,所述子带处理模块还包括:
    定值设置模块,用于将所述第二子带的所述限制误差设置为第二定值。
  22. 根据权利要求21所述的回声消除中的自适应滤波装置,其特征在于,所述定值设置模块还用于设置所述第二定值为零。
  23. 根据权利要求18至20中任一项所述的回声消除中的自适应滤波装置,其特征在于,所述归一化模块还用于对所述第二子带进行归一化得到所述第二子带的归一化误差;
    所述求模模块还用于对所述第二子带的归一化误差求模得到所述第二子带的模值;
    所述限制误差生成模块还用于处理所述第二子带的模值得到所述第二子带的所述限制误差。
  24. 根据权利要求23所述的回声消除中的自适应滤波方法,其特征在于,所述门限值设置模块还用于对所述第二子带设置第二门限值;
    限制误差生成子模块还用于根据所述第二子带的模值与所述第二门限值确定所述第二子带的所述限制误差。
  25. 根据权利要求24所述的回声消除中的自适应滤波方法,其特征在于,所述限制误差生成子模块还包括:
    第二设置模块,用于当所述第二子带的模值小于所述第二门限值时,设置所述第二子带的所述限制误差为所述第二子带的归一化误差;或者
    还用于当所述第二子带的模值大于所述第二门限值时,设置所述第二子带的所述限制误差为所述第二门限值乘以所述第二子带的归一化误差得到的结果再除以所述第二子带的模值。
  26. 根据权利要求24或25所述的回声消除中的自适应滤波方法,其特征在于,所述门限值设置模块还用于设置所述第二门限值小于所述第三门限值。
  27. 一种设备,用于回声消除中的自适应滤波,其特征在于,包括:存储器和处理器;
    所述存储器与所述处理器耦合;
    所述存储器,用于存储程序指令;
    所述处理器,用于调用所述存储器存储的程序指令,使得所述设备执行上述权利要求1至13中任一项所述的回声消除中的自适应滤 波方法。
  28. 一种计算机可读存储介质,其特征在于,包括:其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现上述权利要求1至13中任一项所述的回声消除中的自适应滤波方法。
PCT/CN2018/121560 2018-12-17 2018-12-17 一种回声消除中的自适应滤波方法、装置、设备及存储介质 WO2020124325A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201880003450.2A CN111868826B (zh) 2018-12-17 2018-12-17 一种回声消除中的自适应滤波方法、装置、设备及存储介质
PCT/CN2018/121560 WO2020124325A1 (zh) 2018-12-17 2018-12-17 一种回声消除中的自适应滤波方法、装置、设备及存储介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/121560 WO2020124325A1 (zh) 2018-12-17 2018-12-17 一种回声消除中的自适应滤波方法、装置、设备及存储介质

Publications (1)

Publication Number Publication Date
WO2020124325A1 true WO2020124325A1 (zh) 2020-06-25

Family

ID=71100168

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/121560 WO2020124325A1 (zh) 2018-12-17 2018-12-17 一种回声消除中的自适应滤波方法、装置、设备及存储介质

Country Status (2)

Country Link
CN (1) CN111868826B (zh)
WO (1) WO2020124325A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114650340A (zh) * 2022-04-21 2022-06-21 深圳市中科蓝讯科技股份有限公司 一种回声消除方法、装置及电子设备
CN115598615A (zh) * 2022-12-13 2023-01-13 中国人民解放军国防科技大学(Cn) 基于子带滤波的功率谱信息几何雷达目标检测方法和装置
CN116962583A (zh) * 2023-09-20 2023-10-27 腾讯科技(深圳)有限公司 一种回声控制的方法、装置、设备、存储介质及程序产品

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112653799B (zh) * 2020-12-07 2022-09-20 兰州空间技术物理研究所 一种空间站的空间话音设备的回声消除方法
CN113362844B (zh) * 2021-07-26 2022-05-10 西南交通大学 一种低复杂度分离去相关自适应声学回声消除方法及装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101888455A (zh) * 2010-04-09 2010-11-17 熔点网讯(北京)科技有限公司 一种频域自适应回声抵消方法
CN102185991A (zh) * 2011-03-01 2011-09-14 杭州华三通信技术有限公司 回声消除方法、系统和装置
CN105957534A (zh) * 2016-06-28 2016-09-21 百度在线网络技术(北京)有限公司 自适应滤波方法和自适应滤波器
CN107134281A (zh) * 2017-05-04 2017-09-05 重庆第二师范学院 一种自适应回声消除中自适应滤波器系数更新方法
US20170302314A1 (en) * 2011-12-12 2017-10-19 John W. Bogdan Adaptive Data Recovery from Distorted Signals

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4705554B2 (ja) * 2006-11-14 2011-06-22 日本電信電話株式会社 エコーキャンセル装置、その方法、そのプログラム、およびその記録媒体
CN101984360B (zh) * 2010-07-29 2012-09-05 中国人民解放军海军航空工程学院 基于frft的归一化泄露lms自适应动目标检测方法
US20120140940A1 (en) * 2010-12-07 2012-06-07 Electronics And Telecommunications Research Institute Method and device for cancelling acoustic echo
CN107105111B (zh) * 2017-03-15 2019-08-02 西南交通大学 一种组合步长成比例仿射投影回声消除方法
CN107291663A (zh) * 2017-06-12 2017-10-24 华侨大学 应用于声反馈抑制的变步长归一化子带自适应滤波方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101888455A (zh) * 2010-04-09 2010-11-17 熔点网讯(北京)科技有限公司 一种频域自适应回声抵消方法
CN102185991A (zh) * 2011-03-01 2011-09-14 杭州华三通信技术有限公司 回声消除方法、系统和装置
US20170302314A1 (en) * 2011-12-12 2017-10-19 John W. Bogdan Adaptive Data Recovery from Distorted Signals
CN105957534A (zh) * 2016-06-28 2016-09-21 百度在线网络技术(北京)有限公司 自适应滤波方法和自适应滤波器
CN107134281A (zh) * 2017-05-04 2017-09-05 重庆第二师范学院 一种自适应回声消除中自适应滤波器系数更新方法

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114650340A (zh) * 2022-04-21 2022-06-21 深圳市中科蓝讯科技股份有限公司 一种回声消除方法、装置及电子设备
CN115598615A (zh) * 2022-12-13 2023-01-13 中国人民解放军国防科技大学(Cn) 基于子带滤波的功率谱信息几何雷达目标检测方法和装置
CN115598615B (zh) * 2022-12-13 2023-03-10 中国人民解放军国防科技大学 基于子带滤波的功率谱信息几何雷达目标检测方法和装置
CN116962583A (zh) * 2023-09-20 2023-10-27 腾讯科技(深圳)有限公司 一种回声控制的方法、装置、设备、存储介质及程序产品
CN116962583B (zh) * 2023-09-20 2023-12-08 腾讯科技(深圳)有限公司 一种回声控制的方法、装置、设备、存储介质及程序产品

Also Published As

Publication number Publication date
CN111868826A (zh) 2020-10-30
CN111868826B (zh) 2022-09-13

Similar Documents

Publication Publication Date Title
WO2020124325A1 (zh) 一种回声消除中的自适应滤波方法、装置、设备及存储介质
US9754605B1 (en) Step-size control for multi-channel acoustic echo canceller
US8483398B2 (en) Methods and systems for reducing acoustic echoes in multichannel communication systems by reducing the dimensionality of the space of impulse responses
US9961443B2 (en) Microphone signal fusion
CN106486135B (zh) 近端语音检测器、语音系统、对语音进行分类的方法
US10117021B1 (en) Audio feedback reduction utilizing adaptive filters and nonlinear processing
CN104050971A (zh) 声学回声减轻装置和方法、音频处理装置和语音通信终端
JP2014502074A (ja) 後期残響成分のモデリングを含むエコー抑制
CN104158990A (zh) 用于处理音频信号的方法和音频接收电路
WO2012142270A1 (en) Systems, methods, apparatus, and computer readable media for equalization
US11373667B2 (en) Real-time single-channel speech enhancement in noisy and time-varying environments
CN109273019B (zh) 用于回声抑制的双重通话检测的方法及回声抑制
WO2017099728A1 (en) System and method for suppression of non-linear acoustic echoes
CN105794190A (zh) 回声消除
US20220109929A1 (en) Cascaded adaptive interference cancellation algorithms
US11189297B1 (en) Tunable residual echo suppressor
CN102739886A (zh) 基于回声频谱估计和语音存在概率的立体声回声抵消方法
Hamidia et al. Improved variable step-size NLMS adaptive filtering algorithm for acoustic echo cancellation
US20200286501A1 (en) Apparatus and a method for signal enhancement
US9137611B2 (en) Method, system and computer program product for estimating a level of noise
CN108010536A (zh) 回声消除方法、装置、系统及存储介质
CN113744748A (zh) 一种网络模型的训练方法、回声消除方法及设备
US9172791B1 (en) Noise estimation algorithm for non-stationary environments
US10938992B1 (en) Advanced audio feedback reduction utilizing adaptive filters and nonlinear processing
US9666206B2 (en) Method, system and computer program product for attenuating noise in multiple time frames

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18944069

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18944069

Country of ref document: EP

Kind code of ref document: A1