WO2019080552A1 - 基于时延估计的回声消除方法及装置 - Google Patents

基于时延估计的回声消除方法及装置

Info

Publication number
WO2019080552A1
WO2019080552A1 PCT/CN2018/095759 CN2018095759W WO2019080552A1 WO 2019080552 A1 WO2019080552 A1 WO 2019080552A1 CN 2018095759 W CN2018095759 W CN 2018095759W WO 2019080552 A1 WO2019080552 A1 WO 2019080552A1
Authority
WO
WIPO (PCT)
Prior art keywords
signal
delay
reference signal
correlation
current
Prior art date
Application number
PCT/CN2018/095759
Other languages
English (en)
French (fr)
Inventor
李明子
马峰
王海坤
王智国
胡国平
Original Assignee
科大讯飞股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 科大讯飞股份有限公司 filed Critical 科大讯飞股份有限公司
Priority to ES18869573T priority Critical patent/ES2965954T3/es
Priority to KR1020207014264A priority patent/KR102340999B1/ko
Priority to EP18869573.8A priority patent/EP3703052B1/en
Priority to US16/756,967 priority patent/US11323807B2/en
Priority to JP2020517351A priority patent/JP7018130B2/ja
Publication of WO2019080552A1 publication Critical patent/WO2019080552A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • H04R3/04Circuits for transducers, loudspeakers or microphones for correcting frequency response
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M9/00Arrangements for interconnection not involving centralised switching
    • H04M9/08Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic
    • H04M9/082Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic using echo cancellers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0232Processing in the frequency domain
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R29/00Monitoring arrangements; Testing arrangements
    • H04R29/004Monitoring arrangements; Testing arrangements for microphones
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L2021/02082Noise filtering the noise being echo, reverberation of the speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02163Only one microphone

Definitions

  • the present application relates to the field of signal processing, and in particular, to an echo cancellation method and apparatus based on time delay estimation.
  • Echo cancellation is an indispensable part of smart device interaction and has been a hot topic for researchers in related fields.
  • Echo cancellation is a processing method that prevents the far-end sound from returning by eliminating or removing the far-end audio signal picked up in the local microphone.
  • the typical echo cancellation scheme is based on the delay estimation method. a linear correlation between the signal and the mic signal, and selecting a delay corresponding to the maximum cross-correlation as the device delay, moving the reference signal based on the delay of the device, and updating the adaptive filter according to the moved reference signal and the mic signal, A signal that is close to the true echo is generated, and the signal is subtracted from the microphone signal to achieve echo cancellation.
  • a distributed intelligent hardware device with only single-ended operation authority cannot synchronously resample the reference signal and the microphone signal.
  • the television box is used to control the television.
  • the TV box and the television are mostly provided by different manufacturers, As a TV box manufacturer, in the voice control of the TV box, it is necessary to echo the sound played by the TV set. At this time, only the operation right of the TV box is available, that is, only the source signal of the TV box and the TV box microphone can be obtained.
  • the signal is collected, wherein the source signal transmitted from the television box to the television is used as a reference signal, and the television box microphone collects the signal as a microphone signal, and the resampling of the speaker signal and the microphone signal cannot be truly performed. In this case, the reference signal and the microphone need to be estimated.
  • the delay of the signal which in turn achieves echo cancellation based on the delay.
  • the accuracy of the delay estimation directly affects the effect of echo cancellation. Due to the complex and varied environment in the actual application scenario, the existing delay estimation technique based on the delay estimation has a large delay estimation error and echo cancellation effect. Still needs to be improved.
  • the embodiment of the present application provides an echo cancellation method and apparatus based on time delay estimation to reduce the delay estimation error and improve the echo cancellation effect.
  • An echo cancellation method based on time delay estimation comprising:
  • the adaptive filter is updated according to the pre-processed microphone signal and the moved reference signal to implement echo cancellation.
  • the determining that the frequency signal of the non-linear condition in the pre-processed microphone signal and the reference signal in the current echo cancellation scenario comprises any one or more of the following detections:
  • Remote signal detection determining a frequency point signal in which a nonlinear condition exists according to any one or more of energy, zero-crossing rate, and short-time amplitude of the pre-processed reference signal;
  • Double-ended signal detection determining a frequency point signal in which a nonlinear condition exists according to an energy ratio of the pre-processed microphone signal to the reference signal;
  • Non-linear detection caused by equipment hardware firstly calculate the correlation mean of the reference signal and the microphone signal in a lower frequency range; then calculate the correlation between the reference signal and the microphone signal in other frequency ranges by using a certain frequency interval; The correlation mean in the other frequency ranges and the correlation mean in the low frequency range to determine the frequency point signal in which the nonlinear condition exists.
  • the calculating the current delay estimation value according to the frequency signal of the microphone signal and the reference signal without a nonlinear condition includes:
  • a frequency point signal in which no nonlinear condition is selected is selected, and a cross correlation between the reference signal and each frame of the microphone signal is calculated;
  • the delay estimation value is determined according to the cross-correlation between the calculated reference signal and each frame of the microphone signal.
  • the determining the time delay estimation value according to the cross-correlation between the calculated frame reference signal and the microphone signal comprises:
  • the position corresponding to the frame with the largest cross-correlation in the cross-correlation of the calculated reference signal and each frame of the microphone signal is selected as the current delay position, and the current delay estimation value is determined according to the current delay position and the position of the reference signal.
  • determining, according to the calculated cross-correlation between each frame reference signal and the microphone signal, the current delay estimation value includes:
  • the candidate delay position changes, the current candidate delay position is increased by the first set value t1, the last candidate delay position is decreased by the second set value t2, and the remaining positions are decreased by the third setting.
  • the candidate delay position is increased by the first set value t1, and the remaining positions are decreased by the third set value t3, and the second set value t2 is less than or equal to the third Setting value t3;
  • the current delay estimation value is determined according to the current candidate delay position and the position of the reference signal.
  • the moving the reference signal based on the current delay estimation value comprises:
  • the determining the time delay estimation value according to the cross-correlation between the calculated reference signal and each frame of the microphone signal is to satisfy one or more of the following conditions:
  • the method further includes:
  • the data of the buffered historical reference signal is coordinated.
  • the method further includes:
  • the filter coefficient is moved according to the delay estimation value, and is updated based on the moved coefficient, and the coefficient corresponding to the position without the coefficient after the movement is performed. Reset and update based on the reset coefficients.
  • An echo cancellation device based on time delay estimation comprising:
  • a signal processing module configured to receive a microphone signal and a reference signal, and perform pre-processing to output the pre-processed microphone signal and the reference signal;
  • a frequency point detecting module configured to determine a frequency point signal in which a non-linear condition exists in the pre-processed microphone signal and the reference signal output by the signal processing module in the current echo cancellation scenario
  • a delay estimation module configured to calculate and output a current delay estimation value according to the frequency signal of the microphone signal and the reference signal determined by the frequency point detecting module without a nonlinear condition
  • a signal moving module configured to: move the reference signal based on a current delay estimation value output by the time delay estimation module, and output the moved reference signal
  • an adaptive filter configured to update the adaptive filter according to the preprocessed microphone signal output by the signal processing module and the moved reference signal output by the signal moving module, to implement echo cancellation.
  • the frequency point detecting module comprises any one or more of the following detecting units:
  • a remote signal detecting unit configured to determine a frequency point signal in which a nonlinear condition exists according to any one or more of an energy, a zero-crossing rate, and a short-term amplitude of the pre-processed reference signal;
  • a double-ended signal detecting unit configured to determine, according to an energy ratio of the pre-processed microphone signal and the reference signal, a frequency point signal in which a nonlinear condition exists
  • the device hardware detecting unit is configured to first calculate a correlation mean value of the reference signal and the microphone signal in a lower frequency range; and then calculate a correlation value between the reference signal and the microphone signal in other frequency ranges by using a certain frequency interval; The mean of the correlation in the frequency range and the mean of the correlation in the low frequency range to determine the frequency point signal in which the nonlinear condition exists.
  • the delay estimation module comprises:
  • a cross-correlation calculation unit is configured to sequentially select a frequency point signal in which no nonlinear condition is present for each frame of the reference signal and the delay analysis range, and calculate a cross-correlation between the reference signal and each frame of the microphone signal;
  • the delay estimation value determining unit is configured to determine a delay estimation value according to the cross-correlation between the calculated reference signal and each frame of the microphone signal.
  • the time delay estimation value determining unit is configured to select, as the current delay position, a position corresponding to a frame with the largest cross-correlation in the cross-correlation between the reference signal calculated by the cross-correlation computing unit and each frame of the microphone signal, And determining a current delay estimation value according to the current delay position and a position of the reference signal.
  • the time delay estimation value determining unit further determines that one or more of the following conditions are met when determining the time delay estimation value:
  • the device further comprises:
  • a cache module configured to cache data of a historical reference signal
  • the signal movement module is further configured to cooperate with the data of the historical reference signal buffered in the cache module when the reference signal is moved.
  • the filter coefficient when the coefficient is updated, if the current delay estimation value is smaller than the filter length, the filter coefficient is moved according to the delay estimation value, and is updated based on the moved coefficient, and there is no coefficient after the movement.
  • the corresponding coefficient of the position is reset and updated based on the reset coefficient.
  • An echo cancellation device based on time delay estimation comprising: a processor, a memory, and a system bus;
  • the processor and the memory are connected by the system bus;
  • the memory is for storing one or more programs, the one or more programs including instructions that, when executed by the processor, cause the processor to perform the above-described delay estimation based echo cancellation method One of the methods described.
  • a computer readable storage medium having stored therein instructions for causing the terminal device to perform any of the above-described delay estimation based echo cancellation methods when the instructions are run on a terminal device Said method.
  • a computer program product when the computer program product is run on a terminal device, causing the terminal device to perform the method of any of the above-described delay estimation based echo cancellation methods.
  • the echo cancellation method and apparatus detect the frequency point of the nonlinear condition in the microphone signal and the reference signal, and calculate the current time according to the frequency point signal of the microphone signal and the reference signal without the nonlinear condition. Estimating the value, that is, estimating the delay between the reference signal and the mic signal in the case of removing the nonlinearity, thereby making the delay estimation more accurate, and then moving the reference signal based on the current delay estimation value, based on the mic signal and the movement The subsequent reference signal updates the adaptive filter to achieve echo cancellation, effectively improving the echo cancellation effect.
  • the delay estimation value is corrected based on various robust conditions, so that the estimated delay is more robust.
  • the filter caused by the change of the delay can be reduced by buffering and matching the data of the historical reference signal and resetting the filter coefficients without reference meaning. Reconverge the time, which in turn reduces the impact of reconvergence on echo cancellation performance.
  • 1 is a flowchart of an echo cancellation method based on time delay estimation in an embodiment of the present application
  • FIG. 2 is a schematic diagram of comparison between historical reference signal data and movement in the embodiment of the present application and only current reference signal movement in the prior art;
  • FIG. 3 is a schematic diagram showing a comparison between the case where the erroneous filter coefficients are reset when the filter is updated in the embodiment of the present application and the prior art;
  • FIG. 4 is a schematic block diagram of an echo cancellation apparatus based on time delay estimation according to an embodiment of the present application
  • FIG. 5 is another block diagram of an echo cancellation apparatus based on time delay estimation according to an embodiment of the present application.
  • the embodiment of the present application provides an echo cancellation method and apparatus based on delay estimation, which removes a frequency point in which a nonlinear condition exists in determining a delay, according to a frequency in which there is no nonlinear condition in the microphone signal and the reference signal.
  • the point signal determines the delay estimation value, so that the obtained delay estimation value is more accurate, and then the echo cancellation is implemented based on the delay estimation value, thereby effectively improving the echo cancellation effect.
  • FIG. 1 it is a flowchart of an echo cancellation method based on time delay estimation in the embodiment of the present application, which includes the following steps:
  • step 101 the microphone signal and the reference signal are respectively received and preprocessed.
  • the microphone signal is a digital signal collected by a microphone for collecting a voice signal and subjected to A/D conversion;
  • the reference signal is a source signal that needs to be echo-removed, and is also a digital signal.
  • the reference signal is the source signal transmitted by the TV box to the TV set, and of course, it can also be a conventionally common TV speaker signal. Not limited.
  • the preprocessing mainly includes processing such as framing, windowing, fast Fourier transform, etc., and transforms the reference signal of the time domain and the mic signal into corresponding frequency domain signals.
  • Step 102 Determine a frequency point signal in which a nonlinear condition exists in the pre-processed microphone signal and the reference signal in the current echo cancellation scenario.
  • a frequency point signal in which a nonlinear condition exists is determined according to any one or more of energy, zero-crossing rate, and short-time amplitude of the pre-processed reference signal. For example, if the energy P x of the reference signal x at a certain frequency point is greater than the set energy threshold, then a frequency point signal in which a nonlinear condition exists at the frequency point is determined.
  • Double-ended signal detection According to the energy ratio of the pre-processed microphone signal and the reference signal, a frequency point signal in which a nonlinear condition exists is determined.
  • x(n) and y(n) respectively represent the reference signal and the mic signal at the frequency n
  • is a smoothing coefficient, the value of which can be determined by a large number of experimental results and/or experience.
  • a cross-correlation mean of the reference signal and the mic signal within a lower frequency range N (eg, 300 Hz-800 Hz, as determined empirically and/or by a large number of experimental results) is calculated.
  • is the coefficient of smoothing and can be determined by extensive experimentation and/or experience.
  • the cross-correlation mean of the reference signal and the microphone signal are:
  • the cross-correlation mean of the reference signal and the mic signal in other frequency ranges is calculated, and the specific calculation method is the same as the cross-correlation mean calculation in the above low frequency range.
  • the frequency point signal with nonlinear condition is determined. For example, it is determined whether the correlation mean in other frequency ranges is significantly smaller than the correlation mean in the low frequency range (for example, the ratio of the two is ⁇ 0.1), and if so, the signal in the frequency range is nonlinear.
  • any one of the above detection methods may be used alone.
  • the weighted analysis may be performed on the values obtained by each of the detection methods, or the nonlinearity of the corresponding frequency points may be determined according to various detection methods, and then the nonlinearity of the frequency signal is determined. This embodiment of the present application is not limited.
  • Step 103 Calculate the current delay estimation value according to the frequency signal of the microphone signal and the reference signal without a nonlinear condition.
  • the delay analysis range For example, taking the sampling rate of 16KHZ and the maximum delay of 1s as an example, the delay analysis range is 30 frames, that is, each time delay analysis needs to separately calculate the cross-correlation between the reference signal and the microphone signals of each frame in the 30 frames.
  • the frequency point signal in which there is no nonlinear condition is selected, and the cross correlation between the reference signal and each frame of the microphone signal is calculated.
  • the mean value of the cross-correlation of each frequency point in the current frame is calculated, and the average value is used as a cross-correlation between the current frame reference signal and the current frame microphone signal.
  • the frequency of the common frequency range of audio (taking 16KHZ sampling as an example, the common frequency of the sound is 1500Hz ⁇ 4625Hz) can be selected in the calculation of cross-correlation, and the calculation reference is made.
  • the cross-correlation between the signal and the microphone signal can be selected in the calculation of cross-correlation between the signal and the microphone signal.
  • the M may be selected to calculate the cross-correlation between the reference signal and the microphone signal.
  • a delay estimate is determined based on the cross-correlation of the calculated reference signal with each frame of the microphone signal.
  • the delay estimates can be determined in a number of ways, as explained below.
  • Example 1 selecting a position corresponding to a frame with the largest cross-correlation in the cross-correlation of the calculated reference signal and each frame of the microphone signal as the current delay position, and determining the current delay according to the current delay position and the position of the reference signal. estimated value.
  • the reference signal is the 50th frame, and there are 30 frames of the microphone signal in the delay analysis range, respectively, the 20th to 50th frame microphone signals, and the 30 frames of the microphone signal and the 50th frame respectively.
  • the number of frames of the reference signal (such as 18) is less than 30, then the compared microphone signals are all the frames before the current frame, that is, the 18th frame reference is compared with the 1-18 frame microphone signals, respectively.
  • Example 2 In order to make the current estimated delay value more robust, one or more of the following robust conditions can be satisfied, that is, after calculating the frame with the largest cross-correlation, it is also necessary to determine whether the following is satisfied. Any one or more conditions, if satisfied, determining a current delay estimate based on the current delay location and a location of the reference signal; otherwise, proceeding to the next delay estimate.
  • Example 3 When the reference signal is moved based on the delay and the filter is updated, the accuracy of the delay estimation is more strict with the convergence of the filter. In order to obtain a more accurate delay, in this embodiment, First, the position corresponding to the frame with the largest cross-correlation is used as the candidate delay position, and based on the historical data, the candidate delay position is punished and/or rewarded according to the change of the candidate delay position, so that the final delay estimate is obtained. The value is more accurate.
  • the candidate delay position is used as the candidate delay position, and the candidate delay position is counted.
  • L the total number of frames of the microphone signal in the range of delay analysis, and counting the number of consecutive occurrences of the candidate delay position count, and once the discontinuity occurs, count is set to 0;
  • the candidate delay position changes, the current candidate delay position is increased by the first set value t1, the last candidate delay position is decreased by the second set value t2, and the remaining positions are decreased by the third setting.
  • the reliability of the candidate delay position is increasing, and the corresponding previously estimated candidate delay position and the reliability of other positions are lower, so
  • the candidate delay position is increased by the first set value t1, and the remaining positions are decreased by the third set value t3.
  • the last candidate position is more reliable than the other positions, so its statistics are weaker.
  • the current candidate delay position estimate is used. More accurate, as a more accurate delay D 1 (t).
  • Example 4 In order to make the current estimated delay value more robust, when determining the delay estimation value in the manner of the above example 3, it is also possible to determine whether one or more of the above robust conditions are satisfied, and if so, And determining a current delay estimation value according to the current delay position and a position of the reference signal; otherwise, proceeding to the next time delay estimation.
  • Step 104 Move the reference signal based on the current delay estimation value.
  • the data of the reference signal may be moved to a corresponding delay estimation value.
  • the existing reference signal content is lost when the reference signal is moved based on the delay.
  • the historical reference signal data can be further moved in conjunction. Specifically, the data of the historical reference signal is buffered; when the reference signal is moved, the buffered historical reference signal data is coordinatedly moved.
  • FIG. 2 a schematic diagram of comparing the data of the cached historical reference signal with the prior art in the embodiment of the present application is shown.
  • Step 105 Update the adaptive filter according to the pre-processed microphone signal and the moved reference signal to implement echo cancellation.
  • the output signal after echo cancellation is:
  • the filter coefficients h(t,n) are updated as follows:
  • h(t,n) h(t-1,n)+ ⁇ *e(t,n)*x'(t,n)/(x'(t,n) 2 + ⁇ ) (1.6)
  • is the filter update step size, determined by extensive experimentation and/or experience
  • is a normalization factor and is generally determined by extensive experimentation and/or experience.
  • the filter coefficients need to be re-updated when the time delay changes, in order to speed up the filter coefficient update speed and reduce the influence of the echo cancellation performance degradation caused by the filter coefficient update, in another embodiment of the method of the present application.
  • the filter coefficient is moved according to the delay estimation value, and is updated based on the moved coefficient, and the coefficient corresponding to the position without the coefficient after the movement is heavy.
  • the error filter coefficient is reset to 0, which is compared with the prior art.
  • the left side is an updated schematic diagram of the filter coefficients in the prior art. It can be seen that in the prior art, when the filter coefficients are updated, all coefficients will be updated based on the current value.
  • the right side is a schematic diagram of updating the filter coefficients in the embodiment of the present application. It can be seen that the nth and n-1th filter coefficients are updated, and the other filter coefficients are reset to 0. Update.
  • the echo cancellation method detects the frequency point where the nonlinear condition exists in the microphone signal and the reference signal, and calculates the current delay estimation according to the frequency signal of the microphone signal and the reference signal without the nonlinear condition.
  • the value that is, the delay between the reference signal and the mic signal is estimated in the case of removing the nonlinearity, so that the delay estimation value is more accurate, and then the reference signal is moved based on the current delay estimation value, based on the mic signal and the moved
  • the reference signal updates the adaptive filter to achieve echo cancellation, which effectively improves the echo cancellation effect.
  • the delay estimation value is corrected based on various robust conditions, so that the estimated delay is more robust.
  • the filter re-reduction due to the change of the delay is reduced.
  • Convergence time reduces the impact of reconvergence on echo cancellation performance.
  • the embodiment of the present application further provides an echo cancellation device based on time delay estimation, as shown in FIG. 4, which is a schematic intention of the device.
  • an echo cancellation is performed on a television box, wherein the reference signal is a source signal transmitted from the television box to the television set, and may of course be a conventionally common television speaker signal, and the television box microphone collects the signal as a microphone.
  • the television box microphone collects the signal as a microphone.
  • Signal, User A controls the TV box by voice.
  • the input signal in FIG. 4 includes an echo signal of the speaker, and the input signal is A/D converted.
  • the converted digital signal passes through the echo cancellation device 400 based on the delay estimation in the embodiment of the present application, and the echo cancellation device 400 inputs the signal into the input signal.
  • the echo signal is cancelled to obtain an output signal, that is, the voice signal of the user A, and the television box parses the output signal to obtain the manipulation command of the user A.
  • the echo estimation device based on the time delay estimation includes the following modules:
  • the signal processing module 401, 401' is configured to respectively receive the microphone signal and the reference signal, and preprocess the same, and output the preprocessed microphone signal and the reference signal; the preprocessing mainly includes framing, windowing, and fast The processing of the Fourier transform and the like converts the reference signal of the time domain and the signal of the microphone into corresponding frequency domain signals.
  • the frequency point detection module 402, 402' is configured to determine a frequency point signal in which a non-linear condition exists in the pre-processed microphone signal and the reference signal output by the signal processing module in the current echo cancellation scenario;
  • the delay estimation module 403 is configured to calculate and output a current delay estimation value according to the frequency signal of the microphone signal and the reference signal determined by the frequency point detecting module without a nonlinear condition;
  • the signal movement module 404 is configured to: move the reference signal based on a current delay estimation value output by the delay estimation value calculation module, and output the moved reference signal;
  • the adaptive filter 405 is configured to update the adaptive filter according to the preprocessed microphone signal output by the signal processing module and the moved reference signal output by the signal moving module 404 to implement echo cancellation.
  • the signal processing module 401 and the signal processing module 401', and the frequency point detecting module 402 and the frequency point detecting module 402' are merely for understanding the implementation principle of the device of the present application, and in practical applications.
  • the signal processing module 401 and the signal processing module 401' may be the same physical entity.
  • the frequency detection module 402 and the frequency detection module 402' may be the same physical entity.
  • the signal processing module and the frequency point detection module can also be the same physical entity.
  • the embodiment of the present application is not limited.
  • FIG. 4 is only an application example of the device of the present application. It should be noted that the device in the embodiment of the present application can be applied to various scenarios, for example, a TV box controls a scene of a television, and the device of the present application is integrated on a TV box. The sound played by the television in the control voice of the TV box can be effectively eliminated, and other applications are not illustrated here.
  • any one or more of the following detecting units may be disposed in the frequency point detecting module to detect the nonlinear frequency points in different situations:
  • a remote signal detecting unit configured to determine a frequency point signal in which a nonlinear condition exists according to any one or more of energy, a zero-crossing rate, and a short-term amplitude of the pre-processed reference signal; for example, if the reference signal x is an energy threshold frequency energy is greater than a set P x, it is determined that the nonlinear frequency signal is present in the case of frequency.
  • the double-ended signal detecting unit is configured to determine a frequency point signal in which a non-linear condition exists according to the energy ratio of the pre-processed mic signal and the reference signal; for details, refer to the description in the foregoing method embodiment of the present application, and details are not described herein again.
  • the device hardware detecting unit is configured to first calculate a correlation mean value of the reference signal and the microphone signal in a lower frequency range; and then calculate a correlation value between the reference signal and the microphone signal in other frequency ranges by using a certain frequency interval; The mean of the correlation in the frequency range and the mean of the correlation in the low frequency range to determine the frequency point signal in which the nonlinear condition exists. For example, if the correlation mean in other frequency ranges is significantly smaller than the correlation mean in the low frequency range, then the signal in the frequency range can be determined to be non-linear.
  • the time delay estimation module 403 includes: a cross-correlation calculation unit and a delay estimation value determining unit, wherein the cross-correlation calculation unit is configured to sequentially select, for each frame, the microphone signals in the reference signal and the delay analysis range a cross-correlation signal of a linear condition, calculating a cross-correlation between the reference signal and each frame of the microphone signal; the delay estimation value determining unit is configured to determine, according to the cross-correlation between the reference signal calculated by the cross-correlation computing unit and each frame of the microphone signal Estimated time delay.
  • the delay estimation value determining unit may select, as the current delay position, a position corresponding to a frame with the largest cross-correlation in the cross-correlation between the reference signal calculated by the cross-correlation computing unit and each frame of the microphone signal, according to the current The delay position and the position of the reference signal determine a current delay estimate.
  • the time delay estimation value determining unit may further consider whether one or more of the following robust conditions are met when determining the current time delay estimation value, if If yes, the current delay estimation value is determined according to the current delay position and the position of the reference signal; otherwise, the next delay estimation is continued.
  • the delay The estimated value determining unit may firstly use the position corresponding to the frame with the largest cross-correlation as the candidate delay position, and based on the historical data, punish and/or reward the candidate delay position according to the change of the candidate delay position, so as to obtain the finalized The delay estimate is more accurate.
  • the position corresponding to the frame in which the cross-correlation between the reference signal calculated at each time delay estimation and the cross-correlation of each frame of the frame in the delay analysis range is the candidate delay position, and the candidate delay position is used.
  • the time delay estimation value determining unit may also consider whether one or more of the foregoing robust conditions are met, so that the current estimated delay value is more Robustness.
  • the signal movement module 404 may move the data of the reference signal to a corresponding delay estimation value for the current delay estimation value determined by the delay estimation value determining unit in different manners.
  • the method may further include: a buffer module 501, configured to use data of the historical reference signal. Cache.
  • the signal movement module 404 moves the reference signal
  • the data of the historical reference signal buffered in the cache module needs to be coordinated.
  • the filter coefficient when the coefficient filter is updated, if the current delay estimation value is smaller than the filter length, the filter coefficient is moved according to the delay estimation value, based on the moved coefficient.
  • the update is performed, and the coefficients corresponding to the positions without coefficients after the movement are reset, for example, the filter coefficients having no reference meaning are reset to 0, and then updated based on the restored coefficients.
  • the echo cancellation device detects the frequency point where the nonlinear condition exists in the microphone signal and the reference signal, and calculates the current delay estimation according to the frequency signal of the microphone signal and the reference signal without the nonlinear condition.
  • the value that is, the delay between the reference signal and the mic signal is estimated in the case of removing the nonlinearity, so that the delay estimation value is more accurate, and then the reference signal is moved based on the current delay estimation value, based on the mic signal and the moved
  • the reference signal updates the adaptive filter to achieve echo cancellation, which effectively improves the echo cancellation effect.
  • the delay estimation value is corrected based on various robust conditions, so that the estimated delay is more robust.
  • the filter caused by the change of the delay can be reduced by buffering and matching the data of the historical reference signal and resetting the filter coefficients without reference meaning. Reconverge the time, which in turn reduces the impact of reconvergence on echo cancellation performance.
  • the embodiment of the present application further provides an echo cancellation apparatus based on delay estimation, including: a processor, a memory, and a system bus;
  • the processor and the memory are connected by the system bus;
  • the memory is for storing one or more programs, the one or more programs including instructions that, when executed by the processor, cause the processor to perform the above-described delay estimation based echo cancellation method Any implementation.
  • the embodiment of the present application further provides a computer readable storage medium, where the computer readable storage medium stores an instruction, when the instruction is run on the terminal device, causing the terminal device to perform the foregoing time-based Any implementation of the extended echo cancellation method.
  • the embodiment of the present application further provides a computer program product, when the computer program product runs on the terminal device, causing the terminal device to perform any one of the foregoing echo estimation methods based on the delay estimation.
  • the various embodiments in the specification are described in a progressive manner, and the same or similar parts of the various embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments.
  • the device embodiments described above are merely illustrative, wherein the units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, ie It can be located in one place or it can be distributed to multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment. Those of ordinary skill in the art can understand and implement without any creative effort.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Acoustics & Sound (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Quality & Reliability (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Otolaryngology (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)
  • Telephone Function (AREA)
  • Filters That Use Time-Delay Elements (AREA)

Abstract

一种基于时延估计的回声消除方法及装置,方法包括:获取麦克信号和参考信号,并进行预处理(101);确定在当前回声消除场景下预处理后的麦克信号和参考信号中存在非线性情况的频点信号(102);根据麦克信号和参考信号中没有非线性情况的频点信号,计算得到当前时延估计值(103);基于当前时延估计值对参考信号进行移动(104);根据预处理后的麦克信号和移动后的参考信号,更新自适应滤波器,实现回声消除(105)。可以提高时延估计的准确性,提高回声消除效果。

Description

基于时延估计的回声消除方法及装置
本申请要求于2017年10月23日提交中国专利局、申请号为201710994195.X、申请名称为“基于时延估计的回声消除方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及信号处理领域,具体涉及一种基于时延估计的回声消除方法及装置。
背景技术
随着信息技术的不断发展,各种各样的分布式智能硬件在各个领域的应用日益广泛。回声消除作为智能设备交互中不可或缺的环节,一直是相关领域技术人员研究的热点。
回声消除是通过消除或者移除本地话筒中拾取到的远端的音频信号来阻止远端的声音返回去的一种处理方法,现有典型的回声消除方案是基于时延估计的方法,计算参考信号和麦克信号的线性相关关系,并选取最大互相关对应的时延作为装置时延,将参考信号基于该装置时延进行移动,再根据移动后的参考信号和麦克信号更新自适应滤波器,产生一个与真实回声接近的信号,将该信号从麦克信号中减去,从而达到回声消除的目的。比如只有单端操作权限的分布式智能硬件装置,无法做到参考信号与麦克信号同步重新采样,如日常家庭中,用电视盒子控制电视机,由于电视盒子和电视机多为不同厂商所提供,假设作为电视盒子厂商,在电视盒子语音操控中需要对电视机播放的声音进行回声消除,而此时只拥有电视盒子的操作权限,即只能获取电视盒子传给电视的源信号和电视盒子麦克风收集信号,其中,电视盒子传给电视的源信号作为参考信号,电视盒子麦克风收集信号作为麦克信号,无法真正做到对扬声器信号与麦克信号同步的重采样,此时就需要估计参考信号与麦克信号的时延,进而根据该时延实 现回声消除。
可见,时延估计的准确与否直接影响到回声消除的效果,由于实际应用场景中环境复杂多变,现有的基于时延估计的回声消除技术得到的时延估计误差较大,回声消除效果仍有待提高。
发明内容
本申请实施例提供一种基于时延估计的回声消除方法及装置,以降低时延估计误差,提高回声消除效果。
为此,本申请提供如下技术方案:
一种基于时延估计的回声消除方法,所述方法包括:
分别接收麦克信号和参考信号,并对其进行预处理;
确定在当前回声消除场景下预处理后的麦克信号和参考信号中存在非线性情况的频点信号;
根据麦克信号和参考信号中没有非线性情况的频点信号,计算得到当前时延估计值;
基于所述当前时延估计值对所述参考信号进行移动;
根据预处理后的麦克信号和移动后的参考信号,更新自适应滤波器,实现回声消除。
优选地,所述确定在当前回声消除场景下预处理后的麦克信号和参考信号中存在非线性情况的频点信号包括以下任意一种或多种检测:
远端信号检测:根据预处理后的参考信号的能量、过零率、短时幅值中的任一种或多种参数确定存在非线性情况的频点信号;
双端信号检测:根据预处理后的麦克信号与参考信号的能量比确定存在非线性情况的频点信号;
设备硬件导致的非线性检测:首先计算一较低频率范围内的参考信号和麦克信号的相关性均值;然后采用一定频率间隔,计算得到其他频率范围内参考信号和麦克信号相关性均值;最后依据其他频率范围内的相关性均值和低频范围内的相关性均值,确定存在非线性情况的频点信号。
优选地,所述根据麦克信号和参考信号中没有非线性情况的频点信号,计算得到当前时延估计值包括:
依次针对参考信号和时延分析范围内的各帧麦克信号,选取其中没有非线性情况的频点信号,计算参考信号与各帧麦克信号的互相关;
根据计算得到的参考信号与各帧麦克信号的互相关确定时延估计值。
优选地,所述根据计算得到的各帧参考信号与麦克信号的互相关确定时延估计值包括:
选取计算得到的参考信号与各帧麦克信号的互相关中互相关最大的帧对应的位置作为当前时延位置,根据所述当前时延位置及所述参考信号的位置确定当前时延估计值。
优选地,所述根据计算得到的各帧参考信号与麦克信号的互相关确定当前时延估计值包括:
将每次时延估计时计算得到的参考信号与时延分析范围内各帧麦克信号的互相关中互相关最大的帧对应的位置作为候选时延位置,将所述候选时延位置统计在一个L维的数组Sa中,其中L=时延分析范围内麦克信号的总帧数,并统计所述候选时延位置连续出现的次数;
如果本次候选时延位置发生变化,则将本次候选时延位置增加第一设定值t1,将上次候选时延位置减小第二设定值t2,其余位置减小第三设定值t3;
如果本次候选时延位置未发生变化,则将本次候选时延位置增加第一设定值t1,其余位置减小第三设定值t3,所述第二设定值t2小于等于第三设定值t3;
如果本次候选时延位置大于第一阈值,且该候选时延位置连续出现的次数大于第二阈值时,根据本次候选时延位置及所述参考信号的位置确定当前时延估计值。
优选地,所述基于所述当前时延估计值对所述参考信号进行移动包括:
在当前时延估计值D 1(t)<=第三阈值T3时,所述参考信号的数据不做 移动;
在第三阈值T3<当前时延估计值D 1(t)<第四阈值T4时,将所述参考信号的数据移动D 1(t)/2;
在第四阈值T4<=当前时延估计值D 1(t)时,将所述参考信号的数据移动D 1(t)。
优选地,所述根据计算得到的参考信号与各帧麦克信号的互相关确定时延估计值需满足以下任一种或多种条件:
(1)当前时延位置对应的互相关C(t)大于上一次时延位置对应的互相关C(t-1);
(2)当前时延分析范围内每帧中的最大互相关C max(t)和最小互相关C min(t)对应的位置的差值大于设定的第一设定差值;
(3)参考信号与时延分析范围内各帧麦克信号的互相关均值C mean(t)与当前时延位置对应的互相关C(t)的差值大于第二设定差值;
(4)当前时延位置p(t)小于上一次时延位置p(t-1)。
优选地,所述方法还包括:
对历史参考信号的数据进行缓存;
在对所述参考信号进行移动时,对缓存的历史参考信号的数据进行配合移动。
优选地,所述方法还包括:
如果当前时延估计值小于滤波器长度,则更新自适应滤波器系数时,将滤波器系数按照时延估计值移动,基于移动后的系数进行更新,对于移动后没有系数的位置对应的系数进行重置,并基于重置后的系数进行更新。
一种基于时延估计的回声消除装置,所述装置包括:
信号处理模块,用于接收麦克信号和参考信号,并进行预处理,输出预处理后的麦克信号和参考信号;
频点检测模块,用于确定在当前回声消除场景下所述信号处理模块输出的预处理后的麦克信号和参考信号中存在非线性情况的频点信号;
时延估计模块,用于根据所述频点检测模块确定的麦克信号和参考信号中没有非线性情况的频点信号,计算并输出当前时延估计值;
信号移动模块,用于基于所述时延估计模块输出的当前时延估计值对所述参考信号进行移动,并输出移动后的参考信号;
自适应滤波器,用于根据所述信号处理模块输出的预处理后的麦克信号和所述信号移动模块输出的移动后的参考信号,更新自适应滤波器,实现回声消除。
优选地,所述频点检测模块包括以下任意一个或多个检测单元:
远端信号检测单元,用于根据预处理后的参考信号的能量、过零率、短时幅值中的任一种或多种参数确定存在非线性情况的频点信号;
双端信号检测单元,用于根据预处理后的麦克信号与参考信号的能量比确定存在非线性情况的频点信号;
设备硬件检测单元,用于首先计算一较低频率范围内的参考信号和麦克信号的相关性均值;然后采用一定频率间隔,计算得到其他频率范围内参考信号和麦克信号相关性均值;最后依据其他频率范围内的相关性均值和低频范围内的相关性均值,确定存在非线性情况的频点信号。
优选地,所述时延估计模块包括:
互相关计算单元,用于依次针对参考信号和时延分析范围内的各帧麦克信号,选取其中没有非线性情况的频点信号,计算参考信号与各帧麦克信号的互相关;
时延估计值确定单元,用于根据计算得到的参考信号与各帧麦克信号的互相关确定时延估计值。
优选地,所述时延估计值确定单元,具体用于选取所述互相关计算单元计算得到的参考信号与各帧麦克信号的互相关中互相关最大的帧对应的位置作为当前时延位置,根据所述当前时延位置及所述参考信号的位置确定当前时延估计值。
优选地,所述时延估计值确定单元,具体用于将每次时延估计时计算 得到的参考信号与时延分析范围内各帧麦克信号的互相关中互相关最大的帧对应的位置作为候选时延位置,将所述候选时延位置统计在一个L维的数组Sa中,其中L=时延分析范围内麦克信号的总帧数,并统计所述候选时延位置连续出现的次数;如果本次候选时延位置发生变化,则将本次候选时延位置增加第一设定值t1,将上次候选时延位置减小第二设定值t2,其余位置减小第三设定值t3;如果本次候选时延位置未发生变化,则将本次候选时延位置增加第一设定值t1,其余位置减小第三设定值t3,所述第二设定值t2小于等于第三设定值t3;如果本次候选时延位置大于第一阈值,且该候选时延位置连续出现的次数大于第二阈值时,根据本次候选时延位置及所述参考信号的位置确定当前时延估计值。
优选地,所述信号移动模块,具体用于在当前时延估计值D 1(t)<=第三阈值T3时,所述参考信号的数据不做移动;在第三阈值T3<当前时延估计值D 1(t)<第四阈值T4时,将所述参考信号的数据移动D 1(t)/2;在第四阈值T4<=当前时延估计值D 1(t)时,将所述参考信号的数据移动D 1(t)。
优选地,所述时延估计值确定单元在确定时延估计值时还需确定满足以下任一种或多种条件:
(1)当前时延位置对应的互相关C(t)大于上一次时延位置对应的互相关C(t-1);
(2)当前时延分析范围内每帧中的最大互相关C max(t)和最小互相关C min(t)对应的位置的差值大于设定的第一设定差值;
(3)参考信号与时延分析范围内各帧麦克信号的互相关均值C mean(t)与当前时延位置对应的互相关C(t)的差值大于第二设定差值;
(4)当前时延位置p(t)小于上一次时延位置p(t-1)。
优选地,所述装置还包括:
缓存模块,用于对历史参考信号的数据进行缓存;
所述信号移动模块,还用于在对所述参考信号进行移动时,对所述缓存模块中缓存的历史参考信号的数据进行配合移动。
优选地,所述自适应滤波器在进行系数更新时,如果当前时延估计值小于滤波器长度,将滤波器系数按照时延估计值移动,基于移动后的系数进行更新,对于移动后没有系数的位置对应的系数进行重置,并基于重置后的系数进行更新。
一种基于时延估计的回声消除装置,包括:处理器、存储器、系统总线;
所述处理器以及所述存储器通过所述系统总线相连;
所述存储器用于存储一个或多个程序,所述一个或多个程序包括指令,所述指令当被所述处理器执行时使所述处理器执行上述基于时延估计的回声消除方法中任一项所述的方法。
一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当所述指令在终端设备上运行时,使得所述终端设备执行上述基于时延估计的回声消除方法中任一项所述的方法。
一种计算机程序产品,所述计算机程序产品在终端设备上运行时,使得所述终端设备执行上述基于时延估计的回声消除方法中任一项所述的方法。
本申请实施例提供的回声消除方法及装置,通过对麦克信号和参考信号中存在非线性情况的频点进行检测,根据麦克信号和参考信号中没有非线性情况的频点信号,计算得到当前时延估计值,也就是说,在去除非线性情况下估计参考信号和麦克信号间的时延,从而使得时延估计值更加准确,然后基于当前时延估计值移动参考信号,基于麦克信号和移动后的参考信号更新自适应滤波器,实现回声消除,有效地提升了回声消除效果。
进一步地,在时延估计时,基于多种鲁棒条件对时延估计值进行修正,从而使估计到的时延更加鲁棒。
进一步地,在自适应滤波器更新过程中,通过对历史参考信号的数据的缓存及配合移动、以及将没有参考意义的滤波器系数进行重置,均可以减少因时延发生变化导致的滤波器重新收敛时间,进而减小重新收敛对回声消除性能的影响。
附图说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请中记载的一些实施例,对于本领域普通技术人员来讲,还可以根据这些附图获得其他的附图。
图1是本申请实施例基于时延估计的回声消除方法的流程图;
图2是本申请实施例中历史参考信号数据配合移动与现有技术中只对当前参考信号移动的对比示意图;
图3是本申请实施例中滤波器进行更新时将错误的滤波器系数进行重置的情况与现有技术的对比示意图;
图4是本申请实施例基于时延估计的回声消除装置的原理框图;
图5是本申请实施例基于时延估计的回声消除装置的另一框图。
具体实施方式
为了使本技术领域的人员更好地理解本申请实施例的方案,下面结合附图和实施方式对本申请实施例作进一步的详细说明。
考虑到在实际应用中,回声消除过程中常会出现非线性的情况,如音量大、电池电量不足、无远端信号即参考信号、双端讲话即既有人声又有扬声器声音等情况,这些情况会导致音频信号的非线性,在进行延时估计时会使得互相关计算错误,最终导致回声消除效果差。为此,本申请实施例提供一种基于时延估计的回声消除方法及装置,在确定时延时,将存在非线性情况的频点去除,根据麦克信号和参考信号中没有非线性情况的频点信号确定时延估计值,从而使得到的时延估计值更准确,进而基于该时延估计值实现回声消除,有效地提高了回声消除效果。
如图1所示,是本申请实施例基于时延估计的回声消除方法的一种流程图,包括以下步骤:
步骤101,分别接收麦克信号和参考信号,并对其进行预处理。
所述麦克信号为用于收集语音信号的麦克风采集的、并且经过A/D转 换后的数字信号;所述参考信号为需要进行回声消除掉的源信号,同样为数字信号。以在电视盒子上进行回声消除为例,作为电视盒子厂家,参考信号为电视盒子传给电视机的源信号,当然也可以是传统上较为常见的电视机扬声器信号,对此,本申请实施例不做限定。
所述预处理主要包括分帧、加窗、快速傅里叶变换等处理,将时域的参考信号和麦克信号变换为相应的频域信号。
步骤102,确定在当前回声消除场景下预处理后的麦克信号和参考信号中存在非线性情况的频点信号。
考虑到实际应用环境及设备硬件的多样性,在确定是否存在非线性情况的频点信号时,也可以有多种不同的检测方法,比如:
(1)远端信号检测:根据预处理后的参考信号的能量、过零率、短时幅值中的任一种或多种参数确定存在非线性情况的频点信号。比如,如果参考信号x在某频点的能量P x大于设定的能量阈值,则确定在该频点存在非线性情况的频点信号。
(2)双端信号检测:根据预处理后的麦克信号与参考信号的能量比确定存在非线性情况的频点信号。
具体的,计算参考信号x在某频点的平滑能量
Figure PCTCN2018095759-appb-000001
和麦克信号在该频点的平滑能量
Figure PCTCN2018095759-appb-000002
根据能量比
Figure PCTCN2018095759-appb-000003
来判断,若能量比大于设定的能量比阈值,则确定在该频点存在非线性情况的频点信号。所述平滑能量
Figure PCTCN2018095759-appb-000004
Figure PCTCN2018095759-appb-000005
的计算公式如下:
Figure PCTCN2018095759-appb-000006
Figure PCTCN2018095759-appb-000007
其中,x(n)、y(n)分别表示频率n时的参考信号和麦克信号,α为平滑系数,其值可通过大量实验结果和/或经验确定。
当然,也可以直接计算参考信号x在某频点的能量和麦克信号在该频点的能量,根据两者的能量比来判断,若能量比大于设定的能量比阈值,则确定在该频点存在非线性情况的频点信号。
(3)设备硬件导致的非线性检测
由于在实际应用中,为了尽可能降低硬件设备成本,常会使用较为便宜的硬件,此时可能会导致某些工作状态发生非线性现象,如喇叭音量过大、电池电量不足等。对于这些非线性现象的检测,可以采用以下方法:
首先,计算一较低频率范围N(如300HZ-800HZ,具体可根据经验和/或大量实验结果确定)内的参考信号和麦克信号的互相关均值。
在频率n时,参考信号和麦克信号的互相关计算公式如式1.3:
Figure PCTCN2018095759-appb-000008
其中,
Figure PCTCN2018095759-appb-000009
β为平滑系数,可通过大量实验和/或经验确定。
那么在低频范围N内,参考信号和麦克信号的互相关均值为:
Figure PCTCN2018095759-appb-000010
然后,采用一定频率间隔d(1≤d<20),计算其他频率范围内参考信号和麦克信号的互相关均值,具体计算方法与上述低频范围内的互相关均值计算相同。
最后,依据其他频率间隔互相关均值和低频范围内的互相关均值,确定存在非线性情况的频点信号。比如,判断其他频率范围内的相关性均值是否明显小于低频范围内的相关性均值(如两者比值<0.1),若是,则说明该频率范围内的信号具有非线性。
需要说明的是,在实际应用中,可以单独采用上述任意一种检测方式,当然,也可以同时利用上述检测方式中的任意两种或三种进行综合判断,而且,在利用多种检测方式进行综合判断时,可以采用对其中各检测方法得到的值进行加权分析,或者依据多种检测方法均检测到相应频点存在非线性情况时,才确定该频点信号存在非线性情况等方式,对此本申请实施例不做限定。
步骤103,根据麦克信号和参考信号中没有非线性情况的频点信号,计算得到当前时延估计值。
首先,确定时延分析范围内包含的各帧麦克信号。比如,以16KHZ采样率,最大时延1s为例,时延分析范围为30帧,即每次时延分析需要分别计算参考信号与这30帧中各帧麦克信号的互相关。
然后,依次针对参考信号和各帧麦克信号,选取其中没有非线性情况的频点信号,计算参考信号与各帧麦克信号的互相关。
具体地,针对当前帧,选取其中没有非线性现象的频点信号,以512个频点(FFT长度区间为1024)为例,计算每个频点的两信号的互相关,具体可参见前面公式1.3。
得到各频点的互相关后,计算当前帧中各频点的互相关的均值,将该均值作为当前帧参考信号与当前帧麦克信号的互相关。
需要说明的是,为了使基于互相关的时延估计更准确,在计算互相关时可选取音频常见频率范围(以16KHZ采样为例,声音常见频率为1500Hz~4625Hz)内的频点,计算参考信号与麦克信号的互相关。进一步地,为了减少计算量,提高效率,可选取常见频率中的M(比如M=100)个频点计算参考信号与麦克信号的互相关。
最后,根据计算得到的参考信号与各帧麦克信号的互相关确定时延估计值。
在实际应用中,可以有多种方式确定时延估计值,下面将分别加以说明。
例1:选取计算得到的参考信号与各帧麦克信号的互相关中互相关最大的帧对应的位置作为当前时延位置,根据所述当前时延位置及所述参考信号的位置确定当前时延估计值。
比如,以当前参考信号帧为基准,假设参考信号为第50帧,时延分析范围内有30帧麦克信号,分别为第20-50帧麦克信号,分别将这30帧麦克信号与第50帧参考信号进行比较,如果当前的第50帧参考信号与第25帧麦克信号的互相关最大,则当前时延估计值是50-25=25。当然,如果参考信号的帧数(如18)小于30,则需要比较的麦克信号就为当前帧 之前的所有帧,即将第18帧参考分别与1-18帧麦克信号进行比较。
例2:为了使当前估计时延值更具鲁棒性,还可以满足以下鲁棒条件的一种或多种,也就是说,在计算得到互相关最大的帧后,还需要判断是否满足以下任意一种或多种条件,如果满足,则根据所述当前时延位置及所述参考信号的位置确定当前时延估计值;否则,继续进行下一次时延估计。
所述鲁棒条件如下:
(1)当前时延位置对应的互相关C(t)大于上一次时延位置对应的互相关C(t-1);
(2)当前时延分析范围内每帧中的最大互相关C max(t)和最小互相关C min(t)对应的位置的差值大于设定的第一设定差值,比如第一设定差值为3;
(3)参考信号与时延分析范围内各帧麦克信号的互相关均值C mean(t)与当前时延位置对应的互相关C(t)的差值大于第二设定差值;
(4)当前时延位置p(t)小于上一次时延位置p(t-1)。
例3:在基于时延进行参考信号移动、滤波器更新时,随着滤波器的收敛,对于时延估计的准确性要求更为严格,为了得到更为准确的时延,在该实施例中,还可先将互相关最大的帧对应的位置作为候选时延位置,基于历史数据,根据候选时延位置的变化情况对候选时延位置进行惩罚和/或奖励,使最终得到的时延估计值更准确。
时延估计值的具体确定过程如下:
首先,将每次时延估计时计算得到的参考信号与时延分析范围内各帧麦克信号的互相关中互相关最大的帧对应的位置作为候选时延位置,将所述候选时延位置统计在一个L维的数组Sa中,其中L=时延分析范围内麦克信号的总帧数,并统计所述候选时延位置连续出现的次数count,一旦出现不连续,则count置0;
如果本次候选时延位置发生变化,则将本次候选时延位置增加第一设 定值t1,将上次候选时延位置减小第二设定值t2,其余位置减小第三设定值t3;
如果本次候选时延位置未发生变化,则说明本次候选时延位置的可信度在增加,那么相应的之前估计出的候选时延位置以及其他位置的可信度就较低,因此将本次候选时延位置增加第一设定值t1,其余位置减小第三设定值t3。上次候选位置较其他位置更为可信,故对其统计量减小的较弱。所述t1、t2、t3的取值,可由经验或大量实验确定,一般t3>=t2,比如,t1、t2、t3分别为2、1、2;
如果本次候选时延位置大于第一阈值T1(如T1为10),且该候选时延位置连续出现的次数大于第二阈值T2(如T2为4)时,说明当前候选时延位置的估计较为准确,作为较为准确的时延D 1(t)。
例4:为了使当前估计时延值更具鲁棒性,在采用上述例3的方式确定时延估计值时,同样还可以判断是否满足上述鲁棒条件的一种或多种,如果满足,则根据所述当前时延位置及所述参考信号的位置确定当前时延估计值;否则,继续进行下一次时延估计。
步骤104,基于所述当前时延估计值对所述参考信号进行移动。
针对上述例1、例2、例3、例4采用不同方式确定的当前时延估计值,在实际应用中,均可将所述参考信号的数据移动相应的时延估计值。
另外,针对上述例3或例4的方式确定当前时延估计值D 1(t)的情况,考虑到候选时延位置在不同区间内的估计的准确程度不同,因此,还可以按以下方式对所述参考信号进行移动:
在当前时延估计值D 1(t)<=第三阈值T3(比如为10)时,所述参考信号的数据不做移动;
在第三阈值T3<当前时延估计值D 1(t)<第四阈值T4(比如为20)时,将所述参考信号的数据移动D 1(t)/2;
在第四阈值T4<=当前时延估计值D 1(t)时,将所述参考信号的数据移动D 1(t)。
另外,针对现有的基于时延对参考信号进行移动时存在的历史参考信号内容丢失的问题,在本申请方法另一实施例中,在根据当前时延估计值对参考信号进行移动时,还可进一步将历史参考信号数据配合移动。具体地,对历史参考信号的数据进行缓存;在对所述参考信号进行移动时,对缓存的历史参考信号数据进行配合移动。
如图2所示,示出了本申请实施例中对缓存的历史参考信号的数据进行配合移动与现有技术的对比示意图。
由图2中可以看出,在现有技术中,假设时延估计为3,则对参考信号进行移动时,直接使用n-3时刻的参考信号数据来代替n时刻的参考信号数据,而历史参考信号数据,如图中的n-1至n-4时刻的参考信号数据保持不变,从而会造成信号不连续。而利用本申请实施例,对参考信号进行移动时,不仅使用n-3时刻的参考信号数据来代替n时刻的参考信号数据,而且历史参考信号数据也同时一起配合移动,如图2中所示,从而避免了信号不连续即历史参考信号内容丢失对回声消除效果的影响。
步骤105,根据预处理后的麦克信号和移动后的参考信号,更新自适应滤波器,实现回声消除。
具体地,基于麦克信号y(t,n)和移动后的参考信号x’(t,n),回声消除后的输出信号为:
e(t,n)=y(t,n)-h(t,n)*x’(t,n)                (1.5)
其中,h(t,n)为滤波器系数。
滤波器系数h(t,n)的更新如下:
h(t,n)=h(t-1,n)+γ*e(t,n)*x’(t,n)/(x’(t,n) 2+θ)       (1.6)
其中,γ为滤波器更新步长,由大量实验和/或经验确定;θ为规整因子,一般也由大量实验和/或经验确定。
由于在时延发生变化时,滤波器系数需要重新更新,为了加快滤波器系数更新速度,减小因滤波器系数更新导致的回声消除性能下降的影响,在本申请方法另一实施例中,如果当前时延估计值小于滤波器长度,则更 新自适应滤波器系数时,将滤波器系数按照时延估计值移动,基于移动后的系数进行更新,对于移动后没有系数的位置对应的系数进行重置,比如重置为0,并基于重置后的系数进行更新,以缩短滤波器系数更新时间。
如图3所示,是本申请实施例中,滤波器进行更新时将错误的滤波器系数重置为0,与现有技术的对比示意图。
假设时延估计值为3,其中左侧为现有技术中滤波器系数的更新示意图,可以看出,在现有技术中,滤波器系数进行更新时,所有系数都将基于当前值进行更新。右侧为本申请实施例中滤波器系数的更新示意图,可以看出,将第n个和第n-1个滤波器系数进行了更新,将其它滤波器系数在重置为0的基础上进行更新。
本申请实施例提供的回声消除方法,通过对麦克信号和参考信号中存在非线性情况的频点进行检测,根据麦克信号和参考信号中没有非线性情况的频点信号,计算得到当前时延估计值,也就是说,在去除非线性情况下估计参考信号和麦克信号间的时延,从而使得时延估计值更加准确,然后基于当前时延估计值移动参考信号,基于麦克信号和移动后的参考信号更新自适应滤波器,实现回声消除,有效地提升了回声消除效果。
进一步地,在时延估计时,基于多种鲁棒条件对时延估计值进行修正,从而使估计到的时延更加鲁棒。
进一步地,在自适应滤波器更新过程中,通过对历史参考信号的数据的缓存及配合移动、以及将没有参考意义的滤波器系数进行重置,减少了因时延发生变化导致的滤波器重新收敛时间,减小了重新收敛对回声消除性能的影响。
相应地,本申请实施例还提供一种基于时延估计的回声消除装置,如图4所示,是该装置的原理意图。
图4中以在电视盒子上进行回声消除为例,其中,参考信号为电视盒子传给电视机的源信号,当然也可以是传统上较为常见的电视机扬声器信号,电视盒子麦克风收集信号作为麦克信号,用户A通过语音操控电视盒 子。
图4中的输入信号包含了扬声器的回声信号,输入信号经过A/D转换,转换后的数字信号经过本申请实施例基于时延估计的回声消除装置400,由回声消除装置400对输入信号中的回声信号进行消除,得到输出信号,即用户A的语音信号,电视盒子解析该输出信号,即可得到用户A的操控指令。
在该实施例中,基于时延估计的回声消除装置400包括以下各模块:
信号处理模块401、401′,用于分别接收麦克信号和参考信号,并对其进行预处理,输出预处理后的麦克信号和参考信号;所述预处理主要包括分帧、加窗、快速傅里叶变换等处理,将时域的参考信号和麦克信号变换为相应的频域信号。
频点检测模块402、402′,用于确定在当前回声消除场景下所述信号处理模块输出的预处理后的麦克信号和参考信号中存在非线性情况的频点信号;
时延估计模块403,用于根据所述频点检测模块确定的麦克信号和参考信号中没有非线性情况的频点信号,计算并输出当前时延估计值;
信号移动模块404,用于基于所述时延估计值计算模块输出的当前时延估计值对所述参考信号进行移动,并输出移动后的参考信号;
自适应滤波器405,用于根据所述信号处理模块输出的预处理后的麦克信号和所述信号移动模块404输出的移动后的参考信号,更新自适应滤波器,实现回声消除。
需要说明的是,在图4中,信号处理模块401和信号处理模块401′、以及频点检测模块402和频点检测模块402′仅仅是为了方便理解本申请装置的实现原理,在实际应用中,信号处理模块401和信号处理模块401′可以为同一物理实体,同样,频点检测模块402和频点检测模块402′可以为同一物理实体。当然,信号处理模块和频点检测模块也可以为同一物理实体。对此本申请实施例不做限定。
图4仅仅是本申请装置的一个应用举例,需要说明的是,本申请实施例的装置可以应用于多种场景,比如,电视盒子控制电视机的场景,将本申请装置集成在电视盒子上,可以有效消除对电视盒子的操控语音中的电视机播放的声音,其它应用在此不再举例说明。
考虑到实际应用环境及设备硬件的多样性,在确定是否存在非线性情况的频点信号时,也可以有多种不同的检测方法。相应地,在上述频点检测模块中可以设置以下任意一个或多个检测单元,以针对不同的情况下的非线性频点进行检测:
远端信号检测单元,用于根据预处理后的参考信号的能量、过零率、短时幅值中的任一种或多种参数确定存在非线性情况的频点信号;比如,如果参考信号x在某频点的能量P x大于设定的能量阈值,则确定在该频点存在非线性情况的频点信号。
双端信号检测单元,用于根据预处理后的麦克信号与参考信号的能量比确定存在非线性情况的频点信号;具体可参考前面本申请方法实施例中的描述,在此不再赘述。
设备硬件检测单元,用于首先计算一较低频率范围内的参考信号和麦克信号的相关性均值;然后采用一定频率间隔,计算得到其他频率范围内参考信号和麦克信号相关性均值;最后依据其他频率范围内的相关性均值和低频范围内的相关性均值,确定存在非线性情况的频点信号。比如,如果其他频率范围内的相关性均值明显小于低频范围内的相关性均值,则可以确定该频率范围内的信号具有非线性。
上述时延估计模块403包括:互相关计算单元和时延估计值确定单元,其中,所述互相关计算单元用于依次针对参考信号和时延分析范围内的各帧麦克信号,选取其中没有非线性情况的频点信号,计算参考信号与各帧麦克信号的互相关;所述时延估计值确定单元用于根据所述互相关计算单元计算得到的参考信号与各帧麦克信号的互相关确定时延估计值。
比如,所述时延估计值确定单元可以选取所述互相关计算单元计算得 到的参考信号与各帧麦克信号的互相关中互相关最大的帧对应的位置作为当前时延位置,根据所述当前时延位置及所述参考信号的位置确定当前时延估计值。
进一步地,为了使当前估计时延值更具鲁棒性,所述时延估计值确定单元在确定当前时延估计值时,还可以考虑是否满足以下鲁棒条件的一种或多种,如果满足,则根据所述当前时延位置及所述参考信号的位置确定当前时延估计值;否则,继续进行下一次时延估计。
所述鲁棒条件如下:
(1)当前时延位置对应的互相关C(t)大于上一次时延位置对应的互相关C(t-1);
(2)当前时延分析范围内每帧中的最大互相关C max(t)和最小互相关C min(t)对应的位置的差值大于设定的第一设定差值;
(3)参考信号与时延分析范围内各帧麦克信号的互相关均值C mean(t)与当前时延位置对应的互相关C(t)的差值大于第二设定差值;
(4)当前时延位置p(t)小于上一次时延位置p(t-1)。
另外,在基于时延进行参考信号移动、滤波器更新时,随着滤波器的收敛,对于时延估计的准确性要求更为严格,因此,为了得到更为准确的时延,所述时延估计值确定单元还可以先将互相关最大的帧对应的位置作为候选时延位置,基于历史数据,根据候选时延位置的变化情况对候选时延位置进行惩罚和/或奖励,使最终得到的时延估计值更准确。具体地,将每次时延估计时计算得到的参考信号与时延分析范围内各帧麦克信号的互相关中互相关最大的帧对应的位置作为候选时延位置,将所述候选时延位置统计在一个L维的数组Sa中,其中L=时延分析范围内麦克信号的总帧数,并统计所述候选时延位置连续出现的次数;如果本次候选时延位置发生变化,则将本次候选时延位置增加第一设定值t1,将上次候选时延位置减小第二设定值t2,其余位置减小第三设定值t3;如果本次候选时延位置未发生变化,则将本次候选时延位置增加第一设定值t1,其余位置减小 第三设定值t3,所述第二设定值t2小于等于第三设定值t3;如果本次候选时延位置大于第一阈值,且该候选时延位置连续出现的次数大于第二阈值时,根据本次候选时延位置及所述参考信号的位置确定当前时延估计值。所述t1、t2、t3的取值,可由经验或大量实验确定,一般t3>=t2,比如,t1、t2、t3分别为2、1、2。
需要说明的是,所述时延估计值确定单元在按照上述方式确定时延估计值时,同样可以考虑是否满足上述鲁棒条件中的一种或多种,以使当前估计时延值更具鲁棒性。
在实际应用中,所述信号移动模块404可以针对上述时延估计值确定单元采用不同方式确定的当前时延估计值,将所述参考信号的数据移动相应的时延估计值。
另外,针对上述基于历史数据,根据候选时延位置的变化情况对候选时延位置进行惩罚和/或奖励,最终得到时延估计值的情况,由于候选时延位置在不同区间内的估计的准确程度不同,因此,上述信号移动模块404还可以按以下方式对所述参考信号进行移动:在当前时延估计值D 1(t)<=第三阈值T3时,所述参考信号的数据不做移动;在第三阈值T3<当前时延估计值D 1(t)<第四阈值T4时,将所述参考信号的数据移动D 1(t)/2;在第四阈值T4<=当前时延估计值D 1(t)时,将所述参考信号的数据移动D 1(t)。
进一步地,对参考信号进行移动时,为了避免历史参考信号内容丢失,如图5所示,在本申请装置另一实施例中,还可包括:缓存模块501,用于对历史参考信号的数据进行缓存。
相应地,在该实施例中,所述信号移动模块404在对所述参考信号进行移动时,需要对所述缓存模块中缓存的历史参考信号的数据进行配合移动。
在本申请装置另一实施例中,所述自适应滤波器405在进行系数更新时,如果当前时延估计值小于滤波器长度,将滤波器系数按照时延估计值移动,基于移动后的系数进行更新,对于移动后没有系数的位置对应的系 数进行重置,比如,将这些没有参考意义的滤波器系数重置为0,然后基于重置后的系数进行更新。
本申请实施例提供的回声消除装置,通过对麦克信号和参考信号中存在非线性情况的频点进行检测,根据麦克信号和参考信号中没有非线性情况的频点信号,计算得到当前时延估计值,也就是说,在去除非线性情况下估计参考信号和麦克信号间的时延,从而使得时延估计值更加准确,然后基于当前时延估计值移动参考信号,基于麦克信号和移动后的参考信号更新自适应滤波器,实现回声消除,有效地提升了回声消除效果。
进一步地,在时延估计时,基于多种鲁棒条件对时延估计值进行修正,从而使估计到的时延更加鲁棒。
进一步地,在自适应滤波器更新过程中,通过对历史参考信号的数据的缓存及配合移动、以及将没有参考意义的滤波器系数进行重置,均可以减少因时延发生变化导致的滤波器重新收敛时间,进而减小重新收敛对回声消除性能的影响。
进一步地,本申请实施例还提供了一种基于时延估计的回声消除装置,包括:处理器、存储器、系统总线;
所述处理器以及所述存储器通过所述系统总线相连;
所述存储器用于存储一个或多个程序,所述一个或多个程序包括指令,所述指令当被所述处理器执行时使所述处理器执行上述基于时延估计的回声消除方法中的任一实现方式。
进一步地,本申请实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当所述指令在终端设备上运行时,使得所述终端设备执行上述基于时延估计的回声消除方法中的任一实现方式。
进一步地,本申请实施例还提供了一种计算机程序产品,所述计算机程序产品在终端设备上运行时,使得所述终端设备执行上述基于时延估计的回声消除方法中的任一实现方式。
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的 不同之处。而且,以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性劳动的情况下,即可以理解并实施。
以上对本申请实施例进行了详细介绍,本文中应用了具体实施方式对本申请进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及装置;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。

Claims (21)

  1. 一种基于时延估计的回声消除方法,其特征在于,所述方法包括:
    分别接收麦克信号和参考信号,并对其进行预处理;
    确定在当前回声消除场景下预处理后的麦克信号和参考信号中存在非线性情况的频点信号;
    根据麦克信号和参考信号中没有非线性情况的频点信号,计算得到当前时延估计值;
    基于所述当前时延估计值对所述参考信号进行移动;
    根据预处理后的麦克信号和移动后的参考信号,更新自适应滤波器,实现回声消除。
  2. 根据权利要求1所述的方法,其特征在于,所述确定在当前回声消除场景下预处理后的麦克信号和参考信号中存在非线性情况的频点信号包括以下任意一种或多种检测:
    远端信号检测:根据预处理后的参考信号的能量、过零率、短时幅值中的任一种或多种参数确定存在非线性情况的频点信号;
    双端信号检测:根据预处理后的麦克信号与参考信号的能量比确定存在非线性情况的频点信号;
    设备硬件导致的非线性检测:首先计算一较低频率范围内的参考信号和麦克信号的相关性均值;然后采用一定频率间隔,计算得到其他频率范围内参考信号和麦克信号相关性均值;最后依据其他频率范围内的相关性均值和低频范围内的相关性均值,确定存在非线性情况的频点信号。
  3. 根据权利要求1所述的方法,其特征在于,所述根据麦克信号和参考信号中没有非线性情况的频点信号,计算得到当前时延估计值包括:
    依次针对参考信号和时延分析范围内的各帧麦克信号,选取其中没有非线性情况的频点信号,计算参考信号与各帧麦克信号的互相关;
    根据计算得到的参考信号与各帧麦克信号的互相关确定时延估计值。
  4. 根据权利要求3所述的方法,其特征在于,所述根据计算得到的各 帧参考信号与麦克信号的互相关确定时延估计值包括:
    选取计算得到的参考信号与各帧麦克信号的互相关中互相关最大的帧对应的位置作为当前时延位置,根据所述当前时延位置及所述参考信号的位置确定当前时延估计值。
  5. 根据权利要求3所述的方法,其特征在于,所述根据计算得到的各帧参考信号与麦克信号的互相关确定当前时延估计值包括:
    将每次时延估计时计算得到的参考信号与时延分析范围内各帧麦克信号的互相关中互相关最大的帧对应的位置作为候选时延位置,将所述候选时延位置统计在一个L维的数组Sa中,其中L=时延分析范围内麦克信号的总帧数,并统计所述候选时延位置连续出现的次数;
    如果本次候选时延位置发生变化,则将本次候选时延位置增加第一设定值t1,将上次候选时延位置减小第二设定值t2,其余位置减小第三设定值t3;
    如果本次候选时延位置未发生变化,则将本次候选时延位置增加第一设定值t1,其余位置减小第三设定值t3,所述第二设定值t2小于等于第三设定值t3;
    如果本次候选时延位置大于第一阈值,且该候选时延位置连续出现的次数大于第二阈值时,根据本次候选时延位置及所述参考信号的位置确定当前时延估计值。
  6. 根据权利要求5所述的方法,其特征在于,所述基于所述当前时延估计值对所述参考信号进行移动包括:
    在当前时延估计值D 1(t)<=第三阈值T3时,所述参考信号的数据不做移动;
    在第三阈值T3<当前时延估计值D 1(t)<第四阈值T4时,将所述参考信号的数据移动D 1(t)/2;
    在第四阈值T4<=当前时延估计值D 1(t)时,将所述参考信号的数据移动D 1(t)。
  7. 根据权利要求4或5或6所述的方法,其特征在于,所述根据计算得到的参考信号与各帧麦克信号的互相关确定时延估计值需满足以下任一种或多种条件:
    (1)当前时延位置对应的互相关C(t)大于上一次时延位置对应的互相关C(t-1);
    (2)当前时延分析范围内每帧中的最大互相关C max(t)和最小互相关C min(t)对应的位置的差值大于设定的第一设定差值;
    (3)参考信号与时延分析范围内各帧麦克信号的互相关均值C mean(t)与当前时延位置对应的互相关C(t)的差值大于第二设定差值;
    (4)当前时延位置p(t)小于上一次时延位置p(t-1)。
  8. 根据权利要求1至6任一项所述的方法,其特征在于,所述方法还包括:
    对历史参考信号的数据进行缓存;
    在对所述参考信号进行移动时,对缓存的历史参考信号的数据进行配合移动。
  9. 根据权利要求1至6任一项所述的方法,其特征在于,所述方法还包括:
    如果当前时延估计值小于滤波器长度,则更新自适应滤波器系数时,将滤波器系数按照时延估计值移动,基于移动后的系数进行更新,对于移动后没有系数的位置对应的系数进行重置,并基于重置后的系数进行更新。
  10. 一种基于时延估计的回声消除装置,其特征在于,所述装置包括:
    信号处理模块,用于接收麦克信号和参考信号,并进行预处理,输出预处理后的麦克信号和参考信号;
    频点检测模块,用于确定在当前回声消除场景下所述信号处理模块输出的预处理后的麦克信号和参考信号中存在非线性情况的频点信号;
    时延估计模块,用于根据所述频点检测模块确定的麦克信号和参考信号中没有非线性情况的频点信号,计算并输出当前时延估计值;
    信号移动模块,用于基于所述时延估计模块输出的当前时延估计值对所述参考信号进行移动,并输出移动后的参考信号;
    自适应滤波器,用于根据所述信号处理模块输出的预处理后的麦克信号和所述信号移动模块输出的移动后的参考信号,更新自适应滤波器,实现回声消除。
  11. 根据权利要求10所述的装置,其特征在于,所述频点检测模块包括以下任意一个或多个检测单元:
    远端信号检测单元,用于根据预处理后的参考信号的能量、过零率、短时幅值中的任一种或多种参数确定存在非线性情况的频点信号;
    双端信号检测单元,用于根据预处理后的麦克信号与参考信号的能量比确定存在非线性情况的频点信号;
    设备硬件检测单元,用于首先计算一较低频率范围内的参考信号和麦克信号的相关性均值;然后采用一定频率间隔,计算得到其他频率范围内参考信号和麦克信号相关性均值;最后依据其他频率范围内的相关性均值和低频范围内的相关性均值,确定存在非线性情况的频点信号。
  12. 根据权利要求10所述的装置,其特征在于,所述时延估计模块包括:
    互相关计算单元,用于依次针对参考信号和时延分析范围内的各帧麦克信号,选取其中没有非线性情况的频点信号,计算参考信号与各帧麦克信号的互相关;
    时延估计值确定单元,用于根据计算得到的参考信号与各帧麦克信号的互相关确定时延估计值。
  13. 根据权利要求12所述的装置,其特征在于,
    所述时延估计值确定单元,具体用于选取所述互相关计算单元计算得到的参考信号与各帧麦克信号的互相关中互相关最大的帧对应的位置作为当前时延位置,根据所述当前时延位置及所述参考信号的位置确定当前时延估计值。
  14. 根据权利要求12所述的装置,其特征在于,所述时延估计值确定单元,具体用于将每次时延估计时计算得到的参考信号与时延分析范围内各帧麦克信号的互相关中互相关最大的帧对应的位置作为候选时延位置,将所述候选时延位置统计在一个L维的数组Sa中,其中L=时延分析范围内麦克信号的总帧数,并统计所述候选时延位置连续出现的次数;如果本次候选时延位置发生变化,则将本次候选时延位置增加第一设定值t1,将上次候选时延位置减小第二设定值t2,其余位置减小第三设定值t3;如果本次候选时延位置未发生变化,则将本次候选时延位置增加第一设定值t1,其余位置减小第三设定值t3,所述第二设定值t2小于等于第三设定值t3;如果本次候选时延位置大于第一阈值,且该候选时延位置连续出现的次数大于第二阈值时,根据本次候选时延位置及所述参考信号的位置确定当前时延估计值。
  15. 根据权利要求14所述的装置,其特征在于,
    所述信号移动模块,具体用于在当前时延估计值D 1(t)<=第三阈值T3时,所述参考信号的数据不做移动;在第三阈值T3<当前时延估计值D 1(t)<第四阈值T4时,将所述参考信号的数据移动D 1(t)/2;在第四阈值T4<=当前时延估计值D 1(t)时,将所述参考信号的数据移动D 1(t)。
  16. 根据权利要求13或14或15所述的装置,其特征在于,所述时延估计值确定单元在确定时延估计值时还需确定满足以下任一种或多种条件:
    (1)当前时延位置对应的互相关C(t)大于上一次时延位置对应的互相关C(t-1);
    (2)当前时延分析范围内每帧中的最大互相关C max(t)和最小互相关C min(t)对应的位置的差值大于设定的第一设定差值;
    (3)参考信号与时延分析范围内各帧麦克信号的互相关均值C mean(t)与当前时延位置对应的互相关C(t)的差值大于第二设定差值;
    (4)当前时延位置p(t)小于上一次时延位置p(t-1)。
  17. 根据权利要求10至15任一项所述的装置,其特征在于,所述装置还包括:
    缓存模块,用于对历史参考信号的数据进行缓存;
    所述信号移动模块,还用于在对所述参考信号进行移动时,对所述缓存模块中缓存的历史参考信号的数据进行配合移动。
  18. 根据权利要求10至15任一项所述的装置,其特征在于,
    所述自适应滤波器在进行系数更新时,如果当前时延估计值小于滤波器长度,将滤波器系数按照时延估计值移动,基于移动后的系数进行更新,对于移动后没有系数的位置对应的系数进行重置,并基于重置后的系数进行更新。
  19. 一种基于时延估计的回声消除装置,其特征在于,包括:处理器、存储器、系统总线;
    所述处理器以及所述存储器通过所述系统总线相连;
    所述存储器用于存储一个或多个程序,所述一个或多个程序包括指令,所述指令当被所述处理器执行时使所述处理器执行权利要求1-9任一项所述的方法。
  20. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有指令,当所述指令在终端设备上运行时,使得所述终端设备执行权利要求1-9任一项所述的方法。
  21. 一种计算机程序产品,其特征在于,所述计算机程序产品在终端设备上运行时,使得所述终端设备执行权利要求1-9任一项所述的方法。
PCT/CN2018/095759 2017-10-23 2018-07-16 基于时延估计的回声消除方法及装置 WO2019080552A1 (zh)

Priority Applications (5)

Application Number Priority Date Filing Date Title
ES18869573T ES2965954T3 (es) 2017-10-23 2018-07-16 Método y aparato de cancelación de eco basados en estimación de retardo de tiempo
KR1020207014264A KR102340999B1 (ko) 2017-10-23 2018-07-16 시간 지연 추정을 기반으로 하는 에코 제거 방법 및 장치
EP18869573.8A EP3703052B1 (en) 2017-10-23 2018-07-16 Echo cancellation method and apparatus based on time delay estimation
US16/756,967 US11323807B2 (en) 2017-10-23 2018-07-16 Echo cancellation method and apparatus based on time delay estimation
JP2020517351A JP7018130B2 (ja) 2017-10-23 2018-07-16 遅延時間推定に基づくエコー除去方法及び装置

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710994195.X 2017-10-23
CN201710994195.XA CN107610713B (zh) 2017-10-23 2017-10-23 基于时延估计的回声消除方法及装置

Publications (1)

Publication Number Publication Date
WO2019080552A1 true WO2019080552A1 (zh) 2019-05-02

Family

ID=61079274

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/095759 WO2019080552A1 (zh) 2017-10-23 2018-07-16 基于时延估计的回声消除方法及装置

Country Status (8)

Country Link
US (1) US11323807B2 (zh)
EP (1) EP3703052B1 (zh)
JP (1) JP7018130B2 (zh)
KR (1) KR102340999B1 (zh)
CN (1) CN107610713B (zh)
ES (1) ES2965954T3 (zh)
HU (1) HUE065351T2 (zh)
WO (1) WO2019080552A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110349592A (zh) * 2019-07-17 2019-10-18 百度在线网络技术(北京)有限公司 用于输出信息的方法和装置
CN111402868A (zh) * 2020-03-17 2020-07-10 北京百度网讯科技有限公司 语音识别方法、装置、电子设备及计算机可读存储介质

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9947337B1 (en) * 2017-03-21 2018-04-17 Omnivision Technologies, Inc. Echo cancellation system and method with reduced residual echo
CN107610713B (zh) 2017-10-23 2022-02-01 科大讯飞股份有限公司 基于时延估计的回声消除方法及装置
CN109102821B (zh) * 2018-09-10 2021-05-25 思必驰科技股份有限公司 时延估计方法、系统、存储介质及电子设备
CN110166882B (zh) * 2018-09-29 2021-05-25 腾讯科技(深圳)有限公司 远场拾音设备、及远场拾音设备中采集人声信号的方法
CN109087662B (zh) * 2018-10-25 2021-10-08 科大讯飞股份有限公司 一种回声消除方法及装置
CN111223492A (zh) * 2018-11-23 2020-06-02 中移(杭州)信息技术有限公司 一种回声路径延迟估计方法及装置
CN109361828B (zh) * 2018-12-17 2021-02-12 北京达佳互联信息技术有限公司 一种回声消除方法、装置、电子设备及存储介质
AU2020310315A1 (en) * 2019-07-10 2022-01-27 SwipBox Development ApS Method of reusing a reusable transport packaging and a service point and system therefor
CN111031448B (zh) * 2019-11-12 2021-09-17 西安讯飞超脑信息科技有限公司 回声消除方法、装置、电子设备和存储介质
CN110992973A (zh) * 2019-11-29 2020-04-10 维沃移动通信有限公司 一种信号时延的确定方法和电子设备
TWI756595B (zh) * 2019-12-06 2022-03-01 瑞昱半導體股份有限公司 通訊裝置及回音消除方法
CN111246036A (zh) * 2020-02-17 2020-06-05 上海推乐信息技术服务有限公司 一种回声估计方法和装置
CN111556410A (zh) * 2020-05-20 2020-08-18 南京中芯声学技术研究院 基于多工作模式麦克风的室内扩声系统工作模式切换方法
TWI743950B (zh) * 2020-08-18 2021-10-21 瑞昱半導體股份有限公司 訊號處理裝置、延遲估計方法與回音消除方法
CN112562709B (zh) * 2020-11-18 2024-04-19 珠海全志科技股份有限公司 一种回声消除信号处理方法及介质
CN112489670B (zh) * 2020-12-01 2023-08-18 广州华多网络科技有限公司 时延估计方法、装置、终端设备和计算机可读存储介质
KR20220102451A (ko) * 2021-01-13 2022-07-20 삼성전자주식회사 외부 장치에 의해 유입되는 에코를 제거하는 방법 및 전자 장치
TWI778502B (zh) 2021-01-22 2022-09-21 威聯通科技股份有限公司 回聲延時估計方法及回聲延時估計系統
CN113724722B (zh) * 2021-08-18 2023-12-26 杭州网易智企科技有限公司 回声延迟估计方法、装置、存储介质和计算设备
CN114613383B (zh) * 2022-03-14 2023-07-18 中国电子科技集团公司第十研究所 一种机载环境下多输入语音信号波束形成信息互补方法
CN114822575A (zh) * 2022-04-28 2022-07-29 深圳市中科蓝讯科技股份有限公司 一种双麦克风阵列回声消除方法、装置及电子设备
CN115297404A (zh) * 2022-08-04 2022-11-04 中国第一汽车股份有限公司 一种音频处理系统、方法和车辆

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1691716A (zh) * 2004-04-23 2005-11-02 北京三星通信技术研究有限公司 回声消除装置
JP2010072460A (ja) * 2008-09-19 2010-04-02 Oki Electric Ind Co Ltd 音声通信装置及び音声通信プログラム
CN103700374A (zh) * 2013-12-25 2014-04-02 宁波菊风系统软件有限公司 确定声学回声消除中系统延时的方法及声学回声消除方法
CN105825864A (zh) * 2016-05-19 2016-08-03 南京奇音石信息技术有限公司 基于过零率指标的双端说话检测与回声消除方法
CN105872156A (zh) * 2016-05-25 2016-08-17 腾讯科技(深圳)有限公司 一种回声时延跟踪方法及装置
CN106847299A (zh) * 2017-02-24 2017-06-13 喜大(上海)网络科技有限公司 延时的估计方法及装置
CN107610713A (zh) * 2017-10-23 2018-01-19 科大讯飞股份有限公司 基于时延估计的回声消除方法及装置

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2782180B1 (fr) * 1998-08-06 2001-09-07 France Telecom Dispositif de traitement numerique a filtrage frequentiel et a complexite de calcul reduite
US6937723B2 (en) 2002-10-25 2005-08-30 Avaya Technology Corp. Echo detection and monitoring
US7792281B1 (en) * 2005-12-13 2010-09-07 Mindspeed Technologies, Inc. Delay estimation and audio signal identification using perceptually matched spectral evolution
US9544698B2 (en) * 2009-05-18 2017-01-10 Oticon A/S Signal enhancement using wireless streaming
JP5235226B2 (ja) 2011-06-28 2013-07-10 日本電信電話株式会社 エコー消去装置及びそのプログラム
US9173025B2 (en) * 2012-02-08 2015-10-27 Dolby Laboratories Licensing Corporation Combined suppression of noise, echo, and out-of-location signals
GB201309781D0 (en) * 2013-05-31 2013-07-17 Microsoft Corp Echo cancellation
GB201321052D0 (en) * 2013-11-29 2014-01-15 Microsoft Corp Detecting nonlinear amplitude processing
GB201406574D0 (en) * 2014-04-11 2014-05-28 Microsoft Corp Audio Signal Processing
US20150371655A1 (en) * 2014-06-19 2015-12-24 Yang Gao Acoustic Echo Preprocessing for Speech Enhancement
JP6369192B2 (ja) 2014-07-18 2018-08-08 沖電気工業株式会社 エコー抑圧装置、エコー抑圧プログラム、エコー抑圧方法及び通信端末
GB2527865B (en) * 2014-10-30 2016-12-14 Imagination Tech Ltd Controlling operational characteristics of an acoustic echo canceller
GB201501791D0 (en) * 2015-02-03 2015-03-18 Microsoft Technology Licensing Llc Non-linear echo path detection
CN106033673B (zh) * 2015-03-09 2019-09-17 电信科学技术研究院 一种近端语音信号检测方法及装置
CN105472191B (zh) * 2015-11-18 2019-09-20 百度在线网络技术(北京)有限公司 一种跟踪回声时延的方法和装置

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1691716A (zh) * 2004-04-23 2005-11-02 北京三星通信技术研究有限公司 回声消除装置
JP2010072460A (ja) * 2008-09-19 2010-04-02 Oki Electric Ind Co Ltd 音声通信装置及び音声通信プログラム
CN103700374A (zh) * 2013-12-25 2014-04-02 宁波菊风系统软件有限公司 确定声学回声消除中系统延时的方法及声学回声消除方法
CN105825864A (zh) * 2016-05-19 2016-08-03 南京奇音石信息技术有限公司 基于过零率指标的双端说话检测与回声消除方法
CN105872156A (zh) * 2016-05-25 2016-08-17 腾讯科技(深圳)有限公司 一种回声时延跟踪方法及装置
CN106847299A (zh) * 2017-02-24 2017-06-13 喜大(上海)网络科技有限公司 延时的估计方法及装置
CN107610713A (zh) * 2017-10-23 2018-01-19 科大讯飞股份有限公司 基于时延估计的回声消除方法及装置

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110349592A (zh) * 2019-07-17 2019-10-18 百度在线网络技术(北京)有限公司 用于输出信息的方法和装置
CN111402868A (zh) * 2020-03-17 2020-07-10 北京百度网讯科技有限公司 语音识别方法、装置、电子设备及计算机可读存储介质
EP3882914A1 (en) * 2020-03-17 2021-09-22 Beijing Baidu Netcom Science And Technology Co. Ltd. Voice recognition method, voice recognition apparatus, electronic device and computer readable storage medium
CN111402868B (zh) * 2020-03-17 2023-10-24 阿波罗智联(北京)科技有限公司 语音识别方法、装置、电子设备及计算机可读存储介质

Also Published As

Publication number Publication date
HUE065351T2 (hu) 2024-05-28
EP3703052A4 (en) 2021-04-28
EP3703052C0 (en) 2023-11-01
US11323807B2 (en) 2022-05-03
JP7018130B2 (ja) 2022-02-09
CN107610713B (zh) 2022-02-01
ES2965954T3 (es) 2024-04-17
EP3703052B1 (en) 2023-11-01
CN107610713A (zh) 2018-01-19
KR102340999B1 (ko) 2021-12-20
KR20200070346A (ko) 2020-06-17
EP3703052A1 (en) 2020-09-02
US20210051404A1 (en) 2021-02-18
JP2021500778A (ja) 2021-01-07

Similar Documents

Publication Publication Date Title
WO2019080552A1 (zh) 基于时延估计的回声消除方法及装置
JP6557786B2 (ja) エコー遅延トラッキング方法、装置及びコンピュータ記憶媒体
US20210327448A1 (en) Speech noise reduction method and apparatus, computing device, and computer-readable storage medium
CN109273021B (zh) 一种基于rnn的实时会议降噪方法及装置
CN104050971A (zh) 声学回声减轻装置和方法、音频处理装置和语音通信终端
WO2020029882A1 (zh) 一种方位角估计的方法、设备及存储介质
CN104685903A (zh) 用于音频干扰估计的方法和设备
US11245788B2 (en) Acoustic echo cancellation based sub band domain active speaker detection for audio and video conferencing applications
CN112004177B (zh) 一种啸叫检测方法、麦克风音量调节方法及存储介质
CN109688284B (zh) 一种回音延时检测方法
CN108010536B (zh) 回声消除方法、装置、系统及存储介质
CN105355199B (zh) 一种基于gmm噪声估计的模型组合语音识别方法
JP5838861B2 (ja) 音声信号処理装置、方法及びプログラム
CN111048061B (zh) 回声消除滤波器的步长获取方法、装置及设备
KR102152197B1 (ko) 음성 검출기를 구비한 보청기 및 그 방법
CN112489670B (zh) 时延估计方法、装置、终端设备和计算机可读存储介质
WO2020097828A1 (zh) 回声消除方法、延时估计方法、装置、存储介质及设备
WO2016141773A1 (zh) 一种近端语音信号检测方法及装置
CN103700375A (zh) 语音降噪方法及其装置
WO2022218254A1 (zh) 语音信号增强方法、装置及电子设备
WO2020107455A1 (zh) 语音处理方法、装置、存储介质及电子设备
CN103890843A (zh) 信号噪声衰减
CN110148421A (zh) 一种残余回声检测方法、终端和装置
WO2020191512A1 (zh) 回声消除装置、回声消除方法、信号处理芯片及电子设备
CN111246036A (zh) 一种回声估计方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18869573

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2020517351

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 20207014264

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2018869573

Country of ref document: EP

Effective date: 20200525