CN111225317A - Echo cancellation method - Google Patents
Echo cancellation method Download PDFInfo
- Publication number
- CN111225317A CN111225317A CN202010053652.7A CN202010053652A CN111225317A CN 111225317 A CN111225317 A CN 111225317A CN 202010053652 A CN202010053652 A CN 202010053652A CN 111225317 A CN111225317 A CN 111225317A
- Authority
- CN
- China
- Prior art keywords
- end signal
- far
- signal
- echo
- voice
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
- H04R3/02—Circuits for transducers, loudspeakers or microphones for preventing acoustic reaction, i.e. acoustic oscillatory feedback
Landscapes
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Otolaryngology (AREA)
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Circuit For Audible Band Transducer (AREA)
- Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)
Abstract
The invention relates to the technical field of voice processing, and discloses an echo cancellation method, which solves the problem that an echo cancellation technology in the traditional technology cannot obtain an ideal echo cancellation effect in a scene with a strong far-end signal. The method comprises the following steps: a. collecting a far-end signal and a near-end signal; b. judging whether the far-end signal is larger than the speaker voice in the near-end signal or not according to the corresponding parameters of the far-end signal and the near-end signal; c. when the far-end signal is larger than the voice of the speaker in the near-end signal, performing automatic gain processing on the near-end signal; d. performing decorrelation processing on the near-end signal; e. pre-emphasis and de-direct-current processing are carried out on the near-end signal and the far-end signal; f. obtaining an echo estimation value of a far-end signal through a double-filter, then removing echo from a near-end signal, and adjusting the updating of the coefficient of the double-filter through the results of the previous filtering and the current filtering; g. and removing residual echo by utilizing the correlation among the far-end signal, the near-end signal and the output signal of the filter to obtain a final output signal.
Description
Technical Field
The invention relates to the technical field of voice processing, in particular to an echo cancellation method.
Background
With the advent of the artificial intelligence era, voice technology is an important interface for human-computer interaction. Particularly, with the continuous development of the internet of things technology, people hope to use voice control intelligent equipment in a longer distance and a more complex environment, so that the traditional near-field voice interaction cannot meet the requirements of people, and the microphone array technology becomes the core of far-field interaction.
Aiming at the current complex application scene, a series of key technologies capable of effectively improving the speech recognition rate are developed based on a microphone array, and the key technologies mainly comprise: speech enhancement, sound source localization, reverberation cancellation, echo cancellation, noise suppression, etc.
The echo cancellation mainly utilizes means such as adaptive signal processing to cancel the interference of background sound, the basic principle is as shown in fig. 1, the voice signal collected by the microphone includes the voice signal of the speaker at the far end and near end and the echo signal, the adaptive filter outputs the echo signal estimation value, the echo cancellation is carried out by subtracting the echo signal estimation value from the voice signal collected by the microphone, and the output value is fed back to the adaptive filter for coefficient updating, thereby improving the accuracy of the echo estimation; the existing echo cancellation algorithm can obtain good effect for the condition that the sound signal of the loudspeaker is not strong, but when the sound of the loudspeaker is large, the loudspeaker signal already covers the required speaking voice signal in the voice signal collected by the microphone, and the echo cancellation effect is not ideal when the speaking voice is hard to hear in the sense of hearing.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: an echo cancellation method is provided to solve the problem that the echo cancellation technology in the traditional technology cannot obtain an ideal echo cancellation effect in a scene with a strong far-end signal.
The technical scheme adopted by the invention for solving the technical problems is as follows:
an echo cancellation method, comprising the steps of:
a. collecting a far-end signal and a near-end signal;
b. judging whether the far-end signal is larger than the speaker voice in the near-end signal or not according to the corresponding parameters of the far-end signal and the near-end signal;
c. when the far-end signal is larger than the voice of the speaker in the near-end signal, performing automatic gain processing on the near-end signal;
d. performing decorrelation processing on the near-end signal;
e. pre-emphasis and de-direct-current processing are carried out on the near-end signal and the far-end signal;
f. obtaining an echo estimation value of a far-end signal through a double-filter, then removing echo from a near-end signal, and adjusting the updating of the coefficient of the double-filter through the results of the previous filtering and the current filtering;
g. and removing residual echo by utilizing the correlation among the far-end signal, the near-end signal and the output signal of the filter to obtain a final output signal.
As a further optimization, in step b, the determining whether the far-end signal is greater than the speaker voice in the near-end signal according to the corresponding parameters of the far-end signal and the near-end signal specifically includes:
respectively calculating the energy and power spectrum parameters of the far-end signal and the near-end signal, and calculating the cross correlation between the far-end signal and the near-end signal through the power spectrum parameters;
and when the energy ratio of the far-end signal to the near-end signal is greater than a preset energy ratio threshold value and the cross correlation between the far-end signal and the near-end signal is greater than a set threshold value, judging that the far-end signal is greater than the voice of the speaker in the near-end signal.
As a further optimization, the preset energy ratio threshold is 0.5, and the set threshold of the cross correlation is 0.9.
The invention has the beneficial effects that:
the near-end and far-end signal energy detection and cross correlation detection are added before the traditional echo cancellation, when the far-end signal is judged to be stronger than the speaker voice signal in the near-end signal, the near-end signal is subjected to automatic gain processing and decorrelation processing, so that the effective speaker voice signal can be increased, the interference of partial far-end signals is removed, the echo cancellation is facilitated, and a cleaner speaker voice signal is extracted; meanwhile, after echo cancellation, the method for removing residual echo by using correlation is added, so that a relatively pure human voice signal can be finally obtained, and a relatively good echo cancellation effect is obtained.
Drawings
FIG. 1 is a schematic diagram of echo cancellation;
fig. 2 is a flow chart of an echo cancellation method according to the present invention.
Detailed Description
The invention aims to provide an echo cancellation method, which solves the problem that the echo cancellation technology in the traditional technology cannot obtain an ideal echo cancellation effect in a scene that a far-end signal is larger than a near-end signal. The core idea is as follows: the method comprises the steps of firstly obtaining a far-end signal (a signal played by a loudspeaker) and a near-end signal (a required signal and an echo signal) through a microphone, carrying out energy detection and power spectrum correlation detection on the far-end signal and the near-end signal, and then using the far-end signal and the near-end signal as a discrimination condition. Meanwhile, when the far-end signal is strong, the correlation between the near-end signal and the far-end signal is large, and the near-end signal needs to be decorrelated first. Then, an echo estimation value is obtained by using the self-adaptive filter, echo signals are removed, and meanwhile, the updating of the filter coefficient is adjusted through the results of the previous filtering and the current filtering. And finally, removing residual echo by utilizing the correlation among the far-end signal, the near-end signal and the output signal of the filter to obtain a final output signal.
In a specific implementation, as shown in fig. 2, the echo cancellation method in the present invention includes the following steps:
(1) acquiring a far-end signal and a near-end signal:
in this step, the far-end signal collected by the microphone array is a sound signal played by a loudspeaker, and the collected near-end signal comprises a required voice signal and an echo signal;
(2) judging whether the far-end signal is larger than the speaker voice signal in the near-end signal according to the corresponding parameters of the far-end signal and the near-end signal:
in this step, the energy of the far-end signal and the near-end signal, the power spectrum and other parameters are respectively calculated as the basis for discrimination.
And (3) obtaining energy by squaring the time domains of the far-end signal and the near-end signal, windowing the far-end signal and the near-end signal, and performing fast Fourier transform processing to obtain frequency spectrums.
Wx=|x(n)|2(1)
Wd=|d(n)|2(2)
xf=fft(x(n)·win) (3)
df=fft(d(n)·win) (4)
In the formula: x (n), d (n) represent near-end and far-end signals, Wx, Wd represent time domain energy of near-end and far-end signals, xf、dfRepresenting the near-end and far-end spectra, and win representing the hanning window function.
A preferred example of the method for discriminating the magnitudes of the speaker's voice signals in the far-end signal and the near-end signal is as follows:
energy detection: by calculating the energy of the far-end signal and the near-end signal and the ratio between them as a reference value, the ratio of the energy of the far-end signal to the energy of the near-end signal is closer to 1 when the far-end signal is large and the speaking voice is small, and the ratio is closer to 0 when the far-end signal is small and the speaking voice is large.
kWRepresenting the ratio of the time domain energies.
And (3) correlation detection: the energy judgment is rough and the interference is more, so the correlation detection of the far-end signal and the near-end signal is needed, the power spectrum of the far-end signal and the near-end signal is smoothed, then the cross power spectrum of the far-end signal and the near-end signal is obtained, the correlation of the far-end signal and the near-end signal can be obtained, the larger the correlation is, the larger the far-end signal contained in the near-end signal is, the smaller the voice signal of the speaker is, and the more the near-end signal needs to be enhanced and de-correlated.
Sd=gama·Sd+(1-gama)df·d'f(6)
Sx=gama·Sx+(1-gama)xf·x'f(7)
Sxd=gama·Sxd+(1-gama)xf·d'f(8)
Sx and Sd represent a near-end smooth power spectrum and a far-end smooth power spectrum, Sxd near-end cross-power spectrum, Cxd represents cross-correlation, CMxd represents a mean value of the cross-correlation, rang represents frequency points in a frequency point range, the range is 300Hz-1.8KHz, and N represents the total number of the frequency points.
When the energy ratio k of the far-end signal to the near-end signalWAnd if the cross correlation CMxd between the far-end signal and the near-end signal is greater than 0.5 and greater than 0.9, the far-end signal is judged to be greater than the voice signal of the speaker.
It should be noted that, if the far-end signal is not greater than the voice signal of the speaker, the existing echo processing scheme is directly adopted.
(3) When the far-end signal is greater than the near-end speaker voice signal, the near-end signal is automatically gained, so that the speaker voice signal can be increased, and when the near-end signal is used for subtracting the echo estimation value during echo cancellation, the speaker voice signal contained in the near-end signal can be more effectively extracted.
(4) Meanwhile, when the far-end signal is larger than the near-end signal, decorrelation processing needs to be carried out on the near-end signal, interference of a part of far-end signals can be removed, and echo signals can be eliminated during echo elimination.
(5) After the near-end signal is subjected to enhancement and decorrelation processing, the near-end signal and the far-end signal are subjected to pre-emphasis and direct-current removal processing.
(6) And obtaining an echo estimation value of the far-end signal through a double filter, and then removing the echo of the near-end signal. Meanwhile, the updating of the filter coefficient is adjusted through the results of the previous filtering and the current filtering.
(7) And calculating the cross correlation between the far-end signal and the output signal of the filter again, and removing residual echo by using the cross correlation between the far-end signal and the near-end signal and the cross correlation between the far-end signal and the output signal of the filter to obtain a final output signal.
Compared with the traditional speex echo cancellation technology, the invention adds near-end and far-end signal energy detection and cross correlation detection before echo cancellation, when the far-end signal is judged to be stronger than the speaker voice signal in the near-end signal, the effective speaker voice signal can be increased by automatic gain processing and decorrelation processing of the near-end signal, and the interference of partial far-end signal is removed, thus being more beneficial to echo cancellation and extracting cleaner speaker voice signal; meanwhile, after echo cancellation, the method for removing residual echo by using correlation is added, so that a relatively pure human voice signal can be finally obtained, and a relatively good echo cancellation effect is obtained.
Claims (3)
1. An echo cancellation method, comprising the steps of:
a. collecting a far-end signal and a near-end signal;
b. judging whether the far-end signal is larger than the speaker voice in the near-end signal or not according to the corresponding parameters of the far-end signal and the near-end signal;
c. when the far-end signal is larger than the voice of the speaker in the near-end signal, performing automatic gain processing on the near-end signal;
d. performing decorrelation processing on the near-end signal;
e. pre-emphasis and de-direct-current processing are carried out on the near-end signal and the far-end signal;
f. obtaining an echo estimation value of a far-end signal through a double-filter, then removing echo from a near-end signal, and adjusting the updating of the coefficient of the double-filter through the results of the previous filtering and the current filtering;
g. and removing residual echo by utilizing the correlation among the far-end signal, the near-end signal and the output signal of the filter to obtain a final output signal.
2. The echo cancellation method of claim 1,
in step b, the determining whether the far-end signal is greater than the speaker voice in the near-end signal according to the corresponding parameters of the far-end signal and the near-end signal specifically includes:
respectively calculating the energy and power spectrum parameters of the far-end signal and the near-end signal, and calculating the cross correlation between the far-end signal and the near-end signal through the power spectrum parameters;
and when the energy ratio of the far-end signal to the near-end signal is greater than a preset energy ratio threshold value and the cross correlation between the far-end signal and the near-end signal is greater than a set threshold value, judging that the far-end signal is greater than the voice of the speaker in the near-end signal.
3. The echo cancellation method of claim 2,
the preset energy ratio threshold is 0.5, and the set threshold of the cross correlation is 0.9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010053652.7A CN111225317B (en) | 2020-01-17 | 2020-01-17 | Echo cancellation method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010053652.7A CN111225317B (en) | 2020-01-17 | 2020-01-17 | Echo cancellation method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111225317A true CN111225317A (en) | 2020-06-02 |
CN111225317B CN111225317B (en) | 2021-04-13 |
Family
ID=70829585
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010053652.7A Active CN111225317B (en) | 2020-01-17 | 2020-01-17 | Echo cancellation method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111225317B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111711881A (en) * | 2020-06-29 | 2020-09-25 | 深圳市科奈信科技有限公司 | Self-adaptive volume adjustment method according to environmental sound and wireless earphone |
CN112614502A (en) * | 2020-12-10 | 2021-04-06 | 四川长虹电器股份有限公司 | Echo cancellation method based on double LSTM neural network |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101043560A (en) * | 2006-03-22 | 2007-09-26 | 北京大学深圳研究生院 | Echo eliminator and echo cancellation method |
CN101106405A (en) * | 2006-07-12 | 2008-01-16 | 北京大学深圳研究生院 | Method for eliminating echo in echo eliminator and its dual end communication detection system |
EP2438766A1 (en) * | 2009-06-02 | 2012-04-11 | Koninklijke Philips Electronics N.V. | Acoustic multi-channel cancellation |
CN102655558A (en) * | 2012-05-21 | 2012-09-05 | 宁波工程学院 | Double-end pronouncing robust structure and acoustic echo cancellation method |
CN102739886A (en) * | 2011-04-01 | 2012-10-17 | 中国科学院声学研究所 | Stereo echo offset method based on echo spectrum estimation and speech existence probability |
CN108076239A (en) * | 2016-11-14 | 2018-05-25 | 深圳联友科技有限公司 | A kind of method for improving IP phone echo |
-
2020
- 2020-01-17 CN CN202010053652.7A patent/CN111225317B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101043560A (en) * | 2006-03-22 | 2007-09-26 | 北京大学深圳研究生院 | Echo eliminator and echo cancellation method |
CN101106405A (en) * | 2006-07-12 | 2008-01-16 | 北京大学深圳研究生院 | Method for eliminating echo in echo eliminator and its dual end communication detection system |
EP2438766A1 (en) * | 2009-06-02 | 2012-04-11 | Koninklijke Philips Electronics N.V. | Acoustic multi-channel cancellation |
CN102739886A (en) * | 2011-04-01 | 2012-10-17 | 中国科学院声学研究所 | Stereo echo offset method based on echo spectrum estimation and speech existence probability |
CN102655558A (en) * | 2012-05-21 | 2012-09-05 | 宁波工程学院 | Double-end pronouncing robust structure and acoustic echo cancellation method |
CN108076239A (en) * | 2016-11-14 | 2018-05-25 | 深圳联友科技有限公司 | A kind of method for improving IP phone echo |
Non-Patent Citations (1)
Title |
---|
刘荣亮: "回音消除自适应滤波和双端通话检测算法的研究", 《2008年中国高校通信类院系学术研讨会》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111711881A (en) * | 2020-06-29 | 2020-09-25 | 深圳市科奈信科技有限公司 | Self-adaptive volume adjustment method according to environmental sound and wireless earphone |
CN111711881B (en) * | 2020-06-29 | 2022-02-18 | 深圳市科奈信科技有限公司 | Self-adaptive volume adjustment method according to environmental sound and wireless earphone |
CN112614502A (en) * | 2020-12-10 | 2021-04-06 | 四川长虹电器股份有限公司 | Echo cancellation method based on double LSTM neural network |
CN112614502B (en) * | 2020-12-10 | 2022-01-28 | 四川长虹电器股份有限公司 | Echo cancellation method based on double LSTM neural network |
Also Published As
Publication number | Publication date |
---|---|
CN111225317B (en) | 2021-04-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5675848B2 (en) | Adaptive noise suppression by level cue | |
EP3866165B1 (en) | Method for enhancing telephone speech signals based on convolutional neural networks | |
US8880396B1 (en) | Spectrum reconstruction for automatic speech recognition | |
JP5102365B2 (en) | Multi-microphone voice activity detector | |
US9467775B2 (en) | Method and a system for noise suppressing an audio signal | |
KR20130108063A (en) | Multi-microphone robust noise suppression | |
US20090268920A1 (en) | Cardioid beam with a desired null based acoustic devices, systems and methods | |
CN108447496B (en) | Speech enhancement method and device based on microphone array | |
US20080031467A1 (en) | Echo reduction system | |
CN109068012B (en) | Double-end call detection method for audio conference system | |
US9378754B1 (en) | Adaptive spatial classifier for multi-microphone systems | |
CN104157295A (en) | Method used for detecting and suppressing transient noise | |
CN104021798B (en) | For by with variable spectral gain and can dynamic modulation hardness algorithm to the method for audio signal sound insulation | |
CN108986832B (en) | Binaural voice dereverberation method and device based on voice occurrence probability and consistency | |
CN111225317B (en) | Echo cancellation method | |
CN112530451A (en) | Speech enhancement method based on denoising autoencoder | |
CN112614502B (en) | Echo cancellation method based on double LSTM neural network | |
Miyazaki et al. | Theoretical analysis of parametric blind spatial subtraction array and its application to speech recognition performance prediction | |
Unoki et al. | Unified denoising and dereverberation method used in restoration of MTF-based power envelope | |
Lu et al. | Reduction of musical residual noise using hybrid median filter | |
Liu et al. | MTF-based kalman filtering with linear prediction for power envelope restoration in noisy reverberant environments | |
CN114242095B (en) | Neural network noise reduction system and method based on OMLSA framework adopting harmonic structure | |
Wang et al. | Sub-band noise reduction in multi-channel digital hearing aid | |
Guo et al. | A Wind Noise Detection and Suppression Method in Digital Hearing Aid | |
Jiang et al. | Using energy difference for speech separation of dual-microphone close-talk system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |