CN112614502B - Echo cancellation method based on double LSTM neural network - Google Patents
Echo cancellation method based on double LSTM neural network Download PDFInfo
- Publication number
- CN112614502B CN112614502B CN202011455735.5A CN202011455735A CN112614502B CN 112614502 B CN112614502 B CN 112614502B CN 202011455735 A CN202011455735 A CN 202011455735A CN 112614502 B CN112614502 B CN 112614502B
- Authority
- CN
- China
- Prior art keywords
- signal
- sound source
- sample
- neural network
- echo
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 18
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 13
- 238000003062 neural network model Methods 0.000 claims abstract description 40
- 238000001228 spectrum Methods 0.000 claims abstract description 34
- 230000003595 spectral effect Effects 0.000 claims abstract description 27
- 230000000694 effects Effects 0.000 abstract description 5
- 230000005236 sound signal Effects 0.000 abstract description 2
- 238000005516 engineering process Methods 0.000 description 5
- 238000000605 extraction Methods 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 230000015654 memory Effects 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 2
- 230000008033 biological extinction Effects 0.000 description 2
- 230000009977 dual effect Effects 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L21/0232—Processing in the frequency domain
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L2021/02082—Noise filtering the noise being echo, reverberation of the speech
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02163—Only one microphone
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Acoustics & Sound (AREA)
- Human Computer Interaction (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Evolutionary Computation (AREA)
- Signal Processing (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Quality & Reliability (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Circuit For Audible Band Transducer (AREA)
- Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)
Abstract
The invention relates to the field of audio signal processing, aims to solve the problem of poor echo cancellation effect in the prior art, and provides an echo cancellation method based on a double-LSTM neural network, which comprises the following steps: acquiring a first sound source signal to be input to a loudspeaker and a second sound source signal input by a microphone, and extracting a first frequency spectrum characteristic of the first sound source signal and a second frequency spectrum characteristic of the second sound source signal; obtaining an echo estimation signal and a noise estimation signal according to the first spectral feature and the second spectral feature and based on a first LSTM neural network model; extracting a third spectral feature of the echo estimation signal and a fourth spectral feature of a noise estimation signal; obtaining a pure voice signal based on a second LSTM neural network model according to the second spectral feature, the third spectral feature and the fourth spectral feature; the clean speech signal is input to a speaker. The invention can effectively eliminate the echo signal in the voice signal and is suitable for the intelligent television.
Description
Technical Field
The invention relates to the field of audio signal processing, in particular to an echo cancellation method.
Background
With the advent of the artificial intelligence era, voice technology is an important interface for human-computer interaction. With the continuous development of the internet of things technology, people hope to use voice control intelligent equipment in a longer distance and a more complex environment, so that the traditional near-field voice interaction cannot meet the requirements of people, and the microphone array technology becomes the core of far-field interaction.
Aiming at the current complex application scene, a series of key technologies capable of effectively improving the speech recognition rate are developed based on a microphone array, and the key technologies mainly comprise: speech enhancement, sound source localization, reverberation cancellation, echo cancellation, noise suppression. For devices with speakers and microphones (such as smart audio and smart television), the played sound of the devices needs to be eliminated to obtain effective speaker sound, and the conventional echo cancellation algorithm mainly utilizes means such as adaptive signal processing to eliminate the interference of background sound. But there are various kinds of noise in everyday scenarios, so that noise is a non-negligible contributing factor in echo cancellation. When no noise exists, the effect is good, when environmental noise exists, the performance of the echo cancellation algorithm is reduced, and particularly when non-stationary noise exists, the echo cancellation effect is not ideal.
Disclosure of Invention
The invention aims to solve the problem of poor echo cancellation effect in the prior art, and provides an echo cancellation method based on a double-LSTM neural network.
The technical scheme adopted by the invention for solving the technical problems is as follows: the echo cancellation method based on the double LSTM neural network is characterized by comprising the following steps of:
step 1, acquiring a first sound source signal to be input to a loudspeaker and a second sound source signal input by a microphone, and extracting a first frequency spectrum characteristic of the first sound source signal and a second frequency spectrum characteristic of the second sound source signal;
step 2, obtaining an echo estimation signal and a noise estimation signal according to the first frequency spectrum characteristic and the second frequency spectrum characteristic and based on a first LSTM neural network model, wherein the first LSTM neural network model is obtained by training according to a first sample sound source signal, a second sample sound source signal, a sample echo signal and a sample noise signal;
step 3, extracting a third spectral feature of the echo estimation signal and a fourth spectral feature of the noise estimation signal;
step 4, obtaining a pure voice signal according to the second frequency spectrum characteristic, the third frequency spectrum characteristic and the fourth frequency spectrum characteristic and based on a second LSTM neural network model, wherein the second LSTM neural network model is obtained by training according to a second sample sound source signal, a sample echo signal, a sample noise signal and the pure sample voice signal;
and 5, inputting the pure voice signal to a loudspeaker.
Further, the first LSTM neural network model includes an echo estimation model and a noise estimation model, the echo estimation model is obtained by training according to a first sample sound source signal, a second sample sound source signal, and a sample echo signal, and the noise estimation model is obtained by training according to the first sample sound source signal, the second sample sound source signal, and a sample noise signal.
The invention has the beneficial effects that: the echo cancellation method based on the double LSTM neural network eliminates the echo signal with noise based on the LSTM neural network model, eliminates the influence of the noise on the echo cancellation, and can effectively eliminate the echo signal in the voice signal.
Drawings
Fig. 1 is a schematic flow chart of an echo cancellation method based on a dual LSTM neural network according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating a conventional echo cancellation structure;
fig. 3 is another schematic flow chart of the echo cancellation method based on the dual LSTM neural network according to the embodiment of the present invention.
Detailed Description
Embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
The invention aims to solve the problem of poor echo cancellation effect in the prior art, and provides an echo cancellation method based on a double-LSTM neural network, which has the main technical conception that: acquiring a first sound source signal to be input to a loudspeaker and a second sound source signal input by a microphone, and extracting a first frequency spectrum characteristic of the first sound source signal and a second frequency spectrum characteristic of the second sound source signal; obtaining an echo estimation signal and a noise estimation signal according to the first spectral feature and the second spectral feature and based on a first LSTM neural network model, wherein the first LSTM neural network model is obtained by training according to a first sample sound source signal, a second sample sound source signal, a sample echo signal and a sample noise signal; extracting a third spectral feature of the echo estimation signal and a fourth spectral feature of a noise estimation signal; obtaining a pure voice signal according to the second frequency spectrum characteristic, the third frequency spectrum characteristic and the fourth frequency spectrum characteristic and based on a second LSTM neural network model, wherein the second LSTM neural network model is obtained by training according to a second sample sound source signal, a sample echo signal, a sample noise signal and the pure sample voice signal; the clean speech signal is input to a speaker.
Before implementation, a first LSTM neural network model and a second LSTM neural network model are obtained by pre-training, wherein the first LSTM neural network model can be obtained by training according to a first sample sound source signal, a second sample sound source signal, a sample echo signal and a sample noise signal, and the second LSTM neural network model can be obtained by training according to the first sample sound source signal, the sample echo signal, the sample noise signal and a pure sample voice signal; when specifically using, acquire the second sound source signal of waiting to input the first sound source signal and the microphone input of treating the input to the speaker, wherein, first sound source signal is the distal end signal of waiting to input to the microphone in the echo channel, and the second sound source signal is the sound source signal that the microphone was collected, wherein includes: the method comprises the steps of firstly inputting a first frequency spectrum characteristic of a first sound source signal and a second frequency spectrum characteristic of a second sound source signal into a first LSTM neural network model to obtain an echo estimation signal and a noise estimation signal corresponding to the current environment, then inputting the second frequency spectrum characteristic of the second sound source signal, a third frequency spectrum characteristic of the echo estimation signal and a fourth frequency spectrum characteristic of the noise estimation signal into a second LSTM neural network model to obtain a pure voice signal, and finally inputting the pure voice signal into a loudspeaker to achieve echo cancellation of the sound source signal.
Examples
The echo cancellation method based on the double-LSTM neural network according to the embodiment of the present invention, as shown in FIG. 1, includes the following steps:
step S1, acquiring a first sound source signal to be input to a loudspeaker and a second sound source signal input by a microphone, and extracting a first frequency spectrum characteristic of the first sound source signal and a second frequency spectrum characteristic of the second sound source signal;
a conventional echo cancellation structure is shown in fig. 2, which performs echo cancellation on a far-end signal to be input to a speaker through an adaptive filter, and in this embodiment, on the basis of the far-end signal, that is, a first sound source signal to be input to the speaker, is obtained, and a second sound source signal input by a microphone, that is, a sound source signal collected by the microphone is obtained.
After the first sound source signal and the second sound source signal are obtained, a first spectrum feature corresponding to the first sound source signal and a second spectrum feature corresponding to the second sound source signal are extracted through a feature extraction method.
Step S2, obtaining an echo estimation signal and a noise estimation signal according to the first frequency spectrum characteristic and the second frequency spectrum characteristic and based on a first LSTM neural network model, wherein the first LSTM neural network model is obtained by training according to a first sample sound source signal, a second sample sound source signal, a sample echo signal and a sample noise signal;
a Long Short-Term Memory (LSTM) neural network is a variant of a Recurrent Neural Network (RNN), can overcome the defects of gradient extinction and explosion of the traditional RNN, and can selectively keep the Memory quantity of context, reduce the network depth and relieve the gradient extinction phenomenon by introducing a gating mechanism into a Memory unit.
Specifically, the first LSTM neural network model is trained in advance before being used specifically, and is obtained by training according to a first sample sound source signal, a second sample sound source signal, a sample echo signal and a sample noise signal, specifically, noise signals in different environments can be collected as sample noise signals, echo signals at different volumes of speakers and at different distances between the speakers and a microphone are collected as sample echo signals, and the first sample sound source signal and the second sample sound source signal corresponding to the above conditions are collected, and the established preliminary LSTM neural network model is trained through the first sample sound source signal, the second sample sound source signal, the sample echo signal and the sample noise signal, so that the first LSTM neural network model is obtained.
When the method is used specifically, a first frequency spectrum characteristic of a first sound source signal and a second frequency spectrum characteristic of a second sound source signal which are obtained currently are input into a first LSTM neural network model, and an echo estimation signal and a noise estimation signal corresponding to the current environment can be obtained.
In this embodiment, the first LSTM neural network model may include an echo estimation model and a noise estimation model, where the echo estimation model is used for calculating an echo estimation signal, and may be obtained by training a first sample sound source signal, a second sample sound source signal, and a sample echo signal, and the noise estimation model is used for calculating a noise estimation signal, and may be obtained by training a first sample sound source signal, a second sample sound source signal, and a sample noise signal.
Step S3, extracting a third spectral feature of the echo estimation signal and a fourth spectral feature of the noise estimation signal;
specifically, corresponding to step S1, the existing feature extraction method may be used to perform feature extraction on the echo estimation signal and the noise estimation signal output by the first LSTM neural network model, so as to obtain a third spectral feature of the echo estimation signal and a fourth spectral feature of the noise estimation signal.
Step S4, obtaining a pure voice signal according to the second spectral feature, the third spectral feature and the fourth spectral feature and based on a second LSTM neural network model, wherein the second LSTM neural network model is obtained by training according to a second sample sound source signal, a sample echo signal, a sample noise signal and the pure sample voice signal;
specifically, the second LSTM neural network model is also trained in advance before being used specifically, and may be trained according to a second sample sound source signal, a sample echo signal, a sample noise signal, and a pure sample voice signal, specifically, noise signals in different environments may be collected as sample noise signals, echo signals at different volumes of speakers and at different distances between the speakers and the microphone are collected as sample echo signals, and second sample sound source signals corresponding to the above conditions and pure voice signals of different users are collected, and the preliminary LSTM neural network model established is trained through the second sample sound source signal, the sample echo signal, the sample noise signal, and the pure sample voice signal, thereby obtaining the second LSTM neural network model.
When the method is used specifically, the second spectral feature of the second sound source signal, the third spectral feature of the echo estimation signal and the fourth spectral feature of the noise estimation signal are input into the second LSTM neural network model, and then a pure voice signal can be obtained.
And step S5, inputting the pure voice signal to a loudspeaker.
Finally, the pure voice signal output by the second LSTM neural network model is input to a loudspeaker, and echo cancellation of the sound source signal can be achieved.
In summary, as shown in fig. 3, in this embodiment, the first sound source signal and the second sound source signal are input to the first LSTM neural network model to obtain the echo estimation signal and the noise estimation signal, then the spectral features of the echo estimation signal and the noise estimation signal are extracted, and then the spectral features of the echo estimation signal, the noise estimation signal and the second sound source signal are input to the second LSTM neural network model to obtain the target signal. The method can keep the memory quantity of the context, reduce the network depth and relieve the gradient disappearance phenomenon, and the method has obvious inhibiting effect on the echo signal with noise.
Claims (2)
1. The echo cancellation method based on the double LSTM neural network is characterized by comprising the following steps of:
step 1, acquiring a first sound source signal to be input to a loudspeaker and a second sound source signal input by a microphone, and extracting a first frequency spectrum characteristic of the first sound source signal and a second frequency spectrum characteristic of the second sound source signal;
step 2, obtaining an echo estimation signal and a noise estimation signal according to the first frequency spectrum characteristic and the second frequency spectrum characteristic and based on a first LSTM neural network model, wherein the first LSTM neural network model is obtained by training according to a first sample sound source signal, a second sample sound source signal, a sample echo signal and a sample noise signal;
step 3, extracting a third spectral feature of the echo estimation signal and a fourth spectral feature of the noise estimation signal;
step 4, obtaining a pure voice signal according to the second frequency spectrum characteristic, the third frequency spectrum characteristic and the fourth frequency spectrum characteristic and based on a second LSTM neural network model, wherein the second LSTM neural network model is obtained by training according to a second sample sound source signal, a sample echo signal, a sample noise signal and the pure sample voice signal;
and 5, inputting the pure voice signal to a loudspeaker.
2. The method of claim 1, wherein the first LSTM neural network model comprises an echo estimation model trained from a first sample sound source signal, a second sample sound source signal, and a sample echo signal, and a noise estimation model trained from a first sample sound source signal, a second sample sound source signal, and a sample noise signal.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011455735.5A CN112614502B (en) | 2020-12-10 | 2020-12-10 | Echo cancellation method based on double LSTM neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011455735.5A CN112614502B (en) | 2020-12-10 | 2020-12-10 | Echo cancellation method based on double LSTM neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112614502A CN112614502A (en) | 2021-04-06 |
CN112614502B true CN112614502B (en) | 2022-01-28 |
Family
ID=75233242
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011455735.5A Active CN112614502B (en) | 2020-12-10 | 2020-12-10 | Echo cancellation method based on double LSTM neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112614502B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11984110B2 (en) | 2022-03-07 | 2024-05-14 | Mediatek Singapore Pte. Ltd. | Heterogeneous computing for hybrid acoustic echo cancellation |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111161752A (en) * | 2019-12-31 | 2020-05-15 | 歌尔股份有限公司 | Echo cancellation method and device |
CN111225317A (en) * | 2020-01-17 | 2020-06-02 | 四川长虹电器股份有限公司 | Echo cancellation method |
US10854186B1 (en) * | 2019-07-22 | 2020-12-01 | Amazon Technologies, Inc. | Processing audio data received from local devices |
CN112055284A (en) * | 2019-06-05 | 2020-12-08 | 北京地平线机器人技术研发有限公司 | Echo cancellation method, neural network training method, apparatus, medium, and device |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3474280B1 (en) * | 2017-10-19 | 2021-07-07 | Goodix Technology (HK) Company Limited | Signal processor for speech signal enhancement |
WO2020077232A1 (en) * | 2018-10-12 | 2020-04-16 | Cambridge Cancer Genomics Limited | Methods and systems for nucleic acid variant detection and analysis |
-
2020
- 2020-12-10 CN CN202011455735.5A patent/CN112614502B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112055284A (en) * | 2019-06-05 | 2020-12-08 | 北京地平线机器人技术研发有限公司 | Echo cancellation method, neural network training method, apparatus, medium, and device |
US10854186B1 (en) * | 2019-07-22 | 2020-12-01 | Amazon Technologies, Inc. | Processing audio data received from local devices |
CN111161752A (en) * | 2019-12-31 | 2020-05-15 | 歌尔股份有限公司 | Echo cancellation method and device |
CN111225317A (en) * | 2020-01-17 | 2020-06-02 | 四川长虹电器股份有限公司 | Echo cancellation method |
Non-Patent Citations (4)
Title |
---|
Attention Wave-U-Net for Acoustic Echo Cancellation;Jung-Hee Kim et al;《INTERSPEECH 2020》;20201029;全文 * |
Deep Learning for Acoustic Echo Cancellation in Noisy and Double-Talk Scenarios;Hao Zhang et al;《Interspeech 2018》;20180906;全文 * |
会议电话中的实时回声消除算法研究与实现;陈林;《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》;20200615;全文 * |
基于BLSTM神经网络的回声和噪声抑制算法;王冬霞等;《信号处理》;20200612;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN112614502A (en) | 2021-04-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10546593B2 (en) | Deep learning driven multi-channel filtering for speech enhancement | |
CN109065067B (en) | Conference terminal voice noise reduction method based on neural network model | |
WO2021042870A1 (en) | Speech processing method and apparatus, electronic device, and computer-readable storage medium | |
EP1443498B1 (en) | Noise reduction and audio-visual speech activity detection | |
CN108447496B (en) | Speech enhancement method and device based on microphone array | |
US9640194B1 (en) | Noise suppression for speech processing based on machine-learning mask estimation | |
WO2021022094A1 (en) | Per-epoch data augmentation for training acoustic models | |
CN102164328B (en) | Audio input system used in home environment based on microphone array | |
CN108109617A (en) | A kind of remote pickup method | |
KR20130108063A (en) | Multi-microphone robust noise suppression | |
CN110660406A (en) | Real-time voice noise reduction method of double-microphone mobile phone in close-range conversation scene | |
CN110660407B (en) | Audio processing method and device | |
CN111755020B (en) | Stereo echo cancellation method | |
CN110012331A (en) | A kind of far field diamylose far field audio recognition method of infrared triggering | |
CN115132215A (en) | Single-channel speech enhancement method | |
CN112614502B (en) | Echo cancellation method based on double LSTM neural network | |
CN111225317B (en) | Echo cancellation method | |
CN112820311A (en) | Echo cancellation method and device based on spatial prediction | |
CN114023352B (en) | Voice enhancement method and device based on energy spectrum depth modulation | |
Zhou et al. | Speech Enhancement via Residual Dense Generative Adversarial Network. | |
CN113763978B (en) | Voice signal processing method, device, electronic equipment and storage medium | |
CN114827363A (en) | Method, device and readable storage medium for eliminating echo in call process | |
CN114283844A (en) | Double-talk detection method and device for audio and video conference | |
Kothapally et al. | Monaural Speech Dereverberation Using Deformable Convolutional Networks | |
Zhang et al. | A Beam-TFDPRNN Based Speech Separation Method in Reverberant Environments |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |