CN112614502B - Echo cancellation method based on double LSTM neural network - Google Patents

Echo cancellation method based on double LSTM neural network Download PDF

Info

Publication number
CN112614502B
CN112614502B CN202011455735.5A CN202011455735A CN112614502B CN 112614502 B CN112614502 B CN 112614502B CN 202011455735 A CN202011455735 A CN 202011455735A CN 112614502 B CN112614502 B CN 112614502B
Authority
CN
China
Prior art keywords
signal
sound source
sample
neural network
echo
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011455735.5A
Other languages
Chinese (zh)
Other versions
CN112614502A (en
Inventor
王前慧
邓小红
胡涛
李俊潇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Changhong Electric Co Ltd
Original Assignee
Sichuan Changhong Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Changhong Electric Co Ltd filed Critical Sichuan Changhong Electric Co Ltd
Priority to CN202011455735.5A priority Critical patent/CN112614502B/en
Publication of CN112614502A publication Critical patent/CN112614502A/en
Application granted granted Critical
Publication of CN112614502B publication Critical patent/CN112614502B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0232Processing in the frequency domain
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L2021/02082Noise filtering the noise being echo, reverberation of the speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02163Only one microphone

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Acoustics & Sound (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Evolutionary Computation (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)

Abstract

The invention relates to the field of audio signal processing, aims to solve the problem of poor echo cancellation effect in the prior art, and provides an echo cancellation method based on a double-LSTM neural network, which comprises the following steps: acquiring a first sound source signal to be input to a loudspeaker and a second sound source signal input by a microphone, and extracting a first frequency spectrum characteristic of the first sound source signal and a second frequency spectrum characteristic of the second sound source signal; obtaining an echo estimation signal and a noise estimation signal according to the first spectral feature and the second spectral feature and based on a first LSTM neural network model; extracting a third spectral feature of the echo estimation signal and a fourth spectral feature of a noise estimation signal; obtaining a pure voice signal based on a second LSTM neural network model according to the second spectral feature, the third spectral feature and the fourth spectral feature; the clean speech signal is input to a speaker. The invention can effectively eliminate the echo signal in the voice signal and is suitable for the intelligent television.

Description

Echo cancellation method based on double LSTM neural network
Technical Field
The invention relates to the field of audio signal processing, in particular to an echo cancellation method.
Background
With the advent of the artificial intelligence era, voice technology is an important interface for human-computer interaction. With the continuous development of the internet of things technology, people hope to use voice control intelligent equipment in a longer distance and a more complex environment, so that the traditional near-field voice interaction cannot meet the requirements of people, and the microphone array technology becomes the core of far-field interaction.
Aiming at the current complex application scene, a series of key technologies capable of effectively improving the speech recognition rate are developed based on a microphone array, and the key technologies mainly comprise: speech enhancement, sound source localization, reverberation cancellation, echo cancellation, noise suppression. For devices with speakers and microphones (such as smart audio and smart television), the played sound of the devices needs to be eliminated to obtain effective speaker sound, and the conventional echo cancellation algorithm mainly utilizes means such as adaptive signal processing to eliminate the interference of background sound. But there are various kinds of noise in everyday scenarios, so that noise is a non-negligible contributing factor in echo cancellation. When no noise exists, the effect is good, when environmental noise exists, the performance of the echo cancellation algorithm is reduced, and particularly when non-stationary noise exists, the echo cancellation effect is not ideal.
Disclosure of Invention
The invention aims to solve the problem of poor echo cancellation effect in the prior art, and provides an echo cancellation method based on a double-LSTM neural network.
The technical scheme adopted by the invention for solving the technical problems is as follows: the echo cancellation method based on the double LSTM neural network is characterized by comprising the following steps of:
step 1, acquiring a first sound source signal to be input to a loudspeaker and a second sound source signal input by a microphone, and extracting a first frequency spectrum characteristic of the first sound source signal and a second frequency spectrum characteristic of the second sound source signal;
step 2, obtaining an echo estimation signal and a noise estimation signal according to the first frequency spectrum characteristic and the second frequency spectrum characteristic and based on a first LSTM neural network model, wherein the first LSTM neural network model is obtained by training according to a first sample sound source signal, a second sample sound source signal, a sample echo signal and a sample noise signal;
step 3, extracting a third spectral feature of the echo estimation signal and a fourth spectral feature of the noise estimation signal;
step 4, obtaining a pure voice signal according to the second frequency spectrum characteristic, the third frequency spectrum characteristic and the fourth frequency spectrum characteristic and based on a second LSTM neural network model, wherein the second LSTM neural network model is obtained by training according to a second sample sound source signal, a sample echo signal, a sample noise signal and the pure sample voice signal;
and 5, inputting the pure voice signal to a loudspeaker.
Further, the first LSTM neural network model includes an echo estimation model and a noise estimation model, the echo estimation model is obtained by training according to a first sample sound source signal, a second sample sound source signal, and a sample echo signal, and the noise estimation model is obtained by training according to the first sample sound source signal, the second sample sound source signal, and a sample noise signal.
The invention has the beneficial effects that: the echo cancellation method based on the double LSTM neural network eliminates the echo signal with noise based on the LSTM neural network model, eliminates the influence of the noise on the echo cancellation, and can effectively eliminate the echo signal in the voice signal.
Drawings
Fig. 1 is a schematic flow chart of an echo cancellation method based on a dual LSTM neural network according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating a conventional echo cancellation structure;
fig. 3 is another schematic flow chart of the echo cancellation method based on the dual LSTM neural network according to the embodiment of the present invention.
Detailed Description
Embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
The invention aims to solve the problem of poor echo cancellation effect in the prior art, and provides an echo cancellation method based on a double-LSTM neural network, which has the main technical conception that: acquiring a first sound source signal to be input to a loudspeaker and a second sound source signal input by a microphone, and extracting a first frequency spectrum characteristic of the first sound source signal and a second frequency spectrum characteristic of the second sound source signal; obtaining an echo estimation signal and a noise estimation signal according to the first spectral feature and the second spectral feature and based on a first LSTM neural network model, wherein the first LSTM neural network model is obtained by training according to a first sample sound source signal, a second sample sound source signal, a sample echo signal and a sample noise signal; extracting a third spectral feature of the echo estimation signal and a fourth spectral feature of a noise estimation signal; obtaining a pure voice signal according to the second frequency spectrum characteristic, the third frequency spectrum characteristic and the fourth frequency spectrum characteristic and based on a second LSTM neural network model, wherein the second LSTM neural network model is obtained by training according to a second sample sound source signal, a sample echo signal, a sample noise signal and the pure sample voice signal; the clean speech signal is input to a speaker.
Before implementation, a first LSTM neural network model and a second LSTM neural network model are obtained by pre-training, wherein the first LSTM neural network model can be obtained by training according to a first sample sound source signal, a second sample sound source signal, a sample echo signal and a sample noise signal, and the second LSTM neural network model can be obtained by training according to the first sample sound source signal, the sample echo signal, the sample noise signal and a pure sample voice signal; when specifically using, acquire the second sound source signal of waiting to input the first sound source signal and the microphone input of treating the input to the speaker, wherein, first sound source signal is the distal end signal of waiting to input to the microphone in the echo channel, and the second sound source signal is the sound source signal that the microphone was collected, wherein includes: the method comprises the steps of firstly inputting a first frequency spectrum characteristic of a first sound source signal and a second frequency spectrum characteristic of a second sound source signal into a first LSTM neural network model to obtain an echo estimation signal and a noise estimation signal corresponding to the current environment, then inputting the second frequency spectrum characteristic of the second sound source signal, a third frequency spectrum characteristic of the echo estimation signal and a fourth frequency spectrum characteristic of the noise estimation signal into a second LSTM neural network model to obtain a pure voice signal, and finally inputting the pure voice signal into a loudspeaker to achieve echo cancellation of the sound source signal.
Examples
The echo cancellation method based on the double-LSTM neural network according to the embodiment of the present invention, as shown in FIG. 1, includes the following steps:
step S1, acquiring a first sound source signal to be input to a loudspeaker and a second sound source signal input by a microphone, and extracting a first frequency spectrum characteristic of the first sound source signal and a second frequency spectrum characteristic of the second sound source signal;
a conventional echo cancellation structure is shown in fig. 2, which performs echo cancellation on a far-end signal to be input to a speaker through an adaptive filter, and in this embodiment, on the basis of the far-end signal, that is, a first sound source signal to be input to the speaker, is obtained, and a second sound source signal input by a microphone, that is, a sound source signal collected by the microphone is obtained.
After the first sound source signal and the second sound source signal are obtained, a first spectrum feature corresponding to the first sound source signal and a second spectrum feature corresponding to the second sound source signal are extracted through a feature extraction method.
Step S2, obtaining an echo estimation signal and a noise estimation signal according to the first frequency spectrum characteristic and the second frequency spectrum characteristic and based on a first LSTM neural network model, wherein the first LSTM neural network model is obtained by training according to a first sample sound source signal, a second sample sound source signal, a sample echo signal and a sample noise signal;
a Long Short-Term Memory (LSTM) neural network is a variant of a Recurrent Neural Network (RNN), can overcome the defects of gradient extinction and explosion of the traditional RNN, and can selectively keep the Memory quantity of context, reduce the network depth and relieve the gradient extinction phenomenon by introducing a gating mechanism into a Memory unit.
Specifically, the first LSTM neural network model is trained in advance before being used specifically, and is obtained by training according to a first sample sound source signal, a second sample sound source signal, a sample echo signal and a sample noise signal, specifically, noise signals in different environments can be collected as sample noise signals, echo signals at different volumes of speakers and at different distances between the speakers and a microphone are collected as sample echo signals, and the first sample sound source signal and the second sample sound source signal corresponding to the above conditions are collected, and the established preliminary LSTM neural network model is trained through the first sample sound source signal, the second sample sound source signal, the sample echo signal and the sample noise signal, so that the first LSTM neural network model is obtained.
When the method is used specifically, a first frequency spectrum characteristic of a first sound source signal and a second frequency spectrum characteristic of a second sound source signal which are obtained currently are input into a first LSTM neural network model, and an echo estimation signal and a noise estimation signal corresponding to the current environment can be obtained.
In this embodiment, the first LSTM neural network model may include an echo estimation model and a noise estimation model, where the echo estimation model is used for calculating an echo estimation signal, and may be obtained by training a first sample sound source signal, a second sample sound source signal, and a sample echo signal, and the noise estimation model is used for calculating a noise estimation signal, and may be obtained by training a first sample sound source signal, a second sample sound source signal, and a sample noise signal.
Step S3, extracting a third spectral feature of the echo estimation signal and a fourth spectral feature of the noise estimation signal;
specifically, corresponding to step S1, the existing feature extraction method may be used to perform feature extraction on the echo estimation signal and the noise estimation signal output by the first LSTM neural network model, so as to obtain a third spectral feature of the echo estimation signal and a fourth spectral feature of the noise estimation signal.
Step S4, obtaining a pure voice signal according to the second spectral feature, the third spectral feature and the fourth spectral feature and based on a second LSTM neural network model, wherein the second LSTM neural network model is obtained by training according to a second sample sound source signal, a sample echo signal, a sample noise signal and the pure sample voice signal;
specifically, the second LSTM neural network model is also trained in advance before being used specifically, and may be trained according to a second sample sound source signal, a sample echo signal, a sample noise signal, and a pure sample voice signal, specifically, noise signals in different environments may be collected as sample noise signals, echo signals at different volumes of speakers and at different distances between the speakers and the microphone are collected as sample echo signals, and second sample sound source signals corresponding to the above conditions and pure voice signals of different users are collected, and the preliminary LSTM neural network model established is trained through the second sample sound source signal, the sample echo signal, the sample noise signal, and the pure sample voice signal, thereby obtaining the second LSTM neural network model.
When the method is used specifically, the second spectral feature of the second sound source signal, the third spectral feature of the echo estimation signal and the fourth spectral feature of the noise estimation signal are input into the second LSTM neural network model, and then a pure voice signal can be obtained.
And step S5, inputting the pure voice signal to a loudspeaker.
Finally, the pure voice signal output by the second LSTM neural network model is input to a loudspeaker, and echo cancellation of the sound source signal can be achieved.
In summary, as shown in fig. 3, in this embodiment, the first sound source signal and the second sound source signal are input to the first LSTM neural network model to obtain the echo estimation signal and the noise estimation signal, then the spectral features of the echo estimation signal and the noise estimation signal are extracted, and then the spectral features of the echo estimation signal, the noise estimation signal and the second sound source signal are input to the second LSTM neural network model to obtain the target signal. The method can keep the memory quantity of the context, reduce the network depth and relieve the gradient disappearance phenomenon, and the method has obvious inhibiting effect on the echo signal with noise.

Claims (2)

1. The echo cancellation method based on the double LSTM neural network is characterized by comprising the following steps of:
step 1, acquiring a first sound source signal to be input to a loudspeaker and a second sound source signal input by a microphone, and extracting a first frequency spectrum characteristic of the first sound source signal and a second frequency spectrum characteristic of the second sound source signal;
step 2, obtaining an echo estimation signal and a noise estimation signal according to the first frequency spectrum characteristic and the second frequency spectrum characteristic and based on a first LSTM neural network model, wherein the first LSTM neural network model is obtained by training according to a first sample sound source signal, a second sample sound source signal, a sample echo signal and a sample noise signal;
step 3, extracting a third spectral feature of the echo estimation signal and a fourth spectral feature of the noise estimation signal;
step 4, obtaining a pure voice signal according to the second frequency spectrum characteristic, the third frequency spectrum characteristic and the fourth frequency spectrum characteristic and based on a second LSTM neural network model, wherein the second LSTM neural network model is obtained by training according to a second sample sound source signal, a sample echo signal, a sample noise signal and the pure sample voice signal;
and 5, inputting the pure voice signal to a loudspeaker.
2. The method of claim 1, wherein the first LSTM neural network model comprises an echo estimation model trained from a first sample sound source signal, a second sample sound source signal, and a sample echo signal, and a noise estimation model trained from a first sample sound source signal, a second sample sound source signal, and a sample noise signal.
CN202011455735.5A 2020-12-10 2020-12-10 Echo cancellation method based on double LSTM neural network Active CN112614502B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011455735.5A CN112614502B (en) 2020-12-10 2020-12-10 Echo cancellation method based on double LSTM neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011455735.5A CN112614502B (en) 2020-12-10 2020-12-10 Echo cancellation method based on double LSTM neural network

Publications (2)

Publication Number Publication Date
CN112614502A CN112614502A (en) 2021-04-06
CN112614502B true CN112614502B (en) 2022-01-28

Family

ID=75233242

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011455735.5A Active CN112614502B (en) 2020-12-10 2020-12-10 Echo cancellation method based on double LSTM neural network

Country Status (1)

Country Link
CN (1) CN112614502B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11984110B2 (en) 2022-03-07 2024-05-14 Mediatek Singapore Pte. Ltd. Heterogeneous computing for hybrid acoustic echo cancellation

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111161752A (en) * 2019-12-31 2020-05-15 歌尔股份有限公司 Echo cancellation method and device
CN111225317A (en) * 2020-01-17 2020-06-02 四川长虹电器股份有限公司 Echo cancellation method
US10854186B1 (en) * 2019-07-22 2020-12-01 Amazon Technologies, Inc. Processing audio data received from local devices
CN112055284A (en) * 2019-06-05 2020-12-08 北京地平线机器人技术研发有限公司 Echo cancellation method, neural network training method, apparatus, medium, and device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3474280B1 (en) * 2017-10-19 2021-07-07 Goodix Technology (HK) Company Limited Signal processor for speech signal enhancement
WO2020077232A1 (en) * 2018-10-12 2020-04-16 Cambridge Cancer Genomics Limited Methods and systems for nucleic acid variant detection and analysis

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112055284A (en) * 2019-06-05 2020-12-08 北京地平线机器人技术研发有限公司 Echo cancellation method, neural network training method, apparatus, medium, and device
US10854186B1 (en) * 2019-07-22 2020-12-01 Amazon Technologies, Inc. Processing audio data received from local devices
CN111161752A (en) * 2019-12-31 2020-05-15 歌尔股份有限公司 Echo cancellation method and device
CN111225317A (en) * 2020-01-17 2020-06-02 四川长虹电器股份有限公司 Echo cancellation method

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Attention Wave-U-Net for Acoustic Echo Cancellation;Jung-Hee Kim et al;《INTERSPEECH 2020》;20201029;全文 *
Deep Learning for Acoustic Echo Cancellation in Noisy and Double-Talk Scenarios;Hao Zhang et al;《Interspeech 2018》;20180906;全文 *
会议电话中的实时回声消除算法研究与实现;陈林;《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》;20200615;全文 *
基于BLSTM神经网络的回声和噪声抑制算法;王冬霞等;《信号处理》;20200612;全文 *

Also Published As

Publication number Publication date
CN112614502A (en) 2021-04-06

Similar Documents

Publication Publication Date Title
US10546593B2 (en) Deep learning driven multi-channel filtering for speech enhancement
CN109065067B (en) Conference terminal voice noise reduction method based on neural network model
EP1443498B1 (en) Noise reduction and audio-visual speech activity detection
WO2021042870A1 (en) Speech processing method and apparatus, electronic device, and computer-readable storage medium
CN108447496B (en) Speech enhancement method and device based on microphone array
US9640194B1 (en) Noise suppression for speech processing based on machine-learning mask estimation
US9269368B2 (en) Speaker-identification-assisted uplink speech processing systems and methods
WO2021022094A1 (en) Per-epoch data augmentation for training acoustic models
KR20130108063A (en) Multi-microphone robust noise suppression
CN108109617A (en) A kind of remote pickup method
CN111755020B (en) Stereo echo cancellation method
CN110660407B (en) Audio processing method and device
CN110660406A (en) Real-time voice noise reduction method of double-microphone mobile phone in close-range conversation scene
CN110012331A (en) A kind of far field diamylose far field audio recognition method of infrared triggering
CN115132215A (en) Single-channel speech enhancement method
CN106161820B (en) A kind of interchannel decorrelation method for stereo acoustic echo canceler
CN112614502B (en) Echo cancellation method based on double LSTM neural network
CN115457928A (en) Echo cancellation method and system based on neural network double-talk detection
CN111225317B (en) Echo cancellation method
CN112820311A (en) Echo cancellation method and device based on spatial prediction
CN114023352B (en) Voice enhancement method and device based on energy spectrum depth modulation
Zhou et al. Speech Enhancement via Residual Dense Generative Adversarial Network.
CN113763978B (en) Voice signal processing method, device, electronic equipment and storage medium
CN114827363A (en) Method, device and readable storage medium for eliminating echo in call process
CN114566179A (en) Time delay controllable voice noise reduction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant