CN110808057A - Voice enhancement method for generating confrontation network based on constraint naive - Google Patents

Voice enhancement method for generating confrontation network based on constraint naive Download PDF

Info

Publication number
CN110808057A
CN110808057A CN201911051607.1A CN201911051607A CN110808057A CN 110808057 A CN110808057 A CN 110808057A CN 201911051607 A CN201911051607 A CN 201911051607A CN 110808057 A CN110808057 A CN 110808057A
Authority
CN
China
Prior art keywords
voice
training
naive
constraint
noise
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911051607.1A
Other languages
Chinese (zh)
Inventor
袁丛琳
孙成立
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanchang Hangkong University
Original Assignee
Nanchang Hangkong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanchang Hangkong University filed Critical Nanchang Hangkong University
Priority to CN201911051607.1A priority Critical patent/CN110808057A/en
Publication of CN110808057A publication Critical patent/CN110808057A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Signal Processing (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Quality & Reliability (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

The invention discloses a voice enhancement method for generating an confrontation network based on constraint naive, which comprises the following steps: 1) noisy data collection and labeling; 2) voice framing and windowing; 3) amplitude compression; 4) inputting constraint naive to generate confrontation network training; 5) amplitude decompression; 6) inverse short-time Fourier transform to generate enhanced speech. The invention has the advantages that: continuously enhancing the capability of generating a model generation sample by generating countermeasure learning between a generation model and a discrimination model in a countermeasure network, and finally obtaining the distribution of a clean voice sample; there is no assumption about the statistical distribution of speech or noise; the method of complex spectrum mapping is adopted, and phase information is added in the training sample. The invention skillfully solves the problem that the distribution of the voice and noise signals is difficult to estimate, is beneficial to improving the voice intelligibility and avoids phase distortion.

Description

Voice enhancement method for generating confrontation network based on constraint naive
Technical Field
The invention relates to the technical field of voice processing, in particular to a voice enhancement method for generating an confrontation network based on constraint naive.
Background
Voice has played an important role in the fields of mobile communication, multimedia technology, and the like as a main medium for human communication. Under the great background of the rise of artificial intelligence, the wide application of technologies such as voice recognition and voiceprint recognition also puts higher requirements on the quality of voice signals. However, in an actual voice capturing and dialogue communication scenario, a voice signal is often interfered by various noises, mainly including background noise, channel noise and interference noise. Speech enhancement is an effective technique to address noise pollution.
The traditional speech enhancement methods mainly have four types: (1) the spectral subtraction is to subtract the power spectrum of the noise-containing speech signal from the power spectrum of the noise-free speech signal by using the short-time stationarity of the speech to obtain the power spectrum estimation of the pure speech signal. This method is prone to the "musical noise" problem; (2) the wiener filter method is to estimate the spectral coefficient of speech from given noisy speech by a wiener filter under the condition of supposing that the speech and additive noise both obey Gaussian distribution. When the adjustment of the filter parameters reaches the limit or is in an unsteady noise environment, the effect of the wiener filtering method is poor; (3) the method is based on the minimum mean square error estimation (MMSE) of spectral amplitude, and estimates the probability distribution of spectral coefficients through statistical learning assuming that the speech amplitude spectrum satisfies a certain distribution, such as gaussian distribution, gamma distribution, etc. However, the assumed distribution and the true distribution are often not consistent; (4) the subspace method is to place clean speech in a low rank signal subspace, place noise signals in a noise subspace, and the signal subspace and the noise subspace are orthogonal to each other. The method obtains a pure voice signal by setting the noise subspace to zero and then filtering the signal subspace. This method does not take into account the prior knowledge of speech and noise, making it difficult to completely remove the noise subspace.
Disclosure of Invention
The invention aims to solve the problems that: the method for enhancing the voice of the countermeasure network based on constraint naive generation is provided, the problem that the distribution of voice and noise signals is difficult to estimate is solved skillfully, the improvement of the voice intelligibility is facilitated, and the phase distortion is avoided.
The technical scheme provided by the invention for solving the problems is as follows: a method for generating a voice enhancement of a countermeasure network based on constraint naive, the method comprising the steps of,
(1) noise data collection and labeling;
(2) voice frame windowing;
(3) compressing the amplitude;
(4) inputting constraint naive to generate confrontation network training;
(5) amplitude decompression;
(6) and performing inverse short-time Fourier transform to generate enhanced voice.
Preferably, the noise data collection and labeling in step (1) specifically includes the following steps:
(1.1) data collection: the method comprises the steps of adopting the voice of a NOIZE library as pure voice, adopting noise in a NOISEX-92 noise library as a noise signal, and enabling the sampling frequency to be 8 KHz;
(1.2) data tagging: each noise is superimposed on the clean speech with a signal-to-noise ratio of-5 dB, 0dB, 5dB, 10dB and 15dB, respectively, as a noisy speech data set.
Preferably, the step (2) of windowing the speech frames refers to framing the noisy speech by using a hamming window with a length of 512 and a frame shift of 50%, and the number of short-time fourier transform points is 1024.
Preferably, the amplitude compression in the step (3) is to perform amplitude compression on the complex spectrum concatenation vector by using a hyperbolic tangent function, and the value range is limited to [ -1,1], and the hyperbolic tangent function is defined as
Figure BDA0002255454100000021
Preferably, the generation of the confrontation network training by inputting the constraint naive in the step (4) can be divided into a network model initialization, a training discriminator, a training generator and an output training model, and specifically comprises the following steps:
(4.1) initializing a network model: initializing a generator and a discriminator; the generator G is realized through a convolution layer and a deconvolution layer, and the activation function selects the PReLU; the discriminator D is realized by a convolution layer, and an activation function selects LeakyReLU; adopting a zero filling strategy of 'same' and adopting BatchNormalization to normalize each layer; the optimizer selects RMSprop, and the learning rate is 0.0002;
(4.2) trainingAn exercise discriminator: compressing the complex spectrum training of the pure voice sample obtained in the step 3) to ensure that
Figure BDA0002255454100000022
Approaching to 1; compressing the complex spectrum training of the noisy speech sample obtained in the step 3) to enhance the speech complex spectrum
Figure BDA0002255454100000023
And is
Figure BDA0002255454100000024
Approaching to 0;
(4.3), training generator: compressing the complex spectrum of the pure voice sample and the noisy voice sample obtained in the step 3), training, freezing the discriminator and training the generator, so that the discriminator D can enhance the complex spectrum of the voice
Figure BDA0002255454100000025
And is
Figure BDA0002255454100000026
Approaching to 1;
(4.4) outputting a training model: and (4.1) repeating the steps (4.1) to (4.3) until the model is converged, and outputting the generator G and the discriminator D.
Preferably, the amplitude decompression in the step (5) is amplitude decompression of the enhanced complex spectrum concatenation vector by using an inverse hyperbolic tangent function, where the inverse hyperbolic tangent function is defined as:
compared with the prior art, the invention has the advantages that: continuously enhancing the capability of generating a model generation sample by generating countermeasure learning between a generation model and a discrimination model in a countermeasure network, and finally obtaining the distribution of a clean voice sample; there is no assumption about the statistical distribution of speech or noise; the method of complex spectrum mapping is adopted, and phase information is added in the training sample. The invention skillfully solves the problem that the distribution of the voice and noise signals is difficult to estimate, is beneficial to improving the voice intelligibility and avoids phase distortion.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention and not to limit the invention.
Fig. 1 is a schematic diagram of the operation of the present invention.
Fig. 2 is a schematic block diagram of the present invention constraint naive generated countermeasure network.
Detailed Description
The following detailed description of the embodiments of the present invention will be provided with reference to the accompanying drawings and examples, so that how to implement the embodiments of the present invention by using technical means to solve the technical problems and achieve the technical effects can be fully understood and implemented.
The invention adopts a flow chart of a voice enhancement method based on a constraint naive generation countermeasure network (CN-GAN) as shown in figure 1 to realize voice denoising under the environment with low signal-to-noise ratio, and the specific implementation steps are as follows:
1) noisy data collection and labeling
(1.1) data collection: the embodiment of the invention adopts sp 01-sp 30 voices of a NOIZE library as pure voices, adopts babble noise, white noise, hfchannel noise and buccaneer1 noise in a NOISEX-92 noise library as noise signals, and has the sampling frequency of 8 KHz;
(1.2) data tagging: and (3) superposing the four noises in the step (1.1) to pure voice with signal-to-noise ratios of-5 dB, 0dB, 5dB, 10dB and 15dB respectively to serve as a noisy voice data set. The data set was divided into a training set and a test set on a 3:1 scale.
2) Speech framing windowing
The method comprises the steps of framing noisy speech by adopting a Hamming window with the length of 512 and the frame shift of 50%, and connecting a real part and an imaginary part of a complex spectrum in series to form a vector by using the point number of short-time Fourier transform (STFT) of 1024 to obtain the complex spectrum of the noisy speech, wherein the point number is used as a network training target.
3) Amplitude compression
Performing amplitude compression on the complex spectrum concatenation vector obtained in the step 2) by using a hyperbolic tangent function, and performing amplitude compression on the noisy speech complex spectrum shown in the figure 1
Figure BDA0002255454100000032
Real part Z ofrAnd an imaginary part ZiIs limited to the range of [ -1,1 [)]Then Z isrAnd ZiIs used as input of CN-GAN, and X is calculated by CN-GANrAnd XiIs estimated value of
Figure BDA0002255454100000041
And
Figure BDA0002255454100000042
the hyperbolic tangent function is defined as shown in formula (1):
4) input constraint naive generation confrontation network training
(4.1) network model initialization: the generator and the arbiter are initialized. The generator G is implemented by a convolutional layer and a deconvolution layer, and the activation function selects the PReLU. The discriminator D is implemented by a convolutional layer, and the activation function selects the leakyreu. The zero padding strategy of "same" is adopted, and the normalization of each layer is adopted. The optimizer selects RMSprop with a learning rate of 0.0002. Constraint naive generation of complex spectrum mapping confronts a network objective function as shown in equation (2):
Figure BDA0002255454100000044
in the formula, there is Xc=[Xr'Xi'],Zc=[Zr'Zi']λ represents the tuning weight, E [. cndot.)]Representing a computational mathematical expectation.
(4.2) training the arbiter: compressing the complex spectrum training of the pure voice sample obtained in the step 3) to ensure that
Figure BDA0002255454100000045
Approaching to 1; compressing the complex spectrum training of the noisy speech sample obtained in the step 3) to enhance the speech complex spectrum
Figure BDA0002255454100000046
And is
Figure BDA0002255454100000047
Approaching 0.
(4.3) training generator: compressing the complex spectrum of the pure voice sample and the noisy voice sample obtained in the step 3), training, freezing the discriminator and training the generator, so that the discriminator D can enhance the complex spectrum of the voice
Figure BDA0002255454100000048
And is
Figure BDA0002255454100000049
Approaching to 1;
(4.4) outputting a training model: and (4.1) repeating the steps (4.1) to (4.3) until the model is converged, and outputting the generator G and the discriminator D.
5) Amplitude decompression
Using inverse hyperbolic tangent function to real part of enhanced complex spectrum concatenation vector obtained in step 4)
Figure BDA00022554541000000410
And imaginary part
Figure BDA00022554541000000411
Performing amplitude decompression to obtainAndthe inverse hyperbolic tangent function is defined as shown in equation (3):
Figure BDA00022554541000000414
6) inverse short-time Fourier transform to generate enhanced speech
And (3) performing inverse short-time Fourier transform (ISTFT) on the enhanced voice complex spectrum obtained in the step 5) to obtain a time domain waveform of the noise-reduced voice, and finishing the voice enhancement process.
And repeating the step 6) on all the noisy speeches in the test set to obtain an enhanced speech data set.
The foregoing is merely illustrative of the preferred embodiments of the present invention and is not to be construed as limiting the claims. The present invention is not limited to the above embodiments, and the specific structure thereof is allowed to vary. All changes which come within the scope of the invention as defined by the independent claims are intended to be embraced therein.

Claims (6)

1. A voice enhancement method for generating an antagonistic network based on constraint naive is characterized in that: the method comprises the following steps of,
(1) noise data collection and labeling;
(2) voice frame windowing;
(3) compressing the amplitude;
(4) inputting constraint naive to generate confrontation network training;
(5) amplitude decompression;
(6) and performing inverse short-time Fourier transform to generate enhanced voice.
2. The method of claim 1 for speech enhancement based on constraint naive generation countermeasure network, wherein: the noise data collection and marking in the step (1) specifically comprises the following steps:
(1.1) data collection: the method comprises the steps of adopting the voice of a NOIZE library as pure voice, adopting noise in a NOISEX-92 noise library as a noise signal, and enabling the sampling frequency to be 8 KHz;
(1.2) data tagging: each noise is superimposed on the clean speech with a signal-to-noise ratio of-5 dB, 0dB, 5dB, 10dB and 15dB, respectively, as a noisy speech data set.
3. The method of claim 1 for speech enhancement based on constraint naive generation countermeasure network, wherein: the step (2) of windowing the voice frames refers to framing the noisy voice by adopting a Hamming window with the length of 512 and the frame shift of 50%, and the number of short-time Fourier transform points is 1024.
4. The method of claim 1 for speech enhancement based on constraint naive generation countermeasure network, wherein: the amplitude compression in the step (3) refers to amplitude compression of the complex spectrum concatenation vector by using a hyperbolic tangent function, the value range is limited to [ -1,1], and the hyperbolic tangent function is defined as
Figure FDA0002255454090000011
5. The method of claim 1 for speech enhancement based on constraint naive generation countermeasure network, wherein: the generation of confrontation network training by inputting constraint naive in the step (4) can be divided into network model initialization, training discriminator, training generator and output training model, and specifically comprises the following steps:
(4.1) initializing a network model: initializing a generator and a discriminator; the generator G is realized through a convolution layer and a deconvolution layer, and the activation function selects the PReLU; the discriminator D is realized by a convolution layer, and an activation function selects LeakyReLU; adopting a zero filling strategy of 'same' and adopting BatchNormalization to normalize each layer; the optimizer selects RMSprop, and the learning rate is 0.0002;
(4.2), training a discriminator: compressing the complex spectrum training of the pure voice sample obtained in the step 3) to ensure that
Figure FDA0002255454090000021
Approaching to 1; compressing the complex spectrum training of the noisy speech sample obtained in the step 3) to enhance the speech complex spectrum
Figure FDA0002255454090000022
And is
Figure FDA0002255454090000023
Approaching to 0;
(4.3), training generator: compressing the complex spectrum of the pure voice sample and the noisy voice sample obtained in the step 3), training, freezing the discriminator and training the generator, so that the discriminator D can enhance the complex spectrum of the voice
Figure FDA0002255454090000024
And is
Figure FDA0002255454090000025
Approaching to 1;
(4.4) outputting a training model: and (4.1) repeating the steps (4.1) to (4.3) until the model is converged, and outputting the generator G and the discriminator D.
6. The method of claim 1 for speech enhancement based on constraint naive generation countermeasure network, wherein: the amplitude decompression in the step (5) refers to amplitude decompression of the enhanced complex spectrum concatenation vector by using an inverse hyperbolic tangent function, wherein the inverse hyperbolic tangent function is defined as:
Figure FDA0002255454090000026
CN201911051607.1A 2019-10-31 2019-10-31 Voice enhancement method for generating confrontation network based on constraint naive Pending CN110808057A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911051607.1A CN110808057A (en) 2019-10-31 2019-10-31 Voice enhancement method for generating confrontation network based on constraint naive

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911051607.1A CN110808057A (en) 2019-10-31 2019-10-31 Voice enhancement method for generating confrontation network based on constraint naive

Publications (1)

Publication Number Publication Date
CN110808057A true CN110808057A (en) 2020-02-18

Family

ID=69489817

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911051607.1A Pending CN110808057A (en) 2019-10-31 2019-10-31 Voice enhancement method for generating confrontation network based on constraint naive

Country Status (1)

Country Link
CN (1) CN110808057A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112560795A (en) * 2020-12-30 2021-03-26 南昌航空大学 SAR image target recognition algorithm based on CN-GAN and CNN
CN112581929A (en) * 2020-12-11 2021-03-30 山东省计算中心(国家超级计算济南中心) Voice privacy density masking signal generation method and system based on generation countermeasure network
CN113035217A (en) * 2021-03-01 2021-06-25 武汉大学 Voice enhancement method based on voiceprint embedding under low signal-to-noise ratio condition
CN113052267A (en) * 2021-04-28 2021-06-29 电子科技大学 Unsupervised transmitter phase noise parameter extraction method based on generation countermeasure network

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160341814A1 (en) * 2012-03-09 2016-11-24 U.S. Army Research Laboratory Attn: Rdrl-Loc-I Method and system for jointly separating noise from signals
CN108200522A (en) * 2017-11-24 2018-06-22 华侨大学 A kind of change regularization ratio normalization sub-band adaptive filtering method
US10152970B1 (en) * 2018-02-08 2018-12-11 Capital One Services, Llc Adversarial learning and generation of dialogue responses
CN109065021A (en) * 2018-10-18 2018-12-21 江苏师范大学 The end-to-end dialect identification method of confrontation network is generated based on condition depth convolution
CN109215674A (en) * 2018-08-10 2019-01-15 上海大学 Real-time voice Enhancement Method
CN110060701A (en) * 2019-04-04 2019-07-26 南京邮电大学 Multi-to-multi phonetics transfer method based on VAWGAN-AC
CN110085215A (en) * 2018-01-23 2019-08-02 中国科学院声学研究所 A kind of language model data Enhancement Method based on generation confrontation network

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160341814A1 (en) * 2012-03-09 2016-11-24 U.S. Army Research Laboratory Attn: Rdrl-Loc-I Method and system for jointly separating noise from signals
CN108200522A (en) * 2017-11-24 2018-06-22 华侨大学 A kind of change regularization ratio normalization sub-band adaptive filtering method
CN110085215A (en) * 2018-01-23 2019-08-02 中国科学院声学研究所 A kind of language model data Enhancement Method based on generation confrontation network
US10152970B1 (en) * 2018-02-08 2018-12-11 Capital One Services, Llc Adversarial learning and generation of dialogue responses
CN109215674A (en) * 2018-08-10 2019-01-15 上海大学 Real-time voice Enhancement Method
CN109065021A (en) * 2018-10-18 2018-12-21 江苏师范大学 The end-to-end dialect identification method of confrontation network is generated based on condition depth convolution
CN110060701A (en) * 2019-04-04 2019-07-26 南京邮电大学 Multi-to-multi phonetics transfer method based on VAWGAN-AC

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
DANIEL MICHELSANTI ET AL.: "《Conditional Generative Adversarial Networks for Speech Enhancement and Noise-Robust Speaker Verification》", 《CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2017》 *
JONAS SAUTTER ET AL.: "《Artificial Bandwidth Extension Using a Conditional Generative Adversarial Network with Discriminative Training》", 《ICASSP 2019 - 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)》 *
PEIYAO SHENG ET AL.: "《Data Augmentation using Conditional Generative Adversarial Networks for Robust Speech Recognition》", 《2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP)》 *
孙成立、王海武: "《生成式对抗网络在语音增强方面的研究》", 《计算机技术与发展》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112581929A (en) * 2020-12-11 2021-03-30 山东省计算中心(国家超级计算济南中心) Voice privacy density masking signal generation method and system based on generation countermeasure network
CN112581929B (en) * 2020-12-11 2022-06-03 山东省计算中心(国家超级计算济南中心) Voice privacy density masking signal generation method and system based on generation countermeasure network
CN112560795A (en) * 2020-12-30 2021-03-26 南昌航空大学 SAR image target recognition algorithm based on CN-GAN and CNN
CN113035217A (en) * 2021-03-01 2021-06-25 武汉大学 Voice enhancement method based on voiceprint embedding under low signal-to-noise ratio condition
CN113035217B (en) * 2021-03-01 2023-11-10 武汉大学 Voice enhancement method based on voiceprint embedding under low signal-to-noise ratio condition
CN113052267A (en) * 2021-04-28 2021-06-29 电子科技大学 Unsupervised transmitter phase noise parameter extraction method based on generation countermeasure network
CN113052267B (en) * 2021-04-28 2022-06-14 电子科技大学 Unsupervised transmitter phase noise parameter extraction method based on generation countermeasure network

Similar Documents

Publication Publication Date Title
CN110619885B (en) Method for generating confrontation network voice enhancement based on deep complete convolution neural network
CN110085249B (en) Single-channel speech enhancement method of recurrent neural network based on attention gating
CN107452389B (en) Universal single-track real-time noise reduction method
CN107274908B (en) Wavelet voice denoising method based on new threshold function
CN112735456B (en) Speech enhancement method based on DNN-CLSTM network
CN110808057A (en) Voice enhancement method for generating confrontation network based on constraint naive
CN110148420A (en) A kind of audio recognition method suitable under noise circumstance
CN107845389A (en) A kind of sound enhancement method based on multiresolution sense of hearing cepstrum coefficient and depth convolutional neural networks
CN110428849B (en) Voice enhancement method based on generation countermeasure network
CN108831499A (en) Utilize the sound enhancement method of voice existing probability
CN111653288A (en) Target person voice enhancement method based on conditional variation self-encoder
CN110867181A (en) Multi-target speech enhancement method based on SCNN and TCNN joint estimation
CN109643554A (en) Adaptive voice Enhancement Method and electronic equipment
KR101305373B1 (en) Interested audio source cancellation method and voice recognition method thereof
CN111091833A (en) Endpoint detection method for reducing noise influence
WO2019014890A1 (en) Universal single channel real-time noise-reduction method
CN111899750B (en) Speech enhancement algorithm combining cochlear speech features and hopping deep neural network
CN117711419B (en) Intelligent data cleaning method for data center
CN115424627A (en) Voice enhancement hybrid processing method based on convolution cycle network and WPE algorithm
CN116013344A (en) Speech enhancement method under multiple noise environments
CN107045874A (en) A kind of Non-linear Speech Enhancement Method based on correlation
CN112634927A (en) Short wave channel voice enhancement method
CN113066483B (en) Sparse continuous constraint-based method for generating countermeasure network voice enhancement
Hamid et al. Speech enhancement using EMD based adaptive soft-thresholding (EMD-ADT)
CN117037825A (en) Adaptive filtering and multi-window spectrum estimation spectrum subtraction combined noise reduction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200218