CN111261197B - Real-time speech paragraph tracking method under complex noise scene - Google Patents

Real-time speech paragraph tracking method under complex noise scene Download PDF

Info

Publication number
CN111261197B
CN111261197B CN202010029721.0A CN202010029721A CN111261197B CN 111261197 B CN111261197 B CN 111261197B CN 202010029721 A CN202010029721 A CN 202010029721A CN 111261197 B CN111261197 B CN 111261197B
Authority
CN
China
Prior art keywords
noise
calculating
frame
signal
voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010029721.0A
Other languages
Chinese (zh)
Other versions
CN111261197A (en
Inventor
马翼平
张玮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Avic East China Photoelectric Shanghai Co ltd
Original Assignee
Avic East China Photoelectric Shanghai Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Avic East China Photoelectric Shanghai Co ltd filed Critical Avic East China Photoelectric Shanghai Co ltd
Priority to CN202010029721.0A priority Critical patent/CN111261197B/en
Publication of CN111261197A publication Critical patent/CN111261197A/en
Application granted granted Critical
Publication of CN111261197B publication Critical patent/CN111261197B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L25/84Detection of presence or absence of voice signals for discriminating voice from noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/21Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/45Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of analysis window
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/93Discriminating between voiced and unvoiced parts of speech signals

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention discloses a real-time voice paragraph tracking method under a complex noise scene, which comprises the following steps: A. pre-treating; B. calculating a discrete Fourier transform coefficient of an input audio frame, and calculating the power of initial noise, namely calculating the arithmetic mean value of a Fourier transform amplitude spectrum, assuming that a previous frame is a noise frame; assuming that the data after the frame is a signal with noise, calculating the power of the signal with noise; D. calculating the posterior signal-to-noise ratio; E. calculating a priori signal-to-noise ratio; F. voice activation detection; G. updating a noise spectrum; H. and calculating a gain coefficient, estimating the spectrum attribute of stationary noise in a scene by using paragraph noise between the speech segments, and designing a gain function to enhance the voice and inhibit the stationary noise. And performing voiced sound detection on the basis, tracking the speech paragraphs, and shielding various noises among the speech paragraphs. Therefore, the accuracy of voice detection can be improved, the noise of voice segment superposition can be inhibited, and the noise between the voice segments influencing the listening feeling can be thoroughly shielded.

Description

Real-time voice paragraph tracking method under complex noise scene
Technical Field
The invention relates to the technical field of voice processing, in particular to a real-time voice paragraph tracking method in a complex noise scene.
Background
Engineering in the field of speech signal processing is to be faced with complex noise scenarios including stationary noise, transient noise, time-varying noise, and strong noise, etc., which have different statistical properties. When the near-talking sound pickup equipment is used for voice collection, voice communication and voice recognition, background noise is easily picked up by the microphone, direct influence is caused to the voice communication from the listening aspect, and the performance of processing modules such as rear-end voice recognition and the like can be further influenced. In a complex noise scene, steady-state noise mixed in voice is inhibited, other types of noise mixed among voice paragraphs are shielded, and pure voice paragraphs are obtained by tracking, so that the hearing of voice communication can be effectively improved, and the performance of a back-end processing module such as voice recognition and the like is improved. The speech tracking under the single noise scene with the statistical characteristics is relatively easy to process, and the speech paragraph tracking under the complex noise scene is a difficult problem.
Disclosure of Invention
The present invention is directed to provide a real-time speech paragraph tracking method in a complex noise scene, so as to solve the problems mentioned in the background art.
In order to achieve the purpose, the invention provides the following technical scheme:
a real-time speech paragraph tracking method under a complex noise scene is characterized by comprising the following steps:
A. pretreatment: framing and windowing an input audio signal; taking 16ms data as a frame x i (n), wherein i is a frame number;
B. computing input audio frames
Figure BDA0002363854900000011
Discrete fourier transform coefficient Y of ik ) Where k is the index of the spectral component;
C. assuming the previous L frames as noise frames, calculating the power of the initial noise, i.e. calculating
Figure BDA0002363854900000012
An arithmetic mean of the fourier transform magnitude spectrum; assuming the data after L frames as a noise signal, calculating the power of the noise signal
Figure BDA0002363854900000013
D. Calculating the posterior signal-to-noise ratio
Figure BDA0002363854900000014
E. Calculating the prior signal-to-noise ratio
Figure BDA0002363854900000021
Figure BDA0002363854900000022
F. Voice activation detection;
G. updating a noise spectrum;
H. calculating a gain coefficient;
I. signal reconstruction: calculating the amplitude spectrum and the power spectrum of the enhanced voice of the current frame, and performing inverse Fourier transform on the spectrum of the enhanced voice to obtain a reconstructed signal;
J. calculating out
Figure BDA0002363854900000023
Is self-correlation function of
Figure BDA0002363854900000024
Wherein r is t (tau) is an autocorrelation function with a delay of tau, N is a window length and N is greater than or equal to 1 and less than or equal to N;
K. calculating a difference function:
Figure BDA0002363854900000025
and (3) calculating:
Figure BDA0002363854900000026
l, judging the voiced sound according to the following conditions: p =1-d' (τ) is calculated, p characterizing the probability that a fundamental frequency component is clearly contained in a frame of speech. Since d' (τ) has a value in the range of [0,1 ]]Then p is in the value range of [0,1 ]](ii) a With p th As a threshold, is larger than p th The speech frame of (1) is reserved as voiced;
m, unvoiced sound compensation and noise masking.
As a further scheme of the invention: in the step A, the input audio signal is framed and windowed, and the window function is a Hamming window:
Figure BDA0002363854900000027
as a further scheme of the invention: and the step F is specifically to carry out voice activation detection on the input frame and select out the noise frame. According to the posterior signal-to-noise ratio gamma k And a priori signal-to-noise ratio
Figure BDA0002363854900000028
And solving a judgment parameter v for activating voice detection, judging as voice if v is greater than a judgment threshold eta, and judging as noise if v is less than eta, so as to update a noise spectrum. The calculation method of the decision parameter v is as follows.
As a further scheme of the invention: the step G is specifically as follows: after the noise frame is selected, the noise spectrum is updated according to the following formula:
Figure BDA0002363854900000031
as a further scheme of the invention: the step H is specifically as follows: calculating the weighting coefficient of the amplitude spectrum of the current frame according to the posterior signal-to-noise ratio and the prior signal-to-noise ratio:
Figure BDA0002363854900000032
as a further scheme of the invention: the function established in the step I is as follows:
Figure BDA0002363854900000033
as a further scheme of the invention: in the step M, if a certain frame is determined to be voiced and a signal frame within 400 milliseconds after the certain frame is determined to be not voiced, compensation is performed, that is, the signal frame is directly output without being processed; and (4) masking the non-voiced sound frame which does not meet the compensation condition, namely performing amplitude limiting processing and outputting.
Compared with the prior art, the invention has the beneficial effects that: the invention completely tracks the voice paragraph, shields the noise outside the voice paragraph, inhibits the noise superposed on the voice, and enhances the listening effect of the voice.
Drawings
FIG. 1 is a time domain waveform of an audio signal with stationary noise and transient noise superimposed on the speech and a noise peak exceeding 60 dB;
FIG. 2 is a time domain waveform of the signal of FIG. 1 after being processed by the present embodiment;
FIG. 3 is a time domain waveform of an audio signal with stationary noise and transient noise superimposed on the speech and a noise peak exceeding 110 dB;
FIG. 4 is a time domain waveform of the signal of FIG. 3 after being processed by the present invention;
fig. 5 is a flowchart of the method of the present embodiment.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1-5, example 1: in an embodiment of the present invention, a real-time speech paragraph tracking method in a complex noise scene includes the following steps:
A. and (4) preprocessing. The input audio signal is framed and windowed. Taking 16ms (256 samples) of data as a frame x i (n), where i is the frame number. Windowing is performed on the data, and the window function is a Hamming window:
Figure BDA0002363854900000041
B. computing input audio frames
Figure BDA0002363854900000042
Discrete fourier transform coefficient Y of ik ) Where k is the index of the spectral component:
Y ik )=Y k exp(jθ y (k))
C. assuming the first L frames as noise frames, calculating the power of the initial noise, i.e. calculating
Figure BDA0002363854900000043
Arithmetic mean of fourier transform magnitude spectrum:
Figure BDA0002363854900000044
assuming that the data after L frames is a signal with noise, calculating the power of the signal with noise
|Y ik )| 2
D. Calculating the posterior signal-to-noise ratio gamma k =|Y ik )| 2d (k);
E. Calculating the prior signal-to-noise ratio
Figure BDA0002363854900000045
Figure BDA0002363854900000046
F. Voice activity detection. Since the noise may be stationary for a short time, the noise spectrum needs to be updated in real time to ensure the effect of noise suppression. And carrying out voice activation detection on the input frame, and selecting out a noise frame. According to the posterior signal-to-noise ratio gamma k And a priori signal to noise ratio
Figure BDA0002363854900000047
A decision parameter v is derived that activates speech detection. If v is larger than the decision threshold eta, the voice is judged, and if v is smaller than eta, the voice is judged as noise, and the noise spectrum is updated. The calculation method of the decision parameter v comprises the following steps:
Figure BDA0002363854900000048
G. and updating the noise spectrum. After the noise frame is selected, the noise spectrum is updated according to the following formula:
Figure BDA0002363854900000049
H. a gain factor is calculated. Calculating a weighting coefficient of the magnitude spectrum of the current frame according to the posterior signal-to-noise ratio and the prior signal-to-noise ratio:
Figure BDA0002363854900000051
wherein exp (-) is an exponential function with a natural constant e as a base, and expint (-) is an exponential integration function with the natural constant e as a base.
I. The signal is reconstructed. Calculating the amplitude spectrum and the power spectrum of the enhanced voice of the current frame, and performing inverse Fourier transform on the frequency spectrum of the enhanced voice to obtain a reconstructed signal:
Figure BDA0002363854900000052
J. computing
Figure BDA0002363854900000053
Is self-correlation function of
Figure BDA0002363854900000054
Wherein r is t (tau) is an autocorrelation function with a delay of tau, N is a window length and N is greater than or equal to 1 and less than or equal to N;
K. calculating a difference function:
Figure BDA0002363854900000055
and (3) calculating:
Figure BDA0002363854900000056
l, judging the voiced sound according to the following conditions:
p =1-d' (τ) is calculated, p characterizing the probability that a fundamental frequency component is clearly contained in a frame of speech. Due to the range of d' (τ)Is enclosed as [0,1 ]]Then p has a value in the range of [0,1 ]]. With p th As a threshold, is larger than p th The speech frame of (1) is reserved as voiced;
m, unvoiced sound compensation and noise masking. If a certain frame is judged to be voiced and a signal frame within 400 milliseconds later is not voiced, compensation is carried out, namely the signal frame is directly output without being processed; and (4) masking the non-voiced sound frame which does not meet the compensation condition, namely performing amplitude limiting processing and outputting.
Fig. 3 and fig. 5 are audio time domain waveforms processed by the method of the present invention, and it can be seen from comparing with the original waveforms that, under the complex noise background, the method completely tracks the speech paragraphs, masks the noise outside the speech paragraphs, and also plays a role in suppressing the noise superimposed on the speech, thereby enhancing the listening effect of the speech itself.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
Furthermore, it should be understood that although the present description refers to embodiments, not every embodiment may contain only a single embodiment, and such description is for clarity only, and those skilled in the art should integrate the description, and the embodiments may be combined as appropriate to form other embodiments understood by those skilled in the art.

Claims (5)

1. A real-time speech paragraph tracking method under a complex noise scene is characterized by comprising the following steps:
A. pretreatment: framing and windowing an input audio signal; taking 16ms data as a frame xi (n), wherein i is a frame number;
B. computing input audio frames
Figure FDA0003885150610000011
Discrete fourier transform coefficient Yi (ω) k ) Where k is the index of the spectral component;
C. assuming the previous L frames as noise frames, calculating the power of the initial noise, i.e. calculating
Figure FDA0003885150610000012
An arithmetic mean of the fourier transform magnitude spectrum; assuming that the data after the L frame is a signal with noise, the power | Yi (ω) of the signal with noise is calculated k )| 2
D. Calculating the posterior signal-to-noise ratio gamma k =|Yi(ω k )| 2d (k);
E. Calculating a priori signal-to-noise ratio
Figure FDA0003885150610000013
Figure FDA0003885150610000014
F. Voice activation detection; the step F specifically comprises the following steps: carrying out voice activation detection on an input frame, and selecting a noise frame; according to the posterior signal-to-noise ratio gamma k And a priori signal to noise ratio
Figure FDA0003885150610000015
Solving a judgment parameter v for activating voice detection, judging voice if v is greater than a judgment threshold eta, and judging noise if v is less than eta, wherein the judgment parameter v is used for updating a noise spectrum; the calculation method of the decision parameter v comprises the following steps:
Figure FDA0003885150610000016
G. updating a noise spectrum; the step G is specifically as follows: after the noise frame is selected, the noise spectrum is updated according to the following formula:
Figure FDA0003885150610000017
H. calculating a gain coefficient;
I. signal reconstruction: calculating the amplitude spectrum and the power spectrum of the enhanced voice of the current frame, and performing inverse Fourier transform on the spectrum of the enhanced voice to obtain a reconstructed signal;
J. calculating out
Figure FDA0003885150610000018
Is self-correlation function of
Figure FDA0003885150610000019
Wherein r is t (tau) is an autocorrelation function with a delay of tau, N is a window length and N is greater than or equal to 1 and less than or equal to N;
K. calculating a difference function:
Figure FDA0003885150610000021
and (3) calculating:
Figure FDA0003885150610000022
l, judging the voiced sound according to the following conditions: calculating p =1-d' (τ), wherein p represents the probability that a certain fundamental frequency component is obviously contained in a frame of voice; since d' (τ) has a value in the range of [0,1 ]]Then p has a value in the range of [0,1 ]](ii) a With p th As a threshold, is larger than p th The speech frame of (1) is reserved as voiced;
m, unvoiced sound compensation and noise masking.
2. The method according to claim 1, wherein the real-time speech paragraph tracking method under complex noise scene,
in the step A, the input audio signal is framed and windowed, and the window function is a Hamming window:
Figure FDA0003885150610000023
3. the method according to claim 1, wherein the real-time speech paragraph tracking method under the complex noise scene,
the step H is specifically as follows: calculating the weighting coefficient of the amplitude spectrum of the current frame according to the posterior signal-to-noise ratio and the prior signal-to-noise ratio:
Figure FDA0003885150610000024
4. the method according to claim 1, wherein the real-time speech paragraph tracking method under the complex noise scene,
the function established in the step I is as follows:
Figure FDA0003885150610000025
5. the method according to claim 1, wherein the real-time speech paragraph tracking method under the complex noise scene,
in the step M, if a certain frame is judged to be voiced and a signal frame within 400 milliseconds after the certain frame is judged to be not voiced, compensation is carried out, namely the signal frame is directly output without being processed; and (4) shielding the non-voiced sound frame which does not meet the compensation condition, namely performing amplitude limiting processing and outputting the non-voiced sound frame.
CN202010029721.0A 2020-01-13 2020-01-13 Real-time speech paragraph tracking method under complex noise scene Active CN111261197B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010029721.0A CN111261197B (en) 2020-01-13 2020-01-13 Real-time speech paragraph tracking method under complex noise scene

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010029721.0A CN111261197B (en) 2020-01-13 2020-01-13 Real-time speech paragraph tracking method under complex noise scene

Publications (2)

Publication Number Publication Date
CN111261197A CN111261197A (en) 2020-06-09
CN111261197B true CN111261197B (en) 2022-11-25

Family

ID=70950451

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010029721.0A Active CN111261197B (en) 2020-01-13 2020-01-13 Real-time speech paragraph tracking method under complex noise scene

Country Status (1)

Country Link
CN (1) CN111261197B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1130952A (en) * 1993-09-14 1996-09-11 英国电讯公司 Voice activity detector
CN105845150A (en) * 2016-03-21 2016-08-10 福州瑞芯微电子股份有限公司 Voice enhancement method and system adopting cepstrum to correct
CN107452363A (en) * 2017-07-03 2017-12-08 福建天泉教育科技有限公司 Musical instrument tuner method and system
CN108831504A (en) * 2018-06-13 2018-11-16 西安蜂语信息科技有限公司 Determination method, apparatus, computer equipment and the storage medium of pitch period
CN108831499A (en) * 2018-05-25 2018-11-16 西南电子技术研究所(中国电子科技集团公司第十研究所) Utilize the sound enhancement method of voice existing probability
CN110322898A (en) * 2019-05-28 2019-10-11 平安科技(深圳)有限公司 Vagitus detection method, device and computer readable storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101136199B (en) * 2006-08-30 2011-09-07 纽昂斯通讯公司 Voice data processing method and equipment
FR3014237B1 (en) * 2013-12-02 2016-01-08 Adeunis R F METHOD OF DETECTING THE VOICE

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1130952A (en) * 1993-09-14 1996-09-11 英国电讯公司 Voice activity detector
CN105845150A (en) * 2016-03-21 2016-08-10 福州瑞芯微电子股份有限公司 Voice enhancement method and system adopting cepstrum to correct
CN107452363A (en) * 2017-07-03 2017-12-08 福建天泉教育科技有限公司 Musical instrument tuner method and system
CN108831499A (en) * 2018-05-25 2018-11-16 西南电子技术研究所(中国电子科技集团公司第十研究所) Utilize the sound enhancement method of voice existing probability
CN108831504A (en) * 2018-06-13 2018-11-16 西安蜂语信息科技有限公司 Determination method, apparatus, computer equipment and the storage medium of pitch period
CN110322898A (en) * 2019-05-28 2019-10-11 平安科技(深圳)有限公司 Vagitus detection method, device and computer readable storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
一种基于Hilbert-Huang变换的基音周期检测新方法;杨志华等;《计算机学报》;20060112(第01期);全文 *
基于浊音语音谐波谱子带加权重建的抗噪声说话人识别;曾毓敏等;《东南大学学报(自然科学版)》;20081120(第06期);全文 *

Also Published As

Publication number Publication date
CN111261197A (en) 2020-06-09

Similar Documents

Publication Publication Date Title
EP2360685B1 (en) Noise suppression
Nakatani et al. Robust and accurate fundamental frequency estimation based on dominant harmonic components
CN108831499A (en) Utilize the sound enhancement method of voice existing probability
EP1065656B1 (en) Method for reducing noise in an input speech signal
US20070255535A1 (en) Method of Processing a Noisy Sound Signal and Device for Implementing Said Method
Verteletskaya et al. Noise reduction based on modified spectral subtraction method
CN113539285B (en) Audio signal noise reduction method, electronic device and storage medium
Wolfe et al. Towards a perceptually optimal spectral amplitude estimator for audio signal enhancement
CN114694670A (en) Multi-task network-based microphone array speech enhancement system and method
CN103295580A (en) Method and device for suppressing noise of voice signals
CN112185405B (en) Bone conduction voice enhancement method based on differential operation and combined dictionary learning
Ambikairajah et al. Wavelet transform-based speech enhancement
CN111261197B (en) Real-time speech paragraph tracking method under complex noise scene
Cao et al. Research on noise reduction algorithm based on combination of LMS filter and spectral subtraction
Bahadur et al. Performance measurement of a hybrid speech enhancement technique
Hamid et al. Speech enhancement using EMD based adaptive soft-thresholding (EMD-ADT)
Rao et al. Speech enhancement using sub-band cross-correlation compensated Wiener filter combined with harmonic regeneration
Graupe et al. Blind adaptive filtering of speech from noise of unknown spectrum using a virtual feedback configuration
Srinivas et al. A classification-based non-local means adaptive filtering for speech enhancement and its FPGA prototype
Islam et al. Speech enhancement in adverse environments based on non-stationary noise-driven spectral subtraction and snr-dependent phase compensation
CN112750451A (en) Noise reduction method for improving voice listening feeling
Upadhyay et al. Recursive noise estimation-based Wiener filtering for monaural speech enhancement
Zengyuan et al. A speech denoising algorithm based on harmonic regeneration
CN117995215B (en) Voice signal processing method and device, computer equipment and storage medium
Gbadamosi et al. Development of non-parametric noise reduction algorithm for GSM voice signal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant