CN111508514A - Single-channel speech enhancement algorithm based on compensation phase spectrum - Google Patents

Single-channel speech enhancement algorithm based on compensation phase spectrum Download PDF

Info

Publication number
CN111508514A
CN111508514A CN202010278564.7A CN202010278564A CN111508514A CN 111508514 A CN111508514 A CN 111508514A CN 202010278564 A CN202010278564 A CN 202010278564A CN 111508514 A CN111508514 A CN 111508514A
Authority
CN
China
Prior art keywords
spectrum
noise
signal
phase
speech
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010278564.7A
Other languages
Chinese (zh)
Inventor
张晓如
许清臣
张再跃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu University of Science and Technology
Original Assignee
Jiangsu University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu University of Science and Technology filed Critical Jiangsu University of Science and Technology
Priority to CN202010278564.7A priority Critical patent/CN111508514A/en
Publication of CN111508514A publication Critical patent/CN111508514A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0316Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
    • G10L21/0324Details of processing therefor

Abstract

The invention provides a single-channel speech enhancement algorithm based on a compensation phase spectrum, which comprises the following steps: preprocessing the voice signal with noise, and windowing in frames; performing Fourier transform; dividing critical frequency bands by using an ERB scale; calculating the value of the segmented signal-to-noise ratio; calculating a new compensation factor, and simultaneously obtaining a primary enhanced speech complex spectrum through power spectrum subtraction; performing additive calculation on the phase spectrum compensation function and the primarily enhanced voice complex spectrum to obtain a compensated complex spectrum; calculating a phase angle of the compensated complex frequency spectrum to obtain a compensated phase spectrum; and performing inverse Fourier transform after overlapping and adding the voice magnitude spectrum subjected to basic spectrum subtraction and the compensation phase spectrum obtained in the step seven to obtain the enhanced voice signal. The invention verifies that the improved compensation factor not only has an effect on the steady-state noise, but also is more favorable for the denoising effect of the unsteady-state noise, and the speech enhancement algorithm of the invention is more widely and effectively applicable to noise environments.

Description

Single-channel speech enhancement algorithm based on compensation phase spectrum
Technical Field
The invention relates to the field of speech enhancement algorithms, in particular to a single-channel speech enhancement algorithm based on a compensation phase spectrum.
Background
In the conventional phase spectrum compensation speech enhancement algorithm, assuming that the clean speech s (t) is contaminated by the stationary additive gaussian noise d (t), and the two are independent of each other, the time domain representation of the noisy speech x (t) is:
x(t)=s(t)+d(t) (1);
the expression of the frequency domain obtained by performing short-time Fourier transform on the formula (1) is as follows:
Figure BDA0002445695040000011
wherein N is the number of frames, N represents the discrete fourier transform length, k represents the number of frequency bands, and w (N) is a window function; j is a mathematical symbol in the definition of the fourier transform and generally does not require explanation.
For convenience of representing noisy speech, equation (2) is represented using polar coordinates:
X(n,k)=|X(n,k)|ej∠X(n,k)(3);
where | X (n, k) | is the magnitude spectrum of the noisy speech signal, and ∠ X (n, k) is the phase spectrum.
In a conventional phase spectrum compensation algorithm, the positioning phase spectrum compensation function is:
Figure BDA0002445695040000012
in the formula (4), λ is a compensation factor, and the value of an empirical constant obtained by a large number of experiments for the compensation factor in the prior art is 3.74;
Figure BDA0002445695040000021
is a noise estimate for the first few frames of noisy speech;
Figure BDA0002445695040000022
is a decision function whose expression is:
Figure BDA0002445695040000023
then adding the phase spectrum compensation function and the frequency spectrum of the voice signal with noise to obtain a compensated frequency spectrum expression:
XΛ(n,k)=X(n,k)+Λ(n,k) (6)。
and (3) solving a phase angle of the compensated frequency spectrum to obtain a compensated phase spectrum:
Figure BDA0002445695040000024
wherein Im {. cndot } and Re {. cndot } are each a pair of XΛ(n, k) the imaginary and real parts are calculated.
Finally, the amplitude spectrum of the voice with noise is combined with the compensated phase spectrum to obtain an enhanced voice frequency spectrum as follows:
Figure BDA0002445695040000025
since the speech signal is a real signal, it is subjected to short-time fourier transform to obtain a pair of vectors with conjugate symmetry, in which the magnitude spectrum is symmetric and the phase spectrum is antisymmetric, and finally the inverse short-time fourier transform in the speech synthesis process is the inverse process of adding conjugate terms to form a real signal.
In conventional speech enhancement algorithms, the noisy phase spectrum is usually retained and combined with the processed speech magnitude spectrum. And the compensation factor in the traditional phase spectrum compensation algorithm is a constant value of 3.74 fixed according to experience obtained by experiments, which cannot flexibly compensate the phase spectrum of the voice with noise, for the voice with noise, the background noise is constantly changed, and if the phase spectrum compensation is performed by using a fixed compensation factor, the phase spectrum which is consistent with the constant change cannot be obtained, so that the detail information in the synthesized voice phase spectrum is inaccurate, and the quality of the enhanced voice is not high.
Because the compensation factors in the phase spectrum compensation function in the prior art are all fixed, the noisy speech cannot be compensated to different degrees according to the change of the noise, and when the noise intensity is high and the signal-to-noise ratio is low, the noise removal condition becomes not ideal, and the residual noise amount is high, so that the phase spectrum compensation algorithm in the prior art cannot obtain better enhanced speech quality. In order to compensate for this drawback, the present invention improves the compensation factor in the phase spectrum compensation function.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a single-channel speech enhancement algorithm based on a compensation phase spectrum, verify that an improved compensation factor is effective, not only has an effect on steady-state noise, but also is more beneficial to the denoising effect of unsteady-state noise, and simultaneously make up for the defect of denoising the unsteady-state noise in the prior art, so that the speech enhancement algorithm is more widely and effectively applicable to noise environments.
In order to solve the above technical problem, an embodiment of the present invention provides a single-channel speech enhancement algorithm based on a compensated phase spectrum, including the following steps:
step one, preprocessing a voice signal x (n) with noise, and windowing in a frame mode;
step two, Fourier transform is carried out to obtain a complex spectrum X (n, k) ═ X (n, k) | ∠ X (n, k),
wherein | X (n, k) | is the amplitude spectrum of the noisy speech signal, ∠ X (n, k) is the noisy speech phase spectrum;
thirdly, dividing a critical frequency band by using an ERB scale;
step four, calculating the value of the segmented signal-to-noise ratio;
step five, calculating a new compensation factor, then calculating an improved phase spectrum compensation function Λ' (n, k), and simultaneously obtaining a primary enhanced speech complex spectrum by power spectrum subtraction
Figure BDA0002445695040000041
Step six, additive calculation is carried out on the phase spectrum compensation function and the primarily enhanced voice complex spectrum to obtain a compensated complex spectrum;
step seven, solving a phase angle of the compensated complex frequency spectrum to obtain a compensated phase spectrum;
step eight, overlapping and adding the voice magnitude spectrum after the basic spectrum is subtracted and the compensation phase spectrum obtained in the step seven, and then performing inverse Fourier transformLeaf transformation to obtain enhanced speech signal
Figure BDA0002445695040000042
Wherein, the definition of the segment signal-to-noise ratio in the fourth step is as follows:
Figure BDA0002445695040000043
where x (n) represents the original (clean) speech signal,
Figure BDA0002445695040000044
is an enhanced speech signal, N is the frame length, M is the number of frames in the signal; the segment signal-to-noise ratio in step four is based on geometric averaging of the signal-to-noise ratios of all speech frames.
In the fifth step, a functional expression of a compensation factor in the phase spectrum compensation function is as follows:
Figure RE-GDA0002491279050000044
where c is a fixed empirical value obtained from a large number of experimental data, c is 2.5; SNRseg'iIs the segment signal-to-noise ratio for the ith band:
Figure BDA0002445695040000051
the numerator denominator in the formula (III) is a noisy speech power spectrum and a noise estimation power spectrum in the ith frequency band respectively, and K is the total frequency band number of frequency band division;
substituting equation (II) into the phase spectrum compensation function yields the following expression:
Figure BDA0002445695040000052
in the sixth step, the new phase spectrum compensation function Λ' (n, k) is subtracted from the ERB scale-based multi-band spectrum divisionComplex spectrum obtained by the method
Figure BDA0002445695040000053
Adding to obtain a new compensated complex spectrum:
Figure BDA0002445695040000054
step seven, calculating a phase angle of the compensated complex spectrum to obtain a new compensated phase spectrum:
Figure BDA0002445695040000055
in the eighth step, the voice magnitude spectrum after the basic spectrum subtraction and the compensation phase spectrum obtained in the seventh step are overlapped and added, and then inverse Fourier transform is carried out, so as to obtain an enhanced voice signal
Figure BDA0002445695040000056
Figure BDA0002445695040000057
The technical scheme of the invention has the following beneficial effects:
1. the signal-to-noise ratio improvement value obtained by the algorithm is superior to the prior art in a white Gaussian noise environment, and the result of the algorithm is optimal along with the improvement of the input signal-to-noise ratio in the background noise environment of automobile noise, airport noise and babble noise, which shows that the improved compensation factor plays a certain role in the estimation of unsteady noise and can flexibly perform corresponding compensation according to the change of the noise.
2. The algorithm verifies that the improved compensation factor is effective, not only has the effect on steady-state noise, but also is more favorable for the denoising effect of unsteady-state noise, and simultaneously makes up the defect of the prior art in denoising the unsteady-state noise, so that the speech enhancement algorithm is more widely and effectively applicable to noise environments.
3. The voice residual noise and the voice distortion degree enhanced by the algorithm provided by the invention are acceptable, and the PESQ result is verified.
4. The algorithm of the invention is not only applied to the phase spectrum compensation function based on the ERB scale, but also improves the compensation factor, thus removing more noise components from the noisy speech in the unsteady noise environment, and ensuring better speech quality.
5. The improved compensation factor of the present invention enables the background noise to be removed and the distortion degree of the voice to be reduced. In a whole view, the speech quality after the algorithm enhancement is superior to the prior art.
Drawings
FIG. 1 is a block flow diagram of the present invention;
FIG. 2 is a diagram illustrating signal-to-noise ratio improvement values for different noise types in the present invention;
FIG. 3 is a diagram illustrating PESQ values for different noise types according to the present invention;
FIG. 4 is a diagram of a phonetic spectrogram according to the present invention;
FIG. 5 shows the MOS scoring test results of residual noise in the present invention;
FIG. 6 shows the result of MOS scoring test of speech distortion in the present invention.
Detailed Description
In order to make the technical problems, technical solutions and advantages to be solved by the present invention clearer, the following detailed description is made with reference to the accompanying drawings and specific embodiments.
The invention firstly uses ERB scale to divide the frequency band of the noisy speech complex frequency spectrum after Fourier transform, independently performs power spectrum subtraction in each frequency band to obtain the speech signal after primary enhancement, and then performs improved compensation phase spectrum on the speech signal after primary enhancement, and the invention will be described in detail.
The invention provides a single-channel speech enhancement algorithm based on a compensation phase spectrum, which has a flow shown in figure 1, and comprises the following steps:
step one, preprocessing a voice signal x (n) with noise, and windowing in a frame mode;
step two, Fourier transform is carried out to obtain a complex spectrum X (n, k) ═ X (n, k) | ∠ X (n, k),
where | X (n, k) | is the magnitude spectrum of the noisy speech signal, and ∠ X (n, k) is the phase spectrum of the noisy speech.
Thirdly, dividing a critical frequency band by using an ERB scale;
step four, calculating the value of the segmented signal-to-noise ratio;
the invention improves the compensation factor, so that the compensation factor can be flexibly adjusted according to the noise intensity. The calculation of the segment signal-to-noise ratio can measure the noise level of the speech signal, which can be calculated in the time domain or the frequency domain, and the definition of the segment signal-to-noise ratio is as follows:
Figure BDA0002445695040000071
where x (n) represents the original (clean) speech signal,
Figure BDA0002445695040000072
is an enhanced speech signal, N is the frame length, M is the number of frames in the signal; the segment signal-to-noise ratio in step four is based on geometric averaging of the signal-to-noise ratios of all speech frames.
Step five, calculating a new compensation factor, then calculating an improved phase spectrum compensation function Λ' (n, k), and simultaneously obtaining a primary enhanced speech complex spectrum by power spectrum subtraction
Figure BDA0002445695040000081
In this step, the function expression of the compensation factor in the phase spectrum compensation function is:
Figure RE-GDA0002491279050000082
where c is a fixed empirical value obtained from a large number of experimental data, c is 2.5; SNRseg'iIs the segment signal-to-noise ratio for the ith band:
Figure BDA0002445695040000083
the numerator denominator in the formula (III) is a noisy speech power spectrum and a noise estimation power spectrum in the ith frequency band respectively, and K is the total frequency band number of frequency band division;
substituting equation (II) into the phase spectrum compensation function yields the following expression:
Figure BDA0002445695040000084
it can be seen from the new phase spectrum compensation function that if the current frame is a speech frame, i.e. the segmented signal-to-noise ratio value is larger, the compensation factor is correspondingly reduced according to the characteristics of the exponential function, and then the phase spectrum compensation for the speech with noise is reduced at this time; if the current frame is a noise frame, i.e. the segmented signal-to-noise ratio value is smaller, the compensation factor is correspondingly increased, and the compensation for the phase spectrum of the noisy speech is increased, so that the effect of removing the background noise is achieved.
Sixthly, the new phase spectrum compensation function Λ' (n, k) and the complex spectrum obtained by dividing the multiband spectrum subtraction method based on the ERB scale
Figure BDA0002445695040000091
Adding to obtain a new compensated complex spectrum:
Figure BDA0002445695040000092
step seven, solving a phase angle of the compensated complex spectrum to obtain a new compensated phase spectrum:
Figure BDA0002445695040000093
step eight, overlapping and adding the voice amplitude spectrum after the basic spectrum is subtracted and the compensation phase spectrum obtained in the step seven, and then performing inverse Fourier transform to obtain an enhanced voice signal
Figure BDA0002445695040000094
Figure BDA0002445695040000095
The basic spectral subtraction in this step is to perform spectral subtraction on the noisy speech signal first, which is a complete algorithm, and then to call the magnitude spectrum after spectral subtraction. Theoretically, the basic spectral subtraction is after the fourier transform.
For the description of the symbols in the algorithm: the time domain signal before preprocessing is a lowercase X, the time domain signal is changed into a frequency domain signal after Fourier transformation, and the frequency domain signal is a uppercase X, and finally the time domain signal is converted back to a lowercase s after inverse Fourier transformation.
Compared with the experimental simulation result of the prior art, the improved algorithm provided by the invention is provided. The superiority and the effectiveness of the speech enhancement algorithm provided in the chapter can be more intuitively seen in experimental simulation.
The speech data used in this experiment are from the NOIZEUS corpus database, and in this simulation experiment, gaussian white noise, car noise, airport noise and babble noise are selected plus different signal-to-noise ratios (0dB, 2dB, 4dB, 6dB, 8dB and 10dB) to generate different noisy speech for performance evaluation. In the experimental performance evaluation, the sampling frequency of the voice is 8KHz, and the sampling precision is 16 bits.
And performing speech enhancement algorithm performance evaluation from the improvement of the passenger signal-to-noise ratio, subjective speech quality evaluation, speech spectrogram comparison and MOS scoring. In order to facilitate direct comparison between algorithms, basic parameter settings involved in the short-time fourier transform process are consistent: (1) the length of the voice frame is 256 sampling points, namely 32 ms; (2) selecting 50% of overlapping rate between adjacent frames; (3) the window function selects a hamming window.
One, improvement of SNR
FIG. 2 shows the improvement of SNR at different input SNR for different background noise types as shown in FIG. 2, where FIG. 2(a) is Gaussian white noise; FIG. 2(b) car noise; FIG. 2(c) airport noise; fig. 2(d) babble noise.
Observing fig. 2, it can be seen that the signal-to-noise ratio improvement value obtained by the algorithm of the present invention is superior to the prior art in the white gaussian noise environment, and in the other three background noise environments, it can be seen that the result of the algorithm of the present invention is optimal as the input signal-to-noise ratio is improved. This shows that the improved compensation factor plays a certain role in the estimation of the unsteady noise, and can flexibly perform corresponding compensation according to the change of the noise.
Second, PESQ value
FIG. 3 shows PESQ values at different input signal-to-noise ratios for different background noise types as shown in FIG. 3, where FIG. 3(a) is white Gaussian noise; FIG. 3(b) car noise; FIG. 3(c) airport noise; fig. 3(d) babble noise.
As shown in FIG. 3, the algorithm of the present invention achieves PESQ values in four background noise environments that are significantly better than those of the prior art. Under the white Gaussian noise, no matter under low signal-to-noise ratio or high signal-to-noise ratio, the result of the algorithm is optimal, under the other three unsteady noise environments, the PESQ value becomes better and better along with the increase of the signal-to-noise ratio, the result is obviously better than the prior art, and the improved compensation factor is effective, so that the algorithm not only has an effect on steady-state noise, but also has a more favorable denoising effect on the unsteady-state noise, and simultaneously compensates the defect of denoising the unsteady-state noise in the prior art, therefore, the speech enhancement algorithm is more widely and effectively applicable to noise environments.
Spectrogram of three languages
FIG. 4 is a spectrogram showing an example of noisy speech, i.e., babble noise with an input signal-to-noise ratio equal to 0dB, FIG. 4 is a diagram illustrating performance evaluation between the present invention and the prior art, wherein FIG. 4(a) is pure speech; FIG. 4(b) noisy speech with babble noise and a signal-to-noise ratio of 0 dB; FIG. 4(c) prior art; FIG. 4(d) the algorithm of the present invention.
The method is obtained by analyzing and comparing the spectrogram shown in the figure 4, and the voice denoising effect is enhanced by the algorithm, and the voice distortion degree is optimal. The spectrogram of (c) in fig. 4 has very obvious residual noise and slightly distorted speech, for example, the enhanced speech structure of the present invention is closer to the pure speech in the period of about 1-1.5s, and the enhanced speech structure of the prior art algorithm is distorted too much. This shows that the degree of speech residual noise and speech distortion enhanced by the algorithm of the present invention is acceptable, and the result of PESQ is also verified.
Fourth, MOS scoring
In order to more comprehensively evaluate the enhanced performance of the algorithm, in addition to the three evaluation modes, an informal subjective listening test is also carried out, wherein the selected test audio is Gaussian white noise and babble noise, and the input signal-to-noise ratios of the Gaussian white noise and the babble noise are both 0 dB. Here, the MOS scoring results in terms of both residual noise and speech distortion degree are mainly given, and the specific experimental test results are shown in fig. 5 and fig. 6.
In fig. 5, the residual noise result of the algorithm of the present invention is better than the prior art, no matter in white gaussian noise environment or in babble noise environment, because the compensation factor of the prior art is fixed and cannot flexibly change according to the change of noise, this is reflected in the result of the test of the babble noise environment. The algorithm is not only applied to the phase spectrum compensation function based on the ERB scale, but also improves the compensation factor, so that more noise components can be removed from the noisy speech in the unsteady noise environment, and the speech quality is better.
It is found from fig. 6 that the speech distortion level of the algorithm of the present invention achieves the best test results compared to the prior art. It can be seen therein that the degree of speech distortion is best in both noise environments. Although the test result score of the speech distortion degree is not as high as the speech quality, the speech distortion degree of the algorithm of the invention is most acceptable and comfortable, which shows that the improved compensation factor of the invention can remove background noise and reduce the speech distortion degree. In a whole view, the speech quality after the algorithm enhancement is superior to the prior art.
The foregoing is a preferred embodiment of the present invention, and it should be noted that it is obvious to those skilled in the art that various modifications and improvements can be made without departing from the principle of the present invention, and these modifications and improvements should be construed as the protection scope of the present invention.

Claims (4)

1. A single-channel speech enhancement algorithm based on a compensated phase spectrum is characterized by comprising the following steps:
step one, preprocessing a voice signal x (n) with noise, and windowing in a frame mode;
step two, Fourier transform is carried out to obtain a complex spectrum X (n, k) ═ X (n, k) | ∠ X (n, k),
wherein | X (n, k) | is the amplitude spectrum of the noisy speech signal, ∠ X (n, k) is the noisy speech phase spectrum;
thirdly, dividing a critical frequency band by using an ERB scale;
step four, calculating the value of the segmented signal-to-noise ratio;
step five, calculating a new compensation factor, then calculating an improved phase spectrum compensation function Λ' (n, k), and simultaneously obtaining a primary enhanced speech complex spectrum by power spectrum subtraction
Figure FDA0002445695030000011
Step six, additive calculation is carried out on the phase spectrum compensation function and the primarily enhanced voice complex spectrum to obtain a compensated complex spectrum;
step seven, solving a phase angle of the compensated complex frequency spectrum to obtain a compensated phase spectrum;
step eight, overlapping and adding the voice amplitude spectrum after the basic spectrum is subtracted and the compensation phase spectrum obtained in the step seven, and then performing inverse Fourier transform to obtain an enhanced voice signal
Figure FDA0002445695030000012
2. The compensated phase spectrum-based single-channel speech enhancement algorithm of claim 1, wherein the segment signal-to-noise ratio in step four is defined as follows:
Figure FDA0002445695030000021
where x (n) represents the original speech signal,
Figure FDA0002445695030000022
is the enhanced speech signal, N is the frame length, and M is the number of frames in the signal.
3. The compensated phase spectrum based single channel speech enhancement algorithm of claim 1, wherein in the fifth step, the function expression of the compensation factor in the phase spectrum compensation function is:
Figure RE-FDA0002491279040000023
wherein c is a fixed empirical value, c is 2.5; SNRseg'iIs the segment signal-to-noise ratio for the ith band:
Figure RE-FDA0002491279040000024
the numerator denominator in the formula (III) is a noisy speech power spectrum and a noise estimation power spectrum in the ith frequency band respectively, and K is the total frequency band number of frequency band division;
substituting equation (II) into the phase spectrum compensation function yields the following expression:
Figure RE-FDA0002491279040000025
4. the compensated phase spectrum based single channel speech enhancement algorithm of claim 1, wherein in step six, the new phase spectrum compensation function Λ' (n, k) is combined with the complex spectrum obtained by ERB scale division multi-band spectral subtraction
Figure FDA0002445695030000026
Adding to obtain a new compensated complex spectrum:
Figure FDA0002445695030000031
step seven, calculating a phase angle of the compensated complex spectrum to obtain a new compensated phase spectrum:
Figure FDA0002445695030000032
in the eighth step, the voice amplitude spectrum after the basic spectrum is subtracted and the compensation phase spectrum obtained in the seventh step are overlapped and added, and then inverse Fourier transform is carried out to obtain an enhanced voice signal
Figure FDA0002445695030000033
Figure FDA0002445695030000034
CN202010278564.7A 2020-04-10 2020-04-10 Single-channel speech enhancement algorithm based on compensation phase spectrum Pending CN111508514A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010278564.7A CN111508514A (en) 2020-04-10 2020-04-10 Single-channel speech enhancement algorithm based on compensation phase spectrum

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010278564.7A CN111508514A (en) 2020-04-10 2020-04-10 Single-channel speech enhancement algorithm based on compensation phase spectrum

Publications (1)

Publication Number Publication Date
CN111508514A true CN111508514A (en) 2020-08-07

Family

ID=71869231

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010278564.7A Pending CN111508514A (en) 2020-04-10 2020-04-10 Single-channel speech enhancement algorithm based on compensation phase spectrum

Country Status (1)

Country Link
CN (1) CN111508514A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112533120A (en) * 2020-11-23 2021-03-19 北京声加科技有限公司 Beam forming method and device based on dynamic compression of noisy speech signal magnitude spectrum
CN114333882A (en) * 2022-03-09 2022-04-12 深圳市友杰智新科技有限公司 Voice noise reduction method, device and equipment based on amplitude spectrum and storage medium
WO2023045779A1 (en) * 2021-09-24 2023-03-30 北京字跳网络技术有限公司 Audio denoising method and apparatus, device and storage medium
CN116052706A (en) * 2023-03-30 2023-05-02 苏州清听声学科技有限公司 Low-complexity voice enhancement method based on neural network
WO2023197955A1 (en) * 2022-04-11 2023-10-19 维沃移动通信有限公司 Signal processing method and apparatus, electronic device, and medium
CN117037788A (en) * 2023-09-11 2023-11-10 南京申瑞电力电子有限公司 Control cabinet information display device based on voice control

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
SE9500321D0 (en) * 1995-01-30 1995-01-30 Ericsson Telefon Ab L M Method of radio communication system
FI955947A0 (en) * 1995-12-12 1995-12-12 Nokia Mobile Phones Ltd Brusdaempare ocffarande Foer daempning av bakgrundsbrus i brusigt tal samt en mobilteleapparat
CN103021420A (en) * 2012-12-04 2013-04-03 中国科学院自动化研究所 Speech enhancement method of multi-sub-band spectral subtraction based on phase adjustment and amplitude compensation
CN108735229A (en) * 2018-06-12 2018-11-02 华南理工大学 A kind of amplitude based on noise Ratio Weighted and phase combining compensation anti-noise sound enhancement method and realization device
CN108735213A (en) * 2018-05-29 2018-11-02 太原理工大学 A kind of sound enhancement method and system based on phase compensation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
SE9500321D0 (en) * 1995-01-30 1995-01-30 Ericsson Telefon Ab L M Method of radio communication system
FI955947A0 (en) * 1995-12-12 1995-12-12 Nokia Mobile Phones Ltd Brusdaempare ocffarande Foer daempning av bakgrundsbrus i brusigt tal samt en mobilteleapparat
CN103021420A (en) * 2012-12-04 2013-04-03 中国科学院自动化研究所 Speech enhancement method of multi-sub-band spectral subtraction based on phase adjustment and amplitude compensation
CN108735213A (en) * 2018-05-29 2018-11-02 太原理工大学 A kind of sound enhancement method and system based on phase compensation
CN108735229A (en) * 2018-06-12 2018-11-02 华南理工大学 A kind of amplitude based on noise Ratio Weighted and phase combining compensation anti-noise sound enhancement method and realization device

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
王虎等: "稀疏低秩模型及相位谱补偿的语音增强算法", 《计算机工程与应用》 *
阮振裔等: "具有相位补偿的改进log-MMSE语音增强算法", 《电声技术》 *
陈紫强等: "结合相位谱补偿的调制域谱减法", 《信号处理》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112533120A (en) * 2020-11-23 2021-03-19 北京声加科技有限公司 Beam forming method and device based on dynamic compression of noisy speech signal magnitude spectrum
WO2023045779A1 (en) * 2021-09-24 2023-03-30 北京字跳网络技术有限公司 Audio denoising method and apparatus, device and storage medium
CN114333882A (en) * 2022-03-09 2022-04-12 深圳市友杰智新科技有限公司 Voice noise reduction method, device and equipment based on amplitude spectrum and storage medium
CN114333882B (en) * 2022-03-09 2022-08-19 深圳市友杰智新科技有限公司 Voice noise reduction method, device and equipment based on amplitude spectrum and storage medium
WO2023197955A1 (en) * 2022-04-11 2023-10-19 维沃移动通信有限公司 Signal processing method and apparatus, electronic device, and medium
CN116052706A (en) * 2023-03-30 2023-05-02 苏州清听声学科技有限公司 Low-complexity voice enhancement method based on neural network
CN117037788A (en) * 2023-09-11 2023-11-10 南京申瑞电力电子有限公司 Control cabinet information display device based on voice control

Similar Documents

Publication Publication Date Title
CN111508514A (en) Single-channel speech enhancement algorithm based on compensation phase spectrum
CN108735213B (en) Voice enhancement method and system based on phase compensation
US8477963B2 (en) Method, apparatus, and computer program for suppressing noise
US10811026B2 (en) Noise suppression method, device, and program
EP1349148B1 (en) Method and apparatus for noise estimation within an audio signal
US8160732B2 (en) Noise suppressing method and noise suppressing apparatus
KR100304666B1 (en) Speech enhancement method
US8737641B2 (en) Noise suppressor
JP2013148724A (en) Noise suppressing device, noise suppressing method, and program
CN110739005A (en) real-time voice enhancement method for transient noise suppression
JP2012058358A (en) Noise suppression apparatus, noise suppression method and program
JP3588030B2 (en) Voice section determination device and voice section determination method
Wolfe et al. Towards a perceptually optimal spectral amplitude estimator for audio signal enhancement
JP4757775B2 (en) Noise suppressor
JP2008216721A (en) Noise suppression method, device, and program
WO1999050825A1 (en) Noise reduction device and a noise reduction method
JP2003140700A (en) Method and device for noise removal
Li et al. Adaptive β-order generalized spectral subtraction for speech enhancement
JP2003131689A (en) Noise removing method and device
Islam et al. Speech enhancement in adverse environments based on non-stationary noise-driven spectral subtraction and snr-dependent phase compensation
Islam et al. A Divide and Conquer Strategy for Musical Noise-free Speech Enhancement in Adverse Environments
Fahim et al. Single-Channel Speech Dereverberation in Noisy Environment for Non-Orthogonal Signals
Islam et al. Speech Enhancement Based on Non-stationary Noise-driven Geometric Spectral Subtraction and Phase Spectrum Compensation
JP2022011893A (en) Noise suppression circuit
Lan et al. DCU-Net transient noise suppression based on joint spectrum estimation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
AD01 Patent right deemed abandoned
AD01 Patent right deemed abandoned

Effective date of abandoning: 20231017