CN106997766B - Homomorphic filtering speech enhancement method based on broadband noise - Google Patents
Homomorphic filtering speech enhancement method based on broadband noise Download PDFInfo
- Publication number
- CN106997766B CN106997766B CN201710176799.3A CN201710176799A CN106997766B CN 106997766 B CN106997766 B CN 106997766B CN 201710176799 A CN201710176799 A CN 201710176799A CN 106997766 B CN106997766 B CN 106997766B
- Authority
- CN
- China
- Prior art keywords
- noise
- voice
- autocorrelation
- homomorphic filtering
- signal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/06—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being correlation coefficients
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/24—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum
Abstract
The invention discloses a homomorphic filtering voice enhancement method based on broadband noise, and belongs to the technical field of voice processing methods. The purpose is for solving prior art still produces the error when extracting speech signal characteristic easily to and the speech signal after making an uproar still can have more residual noise's problem, and the method specifically is: obtaining a short-time stable voice signal, then carrying out autocorrelation analysis on the voice signal to obtain an autocorrelation coefficient, then carrying out homomorphic filtering analysis processing on the autocorrelation coefficient, solving the cepstrum of the voice signal with noise during homomorphic filtering analysis processing, removing the noise component of the cepstrum of the voice signal with noise to obtain the cepstrum of the enhanced voice, obtaining the characteristic parameters after noise reduction through spectrum analysis, and synthesizing the voice signal after noise reduction. The improved homomorphic filtering anti-noise method provided by the invention can enhance the robustness of the voice cepstrum characteristics to environmental noise, more accurately obtain the characteristic information of voice, has better stubborn health and can better achieve the purpose of voice enhancement.
Description
Technical Field
The invention particularly relates to a homomorphic filtering voice enhancement method based on broadband noise, and belongs to the technical field of voice processing methods.
Background
In the actual environment, the speech signal processing technology comprises many aspects, while the speech enhancement technology is one of the effective methods for solving the problem of noise pollution of speech, and the research goal is to extract a pure speech signal from a noisy speech signal as much as possible by system output so as to improve the transmission quality; the traditional speech enhancement technology also has a plurality of technologies, including spectral subtraction, center filtering, homomorphic filtering anti-noise, nonlinear processing, adaptive analysis, wavelet analysis, etc., and in the noise processing process, different speech enhancement methods are selected according to different characteristics of speech, perception characteristics of human ears and different noise properties.
In the traditional homomorphic filtering anti-noise method, when homomorphic processing is carried out, fundamental tone peak values in cepstrum become unclear or even disappear under many conditions, and when a voice signal is separated from a noise signal, the problem of phase multivaluence is easily caused, namely, phase winding is easy to generate, so that the error of extracted characteristic parameters is overlarge, the original voice signal is difficult to recover, and the real-time property of a system cannot be met; the traditional autocorrelation analysis method is easy to generate second harmonic during autocorrelation processing, is not easy to directly perform characteristic extraction on autocorrelation coefficients of noisy speech signals so as to achieve the purpose of speech enhancement, requires strong periodicity of the signals, and is not easy to perform denoising analysis because autocorrelation functions are similar to high-frequency waveforms of noise; the traditional center filtering method can cause the damage of voice quality when voice is enhanced, the threshold selection of the center filtering method is very important, the related information of a voice signal is easily lost, and the voice signal can only be analyzed in a frequency domain; in the traditional spectral subtraction method, because the voice energy is more concentrated in a certain frequency band, a large amount of residual noise still exists after voice enhancement processing, and pure-tone noise is still easily generated in a voice signal if the noise of a high-power component cannot be eliminated in the noise elimination process. Aiming at the problems existing in the traditional method of speech enhancement, at present, a plurality of new methods for speech enhancement under different background noise environments are researched, and the speech enhancement algorithm based on anisotropic filtering under white noise is researched [ J ]. the university of jia hous (natural science edition), 2015, 06: 902-; speech enhancement algorithm study with improved wavelet threshold function [ J ] signal processing, 2016, 02: 203-213, which adopts the speech enhancement algorithm for improving the wavelet threshold function to effectively improve the intelligibility and the overall quality of the speech signal; speech enhancement algorithm research [ J ] scientific technology and engineering based on cepstrum preprocessing technology, 2013, 21: 6111-; speech enhancement algorithm study under low signal-to-noise ratio [ J ] chinese new communication, 2015, 15: 73-74, aiming at the problem of music noise and low definition of the spectral subtraction algorithm in the aspect of voice enhancement under the background condition of low signal-to-noise ratio, the voice enhancement algorithm based on the cepstrum distance and the spectral subtraction algorithm is provided.
Short-time autocorrelation analysis is a common method in time-domain analysis of speech signals, defining a speech signal sn(m) short-time autocorrelation function Zn(k) The calculation expression of (a) is as follows:
where L is the maximum number of delay points.
The self-correlation processing anti-noise method reduces the harmonic component after filtering, the curve becomes smooth, but the second harmonic still exists, the peak value is still not very sharp, and the error is still easy to generate when the voice signal characteristic is extracted.
In the traditional homomorphic filtering and noise-resisting method, the first peak value of the cepstrum of the voice signal is still not obvious, and high-frequency components between the origin and the second peak value are more, which affects the accuracy of observing characteristic values. The noise-reduced speech signal still has more residual noise.
Disclosure of Invention
Therefore, the invention provides a homomorphic filtering speech enhancement method based on broadband noise, aiming at the problems that in the prior art, errors are still easy to generate when speech signal features are extracted, and the noise-reduced speech signals still have more residual noise.
The method specifically comprises the following steps:
obtaining a short-time stable voice signal, then carrying out autocorrelation analysis on the voice signal to obtain an autocorrelation coefficient, then carrying out homomorphic filtering analysis processing on the autocorrelation coefficient, solving the cepstrum of the voice signal with noise during homomorphic filtering analysis processing, removing the noise component of the cepstrum of the voice signal with noise to obtain the cepstrum of the enhanced voice, obtaining the characteristic parameters after noise reduction through spectrum analysis, and synthesizing the voice signal after noise reduction.
Further, in the method, the autocorrelation analysis is performed on the speech signal, and the obtained autocorrelation coefficient specifically includes:
setting a speech signal sn(m) is [0, N-1]]Then the short-time autocorrelation of the speech signal is:
where L is the maximum number of delay points.
Further, the homomorphic filtering analysis processing of the autocorrelation coefficients in the method specifically comprises:
autocorrelation coefficient FFT transform Rn(ejw);
To Rn(ejw) The real part of (A) is logarithmically calculated to obtain:
c improved by inverse FFT of the resultn(m):
The invention has the beneficial effects that: the invention provides a homomorphic filtering speech enhancement method based on broadband noise, which is superior to the traditional method in improvement of the traditional method, has smaller average relative error and higher accuracy and is more beneficial to speech recognition and speech synthesis. The improved homomorphic filtering anti-noise method can enhance the robustness of the voice cepstrum characteristics to environmental noise, accurately obtain the characteristic information of voice, has better stubborn performance and can better achieve the purpose of voice enhancement.
Drawings
FIG. 1 is a flowchart illustrating a process of auto-correlation anti-noise method according to an embodiment;
FIG. 2 is a waveform diagram of the original speech signal "hello" in the autocorrelation anti-noise method in the embodiment;
FIG. 3 is a schematic diagram of FIG. 2 illustrating an auto-correlation analysis of a frame of speech signal;
FIG. 4 is a diagram of simulation effect of conventional autocorrelation anti-noise French voice enhancement;
FIG. 5 is a flow chart of a conventional homomorphic filtering anti-noise method;
FIG. 6 is a diagram illustrating the effect of a conventional homomorphic filtering anti-noise method;
FIG. 7 is a diagram illustrating conventional homomorphic filtering feature extraction;
FIG. 8 is a flow chart of an improved homomorphic filtering anti-noise method;
FIG. 9 is a graph of the simulation effect of improved homomorphic filtering anti-noise French voice enhancement;
FIG. 10 is a simulation diagram of feature parameters of an improved homomorphic filtered output extracted speech signal.
Detailed Description
The following description of the embodiments of the present invention is provided with reference to the accompanying drawings:
the invention improves the traditional autocorrelation processing anti-noise method and the traditional homomorphic filtering anti-noise method, and designs the improved homomorphic filtering anti-noise method.
Traditional autocorrelation noise-resisting method
The autocorrelation processing anti-noise method carries out autocorrelation analysis on the voice signal with noise to obtain an autocorrelation sequence which is the same as that of the voice signal without noise, and the autocorrelation of the voice signal is irrelevant to the noise, so the autocorrelation of the voice signal with noise can be approximate to the autocorrelation of a pure voice signal, and the aim of anti-noise can be achieved by taking an autocorrelation coefficient as a characteristic value of a voice processing system.
The processing flow of the autocorrelation anti-noise method is shown in fig. 1.
In a common indoor environment, Cooledit is adopted to record a noisy voice signal of a girl, namely 'hello', the sampling frequency is 22KHZ, and the voice signal is monaural, as shown in figure 2.
A frame of speech signal is truncated for autocorrelation analysis as shown in fig. 3.
As can be seen from fig. 3, an original frame of speech signal with noise has a certain periodicity, but has a large number of harmonic components, and the peak value is not very sharp, which may generate a certain error in the extraction process of the feature parameters.
The simulation effect of applying conventional autocorrelation anti-noise french speech enhancement to the original noisy speech signal is shown in fig. 4.
It can be seen from the simulation of fig. 4 that the harmonic component after filtering is reduced, the curve becomes smooth, but the second harmonic still exists, the peak value is still not very sharp, and an error is still easily generated when the speech signal feature is extracted.
Traditional homomorphic filtering anti-noise method
The speech signal is not an additive signal but a convolutional signal. In order to process the data by using a linear system, a convolution homomorphic system can be adopted for processing.
Assuming that the interval occupied by the processed speech signal s (N) is [0, N-1], the interval length N used here can be chosen to be larger than the actual length; the magnitude of N determines whether aliasing is present in the cepstrum c (N), which represents whether the discrete time domain spectrum has better resolution. When N is greater than the actual length of s (N), a number of zeros may be added behind s (N) to make up the required length, which is called "zero padding". If the speech signal s (n) can restore the original speech signal through homomorphic filtering, it is proved that the speech signal has no aliasing distortion, and the cepstrum c (n) of the speech signal s (n) can be directly solved.
Setting:
then taking the logarithm thereof can be obtained:
the logarithm of the complex number is still a complex number, which contains a real part and an imaginary part; imaginary part of logarithm arg S (e)jw)]Due to being S (e)jw) Will produce inconsistencies. If we only considerAnd real part of
c(n)=F-1ln|S(ejw) L (formula C)
c (n) is an inverse fourier transform of the log-amplitude spectrum of the speech signal s (n), which can be considered as the "cepstrum".
Then, a flow chart of the conventional homomorphic filtering anti-noise method is shown in fig. 5.
Wherein, A is a short-time voice signal; b is a short-time frequency spectrum; c is a logarithmic spectrum; d is a cepstrum coefficient; e is a logarithmic spectrum envelope; f is the fundamental period.
The simulation effect of speech enhancement is shown in fig. 6 by applying the conventional homomorphic filtering anti-noise method to the original noisy speech signal.
It can be seen from the simulation of fig. 6 that, after the conventional homomorphic filtering and noise-resisting method is adopted, the first cepstrum peak of the speech signal is still not very obvious, and the high-frequency component between the origin and the second peak is more, which affects the accuracy of the observed feature value.
The feature parameters of the speech signal are extracted for the conventional homomorphic filtered output, as shown in fig. 7.
As can be seen from fig. 7, in the conventional homomorphic filtering output, the coordinates of the first peak close to the origin are (12, 0.12), the first peak is relatively smooth, a large error is caused during feature extraction, and the noise-reduced speech signal still has a large amount of residual noise.
Method of the invention
In the present invention, the improved homomorphic filtering anti-noise method is adopted to perform the same preprocessing on the voice signals as the traditional homomorphic filtering anti-noise method, the voice signals must be digitized and preprocessed to obtain the short-time stable voice signals, then the signals are subjected to autocorrelation analysis to obtain autocorrelation coefficients, then the autocorrelation coefficients are subjected to homomorphic filtering analysis, during homomorphic processing, the cepstrum of the voice signals with noise is solved, the noise components of the cepstrum of the voice signals with noise are removed to obtain the cepstrum of the enhanced voice, the characteristic parameters after noise reduction are obtained through spectrum analysis, the voice signals after noise reduction are synthesized to achieve the purpose of voice enhancement, and the specific algorithm flow is shown in fig. 8.
Wherein, A is a short-time voice signal; b is an autocorrelation coefficient; c is an autocorrelation short-time spectrum; d is an autocorrelation logarithmic spectrum; e is the improved cepstral coefficient; f is a logarithmic spectrum envelope; g is the fundamental period.
1) Autocorrelation of a short-term speech signal:
setting a speech signal sn(m) is [0, N-1]]. The short-time autocorrelation of the speech signal is then:
2) homomorphically processing the autocorrelation coefficients:
a) autocorrelation coefficient FFT transform Rn(ejw);
b) To Rn(ejw) The real part of (A) is logarithmically calculated to obtain:
c) c improved by inverse FFT transformation of the above resultsn(m):
An improved homomorphic filtering anti-noise method is adopted for an original voice signal with noise, and the simulation effect of voice enhancement is shown in fig. 9;
it can be seen from the simulation of fig. 9 that, by using the improved homomorphic filtering anti-noise method, the harmonic component is obviously reduced, and the periodicity becomes clearer, which will improve the accuracy of extracting the characteristic parameters of the noisy speech signal.
The simulation of the feature parameters of the extracted speech signal for the improved homomorphic filtered output is shown in fig. 10.
As can be seen from fig. 10, in the improved homomorphic filtering output, the coordinates of the first peak close to the origin are (11, 0.34), the first peak is relatively sharp and closer to the origin, which improves the accuracy of feature extraction, and can better synthesize the noise-reduced speech signal, thereby achieving the purpose of speech enhancement.
Comparison analysis of the method of the present invention with the conventional homomorphic filtering anti-noise method
To further compare the performance of the conventional homomorphic filtering anti-noise method with the performance of the homomorphic filtering anti-noise method related to the present invention, 20 experimental simulations were performed on the voice recording with noise. The two algorithms are used for comparing and analyzing the signal-to-noise ratio, comparing the voice signal with a large signal-to-noise ratio for 10 times and the voice signal with a small signal-to-noise ratio for 10 times, and calculating the average relative error (percentage).
Experimental results the average relative error of the large snr speech signal is shown in table 1, and the average relative error of the small snr speech signal is shown in table 2.
TABLE 1
Detection algorithm | Traditional homomorphic filtering | Correlation-homomorphic filtering |
Average relative error | 0.55 | 0.46 |
TABLE 2
Detection algorithm | Traditional homomorphic filtering | Correlation-homomorphic filtering |
Average relative error | 20.3 | 15.9 |
The comparison and analysis of the signal-to-noise ratio of the voice signal with noise can be used for obtaining the noise-carrying voice signal, the correlation-homomorphic filtering method is superior to the traditional homomorphic filtering method, the average relative error is smaller, the accuracy is higher, and the noise-carrying voice signal is more beneficial to voice recognition and voice synthesis.
The noise may be additive noise or non-additive noise, and the broadband noise caused by breathing during speaking is non-additive noise, and needs to be converted into additive noise by adopting homomorphic filtering, and the pure speech signal and the noise signal are separated to provide the pure speech signal, in the above analysis, it can be seen that,
the improved homomorphic filtering anti-noise method can enhance the robustness of the voice cepstrum characteristics to environmental noise, accurately obtain the characteristic information of voice, has better stubborn performance and can better achieve the purpose of voice enhancement.
The foregoing is a preferred embodiment of the present invention, and it should be noted that it would be apparent to those skilled in the art that various modifications and enhancements can be made without departing from the principles of the invention, and such modifications and enhancements are also considered to be within the scope of the invention.
Claims (1)
1. A homomorphic filtering speech enhancement method based on broadband noise is characterized in that the method specifically comprises the following steps:
obtaining a short-time stable voice signal, then carrying out autocorrelation analysis on the voice signal to obtain an autocorrelation coefficient, carrying out homomorphic filtering analysis processing on the autocorrelation coefficient, solving a cepstrum of the voice signal with noise during homomorphic filtering analysis processing, removing a noise component of the cepstrum of the voice signal with noise to obtain a cepstrum of enhanced voice, obtaining a characteristic parameter after noise reduction through spectrum analysis, and synthesizing the voice signal after noise reduction;
the method comprises the following steps of carrying out autocorrelation analysis on a voice signal to obtain an autocorrelation coefficient:
assuming that the interval occupied by the speech signal sn (m) is [0, N-1], the short-time autocorrelation of the speech signal is:
where L is the maximum number of delay points;
the homomorphic filtering analysis processing of the autocorrelation coefficient specifically comprises the following steps:
the autocorrelation coefficients FFT transform rn (ejw);
the logarithm of the real part of Rn (ejw) is obtained:
inverse FFT transformation of the results yields an improved cn (m):
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710176799.3A CN106997766B (en) | 2017-03-16 | 2017-03-16 | Homomorphic filtering speech enhancement method based on broadband noise |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710176799.3A CN106997766B (en) | 2017-03-16 | 2017-03-16 | Homomorphic filtering speech enhancement method based on broadband noise |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106997766A CN106997766A (en) | 2017-08-01 |
CN106997766B true CN106997766B (en) | 2020-05-15 |
Family
ID=59431506
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710176799.3A Expired - Fee Related CN106997766B (en) | 2017-03-16 | 2017-03-16 | Homomorphic filtering speech enhancement method based on broadband noise |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106997766B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114242097A (en) * | 2021-12-01 | 2022-03-25 | 腾讯科技(深圳)有限公司 | Audio data processing method and apparatus, medium, and device |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2001241475A1 (en) * | 2000-02-11 | 2001-08-20 | Comsat Corporation | Background noise reduction in sinusoidal based speech coding systems |
-
2017
- 2017-03-16 CN CN201710176799.3A patent/CN106997766B/en not_active Expired - Fee Related
Non-Patent Citations (5)
Title |
---|
《一种改进的短时平均幅度差函数算法》;马英等;《应用声学》;20100930;第29卷(第5期);第387-390页 * |
《基于同态滤波的基音检测算法》;胡立波等;《微电子学与计算机》;20090430;第26卷(第4期);第95-97页 * |
《基于同态滤波的多途信号处理技术》;许琬琰;《中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑》;20140615(第06期);第6-8页 * |
《强噪声背景下的语音增强方法研究》;谢桂海等;《第一届建立和谐人机环境联合学术会议(HHME2005)》;20051031;第1-8页 * |
《数字录音中噪声抑制相关技术的研究》;张景达;《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》;20060815(第08期);第17-33页 * |
Also Published As
Publication number | Publication date |
---|---|
CN106997766A (en) | 2017-08-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108831499B (en) | Speech enhancement method using speech existence probability | |
CN107274908B (en) | Wavelet voice denoising method based on new threshold function | |
US8359195B2 (en) | Method and apparatus for processing audio and speech signals | |
CN107785028B (en) | Voice noise reduction method and device based on signal autocorrelation | |
CN105390142B (en) | A kind of digital deaf-aid voice noise removing method | |
CN108198545B (en) | Speech recognition method based on wavelet transformation | |
CN111091833A (en) | Endpoint detection method for reducing noise influence | |
CN101853665A (en) | Method for eliminating noise in voice | |
CN110808059A (en) | Speech noise reduction method based on spectral subtraction and wavelet transform | |
Islam et al. | Speech enhancement based on a modified spectral subtraction method | |
CN110808057A (en) | Voice enhancement method for generating confrontation network based on constraint naive | |
CN105679321B (en) | Voice recognition method, device and terminal | |
CN108231084A (en) | A kind of improvement wavelet threshold function denoising method based on Teager energy operators | |
WO2020024787A1 (en) | Method and device for suppressing musical noise | |
CN106997766B (en) | Homomorphic filtering speech enhancement method based on broadband noise | |
Mergu et al. | Multi-resolution speech spectrogram | |
CN112233657A (en) | Speech enhancement method based on low-frequency syllable recognition | |
Hamid et al. | Speech enhancement using EMD based adaptive soft-thresholding (EMD-ADT) | |
CN110909827A (en) | Noise reduction method suitable for fan blade sound signals | |
Rao et al. | Speech enhancement using sub-band cross-correlation compensated Wiener filter combined with harmonic regeneration | |
Li et al. | Noisy speech enhancement based on discrete sine transform | |
Özen et al. | Speech noise reduction with wavelet transform domain adaptive filters | |
CN112750451A (en) | Noise reduction method for improving voice listening feeling | |
Sulong et al. | Speech enhancement based on wiener filter and compressive sensing | |
CN111968627A (en) | Bone conduction speech enhancement method based on joint dictionary learning and sparse representation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20200515 Termination date: 20210316 |
|
CF01 | Termination of patent right due to non-payment of annual fee |