CN108172231A - A kind of dereverberation method and system based on Kalman filtering - Google Patents
A kind of dereverberation method and system based on Kalman filtering Download PDFInfo
- Publication number
- CN108172231A CN108172231A CN201711285885.4A CN201711285885A CN108172231A CN 108172231 A CN108172231 A CN 108172231A CN 201711285885 A CN201711285885 A CN 201711285885A CN 108172231 A CN108172231 A CN 108172231A
- Authority
- CN
- China
- Prior art keywords
- signal
- kalman
- matrix
- microphone
- variance
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 37
- 238000001914 filtration Methods 0.000 title claims abstract description 19
- 239000011159 matrix material Substances 0.000 claims abstract description 47
- 239000013598 vector Substances 0.000 claims abstract description 47
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 32
- 238000009499 grossing Methods 0.000 claims description 6
- 238000004458 analytical method Methods 0.000 claims description 5
- 230000006870 function Effects 0.000 claims description 4
- 239000011541 reaction mixture Substances 0.000 claims description 4
- 230000037433 frameshift Effects 0.000 claims description 3
- 238000009432 framing Methods 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 238000004590 computer program Methods 0.000 claims description 2
- 238000010586 diagram Methods 0.000 description 8
- 230000003595 spectral effect Effects 0.000 description 6
- 238000004364 calculation method Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 230000004044 response Effects 0.000 description 4
- 238000011160 research Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 230000002708 enhancing effect Effects 0.000 description 2
- 230000005284 excitation Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 206010002953 Aphonia Diseases 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 229910003460 diamond Inorganic materials 0.000 description 1
- 239000010432 diamond Substances 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000007306 turnover Effects 0.000 description 1
- 230000002087 whitening effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/26—Pre-filtering or post-filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L21/0232—Processing in the frequency domain
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L2021/02082—Noise filtering the noise being echo, reverberation of the speech
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Quality & Reliability (AREA)
- Circuit For Audible Band Transducer (AREA)
Abstract
The invention discloses a kind of dereverberation method and system based on Kalman filtering, the method includes:The collected original signal of each microphone is pre-processed to obtain corresponding frequency-region signal, input signal is formed after delay;Using Kalman filtering algorithm and the multichannel autoregression model of time-varying estimation reverb signal, using the collected original signal of each microphone at current time as with reference to signal, subtract reverb signal and obtain error signal;Utilize the coefficient of kalman gain matrix and error signal update Kalman filter;Echo signal is obtained using the collected original signal of current time each microphone, input signal and updated Kalman filter coefficient;Finally, frequency domain echo signal is transformed into time domain using inverse Fourier transform.The method of the present invention reduces the complexity of self-adapting multi-channel linear prediction dereverberation algorithm by diagonalization kalman filter state vector error covariance matrix.
Description
Technical Field
The invention relates to the field of voice dereverberation, in particular to a dereverberation method and system based on Kalman filtering.
Background
As shown in fig. 1, due to the reflection of sound waves by the boundary of a room and objects in the room, the microphone receives reflected sound from various directions in addition to direct sound emitted from a sound source. An acoustic signal whose arrival time is 30-50ms after the direct sound is generally referred to as early reflected sound, and an acoustic signal arriving thereafter is referred to as late reflected sound, i.e., reverberation tail. Psychoacoustic research finds that early reflected sound can enhance the intensity of direct sound and improve speech intelligibility. While the reverberant signal masks the direct sound signal that arrives later, resulting in speech blurring. In addition, the reverberation signal degrades the speech quality of the signal received by the microphone and the accurate recognition rate of the speech recognition system. In an application scene such as a teleconference and an intelligent sound box performed in a closed room, a microphone is often located in a far field of a sound source. As the distance between the sound source and the microphone increases, the reverberation is more disruptive to the microphone received signal. In addition, in a voice communication system, environmental noise is small, and signals received by a microphone are mainly influenced by room reverberation, so that the accuracy and intelligibility of voice signals are reduced, and the communication quality is seriously influenced. Therefore, it is a necessary task to dereverberate the microphone received signal.
Speech dereverberation is a popular research topic. The current solutions mainly include:
(1) linear prediction residual enhancement algorithm. The speech model used by the linear prediction residual enhancement algorithm is a sound source filter model. The model treats speech as a series of excitation sequences through a time-varying all-pole filter. The linear predictive analysis of the reverberant speech signal can yield an estimate of the all-pole filter coefficients, i.e., the linear predictive coefficients. Then, inverse filtering is performed on the microphone received signal, so that a corresponding excitation signal, namely a residual signal, can be obtained. The dereverberation can be realized by enhancing the residual signal, and the speech signal can be reconstructed by the linear prediction coefficient obtained by estimation.
(2) A spectral enhancement method. Spectral enhancement methods are a class of classical dereverberation algorithms. The method achieves the aim of enhancing the voice signal by correcting the signal containing noise or reverberation in the short-time Fourier transform domain. Document [1] (k.kinoshita, m.delcroix, t.nakatani, and m.miyoshi, "compression of plate turnover effect space signal using long-term multiplex-step linear prediction," ieee trans.audio, Speech, lang.process, vol.17, No.4, pp.534-545, May 2009.) estimates late reverberation by delayed linear prediction, and achieves dereverberation by subsequent spectral subtraction. Document [2] (f.xiong, n.moritz, r.rehr, j.emeruller, b.meyer, t.g.g.doclo, and s.goetze, "Robust ASR inverse environment using temporal central hearing regime for speech enhancement and an amplitude modulation filter bank for feature extraction," inproc.reverb change work, flood, item, 2014 power spectral density ") estimates the clean speech signal magnitude spectrum using the minimum mean square error method as a preprocessing stage for automatic speech recognition, from the magnitudes of late reverberation and stationary background noise. Typically, spectral enhancement methods require an estimate of the reverberation time to determine the level of spectral decay. However, blind reverberation estimation remains a very difficult problem, especially in noisy environments, and research into this problem is still ongoing.
(3) And (4) an inverse filtering method. Blind dereverberation algorithms refer to the fact that a priori knowledge of the room impulse response between the sound source and the microphone is not known during dereverberation. A multi-channel linear prediction algorithm based on a microphone array is a classic blind dereverberation algorithm. According to the Multiple input/output inversion theory (MINT), under the condition that the transfer functions of all channels do not contain a common zero point, the multi-channel method can perfectly balance the room impulse response which is not changed in time. However, the MINT algorithm is very sensitive to system identification errors, and the impulse response of an actual room often contains similar zeros, so that the MINT algorithm is difficult to apply in practice.
Since time-domain linear prediction algorithms tend to require long filter lengths and suffer from the problem of whitening the target signal. Recent researchers have proposed applying multi-channel linear prediction algorithms in the short-time fourier transform domain to process signals independently in each sub-band. In the STFT domain, the reverberated speech signal is described with an autoregressive model in each frequency band, whereby the filter length per subband can be reduced. Since the room impulse response is actually time-varying, time-varying prediction model coefficient modeling is required. Recently, researchers have proposed a multi-channel autoregressive (MAR) signal model of the STFT domain, which estimates MAR coefficients using a kalman filter, and this algorithm can be regarded as a generalized Recursive Least Squares (RLS) algorithm.
The computation complexity of the multi-channel linear prediction algorithm based on the STFT domain is square-related to the order of each sub-band filter. This complexity limits the application of algorithms to many resource-limited system platforms. Document [3] (diamond T, Doclo S, spread A, et al, Low-complexity Kalman filter for multi-channel linear-prediction-based linear prediction dereverberation [ C ]. IEEE Workshop on application Signal Processing to Audio and acoustics. IEEE,2017.) an adaptive multi-channel linear prediction dereverberation algorithm for STFT domain proposes a simplified Kalman filter solution method to reduce the computational complexity to be in linear relation with the filter order. However, this simplified approach may result in some degradation of speech quality. In addition, the algorithm only estimates one channel signal, and actually needs to calculate a plurality of channels.
Disclosure of Invention
The invention aims to overcome the defects of the existing dereverberation method and provides a low-complexity dereverberation method based on Kalman filtering, and the method further reduces the complexity of an STFT domain self-adaptive multi-channel linear prediction dereverberation algorithm while ensuring no loss of voice quality.
In order to achieve the above object, the present invention provides a dereverberation method based on kalman filtering, which includes:
preprocessing the original signals collected by each microphone to obtain corresponding frequency domain signals, and delaying to form input signals;
estimating a reverberation signal by using a Kalman filtering algorithm and a time-varying multi-channel autoregressive model, taking an original signal acquired by each microphone at the current moment as a reference signal, and subtracting the reverberation signal to obtain an error signal;
updating the coefficients of the Kalman filter by using the Kalman gain matrix and the error signal;
obtaining a target signal by using original signals, input signals and updated Kalman filter coefficients acquired by each microphone at the current moment;
and finally, converting the frequency domain target signal into a time domain by utilizing inverse Fourier transform.
As an improvement of the above method, the method specifically comprises:
step 1) signals y collected by M microphonesm(n), M is more than or equal to 1 and less than or equal to M, and corresponding frequency domain signal Y is obtained by framing, windowing and Fourier transformm(n),
Frequency domain signal Ym(n) is:
wherein k is a frequency subscript, and N is the number of points of Fourier transform; n is a time frame index, wSTFT(l) A short-time Fourier transform analysis window function, wherein R represents frame shift;
step 2) forming an input signal matrix Y (n-D) by frequency domain signals of M microphones from n-D to n-L, and estimating a reverberation signal vector r (n) by using a Kalman weight vector, wherein D is delay, and L is linear prediction length;
y(n)=[Y1(n),...,YM(n)]T(2)
in the formula (3), IMIs a unit matrix of M x M,representing the Kronecker product, Y (n-D) is a dimension of M × L formed by the microphone observation signalcOf the sparse matrix, Lc=M2(L-D+1);
Calculating a reverberation signal vector r (n) according to equation (4);
in the formula (4), the reaction mixture is,m matrix Cp(n-1) is a time-varying kalman weight vector coefficient, p ═ D, D +1]Vec {. is a matrix column stacking operation factor;
step 3) subtracting the reverberation signal vector r (n) obtained in the step 2) from the signal y (n) collected by each microphone at the current moment to obtain an error signal vector e (n);
e(n)=y(n)-r(n) (5)
step 4), calculating a Kalman gain matrix K (n);
step 5) updating Kalman filter coefficient by Kalman gain matrix K (n) and error signal vector e (n)
Step 6) utilizing the signal Y (n) collected by the microphone at the current moment, the input signal matrix Y (n-D) and the updated Kalman filter coefficientCalculating a target signal vector x (n);
step 7) carrying out inverse Fourier transform on the frequency domain target signal vector x (n) to obtain a time domain target signal vector xt(l):
As an improvement of the above method, the step 4) specifically includes:
step 401) adopts a first-order smoothing mode to calculate according to the formula (6)
Wherein,is the target signal variance at time n-1,is the variance of the target signal at the moment of n-2, x (n-1 α is the vector of the target signal at the moment of n-1, alpha is a smoothing factor, and the value is 0.2;
step 402) first calculate the variance of the disturbance noise w (n) according to equation (7)The a priori imbalance variance is then calculated according to equation (8)
In the formula (7), Lc=M2(L-D +1), η is usually 10-5;Is the posterior imbalance variance at the time n-1;
step 403) selecting the variance of the target signal according to equation (9)And a priori imbalance varianceCalculating a regularization factor delta (n);
step 404) calculating a covariance matrix S from the signals collected by the microphones according to equation (10)Y(n-D);
SY(n-D)=Y(n-D)YH(n-D) (10)
Step 405) calculating a kalman gain matrix K (n) according to equation (11);
K(n)=YH(n-D)[SY(n-D)+δ(n)IM]-1(11)。
as a modification of the above method, the step 7) is followed by:
updating posterior imbalance variance
A kalman filter based dereverberation system comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method when executing the program.
The invention has the advantages that:
1. according to the method, the complexity of a self-adaptive multi-channel linear prediction dereverberation algorithm is reduced through a diagonalization Kalman filter state vector error covariance matrix;
2. the simplified kalman filtering algorithm of the present invention can be regarded as a Normalized Least Mean Square (NLMS) algorithm with a variable normalization factor. In addition, the error signal vector e (n) and the target signal vector of the simplified Kalman filtering algorithm proposed by the inventionx (n) are M multiplied by 1 vectors, which provides convenience for the subsequent cascade of other multi-channel algorithms. In addition, the variance of the target signal is calculatedProviding more information available.
Drawings
FIG. 1 is a schematic diagram of room reverberation generation;
FIG. 2 is a block diagram of Kalman filtering dereverberation of the present invention;
FIG. 3 is a block diagram of Kalman weight vector update of the present invention;
FIG. 4 is a block diagram of a calculate Kalman gain matrix module of the present invention;
FIG. 5 is a block diagram of the present invention for estimating the a priori misalignment variance.
Detailed Description
The invention is described in detail below with reference to the figures and specific embodiments.
A kalman filtering based low complexity dereverberation method, the method comprising:
step 1) signals y collected by M microphonesm(n), M is more than or equal to 1 and less than or equal to M, and corresponding frequency domain signal Y is obtained by framing, windowing and Fourier transformm(k, n), the frequency index k will be omitted hereinafter for simplicity;
frequency domain signal YmThe calculation of (k, n) is calculated according to equation (1):
wherein k isFrequency subscript, N is the number of Fourier transform points; n is a time frame index, wSTFT(l) A short-time Fourier transform analysis window function, wherein R represents frame shift;
step 2) forming an input signal matrix Y (n-D) by frequency domain signals of M microphones from n-D to n-L, and estimating a reverberation signal vector r (n) by using a Kalman weight vector, wherein D is delay, and L is linear prediction length;
y (n-D) is a dimension of M x L formed by the microphone observation signalscOf the sparse matrix, Lc=M2(L-D + 1). r (n) represents late reverberation.
Obtaining an input signal matrix Y (n-D) according to the formulas (2) and (3);
y(k,n)=[Y1(k,n),...,YM(k,n)]T(2)
in the formula (3), the reaction mixture is,representing the Kronecker product.
Calculating a reverberation signal vector r (n) according to equation (4);
in the formula (4), the reaction mixture is,which represents an estimate of a certain signal,m matrix Cp(n-1) is a time-varying kalman weight vector coefficient, p ═ D, D +1]. L is the linear prediction length, the delay D > 1 is selected and STFTThe frame overlap parameter of (Short-time Fourier transform, STFT) is related and is valued to ensure that x (n) is negligible related to r (n). Vec {. is a matrix column stacking operation factor.
Step 3) subtracting the reverberation signal vector r (n) obtained in the step 2) from the signal y (n) collected by each microphone at the current moment to obtain an error signal vector e (n);
e(n)=y(n)-r(n) (5)
step 4) using input signal matrix Y (n-D) and target signal varianceAnd a priori imbalance varianceCalculating a Kalman gain matrix K (n); the method specifically comprises the following steps:
step 401) calculating the variance of the target signal at the n moment by adopting a first-order smoothing mode according to the formula (6)
Wherein,is the target signal variance at time n-1,is the variance of the target signal at the moment of n-2, x (n-1 α is the vector of the target signal at the moment of n-1, alpha is a smoothing factor, and the value is 0.2;
step 402) first calculate the variance of the disturbance noise w (n) according to equation (7)The a priori imbalance variance is then calculated according to equation (8)
In the formula (7), Lc=M2(L-D +1), η is a small normal number, and 10 is generally recommended-5。
Step 403) selecting the variance of the target signal according to equation (9)And a priori imbalance varianceCalculating a regularization factor delta (n);
step 404) calculating a covariance matrix S from the signals collected by the microphones according to equation (10)Y(n-D);
SY(n-D)=Y(n-D)YH(n-D) (10)
Step 405) calculating a kalman gain matrix K (n) according to equation (11);
K(n)=YH(n-D)[SY(n-D)+δ(n)IM]-1(11)
step 5) updating Kalman filter coefficient by Kalman gain matrix K (n) and error signal vector e (n)
Step 6) utilizing the signal Y (n) collected by the microphone at the current moment, the input signal matrix Y (n-D) and the updated Kalman filter coefficientCalculating a target signal vector x (n);
step 7) solving the inverse Fourier transform of the frequency domain signal vector x (n) to obtain a time domain target signal vector xt(l);
Step 8) updating posterior imbalance variance
In the formula (15), IMIs a unit matrix of M × M, Lc=M2(L-D +1), L is the linear prediction length. tr [. C]The trace of the matrix is represented.
As shown in fig. 2, fig. 2 is a block diagram of the kalman filtering based low-complexity dereverberation algorithm system of the present invention. Wherein Y (n-D) is an input signal matrix formed by frequency domain signals of M microphones from n-D to n-L time, and r (n) is estimated by Kalman filtering algorithmThe reverberation signal vector, y (n) is a reference signal vector formed by signals collected by the microphone at the current moment, and x (n) is a target signal vector finally output. The fourier transform module 201 is used for performing fourier transform on the signal collected by the microphone, and Y is used for performing fourier transform on the mth microphone signalmAnd (n) represents. The delay block 202 represents delaying the signal collected by the microphone. The delay D > 1 is chosen in relation to the frame overlap parameter of the STFT, and is chosen to ensure that x (n) is negligibly related to r (n). The kalman filter block 203 represents filtering the input signal with a kalman filter to estimate the reverberation signal. The target signal vector x (n) is calculated by the summation module 204. The inverse fourier transform module 205 transforms the frequency domain signal to the time domain.
FIG. 3 is a schematic diagram of Kalman weight coefficient update, which includes a Kalman gain calculation module 303. And obtaining the updating amount of the weight vector according to the error signal vector and the Kalman gain matrix, and calculating to obtain the final output target signal vector x (n) according to the updated weight vector.
FIG. 4 is a schematic block diagram of the calculation of the Kalman gain matrix, which includes an a priori imbalance variance estimation module 403. The product module 401 performs multiplication of two input variables, and the inversion module 402 performs inversion operation on the input signal. Using variance of target signalInput signal matrix Y (n-D) and a priori misadjustment errorAnd calculating a Kalman gain matrix.Calculated by the a priori imbalance variance estimation block 403. The kalman gain is critical to the updating of the filter weight coefficients and the estimation of the a priori imbalance variance. First, calculate Re(n), and then calculating to obtain a Kalman gain matrix K (n).
The a priori imbalance variance estimation shown in FIG. 5The module also reflects the posterior imbalance varianceThe method of (3). The transpose module 501 represents transposing the matrix. Block 503 represents finding the traces of the matrix.
From the above analysis and fig. 2, 3 and 4, the following conclusions can be drawn:
firstly, after the technology of the invention is adopted, the computational complexity of the STFT domain self-adaptive multi-channel linear prediction dereverberation algorithm is greatly reduced;
secondly, after the technology of the invention is adopted, not only the calculation complexity is reduced, but also the output voice quality is ensured;
finally, after the technology of the invention is adopted, a good compromise can be obtained between the tracking performance and the convergence performance of the Kalman filter.
The above fully shows that the invention provides an effective dereverberation technology, which can well remove the reverberation interference caused by room sound reflection, and improve the speech intelligibility and the accurate recognition rate of the automatic speech recognition system.
It should be noted that the simplified kalman filtering algorithm described in the present invention can be regarded as a variable regularization factor NLMS algorithm, where δ (n) can be regarded as a variable regularization factor. Variance (variance)Having an important role in the estimation of the filter coefficients c (n), smallerThe values characterize good detuning performance and poor tracking performance, largerThe values characterize good tracking performance and poor detuning performance. In other words,the value height of (2) determines the tracking performance and the convergence performance of the Kalman filter. When the algorithm has not yet converged on the network,andthe difference is large, according to equation (7),this time also takes a larger value, thus providing fast convergence and tracking performance. When the algorithm begins to converge to a steady state,andis reduced, resulting in smallerI.e. lower detuning.
Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present invention and are not limited. Although the present invention has been described in detail with reference to the embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted without departing from the spirit and scope of the invention as defined in the appended claims.
Claims (5)
1. A kalman filter based dereverberation method, the method comprising:
preprocessing the original signals collected by each microphone to obtain corresponding frequency domain signals, and delaying to form input signals;
estimating a reverberation signal by using a Kalman filtering algorithm and a time-varying multi-channel autoregressive model, taking an original signal acquired by each microphone at the current moment as a reference signal, and subtracting the reverberation signal to obtain an error signal;
updating the coefficients of the Kalman filter by using the Kalman gain matrix and the error signal;
obtaining a target signal by using original signals, input signals and updated Kalman filter coefficients acquired by each microphone at the current moment;
and finally, converting the frequency domain target signal into a time domain by utilizing inverse Fourier transform.
2. The kalman filter-based dereverberation method according to claim 1, characterized in that the method specifically comprises:
step 1) signals y collected by M microphonesm(n), M is more than or equal to 1 and less than or equal to M, and corresponding frequency domain signal Y is obtained by framing, windowing and Fourier transformm(n),
Frequency domain signal Ym(n) is:
wherein k is a frequency subscript, and N is the number of points of Fourier transform; n is a time frame index, wSTFT(l) A short-time Fourier transform analysis window function, wherein R represents frame shift;
step 2) forming an input signal matrix Y (n-D) by frequency domain signals of M microphones from n-D to n-L, and estimating a reverberation signal vector r (n) by using a Kalman weight vector, wherein D is delay, and L is linear prediction length;
y(n)=[Y1(n),...,YM(n)]T(2)
in the formula (3), IMIs a unit matrix of M x M,representing the Kronecker product, Y (n-D) is a dimension of M × L formed by the microphone observation signalcOf the sparse matrix, Lc=M2(L-D+1);
Calculating a reverberation signal vector r (n) according to equation (4);
in the formula (4), the reaction mixture is,m matrix Cp(n-1) is a time-varying kalman weight vector coefficient, p ═ D, D +1]Vec {. is a matrix column stacking operation factor;
step 3) subtracting the reverberation signal vector r (n) obtained in the step 2) from the signal y (n) collected by each microphone at the current moment to obtain an error signal vector e (n);
e(n)=y(n)-r(n) (5)
step 4), calculating a Kalman gain matrix K (n);
step 5) updating Kalman filter coefficient by Kalman gain matrix K (n) and error signal vector e (n)
Step 6) utilizing the signal Y (n) collected by the microphone at the current moment, the input signal matrix Y (n-D) and the updated Kalman filter coefficientCalculating a target signal vector x (n);
step 7) carrying out inverse Fourier transform on the frequency domain target signal vector x (n) to obtain a time domain target signal vector xt(l):
3. The kalman filter-based dereverberation method according to claim 2, wherein the step 4) specifically comprises:
step 401) adopts a first-order smoothing mode to calculate according to the formula (6)
Wherein,is the target signal variance at time n-1,is the variance of the target signal at the moment of n-2, x (n-1 α is the vector of the target signal at the moment of n-1, alpha is a smoothing factor, and the value is 0.2;
step 402) first calculate the variance of the disturbance noise w (n) according to equation (7)The a priori imbalance variance is then calculated according to equation (8)
In the formula (7), Lc=M2(L-D +1), η is usually 10-5;Is the posterior imbalance variance at the time n-1;
step 403) selecting the variance of the target signal according to equation (9)And a priori imbalance varianceCalculating a regularization factor delta (n);
step 404) calculating a covariance matrix S from the signals collected by the microphones according to equation (10)Y(n-D);
SY(n-D)=Y(n-D)YH(n-D) (10)
Step 405) calculating a kalman gain matrix K (n) according to equation (11);
K(n)=YH(n-D)[SY(n-D)+δ(n)IM]-1(11)。
4. the kalman filter-based dereverberation method according to claim 3, further comprising, after the step 7):
updating posterior imbalance variance
5. A Kalman filter based dereverberation system comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method of one of claims 1 to 4 when executing the program.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711285885.4A CN108172231B (en) | 2017-12-07 | 2017-12-07 | Dereverberation method and system based on Kalman filtering |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711285885.4A CN108172231B (en) | 2017-12-07 | 2017-12-07 | Dereverberation method and system based on Kalman filtering |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108172231A true CN108172231A (en) | 2018-06-15 |
CN108172231B CN108172231B (en) | 2021-07-30 |
Family
ID=62524587
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711285885.4A Active CN108172231B (en) | 2017-12-07 | 2017-12-07 | Dereverberation method and system based on Kalman filtering |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108172231B (en) |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108600894A (en) * | 2018-07-11 | 2018-09-28 | 重庆传乐音响科技有限公司 | A kind of earphone adaptive active noise control system and method |
CN109297718A (en) * | 2018-09-29 | 2019-02-01 | 重庆长安汽车股份有限公司 | A kind of evaluation method of order whistler |
CN110289011A (en) * | 2019-07-18 | 2019-09-27 | 大连理工大学 | A kind of speech-enhancement system for distributed wireless acoustic sensor network |
WO2020078210A1 (en) * | 2018-10-18 | 2020-04-23 | 电信科学技术研究院有限公司 | Adaptive estimation method and device for post-reverberation power spectrum in reverberation speech signal |
CN111474481A (en) * | 2020-04-13 | 2020-07-31 | 深圳埃瑞斯瓦特新能源有限公司 | Battery SOC estimation method and device based on extended Kalman filtering algorithm |
CN111540372A (en) * | 2020-04-28 | 2020-08-14 | 北京声智科技有限公司 | Method and device for multi-microphone array noise reduction processing |
CN111599374A (en) * | 2020-04-16 | 2020-08-28 | 云知声智能科技股份有限公司 | Single-channel voice dereverberation method and device |
CN111599372A (en) * | 2020-04-02 | 2020-08-28 | 云知声智能科技股份有限公司 | Stable on-line multi-channel voice dereverberation method and system |
CN111933170A (en) * | 2020-07-20 | 2020-11-13 | 歌尔科技有限公司 | Voice signal processing method, device, equipment and storage medium |
CN112017680A (en) * | 2020-08-26 | 2020-12-01 | 西北工业大学 | Dereverberation method and device |
CN114205731A (en) * | 2021-12-08 | 2022-03-18 | 随锐科技集团股份有限公司 | Speaker area detection method, device, electronic equipment and storage medium |
CN115065422A (en) * | 2021-07-26 | 2022-09-16 | 中国计量科学研究院 | System and method for evaluating communication quality in reverberant room |
CN117318671A (en) * | 2023-11-29 | 2023-12-29 | 有研(广东)新材料技术研究院 | Self-adaptive filtering method based on fast Fourier transform |
CN117316175A (en) * | 2023-11-28 | 2023-12-29 | 山东放牛班动漫有限公司 | Intelligent encoding storage method and system for cartoon data |
WO2024198931A1 (en) * | 2023-03-27 | 2024-10-03 | 厦门亿联网络技术股份有限公司 | Kalman filter-based adaptive noise reduction method and device for microphone array |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101460999A (en) * | 2006-06-05 | 2009-06-17 | 埃克奥迪公司 | Blind signal extraction |
CN103187068A (en) * | 2011-12-30 | 2013-07-03 | 联芯科技有限公司 | Priori signal-to-noise ratio estimation method, device and noise inhibition method based on Kalman |
US20130332156A1 (en) * | 2012-06-11 | 2013-12-12 | Apple Inc. | Sensor Fusion to Improve Speech/Audio Processing in a Mobile Device |
CN107393550A (en) * | 2017-07-14 | 2017-11-24 | 深圳永顺智信息科技有限公司 | Method of speech processing and device |
-
2017
- 2017-12-07 CN CN201711285885.4A patent/CN108172231B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101460999A (en) * | 2006-06-05 | 2009-06-17 | 埃克奥迪公司 | Blind signal extraction |
CN103187068A (en) * | 2011-12-30 | 2013-07-03 | 联芯科技有限公司 | Priori signal-to-noise ratio estimation method, device and noise inhibition method based on Kalman |
US20130332156A1 (en) * | 2012-06-11 | 2013-12-12 | Apple Inc. | Sensor Fusion to Improve Speech/Audio Processing in a Mobile Device |
CN107393550A (en) * | 2017-07-14 | 2017-11-24 | 深圳永顺智信息科技有限公司 | Method of speech processing and device |
Non-Patent Citations (2)
Title |
---|
CHRISTINE EVERS ET AL.: "Multichannel Online Blind Speech Dereverberation with Marginalization of Static Observation Parameters in a Rao-Blackwellized Particle Filter", 《JOURNAL OF SIGNAL PROCESSING SYSTEMS》 * |
SEBASTIAN BRAUN ET AL.: "Online Dereverberation for Dynamic Scenarios Using a Kalman Filter With an Autoregressive Model", 《IEEE SIGNAL PROCESSING LETTERS》 * |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108600894A (en) * | 2018-07-11 | 2018-09-28 | 重庆传乐音响科技有限公司 | A kind of earphone adaptive active noise control system and method |
CN109297718A (en) * | 2018-09-29 | 2019-02-01 | 重庆长安汽车股份有限公司 | A kind of evaluation method of order whistler |
WO2020078210A1 (en) * | 2018-10-18 | 2020-04-23 | 电信科学技术研究院有限公司 | Adaptive estimation method and device for post-reverberation power spectrum in reverberation speech signal |
CN110289011B (en) * | 2019-07-18 | 2021-06-25 | 大连理工大学 | Voice enhancement system for distributed wireless acoustic sensor network |
CN110289011A (en) * | 2019-07-18 | 2019-09-27 | 大连理工大学 | A kind of speech-enhancement system for distributed wireless acoustic sensor network |
CN111599372A (en) * | 2020-04-02 | 2020-08-28 | 云知声智能科技股份有限公司 | Stable on-line multi-channel voice dereverberation method and system |
CN111599372B (en) * | 2020-04-02 | 2023-03-21 | 云知声智能科技股份有限公司 | Stable on-line multi-channel voice dereverberation method and system |
CN111474481A (en) * | 2020-04-13 | 2020-07-31 | 深圳埃瑞斯瓦特新能源有限公司 | Battery SOC estimation method and device based on extended Kalman filtering algorithm |
CN111599374A (en) * | 2020-04-16 | 2020-08-28 | 云知声智能科技股份有限公司 | Single-channel voice dereverberation method and device |
CN111540372A (en) * | 2020-04-28 | 2020-08-14 | 北京声智科技有限公司 | Method and device for multi-microphone array noise reduction processing |
CN111540372B (en) * | 2020-04-28 | 2023-09-12 | 北京声智科技有限公司 | Method and device for noise reduction processing of multi-microphone array |
CN111933170A (en) * | 2020-07-20 | 2020-11-13 | 歌尔科技有限公司 | Voice signal processing method, device, equipment and storage medium |
CN111933170B (en) * | 2020-07-20 | 2024-03-29 | 歌尔科技有限公司 | Voice signal processing method, device, equipment and storage medium |
CN112017680A (en) * | 2020-08-26 | 2020-12-01 | 西北工业大学 | Dereverberation method and device |
CN115065422A (en) * | 2021-07-26 | 2022-09-16 | 中国计量科学研究院 | System and method for evaluating communication quality in reverberant room |
CN114205731A (en) * | 2021-12-08 | 2022-03-18 | 随锐科技集团股份有限公司 | Speaker area detection method, device, electronic equipment and storage medium |
CN114205731B (en) * | 2021-12-08 | 2023-12-26 | 随锐科技集团股份有限公司 | Speaker area detection method, speaker area detection device, electronic equipment and storage medium |
WO2024198931A1 (en) * | 2023-03-27 | 2024-10-03 | 厦门亿联网络技术股份有限公司 | Kalman filter-based adaptive noise reduction method and device for microphone array |
CN117316175A (en) * | 2023-11-28 | 2023-12-29 | 山东放牛班动漫有限公司 | Intelligent encoding storage method and system for cartoon data |
CN117316175B (en) * | 2023-11-28 | 2024-01-30 | 山东放牛班动漫有限公司 | Intelligent encoding storage method and system for cartoon data |
CN117318671A (en) * | 2023-11-29 | 2023-12-29 | 有研(广东)新材料技术研究院 | Self-adaptive filtering method based on fast Fourier transform |
CN117318671B (en) * | 2023-11-29 | 2024-04-23 | 有研(广东)新材料技术研究院 | Self-adaptive filtering method based on fast Fourier transform |
Also Published As
Publication number | Publication date |
---|---|
CN108172231B (en) | 2021-07-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108172231B (en) | Dereverberation method and system based on Kalman filtering | |
CN110085249B (en) | Single-channel speech enhancement method of recurrent neural network based on attention gating | |
US10446171B2 (en) | Online dereverberation algorithm based on weighted prediction error for noisy time-varying environments | |
Kinoshita et al. | Neural Network-Based Spectrum Estimation for Online WPE Dereverberation. | |
JP5124014B2 (en) | Signal enhancement apparatus, method, program and recording medium | |
US8467538B2 (en) | Dereverberation apparatus, dereverberation method, dereverberation program, and recording medium | |
US7895038B2 (en) | Signal enhancement via noise reduction for speech recognition | |
US5924065A (en) | Environmently compensated speech processing | |
EP3685378B1 (en) | Signal processor and method for providing a processed audio signal reducing noise and reverberation | |
US11373667B2 (en) | Real-time single-channel speech enhancement in noisy and time-varying environments | |
Heymann et al. | Frame-online DNN-WPE dereverberation | |
CN111312275A (en) | Online sound source separation enhancement system based on sub-band decomposition | |
Nesta et al. | A flexible spatial blind source extraction framework for robust speech recognition in noisy environments | |
CN110111802B (en) | Kalman filtering-based adaptive dereverberation method | |
CN117854536B (en) | RNN noise reduction method and system based on multidimensional voice feature combination | |
CN117219102A (en) | Low-complexity voice enhancement method based on auditory perception | |
WO2020078210A1 (en) | Adaptive estimation method and device for post-reverberation power spectrum in reverberation speech signal | |
Kinoshita et al. | Multi-step linear prediction based speech dereverberation in noisy reverberant environment. | |
Yoshioka et al. | Dereverberation by using time-variant nature of speech production system | |
CN116052702A (en) | Kalman filtering-based low-complexity multichannel dereverberation noise reduction method | |
Tan et al. | Kronecker Product Based Linear Prediction Kalman Filter for Dereverberation and Noise Reduction | |
CN112687285B (en) | Echo cancellation method and device | |
CN115588438B (en) | WLS multi-channel speech dereverberation method based on bilinear decomposition | |
KR102358151B1 (en) | Noise reduction method using convolutional recurrent network | |
CN113870884B (en) | Single-microphone noise suppression method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |