CN106340304A - Online speech enhancement method for non-stationary noise environment - Google Patents
Online speech enhancement method for non-stationary noise environment Download PDFInfo
- Publication number
- CN106340304A CN106340304A CN201610843483.0A CN201610843483A CN106340304A CN 106340304 A CN106340304 A CN 106340304A CN 201610843483 A CN201610843483 A CN 201610843483A CN 106340304 A CN106340304 A CN 106340304A
- Authority
- CN
- China
- Prior art keywords
- noise
- theta
- estimation
- voice signal
- parameter
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 77
- 238000005457 optimization Methods 0.000 claims abstract description 56
- 238000001914 filtration Methods 0.000 claims abstract description 43
- 238000009432 framing Methods 0.000 claims abstract description 13
- 239000011159 matrix material Substances 0.000 claims description 51
- 238000005259 measurement Methods 0.000 claims description 41
- 230000006870 function Effects 0.000 claims description 29
- 239000004568 cement Substances 0.000 claims description 17
- 238000009826 distribution Methods 0.000 claims description 14
- 230000002708 enhancing effect Effects 0.000 claims description 10
- 238000012545 processing Methods 0.000 claims description 7
- 230000005284 excitation Effects 0.000 claims description 6
- 230000001965 increasing effect Effects 0.000 claims description 5
- 238000004519 manufacturing process Methods 0.000 claims description 5
- 238000005086 pumping Methods 0.000 claims description 3
- 230000011218 segmentation Effects 0.000 claims description 3
- 238000012546 transfer Methods 0.000 claims description 3
- 239000004744 fabric Substances 0.000 claims 1
- 230000005236 sound signal Effects 0.000 claims 1
- 230000009977 dual effect Effects 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 4
- 238000000354 decomposition reaction Methods 0.000 description 3
- 238000001228 spectrum Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000012407 engineering method Methods 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0264—Noise filtering characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Measurement Of Velocity Or Position Using Acoustic Or Ultrasonic Waves (AREA)
- Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
Abstract
The invention provides an online speech enhancement method for a non-stationary noise environment. The method comprises the steps of (1) establishing a system model in a non-stationary noise environment, (2) framing and windowing, (3) carrying out system initialization, (4) estimating an AR parameter, and (5) estimating a speech signal state sequence. For a problem that the AR parameter in a speech model can not be updated with noise change in real time, the invention put forward a dual Calman filtering frame, two Calman filters are in parallel computing, speech signal state estimation and AR parameter estimation are in mutual updating, a data estimation process and a parameter estimation process are carried out alternately, thus the parameter estimation process can be adapted to the noise change process so as to improve the accuracy of the system model, and thus the performance of speech enhancement is enhanced. For a problem that a traditional Calman filtering algorithm can not process non-stationary noise, combined with a convex optimization technique, an improved Calman filtering frame is put forward, Gauss noise and non-stationary noise can be accurately estimated, and the accuracy of speech enhancement is improved.
Description
Technical field
The present invention relates to field of speech enhancement, refer in particular to a kind of online voice being applied under nonstationary noise environment and increase
Strong method.
Background technology
In speech recognition front-ends processing procedure, voice signal always by various noise jamming and flooding, due to interference
Randomness, signal processing technology can only go to strengthen as far as possible voice quality.The main purpose of speech enhan-cement is from noisy speech
In extract pure raw tone.
Common voice enhancement algorithm mainly has following several:
1st, noise cancellation method: the method is according in a time domain or in a frequency domain, directly subtracts noise component(s) from noisy speech
The method gone is realized.The maximum feature of the method is to need using background signal as reference signal, reference signal accurately with
The no performance directly determining the method.
2nd, harmonic signal enhancement method: because the voiced sound in voice has obvious periodicity, this periodicity reflects in frequency domain
It is then a series of peak component one by one corresponding to fundamental frequency (fundamental tone) and its harmonic wave respectively, these frequency components occupy voice
Most of energy, can carry out speech enhan-cement using this periodicity, to extract fundamental tone using comb filter and its harmonic wave divides
Amount, suppresses other periodic noises and aperiodic broadband noise.
3rd, the enhancing algorithm based on speech production model: the voiced process of voice can be modeled as a linear time-varying filtering
Device.Different driving sources are adopted to different types of voice.In the generation model of voice, most widely used is full limit mould
Type.A series of voice enhancement algorithm, such as time-varying parameter Wiener filtering and Kalman can be obtained based on speech production model
Filtering method.
4th, the enhancing algorithm based on short time spectrum: the enhancing algorithm species based on voice short time spectrum is a lot, such as composes
Subtractive method, Wiener Filter Method, LMSE method etc..Such method has an adaptation, and SNR ranges are big, method simple, be easy to
The advantages of real-time processing.
5th, the enhancing algorithm based on wavelet decomposition: wavelet decomposition method is as sending out of this tool of mathematical analysis of wavelet decomposition
Open up and grow up, it combines some ultimate principles of subtractive method of spectrums simultaneously again.
6th, the enhancing algorithm based on audition shielding: audition screen method is that a kind of enhancing of the auditory properties using human ear is calculated
Method.
Based on the voice enhancement algorithm of Kalman filtering belong to above the third, traditional Kalman filtering is carrying out voice increasing
Two important hypothesis: process noise and measurement noise equal Gaussian distributed are had when strong.Traditional Kalman filtering is in actual speech
Following both sides limitation is shown: 1. the estimation of ar parameter must be accurately in enhancing.But gather environment in actual speech
In, noise is continually changing, and this requires that the estimation of ar parameter in speech model should have real-time, simultaneously should be in ar parameter
Consider various effect of noise in estimation procedure, otherwise can lead to the decline of speech enhan-cement performance.2. traditional Kalman filtering is calculated
Method only considers that the situation of Gaussian noise does not meet practical application.Can be by a kind of nonstationary noise (tool during speech signal collection
Have openness, obey laplacian distribution) pollution, it is not common, but is implicitly present in and voice quality is affected larger.If
In speech enhan-cement, when nonstationary noise is processed as Gaussian noise, it will serious reduction speech enhan-cement quality, it is unfavorable for follow-up
The semantic identification of voice.
Based on the problems referred to above, provide a kind of can be in the case of real-time processing Gaussian noise and nonstationary noise exist simultaneously
Online speech enhancement technique is very important.
Content of the invention
The technical problem to be solved is cannot to process ar in speech model for existing kalman filter method
Parameter cannot real-time update, measure during there is nonstationary noise, in conjunction with convex optimisation technique, provide one kind to be applied to
Online sound enhancement method under nonstationary noise environment, being capable of On-line Estimation ar parameter and nonstationary noise.
For achieving the above object, technical scheme provided by the present invention is: a kind of is applied under nonstationary noise environment
Online sound enhancement method, comprises the following steps:
1) set up the system model under nonstationary noise environment
1.1) the autoregression ar model in the case of setting up that Gaussian noise and sparse noise are common and existing
The generation process of voice signal be one by white-noise excitation, through the output of full limit linear system from recurrence mistake
Journey, i.e. current output is equal to the pumping signal of present moment and the weighted sum of p moment output in the past, and this is an autoregression
Ar model, is expressed as follows:
Wherein, u (k) is the white Gaussian noise excitation value in k moment;S (k-i) is the voice signal in (k-i) moment;s(k)
Voice signal for the kth moment;aiFor i-th linear predictor coefficient, also referred to as ar model parameter;P is the rank of ar model parameter
Number;
Set up the voice signal model meeting actual measurement process, it is as follows that voice signal measures process description:
Y (k)=s (k)+n (k)+v (k) (2)
Wherein, y (k) is k moment voice signal measurement sequence;S (k) is the voice signal in k moment;N (k) is that the k moment is high
This white noise;V (k) is k moment nonstationary noise, obeys laplacian distribution, has openness;
1.2) set up voice signal state-space model
Formula (1) and formula (2) are converted to state-space model, are described as follows:
X (k)=fx (k-1)+p (k) (3)
Y (k)=cx (k)+n (k)+v (k) (4)
Wherein,
C=[0 0 ... 0 1] (6)
X (k)=[s (k-p+1) ... s (k)]t(7)
In voice signal state equation (3) and voice signal measurement equation (4), x (k) is k moment voice signal state
Estimated sequence, i.e. the optimal State Estimation of voice signal;X (k-1) is (k-1) moment voice signal state estimation sequence;y(k)
For k moment voice signal measurement sequence;The state-transition matrix that f is constituted for linear predictor coefficient, last column [a in fp(k)
… a1(k)] it is referred to as ar parameter;C=[0 0 ... 0 1] is to measure transfer matrix;P (k) is k moment state-noise, obeys high
This distribution;N (k) is k moment measurement noise, Gaussian distributed;V (k) is the nonstationary noise in k moment, obeys Laplce
Distribution;
The statistical property of the state of voice signal and measurement noise p (k) and n (k) is:
E (p (k))=q, e (n (k))=r
e(p(k)p(j)t)=q δkj,e(n(k)n(j)t)=r δkj(8)
Wherein, q and r is respectively the average of noise p (k) and n (k);Q and r is respectively the covariance of noise p (k) and n (k);
δkjFor kronecker function;Speech Enhancement problem is to go to estimate optimum voice on the premise of known measurement voice signal y (k)
Signal x (k);
2) framing and adding window
Voice signal has short-term stationarity, thinks that voice signal is constant in 10--30ms, this makes it possible to voice to believe
Number it is divided into some short sections come being processed, here it is framing, the framing of voice signal is using moveable finite length
Method that window is weighted is realizing;Frame number generally per second is 33~100 frames, and framing method is the side of overlapping segmentation
The overlapping part of method, former frame and a later frame is referred to as frame and moves, and frame moves and the ratio of frame length is 0~0.5;
3) system initialization
3.1) improved Kalman filter device parameter initialization
Initialization voice signal state estimation sequence x (0/0), covariance matrix p (0/0) are it is ensured that covariance matrix is just
Fixed;
3.2) ar parameter initialization
Initialization ar parameter state estimated sequence θ (0/0);
4) estimate ar parameter
Ar parameter refers to last column [a in state-transition matrix f in formula (3)p(k) … a1(k)], it is mainly used to
Description speech production process, its accuracy has direct impact to the result of speech enhan-cement;Propose in the estimation of ar parameter
Consider voice signal state estimation sequence x (k-1), state-noise q (k), measurement noise n (k), nonstationary noise v (k),
Set up new ar parameter estimation state-space model, realize the online Robust Estimation of ar parameter, and the real-time estimation mistake to ar parameter
Journey is as follows:
4.1) set up the parameter estimation model of ar parameter
The ar parameter model that Gaussian noise and nonstationary noise mix under lower environment is described as follows:
θ (k)=θ (k-1)+q (k)
Y (k)=a θ (k)+r (k)+w (k) (9)
Wherein, θ (k)=[ap(k) … a1(k)]tFor k moment ar parameter state sequence;Q (k) is k moment state-noise,
Gaussian distributed, its covariance matrix is q (k);R (k) k moment measurement noise, Gaussian distributed, its covariance matrix is
r(k);W (k) k moment measurement noise, Gaussian distributed, its covariance matrix is w (k);A=x (k-1)t=[s (k-p) ...
S (k-1)] it is measurement matrix;Y (k) is k moment voice signal measurement sequence;State and the statistics of measurement noise q (k) and r (k)
Characteristic is:
E (q (k))=d, e (r (k))=l
e(q(k)q(j)t)=d δkj,e(r(k)r(j)t)=l δkj(10)
Wherein, d and l is respectively the average of noise q (k) and r (k);D and l is respectively the covariance of noise q (k) and r (k);
δkjFor kronecker function;
4.2) from the traditional Kalman filtering problem of convex optimization angle reconstruct
In order to easily estimate to sparse noise, need to ask from the angle reconstruct Kalman filtering of convex optimization
Topic, the state-space model of traditional Kalman filtering, without nonstationary noise w (k), as follows:
θ (k)=θ (k-1)+q (k)
Y (k)=a θ (k)+r (k) (11)
According to Bayes principle, ar Parameter Estimation Problem is expressed as, under the premise of metric data y (k) is known, estimating
Excellent ar argument sequence θ (k) it may be assumed that
Theoretical according to maximal possibility estimation, set up the likelihood function of p (y (k) | θ (k)) and p (θ (k)):
Wherein, ψ beThe covariance matrix ψ of conditional probability p in the case of known (θ (k) | y (k))
(k)=pθ(k | k)+d (k), wherein pθ(k | k) it is covariance updated value;When likelihood function condition l1(y (k), θ (k)) and l2(θ
(k)) when obtaining maximum, conditional probability p (y (k) | θ (k)) obtains optimal estimation value;Observation type (12) and formula (13) find
Bigization likelihood function condition l1(z (k), x (k+1)) and l2(x (k+1)) is equivalent to the index minimizing power exponent in likelihood function
PartWithTherefore obtain as
Lower optimization form:
Subjiect to y (k)=a θ (k)+r (k) (15)
Wherein, θ (k) and r (k) is variable, ψ (k)=pθ(k | k)+d (k) is the covariance matrix of Gaussian noise;θ(k)
Estimated value beR (k) is exactly the estimation to Gaussian noise;pθ(k | k) updates matrix for covariance:
pθ(k | k)=(i-kθ(k)a(k))pθ(k|k-1) (16)
pθ(k | k-1) be covariance prediction matrix:
pθ(k | k-1)=pθ(k-1|k-1)+d(k-1) (17)
kθK () is covariance gain:
kθ(k)=pθ(k|k-1)at(apθ(k|k-1)at+l(k-1))-1(18)
4.3) build, from the convex angle that optimizes, the optimization problem that nonstationary noise is estimated
Nonstationary noise obeys laplacian distribution, has sparse characteristic, and the core concept that nonstationary noise is estimated is profit
With the sparse characteristic of noise, through step 4.2) traditional Kalman filtering problem is converted into after convex optimization problem, can be excellent
Increase the sparsity constraints of nonstationary noise w (k) completing the estimation to sparse noise, new optimization form is in change:
Subjiect to y (k)=a θ (k)+r (k)+w (k) (19)
Wherein, w (k) is sparse noise, by above-mentioned optimization problem, obtaining the optimum of ar parameter is estimated
Meter θ (k),The optimization problem that formula (17) represents is a convex optimization problem, can be using the interior point in engineering
Method is solved;
5) estimated speech signal status switch
5.1) from the traditional Kalman filtering problem of convex optimization angle reconstruct
In order to easily estimate to sparse noise, need to ask from the angle reconstruct Kalman filtering of convex optimization
Topic, the state-space model of traditional Kalman filtering is as follows:
X (k)=fx (k-1)+p (k) (20)
Y (k)=cx (k)+n (k) (21)
According to Bayes principle, Kalman filtering problem is expressed as, under the premise of metric data y (k) is known, estimating
Excellent voice status sequence x (k) it may be assumed that
Theoretical according to maximal possibility estimation, set up p (y (k) | x (k)) and p (likelihood function of x (k):
Wherein, θ beThe covariance matrix of conditional probability p in the case of known (x (k) | y (k-1))
θ=fp (k-1 | k-1) ft+ q (k-1), wherein p (k-1 | k-1) it is covariance updated value;When likelihood function condition l1(y(k),x
(k)) and l2When (x (k)) obtains maximum, and conditional probability p (x (k) | y (k)) obtain optimal estimation value;Observation type (23) and formula
(24) find to maximize likelihood function condition l1(y (k), x (k)) and l2(x (k)) is equivalent to power exponent in minimum likelihood function
Exponential partWithTherefore
To the following form that optimizes:
Subjiect to y (k)=cx (k)+n (k) (25)
Wherein, x (k) and n (k) is variable, and θ is the covariance matrix of Gaussian noise;The estimated value of x (k) isN (k) is exactly the estimation to Gaussian noise;
P (k | k) updates matrix for covariance:
P (k | k)=(i-k (k) c (k)) p (k | k-1) (26)
P (k | k-1) be covariance prediction matrix:
P (k | k-1)=f (k-1) p (k-1 | k-1) f (k-1)t+q(k-1) (27)
kθK () is covariance gain:
K (k)=p (k | k-1) ct(cp(k|k-1)ct+r(k-1))-1(28)
5.2) build the estimation problem to sparse noise from the convex angle that optimizes
The core concept of the estimation of sparse noise is the sparse characteristic using noise, through step 5.1) by traditional Kalman
After filtering problem is converted into convex optimization problem, sparse noise n can be increased in optimizationsK the sparsity constraints of () are right to complete
The estimation of sparse noise, new optimization form is:
Subjiect to y (k)=cx (k)+n (k)+v (k) (29)
Wherein, v (k) is sparse noise, by above-mentioned optimization problem, obtaining the optimum to molten bath centroid position
Estimate x (k), x (k) is the optimal estimation in traditional Kalman filtering to state valueThe optimization that formula (29) represents is asked
An entitled convex optimization problem, can be solved using the interior point method in engineering;
5.3), after completing the enhancing to k moment voice signal, strengthen resultStep 4 will be returned to), it is used for
Update the ar parameter θ (k+1) in k+1 moment, be further continued for carrying out the speech enhan-cement in k+1 moment afterwards, estimate x (k+1), until by institute
There is Speech processing complete.
The present invention compared with prior art, has the advantage that and beneficial effect:
1st, the present invention is directed to ar parameter in speech model (especially autoregression ar model) and can not change in real time more with noise
New problem is it is proposed that double card Kalman Filtering framework, two Kalman filter concurrent operations, voice signal state estimation and ar
Parameter estimation updates mutually, and state estimation procedure and parameter estimation procedure are alternately so that parameter estimation procedure can adapt to
Noise change procedure, to improve the accuracy of system model, and then improves the performance of speech enhan-cement.
2nd, the present invention cannot process the problem of nonstationary noise for traditional Kalman filter algorithm, in conjunction with convex optimization skill
Art is it is proposed that improved Kalman filter framework.New algorithm has been simultaneously introduced Gauss to measurement process in speech enhan-cement model
Noise and nonstationary noise item, set up rational Optimized model by using convex optimisation technique, can be to Gaussian noise and non-flat
Steady noise is accurately estimated, improves the accuracy of speech enhan-cement.
Brief description
Fig. 1 is the flow chart of the sound enhancement method under nonstationary noise.
Fig. 2 a is primary speech signal schematic diagram.
Fig. 2 b is the voice signal schematic diagram with white Gaussian noise.
Fig. 2 c is the voice signal schematic diagram with white Gaussian noise and nonstationary noise.
Fig. 3 is the voice enhancement algorithm flow chart based on dual improved Kalman filter.
Fig. 4 a is primary speech signal.
Fig. 4 b is speech enhan-cement result schematic diagram.
Specific embodiment
With reference to specific embodiment, the invention will be further described.
As shown in figure 1, the online sound enhancement method being applied under nonstationary noise environment described in the present embodiment, including
Following steps:
1) set up the system model under nonstationary noise environment
1.1) the autoregression ar model in the case of setting up that Gaussian noise and sparse noise are common and existing
The generation process of voice signal can be described as one by white-noise excitation, through the output of full limit linear system from
Recursive procedure, i.e. current output is equal to the pumping signal of present moment and the weighted sum of p moment output in the past, and this is one
Autoregression ar model, is expressed as follows
Wherein, u (k) is the white Gaussian noise excitation value in k moment;S (k-i) is the voice signal in (k-i) moment;s(k)
Voice signal for the kth moment;aiFor i-th linear predictor coefficient, also referred to as ar model parameter;P is the rank of ar model parameter
Number.
As shown in Fig. 2 a, 2b, 2c, the voice signal observing in actual environment can be by various sound pollutions, especially right and wrong
Stationary noise, proposes in the present invention to consider Gaussian noise and nonstationary noise during voice signal measures simultaneously, sets up more
Meet the voice signal model of actual measurement process.Voice signal in the present invention measures process and can be described as follows:
Y (k)=s (k)+n (k)+v (k) (2)
Wherein, y (k) is k moment voice signal measurement sequence;S (k) is the voice signal in k moment;N (k) is that the k moment is high
This white noise;V (k) is k moment nonstationary noise, obeys laplacian distribution, has openness.
1.2) set up voice signal state-space model
Formula (1) and formula (2) are converted to state-space model, can be described as follows:
X (k)=fx (k-1)+p (k) (3)
Y (k)=cx (k)+n (k)+v (k) (4)
Wherein
C=[0 0 ... 0 1] (6)
X (k)=[s (k-p+1) ... s (k)]t(7)
In voice signal state equation (3) and voice signal measurement equation (4), x (k) is k moment voice signal state
Estimated sequence, i.e. the optimal State Estimation of voice signal;X (k-1) is (k-1) moment voice signal state estimation sequence;y(k)
For k moment voice signal measurement sequence;The state-transition matrix that f is constituted for linear predictor coefficient, last column [a in fp(k)
… a1(k)] it is referred to as ar parameter.;C=[0 0 ... 0 1] is to measure transfer matrix;P (k) is k moment state-noise, obeys high
This distribution;N (k) is k moment measurement noise, Gaussian distributed;V (k) is the nonstationary noise in k moment, obeys Laplce
Distribution.
The statistical property of the state of voice signal and measurement noise p (k) and n (k) is:
E (p (k))=q, e (n (k))=r
e(p(k)p(j)t)=q δkj,e(n(k)n(j)t)=r δkj(8)
Wherein, q and r is respectively the average of noise p (k) and n (k);Q and r is respectively the covariance of noise p (k) and n (k).
δkjFor kronecker function.Speech Enhancement problem is to go to estimate optimum voice on the premise of known measurement voice signal y (k)
Signal x (k).
2) framing and adding window
Voice signal has short-term stationarity (it is considered that voice signal is approximately constant in 10~30ms), thus permissible
Voice signal is divided into some short sections come being processed, here it is framing, the framing of voice signal is using movably having
Method that the window of limit for length's degree is weighted is realizing.Frame number typically per second is about 33~100 frames.General framing method
For the method for overlapping segmentation, the overlapping part of former frame and a later frame is referred to as frame and moves, frame move with the ratio generally 0 of frame length~
0.5.In the present invention, frame length is 25ms, and frame moves as 10ms.
3) system initialization
3.1) improved Kalman filter device parameter initialization
Initialization voice signal state estimation sequence x (0/0), covariance matrix p (0/0) are it is ensured that covariance matrix is just
Fixed.
3.2) ar parameter initialization
Initialization ar parameter state estimated sequence θ (0/0), in the present invention, the exponent number of ar parameter (rule of thumb sets for 13
Fixed).
4) estimate ar parameter
Ar parameter refers to last column [a in state-transition matrix f in formula (3)p(k) … a1(k)], it is mainly used to
Description speech production process, its accuracy has direct impact to the result of speech enhan-cement.Ar parameter estimation in practical application
Larger by voice signal itself, various influence of noise, therefore propose in the present invention to consider voice in the estimation of ar parameter
Signal condition estimated sequence x (k-1), state-noise q (k), measurement noise n (k), nonstationary noise v (k) etc., set up new ar
Parameter estimation state-space model, realizes the online Robust Estimation of ar parameter, and this is a core point of the present invention.As shown in figure 3,
As follows to the real-time estimation process of ar parameter:
4.1) set up the parameter estimation model of ar parameter
The ar parameter model that Gaussian noise and nonstationary noise mix under lower environment is described as follows:
θ (k)=θ (k-1)+q (k)
Y (k)=a θ (k)+r (k)+w (k) (9)
Wherein θ (k)=[ap(k) … a1(k)]tFor k moment ar parameter state sequence;Q (k) is k moment state-noise,
Gaussian distributed, its covariance matrix is q (k);R (k) k moment measurement noise, Gaussian distributed, its covariance matrix is
r(k);W (k) k moment measurement noise, Gaussian distributed, its covariance matrix is w (k);A=x (k-1)t=[s (k-p) ...
S (k-1)] it is measurement matrix;Y (k) is k moment voice signal measurement sequence.State and the statistics of measurement noise q (k) and r (k)
Characteristic is:
E (q (k))=d, e (r (k))=l
e(q(k)q(j)t)=d δkj,e(r(k)r(j)t)=l δkj(10)
Wherein, d and l is respectively the average of noise q (k) and r (k);D and l is respectively the covariance of noise q (k) and r (k).
δkjFor kronecker function.
4.2) from the traditional Kalman filtering problem of convex optimization angle reconstruct
In order to easily estimate to sparse noise, need to ask from the angle reconstruct Kalman filtering of convex optimization
Topic.The state-space model (without nonstationary noise w (k)) of traditional Kalman filtering is as follows:
θ (k)=θ (k-1)+q (k)
Y (k)=a θ (k)+r (k) (11)
According to Bayes principle, ar Parameter Estimation Problem can be expressed as, under the premise of metric data y (k) is known, estimating
Optimum ar argument sequence θ of meter (k) it may be assumed that
Theoretical according to maximal possibility estimation, set up the likelihood function of p (y (k) | θ (k)) and p (θ (k)):
Wherein, ψ beThe covariance matrix ψ of conditional probability p in the case of known (θ (k) | y (k))
(k)=pθ(k | k)+d (k) (wherein pθ(k | k) be covariance updated value).When likelihood function condition l1(y (k), θ (k)) and l2(θ
(k)) when obtaining maximum, conditional probability p (y (k) | θ (k)) obtains optimal estimation value.Observation type (12) and formula (13) can be sent out
Now maximize likelihood function condition l1(z (k), x (k+1)) and l2(x (k+1)) is equivalent to power exponent in minimum likelihood function
Exponential partWithTherefore may be used
To be optimized form as follows:
Subjiect to y (k)=a θ (k)+r (k) (15)
Wherein, θ (k) and r (k) is variable, ψ (k)=pθ(k | k)+d (k) is the covariance matrix of Gaussian noise.θ(k)
Estimated value beR (k) is exactly the estimation to Gaussian noise.pθ(k | k) updates matrix for covariance:
pθ(k | k)=(i-kθ(k)a(k))pθ(k|k-1) (16)
pθ(k | k-1) be covariance prediction matrix:
pθ(k | k-1)=pθ(k-1|k-1)+d(k-1) (17)
kθK () is covariance gain:
kθ(k)=pθ(k|k-1)at(apθ(k|k-1)at+l(k-1))-1(18)
4.3) build, from the convex angle that optimizes, the optimization problem that nonstationary noise is estimated
Nonstationary noise obeys laplacian distribution, has sparse characteristic, and the core concept that nonstationary noise is estimated is profit
With the sparse characteristic of noise, through step 4.2) traditional Kalman filtering problem is converted into after convex optimization problem, can be excellent
Increase the sparsity constraints of nonstationary noise w (k) completing the estimation to sparse noise, new optimization form is in change:
Subjiect to y (k)=a θ (k)+r (k)+w (k) (19)
Wherein, w (k) is sparse noise, by above-mentioned optimization problem, obtaining the optimum of ar parameter is estimated
Meter θ (k) (note:), the optimization problem that formula (17) represents is a convex optimization problem, it is possible to use in engineering relatively
Solved for ripe interior point method.
5) estimated speech signal status switch.
During speech signal collection, nonstationary noise affects larger on voice quality.In order to improve voice quality,
Voice enhancement algorithm allows for tackling the situation of Gaussian noise and nonstationary noise mixing simultaneously.Nonstationary noise is typically obeyed
Laplacian distribution, has sparse characteristic, and the estimation of nonstationary noise mainly be make use of with the sparse characteristic of noise.For convenience
In optimization problem introduce noise sparsity constraints, initially with convex optimisation technique by traditional Kalman filtering problem reformulation be one
Individual convex optimization problem, then introduces the sparsity constraints to sparse noise in the new optimization building, is finally completed speech enhan-cement
Task, this is another core point of the present invention.
5.1) from the traditional Kalman filtering problem of convex optimization angle reconstruct
In order to easily estimate to sparse noise, need to ask from the angle reconstruct Kalman filtering of convex optimization
Topic.The state-space model of traditional Kalman filtering is as follows:
X (k)=fx (k-1)+p (k) (20)
Y (k)=cx (k)+n (k) (21)
According to Bayes principle, Kalman filtering problem can be expressed as, under the premise of metric data y (k) is known, estimating
Optimum voice status sequence x of meter (k) it may be assumed that
Theoretical according to maximal possibility estimation, set up p (y (k) | x (k)) and p (likelihood function of x (k):
Wherein, θ beThe covariance matrix of conditional probability p in the case of known (x (k) | y (k-1))
θ=fp (k-1 | k-1) ft+ q (k-1) (wherein p (k-1 | k-1) be covariance updated value).When likelihood function condition l1(y(k),x
(k)) and l2When (x (k)) obtains maximum, and conditional probability p (x (k) | y (k)) obtain optimal estimation value.Observation type (23) and formula
(24) it can be found that maximizing likelihood function condition l1(y (k), x (k)) and l2(x (k)) is equivalent to power in minimum likelihood function
The exponential part of indexWithCause
This can be optimized form as follows:
Subjiect to y (k)=cx (k)+n (k) (25)
Wherein, x (k) and n (k) is variable, and θ is the covariance matrix of Gaussian noise.The estimated value of x (k) isN (k) is exactly the estimation to Gaussian noise.
P (k | k) updates matrix for covariance:
P (k | k)=(i-k (k) c (k)) p (k | k-1) (26)
P (k | k-1) be covariance prediction matrix:
P (k | k-1)=f (k-1) p (k-1 | k-1) f (k-1)t+q(k-1) (27)
kθK () is covariance gain:
K (k)=p (k | k-1) ct(cp(k|k-1)ct+r(k-1))-1(28)
5.2) build the estimation problem to sparse noise from the convex angle that optimizes
The core concept of the estimation of sparse noise is the sparse characteristic using noise, through step 5.1) by traditional Kalman
After filtering problem is converted into convex optimization problem, sparse noise n can be increased in optimizationsK the sparsity constraints of () are right to complete
The estimation of sparse noise, new optimization form is:
Subjiect to y (k)=cx (k)+n (k)+v (k) (29)
Wherein, v (k) is sparse noise, by above-mentioned optimization problem, obtaining to molten bath centroid position
(note: x (k) is the optimal estimation to state value in traditional Kalman filtering for optimal estimation x (k)), formula (29) represents
Optimization problem be a convex optimization problem, it is possible to use in engineering, more ripe interior point method is solved.
5.3), after completing the enhancing to k moment voice signal, strengthen resultStep 4 will be returned to), it is used for
Update the ar parameter θ (k+1) in k+1 moment, be further continued for carrying out the speech enhan-cement in k+1 moment afterwards, estimate x (k+1), until by institute
There is Speech processing complete.
As shown in Figs. 4a and 4b, can relatively accurately Gaussian noise and non-stationary be made an uproar through method proposed by the present invention
Sound is filtered, and former voice signal is strengthened.
Using the present invention, can accurately estimate and filter white noise and nonstationary noise, realize white noise and non-stationary
Speech enhan-cement under noise mixing, provides more pure estimated speech signal simultaneously, is that the raising of speech recognition accuracy carries
Support for front end.
Because the present invention establishes two Robust Kalman Filter models, the generating process model of voice signal is carried out
Mathematical modeling, has all done on the temporal characteristics and time-varying characteristics of voice and has targetedly considered, ar parameter estimation has taken dynamic reality
Shi Gengxin iteration, meets the requirement of parameter time varying characteristic, often estimated speech signal can be gone to utilize by state estimation by frame again
Voice short-term stationarity characteristic, so that filter effect is better than traditional Kalman filtering in result, is worthy to be popularized.
Embodiment described above is only the preferred embodiments of the invention, not limits the enforcement model of the present invention with this
Enclose, therefore the change that all shapes according to the present invention, principle are made, all should cover within the scope of the present invention.
Claims (1)
1. a kind of online sound enhancement method being applied under nonstationary noise environment is it is characterised in that comprise the following steps:
1) set up the system model under nonstationary noise environment
1.1) the autoregression ar model in the case of setting up that Gaussian noise and sparse noise are common and existing
The generation process of voice signal be one by white-noise excitation, through the output of full limit linear system from recursive procedure, that is,
Current output is equal to the pumping signal of present moment and the weighted sum of p moment output in the past, and this is an autoregression ar mould
Type, is expressed as follows:
Wherein, u (k) is the white Gaussian noise excitation value in k moment;S (k-i) is the voice signal in (k-i) moment;S (k) is the
The voice signal in k moment;aiFor i-th linear predictor coefficient, also referred to as ar model parameter;P is the exponent number of ar model parameter;
Set up the voice signal model meeting actual measurement process, it is as follows that voice signal measures process description:
Y (k)=s (k)+n (k)+v (k) (2)
Wherein, y (k) is k moment voice signal measurement sequence;S (k) is the voice signal in k moment;N (k) is k moment white Gaussian
Noise;V (k) is k moment nonstationary noise, obeys laplacian distribution, has openness;
1.2) set up voice signal state-space model
Formula (1) and formula (2) are converted to state-space model, are described as follows:
X (k)=fx (k-1)+p (k) (3)
Y (k)=cx (k)+n (k)+v (k) (4)
Wherein,
C=[0 0 ... 0 1] (6)
X (k)=[s (k-p+1) ... s (k)]t(7)
In voice signal state equation (3) and voice signal measurement equation (4), x (k) is k moment voice signal state estimation
Sequence, i.e. the optimal State Estimation of voice signal;X (k-1) is (k-1) moment voice signal state estimation sequence;When y (k) is k
Carve voice signal measurement sequence;The state-transition matrix that f is constituted for linear predictor coefficient, last column [a in fp(k) … a1
(k)] it is referred to as ar parameter;C=[0 0 ... 0 1] is to measure transfer matrix;P (k) is k moment state-noise, obeys Gauss and divides
Cloth;N (k) is k moment measurement noise, Gaussian distributed;V (k) is the nonstationary noise in k moment, obeys laplacian distribution;
The statistical property of the state of voice signal and measurement noise p (k) and n (k) is:
E (p (k))=q, e (n (k))=r
e(p(k)p(j)t)=q δkj,e(n(k)n(j)t)=r δkj(8)
Wherein, q and r is respectively the average of noise p (k) and n (k);Q and r is respectively the covariance of noise p (k) and n (k);δkjFor
Kronecker function;Speech Enhancement problem is to go to estimate optimum voice signal x on the premise of known measurement voice signal y (k)
(k);
2) framing and adding window
Voice signal has short-term stationarity, thinks that voice signal is constant in 10--30ms, this makes it possible to voice signal to divide
For some short sections come being processed, here it is framing, the framing of voice signal is the window using moveable finite length
The method that is weighted is realizing;Frame number generally per second is 33~100 frames, and framing method is the method for overlapping segmentation, front
The overlapping part of one frame and a later frame is referred to as frame and moves, and frame moves and the ratio of frame length is 0~0.5;
3) system initialization
3.1) improved Kalman filter device parameter initialization
Initialization voice signal state estimation sequence x (0/0), covariance matrix p (0/0) are it is ensured that covariance matrix is positive definite;
3.2) ar parameter initialization
Initialization ar parameter state estimated sequence θ (0/0);
4) estimate ar parameter
Ar parameter refers to last column [a in state-transition matrix f in formula (3)p(k) … a1(k)], it is mainly used to describe
Speech production process, its accuracy has direct impact to the result of speech enhan-cement;Propose comprehensive in the estimation of ar parameter
Consider voice signal state estimation sequence x (k-1), state-noise q (k), measurement noise n (k), nonstationary noise v (k), set up
New ar parameter estimation state-space model, realizes the online Robust Estimation of ar parameter, and to the real-time estimation process of ar parameter such as
Under:
4.1) set up the parameter estimation model of ar parameter
The ar parameter model that Gaussian noise and nonstationary noise mix under lower environment is described as follows:
θ (k)=θ (k-1)+q (k)
Y (k)=a θ (k)+r (k)+w (k) (9)
Wherein, θ (k)=[ap(k) … a1(k)]tFor k moment ar parameter state sequence;Q (k) is k moment state-noise, obeys
Gauss distribution, its covariance matrix is q (k);R (k) k moment measurement noise, Gaussian distributed, its covariance matrix is r
(k);W (k) k moment measurement noise, Gaussian distributed, its covariance matrix is w (k);A=x (k-1)t=[s (k-p) ... s
(k-1)] it is measurement matrix;Y (k) is k moment voice signal measurement sequence;The statistics of state and measurement noise q (k) and r (k) is special
Property is:
E (q (k))=d, e (r (k))=l
e(q(k)q(j)t)=d δkj,e(r(k)r(j)t)=l δkj(10)
Wherein, d and l is respectively the average of noise q (k) and r (k);D and l is respectively the covariance of noise q (k) and r (k);δkjFor
Kronecker function;
4.2) from the traditional Kalman filtering problem of convex optimization angle reconstruct
In order to easily estimate to sparse noise, need to reconstruct Kalman filtering problem from the angle of convex optimization, pass
The state-space model of system Kalman filtering, without nonstationary noise w (k), as follows:
θ (k)=θ (k-1)+q (k)
Y (k)=a θ (k)+r (k) (11)
According to Bayes principle, ar Parameter Estimation Problem is expressed as, under the premise of metric data y (k) is known, estimating optimum ar
Argument sequence θ (k) it may be assumed that
Theoretical according to maximal possibility estimation, set up the likelihood function of p (y (k) | θ (k)) and p (θ (k)):
Wherein, ψ beCovariance matrix ψ (k) of conditional probability p in the case of known (θ (k) | y (k))=
pθ(k | k)+d (k), wherein pθ(k | k) it is covariance updated value;When likelihood function condition l1(y (k), θ (k)) and l2(θ (k)) takes
When obtaining maximum, and conditional probability p (y (k) | θ (k)) obtain optimal estimation value;Observation type (12) and formula (13) find to maximize seemingly
So function condition l1(z (k), x (k+1)) and l2(x (k+1)) is equivalent to the exponential part minimizing power exponent in likelihood functionWithTherefore obtain excellent as follows
Change form:
Wherein, θ (k) and r (k) is variable, ψ (k)=pθ(k | k)+d (k) is the covariance matrix of Gaussian noise;The estimation of θ (k)
Value isR (k) is exactly the estimation to Gaussian noise;pθ(k | k) updates matrix for covariance:
pθ(k | k)=(i-kθ(k)a(k))pθ(k|k-1) (16)
pθ(k | k-1) be covariance prediction matrix:
pθ(k | k-1)=pθ(k-1|k-1)+d(k-1) (17)
kθK () is covariance gain:
kθ(k)=pθ(k|k-1)at(apθ(k|k-1)at+l(k-1))-1(18)
4.3) build, from the convex angle that optimizes, the optimization problem that nonstationary noise is estimated
Nonstationary noise obeys laplacian distribution, has a sparse characteristic, and the core concept that nonstationary noise is estimated is using making an uproar
The sparse characteristic of sound, through step 4.2) traditional Kalman filtering problem is converted into after convex optimization problem, can be in optimization
Completing the estimation to sparse noise, new optimization form is the sparsity constraints increasing nonstationary noise w (k):
Wherein, w (k) is sparse noise, by above-mentioned optimization problem, obtaining the optimal estimation θ to ar parameter
(k),The optimization problem that formula (17) represents is a convex optimization problem, can be using the interior point method in engineering
Solved;
5) estimated speech signal status switch
5.1) from the traditional Kalman filtering problem of convex optimization angle reconstruct
In order to easily estimate to sparse noise, need to reconstruct Kalman filtering problem from the angle of convex optimization, pass
The state-space model of system Kalman filtering is as follows:
X (k)=fx (k-1)+p (k) (20)
Y (k)=cx (k)+n (k) (21)
According to Bayes principle, Kalman filtering problem is expressed as, under the premise of metric data y (k) is known, estimating optimum language
Sound status switch x (k) it may be assumed that
Theoretical according to maximal possibility estimation, set up p (y (k) | x (k)) and p (likelihood function of x (k):
Wherein, θ beThe covariance matrix θ of conditional probability p in the case of known (x (k) | y (k-1))=
fp(k-1|k-1)ft+ q (k-1), wherein p (k-1 | k-1) it is covariance updated value;When likelihood function condition l1(y(k),x(k))
And l2When (x (k)) obtains maximum, and conditional probability p (x (k) | y (k)) obtain optimal estimation value;Observation type (23) and formula (24)
Find to maximize likelihood function condition l1(y (k), x (k)) and l2(x (k)) is equivalent to the finger minimizing power exponent in likelihood function
Fractional partWithTherefore obtain as
Lower optimization form:
Wherein, x (k) and n (k) is variable, and θ is the covariance matrix of Gaussian noise;The estimated value of x (k) isn
K () is exactly the estimation to Gaussian noise;
P (k | k) updates matrix for covariance:
P (k | k)=(i-k (k) c (k)) p (k | k-1) (26)
P (k | k-1) be covariance prediction matrix:
P (k | k-1)=f (k-1) p (k-1 | k-1) f (k-1)t+q(k-1) (27)
kθK () is covariance gain:
K (k)=p (k | k-1) ct(cp(k|k-1)ct+r(k-1))-1(28)
5.2) build the estimation problem to sparse noise from the convex angle that optimizes
The core concept of the estimation of sparse noise is the sparse characteristic using noise, through step 5.1) by traditional Kalman filtering
After problem is converted into convex optimization problem, sparse noise n can be increased in optimizationsK the sparsity constraints of () are completing to sparse
The estimation of noise, new optimization form is:
Wherein, v (k) is sparse noise, by above-mentioned optimization problem, obtaining the optimal estimation to molten bath centroid position
X (k), x (k) are the optimal estimation in traditional Kalman filtering to state valueThe optimization problem that formula (29) represents is one
Individual convex optimization problem, can be solved using the interior point method in engineering;
5.3), after completing the enhancing to k moment voice signal, strengthen resultStep 4 will be returned to), for updating k
The ar parameter θ (k+1) in+1 moment, is further continued for carrying out the speech enhan-cement in k+1 moment afterwards, estimates x (k+1), until by all languages
Sound signal processing is complete.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610843483.0A CN106340304B (en) | 2016-09-23 | 2016-09-23 | A kind of online sound enhancement method under the environment suitable for nonstationary noise |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610843483.0A CN106340304B (en) | 2016-09-23 | 2016-09-23 | A kind of online sound enhancement method under the environment suitable for nonstationary noise |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106340304A true CN106340304A (en) | 2017-01-18 |
CN106340304B CN106340304B (en) | 2019-09-06 |
Family
ID=57840174
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610843483.0A Expired - Fee Related CN106340304B (en) | 2016-09-23 | 2016-09-23 | A kind of online sound enhancement method under the environment suitable for nonstationary noise |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106340304B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110248212A (en) * | 2019-05-27 | 2019-09-17 | 上海交通大学 | 360 degree of video stream server end code rate adaptive transmission methods of multi-user and system |
CN110648680A (en) * | 2019-09-23 | 2020-01-03 | 腾讯科技(深圳)有限公司 | Voice data processing method and device, electronic equipment and readable storage medium |
CN112557925A (en) * | 2020-11-11 | 2021-03-26 | 国联汽车动力电池研究院有限责任公司 | Lithium ion battery SOC estimation method and device |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110305345A1 (en) * | 2009-02-03 | 2011-12-15 | University Of Ottawa | Method and system for a multi-microphone noise reduction |
CN102890935A (en) * | 2012-10-22 | 2013-01-23 | 北京工业大学 | Robust speech enhancement method based on fast Kalman filtering |
CN103323815A (en) * | 2013-03-05 | 2013-09-25 | 上海交通大学 | Underwater acoustic locating method based on equivalent sound velocity |
CN103903630A (en) * | 2014-03-18 | 2014-07-02 | 北京捷通华声语音技术有限公司 | Method and device used for eliminating sparse noise |
-
2016
- 2016-09-23 CN CN201610843483.0A patent/CN106340304B/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110305345A1 (en) * | 2009-02-03 | 2011-12-15 | University Of Ottawa | Method and system for a multi-microphone noise reduction |
CN102890935A (en) * | 2012-10-22 | 2013-01-23 | 北京工业大学 | Robust speech enhancement method based on fast Kalman filtering |
CN103323815A (en) * | 2013-03-05 | 2013-09-25 | 上海交通大学 | Underwater acoustic locating method based on equivalent sound velocity |
CN103903630A (en) * | 2014-03-18 | 2014-07-02 | 北京捷通华声语音技术有限公司 | Method and device used for eliminating sparse noise |
Non-Patent Citations (3)
Title |
---|
冯宝: "基于凸优化技术的改进型卡尔曼滤波算法", 《自动化与信息工程》 * |
吴飞: "一种具有在线参数调整功能的Kalman滤波及其应用", 《计算机工程与科学》 * |
吴飞: "鲁棒卡尔曼算法及其应用研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110248212A (en) * | 2019-05-27 | 2019-09-17 | 上海交通大学 | 360 degree of video stream server end code rate adaptive transmission methods of multi-user and system |
CN110248212B (en) * | 2019-05-27 | 2020-06-02 | 上海交通大学 | Multi-user 360-degree video stream server-side code rate self-adaptive transmission method and system |
CN110648680A (en) * | 2019-09-23 | 2020-01-03 | 腾讯科技(深圳)有限公司 | Voice data processing method and device, electronic equipment and readable storage medium |
CN110648680B (en) * | 2019-09-23 | 2024-05-14 | 腾讯科技(深圳)有限公司 | Voice data processing method and device, electronic equipment and readable storage medium |
CN112557925A (en) * | 2020-11-11 | 2021-03-26 | 国联汽车动力电池研究院有限责任公司 | Lithium ion battery SOC estimation method and device |
CN112557925B (en) * | 2020-11-11 | 2023-05-05 | 国联汽车动力电池研究院有限责任公司 | Lithium ion battery SOC estimation method and device |
Also Published As
Publication number | Publication date |
---|---|
CN106340304B (en) | 2019-09-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111261183B (en) | Method and device for denoising voice | |
Deng et al. | Enhancement of log mel power spectra of speech using a phase-sensitive model of the acoustic environment and sequential estimation of the corrupting noise | |
WO2020107269A1 (en) | Self-adaptive speech enhancement method, and electronic device | |
CN102750956B (en) | Method and device for removing reverberation of single channel voice | |
Mahmmod et al. | Speech enhancement algorithm based on super-Gaussian modeling and orthogonal polynomials | |
CN106971740A (en) | Probability and the sound enhancement method of phase estimation are had based on voice | |
CN103325381B (en) | A kind of speech separating method based on fuzzy membership functions | |
CN109192200B (en) | Speech recognition method | |
CN103065629A (en) | Speech recognition system of humanoid robot | |
CN111968658B (en) | Speech signal enhancement method, device, electronic equipment and storage medium | |
CN111785288B (en) | Voice enhancement method, device, equipment and storage medium | |
CN106157964A (en) | A kind of determine the method for system delay in echo cancellor | |
CN106340304A (en) | Online speech enhancement method for non-stationary noise environment | |
CN107785028A (en) | Voice de-noising method and device based on signal autocorrelation | |
Do et al. | Speech source separation using variational autoencoder and bandpass filter | |
González et al. | MMSE-based missing-feature reconstruction with temporal modeling for robust speech recognition | |
EP4325487A1 (en) | Voice signal enhancement method and apparatus, and electronic device | |
Shi et al. | Fusion feature extraction based on auditory and energy for noise-robust speech recognition | |
CN116013344A (en) | Speech enhancement method under multiple noise environments | |
CN115171712A (en) | Speech enhancement method suitable for transient noise suppression | |
CN114495969A (en) | Voice recognition method integrating voice enhancement | |
CN103903630A (en) | Method and device used for eliminating sparse noise | |
Ernawan et al. | Efficient discrete tchebichef on spectrum analysis of speech recognition | |
CN115223583A (en) | Voice enhancement method, device, equipment and medium | |
CN103903631A (en) | Speech signal blind separating method based on variable step size natural gradient algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20190906 |