CN1870136A - Variation Bayesian voice strengthening method based on voice generating model - Google Patents
Variation Bayesian voice strengthening method based on voice generating model Download PDFInfo
- Publication number
- CN1870136A CN1870136A CNA2006100283311A CN200610028331A CN1870136A CN 1870136 A CN1870136 A CN 1870136A CN A2006100283311 A CNA2006100283311 A CN A2006100283311A CN 200610028331 A CN200610028331 A CN 200610028331A CN 1870136 A CN1870136 A CN 1870136A
- Authority
- CN
- China
- Prior art keywords
- model
- distribution
- speech
- production model
- exponent number
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Abstract
A method for intensifying variational Bayes voice based on voice generating model includes setting up a noising voice model and state space equation of voice generating model, expressing a noising course and probability distribution, applying approximate posteriori distribution to approximate parameter of voice generating model and probability distribution of pure voice according to variational Bayes method to obtain parameter update equality of those approximate posteriori distribution and updating equality with cyclic iteration till algorithm convergence.
Description
Technical field
The present invention relates to a kind of variation Bayes sound enhancement method, can be widely used in aspects such as speech communication and speech recognition, belong to field of voice signal based on speech production model.
Background technology
Actual voice capture device and voice collecting environment can not obtain pure voice down, voice can be by the diversity of settings noise pollution, therefore in speech communication and speech recognition etc. are used, it is very important that voice are strengthened as a pre-service link, and the voice after the enhancing can better guarantee the accuracy that subsequent voice is handled.
For improving voice quality, existing sound enhancement method mainly contains following several:
First method is a threshold method, and its ultimate principle thinks that the less part of amplitude absolute value mainly is a noise in the signal, further compresses this part signal by a kind of linearity or non-linear compression function and reaches the purpose that voice strengthen.When being compression noise, the major defect of this algorithm also compressed a lot of useful voice messagings.
Second method is a spectrum-subtraction, suppose that noise is stably or the additive noise that becomes when slow, and suppose that voice signal and noise are under the separate condition, to deduct the power spectrum of noise from the power spectrum of noisy speech, thereby obtain comparatively pure voice spectrum.But it is exactly to have the not naturetone that is called " music " noise in the voice signal after strengthening that this method has a well-known shortcoming, and then makes people's ear subjective sensation uncomfortable.
The third method is based on the enhancement algorithms of speech production model, this algorithm is owing to the parameter of " pure " speech model can't accurately be estimated, so can only adopt direct estimation model parameter from signals and associated noises, inaccurate if model is estimated, strengthen back intelligibility of speech variation.Therefore estimation model parameter and model order are the keys of this method accurately from the voice that contain noise.(S.Gannot such as Gannot, D.Burshteinand E.Weinstein, Iterative and Sequential Kalman Filter-Based Speech EnhancementAlgorithms, IEEE Trans.Speech and Audio Processing, vol.6, No.4, July l998, pp.373-385.) a kind of enhancement algorithms based on Kalman filtering is proposed, estimate the speech production model parameter with maximum likelihood method, but this method can not the estimation model exponent number, can only determine model order with additive method or priori, and the estimation of initial parameter value is very big to result's influence.(J.Vermaak such as Vermaak, C.Andrieu, A.Doucet and S.J.Godsill, Partical Methods for Bayesian Modeling andEnhancement of Speech Signals, IEEE Trans.Speech and Audio Processing, Vol.10, No.3,2002, pp.173-185.) propose to estimate the speech production model parameter with the Markov chain Monte Carlo method, estimate pure voice signal with Kalman filter.But this method can not the estimation model exponent number, and calculated amount is very big, is not suitable for a lot of occasions.
Summary of the invention
The objective of the invention is at the deficiencies in the prior art, a kind of variation Bayes sound enhancement method based on speech production model is proposed, can select the exponent number of speech production model automatically, and can avoid producing in the parameter estimation procedure over-fitting phenomenon, make the estimation of model more accurate, the better effects if that voice strengthen.
For realizing this purpose, the technical solution used in the present invention is considered: the variation bayes method is a kind of Bayes's approximation method that grows up recent years, its principle is that the approximate posteriority with known variables and parameter distributes and approaches their true distribution, make bayes method can resolve realization, it can learning model structure and model parameter.Therefore, the present invention makes full use of the variation bayes method and avoid the advantage of over-fitting and the ability of Model Selection in the learning parameter process, accurately estimates the parameter and the exponent number of speech production model, better to reach the purpose that voice strengthen.The present invention at first sets up the state space equation of noisy speech model and speech production model, expresses the probability distribution of noisy process and speech production process then.According to the variation bayes method, with approximate posteriority the distribute parameter of approaching speech production model and the probability distribution of clean speech signal.At last, obtain the renewal equation of the parameter of these approximate posteriority distributions, loop iteration upgrades equation up to algorithm convergence.It is that the exponent number of minimum cost function value correspondence promptly is optimum model order with the exponent number of the speech production model independent variable as the cost function of variation bayes method that automodel is selected.The voice signal that is calculated by this optimum exponent number is an optimal results.
Variation Bayes sound enhancement method based on speech production model of the present invention mainly comprises following step:
1, noisy speech signal is expressed as the form of clean speech signal and noise addition, sets up the noisy speech model, represent speech production model, and set up the state space equation of noisy speech model and speech production model correspondence with an autoregressive process.
2, the noise of selected noisy speech model is a Gaussian distribution, the driving noise of speech production model also is a Gaussian distribution, state space equation according to these two Gaussian distribution and noisy speech model and speech production model correspondence, draw the probability distribution of state vector and observation vector, determine the prior distribution of the contrary variance of the weight coefficient of speech production model and all Gaussian distribution by priori.
3, according to the cost function of variation bayes method, and according to the probability distribution of state vector and observation vector, and the prior distribution of the contrary variance of the weight coefficient of speech production model and all Gaussian distribution, obtain the approximate posteriority distribution of the contrary variance of the weight coefficient of state vector, speech production model and all Gaussian distribution with the variation expectation-maximization algorithm.
4, with the renewal equation of the approximate posteriority distribution parameter of variation Kalman smoothing algorithm estimated state vector, by the derive renewal equation of approximate posteriority distribution parameter of the weight coefficient of speech production model and the contrary variance of all Gaussian distribution of the variation maximization of variation expectation-maximization algorithm.
5, in predetermined speech production model exponent number scope, select an initial exponent number value, noisy speech signal and initial exponent number value are brought in the parameter update equation of being derived by step 4, the calculation cost function iterates, be not more than certain pre-determined threshold value up to cost function from an absolute value that goes on foot next step variation, with the cost function of this moment and the approximate posteriority distribution parameter preservation of the state vector of correspondence with it.
6, in predetermined speech production model exponent number scope, change the value of model order successively, with the initial exponent number value in the new exponent number value replacement step 5, repeating step 5 obtains the approximate posteriority distribution parameter of one group of cost function corresponding with each model order and state vector.
7, in all cost functions that obtain, the exponent number of minimum cost function correspondence is exactly optimum model order, and the voice signal that is calculated by the approximate posteriority distribution parameter of the pairing state vector of this optimization model exponent number is exactly optimum result.
The present invention makes full use of the advantage of variation Bayesian learning model parameter and structure, estimates the parameter and the exponent number of speech production model more exactly, has improved voice and has strengthened effect.
The variation Bayes sound enhancement method based on speech production model that the present invention proposes can be widely used in aspects such as speech communication and speech recognition, has suitable practical value.
Embodiment
In order to understand technical scheme of the present invention better, below be described in further detail.
1. noisy speech signal x
tBe expressed as clean speech signal s
tWith noise n
tThe form of addition, it is as follows to set up the noisy speech model:
x
t=s
t+n
t (1)
Subscript t is the time.Speech production model is represented with an autoregressive process:
2. noise n
tElect Gaussian distribution as, be expressed as p (n
t)=G (n
t| 0, γ).The driving noise e of autoregressive model
tAlso elect Gaussian distribution as, be expressed as p (e
t)=G (e
t| 0, β).(y|a b) represents that it is a that stochastic variable y satisfies average to G, and contrary variance is the Gaussian distribution of b.According to (3), state vector
Probability distribution as shown in the formula:
According to (4), the probability distribution of observation vector can be write
The weight coefficient of autoregressive model is obeyed Gauss's prior distribution of a zero-mean
The contrary variance of all Gaussian distribution is obeyed the Gamma prior distribution
p(α|H)=Gamma(δ|b
(α),c
(α)) (8)
p(β|H)=Gamma(β|b
(β),c
(β)) (9)
p(γ|H)=Gamma(γ|b
(γ),c
(γ)) (10)
3. set { the x that represents observation vector with X
1, x
2..., x
T, represent the set of state vector with S
Represent the set of the contrary variance of the weight coefficient of speech production model and all Gaussian distribution with θ
The principle of variation bayes method use exactly an approximate posteriority distribution Q (S, θ) approach p (S, θ | X), the cost function of usefulness is in practice
<
QBe illustrated in the expectation under the probability distribution Q ().Cost function (11) according to the variation bayes method, and according to probability distribution (5)-(6) of state vector and observation vector, and prior distribution (7)-(10) of the contrary variance of the weight coefficient of speech production model and all Gaussian distribution, the approximate posteriority distribution of contrary variance that can obtain the weight coefficient of state vector, speech production model and all Gaussian distribution with the variation expectation-maximization algorithm is as follows:
Q(α)=Gamma(α| b
(α), c
(α)) (14)
Q(β)=Gamma(β| b
(β), c
(β)) (15)
Q(γ)=Gamma(γ| b
(γ), c
(γ)) (16)
4. ask distribute parameter in (12) of the approximate posteriority of state vector with variation Kalman smoothing algorithm.An arrangement set { x
T0, x
T0+1, L, x
T1Usefulness { x}
T0 T1Represent at first definite condition expectation
And conditional covariance matrix
Initial value
And V
0|0=V
0, to t=1, L, T below is a Kalman filtering forward recursive process:
V
t|t-1= AV
t-1|t-1A
T+P (18)
V
t|t=V
t|t-1-K
tCV
t|t-1 (21)
Here
β=(〈β〉
Q)
-1,
It is state vector
Kalman filtering distribute.Proceed Kalman's smoothing algorithm, with corresponding Kalman filtering value initialization
And V
T|T, to t=T-1, L, 0, it is as follows then to carry out the backward recursive process:
Therefore, we obtain
The renewal equation of parameter is:
With
Renewal equation with the approximate posteriority distribution parameter of the weight coefficient of the variation of variation expectation-maximization algorithm maximization derivation speech production model and the contrary variance of all Gaussian distribution is as follows:
5. in predetermined speech production model exponent number scope, select an initial exponent number value P
1, with the signals and associated noises x of reality
tWith initial exponent number value p
1Bring in renewal equation (17)-(32) of the parameter of deriving by step 4, the cost function of calculating (11) formula that iterates, be not more than certain pre-determined threshold value up to cost function from an absolute value that goes on foot next step variation and stop, the cost function of this moment is reached the approximate posteriority distribution parameter of corresponding with it state vector
Preserve;
6. in predetermined speech production model exponent number scope, change the value of model order successively, with the initial exponent number value P in the new exponent number value p replacement step 5
1, repeating step 5 obtains the approximate posteriority distribution parameter of one group of cost function corresponding with each model order and state vector;
7. in all cost functions that obtain, the p value of minimum cost function correspondence is exactly optimum model order, by the approximate posteriority distribution parameter of the pairing state vector of this optimization model exponent number
The voice signal that calculates
Be exactly best result.
Claims (1)
1, a kind of variation Bayes sound enhancement method based on speech production model is characterized in that comprising following concrete steps:
1) noisy speech signal is expressed as the form of clean speech signal and noise addition, sets up the noisy speech model, represent speech production model with an autoregressive process, and set up the state space equation of noisy speech model and speech production model correspondence;
2) noise of selected noisy speech model is a Gaussian distribution, the driving noise of speech production model also is a Gaussian distribution, state space equation according to these two Gaussian distribution and noisy speech model and speech production model correspondence, draw the probability distribution of state vector and observation vector, determine the prior distribution of the contrary variance of the weight coefficient of speech production model and all Gaussian distribution by priori;
3) according to the cost function of variation bayes method, and according to the probability distribution of state vector and observation vector, and the prior distribution of the contrary variance of the weight coefficient of speech production model and all Gaussian distribution, obtain the approximate posteriority distribution of the contrary variance of the weight coefficient of state vector, speech production model and all Gaussian distribution with the variation expectation-maximization algorithm;
4) with the renewal equation of the approximate posteriority distribution parameter of variation Kalman smoothing algorithm estimated state vector, by the derive renewal equation of approximate posteriority distribution parameter of the weight coefficient of speech production model and the contrary variance of all Gaussian distribution of the variation maximization of variation expectation-maximization algorithm;
5) in predetermined speech production model exponent number scope, select an initial exponent number value, noisy speech signal and initial exponent number value are brought in the parameter update equation of being derived by step 4), the calculation cost function iterates, be not more than certain pre-determined threshold value up to cost function from an absolute value that goes on foot next step variation, with the cost function of this moment and the approximate posteriority distribution parameter preservation of the state vector of correspondence with it;
6) in predetermined speech production model exponent number scope, change the value of model order successively, with the initial exponent number value in the new exponent number value replacement step 5), repeating step 5), obtain the approximate posteriority distribution parameter of one group of cost function corresponding and state vector with each model order;
7) in all cost functions that obtain, the exponent number of minimum cost function correspondence is exactly optimum model order, and the voice signal that is calculated by the approximate posteriority distribution parameter of the pairing state vector of this optimization model exponent number is exactly optimum result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNB2006100283311A CN100498935C (en) | 2006-06-29 | 2006-06-29 | Variation Bayesian voice strengthening method based on voice generating model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNB2006100283311A CN100498935C (en) | 2006-06-29 | 2006-06-29 | Variation Bayesian voice strengthening method based on voice generating model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN1870136A true CN1870136A (en) | 2006-11-29 |
CN100498935C CN100498935C (en) | 2009-06-10 |
Family
ID=37443781
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CNB2006100283311A Expired - Fee Related CN100498935C (en) | 2006-06-29 | 2006-06-29 | Variation Bayesian voice strengthening method based on voice generating model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN100498935C (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102254552A (en) * | 2011-07-14 | 2011-11-23 | 杭州电子科技大学 | Semantic enhanced transport vehicle acoustic information fusion method |
CN102637438A (en) * | 2012-03-23 | 2012-08-15 | 同济大学 | Voice filtering method |
CN104737229A (en) * | 2012-10-22 | 2015-06-24 | 三菱电机株式会社 | Method for transforming input signal |
CN108206024A (en) * | 2017-12-29 | 2018-06-26 | 河海大学常州校区 | A kind of voice data processing method based on variation Gauss regression process |
CN113421545A (en) * | 2021-06-30 | 2021-09-21 | 平安科技(深圳)有限公司 | Multi-modal speech synthesis method, device, equipment and storage medium |
CN117540173A (en) * | 2024-01-09 | 2024-02-09 | 长江水利委员会水文局 | Flood simulation uncertainty analysis method based on Bayesian joint probability model |
-
2006
- 2006-06-29 CN CNB2006100283311A patent/CN100498935C/en not_active Expired - Fee Related
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102254552A (en) * | 2011-07-14 | 2011-11-23 | 杭州电子科技大学 | Semantic enhanced transport vehicle acoustic information fusion method |
CN102254552B (en) * | 2011-07-14 | 2012-10-03 | 杭州电子科技大学 | Semantic enhanced transport vehicle acoustic information fusion method |
CN102637438A (en) * | 2012-03-23 | 2012-08-15 | 同济大学 | Voice filtering method |
CN102637438B (en) * | 2012-03-23 | 2013-07-17 | 同济大学 | Voice filtering method |
CN104737229A (en) * | 2012-10-22 | 2015-06-24 | 三菱电机株式会社 | Method for transforming input signal |
CN108206024A (en) * | 2017-12-29 | 2018-06-26 | 河海大学常州校区 | A kind of voice data processing method based on variation Gauss regression process |
CN113421545A (en) * | 2021-06-30 | 2021-09-21 | 平安科技(深圳)有限公司 | Multi-modal speech synthesis method, device, equipment and storage medium |
CN113421545B (en) * | 2021-06-30 | 2023-09-29 | 平安科技(深圳)有限公司 | Multi-mode voice synthesis method, device, equipment and storage medium |
CN117540173A (en) * | 2024-01-09 | 2024-02-09 | 长江水利委员会水文局 | Flood simulation uncertainty analysis method based on Bayesian joint probability model |
CN117540173B (en) * | 2024-01-09 | 2024-04-19 | 长江水利委员会水文局 | Flood simulation uncertainty analysis method based on Bayesian joint probability model |
Also Published As
Publication number | Publication date |
---|---|
CN100498935C (en) | 2009-06-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109859767B (en) | Environment self-adaptive neural network noise reduction method, system and storage medium for digital hearing aid | |
CN109841226B (en) | Single-channel real-time noise reduction method based on convolution recurrent neural network | |
CN110619885B (en) | Method for generating confrontation network voice enhancement based on deep complete convolution neural network | |
CN1870136A (en) | Variation Bayesian voice strengthening method based on voice generating model | |
CN101976566B (en) | Voice enhancement method and device using same | |
CN107272066A (en) | A kind of noisy seismic signal first-arrival traveltime pick-up method and device | |
CN110045419A (en) | A kind of perceptron residual error autoencoder network seismic data denoising method | |
CN111985523A (en) | Knowledge distillation training-based 2-exponential power deep neural network quantification method | |
JP2013534651A5 (en) | ||
CN110490816B (en) | Underwater heterogeneous information data noise reduction method | |
CN104067340B (en) | For the method for voice strengthened in mixed signal | |
CN109192200A (en) | A kind of audio recognition method | |
CN115618204A (en) | Electric energy data denoising method based on optimal wavelet basis and improved wavelet threshold function | |
CN112861740A (en) | Wavelet threshold denoising parameter selection method based on composite evaluation index and wavelet entropy | |
CN102930863B (en) | Voice conversion and reconstruction method based on simplified self-adaptive interpolation weighting spectrum model | |
CN1909064A (en) | Time-domain blind separating method for in-line natural voice convolution mixing signal | |
CN1805011A (en) | Adaptive filter method and apparatus for improving speech quality of mobile communication apparatus | |
CN101923716B (en) | Method for improving particle filter tracking effect | |
CN102184530A (en) | Image denoising method based on gray relation threshold value | |
CN115440240A (en) | Training method for voice noise reduction, voice noise reduction system and voice noise reduction method | |
CN105185385A (en) | Voice fundamental tone frequency estimation method based on gender anticipation and multi-frequency-band parameter mapping | |
CN1924850A (en) | Audio fast search method | |
CN114141266A (en) | Speech enhancement method for estimating prior signal-to-noise ratio based on PESQ driven reinforcement learning | |
CN1845640A (en) | Wireless channel blind estimation method based on wavelet shrinkage and HMM | |
CN108573698B (en) | Voice noise reduction method based on gender fusion information |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
C17 | Cessation of patent right | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20090610 Termination date: 20120629 |