CN103824561B

CN103824561B - Missing value nonlinear estimating method of speech linear predictive coding model

Info

Publication number: CN103824561B
Application number: CN201410054042.3A
Authority: CN
Inventors: 马占宇; 齐峰; 司中威; 郭军; 张洪刚
Original assignee: Beijing University of Posts and Telecommunications
Current assignee: Beijing University of Posts and Telecommunications
Priority date: 2014-02-18
Filing date: 2014-02-18
Publication date: 2015-03-11
Anticipated expiration: 2034-02-18
Also published as: CN103824561A

Abstract

The embodiment of the invention discloses a missing value nonlinear estimating method of a speech linear predictive coding model. The method includes the following steps that: linear spectrum frequency parameter transformation: linear spectrum frequency parameters of the speech linear predictive coding model are converted into linear spectrum frequency parameter difference values through linear transform; model training; probability distribution calculation of lost parts and received parts in a transmission process; and minimum mean square error optimal estimation. With the method provided by the embodiment of the invention adopted, optimal estimation of the linear predictive model can be realized reliably under the situation in which packet loss occurs in packet transmission, and therefore, transmission loss can be reduced, voice quality can be improved. Thus, the missing value nonlinear estimating method of the speech linear predictive coding model has a great practical value.

Description

A kind of non-linear evaluation method of missing values of speech linear predictive coding model

Technical field

The present invention relates in packet network, the process problem of packet loss in sound transmission course, describe emphatically a kind of based on the line spectral frequency parameters of conversion and the nonlinear optimization method of estimation of Di Li Cray mixture model.

Background technology

Along with the deep development of Internet technology, voice communication technology obtains significant progress, and the voice signal of transmission has been propagated by narrow band signal and evolved to broadband signal propagation.Continually developing and popularization along with multimedia application, people are more and more higher for the requirement of quality of voice transmission in voice communication technology and real-time, and therefore, the voice communication algorithm of research high efficient and reliable, has urgent social demand.

The matter of utmost importance that will solve in voice communication is the coding of voice.Through the development of many decades, speech coding technology roughly can be divided into three kinds of modes: waveform encoding techniques, based on the coding techniques of parameter model and mixed coding technology.Waveform encoding techniques directly carries out quantizing and transmitting, not based on acoustic model for speech waveform.After voice being analyzed by linear prediction model based on the coding techniques of parameter model, transmission line forecast model, side information and speech energy information respectively.Mixed coding technology is that the two combines.

In voice coding, the coding based on parameter model is widely used, and its core is the quantification and the coding that how effectively realize linear prediction model reliably.In the research of speech linear predictive coding model, generally LPC parameters is converted into line spectral frequency parameters, this method for expressing is other Parametric Representation methods more stability and high efficiency comparatively, and reason is that the distribution of its frequency spectrum sensitizing range is comparatively average.

In packet network during transferring voice, the quality that voice recover depends on the situation of network to a great extent.Under the pattern of packet network transmission, if can estimate from Given information the grouping postponing or lose, effectively can reply out voice signal, and avoid extra delay, thus improve voice quality, improve the experience of user.Traditional disappearance and joint distribution between the line spectral frequencies element that receives carry out modeling primarily of gauss hybrid models, simulated the joint distribution receiving part and lost part by gauss hybrid models, thus optimal estimation goes out the information of the bag of loss.Up-to-date research shows, the coding for linear prediction model can line spectral frequency parameters difference realize by quantifying, and the method quantizes more effective than traditional line spectral frequency parameters based on gauss hybrid models.When transmission line spectral frequency difference, traditional gauss hybrid models cannot the distribution of simulated data well, also just can not realize optimum prediction.Therefore, design the corresponding statistical model bag that also model loses in optimal estimation transmitted in packets thus for line spectral frequencies difference and just seem particularly important.

Summary of the invention

For the packet loss problem in existing voice transmitting procedure, the object of this invention is to provide a kind of nonlinear optimization algorithm and estimate lost content, recover the voice quality transmitted to greatest extent.

For achieving the above object, the nonlinear optimization missing value estimation method that the present invention proposes comprises the following steps:

Line spectral frequency parameters shift step: the line spectral frequency parameters of linear for voice coded prediction model is converted into line spectral frequency parameters difference by linear transformation;

Training pattern step: at transmitting terminal, uses the distribution of Di Li Cray mixture model (DMM-Dirichlet mixture model) artificial line spectral frequency parameter difference, adopts the parameters in expectation-maximization algorithm training DMM;

In transmitting procedure lost part and receive interconnection distribution calculation procedure: the hypothesis meeting Dirichlet distribute (Dirichlet distribution) according to line spectral frequency parameters difference, line spectral frequency parameters difference be divided into lost part and receive part, obtaining corresponding Dirichlet distribute after normalization respectively;

Least mean-square error optimum estimation step: according to least mean-square error standard, obtains the optimal estimation of missing values.

In line spectral frequency parameters shift step, utilize the 1. non-negative characteristic of line spectral frequency parameters, 2. ordered nature and 3. bounded characteristic be transformed to linear spectral parameter difference Δ LSF, the feature of this difference is: be 1. distributed in (0,1), in open interval, 2. add and be 1; This step detailed process is as follows:

1) K ties up line spectral frequency parameters and is expressed as s=[s ₁, s ₂..., s _k] ^t, meet 0 < s ₁< s ₂< ..., s _k< π;

2) K+1 after conversion ties up line spectral frequency parameters difference DELTA LSF wherein

x_{i} = \{\begin{matrix} s_{1} / π & i = 1 \\ (s_{i} - s_{i - 1}) / π & 1 < i \leq K \\ (π - s_{K}) / π & i = K + 1 \end{matrix},

In training pattern step, before transmission, suppose that the voice signal sent meets Dirichlet distribute, in transmitting terminal training pattern, obtain the parameter of i-th mixed components in mixture model:

α_{i} = [\begin{matrix} α_{i}^{M} \\ α_{i}^{R} \end{matrix}],

Wherein,

α_{i}^{M} = [\begin{matrix} α_{li}^{m} \\ . \\ . \\ . \\ α_{mi}^{M} \\ . \\ . \\ . \\ α_{Mi}^{M} \end{matrix}], α_{i}^{R} = [\begin{matrix} α_{li}^{R} \\ . \\ . \\ . \\ α_{ri}^{R} \\ . \\ . \\ . \\ α_{Ri}^{R} \end{matrix}] .

Before being transmitted, this parameter is known at receiving end.

In transmitting procedure lost part and receive interconnection distribution calculation procedure, suppose meet Dirichlet distribute, it can be divided into two parts after transport: lost part with receive part due to Di Li Cray vector neutral vector (neutral vector), can by both correlation properties estimation lost parts wherein.Will with can calculate their marginal probability distribution respectively after normalization, its process is as follows:

1) input: by the Δ LSF parameter obtained in the first step be divided into lost part and receive part, namely

\tilde{x} = [\begin{matrix} {\tilde{x}}^{M} \\ {\tilde{x}}^{R} \end{matrix}],

Two parts comprise M and R element respectively;

2) right respectively with normalization:

A) sue for peace,

S^{M} = Σ_{m = 1}^{M} x_{m}^{M} = 1 - S^{R}, S^{R} = Σ_{r = 1}^{R} x_{r}^{R},

M and R is respectively with the length of vector;

B) normalization obtains with in like manner,

3) due to after normalization add and be 1, meet Dirichlet distribute according to it, probability density function is:

After in like manner can receiving part normalization distribution:

Least mean-square error optimum estimation step: according to minimum mean square error criterion, lost part optimum estimate, be normalization lost part average and (1-S ^r) be multiplied the result obtained, namely lost part is in the known conditional mean received on part basis.Result of calculation as shown in the formula:

Wherein, by the parameter receiving partial probability density function and determine.

Beneficial effect of the present invention is, in terms of existing technologies, and the line spectral frequency parameters transmission of the present invention's application conversion, with the distribution of Di Li Cray analogue transmission signal, provide again complete estimating system for application, test findings demonstrates high efficiency of the present invention, has very strong practicality.

Accompanying drawing explanation

Fig. 1 is the flow chart of steps of the nonlinear optimization packet loss method of estimation of a kind of speech linear predictive model of the present invention;

Fig. 2 is the flow chart of steps of line spectral frequency parameters conversion;

Fig. 3 is the flow chart of steps of the mixed components parameter trained at transmitting terminal;

Fig. 4 calculates lost part and the flow chart of steps receiving interconnection distribution in transmitting procedure;

Fig. 5 is least mean-square error optimum estimation flow chart of steps.

Embodiment

Below in conjunction with accompanying drawing, specific embodiments of the present invention is described in detail.

Fig. 1 is process flow diagram of the present invention, comprises the following steps:

Step S1: line spectral frequency parameters is converted to line spectral frequency parameters difference step;

Step S2: the mixed components parameter step trained at transmitting terminal;

Step S3: to calculate in transmitting procedure lost part and receive part normalization probability distribution step;

Step S4: least mean-square error optimum estimation step.

To be specifically described each step below:

Step S1 realizes line spectral frequency parameters conversion, and the line spectral frequency parameters of linear for voice coded prediction model is converted into line spectral frequency parameters difference by linear transformation.The idiographic flow that Fig. 2 gives the method is as follows:

1) input:

A) line spectral frequency parameters s=[s ₁, s ₂..., s _k] ^t;

2), in step 11, by i from 1 to K+1 circulation, the difference at every turn obtained is as follows:

x_{i} = \{\begin{matrix} s_{1} / π & i = 1 \\ (s_{i} - s_{i - 1}) / π & 1 < i \leq K \\ (π - s_{K}) / π & i = K + 1 \end{matrix};

3) export:

A) line spectral frequency parameters

\tilde{x} = {[x_{1}, x_{2}, . . ., x_{K + 1}]}^{T} .

Step S2 is training pattern before being transmitted, obtains according to hypothesis step S1 meet Dirichlet distribute,

wherein α=[α ₁, α ₂... α _k+1] ^tit is parameter vector.As Fig. 3, middle extraction N ties up object vector if step 31 is by the mixing Di Li Cray model containing I component, the probability of object vector can be obtained:

wherein α _i=[α _1i, α _2i... α _k+1, i] ^tbe the parameter vector of i-th mixed components, this is also known at receiving end.π _ithe nonnegative curvature of i-th component, and as step 33 receives and lose two-part thought according to being divided into by overall line spectral frequencies parameter in step S3, can be by the mixed components Parametric Representation obtained in conditional probability distribution:

α_{i} = [\begin{matrix} α_{i}^{M} \\ α_{i}^{R} \end{matrix}] .

These parameter two parts are all known at transmitting terminal and receiving end.

Step S3 to calculate in transmitting procedure lost part and receives interconnection distribution, as Fig. 4, and will lost part is divided into after transmission with receive part two parts, will with can calculate their marginal probability distribution respectively after normalization, its process is as follows:

1) input: the Δ LSF parameter that step 41 will obtain in S1 be divided into lost part and receive part, namely

\tilde{x} = [\begin{matrix} {\tilde{x}}^{M} \\ {\tilde{x}}^{R} \end{matrix}] .

2) step 42 is right respectively with normalization:

A) sue for peace,

S^{M} = Σ_{m = 1}^{M} x_{m}^{M} = 1 - S^{R}, S^{R} = Σ_{r = 1}^{R} x_{r}^{R},

M and R is respectively with the length of vector;

B) normalization result: in like manner,

3) step 43 writes out two-part distribution, after normalization add and be 1, meet Dirichlet distribute according to it, density function is:

After in like manner can receiving part normalization distribution:

Step S4 is according to minimum mean square error criterion optimal estimation namely the best expectation value of lost part receives the conditional mean that part basis obtains, as Fig. 5 known.Step 51 tries to achieve the expectation after lost part normalization, and this expectation is obtained by the expectation value weighted sum of each composition in mixture model; The length that lost part is multiplied by expectation after lost part normalization by step 52 obtains the optimum estimation of lost part, and this length is expressed as by receiving part

Result of calculation as shown in the formula:

Wherein, to distribute the parameter determined by receiving part.

Below by reference to the accompanying drawings the nonlinear optimization packet loss method of estimation of proposed speech linear predictive model and the embodiment of each module are set forth.By the description of above embodiment, one of ordinary skill in the art clearly can recognize that the mode that the present invention can add required general hardware platform by software realizes, and can certainly pass through hardware implementing, but the former are better embodiments.Based on such understanding, technical scheme of the present invention can embody the part that prior art contributes in essence in other words in form of a computer software product, this software product is stored in a storage medium, comprises some instructions and performs method described in each embodiment of the present invention in order to make one or more computer equipment.

According to thought of the present invention, all will change in specific embodiments and applications.In sum, this description should not be construed as limitation of the present invention.

Above-described embodiment of the present invention, does not form the restriction to invention protection domain.Any amendment done within the spirit and principles in the present invention, equivalent replacement and improvement etc., all should be included within protection scope of the present invention.

Claims

1. a nonlinear optimization packet loss method of estimation for speech linear predictive model, is characterized in that, comprise the following steps:

Training pattern step: at transmitting terminal, uses the distribution of Di Li Cray mixture model artificial line spectral frequency parameter difference, adopts the parameters in the Di Li Cray mixture model of expectation-maximization algorithm training;

2. the method for claim 1, it is characterized in that, in line spectral frequency parameters shift step, utilize the 1. non-negative characteristic of line spectral frequency parameters, 2. ordered nature and 3. bounded characteristic be transformed to linear spectral parameter difference Δ LSF, the feature of this difference is: be 1. distributed in (0,1) open interval, 2. adds and is 1; This step detailed process is as follows:

2) it is x=[x that the K+1 after conversion ties up line spectral frequency parameters difference DELTA LSF ₁, x ₂..., x _k+1] ^t, wherein

x_{i} = \{\begin{matrix} s_{1} / π & i = 1 \\ (s_{i} - s_{i - 1}) / π & 1 < i \leq K \\ (π - s_{K}) / π & i = K + 1 \end{matrix} .

3. method as claimed in claim 2, it is characterized in that, in training pattern step, before transmission, suppose that the x calculated in claim 2 meets Dirichlet distribute, in transmitting terminal training pattern, the mixed components parameter obtained in conditional probability distribution can be expressed as:

α_{i} = [\begin{matrix} α_{i}^{M} \\ α_{i}^{R} \end{matrix}],

Wherein,

α_{i}^{M} = [\begin{matrix} α_{1 i}^{M} \\ . \\ . \\ . \\ α_{mi}^{M} \\ . \\ . \\ . \\ α_{Mi}^{M} \end{matrix}] α_{i}^{R} = [\begin{matrix} α_{1 i}^{R} \\ . \\ . \\ . \\ α_{ri}^{R} \\ . \\ . \\ . \\ α_{Ri}^{R} \end{matrix}];

This parameter is also known at receiving end.

4. method as claimed in claim 3, is characterized in that, calculating in transmitting procedure lost part and receive interconnection distribution step, suppose that x meets Dirichlet distribute, it can be divided into two parts after transport: lost part x ^mwith receive part x ^r, can by both correlation properties estimation lost parts wherein; Because Di Li Cray vector x is neutral vector (neutral vector), by x ^mand x ^rcan calculate their marginal probability distribution respectively after normalization, its process is as follows:

1) input: the Δ LSF parameter x obtained in previous step be divided into lost part and receive part, namely

x = [\begin{matrix} x^{M} \\ x^{R} \end{matrix}];

2) respectively to x ^mand x ^rnormalization:

A) sue for peace, m and R is x respectively ^mand x ^rthe length of vector;

B) normalization result: in like manner,

3) due to after normalization add and be 1, meet Dirichlet distribute according to it, density function is:

After in like manner can receiving part normalization distribution:

5. method as claimed in claim 4, is characterized in that, least mean-square error optimum estimation step: according to minimum mean square error criterion, lost part optimum estimate device, be normalization lost part average be received part the result that weighted sum obtains, namely lost part is in the known conditional mean receiving part; Result of calculation as shown in the formula:

Wherein, receive by what calculate in claim 4 the parameter that partial probability density function determines.