CN103824561B - Missing value nonlinear estimating method of speech linear predictive coding model - Google Patents

Missing value nonlinear estimating method of speech linear predictive coding model Download PDF

Info

Publication number
CN103824561B
CN103824561B CN201410054042.3A CN201410054042A CN103824561B CN 103824561 B CN103824561 B CN 103824561B CN 201410054042 A CN201410054042 A CN 201410054042A CN 103824561 B CN103824561 B CN 103824561B
Authority
CN
China
Prior art keywords
alpha
line spectral
spectral frequency
frequency parameters
normalization
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410054042.3A
Other languages
Chinese (zh)
Other versions
CN103824561A (en
Inventor
马占宇
齐峰
司中威
郭军
张洪刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN201410054042.3A priority Critical patent/CN103824561B/en
Publication of CN103824561A publication Critical patent/CN103824561A/en
Application granted granted Critical
Publication of CN103824561B publication Critical patent/CN103824561B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The embodiment of the invention discloses a missing value nonlinear estimating method of a speech linear predictive coding model. The method includes the following steps that: linear spectrum frequency parameter transformation: linear spectrum frequency parameters of the speech linear predictive coding model are converted into linear spectrum frequency parameter difference values through linear transform; model training; probability distribution calculation of lost parts and received parts in a transmission process; and minimum mean square error optimal estimation. With the method provided by the embodiment of the invention adopted, optimal estimation of the linear predictive model can be realized reliably under the situation in which packet loss occurs in packet transmission, and therefore, transmission loss can be reduced, voice quality can be improved. Thus, the missing value nonlinear estimating method of the speech linear predictive coding model has a great practical value.

Description

A kind of non-linear evaluation method of missing values of speech linear predictive coding model
Technical field
The present invention relates in packet network, the process problem of packet loss in sound transmission course, describe emphatically a kind of based on the line spectral frequency parameters of conversion and the nonlinear optimization method of estimation of Di Li Cray mixture model.
Background technology
Along with the deep development of Internet technology, voice communication technology obtains significant progress, and the voice signal of transmission has been propagated by narrow band signal and evolved to broadband signal propagation.Continually developing and popularization along with multimedia application, people are more and more higher for the requirement of quality of voice transmission in voice communication technology and real-time, and therefore, the voice communication algorithm of research high efficient and reliable, has urgent social demand.
The matter of utmost importance that will solve in voice communication is the coding of voice.Through the development of many decades, speech coding technology roughly can be divided into three kinds of modes: waveform encoding techniques, based on the coding techniques of parameter model and mixed coding technology.Waveform encoding techniques directly carries out quantizing and transmitting, not based on acoustic model for speech waveform.After voice being analyzed by linear prediction model based on the coding techniques of parameter model, transmission line forecast model, side information and speech energy information respectively.Mixed coding technology is that the two combines.
In voice coding, the coding based on parameter model is widely used, and its core is the quantification and the coding that how effectively realize linear prediction model reliably.In the research of speech linear predictive coding model, generally LPC parameters is converted into line spectral frequency parameters, this method for expressing is other Parametric Representation methods more stability and high efficiency comparatively, and reason is that the distribution of its frequency spectrum sensitizing range is comparatively average.
In packet network during transferring voice, the quality that voice recover depends on the situation of network to a great extent.Under the pattern of packet network transmission, if can estimate from Given information the grouping postponing or lose, effectively can reply out voice signal, and avoid extra delay, thus improve voice quality, improve the experience of user.Traditional disappearance and joint distribution between the line spectral frequencies element that receives carry out modeling primarily of gauss hybrid models, simulated the joint distribution receiving part and lost part by gauss hybrid models, thus optimal estimation goes out the information of the bag of loss.Up-to-date research shows, the coding for linear prediction model can line spectral frequency parameters difference realize by quantifying, and the method quantizes more effective than traditional line spectral frequency parameters based on gauss hybrid models.When transmission line spectral frequency difference, traditional gauss hybrid models cannot the distribution of simulated data well, also just can not realize optimum prediction.Therefore, design the corresponding statistical model bag that also model loses in optimal estimation transmitted in packets thus for line spectral frequencies difference and just seem particularly important.
Summary of the invention
For the packet loss problem in existing voice transmitting procedure, the object of this invention is to provide a kind of nonlinear optimization algorithm and estimate lost content, recover the voice quality transmitted to greatest extent.
For achieving the above object, the nonlinear optimization missing value estimation method that the present invention proposes comprises the following steps:
Line spectral frequency parameters shift step: the line spectral frequency parameters of linear for voice coded prediction model is converted into line spectral frequency parameters difference by linear transformation;
Training pattern step: at transmitting terminal, uses the distribution of Di Li Cray mixture model (DMM-Dirichlet mixture model) artificial line spectral frequency parameter difference, adopts the parameters in expectation-maximization algorithm training DMM;
In transmitting procedure lost part and receive interconnection distribution calculation procedure: the hypothesis meeting Dirichlet distribute (Dirichlet distribution) according to line spectral frequency parameters difference, line spectral frequency parameters difference be divided into lost part and receive part, obtaining corresponding Dirichlet distribute after normalization respectively;
Least mean-square error optimum estimation step: according to least mean-square error standard, obtains the optimal estimation of missing values.
In line spectral frequency parameters shift step, utilize the 1. non-negative characteristic of line spectral frequency parameters, 2. ordered nature and 3. bounded characteristic be transformed to linear spectral parameter difference Δ LSF, the feature of this difference is: be 1. distributed in (0,1), in open interval, 2. add and be 1; This step detailed process is as follows:
1) K ties up line spectral frequency parameters and is expressed as s=[s 1, s 2..., s k] t, meet 0 < s 1< s 2< ..., s k< π;
2) K+1 after conversion ties up line spectral frequency parameters difference DELTA LSF wherein
x i = s 1 / &pi; i = 1 ( s i - s i - 1 ) / &pi; 1 < i &le; K ( &pi; - s K ) / &pi; i = K + 1 ,
In training pattern step, before transmission, suppose that the voice signal sent meets Dirichlet distribute, in transmitting terminal training pattern, obtain the parameter of i-th mixed components in mixture model: &alpha; i = &alpha; i M &alpha; i R , Wherein, &alpha; i M = &alpha; li m . . . &alpha; mi M . . . &alpha; Mi M , &alpha; i R = &alpha; li R . . . &alpha; ri R . . . &alpha; Ri R . Before being transmitted, this parameter is known at receiving end.
In transmitting procedure lost part and receive interconnection distribution calculation procedure, suppose meet Dirichlet distribute, it can be divided into two parts after transport: lost part with receive part due to Di Li Cray vector neutral vector (neutral vector), can by both correlation properties estimation lost parts wherein.Will with can calculate their marginal probability distribution respectively after normalization, its process is as follows:
1) input: by the Δ LSF parameter obtained in the first step be divided into lost part and receive part, namely x ~ = x ~ M x ~ R , Two parts comprise M and R element respectively;
2) right respectively with normalization:
A) sue for peace, S M = &Sigma; m = 1 M x m M = 1 - S R , S R = &Sigma; r = 1 R x r R , M and R is respectively with the length of vector;
B) normalization obtains with in like manner,
3) due to after normalization add and be 1, meet Dirichlet distribute according to it, probability density function is:
After in like manner can receiving part normalization distribution:
Least mean-square error optimum estimation step: according to minimum mean square error criterion, lost part optimum estimate, be normalization lost part average and (1-S r) be multiplied the result obtained, namely lost part is in the known conditional mean received on part basis.Result of calculation as shown in the formula:
Wherein, by the parameter receiving partial probability density function and determine.
Beneficial effect of the present invention is, in terms of existing technologies, and the line spectral frequency parameters transmission of the present invention's application conversion, with the distribution of Di Li Cray analogue transmission signal, provide again complete estimating system for application, test findings demonstrates high efficiency of the present invention, has very strong practicality.
Accompanying drawing explanation
Fig. 1 is the flow chart of steps of the nonlinear optimization packet loss method of estimation of a kind of speech linear predictive model of the present invention;
Fig. 2 is the flow chart of steps of line spectral frequency parameters conversion;
Fig. 3 is the flow chart of steps of the mixed components parameter trained at transmitting terminal;
Fig. 4 calculates lost part and the flow chart of steps receiving interconnection distribution in transmitting procedure;
Fig. 5 is least mean-square error optimum estimation flow chart of steps.
Embodiment
Below in conjunction with accompanying drawing, specific embodiments of the present invention is described in detail.
Fig. 1 is process flow diagram of the present invention, comprises the following steps:
Step S1: line spectral frequency parameters is converted to line spectral frequency parameters difference step;
Step S2: the mixed components parameter step trained at transmitting terminal;
Step S3: to calculate in transmitting procedure lost part and receive part normalization probability distribution step;
Step S4: least mean-square error optimum estimation step.
To be specifically described each step below:
Step S1 realizes line spectral frequency parameters conversion, and the line spectral frequency parameters of linear for voice coded prediction model is converted into line spectral frequency parameters difference by linear transformation.The idiographic flow that Fig. 2 gives the method is as follows:
1) input:
A) line spectral frequency parameters s=[s 1, s 2..., s k] t;
2), in step 11, by i from 1 to K+1 circulation, the difference at every turn obtained is as follows:
x i = s 1 / &pi; i = 1 ( s i - s i - 1 ) / &pi; 1 < i &le; K ( &pi; - s K ) / &pi; i = K + 1 ;
3) export:
A) line spectral frequency parameters x ~ = [ x 1 , x 2 , . . . , x K + 1 ] T .
Step S2 is training pattern before being transmitted, obtains according to hypothesis step S1 meet Dirichlet distribute,
wherein α=[α 1, α 2... α k+1] tit is parameter vector.As Fig. 3, middle extraction N ties up object vector if step 31 is by the mixing Di Li Cray model containing I component, the probability of object vector can be obtained:
wherein α i=[α 1i, α 2i... α k+1, i] tbe the parameter vector of i-th mixed components, this is also known at receiving end.π ithe nonnegative curvature of i-th component, and as step 33 receives and lose two-part thought according to being divided into by overall line spectral frequencies parameter in step S3, can be by the mixed components Parametric Representation obtained in conditional probability distribution: &alpha; i = &alpha; i M &alpha; i R . These parameter two parts are all known at transmitting terminal and receiving end.
Step S3 to calculate in transmitting procedure lost part and receives interconnection distribution, as Fig. 4, and will lost part is divided into after transmission with receive part two parts, will with can calculate their marginal probability distribution respectively after normalization, its process is as follows:
1) input: the Δ LSF parameter that step 41 will obtain in S1 be divided into lost part and receive part, namely x ~ = x ~ M x ~ R .
2) step 42 is right respectively with normalization:
A) sue for peace, S M = &Sigma; m = 1 M x m M = 1 - S R , S R = &Sigma; r = 1 R x r R , M and R is respectively with the length of vector;
B) normalization result: in like manner,
3) step 43 writes out two-part distribution, after normalization add and be 1, meet Dirichlet distribute according to it, density function is:
After in like manner can receiving part normalization distribution:
Step S4 is according to minimum mean square error criterion optimal estimation namely the best expectation value of lost part receives the conditional mean that part basis obtains, as Fig. 5 known.Step 51 tries to achieve the expectation after lost part normalization, and this expectation is obtained by the expectation value weighted sum of each composition in mixture model; The length that lost part is multiplied by expectation after lost part normalization by step 52 obtains the optimum estimation of lost part, and this length is expressed as by receiving part
Result of calculation as shown in the formula:
Wherein, to distribute the parameter determined by receiving part.
Below by reference to the accompanying drawings the nonlinear optimization packet loss method of estimation of proposed speech linear predictive model and the embodiment of each module are set forth.By the description of above embodiment, one of ordinary skill in the art clearly can recognize that the mode that the present invention can add required general hardware platform by software realizes, and can certainly pass through hardware implementing, but the former are better embodiments.Based on such understanding, technical scheme of the present invention can embody the part that prior art contributes in essence in other words in form of a computer software product, this software product is stored in a storage medium, comprises some instructions and performs method described in each embodiment of the present invention in order to make one or more computer equipment.
According to thought of the present invention, all will change in specific embodiments and applications.In sum, this description should not be construed as limitation of the present invention.
Above-described embodiment of the present invention, does not form the restriction to invention protection domain.Any amendment done within the spirit and principles in the present invention, equivalent replacement and improvement etc., all should be included within protection scope of the present invention.

Claims (5)

1. a nonlinear optimization packet loss method of estimation for speech linear predictive model, is characterized in that, comprise the following steps:
Line spectral frequency parameters shift step: the line spectral frequency parameters of linear for voice coded prediction model is converted into line spectral frequency parameters difference by linear transformation;
Training pattern step: at transmitting terminal, uses the distribution of Di Li Cray mixture model artificial line spectral frequency parameter difference, adopts the parameters in the Di Li Cray mixture model of expectation-maximization algorithm training;
In transmitting procedure lost part and receive interconnection distribution calculation procedure: the hypothesis meeting Dirichlet distribute (Dirichlet distribution) according to line spectral frequency parameters difference, line spectral frequency parameters difference be divided into lost part and receive part, obtaining corresponding Dirichlet distribute after normalization respectively;
Least mean-square error optimum estimation step: according to least mean-square error standard, obtains the optimal estimation of missing values.
2. the method for claim 1, it is characterized in that, in line spectral frequency parameters shift step, utilize the 1. non-negative characteristic of line spectral frequency parameters, 2. ordered nature and 3. bounded characteristic be transformed to linear spectral parameter difference Δ LSF, the feature of this difference is: be 1. distributed in (0,1) open interval, 2. adds and is 1; This step detailed process is as follows:
1) K ties up line spectral frequency parameters and is expressed as s=[s 1, s 2..., s k] t, meet 0 < s 1< s 2< ..., s k< π;
2) it is x=[x that the K+1 after conversion ties up line spectral frequency parameters difference DELTA LSF 1, x 2..., x k+1] t, wherein
x i = s 1 / &pi; i = 1 ( s i - s i - 1 ) / &pi; 1 < i &le; K ( &pi; - s K ) / &pi; i = K + 1 .
3. method as claimed in claim 2, it is characterized in that, in training pattern step, before transmission, suppose that the x calculated in claim 2 meets Dirichlet distribute, in transmitting terminal training pattern, the mixed components parameter obtained in conditional probability distribution can be expressed as: &alpha; i = &alpha; i M &alpha; i R , Wherein, &alpha; i M = &alpha; 1 i M . . . &alpha; mi M . . . &alpha; Mi M &alpha; i R = &alpha; 1 i R . . . &alpha; ri R . . . &alpha; Ri R ; This parameter is also known at receiving end.
4. method as claimed in claim 3, is characterized in that, calculating in transmitting procedure lost part and receive interconnection distribution step, suppose that x meets Dirichlet distribute, it can be divided into two parts after transport: lost part x mwith receive part x r, can by both correlation properties estimation lost parts wherein; Because Di Li Cray vector x is neutral vector (neutral vector), by x mand x rcan calculate their marginal probability distribution respectively after normalization, its process is as follows:
1) input: the Δ LSF parameter x obtained in previous step be divided into lost part and receive part, namely x = x M x R ;
2) respectively to x mand x rnormalization:
A) sue for peace, m and R is x respectively mand x rthe length of vector;
B) normalization result: in like manner,
3) due to after normalization add and be 1, meet Dirichlet distribute according to it, density function is:
After in like manner can receiving part normalization distribution:
5. method as claimed in claim 4, is characterized in that, least mean-square error optimum estimation step: according to minimum mean square error criterion, lost part optimum estimate device, be normalization lost part average be received part the result that weighted sum obtains, namely lost part is in the known conditional mean receiving part; Result of calculation as shown in the formula:
Wherein, receive by what calculate in claim 4 the parameter that partial probability density function determines.
CN201410054042.3A 2014-02-18 2014-02-18 Missing value nonlinear estimating method of speech linear predictive coding model Active CN103824561B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410054042.3A CN103824561B (en) 2014-02-18 2014-02-18 Missing value nonlinear estimating method of speech linear predictive coding model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410054042.3A CN103824561B (en) 2014-02-18 2014-02-18 Missing value nonlinear estimating method of speech linear predictive coding model

Publications (2)

Publication Number Publication Date
CN103824561A CN103824561A (en) 2014-05-28
CN103824561B true CN103824561B (en) 2015-03-11

Family

ID=50759583

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410054042.3A Active CN103824561B (en) 2014-02-18 2014-02-18 Missing value nonlinear estimating method of speech linear predictive coding model

Country Status (1)

Country Link
CN (1) CN103824561B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10325609B2 (en) * 2015-04-13 2019-06-18 Nippon Telegraph And Telephone Corporation Coding and decoding a sound signal by adapting coefficients transformable to linear predictive coefficients and/or adapting a code book
CN110660402B (en) 2018-06-29 2022-03-29 华为技术有限公司 Method and device for determining weighting coefficients in a stereo signal encoding process

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101145344A (en) * 2006-09-15 2008-03-19 华为技术有限公司 Spectral line frequency vector quantization method and system
JP2010145836A (en) * 2008-12-19 2010-07-01 Nippon Telegr & Teleph Corp <Ntt> Direction information distribution estimating device, sound source number estimating device, sound source direction measuring device, sound source separating device, methods thereof, and programs thereof
CN102956237A (en) * 2011-08-19 2013-03-06 杜比实验室特许公司 Method and device for measuring content consistency and method and device for measuring similarity

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8010341B2 (en) * 2007-09-13 2011-08-30 Microsoft Corporation Adding prototype information into probabilistic models

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101145344A (en) * 2006-09-15 2008-03-19 华为技术有限公司 Spectral line frequency vector quantization method and system
JP2010145836A (en) * 2008-12-19 2010-07-01 Nippon Telegr & Teleph Corp <Ntt> Direction information distribution estimating device, sound source number estimating device, sound source direction measuring device, sound source separating device, methods thereof, and programs thereof
CN102956237A (en) * 2011-08-19 2013-03-06 杜比实验室特许公司 Method and device for measuring content consistency and method and device for measuring similarity

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
《关于汉语/英语AMR语音编码参数统计特性的研究》;于薇;《电讯技术》;20020228(第2期);80-83 *

Also Published As

Publication number Publication date
CN103824561A (en) 2014-05-28

Similar Documents

Publication Publication Date Title
CN102436820B (en) High frequency band signal coding and decoding methods and devices
CN103229234B (en) Audio encoding device, method and program, and audio decoding deviceand method
CN105976830B (en) Audio-frequency signal coding and coding/decoding method, audio-frequency signal coding and decoding apparatus
CN103531205A (en) Asymmetrical voice conversion method based on deep neural network feature mapping
RU2011104813A (en) DEVICE AND METHOD OF QUANTIZATION AND REVERSE QUANTIZATION OF LPC FILTER WITH VARIABLE BIT TRANSFER SPEED
CN101751926A (en) Signal coding and decoding method and device, and coding and decoding system
CN103295582B (en) Noise suppressing method and system thereof
CN101521010B (en) Coding and decoding method for voice frequency signals and coding and decoding device
CN101308655B (en) Audio coding and decoding method and layout design method of static discharge protective device and MOS component device
CN106104685B (en) Audio coding method and device
CN104995673B (en) Hiding frames error
CN106847297A (en) The Forecasting Methodology of high-frequency band signals, coding/decoding apparatus
CN103824561B (en) Missing value nonlinear estimating method of speech linear predictive coding model
CN102332268B (en) Voice signal sparse representation method based on self-adaptive redundant dictionary
CN112086100A (en) Quantization error entropy based urban noise identification method of multilayer random neural network
CN109743269A (en) A kind of underwater sound OFDM channel reconstruction method based on data fitting
CN102982807B (en) Method and system for multi-stage vector quantization of speech signal LPC coefficients
CN101198041A (en) Vector quantization method and device
CN107452391B (en) Audio coding method and related device
CN103632673B (en) A kind of non-linear quantization of speech linear predictive model
US10276186B2 (en) Parameter determination device, method, program and recording medium for determining a parameter indicating a characteristic of sound signal
CN101604524B (en) Stereo coding method, stereo coding device, stereo decoding method and stereo decoding device
CN104036781A (en) Voice signal bandwidth expansion device and method
US20230395086A1 (en) Method and apparatus for processing of audio using a neural network
CN101944235A (en) Image compression method based on fractional fourier transform

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant