The bag-losing hide method and device of parameter field
Technical field
The present invention relates to Internet technical field, particularly to the bag-losing hide method and device of a kind of parameter field.
Background technology
High speed development and the continuous growth of telecommunications demand, VOIP based on voice packet exchange along with the Internet
(Voice Over Internet Protocol, the networking telephone) technology is with its low cost, easily expansion and excellent speech quality
Increasingly favored by user.In voice communication course, after receiving terminal receives the voice packet transmitted by network, logical
Cross Voice decoder and the speech frame in voice packet is decoded into the voice signal of correspondence, and then realize Internet phone-calling.The most existing
In some Voice decoders, interframe related voice decoder owing to higher-quality voice can be provided under same code rate, from
And be widely adopted, such as the SILK decoder of Skype.Due to voice packet transmission way in it may happen that packet loss, cause voice
Communication quality reduces, and therefore, in order to reduce the negative effect that voice packet packet loss brings, needs to use certain bag-losing hide side
Method, ensures speech communication quality.
Providing a kind of bag-losing hide method in correlation technique, in the method, receiving terminal is gone forward side by side receiving voice packet
After row decoding, if voice packet occurs packet loss in transmission way, then carry out the voice signal solved processing to generate losing voice
The voice signal of speech frame in bag, such as, by processing the voice signal of frame before and after lost frames, such as pitch synchronous weight
Multiple, time scale corrections etc., generate the corresponding voice signal of lost frames, thus realize bag-losing hide.
During realizing the present invention, inventor finds that prior art at least there is problems in that
Owing to the speech frame in voice packet is mutually related, the decoded result of the speech frame of decoding can be to working as i.e. before
The decoding of front speech frame impacts.If there is packet loss in transmission way in voice packet, so that the lost speech frames in voice packet, when
When generating the corresponding voice signal of lost frames by carrying out the signal of frame before and after lost frames processing, follow-up due to lost frames
Frame can not correctly solve, therefore, by the signal of frame before and after lost frames carries out processing the corresponding voice of lost frames of generation
Signal effect is the best, thus causes speech communication of low quality.
Summary of the invention
In order to solve problem of the prior art, embodiments provide bag-losing hide method and the dress of a kind of parameter field
Put.Described technical scheme is as follows:
On the one hand, it is provided that a kind of bag-losing hide method of parameter field, described method includes:
Determine whether current speech frame to be decoded is lost;
If described current speech LOF, obtain the parameter of the previous valid frame of described current speech frame;
The parameter of current speech frame described in parameter determination according to described previous valid frame;
Described current speech frame is decoded by the parameter according to described current speech frame.
On the other hand, it is provided that the bag-losing hide device of a kind of parameter field, described device includes:
Determine module, for determining whether current speech frame to be decoded is lost;
Front frame acquisition module, for when described current speech LOF, obtain described current speech frame previous effectively
The parameter of frame;
Present frame determines module, for the parameter according to current speech frame described in the parameter determination of described previous valid frame;
Decoder module, for being decoded described current speech frame according to the parameter of described current speech frame.
The technical scheme that the embodiment of the present invention provides has the benefit that
When determining current speech LOF to be decoded, by obtaining the parameter of the previous valid frame of current speech frame,
Determine the parameter of current speech frame according to concrete condition, then just carrying out losing speech frame according to the parameter of current speech frame
Often decoding, owing to simulating the normal work of decoder under packet drop, therefore maintains the seriality of decoding, thus works as voice
Wrap in time transmitting procedure occurs packet loss phenomenon, can be decoded according to the parameter of the lost frames determined, and then improve decoding
After voice quality.
Accompanying drawing explanation
For the technical scheme being illustrated more clearly that in the embodiment of the present invention, in embodiment being described below required for make
Accompanying drawing be briefly described, it should be apparent that, below describe in accompanying drawing be only some embodiments of the present invention, for
From the point of view of those of ordinary skill in the art, on the premise of not paying creative work, it is also possible to obtain other according to these accompanying drawings
Accompanying drawing.
Fig. 1 is the bag-losing hide method flow diagram of a kind of parameter field that the embodiment of the present invention one provides;
Fig. 2 is the bag-losing hide method flow diagram of the another kind of parameter field that the embodiment of the present invention one provides;
Fig. 3 is the bag-losing hide method flow diagram of another parameter field that the embodiment of the present invention one provides;
Fig. 4 is the bag-losing hide method flow diagram of a kind of parameter field that the embodiment of the present invention two provides;
Fig. 5 is the structural representation of a kind of decoder that the embodiment of the present invention two provides;
Fig. 6 is the bag-losing hide apparatus structure schematic diagram of a kind of parameter field that the embodiment of the present invention three provides;
Fig. 7 is the bag-losing hide apparatus structure schematic diagram of the another kind of parameter field that the embodiment of the present invention three provides.
Detailed description of the invention
For making the object, technical solutions and advantages of the present invention clearer, below in conjunction with accompanying drawing to embodiment party of the present invention
Formula is described in further detail.
Embodiment one
Owing to the speech frame in interframe associated decoder voice packet is to be mutually related, the therefore decoding knot of previous speech frame
Current speech frame decoding can be impacted by fruit.Language when voice packet packet loss occurs in network transmission process, in voice packet
Sound frame also can be lost.Now, owing to not having the decoded result of previous speech frame as reference, the speech frame that speech frame is follow-up is lost
Decoding process can be by the biggest negative effect, thus the voice quality causing decoding voice signal out to produce is poor.
In order to reduce negative effect when interframe associated decoder is decoded by packet loss as far as possible, the invention provides
A kind of bag-losing hide method of parameter field, the method for the equipment of interframe associated decoder can be installed, this equipment include but not
Being limited to terminal, server etc., this is not especially limited by the present embodiment.In order to the speech frame in voice packet is decoded,
The embodiment of the present invention is using the parameter of previous valid frame or previous valid frame and a rear valid frame as determining lost frames parameter
Foundation, as a example by executive agent is as receiving terminal, the method providing the present embodiment is illustrated.See Fig. 1, the present embodiment
The method flow provided includes:
101: determine whether current speech frame to be decoded is lost;
102: if current speech LOF, obtain the parameter of the previous valid frame of current speech frame;
103: according to the parameter of the parameter determination current speech frame of previous valid frame;
104: according to the parameter of current speech frame, current speech frame is decoded.
On the basis of the method shown in Fig. 1, whether the method that the present embodiment provides has current speech frame according in buffering
The different situations of a rear valid frame, specifically can be subdivided into the following two kinds situation:
See Fig. 2, for the situation of the rear valid frame having current speech frame in buffering, the method stream that the present embodiment provides
Journey is as follows:
201: determine whether current speech frame to be decoded is lost;
202: if current speech LOF, obtain the parameter of the previous valid frame of current speech frame;
203: judge whether to buffer a rear valid frame of current speech frame;
204: if buffering has a rear valid frame, the parameter of a valid frame after acquisition;
205: according to the parameter of previous valid frame and the parameter of the parameter determination current speech frame of a rear valid frame;
206: according to the parameter of current speech frame, current speech frame is decoded.
See Fig. 3, for the situation of the rear valid frame not having current speech frame in buffering, the method that the present embodiment provides
Flow process includes:
301: determine whether current speech frame to be decoded is lost;
302: if current speech LOF, obtain the parameter of the previous valid frame of current speech frame;
303: judge whether to buffer a rear valid frame of current speech frame;
304: if buffering does not has a rear valid frame, extrapolate according to the parameter of previous valid frame and determine the ginseng of current speech frame
Number;
305: according to the parameter of current speech frame, current speech frame is decoded.
The method that the present embodiment provides, when determining current speech LOF to be decoded, by obtaining current speech frame
The parameter of previous valid frame or previous valid frame and the parameter of a rear valid frame, determine current speech according to concrete condition
The parameter of frame, then carries out normal decoder, owing to simulating packet drop according to the parameter of current speech frame to losing speech frame
The normal work of lower decoder, therefore maintains the seriality of decoding, thus when voice packet occurs that in transmitting procedure packet loss is existing
As time, can be decoded according to the parameter of the lost frames determined, and then improve decoded voice quality.
Embodiment two
Embodiments provide a kind of bag-losing hide method of parameter field, in conjunction with the content in above-described embodiment one,
Having lost for current speech frame, wobble buffer is with or without the situation of subsequent voice bag, the packet loss provided the present invention respectively
Concealing technology illustrates in detail.Seeing Fig. 4, the method flow that the present embodiment provides includes:
401: determine whether current speech frame to be decoded is lost;
The present embodiment not to determining that determination method that whether current speech frame to be decoded is lost makees concrete restriction, including but
It is not limited to: voice packet transmitting terminal, before sending voice packet, is numbered for each speech frame in voice packet, will number
After speech frame send to voice packet receiving terminal.Decoder shown in Figure 5, is provided with a wobble buffer, will receive
To speech frame be stored in advance in wobble buffer.Decoder according to the numbering of the previous valid frame of current speech frame with shake
In buffer, the numbering of the follow-up valid frame of storage, i.e. can determine that whether current speech frame is lost.
Such as, first speech frame numbered 1, after decoder has decoded first speech frame, examine in wobble buffer
Suo Houxu valid frame, if retrieving numbered the 4 of follow-up valid frame, the most now may determine that second speech frame and the 3rd language
Sound LOF.If being currently needed for second speech frame is decoded, it is determined that current speech LOF.
It is, of course, also possible to use alternate manner to determine whether current speech frame is lost, this is the most specifically limited by the present embodiment
Fixed.Tone decoding method only, as a example by current speech LOF, is illustrated by the present embodiment, for determining current speech
The situation that frame is not lost, can directly be decoded according to decoding process set in advance, not lose about current speech frame
Decoding process, here is omitted.
402: if current speech LOF, it may be judged whether buffering has a rear valid frame of current speech frame, if it is, perform
Step 403, otherwise, performs step 407;
This step, when judging whether the rear valid frame buffering current speech frame, can use and determine current speech frame
The same way whether lost.As described in above-mentioned steps 401, transmitting terminal, before sending speech frame, enters for each speech frame
Line number, sends the speech frame after numbering to receiving terminal.Receiving terminal pre-sets a wobble buffer, and will receive
Speech frame be stored in advance in wobble buffer.The numbering of the previous valid frame according to current speech frame is with in wobble buffer
The numbering of the follow-up valid frame of storage, it may be judged whether buffering has a rear valid frame of current speech frame.
Such as, current speech frame number is 3, if retrieve in wobble buffer follow-up have numbered 4 speech frame, then
Now may determine that buffering has a rear valid frame of current speech frame.The most such as, current speech frame number is 3, if slow in shake
Rush device retrieves follow-up have numbered 5 speech frame, the most now may determine that and do not buffer the rear effective of current speech frame
Frame.
It is, of course, also possible to use alternate manner to judge whether to buffer a rear valid frame of current speech frame, the present embodiment
This is not especially limited.
403: obtain the binary decision class parameter of previous valid frame and a rear valid frame, and according to previous valid frame and rear
The signal type of the binary decision class parameter determination current speech frame of valid frame, obtains the binary decision class ginseng of current speech frame
Number;
Specifically, binary decision class parameter is for judging signal type, owing to voice has dividing of unvoiced or voiced sound, institute
Cyclical signal and the modeling of non-periodic signals and coding are clearly distinguished from common speech model.Wherein,
Wide in range says, cyclical signal correspondence unvoiced frame, non-periodic signals correspondence unvoiced frames.Therefore, signal type include sore throat relieving and
Voiced sound two types.After obtaining the binary decision class parameter of previous valid frame and a rear valid frame, can be according to getting before
Whether the binary decision previous valid frame of class parameter determination of one valid frame and a rear valid frame and a rear valid frame are periodically to believe
Number, thus according to the binary decision previous valid frame of class parameter determination of previous valid frame and a rear valid frame and a rear valid frame
Signal type, obtains the binary decision class parameter of current speech frame.The mode be given according to the present embodiment, is determining current speech
During the signal type of frame, include but not limited to following three kinds of situations:
Situation one: previous valid frame and a rear valid frame are cyclical signal, then can be according to previous valid frame and rear
The binary decision previous valid frame of class parameter determination of valid frame and the signal type of a rear valid frame are unvoiced frame, now ought
The signal type of front speech frame is defined as unvoiced frame.
Situation two: previous valid frame is cyclical signal, a rear valid frame is non-periodic signals, then can have according to previous
The binary decision previous valid frame of class parameter determination of effect frame and a rear valid frame is unvoiced frame, and a rear valid frame is unvoiced frames.Or
Person, previous valid frame is non-periodic signals, and a rear valid frame is cyclical signal, then can have according to previous valid frame and rear one
The binary decision previous valid frame of class parameter determination of effect frame is unvoiced frames, and a rear valid frame is unvoiced frame.
In above two situation, owing to previous valid frame and a rear valid frame there being one be cyclical signal, at this
The conversion that experienced by periodically and between non-periodic signals is can be determined that in lost frames in the case of Zhong, therefore can be reasonably false
It is located in lost frames the existence how much also having cyclical signal, accordingly, it is determined that current speech frame is unvoiced frame.
Situation three: previous valid frame and a rear valid frame are non-periodic signals, then can according to previous valid frame and after
The binary decision previous valid frame of class parameter determination of one valid frame and the signal type of a rear valid frame are unvoiced frames, now will
The signal type of current speech frame is defined as unvoiced frames.
Which kind of situation above-mentioned no matter is used to determine the signal type of current speech frame, by the most convertible for the signal type determined
For corresponding binary decision class parameter.Such as, when being embodied as, the binary decision class parameter that can arrange unvoiced frames is 0, unvoiced frame
Binary decision class parameter value be 1, when after the signal type determining current speech frame, if this current speech frame is unvoiced frames,
Then the binary decision class parameter value of current speech frame is 0, in like manner, if this current speech frame is unvoiced frame, then current speech frame
Binary decision class parameter be 1, certainly, the numerical value of binary decision class parameter can also use other set-up mode, the present embodiment
This is not especially limited.
404: obtain the sequential evolution class parameter of previous valid frame and a rear valid frame, and according to previous valid frame and rear
The binary decision class parameter of valid frame and the sequential evolution class parameter of sequential evolution class parameter determination current speech frame;
Specifically, sequential evolution class parameter can include but not limited to pitch period, gain parameter and LSP
(LineSpectrum Pair, line spectrum pair) coefficients etc., this is not especially limited by the present embodiment, does not has acquisition is previous
The mode of the sequential evolution class parameter of effect frame and a rear valid frame is defined.When being embodied as, first as a example by pitch period,
After the binary decision class parameter determination signal type according to previous valid frame and a rear valid frame, can according to previous valid frame and
The signal type of a rear valid frame determines the pitch period parameter of current speech frame according to following four kinds of situations.
Situation one: previous valid frame and a rear valid frame are unvoiced frame;
After getting the pitch period of previous valid frame and a rear valid frame, owing to, in actual scene, having when people speaks
May improve suddenly or reduce tone, so in the stable voiced sound stage, it is equally possible to there is the sudden change of pitch period.For
Judge whether the pitch period of previous valid frame and a rear valid frame undergos mutation, can take following method: taking previous has
The absolute value of the difference of the pitch period of effect frame and the pitch period of a rear valid frame, the fundamental tone week that the absolute value of difference is preset
Phase offset threshold compares, according to comparative result and then determine whether the pitch period of previous valid frame and a rear valid frame is sent out
Raw sudden change.
Such as, if the pitch period that next_pitch is a rear valid frame, last_pitch is the fundamental tone of previous valid frame
In the cycle, take absolute value by both differences and determine whether to there is pitch period sudden change with pitch period offset threshold δ preset.
Wherein, if | next_pitch-last_pitch | is < δ, after i.e. calculating according to above-mentioned formula, if both differences
Take absolute value less than pitch period offset threshold δ, then can determine that the pitch period of previous valid frame and a rear valid frame does not occurs
Sudden change.Otherwise, then can determine that the pitch period of previous valid frame and a rear valid frame is undergone mutation.Wherein, pitch period skew
Threshold value δ can be set according to historical experience, and this is not especially limited by the present embodiment.It addition, practical operation also may be used
To use other method to determine whether the pitch period of previous valid frame and a rear valid frame undergos mutation, this example is to this most not
Make concrete restriction.
When determining the pitch period parameter of current speech frame, according to previous valid frame and the pitch period of a rear valid frame
Whether undergo mutation, can be divided into the following two kinds situation:
The first situation: the pitch period of previous valid frame and a rear valid frame is not undergone mutation;
Owing to when the pitch period of previous valid frame and a rear valid frame is not undergone mutation, pitch period is taken turns
Exterior feature is smooth and evolution chronologically, it is thereby possible to select determine the base of the subframe of current speech frame by the method for linear interpolation
In the sound cycle, the pitch period in the subframe according to current speech frame determines the pitch period of current speech frame afterwards.Certainly, also may be used
To select other interpolation algorithm to determine the pitch period of current speech frame, this is not especially limited by the present embodiment.
When being embodied as, during carrying out linear interpolation, the ginseng of needs can be set according to actual concrete condition
Number, and arrange different numerical value to carry out linear interpolation, the present embodiment not algorithm to linear interpolation makees concrete restriction.Only with such as
As a example by lower linear interpolation algorithm, the specific implementation of this kind of algorithm can be represented by equation below:
pitch[k]=last_pitch+pIncr*(k+1ossCnt*subFrameCount+1) (2)
In formula (1), pIncr is evolution increment, and lostFrameCount is the frame losing altogether that wobble buffer is incoming
Number, subFrameCount is the number of sub frames comprised in every frame.
In formula (2), the continuous frame losing number before the rear valid frame in wobble buffer can be carried out statistics and come really
Determine the numerical value of lostFrameCount.Such as, can deposit altogether 5 frames in wobble buffer, each of which frame all has numbering.?
The a certain moment, the second LOF, and the 5th frame of numbered 5 during next valid frame of wobble buffer, now can determine that two,
Three, the numerical value of four LOFs, i.e. lostFrameCount is 3.It addition, subFrameCount is the number of sub frames comprised in every frame,
The number of sub frames of every frame can be set according to actual needs, and this is also not especially limited by the present embodiment.
After determining the numerical value of different parameters in the manner described above, it may be determined that the numerical value of evolution increment pIncr, evolution is increased
The numerical value of amount pIncr is updated in formula (2) do computing further.
In formula (2), lossCnt is the frame losing number altogether that frame losing starts position up till now, and the implication of k is current speech
The kth subframe of frame, pitch [k] represents the pitch period of the kth subframe of current speech frame.
After determining the numerical value of different parameters in the manner described above, it may be determined that the numerical value of lossCnt and k, will both numerical value
It is updated in formula (2) carry out the voice week of the kth subframe that computing may determine that the numerical value of pitch [k], i.e. current speech frame
Phase.
After the pitch period getting all subframes of current speech frame, certain method can be used to determine current speech
The pitch period of frame, as being overlapped according to weight by the pitch period of subframe, this is not especially limited by the present embodiment.
Second case: the pitch period of previous valid frame and a rear valid frame is undergone mutation.
For the ease of follow-up calculating, the present embodiment occurs as a example by the middle time point of packet loss by pitch period, such as,
Numbering according to frame determines that lost frames are that the second frame, the 3rd frame and the 4th frame, the first frame and the 5th frame determine normal arrival, then this
There is the middle time point of packet loss in Shi Jiyin sudden change, i.e. three frames when.Certainly, can also be used it according to practical situation
Its method determines the middle time point of packet loss, and this is not especially limited by the present embodiment.Such as, if the pitch period of former frame
For last_pitch, the pitch period of next frame is next_pitch, and the pitch period of current speech frame is pitch, current language
The pitch period of sound frame can be determined according to equation below:
Pitch=last_pitch, iflossCnt < (LostFrameCount > > 1) (3)
Pitch=next_pitch, iflossCnt >=(LostFrameCount > > 1) (4)
In above-mentioned formula (3) and formula (4), after determining the numerical value of lostFrameCount, determine lost frames start to
The number lossCnt of frame losing altogether of current speech frame position, if the numerical value of lostFrameCount is divided by 2 numbers being more than lossCnt
Value, then the pitch period last_pitch by previous valid frame is defined as the pitch period of current speech frame.Otherwise, then by rear one
The pitch period of valid frame is defined as the pitch period of current speech frame.
Situation two: previous valid frame is unvoiced frame, a rear valid frame is unvoiced frames;
Now it is contemplated that, periodic signal is constantly decayed during packet loss.From the point of view of the physical model of voice sounding, this
The decay of signal can show the slow decline of the most elongated of pitch period or fundamental frequency.Based on above-mentioned principle, when
The pitch period of front speech frame can be extrapolated to be incremented by by the pitch period of previous valid frame and be obtained.Current speech frame kth
The evaluation method of the pitch period of frame is as follows:
pitch[k]=last_pitch+lossCnt*subFrameCount+k (5)
Wherein, in formula (5), the implication of parameter refers to the annotation in above-mentioned steps, owns getting current speech frame
The pitch period of subframe can determine that the pitch period of current speech frame, and detailed process refers to above-mentioned steps, and here is omitted.
Situation three: previous valid frame is unvoiced frames, a rear valid frame is unvoiced frame;
Now it is contemplated that, periodic signal gradually forms during packet loss.Based on the principle in situation one, after can passing through
The pitch period extrapolation of one valid frame is incremented by the formation carrying out simulation cycle signal, to obtain the pitch period of current speech frame.When
The evaluation method of the pitch period of front speech frame kth subframe is as follows: pitch [subFrameCount-k-1]=next_pitch-
lossCnt*subFrameCount-k (6)
Wherein, in formula (6), the implication of parameter refers to the annotation in above-mentioned steps, owns getting current speech frame
The pitch period of subframe can determine that the pitch period of current speech frame, and detailed process refers to above-mentioned steps, and here is omitted.
Situation four: previous valid frame and a rear valid frame are unvoiced frames.
Now can determine that current speech frame is unvoiced frames, due to unvoiced frames non-periodic signals, therefore current speech frame
There is no pitch period.
The present embodiment continues as a example by the gain parameter determining current speech frame, to according to previous valid frame or rear effective
The sequential evolution class parameter of the sequential evolution class parameter determination current speech frame of frame explains.
When being embodied as, the present embodiment determines gain parameter in the way of linear interpolation, certainly, in the actual environment may be used
Select interpolation method, the present embodiment that this is not especially limited with the complexity according to algorithms of different, time delay and effect.As worked as
The situation of continual data package dropout be not very serious in the case of, can select polynomial interopolation to obtain the gain of lost frames, but this kind of
Method better effects to be obtained, needs the subsequent frame in more multipair wobble buffers to be decoded, thus can increase solution
Code time delay.The present embodiment is according to concrete application scenarios, it is provided that a kind of linear interpolation algorithm, and concrete formula is as follows:
gain[k]=last_gain+gIncr*(k+lossCnt*subFrameCount+1) (8)
In formula (7), next_gain is the gain parameter of previous valid frame, the gain of next valid frame of last_gain
Parameter, the evolution increment of the pitch period that evolution increment gIncr is similar in above-mentioned steps, the concrete numerical value of parameter is substituted into formula
(7) after calculating, it may be determined that the numerical value of evolution increment gIncr.
After determining evolution increment gIncr, it is updated to the numerical value of parameter in formula (8) calculate, it may be determined that current speech frame
The gain parameter of kth subframe.
Similarly, the present embodiment determines LSP coefficient equally in the way of linear interpolation, and concrete formula is as follows:
Lsp [i]=(1-α) * last_lsp [i]+α * next_lsp [i], 1={1,2 ..., P} (10)
In formula (9), α is weight coefficient, and the present embodiment passes through the frame losing number that wobble buffer is incoming
LostFrameCount and the number lossCnt of frame losing altogether to current speech frame determines front and back's frame is carried out linear interpolation
Weight coefficient α, the concrete numerical value of parameter is substituted into after formula (9) calculates, it may be determined that the numerical value of weight coefficient α.
After determining weight coefficient α, it is updated to the numerical value of parameter in formula (10) calculate, it may be determined that the of current speech frame
The LSP coefficient on i rank, it is assumed that the LSP coefficient one of current speech frame has P rank, calculates in the manner described above, finally can determine that
The LSP coefficient on all rank of current speech frame.
405: obtain the non-sequential evolution class parameter of previous valid frame and a rear valid frame, and according to previous valid frame and after
The binary decision class parameter of one valid frame and the non-sequential evolution class parameter of non-sequential evolution class parameter determination current speech frame;
Specifically, non-sequential evolution class parameter can include but not limited to LTP(Long Term Prediction, for a long time
Prediction) coefficient and pumping signal etc., this is not especially limited by the present embodiment.The present embodiment is the most to determining current speech frame
The determination mode of non-sequential evolution class parameter specifically limits, and includes but not limited to: according to previous valid frame and a rear valid frame
Binary decision class parameter determination signal type after, according to non-according to previous valid frame or a rear valid frame of following four kinds of situations
The non-sequential evolution class parameter of sequential evolution class parameter determination current speech frame.
Situation one: previous valid frame and a rear valid frame are unvoiced frame;
First explain as a example by LTP coefficient, if being currently in continual data package dropout, or before current speech frame
The pitch period of one valid frame and a rear valid frame there occurs sudden change, then may infer that current speech frame previous valid frame and after
The LTP coefficient of one valid frame has been likely to occur significant change.Wherein, when judging continual data package dropout, if the quantity of continual data package dropout
During more than packet loss threshold value, the most previous valid frame of explanation current speech frame and the LTP coefficient of a rear valid frame are likely to occur
Significant change.Packet loss threshold value can be set according to practical situation, and this is not especially limited by the present embodiment.The opposing party
Face, it is judged that whether the previous valid frame of current speech frame and the pitch period of a rear valid frame there occurs that the mode of sudden change refers to
Above-mentioned steps, here is omitted.Based on above-mentioned principle, it is divided into the following two kinds situation to explain:
The first situation: be currently not in continual data package dropout and the previous valid frame of current speech frame and a rear valid frame
Pitch period do not undergo mutation;
Now may determine that previous valid frame and the energy value of a rear valid frame of current speech frame, such as previous valid frame
Energy value is Last_Energy, and the energy value of a rear valid frame is Next_Energy, by the energy value of previous valid frame divided by
The energy value of a rear valid frame can determine that zoom factor β.If it is determined that current speech frame is the lost frames of first half, then by previous
The LTP coefficient of valid frame is multiplied by zoom factor β and i.e. can get the LTP coefficient of current speech frame.If it is determined that after current speech frame is
The lost frames of half part, then be multiplied by zoom factor β by the LTP coefficient of a rear valid frame and i.e. can get the LTP system of current speech frame
Number.Certainly, alternate manner can also be used in practical situation to determine the LTP coefficient of current speech frame, this is not made by the present embodiment
Concrete restriction.
Second case: be currently in continual data package dropout or the previous valid frame of current speech frame and a rear valid frame
Pitch period there occurs sudden change.
Owing to now the previous valid frame of current speech frame and the LTP coefficient of a rear valid frame have been likely to occur significant change
Change, therefore if it is determined that current speech frame is the lost frames of first half, then using the LTP coefficient of previous valid frame as current speech
The LTP coefficient of frame.If it is determined that current speech frame is the lost frames of first half, then using the LTP coefficient of previous valid frame as working as
The LTP coefficient of front speech frame.If it is determined that current speech frame is the lost frames of latter half, then by the LTP coefficient of a rear valid frame
LTP coefficient as current speech frame.Certainly practical situation can also use alternate manner to determine the LTP system of current speech frame
Number, this is not especially limited by the present embodiment.
Situation two: previous valid frame is unvoiced frame, a rear valid frame is unvoiced frames;
Owing to previous valid frame is unvoiced frame, a rear valid frame is unvoiced frames.Now it is contemplated that, periodic signal is at packet loss
Period constantly decays.Therefore, the LTP coefficient of current lost frames is multiplied by decay factor coefficient by previous valid frame LTP coefficient is unified
Obtaining, wherein decay factor can obtain with the energy ratio of previous valid frame and a rear valid frame, certainly can also use other
Mode calculates decay factor, and this is not especially limited by the present embodiment.
Situation three: previous valid frame is unvoiced frames, a rear valid frame is unvoiced frame;
Owing to previous valid frame is unvoiced frames, a rear valid frame is unvoiced frame.Now it is contemplated that, periodic signal is at packet loss
Period constantly strengthens.Therefore, the LTP coefficient of current lost frames is multiplied by decay factor coefficient by a rear valid frame LTP coefficient is unified
Obtaining, the determination mode of decay factor is not specifically limited by the present embodiment.
Situation four: previous valid frame and a rear valid frame are unvoiced frames.
Owing to previous valid frame and a rear valid frame are unvoiced frames, therefore current speech frame is also unvoiced frames, now may be used
Periodic signal is not had, it is not necessary to determine the LTP coefficient of current speech frame during judging packet loss.
The present embodiment continues as a example by pumping signal, and the post processing to non-sequential evolution class parameter explains.Need
It is noted that owing to pumping signal is typically voice signal after long short-term prediction and post processing (such as noise shaping etc.)
The residual signals that remaining randomness is the strongest.But the most still can contain the non-white letter that some speech models cannot decompose
Breath.Therefore, only replace obtaining well synthesizing quality with white noise.The most this kind of parameter does not the most possess sequential evolution
Property, so being not suitable for doing interpolation.Based on above-mentioned principle, present embodiments provide a kind of side determining current speech frame pumping signal
Method, concrete explaination is as follows:
If it is determined that current speech frame is the lost frames of first half, then using the pumping signal of previous valid frame as current language
The pumping signal of sound frame.If it is determined that current speech frame is the lost frames of latter half, then the pumping signal of a rear valid frame is made
Pumping signal for current speech frame.
406: according to binary decision class parameter, sequential evolution class parameter and the non-sequential evolution class parameter pair of current speech frame
Current speech frame is decoded.
Specifically, get the binary decision class parameter of current speech frame by each step above-mentioned, sequential evolution class is joined
Number and non-sequential evolution class parameter after, decoder architecture as shown in Figure 5, by the binary decision class parameter of current speech frame, time
After sequence evolution class parameter and non-frame sequential evolution class parameter are transported to decoder, decoder it is decoded.Wherein decoding algorithm
Can determine according to encryption algorithm, this is not especially limited by the present embodiment, after decoder decoding, i.e. available current
The voice signal of speech frame.
407: obtain the parameter of the previous valid frame of current speech frame, carry out extrapolation according to the parameter of previous valid frame and obtain
Take the parameter of current speech frame, and according to the parameter of current speech frame, current speech frame is decoded.
Specifically, owing to judging that current speech frame does not has a rear valid frame, after therefore obtaining the parameter of previous valid frame,
When obtaining the binary decision class parameter of current speech frame, two kinds of situations can be included but not limited to: before according to the first situation
The binary decision previous valid frame of class parameter determination of one valid frame is unvoiced frames, the most now can extrapolate and judge current speech frame
Signal type is unvoiced frames.The second situation, for determining that previous valid frame is unvoiced frame, the most now can obtain previous valid frame
Pitch period and gain parameter, allow the voice signal of previous valid frame slowly decay according to certain speed so that pitch period
The most elongated, gain parameter slowly diminishes.When arriving current speech frame when, if the pitch period of current speech frame is more than base
Sound cycle predetermined threshold value or gain parameter, less than gain parameter predetermined threshold value, the most now can determine that the class signal of current speech frame
Type is unvoiced frames.Otherwise, it is determined that the signal type of current speech frame is unvoiced frame.
Wherein, slowing down speed and can being configured according to historical experience of the voice signal of previous valid frame, the most also may be used
To use it it is determined that method, this is not especially limited by the present embodiment.The predetermined threshold value of pitch period and gain parameter is same
Can rule of thumb be configured, the method to set up work of pitch period and the predetermined threshold value of gain parameter is not had by the present embodiment
Body limits.
Further, the determination mode of the sequential evolution class parameter determining current speech frame is not made specifically by the present embodiment
Limit, include but not limited to: determine that according to the signal type of previous valid frame the sequential evolution class parameter of current speech frame can be divided
For the following two kinds situation, concrete explaination is as follows:
Situation one: previous valid frame is unvoiced frames;
Now can determine that current speech frame is unvoiced frames, due to unvoiced frames non-periodic signals, therefore, current speech frame
Not having pitch period, the gain according to previous valid frame, as a example by the gain parameter determining current speech frame, is joined by the present embodiment
Number determines that the gain parameter of current speech frame explains.
Being decayed according to given pace by the voice signal of unvoiced frames, gain parameter is corresponding slowly to be reduced, and works as when arriving
The when of front speech frame, can be using gain parameter now as the gain parameter of current speech frame.
Wherein, slowing down speed and can being configured according to historical experience of the voice signal of previous valid frame, the most also may be used
To use it it is determined that method, this is not especially limited by the present embodiment.
Situation two: previous valid frame is unvoiced frame.
Now can be decayed according to given pace by the voice signal of previous valid frame, pitch period is slowly increased, and increases
Benefit parameter is corresponding slowly to be reduced, and the formant of frequency band expanding correspondence LSP coefficient gradually weakens, when arriving current speech frame
Wait, according to the decision procedure in above-mentioned steps 403, if current speech frame is still unvoiced frame, then can by pitch period now,
The parameters such as gain parameter and LSP coefficient are as the sequential evolution class parameter of current speech frame.
Further, according to binary decision class parameter and the non-sequential evolution class parameter of previous valid frame of previous valid frame
Determine the non-sequential evolution class parameter of current speech frame.
Specifically, if the binary decision class parameter determination current speech frame according to current speech frame is unvoiced frame, with time non-
As a example by sequence evolution class parameter is LTP coefficient: obtain the LTP coefficient of previous valid frame, the LTP coefficient of previous valid frame is multiplied by contracting
Put the factor as current speech frame LTP coefficient.Wherein, zoom factor can be empirically derived, and weakens frame by frame, this enforcement
This is not especially limited by example.If current speech frame is unvoiced frames, as a example by non-sequential evolution class parameter is as pumping signal, currently
The pumping signal of speech frame can choose the part that in previous valid frame, energy is less, and this is not especially limited by the present embodiment.
Binary decision class parameter, sequential evolution class parameter and non-frame sequential evolution class parameter according to current speech frame is to working as
Front speech frame is decoded, and the process of the voice signal obtaining current speech frame refers to the related content of above-mentioned steps 406, this
Place repeats no more.
The method that the present embodiment provides, when determining current speech LOF to be decoded, by obtaining current speech frame
The parameter of previous valid frame or previous valid frame and the parameter of a rear valid frame, determine current speech according to concrete condition
The parameter of frame, then carries out normal decoder, owing to simulating packet drop according to the parameter of current speech frame to losing speech frame
The normal work of lower decoder, therefore maintains the seriality of decoding, thus when voice packet occurs that in transmitting procedure packet loss is existing
As time, can be decoded according to the parameter of the lost frames determined, and then improve decoded voice quality.
Embodiment three
Embodiments providing the bag-losing hide device of a kind of parameter field, this device is used for performing embodiment one or real
Execute the bag-losing hide method of the parameter field that example two provides.Seeing Fig. 6, this device includes:
Determine module 601, for determining whether current speech frame to be decoded is lost;
Front frame acquisition module 602, when current speech LOF, obtains the ginseng of the previous valid frame of current speech frame
Number;
Present frame determines module 603, for the parameter of the parameter determination current speech frame according to previous valid frame;
Decoder module 604, for being decoded current speech frame according to the parameter of current speech frame.
As a kind of preferred embodiment, see Fig. 7, this audio decoding apparatus, also include:
Judge module 605, for judging whether to buffer a rear valid frame of current speech frame;
Rear frame acquisition module 606, is used for when a valid frame after buffering has, the parameter of a valid frame after acquisition;
Present frame determines module 603, for current according to the parameter of previous valid frame and the parameter determination of a rear valid frame
The parameter of speech frame.
As a kind of preferred embodiment, the parameter of previous valid frame and the parameter of a rear valid frame include that binary decision class is joined
Number;Binary decision class parameter is for judging signal type, and signal type includes sore throat relieving and voiced sound two types;
Present frame determines module 603, for when previous valid frame binary decision class parameter and after the binary of a valid frame
When judging class parameter has a binary decision class parameter decision signal type as unvoiced frame, determine the class signal of current speech frame
Type is unvoiced frame;
As a kind of preferred embodiment, present frame determines module 603, for when the binary decision class parameter of previous valid frame
When all judging signal type as unvoiced frames with the binary decision class parameter of a rear valid frame has a binary decision class parameter, really
The signal type of settled front speech frame is unvoiced frames.
Or, when previous valid frame binary decision class parameter and after a valid frame binary decision class parameter in have one
When binary decision class parameter all judges signal type as unvoiced frames, determine that the signal type of current speech frame is unvoiced frames.
As a kind of preferred embodiment, the parameter of previous valid frame and the parameter of a rear valid frame also include sequential evolution class
Parameter, sequential evolution class parameter at least includes pitch period;
Present frame determines module 603, be additionally operable to the binary decision class parameter according to previous valid frame and a rear valid frame and
The sequential evolution class parameter of sequential evolution class parameter determination current speech frame.
As a kind of preferred embodiment, present frame determines module 603, for when according to previous valid frame and after a valid frame
The binary decision previous valid frame of class parameter determination and the signal type of a rear valid frame be unvoiced frame, and according to previous effectively
When the sequential evolution previous valid frame of class parameter determination of frame and a rear valid frame and the pitch period of a rear valid frame are unmutated, root
Carry out linear interpolation according to the pitch period of previous valid frame and a rear valid frame, obtain the pitch period of current speech frame.
As a kind of preferred embodiment, present frame determines module 603, for when according to previous valid frame and after a valid frame
The binary decision previous valid frame of class parameter determination and the signal type of a rear valid frame be unvoiced frame, and according to previous effectively
When the sequential evolution previous valid frame of class parameter determination of frame and a rear valid frame and the pitch period of a rear valid frame have sudden change, as
Really current speech framing bit is in the first half of all loss speech frames, determines the pitch period of the currently active frame and previous valid frame
Pitch period consistent, if current speech framing bit is in the latter half of all loss speech frames, determine the base of the currently active frame
The sound cycle is consistent with the pitch period of a rear valid frame.
As a kind of preferred embodiment, present frame determines module 603, for when according to previous valid frame and after a valid frame
The signal type of the binary decision previous valid frame of class parameter determination be unvoiced frame, the signal type of a rear valid frame is unvoiced frames
Time, extrapolate according to the pitch period of previous valid frame and obtain the pitch period of current speech frame.
As a kind of preferred embodiment, present frame determines module 603, for when according to previous valid frame and after a valid frame
The signal type of the binary decision previous valid frame of class parameter determination be unvoiced frames, the signal type of a rear valid frame is unvoiced frame
Time, extrapolate according to the pitch period of a rear valid frame and obtain the pitch period of current speech frame.
As a kind of preferred embodiment, the parameter of previous valid frame and the parameter of a rear valid frame also include non-sequential evolution
Class parameter, non-sequential evolution class parameter at least includes long-term forecast LTP coefficient;
Present frame determines module 603, be additionally operable to the binary decision class parameter according to previous valid frame and a rear valid frame and
The non-sequential evolution class parameter of non-sequential evolution class parameter determination current speech frame.
As a kind of preferred embodiment, present frame determines module 603, for when according to previous valid frame and after a valid frame
The binary decision previous valid frame of class parameter determination and the signal type of a rear valid frame be unvoiced frame, and according to previous effectively
The sequential evolution previous valid frame of class parameter determination of frame and a rear valid frame and the pitch period of a rear valid frame are unmutated, and lose
When bag quantity is less than packet loss threshold value, if the currently active framing bit is in the first half of all loss speech frames, according to previous effectively
The LTP coefficient of frame is multiplied by zoom factor and obtains the LTP coefficient of current speech frame, if the currently active framing bit is in all loss voices
The latter half of frame, is multiplied by zoom factor according to the LTP coefficient of a rear valid frame and obtains the LTP coefficient of current speech frame.
As a kind of preferred embodiment, present frame determines module 603, for when according to previous valid frame and after a valid frame
The binary decision previous valid frame of class parameter determination and the signal type of a rear valid frame be unvoiced frame, and according to previous effectively
The sequential evolution previous valid frame of class parameter determination of frame and a rear valid frame and the pitch period of a rear valid frame is undergone mutation or
When packet loss quantity is more than packet loss threshold value, if the currently active framing bit is in the first half of all loss speech frames, determine current language
The LTP coefficient of sound frame is consistent with the LTP coefficient of previous valid frame, if later half in all loss speech frames of the currently active framing bit
Part, determines that the LTP coefficient of current speech frame is consistent with the LTP coefficient of a rear valid frame.
As a kind of preferred embodiment, present frame determines module 603, for when according to previous valid frame and after a valid frame
The signal type of the binary decision previous valid frame of class parameter determination be unvoiced frame, the signal type of a rear valid frame is unvoiced frames
Time, it is multiplied by decay factor according to the LTP coefficient of previous valid frame and obtains the LTP coefficient of current speech frame.
As a kind of preferred embodiment, present frame determines module 603, for when according to previous valid frame and after a valid frame
The signal type of the binary decision previous valid frame of class parameter determination be unvoiced frames, the signal type of a rear valid frame is unvoiced frame
Time, it is multiplied by decay factor according to the LTP coefficient of a rear valid frame and obtains the LTP coefficient of current speech frame.
The device that the present embodiment provides, when determining current speech LOF to be decoded, by obtaining current speech frame
The parameter of previous valid frame or previous valid frame and the parameter of a rear valid frame, determine current speech according to concrete condition
The parameter of frame, then carries out normal decoder, owing to simulating packet drop according to the parameter of current speech frame to losing speech frame
The normal work of lower decoder, therefore maintains the seriality of decoding, thus when voice packet occurs that in transmitting procedure packet loss is existing
As time, can be decoded according to the parameter of the lost frames determined, and then improve decoded voice quality.
It should be understood that the bag-losing hide device of the parameter field of above-described embodiment offer is when carrying out bag-losing hide, only
Be illustrated with the division of above-mentioned each functional module, in actual application, can as desired by above-mentioned functions distribution by
Different functional modules completes, and the internal structure of device will be divided into different functional modules, with complete described above entirely
Portion or partial function.It addition, the bag-losing hide device of the parameter field of above-described embodiment offer and the bag-losing hide side of parameter field
Method embodiment belongs to same design, and it implements process and refers to embodiment of the method, repeats no more here.
The invention described above embodiment sequence number, just to describing, does not represent the quality of embodiment.
One of ordinary skill in the art will appreciate that all or part of step realizing above-described embodiment can pass through hardware
Completing, it is also possible to instruct relevant hardware by program and complete, described program can be stored in a kind of computer-readable
In storage medium, storage medium mentioned above can be read only memory, disk or CD etc..
The foregoing is only presently preferred embodiments of the present invention, not in order to limit the present invention, all spirit in the present invention and
Within principle, any modification, equivalent substitution and improvement etc. made, should be included within the scope of the present invention.