CN103714820A - Packet loss hiding method and device of parameter domain - Google Patents

Packet loss hiding method and device of parameter domain Download PDF

Info

Publication number
CN103714820A
CN103714820A CN201310741180.4A CN201310741180A CN103714820A CN 103714820 A CN103714820 A CN 103714820A CN 201310741180 A CN201310741180 A CN 201310741180A CN 103714820 A CN103714820 A CN 103714820A
Authority
CN
China
Prior art keywords
frame
valid frame
parameter
current speech
class parameter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310741180.4A
Other languages
Chinese (zh)
Other versions
CN103714820B (en
Inventor
陈若非
高泽华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Cubesili Information Technology Co Ltd
Original Assignee
Guangzhou Huaduo Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Huaduo Network Technology Co Ltd filed Critical Guangzhou Huaduo Network Technology Co Ltd
Priority to CN201310741180.4A priority Critical patent/CN103714820B/en
Publication of CN103714820A publication Critical patent/CN103714820A/en
Application granted granted Critical
Publication of CN103714820B publication Critical patent/CN103714820B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention discloses a packet loss hiding method and device of a parameter domain and belongs to the technical field of internets. The method includes the following steps that whether a current voice frame to be decoded is lost or not is determined, and if yes, parameters of a former effective frame are acquired; parameters of the current voice frame are determined according to the parameters of the former effective frame; the current voice frame is decoded according to the parameters of the current voice frame. When it is determined that the current voice frame to be decoded is lost, the parameters of the former effective frame or the parameters of the former effective frame and parameters of a later effective frame are acquired, the parameters of the current voice frame are determined according to the acquired parameters, and the current voice frame is decoded according to the parameters of the current voice frame; because normal work of a decoder under the pack loss conditions is simulated, decoding is kept continuous, and when the pack loss phenomenon of a voice packet appears in the transmission process, decoding can be carried out according to parameters of lost frames, and accordingly voice quality after decoding is improved.

Description

Bag-losing hide method and the device of parameter field
Technical field
The present invention relates to Internet technical field, particularly a kind of bag-losing hide method and device of parameter field.
Background technology
Along with the high speed development of internet and the continuous growth of telecommunications demand, VOIP(Voice Over Internet Protocol based on voice packet exchange, the networking telephone) technology is with its low cost, easily expand and good speech quality is more and more subject to user's favor.In voice communication course, after receiving end receives the voice packet by Internet Transmission, by Voice decoder, the speech frame in voice packet is decoded into corresponding voice signal, and then realizes Internet phone-calling.In current existing Voice decoder, interframe related voice demoder is due to higher-quality voice can be provided under same code rate, thereby is widely adopted, as the SILK demoder of Skype.Because voice packet, in transmission way, packet loss may occur, cause speech communication quality to reduce, therefore, the negative effect bringing in order to reduce voice packet packet loss, need to adopt certain bag-losing hide method, guarantees speech communication quality.
A kind of bag-losing hide method is provided in correlation technique, in the method, receiving end is after receiving voice packet and decoding, if voice packet, in transmission way, packet loss occurs, the voice signal solving is processed and generated the voice signal of losing speech frame in voice packet, for example, voice signal by the front and back frame to lost frames is processed, as pitch synchronous repetition, time scale correction etc., generate the corresponding voice signal of lost frames, thereby realize bag-losing hide.
In realizing process of the present invention, inventor finds that prior art at least exists following problem:
Because the speech frame in voice packet is related mutually, the decoded result of the speech frame of decoding can impact the decoding of current speech frame before.If there is packet loss in voice packet in transmission way, so that the lost speech frames in voice packet, when the signal of the front and back frame by lost frames is processed the corresponding voice signal of generation lost frames, because the subsequent frame of lost frames can not correctly solve, therefore, by the signal of the front and back frame to lost frames, process the corresponding voice signal poor effect of lost frames generating, thereby cause speech communication of low quality.
Summary of the invention
In order to solve the problem of prior art, the embodiment of the present invention provides a kind of bag-losing hide method and device of parameter field.Described technical scheme is as follows:
On the one hand, provide a kind of bag-losing hide method of parameter field, described method comprises:
Determine whether current speech frame to be decoded is lost;
If described current speech LOF, obtains the parameter of the last valid frame of described current speech frame;
According to the parameter of described last valid frame, determine the parameter of described current speech frame;
According to the parameter of described current speech frame, described current speech frame is decoded.
On the other hand, provide a kind of bag-losing hide device of parameter field, described device comprises:
Determination module, for determining whether current speech frame to be decoded is lost;
Front frame acquisition module, for when the described current speech LOF, obtains the parameter of the last valid frame of described current speech frame;
Present frame determination module, for determining the parameter of described current speech frame according to the parameter of described last valid frame;
Decoder module, for decoding to described current speech frame according to the parameter of described current speech frame.
The beneficial effect that the technical scheme that the embodiment of the present invention provides is brought is:
When determining current speech LOF to be decoded, by obtaining the parameter of the last valid frame of current speech frame, according to concrete condition, determine the parameter of current speech frame, then according to the parameter of current speech frame, to losing speech frame, carry out normal decoder, owing to having simulated the normal work of demoder under packet drop, therefore the continuity that has kept decoding, thereby when there is packet loss phenomenon in voice packet in transmitting procedure, can decode according to the parameter of definite lost frames, and then improve decoded voice quality.
Accompanying drawing explanation
In order to be illustrated more clearly in the technical scheme in the embodiment of the present invention, below the accompanying drawing of required use during embodiment is described is briefly described, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skills, do not paying under the prerequisite of creative work, can also obtain according to these accompanying drawings other accompanying drawing.
Fig. 1 is the bag-losing hide method flow diagram of a kind of parameter field of providing of the embodiment of the present invention one;
Fig. 2 is the bag-losing hide method flow diagram of the another kind of parameter field that provides of the embodiment of the present invention one;
Fig. 3 is the bag-losing hide method flow diagram of another parameter field of providing of the embodiment of the present invention one;
Fig. 4 is the bag-losing hide method flow diagram of a kind of parameter field of providing of the embodiment of the present invention two;
Fig. 5 is the structural representation of a kind of demoder of providing of the embodiment of the present invention two;
Fig. 6 is the bag-losing hide apparatus structure schematic diagram of a kind of parameter field of providing of the embodiment of the present invention three;
Fig. 7 is the bag-losing hide apparatus structure schematic diagram of the another kind of parameter field that provides of the embodiment of the present invention three.
Embodiment
For making the object, technical solutions and advantages of the present invention clearer, below in conjunction with accompanying drawing, embodiment of the present invention is described further in detail.
Embodiment mono-
Because the speech frame in interframe relative decoding device voice packet is to be mutually related, therefore the decoded result of speech frame can impact current speech frame decoding above.When voice packet occurs packet loss in network transmission process, the speech frame in voice packet also can be lost.Now, due to the decoded result that there is no a speech frame above as a reference, the decode procedure of losing the follow-up speech frame of speech frame can be subject to very large negative effect, thus the voice quality that causes decoding voice signal out to produce is poor.
Negative effect when interframe relative decoding device is decoded in order to reduce as much as possible packet loss, the invention provides a kind of bag-losing hide method of parameter field, the method is for installing the equipment of interframe relative decoding device, this equipment includes but not limited to terminal, server etc., and the present embodiment is not done concrete restriction to this.For the speech frame in voice packet is decoded, the embodiment of the present invention is usingd the parameter of last valid frame or last valid frame and a rear valid frame as the foundation of determining lost frames parameter, take executive agent as receiving end is example, and the method that the present embodiment is provided is illustrated.Referring to Fig. 1, the method flow that the present embodiment provides comprises:
101: determine whether current speech frame to be decoded is lost;
102: if the parameter of the last valid frame of current speech frame is obtained in current speech LOF;
103: the parameter of determining current speech frame according to the parameter of last valid frame;
104: according to the parameter of current speech frame, current speech frame is decoded.
On the method basis shown in Fig. 1, the method that the present embodiment provides, according to whether there being the different situations of a rear valid frame of current speech frame in buffering, specifically can be subdivided into following two kinds of situations:
Referring to Fig. 2, for there being the situation of a rear valid frame of current speech frame in buffering, the method flow that the present embodiment provides is as follows:
201: determine whether current speech frame to be decoded is lost;
202: if the parameter of the last valid frame of current speech frame is obtained in current speech LOF;
203: judge whether that buffering has a rear valid frame of current speech frame;
204: if buffering has a rear valid frame, the parameter of a valid frame after obtaining;
205: the parameter of determining current speech frame according to the parameter of the parameter of last valid frame and a rear valid frame;
206: according to the parameter of current speech frame, current speech frame is decoded.
Referring to Fig. 3, for there is no the situation of a rear valid frame of current speech frame in buffering, the method flow that the present embodiment provides comprises:
301: determine whether current speech frame to be decoded is lost;
302: if the parameter of the last valid frame of current speech frame is obtained in current speech LOF;
303: judge whether that buffering has a rear valid frame of current speech frame;
304: if buffering does not have a rear valid frame, according to the parameter extrapolation of last valid frame, determine the parameter of current speech frame;
305: according to the parameter of current speech frame, current speech frame is decoded.
The method that the present embodiment provides, when determining current speech LOF to be decoded, by obtaining the parameter of last valid frame or the parameter of last valid frame and a rear valid frame of current speech frame, according to concrete condition, determine the parameter of current speech frame, then according to the parameter of current speech frame, to losing speech frame, carry out normal decoder, owing to having simulated the normal work of demoder under packet drop, therefore the continuity that has kept decoding, thereby when there is packet loss phenomenon in voice packet in transmitting procedure, can decode according to the parameter of definite lost frames, and then improve decoded voice quality.
Embodiment bis-
The embodiment of the present invention provides a kind of bag-losing hide method of parameter field, in conjunction with the content in above-described embodiment one, for current speech frame, lose, wobble buffer has or not the situation of subsequent voice bag, respectively packet loss concealment provided by the invention is at length illustrated.Referring to Fig. 4, the method flow that the present embodiment provides comprises:
401: determine whether current speech frame to be decoded is lost;
The present embodiment is not to concrete restriction of definite method work of determining whether current speech frame to be decoded is lost, include but not limited to: voice packet transmitting terminal is before sending voice packet, for each speech frame in voice packet is numbered, numbering speech frame is later sent to voice packet receiving end.Demoder shown in Figure 5, is provided with a wobble buffer, and the speech frame receiving is pre-stored in wobble buffer.Demoder is the numbering with the follow-up valid frame of storing in wobble buffer according to the numbering of the last valid frame of current speech frame, can determine whether current speech frame is lost.
For example, first speech frame is numbered 1, and demoder has been decoded after first speech frame, retrieves follow-up valid frame in wobble buffer, if retrieve follow-up valid frame be numbered 4, now can determine second speech frame and the 3rd lost speech frames.If current, need to decode to second speech frame, determine current speech LOF.
Certainly, can also adopt alternate manner to determine whether current speech frame is lost, the present embodiment is not done concrete restriction to this.The present embodiment only be take current speech LOF as example, tone decoding method is illustrated, for the situation of determining that current speech frame is not lost, can directly decodes according to predefined decoding process, the decoding process of not losing about current speech frame repeats no more herein.
402: if current speech LOF judges whether that buffering has a rear valid frame of current speech frame, if so, execution step 403, otherwise, execution step 407;
This step, when judging whether that buffering has a rear valid frame of current speech frame, can adopt the same way of whether losing with definite current speech frame.As described in above-mentioned steps 401, transmitting terminal, before sending speech frame, for each speech frame is numbered, is sent to receiving end by numbering speech frame later.Receiving end sets in advance a wobble buffer, and the speech frame receiving is pre-stored in wobble buffer.Numbering according to the numbering of the last valid frame of current speech frame with the follow-up valid frame of storing in wobble buffer, judges whether that buffering has a rear valid frame of current speech frame.
For example, current speech frame number is 3, if retrieve the follow-up speech frame that is numbered 4 that has in wobble buffer, now can determine that buffering has a rear valid frame of current speech frame.Again for example, current speech frame number is 3, if retrieve the follow-up speech frame that is numbered 5 that has in wobble buffer, now can determine and not cushion a rear valid frame that has current speech frame.
Certainly, can also adopt alternate manner to judge whether that buffering has a rear valid frame of current speech frame, the present embodiment is not done concrete restriction to this.
403: obtain the binary decision class parameter of last valid frame and a rear valid frame, and according to the binary decision class parameter of last valid frame and a rear valid frame, determine the signal type of current speech frame, obtain the binary decision class parameter of current speech frame;
Particularly, binary decision class parameter is for signal type is judged, because voice has dividing of voiceless sound voiced sound, so the modeling of cyclical signal and aperiodicity signal and coding are had to obvious difference in common speech model.Wherein, wide in range says, the corresponding unvoiced frame of cyclical signal, the corresponding unvoiced frames of aperiodicity signal.Therefore, signal type comprises two types of voiceless sound and voiced sounds.Obtain after the binary decision class parameter of last valid frame and a rear valid frame, can determine whether last valid frame and a rear valid frame are cyclical signal according to the binary decision class parameter of the last valid frame getting and a rear valid frame, thereby according to the binary decision class parameter of last valid frame and a rear valid frame, determine the signal type of last valid frame and a rear valid frame, obtain the binary decision class parameter of current speech frame.The mode providing according to the present embodiment, in the process of signal type of determining current speech frame, includes but not limited to following three kinds of situations:
Situation one: last valid frame and a rear valid frame are cyclical signal, can determine that the signal type of last valid frame and a rear valid frame is unvoiced frame, is now defined as unvoiced frame by the signal type of current speech frame according to the binary decision class parameter of last valid frame and a rear valid frame.
Situation two: last valid frame is cyclical signal, a rear valid frame is aperiodicity signal, can determine that last valid frame is unvoiced frame according to the binary decision class parameter of last valid frame and a rear valid frame, a rear valid frame is unvoiced frames.Or last valid frame is aperiodicity signal, a rear valid frame is cyclical signal, can determine that last valid frame is unvoiced frames according to the binary decision class parameter of last valid frame and a rear valid frame, and a rear valid frame is unvoiced frame.
In above-mentioned two kinds of situations, owing to having one in last valid frame and a rear valid frame for cyclical signal, can judge in this case the conversion of having experienced in lost frames periodically and between aperiodicity signal, therefore can reasonably suppose how much also to have the existence of cyclical signal in lost frames, therefore, determine that current speech frame is unvoiced frame.
Situation three: last valid frame and a rear valid frame are aperiodicity signal, can determine that the signal type of last valid frame and a rear valid frame is unvoiced frames, is now defined as unvoiced frames by the signal type of current speech frame according to the binary decision class parameter of last valid frame and a rear valid frame.
No matter adopt above-mentioned which kind of situation to determine the signal type of current speech frame, definite signal type all be can be exchanged into corresponding binary decision class parameter.For example, during concrete enforcement, the binary decision class parameter that unvoiced frames can be set is 0, the binary decision class parameter value of unvoiced frame is 1, after determining the signal type of current speech frame, if this current speech frame is unvoiced frames, the binary decision class parameter value of current speech frame is 0, in like manner, if this current speech frame is unvoiced frame, the binary decision class parameter of current speech frame is 1, certainly, the numerical value of binary decision class parameter can also adopt other set-up mode, and the present embodiment is not done concrete restriction to this.
404: obtain the sequential evolution class parameter of last valid frame and a rear valid frame, and according to the binary decision class parameter of last valid frame and a rear valid frame and sequential evolution class parameter, determine the sequential evolution class parameter of current speech frame;
Particularly, sequential evolution class parameter can include but not limited to pitch period, gain parameter and LSP(LineSpectrum Pair, line spectrum pair) coefficient etc., the present embodiment is not done concrete restriction to this, to obtaining the mode of the sequential evolution class parameter of last valid frame and a rear valid frame, do not limit equally.During concrete enforcement, first take pitch period as example, in the binary decision class parameter according to last valid frame and a rear valid frame, determine after signal type, can according to following four kinds of situations, determine according to the signal type of last valid frame and a rear valid frame pitch period parameter of current speech frame.
Situation one: last valid frame and a rear valid frame are unvoiced frame;
Get after the pitch period of last valid frame and a rear valid frame, due in actual scene, when people speaks, likely improve suddenly or reduce tone, so in the stage, may there is equally the sudden change of pitch period at stable voiced sound.In order to judge the pitch period of last valid frame and a rear valid frame, whether undergo mutation, can take following method: the absolute value of getting the difference of the pitch period of last valid frame and the pitch period of a rear valid frame, whether the pitch period offset threshold that the absolute value of difference is default compares, according to the pitch period of comparative result and then definite last valid frame and a rear valid frame, undergo mutation.
For example, establish the pitch period that next_pitch is a rear valid frame, the pitch period that last_pitch is last valid frame, takes absolute value and determines whether and have pitch period sudden change with default pitch period offset threshold δ by both difference.
Wherein, if | next_pitch-last_pitch| < δ, after calculating according to above-mentioned formula, if both differences take absolute value, be less than pitch period offset threshold δ, can determine that the pitch period of last valid frame and a rear valid frame is not undergone mutation.Otherwise, can determine that the pitch period of last valid frame and a rear valid frame is undergone mutation.Wherein, pitch period offset threshold δ can set according to historical experience, and the present embodiment is not done concrete restriction to this.In addition, can also adopt other method to determine whether the pitch period of last valid frame and a rear valid frame undergos mutation in practical operation, this example is not done concrete restriction yet to this.
When determining the pitch period parameter of current speech frame, according to the pitch period of last valid frame and a rear valid frame, whether undergo mutation, can be divided into following two kinds of situations:
The first situation: the pitch period of last valid frame and a rear valid frame is not undergone mutation;
Under the situation of not undergoing mutation at the pitch period of last valid frame and a rear valid frame, pitch period profile is smoothly and evolution chronologically, therefore, can select to determine by the method for linear interpolation the pitch period of the subframe of current speech frame, according to the pitch period of the subframe of current speech frame, determine the pitch period of current speech frame afterwards.Certainly, can also select other interpolation algorithm to determine the pitch period of current speech frame, the present embodiment is not done concrete restriction to this.
During concrete enforcement, in carrying out the process of linear interpolation, can the parameter needing be set according to actual concrete condition, and different numerical value is set carries out linear interpolation, the present embodiment is not done concrete restriction to the algorithm of linear interpolation.Only take following linear interpolation algorithm as example, and the specific implementation of this kind of algorithm can represent by following formula:
pIncr = next _ pitch - last _ pitch lostFrameCount * subFrameCount + 1 - - - ( 1 )
pitch[k]=last_pitch+pIncr*(k+1ossCnt*subFrameCount+1) (2)
In formula (1), pIncr is evolution increment, and lostFrameCount is the number of frame losing altogether that wobble buffer imports into, and subFrameCount is the number of sub frames comprising in every frame.
In formula (2), can add up the numerical value of determining lostFrameCount to the continuous frame losing number before the rear valid frame in wobble buffer.For example, in wobble buffer, can deposit 5 frames altogether, wherein each frame all has numbering.At a time, the second LOF, and be numbered 5 the 5th frame during next valid frame of wobble buffer, and now can determine two, three, four LOFs, the numerical value of lostFrameCount is 3.In addition, subFrameCount is the number of sub frames comprising in every frame, and the number of sub frames of every frame can be set according to actual needs, and the present embodiment is not done concrete restriction yet to this.
Determine after the numerical value of different parameters in the manner described above, can determine the numerical value of evolution increment pIncr, the numerical value of evolution increment pIncr is updated in formula (2) and does computing further.
In formula (2), lossCnt is that frame losing starts the number of frame losing altogether of position up till now, and the implication of k is the k subframe of current speech frame, pitch[k] represent the pitch period of the k subframe of current speech frame.
Determine in the manner described above after the numerical value of different parameters, can determine the numerical value of lossCnt and k, both numerical value are updated to and in formula (2), carry out computing and can determine pitch[k] numerical value, i.e. the voice cycle of the k subframe of current speech frame.
After getting the pitch period of all subframes of current speech frame, can adopt certain method to determine the pitch period of current speech frame, as the pitch period of subframe is superposeed according to weight, the present embodiment is not done concrete restriction to this.
Second case: the pitch period of last valid frame and a rear valid frame is undergone mutation.
For the ease of follow-up calculating, it is example that the present embodiment be take the interlude point that pitch period occurs in packet loss, for example, according to the numbering of frame, determine that lost frames are the second frame, the 3rd frame and the 4th frame, the first frame and the 5th frame are determined normal arrival, there is the interlude point of packet loss in now fundamental tone sudden change, in the 3rd frame.Certainly, according to actual conditions, can also adopt other method to determine the interlude point of packet loss, the present embodiment is not done concrete restriction to this.For example, the pitch period of establishing former frame is last_pitch, and the pitch period of next frame is next_pitch, and the pitch period of current speech frame is pitch, and the pitch period of current speech frame can be determined according to following formula:
pitch=last_pitch,iflossCnt<(LostFrameCount>>1) (3)
pitch=next_pitch,iflossCnt≥(LostFrameCount>>1) (4)
In above-mentioned formula (3) and formula (4), determine after the numerical value of lostFrameCount, determine that lost frames start to count lossCnt to the frame losing altogether of current speech frame position, if the numerical value of lostFrameCount, divided by 2 numerical value that are greater than lossCnt, is defined as the pitch period last_pitch of last valid frame the pitch period of current speech frame.Otherwise, the pitch period of a rear valid frame is defined as to the pitch period of current speech frame.
Situation two: last valid frame is unvoiced frame, a rear valid frame is unvoiced frames;
Now can predict, periodic signal is constantly decay during packet loss.From the physical model of voice sounding, the decay of sort signal can show the slow decreasing of the slow elongated or fundamental frequency of pitch period.Based on above-mentioned principle, the pitch period of current speech frame can increase progressively to obtain by the pitch period extrapolation of last valid frame.The evaluation method of the pitch period of current speech frame k subframe is as follows:
pitch[k]=last_pitch+lossCnt*subFrameCount+k (5)
Wherein, in formula (5), the implication of parameter can be determined the pitch period of current speech frame with reference to the annotation in above-mentioned steps at the pitch period that gets all subframes of current speech frame, and detailed process can, with reference to above-mentioned steps, repeat no more herein.
Situation three: last valid frame is unvoiced frames, a rear valid frame is unvoiced frame;
Now can predict, periodic signal forms gradually during packet loss.Principle based in situation one, can increase progressively the formation that carrys out simulation cycle signal by the pitch period extrapolation of a rear valid frame, to obtain the pitch period of current speech frame.The evaluation method of the pitch period of current speech frame k subframe is as follows: pitch[subFrameCount-k-1]=next_pitch-lossCnt*subFrameCount-k (6)
Wherein, in formula (6), the implication of parameter can be determined the pitch period of current speech frame with reference to the annotation in above-mentioned steps at the pitch period that gets all subframes of current speech frame, and detailed process can, with reference to above-mentioned steps, repeat no more herein.
Situation four: last valid frame and a rear valid frame are unvoiced frames.
Now can determine that current speech frame is unvoiced frames, because unvoiced frames is not cyclical signal, so current speech frame does not have pitch period.
The present embodiment continues to determine that the gain parameter of current speech frame is example, to determining that according to the sequential evolution class parameter of last valid frame or a rear valid frame sequential evolution class parameter of current speech frame explains.
During concrete enforcement, the present embodiment is determined gain parameter in the mode of linear interpolation, certainly, can be according to the complexity of algorithms of different in actual environment, time delay and effect are selected interpolation method, and the present embodiment is not done concrete restriction to this.As not being in very serious situation when the situation of continual data package dropout, can select polynomial interpolation to obtain the gain of lost frames, but these class methods will obtain better effects, the subsequent frame in more multipair wobble buffers is decoded in advance, thereby can increase decoding time delay.The present embodiment, according to concrete application scenarios, provides a kind of linear interpolation algorithm, and concrete formula is as follows:
gIncr = next _ gain - last _ gain lostFrameCount * subFrameCount + 1 - - - ( 7 )
gain[k]=last_gain+gIncr*(k+lossCnt*subFrameCount+1) (8)
In formula (7), next_gain is the gain parameter of last valid frame, the gain parameter of next valid frame of last_gain, the evolution increment of the pitch period in the similar above-mentioned steps of evolution increment gIncr, after the concrete numerical value substitution formula (7) of parameter is calculated, can determine the numerical value of evolution increment gIncr.
Determine after evolution increment gIncr, the numerical value of parameter is updated in formula (8) and is calculated, can determine the gain parameter of the k subframe of current speech frame.
Similarly, the present embodiment is determined LSP coefficient in the mode of linear interpolation equally, and concrete formula is as follows:
&alpha; = lossCnt + 1 lostFrameCount + 1 - - - ( 9 )
lsp[i]=(1-α)*last_lsp[i]+α*next_lsp[i],1={1,2,...,P} (10)
In formula (9), α is weight coefficient, the frame losing that the present embodiment imports into by wobble buffer is counted lostFrameCount and is determined till lossCnt is counted in the frame losing altogether of current speech frame the weight coefficient α that front and back frame is carried out to linear interpolation, after the concrete numerical value substitution formula (9) of parameter is calculated, can determine the numerical value of weight coefficient α.
Determine after weight coefficient α, the numerical value of parameter is updated in formula (10) and is calculated, can determine the LSP coefficient on the i rank of current speech frame, suppose the LSP coefficient one total P rank of current speech frame, calculate in the manner described above, finally can determine the LSP coefficient on all rank of current speech frame.
405: obtain the non-sequential evolution class parameter of last valid frame and a rear valid frame, and according to the binary decision class parameter of last valid frame and a rear valid frame and non-sequential evolution class parameter, determine the non-sequential evolution class parameter of current speech frame;
Particularly, non-sequential evolution class parameter can include but not limited to LTP(Long Term Prediction, long-term forecasting) coefficient and pumping signal etc., the present embodiment is not done concrete restriction to this.The present embodiment is not done concrete restriction to definite mode of the non-sequential evolution class parameter of definite current speech frame yet, include but not limited to: according to the binary decision class parameter of last valid frame and a rear valid frame, determine after signal type, according to following four kinds of situations, according to the non-sequential evolution class parameter of last valid frame or a rear valid frame, determine the non-sequential evolution class parameter of current speech frame.
Situation one: last valid frame and a rear valid frame are unvoiced frame;
First the LTP coefficient of take explains as example, if current in continual data package dropout, or sudden change has occurred for the last valid frame of current speech frame and the pitch period of a rear valid frame, can infer that significant variation may appear in the last valid frame of current speech frame and the LTP coefficient of a rear valid frame.Wherein, when judgement continual data package dropout, if when the quantity of continual data package dropout is greater than packet loss threshold value, now illustrate that significant variation may appear in the last valid frame of current speech frame and the LTP coefficient of a rear valid frame.Packet loss threshold value can be set according to actual conditions, and the present embodiment is not done concrete restriction to this.On the other hand, judge that the mode whether last valid frame of current speech frame and the pitch period of a rear valid frame have occurred to suddenly change can, with reference to above-mentioned steps, repeat no more herein.Based on above-mentioned principle, be divided into following two kinds of situations and explain:
The first situation: current not in continual data package dropout and the last valid frame of current speech frame and the pitch period of a rear valid frame do not undergo mutation;
Now can determine the last valid frame of current speech frame and the energy value of a rear valid frame, if the energy value of last valid frame is Last_Energy, the energy value of a rear valid frame is Next_Energy, the energy value of last valid frame is removed to the energy value of a later valid frame and can determine zoom factor β.If determine the lost frames that current speech frame is first half, the LTP coefficient of last valid frame be multiplied by the LTP coefficient that zoom factor β can obtain current speech frame.If determine the lost frames that current speech frame is latter half, the LTP coefficient of a rear valid frame be multiplied by the LTP coefficient that zoom factor β can obtain current speech frame.Certainly, can also adopt alternate manner to determine the LTP coefficient of current speech frame in actual conditions, the present embodiment is not done concrete restriction to this.
Second case: current in continual data package dropout or the last valid frame of current speech frame and the pitch period of a rear valid frame there is sudden change.
Owing to having there is significant variation in the now last valid frame of current speech frame and the LTP coefficient of a rear valid frame, if therefore determine the lost frames that current speech frame is first half, the LTP coefficient using the LTP coefficient of last valid frame as current speech frame.If determine the lost frames that current speech frame is first half, the LTP coefficient using the LTP coefficient of last valid frame as current speech frame.If determine the lost frames that current speech frame is latter half, the LTP coefficient using the LTP coefficient of a rear valid frame as current speech frame.Certainly in actual conditions, can also adopt alternate manner to determine the LTP coefficient of current speech frame, the present embodiment is not done concrete restriction to this.
Situation two: last valid frame is unvoiced frame, a rear valid frame is unvoiced frames;
Because last valid frame is unvoiced frame, a rear valid frame is unvoiced frames.Now can predict, periodic signal is constantly decay during packet loss.Therefore, the LTP coefficient of current lost frames is multiplied by decay factor coefficient by the unification of last valid frame LTP coefficient and obtains, wherein decay factor can obtain with the energy Ratios of last valid frame and a rear valid frame, certainly can also adopt alternate manner to calculate decay factor, the present embodiment is not done concrete restriction to this.
Situation three: last valid frame is unvoiced frames, a rear valid frame is unvoiced frame;
Because last valid frame is unvoiced frames, a rear valid frame is unvoiced frame.Now can predict, periodic signal constantly strengthens during packet loss.Therefore, the LTP coefficient of current lost frames is multiplied by decay factor coefficient by a rear valid frame LTP coefficient unification and obtains, and the present embodiment is not done concrete restriction to definite mode of decay factor.
Situation four: last valid frame and a rear valid frame are unvoiced frames.
Because last valid frame and a rear valid frame are unvoiced frames, so current speech frame is also unvoiced frames, now can judge during packet loss and there is no periodic signal, do not need to determine the LTP coefficient of current speech frame.
The present embodiment continues to take pumping signal as example, and the aftertreatment of non-sequential evolution class parameter is explained.It should be noted that, due to pumping signal normally voice signal through long short-term prediction and aftertreatment (as noise shaping etc.) the remaining very strong residual signals of randomness afterwards.But sometimes wherein still can contain the non-white information that some speech models cannot decompose.Therefore, only with white noise, replace obtaining well synthetic quality.This class parameter does not possess sequential evolution yet simultaneously, so be not suitable for doing interpolation.Based on above-mentioned principle, the present embodiment provides a kind of method of definite current speech frame pumping signal, and concrete explaination is as follows:
If determine the lost frames that current speech frame is first half, the pumping signal using the pumping signal of last valid frame as current speech frame.If determine the lost frames that current speech frame is latter half, the pumping signal using the pumping signal of a rear valid frame as current speech frame.
406: according to the binary decision class parameter of current speech frame, sequential evolution class parameter and non-sequential evolution class parameter, current speech frame is decoded.
Particularly, by above-mentioned each step, get after binary decision class parameter, sequential evolution class parameter and the non-sequential evolution class parameter of current speech frame, decoder architecture as shown in Figure 5, the binary decision class parameter of current speech frame, sequential evolution class parameter and non-frame sequential evolution class parameter are transported to after demoder, by demoder, are decoded.Wherein decoding algorithm can be determined according to encryption algorithm, and the present embodiment is not done concrete restriction to this, after demoder decoding, can obtain the voice signal of current speech frame.
407: obtain the parameter of the last valid frame of current speech frame, according to the parameter of last valid frame, extrapolate to obtain the parameter of current speech frame, and according to the parameter of current speech frame, current speech frame is decoded.
Particularly, owing to judging current speech frame, there is no a rear valid frame, therefore obtain after the parameter of last valid frame, while obtaining the binary decision class parameter of current speech frame, can include but not limited to two kinds of situations: the first situation, for to determine that according to the binary decision class parameter of last valid frame last valid frame is unvoiced frames, now can be extrapolated and judge that the signal type of current speech frame is unvoiced frames.The second situation is for determining that last valid frame is unvoiced frame, now can obtain pitch period and the gain parameter of last valid frame, allow the voice signal of last valid frame slowly decay according to certain speed, make pitch period slowly elongated, gain parameter slowly diminishes.When arriving current speech frame, if the pitch period of current speech frame is greater than pitch period predetermined threshold value or gain parameter is less than gain parameter predetermined threshold value, the signal type that now can determine current speech frame is unvoiced frames.Otherwise, determine that the signal type of current speech frame is unvoiced frame.
Wherein, slowing down speed and can arranging according to historical experience of the voice signal of last valid frame, can also adopt other to determine method certainly, and the present embodiment is not done concrete restriction to this.The predetermined threshold value of pitch period and gain parameter can rule of thumb arrange equally, and the present embodiment is not done concrete restriction to the method to set up of the predetermined threshold value of pitch period and gain parameter yet.
Further, the present embodiment is not done concrete restriction to definite mode of the sequential evolution class parameter of definite current speech frame yet, include but not limited to: the sequential evolution class parameter of determining current speech frame according to the signal type of last valid frame can be divided into following two kinds of situations, concrete explaination is as follows:
Situation one: last valid frame is unvoiced frames;
Now can determine that current speech frame is unvoiced frames, because unvoiced frames is not cyclical signal, therefore, current speech frame does not have pitch period, the present embodiment be take and determined that the gain parameter of current speech frame is example, to determining that according to the gain parameter of last valid frame the gain parameter of current speech frame explains.
The voice signal of unvoiced frames is decayed according to given pace, and gain parameter correspondence slowly reduces, when arriving current speech frame, and gain parameter that can be using gain parameter now as current speech frame.
Wherein, slowing down speed and can arranging according to historical experience of the voice signal of last valid frame, can also adopt other to determine method certainly, and the present embodiment is not done concrete restriction to this.
Situation two: last valid frame is unvoiced frame.
Now the voice signal of last valid frame can be decayed according to given pace, pitch period slowly increases, gain parameter correspondence slowly reduces, the resonance peak of the corresponding LSP coefficient of frequency band expanding weakens gradually, when arriving current speech frame, according to the decision procedure in above-mentioned steps 403, if current speech frame is still unvoiced frame, sequential evolution class parameter that can be using parameters such as pitch period now, gain parameter and LSP coefficients as current speech frame.
Further, according to the non-sequential evolution class parameter of the binary decision class parameter of last valid frame and last valid frame, determine the non-sequential evolution class parameter of current speech frame.
Particularly, if determine that according to the binary decision class parameter of current speech frame current speech frame is unvoiced frame, take non-sequential evolution class parameter as LTP coefficient be example: obtain the LTP coefficient of last valid frame, the LTP coefficient of last valid frame be multiplied by zoom factor as current speech frame LTP coefficient.Wherein, zoom factor can rule of thumb obtain, and weakens frame by frame, and the present embodiment is not done concrete restriction to this.If current speech frame is unvoiced frames, take non-sequential evolution class parameter as pumping signal be example, the pumping signal of current speech frame can be chosen the less part of energy in last valid frame, the present embodiment is not done concrete restriction to this.
According to the binary decision class parameter of current speech frame, sequential evolution class parameter and non-frame sequential evolution class parameter, current speech frame is decoded, the process of obtaining the voice signal of current speech frame can, with reference to the related content of above-mentioned steps 406, repeat no more herein.
The method that the present embodiment provides, when determining current speech LOF to be decoded, by obtaining the parameter of last valid frame or the parameter of last valid frame and a rear valid frame of current speech frame, according to concrete condition, determine the parameter of current speech frame, then according to the parameter of current speech frame, to losing speech frame, carry out normal decoder, owing to having simulated the normal work of demoder under packet drop, therefore the continuity that has kept decoding, thereby when there is packet loss phenomenon in voice packet in transmitting procedure, can decode according to the parameter of definite lost frames, and then improve decoded voice quality.
Embodiment tri-
The embodiment of the present invention provides a kind of bag-losing hide device of parameter field, and this device is for the bag-losing hide method of the parameter field carrying out embodiment mono-or embodiment bis-and provide.Referring to Fig. 6, this device comprises:
Determination module 601, for determining whether current speech frame to be decoded is lost;
Front frame acquisition module 602, during for current speech LOF, obtains the parameter of the last valid frame of current speech frame;
Present frame determination module 603, for determining the parameter of current speech frame according to the parameter of last valid frame;
Decoder module 604, for decoding to current speech frame according to the parameter of current speech frame.
As a kind of preferred embodiment, referring to Fig. 7, this audio decoding apparatus, also comprises:
Judge module 605, for judging whether that buffering has a rear valid frame of current speech frame;
Rear frame acquisition module 606, during for a valid frame after buffering has, the parameter of a valid frame after obtaining;
Present frame determination module 603, for determining the parameter of current speech frame according to the parameter of the parameter of last valid frame and a rear valid frame.
As a kind of preferred embodiment, the parameter of the parameter of last valid frame and a rear valid frame comprises binary decision class parameter; Binary decision class parameter is for judging signal type, and signal type comprises two types of voiceless sound and voiced sounds;
Present frame determination module 603, while having a binary decision class parameter decision signal type to be unvoiced frame for the binary decision class parameter of the binary decision class parameter when last valid frame and a rear valid frame, the signal type of determining current speech frame is unvoiced frame;
As a kind of preferred embodiment, present frame determination module 603, for when the binary decision class parameter of last valid frame and the binary decision class parameter of a rear valid frame have an equal decision signal type of binary decision class parameter to be unvoiced frames, the signal type of determining current speech frame is unvoiced frames.
Or while having an equal decision signal type of binary decision class parameter to be unvoiced frames in the binary decision class parameter of last valid frame and the binary decision class parameter of a rear valid frame, the signal type of determining current speech frame is unvoiced frames.
As a kind of preferred embodiment, the parameter of the parameter of last valid frame and a rear valid frame also comprises sequential evolution class parameter, and sequential evolution class parameter at least comprises pitch period;
Present frame determination module 603, also for determining the sequential evolution class parameter of current speech frame according to the binary decision class parameter of last valid frame and a rear valid frame and sequential evolution class parameter.
As a kind of preferred embodiment, present frame determination module 603, for when according to last valid frame and after the binary decision class parameter of a valid frame determine that the signal type of last valid frame and a rear valid frame is unvoiced frame, and while determining that according to the sequential evolution class parameter of last valid frame and a rear valid frame pitch period of last valid frame and a rear valid frame does not suddenly change, according to the pitch period of last valid frame and a rear valid frame, carry out linear interpolation, obtain the pitch period of current speech frame.
As a kind of preferred embodiment, present frame determination module 603, for when according to last valid frame and after the binary decision class parameter of a valid frame determine that the signal type of last valid frame and a rear valid frame is unvoiced frame, and while determining that according to the sequential evolution class parameter of last valid frame and a rear valid frame pitch period of last valid frame and a rear valid frame has sudden change, if current speech framing bit is in the first half of all loss speech frames, determine that the pitch period of current valid frame and the pitch period of last valid frame are consistent, if current speech framing bit is in the latter half of all loss speech frames, the pitch period of determining current valid frame is consistent with the pitch period of a rear valid frame.
As a kind of preferred embodiment, present frame determination module 603, for when according to last valid frame and after the binary decision class parameter of a valid frame determine that the signal type of last valid frame is unvoiced frame, when the signal type of a rear valid frame is unvoiced frames, according to the pitch period extrapolation of last valid frame, obtain the pitch period of current speech frame.
As a kind of preferred embodiment, present frame determination module 603, for when according to last valid frame and after the binary decision class parameter of a valid frame determine that the signal type of last valid frame is unvoiced frames, when the signal type of a rear valid frame is unvoiced frame, according to the pitch period extrapolation of a rear valid frame, obtain the pitch period of current speech frame.
As a kind of preferred embodiment, the parameter of the parameter of last valid frame and a rear valid frame also comprises non-sequential evolution class parameter, and non-sequential evolution class parameter at least comprises long-term forecasting LTP coefficient;
Present frame determination module 603, also for determining the non-sequential evolution class parameter of current speech frame according to the binary decision class parameter of last valid frame and a rear valid frame and non-sequential evolution class parameter.
As a kind of preferred embodiment, present frame determination module 603, for when according to last valid frame and after the binary decision class parameter of a valid frame determine that the signal type of last valid frame and a rear valid frame is unvoiced frame, and determine that according to the sequential evolution class parameter of last valid frame and a rear valid frame pitch period of last valid frame and a rear valid frame does not suddenly change, and when packet loss quantity is less than packet loss threshold value, if current valid frame is positioned at the first half of all loss speech frames, according to the LTP coefficient of last valid frame, be multiplied by the LTP coefficient that zoom factor obtains current speech frame, if current valid frame is positioned at the latter half of all loss speech frames, according to the LTP coefficient of a rear valid frame, be multiplied by the LTP coefficient that zoom factor obtains current speech frame.
As a kind of preferred embodiment, present frame determination module 603, for when according to last valid frame and after the binary decision class parameter of a valid frame determine that the signal type of last valid frame and a rear valid frame is unvoiced frame, and while determining that according to the sequential evolution class parameter of last valid frame and a rear valid frame pitch period of last valid frame and a rear valid frame is undergone mutation or packet loss quantity is greater than packet loss threshold value, if current valid frame is positioned at the first half of all loss speech frames, determine that the LTP coefficient of current speech frame and the LTP coefficient of last valid frame are consistent, if current valid frame is positioned at the latter half of all loss speech frames, determine that the LTP coefficient of current speech frame and the LTP coefficient of a rear valid frame are consistent.
As a kind of preferred embodiment, present frame determination module 603, for when according to last valid frame and after the binary decision class parameter of a valid frame determine that the signal type of last valid frame is unvoiced frame, when the signal type of a rear valid frame is unvoiced frames, according to the LTP coefficient of last valid frame, be multiplied by the LTP coefficient that decay factor obtains current speech frame.
As a kind of preferred embodiment, present frame determination module 603, for when according to last valid frame and after the binary decision class parameter of a valid frame determine that the signal type of last valid frame is unvoiced frames, when the signal type of a rear valid frame is unvoiced frame, according to the LTP coefficient of a rear valid frame, be multiplied by the LTP coefficient that decay factor obtains current speech frame.
The device that the present embodiment provides, when determining current speech LOF to be decoded, by obtaining the parameter of last valid frame or the parameter of last valid frame and a rear valid frame of current speech frame, according to concrete condition, determine the parameter of current speech frame, then according to the parameter of current speech frame, to losing speech frame, carry out normal decoder, owing to having simulated the normal work of demoder under packet drop, therefore the continuity that has kept decoding, thereby when there is packet loss phenomenon in voice packet in transmitting procedure, can decode according to the parameter of definite lost frames, and then improve decoded voice quality.
It should be noted that: the bag-losing hide device of the parameter field that above-described embodiment provides is when carrying out bag-losing hide, only the division with above-mentioned each functional module is illustrated, in practical application, can above-mentioned functions be distributed and by different functional modules, completed as required, the inner structure that is about to device is divided into different functional modules, to complete all or part of function described above.In addition, the bag-losing hide device of the parameter field that above-described embodiment provides and the bag-losing hide embodiment of the method for parameter field belong to same design, and its specific implementation process refers to embodiment of the method, repeats no more here.
The invention described above embodiment sequence number, just to describing, does not represent the quality of embodiment.
One of ordinary skill in the art will appreciate that all or part of step that realizes above-described embodiment can complete by hardware, also can come the hardware that instruction is relevant to complete by program, described program can be stored in a kind of computer-readable recording medium, the above-mentioned storage medium of mentioning can be ROM (read-only memory), disk or CD etc.
The foregoing is only preferred embodiment of the present invention, in order to limit the present invention, within the spirit and principles in the present invention not all, any modification of doing, be equal to replacement, improvement etc., within all should being included in protection scope of the present invention.

Claims (28)

1. a bag-losing hide method for parameter field, is characterized in that, described method comprises:
Determine whether current speech frame to be decoded is lost;
If described current speech LOF, obtains the parameter of the last valid frame of described current speech frame;
According to the parameter of current speech frame described in the parameter acquiring of described last valid frame;
According to the parameter of described current speech frame, described current speech frame is decoded.
2. method according to claim 1, is characterized in that, described according to before the parameter of current speech frame described in the parameter acquiring of described last valid frame, also comprises:
Judge whether that buffering has a rear valid frame of described current speech frame;
If buffering has a described rear valid frame, obtain the parameter of a described rear valid frame;
Described according to the parameter of current speech frame described in the parameter acquiring of described last valid frame, comprising:
According to the parameter of the parameter of described last valid frame and a described rear valid frame, determine the parameter of described current speech frame.
3. method according to claim 2, is characterized in that, the parameter of the parameter of described last valid frame and a described rear valid frame comprises binary decision class parameter; Described binary decision class parameter is for judging signal type, and described signal type comprises two types of voiceless sound and voiced sounds;
The described parameter according to the parameter of described last valid frame and a described rear valid frame is determined the parameter of described current speech frame, comprising:
According to the binary decision class parameter of the binary decision class parameter of described last valid frame and a described rear valid frame, determine the signal type of described current speech frame, obtain the binary decision class parameter of described current speech frame.
4. method according to claim 3, is characterized in that, the described binary decision class parameter according to the binary decision class parameter of described last valid frame and a described rear valid frame is determined the signal type of described current speech frame, comprising:
If having a binary decision class parameter decision signal type in the binary decision class parameter of the binary decision class parameter of described last valid frame and a described rear valid frame is unvoiced frame, determine that the signal type of described current speech frame is unvoiced frame;
If the binary decision class parameter decision signal type of the binary decision class parameter of described last valid frame and a described rear valid frame is unvoiced frames, determine that the signal type of described current speech frame is unvoiced frames.
5. method according to claim 3, is characterized in that, the parameter of the parameter of described last valid frame and a described rear valid frame also comprises sequential evolution class parameter, and described sequential evolution class parameter at least comprises pitch period;
The described parameter according to the parameter of described last valid frame and a described rear valid frame is determined the parameter of described current speech frame, also comprises:
According to described last valid frame and the binary decision class parameter of a described rear valid frame and the sequential evolution class parameter that sequential evolution class parameter is determined described current speech frame.
6. method according to claim 5, is characterized in that, described according to described last valid frame and the binary decision class parameter of a described rear valid frame and the sequential evolution class parameter that sequential evolution class parameter is determined described current speech frame, comprising:
If determine according to the binary decision class parameter of described last valid frame and a described rear valid frame, the signal type of described last valid frame and a described rear valid frame is unvoiced frame, and the pitch period of determining described last valid frame and a described rear valid frame according to the sequential evolution class parameter of described last valid frame and a described rear valid frame does not suddenly change, according to the pitch period of described last valid frame and a described rear valid frame, carry out linear interpolation, obtain the pitch period of described current speech frame.
7. method according to claim 5, is characterized in that, described according to described last valid frame and the binary decision class parameter of a described rear valid frame and the sequential evolution class parameter that sequential evolution class parameter is determined described current speech frame, comprising:
If determine according to the binary decision class parameter of described last valid frame and a described rear valid frame, the signal type of described last valid frame and a described rear valid frame is unvoiced frame, and the pitch period of determining described last valid frame and a described rear valid frame according to the sequential evolution class parameter of described last valid frame and a described rear valid frame has sudden change, if described current speech framing bit is in the first half of all loss speech frames, determine that the pitch period of described current valid frame and the pitch period of described last valid frame are consistent, if described current speech framing bit is in the latter half of all loss speech frames, the pitch period of determining described current valid frame is consistent with the pitch period of a described rear valid frame.
8. method according to claim 5, is characterized in that, described according to described last valid frame and the binary decision class parameter of a described rear valid frame and the sequential evolution class parameter that sequential evolution class parameter is determined described current speech frame, comprising:
If determine that according to the binary decision class parameter of described last valid frame and a described rear valid frame signal type of described last valid frame is unvoiced frame, after described, the signal type of a valid frame is unvoiced frames, according to the pitch period extrapolation of described last valid frame, obtains the pitch period of described current speech frame.
9. method according to claim 5, is characterized in that, described according to described last valid frame and the binary decision class parameter of a described rear valid frame and the sequential evolution class parameter that sequential evolution class parameter is determined described current speech frame, comprising:
If determine that according to the binary decision class parameter of described last valid frame and a described rear valid frame signal type of described last valid frame is unvoiced frames, after described, the signal type of a valid frame is unvoiced frame, according to the pitch period extrapolation of a valid frame after described, obtains the pitch period of described current speech frame.
10. method according to claim 2, is characterized in that, the parameter of the parameter of described last valid frame and a described rear valid frame also comprises non-sequential evolution class parameter, and described non-sequential evolution class parameter at least comprises long-term forecasting LTP coefficient;
The described parameter according to the parameter of described last valid frame and a described rear valid frame is determined the parameter of described current speech frame, also comprises:
According to the binary decision class parameter of described last valid frame and a described rear valid frame, sequential evolution class and non-sequential evolution class parameter are determined the non-sequential evolution class parameter of described current speech frame.
11. methods according to claim 10, it is characterized in that, described according to the binary decision class parameter of described last valid frame and a described rear valid frame, sequential evolution class and non-sequential evolution class parameter are determined the non-sequential evolution class parameter of described current speech frame, comprising:
If determine according to the binary decision class parameter of described last valid frame and a described rear valid frame, the signal type of described last valid frame and a described rear valid frame is unvoiced frame, and determine that according to the sequential evolution class parameter of described last valid frame and a described rear valid frame pitch period of described last valid frame and a described rear valid frame does not suddenly change, and packet loss quantity is less than packet loss threshold value, if described current valid frame is positioned at the first half of all loss speech frames, according to the LTP coefficient of described last valid frame, be multiplied by the LTP coefficient that zoom factor obtains described current speech frame, if described current valid frame is positioned at the latter half of all loss speech frames, according to the LTP coefficient of a valid frame after described, be multiplied by the LTP coefficient that zoom factor obtains described current speech frame.
12. methods according to claim 10, it is characterized in that, described according to the binary decision class parameter of described last valid frame and a described rear valid frame, sequential evolution class and non-sequential evolution class parameter are determined the non-sequential evolution class parameter of described current speech frame, comprising:
If determine according to the binary decision class parameter of described last valid frame and a described rear valid frame, the signal type of described last valid frame and a described rear valid frame is unvoiced frame, and determine that according to the sequential evolution class parameter of described last valid frame and a described rear valid frame pitch period of described last valid frame and a described rear valid frame is undergone mutation or packet loss quantity is greater than packet loss threshold value, if described current valid frame is positioned at the first half of all loss speech frames, determine that the LTP coefficient of described current speech frame and the LTP coefficient of described last valid frame are consistent, if described current valid frame is positioned at the latter half of all loss speech frames, the LTP coefficient of determining described current speech frame is consistent with the LTP coefficient of a described rear valid frame.
13. methods according to claim 10, it is characterized in that, described according to the binary decision class parameter of described last valid frame and a described rear valid frame, sequential evolution class and non-sequential evolution class parameter are determined the non-sequential evolution class parameter of described current speech frame, comprising:
If determine that according to the binary decision class parameter of described last valid frame and a described rear valid frame signal type of described last valid frame is unvoiced frame, after described, the signal type of a valid frame is unvoiced frames, according to the LTP coefficient of described last valid frame, is multiplied by the LTP coefficient that decay factor obtains described current speech frame.
14. methods according to claim 10, it is characterized in that, described according to the binary decision class parameter of described last valid frame and a described rear valid frame, sequential evolution class and non-sequential evolution class parameter are determined the non-sequential evolution class parameter of described current speech frame, comprising:
If determine that according to the binary decision class parameter of described last valid frame and a described rear valid frame signal type of described last valid frame is unvoiced frames, after described, the signal type of a valid frame is unvoiced frame, according to the LTP coefficient of a valid frame after described, is multiplied by the LTP coefficient that decay factor obtains described current speech frame.
The bag-losing hide device of 15. 1 kinds of parameter fields, is characterized in that, described device comprises:
Determination module, for determining whether current speech frame to be decoded is lost;
Front frame acquisition module, for when the described current speech LOF, obtains the parameter of the last valid frame of described current speech frame;
Present frame determination module, for determining the parameter of described current speech frame according to the parameter of described last valid frame;
Decoder module, for decoding to described current speech frame according to the parameter of described current speech frame.
16. devices according to claim 15, is characterized in that, described device, also comprises:
Judge module, for judging whether that buffering has a rear valid frame of described current speech frame;
Rear frame acquisition module, for when buffering has a described rear valid frame, obtains the parameter of a described rear valid frame;
Present frame determination module, for determining the parameter of described current speech frame according to the parameter of the parameter of described last valid frame and a described rear valid frame.
17. devices according to claim 16, is characterized in that, the parameter of the parameter of described last valid frame and a described rear valid frame comprises binary decision class parameter; Described binary decision class parameter is for judging signal type, and described signal type comprises two types of voiceless sound and voiced sounds;
Described present frame determination module, for determine the signal type of described current speech frame according to the binary decision class parameter of the binary decision class parameter of described last valid frame and a described rear valid frame, obtains the binary decision class parameter of described current speech frame.
18. devices according to claim 17, it is characterized in that, described present frame determination module, for when the binary decision class parameter of described last valid frame and described afterwards when the binary decision class parameter of a valid frame has a binary decision class parameter decision signal type to be unvoiced frame, the signal type of determining described current speech frame is unvoiced frame;
Or while having an equal decision signal type of binary decision class parameter to be unvoiced frames in the binary decision class parameter of described last valid frame and the binary decision class parameter of a described rear valid frame, the signal type of determining described current speech frame is unvoiced frames.
19. devices according to claim 16, is characterized in that, the parameter of the parameter of described last valid frame and a described rear valid frame also comprises sequential evolution class parameter, and described sequential evolution class parameter at least comprises pitch period;
Described present frame determination module, also for according to described last valid frame and described after the binary decision class parameter of a valid frame and the sequential evolution class parameter that sequential evolution class parameter is determined described current speech frame.
20. devices according to claim 19, it is characterized in that, described present frame determination module, for when according to described last valid frame and described after the binary decision class parameter of a valid frame determine that described last valid frame and the described signal type of a valid frame are afterwards unvoiced frame, and while determining that according to the sequential evolution class parameter of described last valid frame and a described rear valid frame pitch period of described last valid frame and a described rear valid frame does not suddenly change, according to the pitch period of described last valid frame and a described rear valid frame, carry out linear interpolation, obtain the pitch period of described current speech frame.
21. devices according to claim 19, it is characterized in that, described present frame determination module, for when according to described last valid frame and described after the binary decision class parameter of a valid frame determine that described last valid frame and the described signal type of a valid frame are afterwards unvoiced frame, and while determining that according to the sequential evolution class parameter of described last valid frame and a described rear valid frame pitch period of described last valid frame and a described rear valid frame has sudden change, if described current speech framing bit is in the first half of all loss speech frames, determine that the pitch period of described current valid frame and the pitch period of described last valid frame are consistent, if described current speech framing bit is in the latter half of all loss speech frames, the pitch period of determining described current valid frame is consistent with the pitch period of a described rear valid frame.
22. devices according to claim 19, it is characterized in that, described present frame determination module, for when according to described last valid frame and described after the binary decision class parameter of a valid frame determine that the signal type of described last valid frame is unvoiced frame, when the signal type of a valid frame is unvoiced frames after described, according to the pitch period extrapolation of described last valid frame, obtain the pitch period of described current speech frame.
23. devices according to claim 19, it is characterized in that, described present frame determination module, for when according to described last valid frame and described after the binary decision class parameter of a valid frame determine that the signal type of described last valid frame is unvoiced frames, when the signal type of a valid frame is unvoiced frame after described, according to the pitch period extrapolation of a valid frame after described, obtain the pitch period of described current speech frame.
24. devices according to claim 16, is characterized in that, the parameter of the parameter of described last valid frame and a described rear valid frame also comprises non-sequential evolution class parameter, and described non-sequential evolution class parameter at least comprises long-term forecasting LTP coefficient;
Described present frame determination module, also for according to described last valid frame and described after the binary decision class parameter of a valid frame, sequential evolution class and non-sequential evolution class parameter are determined the non-sequential evolution class parameter of described current speech frame.
25. devices according to claim 24, it is characterized in that, described present frame determination module, for when according to described last valid frame and described after the binary decision class parameter of a valid frame determine that described last valid frame and the described signal type of a valid frame are afterwards unvoiced frame, and determine that according to the sequential evolution class parameter of described last valid frame and a described rear valid frame pitch period of described last valid frame and a described rear valid frame does not suddenly change, and when packet loss quantity is less than packet loss threshold value, if described current valid frame is positioned at the first half of all loss speech frames, according to the LTP coefficient of described last valid frame, be multiplied by the LTP coefficient that zoom factor obtains described current speech frame, if described current valid frame is positioned at the latter half of all loss speech frames, according to the LTP coefficient of a valid frame after described, be multiplied by the LTP coefficient that zoom factor obtains described current speech frame.
26. devices according to claim 24, it is characterized in that, described present frame determination module, for when according to described last valid frame and described after the binary decision class parameter of a valid frame determine that described last valid frame and the described signal type of a valid frame are afterwards unvoiced frame, and according to described last valid frame and described after the sequential evolution class parameter of a valid frame determine described last valid frame and described after the pitch period of a valid frame undergo mutation or when packet loss quantity is greater than packet loss threshold value, if described current valid frame is positioned at the first half of all loss speech frames, determine that the LTP coefficient of described current speech frame and the LTP coefficient of described last valid frame are consistent, if described current valid frame is positioned at the latter half of all loss speech frames, the LTP coefficient of determining described current speech frame is consistent with the LTP coefficient of a described rear valid frame.
27. devices according to claim 24, it is characterized in that, described present frame determination module, for when according to described last valid frame and described after the binary decision class parameter of a valid frame determine that the signal type of described last valid frame is unvoiced frame, when the signal type of a valid frame is unvoiced frames after described, according to the LTP coefficient of described last valid frame, be multiplied by the LTP coefficient that decay factor obtains described current speech frame.
28. devices according to claim 24, it is characterized in that, described present frame determination module, for when according to described last valid frame and described after the binary decision class parameter of a valid frame determine that the signal type of described last valid frame is unvoiced frames, when the signal type of a valid frame is unvoiced frame after described, according to the LTP coefficient of a valid frame after described, be multiplied by the LTP coefficient that decay factor obtains described current speech frame.
CN201310741180.4A 2013-12-27 2013-12-27 Packet loss hiding method and device of parameter domain Active CN103714820B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310741180.4A CN103714820B (en) 2013-12-27 2013-12-27 Packet loss hiding method and device of parameter domain

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310741180.4A CN103714820B (en) 2013-12-27 2013-12-27 Packet loss hiding method and device of parameter domain

Publications (2)

Publication Number Publication Date
CN103714820A true CN103714820A (en) 2014-04-09
CN103714820B CN103714820B (en) 2017-01-11

Family

ID=50407726

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310741180.4A Active CN103714820B (en) 2013-12-27 2013-12-27 Packet loss hiding method and device of parameter domain

Country Status (1)

Country Link
CN (1) CN103714820B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106251875A (en) * 2016-08-12 2016-12-21 广州市百果园网络科技有限公司 The method of a kind of frame losing compensation and terminal
WO2019000178A1 (en) * 2017-06-26 2019-01-03 华为技术有限公司 Frame loss compensation method and device
CN111554308A (en) * 2020-05-15 2020-08-18 腾讯科技(深圳)有限公司 Voice processing method, device, equipment and storage medium
CN112489665A (en) * 2020-11-11 2021-03-12 北京融讯科创技术有限公司 Voice processing method and device and electronic equipment
CN113539278A (en) * 2020-04-09 2021-10-22 同响科技股份有限公司 Audio data reconstruction method and system
WO2022228144A1 (en) * 2021-04-30 2022-11-03 腾讯科技(深圳)有限公司 Audio signal enhancement method and apparatus, computer device, storage medium, and computer program product

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1338723A (en) * 1996-11-07 2002-03-06 松下电器产业株式会社 Acoustic vector generator, and acoustic encoding and decoding apparatus
CN1445941A (en) * 2000-09-30 2003-10-01 华为技术有限公司 Method for recovering lost packets transferred IP voice packets in network
US20050091048A1 (en) * 2003-10-24 2005-04-28 Broadcom Corporation Method for packet loss and/or frame erasure concealment in a voice communication system
EP1724756A2 (en) * 2005-05-20 2006-11-22 Broadcom Corporation Packet loss concealment for block-independent speech codecs
CN101071568A (en) * 2005-11-23 2007-11-14 美国博通公司 Method and system of audio decoder
CN101155140A (en) * 2006-10-01 2008-04-02 华为技术有限公司 Method, device and system for hiding audio stream error
CN101207665A (en) * 2007-11-05 2008-06-25 华为技术有限公司 Method and apparatus for obtaining attenuation factor
CN101325631A (en) * 2007-06-14 2008-12-17 华为技术有限公司 Method and apparatus for implementing bag-losing hide
CN101437009A (en) * 2007-11-15 2009-05-20 华为技术有限公司 Method for hiding loss package and system thereof
CN101588341A (en) * 2008-05-22 2009-11-25 华为技术有限公司 Lost frame hiding method and device thereof
CN101616059A (en) * 2008-06-27 2009-12-30 华为技术有限公司 A kind of method and apparatus of bag-losing hide
CN101833954A (en) * 2007-06-14 2010-09-15 华为终端有限公司 Method and device for realizing packet loss concealment
CN103050121A (en) * 2012-12-31 2013-04-17 北京迅光达通信技术有限公司 Linear prediction speech coding method and speech synthesis method

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1338723A (en) * 1996-11-07 2002-03-06 松下电器产业株式会社 Acoustic vector generator, and acoustic encoding and decoding apparatus
CN1445941A (en) * 2000-09-30 2003-10-01 华为技术有限公司 Method for recovering lost packets transferred IP voice packets in network
US20050091048A1 (en) * 2003-10-24 2005-04-28 Broadcom Corporation Method for packet loss and/or frame erasure concealment in a voice communication system
EP1724756A2 (en) * 2005-05-20 2006-11-22 Broadcom Corporation Packet loss concealment for block-independent speech codecs
CN101071568A (en) * 2005-11-23 2007-11-14 美国博通公司 Method and system of audio decoder
CN101155140A (en) * 2006-10-01 2008-04-02 华为技术有限公司 Method, device and system for hiding audio stream error
CN101833954A (en) * 2007-06-14 2010-09-15 华为终端有限公司 Method and device for realizing packet loss concealment
CN101325631A (en) * 2007-06-14 2008-12-17 华为技术有限公司 Method and apparatus for implementing bag-losing hide
CN101207665A (en) * 2007-11-05 2008-06-25 华为技术有限公司 Method and apparatus for obtaining attenuation factor
CN101437009A (en) * 2007-11-15 2009-05-20 华为技术有限公司 Method for hiding loss package and system thereof
CN101588341A (en) * 2008-05-22 2009-11-25 华为技术有限公司 Lost frame hiding method and device thereof
CN101616059A (en) * 2008-06-27 2009-12-30 华为技术有限公司 A kind of method and apparatus of bag-losing hide
CN103050121A (en) * 2012-12-31 2013-04-17 北京迅光达通信技术有限公司 Linear prediction speech coding method and speech synthesis method

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106251875A (en) * 2016-08-12 2016-12-21 广州市百果园网络科技有限公司 The method of a kind of frame losing compensation and terminal
WO2019000178A1 (en) * 2017-06-26 2019-01-03 华为技术有限公司 Frame loss compensation method and device
CN109496333A (en) * 2017-06-26 2019-03-19 华为技术有限公司 A kind of frame losing compensation method and equipment
CN113539278A (en) * 2020-04-09 2021-10-22 同响科技股份有限公司 Audio data reconstruction method and system
CN113539278B (en) * 2020-04-09 2024-01-19 同响科技股份有限公司 Audio data reconstruction method and system
CN111554308A (en) * 2020-05-15 2020-08-18 腾讯科技(深圳)有限公司 Voice processing method, device, equipment and storage medium
CN111554308B (en) * 2020-05-15 2024-10-15 腾讯科技(深圳)有限公司 Voice processing method, device, equipment and storage medium
CN112489665A (en) * 2020-11-11 2021-03-12 北京融讯科创技术有限公司 Voice processing method and device and electronic equipment
CN112489665B (en) * 2020-11-11 2024-02-23 北京融讯科创技术有限公司 Voice processing method and device and electronic equipment
WO2022228144A1 (en) * 2021-04-30 2022-11-03 腾讯科技(深圳)有限公司 Audio signal enhancement method and apparatus, computer device, storage medium, and computer program product

Also Published As

Publication number Publication date
CN103714820B (en) 2017-01-11

Similar Documents

Publication Publication Date Title
CN103714820A (en) Packet loss hiding method and device of parameter domain
CN1983909B (en) Method and device for hiding throw-away frame
CN102449690B (en) Systems and methods for reconstructing an erased speech frame
US9978400B2 (en) Method and apparatus for frame loss concealment in transform domain
CN104347076B (en) Network audio packet loss covering method and device
CN101894558A (en) Lost frame recovering method and equipment as well as speech enhancing method, equipment and system
WO2017166800A1 (en) Frame loss compensation processing method and device
CN110931025A (en) Apparatus and method for improved concealment of adaptive codebooks in ACELP-like concealment with improved pulse resynchronization
CN110444224B (en) Voice processing method and device based on generative countermeasure network
CN105408954B (en) Apparatus and method for improved concealment of adaptive codebooks in ACE L P-like concealment with improved pitch lag estimation
CN1367918A (en) Methods and apparatus for generating comfort noise using parametric noise model statistics
AU2018253632B2 (en) Audio coding method and related apparatus
CN109496333A (en) A kind of frame losing compensation method and equipment
KR101409305B1 (en) Attenuation of overvoicing, in particular for generating an excitation at a decoder, in the absence of information
US5696873A (en) Vocoder system and method for performing pitch estimation using an adaptive correlation sample window
CN103456307B (en) In audio decoder, the spectrum of frame error concealment replaces method and system
CN106251875A (en) The method of a kind of frame losing compensation and terminal
CN103915097A (en) Voice signal processing method, device and system
TR201909562T4 (en) Methods and devices for DTX residue in audio coding.
Gueham et al. Packet loss concealment method based on interpolation in packet voice coding
CN102903364A (en) Method and device for adaptive discontinuous voice transmission
CN106898356A (en) A kind of bag-losing hide method suitable for Bluetooth voice call, device and blue tooth voice process chip
KR20120088297A (en) APPARATUS, METHOD AND RECORDING DEVICE FOR PREDICTION VoIP BASED SPEECH TRANSMISSION QUALITY USING EXTENDED E-MODEL
WO2009064824A1 (en) Method and apparatus for generating fill frames for voice over internet protocol (voip) applications
Rodbro et al. Time-scaling of sinusoids for intelligent jitter buffer in packet based telephony

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 511446 Guangzhou City, Guangdong Province, Panyu District, South Village, Huambo Business District Wanda Plaza, block B1, floor 28

Applicant after: Guangzhou Huaduo Network Technology Co., Ltd.

Address before: 510655, Guangzhou, Whampoa Avenue, No. 2, creative industrial park, building 3-08,

Applicant before: Guangzhou Huaduo Network Technology Co., Ltd.

COR Change of bibliographic data
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20210113

Address after: 511442 3108, 79 Wanbo 2nd Road, Nancun Town, Panyu District, Guangzhou City, Guangdong Province

Patentee after: GUANGZHOU CUBESILI INFORMATION TECHNOLOGY Co.,Ltd.

Address before: 511446 28th floor, block B1, Wanda Plaza, Wanbo business district, Nancun Town, Panyu District, Guangzhou City, Guangdong Province

Patentee before: GUANGZHOU HUADUO NETWORK TECHNOLOGY Co.,Ltd.

EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20140409

Assignee: GUANGZHOU HUADUO NETWORK TECHNOLOGY Co.,Ltd.

Assignor: GUANGZHOU CUBESILI INFORMATION TECHNOLOGY Co.,Ltd.

Contract record no.: X2021440000053

Denomination of invention: Packet loss hiding method and device in parameter domain

Granted publication date: 20170111

License type: Common License

Record date: 20210208