CN100525281C

CN100525281C - Method of realizing dynamic adjusting dithered buffer in procedure of voice transmission

Info

Publication number: CN100525281C
Application number: CNB200310121961XA
Authority: CN
Inventors: 王麒; 樊荣
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2003-12-09
Filing date: 2003-12-09
Publication date: 2009-08-05
Anticipated expiration: 2023-12-09
Also published as: CN1627747A

Abstract

Through real time statistics of average time delay and average dithering, the method reflects state of current network correctly. The method adjusts JB depth in real time based on state of average time delay and average dithering in everchanging network. Thus, anti dithering mechanism of reception gateway is able to adapt variable IP network dynamically, and reduces rate of packet loss in procedure of transferring voice through network. Comparing with technical scheme of static JB, and proved through tested result, the invention provides better anti dithering effect in procedure of transferring voice, raises voice quality of VOIP under relative poor condition of network quality.

Description

Realize dynamically adjusting the method for dithering cache in the voice transfer process

Technical field

The present invention relates to the voice transmission technology field, relate in particular to the method that realizes dynamically adjusting dithering cache in a kind of voice transfer process.

Background technology

Along with the continuous maturation of popularizing of using of Internet (the Internet) and technology, make traditional voice messaging be carried on Internet and go up transmission and become possibility that therefore, the professional every technology of VOIP (voice of internet protocol-based) is also just arisen at the historic moment.By what present IP (Internet protocol) net carried out that the voice packet transmission mainly faces be: intrinsic time-delay in the IP network, shake, packet loss, problem such as out of order all can impact the voice quality of VOIP, so-called network jitter, i.e. and time-delay changes.

In the application of VOIP, the DSP (Digital Signal Processing) that sends gateway is the voice packet that sends in the same time interval through after the encoding compression, as 10 milliseconds, 20 milliseconds, 30 milliseconds.Voice packet has produced shake through behind the IP network, arrives when receiving gateway, and the time interval of adjacent voice packet, variation took place, the same intervals when therefore no longer being transmission.At this moment, play-over if receive these voice packets that produced shake that gateway will receive, then the receiving terminal user is heard just be not the transmitting terminal user what is said or talked about, thereby influenced the quality of voice communication.

At present, the method that solves this network jitter is to receive anti-jitter mechanism of increase on the gateway, it is JB (dithering cache, Jitter Buffer), the voice packet of receiving is carried out the buffer memory of certain hour, make to receive gateway when the voice packet that broadcast receives, can the same intervals when sending play.Like this, by increasing certain time-delay at receiving terminal, thereby eliminated the jitter problem that IP network produces.The position of JB in network is usually located at and receives the gateway place as shown in Figure 1.

Will introduce JB mechanism used in the prior art below, and at first need two key concepts that JB relates to are described for the ease of understanding corresponding JB mechanism, i.e. the JB degree of depth and JB length.The JB degree of depth is: time and this bag that first bag arrives the JB formation is played time poor of (promptly going out the JB formation), in case the reproduction time of first bag has been determined, the reproduction time of its subsequent packet has also just and then been determined, therefore, the JB degree of depth is used for describing the initial time delay that is produced by the JB formation at the beginning.JB length then is: the physical length of JB formation, promptly the JB formation can be deposited the pairing time span of number of voice packet in realization, therefore, the reflection of JB length be the ability that the voice packet that arrives is ahead of time held in the JB formation.For this reason, the setting of JB length should be more than or equal to the JB degree of depth.

The technical scheme of currently used solution jitter problem mainly is static JB mechanism.So-called static JB mechanism is meant that in whole communication process, the JB degree of depth remains unchanged.As shown in Figure 2, the staircase curve of transmit leg represents that transmit leg sends packet at the same time.Because the shake in the network, be equally spaced staircase curve no longer just after making packet arrive the recipient.The time t that static JB scheme arrives according to first bag ₁Determine the reproduction time of first bag with the JB degree of depth of prior setting, promptly determined the dotted line among the figure, after this JB degree of depth has just no longer changed, and subsequent packet is just play according to the dotted line time among the figure.When the packet point of arrival drops on the left side of dotted line, represent that this packet arrives ahead of time; When dropping on the right side, the expression packet is late, has promptly missed the time that this bag should be play, and is considered as packet loss.Dotted line t among the figure for example ₂The expression JB degree of depth is (t ₂-t ₁), this constant JB degree of depth has caused the 6th, 7,8 follow-up packet loss; And dotted line t ₃The expression JB degree of depth is made as (t ₃-t ₁), this JB degree of depth does not cause packet loss, but preceding 5 bags have been caused unnecessary bigger time-delay.

Therefore, in the such scheme, the time-delay of experiencing in IP network owing to voice packet constantly changes with current network conditions, and this variation is unforeseen.Therefore, use the fixing static JB scheme of the JB degree of depth can not reach the effect of anti-jitter well.As shown in Figure 2, if fixedly the JB degree of depth is established for a short time, then can cause Loss Rate to increase; If established greatly, can increase unnecessary time-delay.Two important parameters of VOIP voice quality are weighed in packet loss and time-delay just, and the increase of these two parameters all can reduce the voice quality of VOIP.

The another one shortcoming of static JB scheme is that the reproduction time of subsequent voice bag is subjected to the influence of first voice packet.Because the dotted line among Fig. 2 is by first bag time of advent and fixedly the JB degree of depth is determined, and also be at random the time of advent of first bag, like this since the morning of first bag to or be late and all can influence the effect of static JB scheme.

Summary of the invention

In view of above-mentioned existing in prior technology shortcoming, the purpose of this invention is to provide in a kind of voice transfer process the method that realizes dynamically adjusting dithering cache, thereby make the anti-jitter mechanism IP network of adaptive variation dynamically that receives gateway.

The objective of the invention is to be achieved through the following technical solutions:

The invention provides the method that realizes dynamically adjusting dithering cache in a kind of voice transfer process, this method comprises:

Time-delay and shake in A, statistics and the computing voice transmission course, and described time-delay is the time that VoP needs from the transmit leg to recipient, the described variation that is dithered as the time-delay of each VoP specifically comprises:

Average delay in statistics and the computing voice transmission course and average jitter are as described time-delay and shake, and described average delay di and corresponding average jitter vi are respectively: d _i=α * d _I-1+ (1-α) * n _i, v _i=α * v _I-1+ (1-α) | d _i-n _i|, the weights factor of α for obtaining, and 0≤α≤1 by test, i is the sequence number of VoP, n _iBe that i speech data wraps in the total time-delay in the network; Perhaps, minimum time-delay in statistics and the computing voice transmission course and average jitter are as described time-delay and shake, and described minimum time-delay is d _MinWith the average jitter of correspondence be v _iBe respectively: d _Min=min (d _Min, n _i), v _i=α * v _I-1+ (1-α) (n _i-d _Min);

B, according to described time-delay and the shake dithering cache of receiving terminal is carried out the adjustment of the degree of depth, this step specifically comprises:

B1, calculate the actual play time of current speech packet according to current system time, be old reproduction time, and, be called new reproduction time according to the average delay that calculate to obtain and average jitter or according to calculating the minimum time-delay that obtains and the reproduction time of average Jitter Calculation current speech packet; Wherein, if be based on average delay and average Jitter Calculation, the new reproduction time of then described current first VoP is: p _i=ts _i+ d _i+ γ * v _i, the new reproduction time of described follow-up VoP is: p _j=p _i+ ts _j-ts _i, described i is the sequence number of first VoP of current speech segment, and described j is the sequence number of follow-up VoP, and described γ is the multiplication factor to average jitter, described ts _iAnd t _sBe respectively transmitting terminal and send time of i and j VoP; If be based on minimum time-delay d _MinWith average Jitter Calculation, the new reproduction time p of then described current speech packet _iFor: p _i=ts _i+ d _Min+ γ * v _i, the new reproduction time p of described follow-up VoP _jFor: p _j=p _i+ ts _j-ts _i

B2, according to described old reproduction time and described new reproduction time, the dithering cache formation of receiving terminal is carried out the adjustment of the degree of depth.

Described steps A further comprises:

A1, receive VoP, calculate the average delay and the average jitter value of this VoP;

A2, this VoP is divided into several is the packet of packaging time length with the base unit, and described base unit is meant the packaging time length of each element in JB (dithering cache) formation;

A3, the packet after will splitting add in the dithering cache formation, simultaneously, and statistics packet loss quantity wherein, and when packet loss quantity surpasses set point, restart the JB formation.

Described step B2 comprises:

When the situation of speech data packet loss takes place in the voice transfer process when, or be in quiet during the time, then, the dithering cache formation of receiving terminal is carried out the adjustment of the degree of depth according to the old reproduction time and the described new reproduction time of described VoP.

Described step B2 comprises:

B21, determine corresponding quiet section starting point in the voice transfer process;

B22, in described quiet section starting point according to described old reproduction time and described new reproduction time, the dithering cache formation of receiving terminal is carried out the adjustment of the degree of depth.

Described step B22 comprises:

Difference between B221, the new reproduction time of calculating and old reproduction time, and whether the absolute value of judging this difference is greater than the maximum adjusting range value that allows, if greater than, then the maximum adjusting range value of order permission is the range value of this JB adjustment, otherwise, the range value of adjusting as this JB with the absolute value of described difference;

B222, the range value of adjusting according to this JB that determines carry out degree of depth adjustment to described JB.

Described step B222 comprises:

According to the range value of this JB adjustment of determining, and by adjusting the adjustment that group pointer carries out JB length that goes out of JB, comprise operation that increases the JB degree of depth and the operation of shortening JB length.

The present invention also comprises, when carrying out the operation of the described shortening JB degree of depth, if determine to run into VoP, then stops to shorten accordingly the operation of the JB degree of depth.

Among the present invention, described step B also comprises:

Whether the spacing value of time of judging this dithering cache degree of depth adjustment of time interval of last dithering cache degree of depth adjustment allows to carry out the minimum time interval that the dithering cache degree of depth is adjusted greater than what set, if, then the dithering cache of receiving terminal is carried out the adjustment of the degree of depth according to described time-delay and shake, otherwise, do not carry out the degree of depth adjustment of dithering cache.

By technical scheme that the invention described above provided as can be seen, among the present invention, average delay by the real-time statistics network and average jitter are to reflect the situation of current network exactly, and the JB degree of depth is adjusted in real time according to average delay in the network of continuous variation and average jitter situation, thereby the anti-jitter mechanism that make to receive gateway is the IP network of adaptive variation dynamically, has reduced the packet loss in the process of transferring voice in network.The present invention has changed that the technical scheme of original static JB is existing can't to adapt to the problem that the variation of network condition causes packet loss to rise neatly because of JB.Among the present invention, the place can select corresponding dynamic JB to adjust scheme according to actual needs neatly at the reception gateway, to improve the voice transfer effect in the network effectively.Test result in corresponding product shows that the present invention can reach the preferable anti-jitter effect in the voice transfer process well, can improve especially the voice quality of VOIP under the relatively poor situation of network quality significantly.

Description of drawings

The network environment schematic diagram that Fig. 1 uses for JB;

Fig. 2 is the VoP receive-after-transmit time delay schematic diagram based on static JB;

Fig. 3 is the schematic diagram of the implication of each variable among the present invention;

Fig. 4 is the schematic diagram of the mode of adjusting at random described in the present invention;

Fig. 5 is for adjusting the schematic diagram of mode during quiet described in the present invention;

Fig. 6 is the flow chart of the processing procedure of joining the team described in the present invention;

Fig. 7 A, 7B, 7C are the flow chart that goes out group processing procedure described in the present invention;

Fig. 8 is the flow chart of the dynamic adjustment JB advanced treatment process described in the present invention.

Embodiment

The present invention proposes the method that realizes dynamically adjusting dithering cache in a kind of voice transfer process, this method has at first proposed the notion of dynamic adjustment JB at existing static JB, secondly this method provides based on time-delay that produces in the voice transfer process and shake the degree of depth of the JB of receiving terminal has been adjusted dynamically, the shake and the different requirements of time-delay that are produced when carrying out voice transfer in the network have promptly been taken into account to the degree of depth of JB, and adjust accordingly, thereby improve the quality of voice transfer effectively, at the IP network that can't estimate that its time-delay changes, adopt DYNAMIC J B to reach and well reach the anti-jitter effect, reduce unnecessary packet loss or time-delay.If the amplitude of adjusting as JB degree of depth needs according to the variation of average jitter only, then increase when the average delay of network, but average jitter need not the JB degree of depth is adjusted when being constant, and in fact because average delay increases, the JB degree of depth need increase adjustment; Therefore, changing and do not introduce average delay according to average jitter merely is to carry out the adjustment of the JB degree of depth exactly, to adapt to different Network Transmission situations.

Among the present invention, can adopt average delay and the average jitter of calculating current network in real time, and according to the network information of real-time statistics, dynamically adjust the convenience of the JB degree of depth, thereby make the degree of depth of JB can adapt to current network condition, reduce packet loss and the time-delay in the network, improved the voice quality of VOIP (IP-based speech business).Realization of the present invention makes the degree of depth relevant with time of advent of first VoP no longer only of JB, if first VoP early to or late cause the degree of depth of JB can't adapt to later needs, then can adjust compensation to the JB degree of depth by follow-up dynamic adjustment process.

Realize in the voice transfer process of the present invention that the method for dynamically adjusting dithering cache specifically comprises following processing procedure:

Step 1: time-delay and shake in real-time statistics and the computing voice transmission course, so that determine the range value that described JB degree of depth needs are adjusted according to described time-delay and shake;

Range value for the accurate each JB degree of depth of weighbridge amount adjustment need be adjusted can adapt to corresponding Network Transmission situation well to guarantee adjusted JB, reduces the packet loss of VoP, and guarantees the quality of voice; For this reason, can carry out the adjustment of the corresponding JB degree of depth according to average delay in the voice transfer process and average jitter, can also carry out the adjustment of the corresponding JB degree of depth according to time-delay of the minimum in the voice transfer process and average jitter, how explanation calculates described each value respectively below:

(1) the real-time statistics network condition, calculate the reproduction time of each bag, and each VoP of receiving is calculated the average delay d of current network by following formula _iWith average jitter v _i, the n that wherein relates to _iBe that i speech data wraps in the total time-delay in the network, a _iBe that i VoP arrives the time (taking from the system time that receives gateway) that receives net and send, ts _iBe the time (taking from the DSP sampling time that sends gateway) that sends i VoP, α is that adjustable power is planted the factor:

n _i=a _i-ts _i→ (formula 1)

d _i=α * d _I-1+ (1-α) * n _i→ (formula 2)

v _i=α * v _I-1+ (1-α) * | d _i-n _i| → (formula 3) wherein: 0≤α≤1

The difference that VoP arrive to receive time of gateway and time stamp is exactly that speech data wraps in relative time delay n in the network _iAt this, the clock synchronization issue that we do not consider to send gateway and receive gateway.The problem that DYNAMIC J B pays close attention to is the shake in the network, i.e. Yan Shi variation; And do not pay close attention to the absolute time-delay that produces in the network, therefore, as long as the relative time clock of assurance transmission gateway and reception gateway is synchronously; In the following formula, by the adjustment to described α the influence degree of the calculated value of this packet to mean value can be set, α can select optimum weights according to the real network situation in realization;

(2) when each VoP being carried out the network condition statistics, described minimum time-delay d _MmWith average jitter v _I2For calculating according to following formula:

n _i=a _i-ts _i→ (formula 4)

d _Min=min (d _Min, n _i) → (formula 5)

v _i=α * v _I-1+ (1-α) * (n _i-d _Min) → (formula 6)

n _iBe that i speech data wraps in the time-delay in the network, d _MinBe minimum time-delay, the network delay that record is minimum; v _iBe average jitter, different with computational process in (1) is that the shake of i VoP is as long as deduct minimum time-delay to the network delay of i VoP, operated and needn't take absolute value, the power that described α remains adjustable is planted the factor again.

Step 2: according to described average delay and average jitter, perhaps according to the reproduction time of described minimum time-delay and average each VoP of Jitter Calculation, i.e. the new reproduction time of VoP; Simultaneously, also need the actual play time of computing voice packet, promptly old reproduction time;

That is: with the average delay of the voice transfer in the network of real-time statistics and average jitter or minimum time-delay and average jitter as basis, calculate the reproduction time of each VoP; Current time with system serves as according to the actual play time that calculates the current speech packet,

If with average delay with on average be dithered as when calculating the reproduction time of described VoP, concrete calculating can adopt following formula to carry out, and formula is as follows:

p _i=ts _i+ d _i+ γ * v _i→ (formula 7)

p _i=p _i+ ts _i-ts _i→ (formula 8)

Wherein, i is the sequence number of first VoP of current speech segment, and p _iNew reproduction time for first current VoP; J is the sequence number of the follow-up VoP of current speech segment, and pj is the new reproduction time of follow-up VoP; γ represents the multiplication factor to average jitter, can select optimum multiplication factor according to the real network situation in realization;

If, then can adopt following formula at the reproduction time that calculates each VoP with minimum time-delay and the reproduction time that on average is dithered as according to the described VoP of calculating:

p _j=p _i+ t _j-t _i→ (formula 10)

Wherein, i is the sequence number of first VoP of current speech segment, and p _iBe the new reproduction time of first current VoP, j is the sequence number of the follow-up VoP of current speech segment; And p _jNew reproduction time for follow-up VoP; γ represents the multiplication factor to average jitter, can select optimum multiplication factor according to the real network situation in realization; Different with the front is to calculate p _iThe time, do not re-use average delay, and be to use minimum time-delay and average jitter;

During according to the reproduction time of minimum time-delay and the average described VoP of Jitter Calculation, as long as because of comparing operation, so amount of calculation reduces; And, can consider the operation that takes absolute value when calculating average jitter; Yet the reproduction time that calculates according to this method carries out the adjustment of the JB degree of depth, may cause the JB degree of depth constantly to increase, the p that promptly calculates _iValue can be bigger than normal, for example, and when the just often minimum time-delay of network is d _Min, after after a while,, still use the d of record under the network normal condition this moment owing to the paralysis of the router on certain critical path causes the network delay increase and continued a period of time _Min, will cause average jitter to increase, pass through γ amplification doubly again after, finally cause the p that at every turn calculates _iThe capital is bigger than normal; Therefore, if use this method also to need to avoid the JB degree of depth constantly to increase, for example can adopt and work as d _MinWhen value was not upgraded in a period of time of setting, then force and upgrade its value, thereby avoid the JB degree of depth constantly to increase.

Step 3: the difference according to the new reproduction time of described current speech packet and old reproduction time is carried out the adjustment of the JB degree of depth, promptly determines the range value that JB degree of depth needs are adjusted according to described difference;

The mode that the JB degree of depth adopts adjustment JB formation to go out group pointer is usually carried out, if increase the degree of depth of JB, then will go out group pointer accordingly and do to successively decrease processing, if reduce the degree of depth of JB, then will go out group pointer accordingly and do to increase progressively processing, the amount of described increasing or decreasing is then determined by the range value that the described JB degree of depth need be adjusted;

Usually when the JB degree of depth is adjusted, for guaranteeing the relatively stable of voice transfer process, also need to consider the problem of two aspects, on the one hand be to determine the suitable frequency that the JB degree of depth is adjusted, too frequent adjustment will cause the vibration repeatedly of the JB degree of depth, and the frequency size can't embody the advantage of DYNAMIC J B well, be the amplitude that the JB degree of depth is adjusted on the other hand, if once the amplitude of Tiao Zhenging is excessive, can occur that JB degree of depth moment extends significantly or reduction significantly, the moment laser propagation effect of voice is produced bigger influence;

In addition, when the JB degree of depth is dynamically adjusted, also need to determine the time point of an adjustment; Owing in one continues to converse, the speech data that both call sides transmits is one section voice and one section quiet alternately appearance normally, therefore, dynamically adjusts JB degree of depth time point and can adopt dual mode to determine:

A kind of is the mode of adjusting at random, be in the voice transfer process in case the adjustment that packet drop just carries out the JB degree of depth at once occurs, as shown in Figure 4, the late packet loss that causes of i+1 the VoP of voice segments K, at this moment carry out the JB degree of depth according to the corresponding calculated result at once and increase adjustment, thereby make follow-up i+2, i+3, an i+4 bag be not dropped; The advantage of adjusting mode at random is to reduce packet loss, but can damage the voice quality of conversation, the intelligibility of the mutual conversation of influence for example when adjusting point and drop on voice segments just, can make one not right pause to occur in coherent if increase the JB degree of depth this moment; On the contrary, if reduce the JB degree of depth several portions in a word is lost, these all can cause the decline of voice quality;

Another kind of scheme is the mode of adjusting during quiet, promptly the time point of dynamically adjusting be chosen in quiet during, adopt this scheme just can avoid the voice quality decline that causes in first kind of scheme; If carry out the adjustment of DYNAMIC J B during quiet, increase and reduce the length that the JB degree of depth only can influence the quiet duration in stage, and the growth a little of quiet duration in stage or shorten and can't have influence on voice quality; As shown in Figure 5, in voice segments K, also do not adjust even packet loss occurred, carry out the DYNAMIC J B degree of depth and increase adjustment in quiet section K, make follow-up voice segments K+1 packet loss no longer occur, the influence that is produced is the growth a little of quiet section K.

Be example in the mode of adjusting during quiet below, implementation method to dynamic adjustment JB based on average delay and average jitter of the present invention is described further, concrete implementation procedure can comprise three processing procedures, is respectively: the processing procedure that VoP is joined the team, go out the processing procedure of team, dynamically adjust the processing procedure of the JB degree of depth.JB formation among the JB is sorted by time stamp, and each element of JB formation is represented the base unit of different encoding and decoding packaging time lengths, as G.711 the regulation base unit is 10ms, G.729 the regulation base unit is 10ms, G.723 the regulation base unit is 30ms.

At first, the processing procedure that VoP is joined the team describes, and VoP is joined the team in the process except described VoP being added the processing of JB formation, also needs to calculate the n of described VoP _i, d _i, v _i, so that the processing procedure of the dynamic adjustment JB degree of depth of back adjusts accordingly according to this result of calculation, to n _i, d _i, v _iCalculating can adopt the formula that the front described to realize, wherein, consider the operational efficiency of algorithm on CPU (central processing unit), the value of described α all should be the inverse of 2 N power, as 0.875,0.75 etc., like this, when realizing, can use shift operation to finish, for example, so best results when drawing α value 0.75 by reality test on the net is α value 0.75.Except that aforementioned calculation, the processing procedure that VoP is added the JB formation specifically may further comprise the steps as shown in Figure 6:

Step 61: carry out the processing of unpacking of VoP, it is the packet of packaging time length with the base unit that the individual data bag that is about to the various packaging time lengths of various encoding and decoding splits into several, except splitting voice packet payload part, also comprise to splitting recomputating of each RTP of back (RTP) bag stem time stamp, to determine new timestamp.

Step 62: define a packet loss counter variable ucLossCounter, be used to write down the continuous packet number that can't enter the JB formation, promptly VoP is because of late or early when causing packet loss, this variable begins to count.

Step 63: can packet that judge current fractionation acquisition go into the JB formation, if can, then execution in step 64, otherwise, execution in step 65;

Step 64: make packet loss counter variable ucLossCounter=0, and preserve current packet; In addition,, before preserving current packet, need also to judge whether described packet is the packet of repetition in this step for determining that the packet of preserving is not the packet of repetition, if, then make discard processing, if not, then preserve.

Step 65: determine this packet for late or early to packet can't add the JB formation, promptly this packet is the packet that abandons, and packet loss counter variable ucLossCounter is done to add 1 handle;

Also need respectively herein to late, early to packet add up, as the parameter of test JB effect, so that determine the adjustable parameter that relates among the present invention, as α and γ etc. according to this parameter.

Step 66: whether judge packet loss counter variable ucLossCounter less than the packet loss quantity of setting, if less than, then execution in step 67, and formation does not deal with to JB, otherwise execution in step 68 restarts the JB formation;

The packet loss quantity that for example can set permission is 3, then when continuous three packets can't be gone into the JB formation, just restart the JB formation, be the voice transfer performance of monitoring network in a period of time simultaneously, also need the number of times that JB restarts is added up, if the number of times that restarts is crossed and illustrated at most and exist more serious time-delay and shake in the network, can't carry out the transmission of speech data again as not safeguarding.

After corresponding packet enters JB formation buffer memory, through after certain time-delay, then, need that also described packet is gone out team and handle, concrete processing procedure may further comprise the steps as shown in Figure 7:

Step 71: definition static variable: empty package counting facility variable and adjust the frequency variable, be respectively applied for the quantity of the continuous sky bag of statistics, and double adjustment interlude value;

Described empty package counting facility variable be used for determining current whether be in quiet during;

Described JB formation initial time delay counter is used to realize the initial time delay of the JB formation of a setting, begins as before to be set to 50ms, then can go out team through the packet in the JB formation behind the 50ms; Concrete counting mode is: when the needs execution once goes out team's processing, judge whether the initial time delay Counter Value is less than or equal to 0, if not, then deduct the base unit (as 10ms) of JB queue element (QE), so cycling, be less than or equal at 0 o'clock up to described initial time delay Counter Value, begin team and handle, promptly carry out following steps.

Step 72: judge the type of the packet of JB formation, promptly according to the type of the field value specified data bag of its type of record in the packet, if empty bag, then execution in step 73, if quiet bag, then execution in step 74, for voice packet, then execution in step 75.

Step 73: described empty bag is further processed, comprises:

Step 731: according to quiet the sign judge current whether be in quiet during, if this is masked as true, then for be in quiet during, if be false, then be not in quiet during, thereby determine further whether this sky bag is quiet bag, if, then execution in step 732, otherwise, execution in step 733;

Step 732: make empty package counting facility value be " 0 ", and create non-transmission frame, execution in step 736;

Step 733: make empty package counting facility value add 1, and whether judge empty package counting facility count value greater than set point, if greater than, then execution in step 734, otherwise, execution in step 735;

Step 734: described quiet sign is set to true, determine current be in quiet during, and determine that this packet is quiet bag, creates non-transmission frame, execution in step 736;

Step 735: create packet loss indication frame, so that carry out corresponding number of dropped packets quantitative statistics, and execution in step 736;

Step 731 to the processing procedure of step 735 be mainly differentiation when the element that goes out team for empty (being empty bag) and current be not be in quiet during the time existing two kinds may: a kind of is that the current packet that goes out team should be voice packet, but, this packet arrives because being later than its reproduction time, cause packet loss, at this moment should issue packet loss indication frame to DSP; Second kind may be before this, has lost a quiet frame, thus quiet sign is not changed to quiet during, what at this moment issue to DSP should be non-transmission frame rather than packet loss indication frame; For example, in the time of can setting continuous occurrence number when empty package counting facility variable ucEmptyCounter record less than 2 times, current empty bag is issued as packet loss indication frame; When number of times during more than or equal to 2 times, issue as non-transmission frame, simultaneously current quiet sign change into quiet during;

Step 736: data are contracted out team, count group data packet number, and upgrade group pointer and go out group time stamp.

Step 74: quiet bag is further processed, specifically comprise:

Step 741: make empty package counting facility count value be " 0 ", and write down the big sequence number of the most quiet current bag, and convert the RTP head of packet to the DSP head;

Step 742: data are contracted out team, count group data packet number, and upgrade group pointer and go out group time stamp;

Step 743: judge that whether quiet sign be the sign during quiet,, promptly do not do any processing, be set to sign during quiet and execution in step 745 if not, then quiet sign if then execution in step 744;

Step 745: judge the blanking time (promptly adjusting the frequency controlling elements) whether the statistical value of current adjustment frequency variable is adjusted greater than twice JB degree of depth of the permission of setting, if then execution in step 746, otherwise, execution in step 747;

Described adjustment frequency controlling elements are for to add up by a timer, whenever carry out the adjustment of a JB degree of depth, this timer clear " 0 " then, and the new timing of starting weight, when needs carry out the adjustment of the new JB degree of depth once, judge that whether this clocking value is greater than the value of setting (i.e. the blanking time that twice JB degree of depth of the permission of She Dinging adjusted), if greater than, then execution in step 746, otherwise, execution in step 747;

Described adjustment frequency controlling elements can or be simulated actual online test at actual net and obtain for obtaining by test, can value be 100 milliseconds for example, and the time interval of promptly controlling twice DYNAMIC J B adjustment is at least 100 milliseconds;

Step 746: carry out the adjustment of the JB degree of depth, concrete adjustment process will describe in the processing procedure of the dynamic adjustment JB degree of depth of back;

Step 747: the processing procedure at this quiet bag finishes.

Step 75: described voice packet is handled, specifically comprised:

Step 751: make empty package counting facility variable for " 0 ", if the described quiet sign that is masked as during quiet also needs to change described quiet sign into non-sign during quiet;

Step 752: write down the maximum sequence number of current packet, and convert described RTP head to the DSP head;

Step 753: go out team and count the quantity of data packets of team, upgrade group pointer and go out group time stamp, the processing procedure of this voice packet finishes.

At last, the processing procedure to the described dynamic adjustment JB degree of depth describes.When the JB degree of depth was adjusted, concrete adjustment point selection to avoid shortening in the voice segments starting point adjustment of the JB degree of depth, caused abandoning the some voice packets in front of this voice segments in each starting point of quiet section, and the phenomenon that causes voice quality to descend occurs; Concrete processing procedure may further comprise the steps referring to Fig. 8:

Step 81: when judging whether to carry out the dynamic adjustment of the JB degree of depth, at first need to calculate the actual play time of current quiet bag, be defined as old reproduction time, described old reproduction time is to determine according to current system time, be old reproduction time=current time in system * time converted variable, described time converted variable is for being converted to millisecond 1/8 millisecond of the time of day of system;

Step 82: according to foregoing formula 7 or formula 9, calculate the reproduction time of current quiet bag, be defined as new reproduction time, new reproduction time is exactly the reproduction time that meets current network shake situation;

Step 83: calculate the poor of new reproduction time and old reproduction time, be the amplitude that the JB degree of depth need be adjusted, also need the difference of gained length of element value (being the base unit of packaging time length) for ease of adjusting divided by JB, obtain the range value that the final JB degree of depth need be adjusted, if the difference of gained is " 0 ", then do not need to adjust;

Step 84: in the absolute value of relatively determining described range value and the once maximum range value of adjusting of the permission of setting less one, as the range value of reality at the JB adjustment;

The once maximum range value of adjusting of described permission is the adjusting range controlling elements, by actual net or the actual online test of simulation, value can be 10, unit is pairing time of JB queue unit lattice (being base unit), for example encoding and decoding are for G.711 the time, and the amplitude range that DYNAMIC J B once adjusts allows for and increases or shorten the JB degree of depth (10 * 10) millisecond;

Step 85: whether the range value of judging described calculating acquisition is less than " 0 ", if execution in step 86, otherwise, execution in step 87;

Step 86: JB is carried out the adjustment that the degree of depth shortens, be specially:

Step 861: set up cyclic variable, and make its initial value be " 0 ";

Step 862: whether judge cyclic variable less than the range value of reality at the JB adjustment, if then execution in step 863, otherwise, execution in step 88;

Step 863: judge whether current data packet is voice packet, if voice packet, then execution in step 88, otherwise, execution in step 864;

Promptly when carrying out JB degree of depth shortening operation, just stop at once shortening adjusting, do not delete any voice packet when adjusting, avoid because the JB degree of depth is dynamically adjusted the voice quality decline that causes with the assurance JB degree of depth as long as run into voice packet;

Step 864: the current data packet type is changed to empty bag, and will goes out group pointer and increase progressively 1, cyclic variable adds 1, execution in step 862;

Step 87: JB is carried out the degree of depth increase adjustment, be specially:

Step 871: set up cyclic variable, and make its initial value be " 0 ";

Step 872: whether judge cyclic variable less than the range value of reality at the JB adjustment, if then execution in step 863, otherwise, execution in step 88;

Step 873: will go out group pointer and successively decrease 1, and the current data packet type is changed to empty bag, cyclic variable adds 1, execution in step 872;

Step 88: upgrade the depth value of JB according to the result who adjusts, process finishes.

By above-mentioned pin detailed description of the invention as can be seen, compare with static JB, the present invention's energy carry out the real-time adjustment of the DYNAMIC J B degree of depth according to the network condition that constantly changes, and the anti-jitter mechanism of feasible reception gateway is the IP network of adaptive variation dynamically.The situation that average delay and average jitter by the real-time statistics network can reflect current network exactly.

The above; only for the preferable embodiment of the present invention, but protection scope of the present invention is not limited thereto, and anyly is familiar with those skilled in the art in the technical scope that the present invention discloses; the variation that can expect easily or replacement all should be encompassed within protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion with the protection range of claims.

Claims

1, realize dynamically adjusting the method for dithering cache in a kind of voice transfer process, it is characterized in that comprising:

The statistics and the computing voice transmission course in average delay and average jitter as described time-delay and shake, described average delay d _iAverage jitter v with correspondence _iBe respectively: d _i=α * d _I-1+ (1-α) * n _i, v _i=α * v _I-1+ (1-α) | d _i-n _i|, the weights factor of α for obtaining, and 0≤α≤1 by test, i is the sequence number of VoP, n _iBe that i speech data wraps in the total time-delay in the network; Perhaps, minimum time-delay in statistics and the computing voice transmission course and average jitter are as described time-delay and shake, and described minimum time-delay is d _MinWith the average jitter of correspondence be v _iBe respectively: d _Min=min (d _Min, n _i), v _i=α * v _I- ₁+ (1-α) (n _i-d _Min);

B1, calculate the actual play time of current speech packet according to current system time, be old reproduction time, and, be called new reproduction time according to the average delay that calculate to obtain and average jitter or according to calculating the minimum time-delay that obtains and the reproduction time of average Jitter Calculation current speech packet; Wherein, if be based on average delay and average Jitter Calculation, the new reproduction time of then described current first VoP is: p _i=ts _i+ d _i+ γ * v _i, the new reproduction time of described follow-up VoP is: p _j=p _i+ ts _j-ts _i, described i is the sequence number of first VoP of current speech segment, and described j is the sequence number of follow-up VoP, and described γ is the multiplication factor to average jitter, described ts _iAnd ts _jBe respectively transmitting terminal and send time of i and j VoP; If be based on minimum time-delay d _MinWith average Jitter Calculation, the new reproduction time p of then described current speech packet _iFor: p _i=ts _i+ d _Min+ γ * v _i, the new reproduction time p of described follow-up VoP _jFor: p _j=p _i+ ts _j-ts _i

2, realize dynamically adjusting the method for dithering cache in the voice transfer process according to claim 1, it is characterized in that described steps A comprises:

A2, this VoP is divided into several is the packet of packaging time length with the base unit, and described base unit is meant the packaging time length of each element in the dithering cache JB formation;

3, realize dynamically adjusting the method for dithering cache in the voice transfer process according to claim 1, it is characterized in that described step B2 comprises:

4, realize dynamically adjusting the method for dithering cache in the voice transfer process according to claim 1, it is characterized in that described step B2 comprises:

5, realize dynamically adjusting the method for dithering cache in the voice transfer process according to claim 4, it is characterized in that described step B22 comprises:

6, realize dynamically adjusting the method for dithering cache in the voice transfer process according to claim 5, it is characterized in that described step B222 comprises:

7, realize dynamically adjusting the method for dithering cache in the voice transfer process according to claim 6, it is characterized in that, when carrying out the operation of the described shortening JB degree of depth,, then stop to shorten accordingly the operation of the JB degree of depth if determine to run into VoP.

8, realize dynamically adjusting the method for dithering cache in the voice transfer process according to claim 1, it is characterized in that described step B also comprises: