CN108495182A

CN108495182A - A kind of audio quality self-adjusting control method

Info

Publication number: CN108495182A
Application number: CN201810243626.3A
Authority: CN
Inventors: 胡治国; 郭丽峰; 闫涛
Original assignee: Shanxi University
Current assignee: Shanxi University
Priority date: 2018-03-23
Filing date: 2018-03-23
Publication date: 2018-09-04

Abstract

The invention discloses a kind of audio quality self-adjusting control method, this method determines default Quality of experience threshold interval, determines audio controllable parameter type in transmission process, transmission process, receive process by dividing audio experience credit rating；Determine that the variation of the controllable parameters such as type of coding, code rate, transmission path, order caching establishes system controllable parameter and Quality of experience level adjustment mapping relations to the influence degree of Quality of experience value in transmission process, transmission process, receive process；For the quality condition of different audios, be respectively adopted code optimization, path optimization, optimization of rate, cache optimization strategy regulate and control Quality of experience.The present invention realizes effective promotion of voice applications Quality of experience by the optimum organization of multiple strategy.

Description

A kind of audio quality self-adjusting control method

Technical field

The present invention relates to Streaming Media and network communication technology field, espespecially a kind of audio quality self-adjusting control method.

Background technology

Audio is one of the main business of network streaming media, and good audio experience quality assurance is that service provider wins It obtains client, account for commercioganic key technology factor.Complexity, diversity and the fragility of network system so that Quality of experience is protected Card and optimization are faced with many uncertain factors.Although having some researchs to optimizing and revising for audio quality, total comes It sees, has the following disadvantages：

It is by the regulation and control to single parameter in tradition research（Such as by the way of code rate adjustment）To improve audio experience Quality, but our experiences show that, such as adjustment, the adjustable region model of Quality of experience are optimized only for single controllable parameter It encloses smaller, it is difficult to realize from " poor " or " bad " grade to the conversion of " excellent " or " good " grade；

How controllable parameter in each unit, each unit is adjusted, be adjusted to which kind of degree lack specific quantisation metric standard or With reference to；

Lack using the comprehensive strategic of multi-method or more means go optimization audio quality research, a complete audio system by Multiple unit compositions, it is apparent that influence of each component units to audio experience quality is different in system, controllability also differs, Therefore, the single optimisation strategy of tradition is difficult to realize effective promotion of voice applications Quality of experience.

Invention content

In order to achieve the above objectives, technical scheme of the present invention is specifically realized in：

The present invention provides a kind of audio quality self-adjusting control method, completes this method, including transmission unit, transmission unit, connects Four unit, monitoring unit parts are received, including：

Step 1, receiving unit by audio quality be divided into it is excellent, good, in, poor, bad 5 grades, concrete numerical value set {MOS₅=5、MOS₄=4、MOS₃=3、MOS₂=2、MOS₁=1 } it indicates, is divided into 4 sections altogether：[5,4]、（4,3]、（3,2]、（2, 1], wherein MOS₅Indicate that audio grade is ' excellent ' etc., MOS₄Indicate that audio grade is ' good ' etc., MOS₃Indicate that audio grade is ' in ' etc., MOS₂Indicate that audio grade is ' poor ' etc., MOS₁Indicate that audio grade is ' bad ' etc.；By MOS₄Labeled as first threshold, MOS₃Labeled as second threshold, MOS₂Labeled as third threshold value, MOS₁Labeled as the 4th threshold value；

Step 2, monitoring unit calculate voice applications Quality of experience value in real time, by experiment obtain transmission unit, transmission unit, The correspondence of different parameters and voice applications Quality of experience value in receiving unit, obtains training dataset, by engineering It practises algorithm and establishes Quality of experience assessment models, model includes at least input layer, output layer two layers；Setting input layer at least has 6 6 inputs of a input, input layer are type of coding, code rate, the time delay of transmission path performance, shake, packet loss, Yi Jijie Receive cached parameters；Output layer exports Quality of experience；It is controllable more in model realization transmission unit, transmission unit, the receiving unit Tie up mapping of the parameter to Quality of experience；

Step 3 determines type of coding in transmission unit, transmission unit, receiving unit, code rate, transmission path performance, connects It receives caching controllable parameter and changes influence variation degree to Quality of experience, establish in voice applications controllable parameter in each component units Value variation and Quality of experience value added or the correspondence mappings relationship of decreasing value；

Step 4, monitoring unit measure Quality of experience in system terminal, and measured value and default Quality of experience threshold levels are carried out Compare, type of coding, code rate, biography in transmission unit, transmission unit, receiving unit are at least adjusted according to comparative analysis result At least one of defeated path performance, order caching, according to after adjustment type of coding and or code rate and or transmission road Diameter parameter and or order caching, further to send, transmit, receive audio signal, so reach optimization voice applications experience matter The purpose of amount.

Wherein it is determined that the step of controllable parameter influences audio experience quality in transmission unit includes：

Transmission unit controls type of coding, code rate；

For the corresponding voice quality of different coding type, G.711 G.726 ＞ G729 ＞ are G.723 by ＞；

G.726, G711 coded audio quality is higher than 0.4MOS values are encoded, and G.726 coded audio quality encodes 0.2MOS higher than G729 Value, G729 coded audio quality is higher than G.723 coding 0.2MOS values；

For the corresponding voice quality of different coding rate, under same coding, the audio quality of higher code rate is higher than The Quality of experience of relatively low coding：11.8kbit（G729）＞ 8kbit（G729）6.4 kbit of ＞（G729）；6.3kbit （G723.1）＞ 5.3kbit（G723.1）；

In the expansible rate of G729 codings, the high 0.2MOS of audio quality of 11.8kbit code rate ratio 8kbit code rates Value, the high 0.1MOS values of audio quality of 8kbit code rate ratio 6.4kbit code rates, during G723.1 is encoded, 6.3kbit is compiled The high 0.1MOS values of audio quality of bit rate ratio 5.3kbit code rates.

Transmission unit controlling transmission path performance；For transmission path packet loss（PLR）For, to G.711 encoding, with no net Network packet drop is reference, and 0.8% packet loss can make audio quality reduce by 0.1 MOS values, and 3.3% packet loss can be such that audio quality drops Low 0.5MOS values, 7.6% packet loss make audio quality reduce by 1 MOS values, and 11.8% packet loss makes audio quality reduce by 1.5 MOS Value, 19% packet loss make audio quality reduce 2MOS values；For encoding G.729, the case where compared to no Network Packet Loss, 0.5% loses Packet rate makes audio quality reduce 0.1MOS values, and 2.1% packet loss makes audio quality reduce 0.5MOS values, and 5.7% packet loss makes audio Quality reduces 1MOS values, and 9.8% packet loss makes audio quality reduce 1.5MOS values, and 16% packet loss makes audio quality reduce 2MOS Value；According to above-mentioned data packet loss is calculated using interpolation method or curve-fitting method（PLR）With MOS values variation () Between functional relation, it is as follows：

（G.711 it encodes,PLRValue range is 0~1）

（G.729 it encodes,PLRValue range is 0~1）

For transmission path delay (Delay) for, it is to refer to no network delay situation to all kinds of type of codings, 170 milliseconds Propagation delay time makes all kinds of audio coding quality reduce 0.1MOS values, and 265 milliseconds of propagation delay times make all kinds of audio coding quality reduce 0.5MOS values, 360 milliseconds of propagation delay times make all kinds of audio coding quality reduce 1MOS values, and 480 milliseconds of propagation delay times make each assonance Frequency coding quality reduces 1.5MOS values, and 700 milliseconds of propagation delay times make all kinds of audio coding quality reduce by 2 MOS values；According to above-mentioned Data utilize interpolation method or curve-fitting method, calculation delay（Delay）With MOS values variation () between functional relation, It is as follows：

（DelayValue unit is ms）

For transmission path shake (Jitter) for, such as G.711 to type of coding, in the case that jitter buffer is 10 milliseconds, It is reference with non-jitter situation, 1 millisecond of shake makes audio quality reduce 0.1MOS values, and 3 milliseconds of shakes make audio quality reduce 0.5MOS values, 6 milliseconds of shakes make audio quality reduce 1MOS values, and shaking for 9 milliseconds makes audio quality reduction 1.5MOS values, 10 milliseconds Shake makes audio quality reduce 2MOS values；It is reference with non-jitter situation, 4 milliseconds are trembled in the case that jitter buffer is 20 milliseconds Dynamic that audio quality is made to reduce 0.1MOS values, 8 milliseconds of shakes make audio quality reduce 0.5MOS values, and 10 milliseconds of shakes make audio quality 1MOS values are reduced, 12 milliseconds of shakes make audio quality reduce 1.5MOS values, and 18 milliseconds of shakes make audio quality reduce 2MOS values；

It is reference with non-jitter situation in the case that jitter buffer is 40 milliseconds, 9 milliseconds of shakes make audio quality reduce 0.1MOS Value, 12 milliseconds of shakes make audio quality reduce 0.5MOS values, and 16 milliseconds of shakes make audio quality reduce 1MOS values, 19 milliseconds of shakes Audio quality is set to reduce 1.5MOS values, 21 milliseconds of shakes make audio quality reduce 2MOS values；G.729 to type of coding, shake is slow It is reference with non-jitter situation in the case that punching is 10 milliseconds, 1 millisecond of shake makes audio quality reduce 0.1MOS values, and 3 milliseconds are trembled Dynamic that audio quality is made to reduce 0.5MOS values, 5 milliseconds of shakes make audio quality reduce 1MOS values, and 7 milliseconds of shakes make audio quality drop Low 1.5MOS values, 9 milliseconds of shakes make audio quality reduce 2MOS values；In the case that jitter buffer is 20 milliseconds, with non-jitter feelings Condition is reference, and 5 milliseconds of shakes make audio quality reduce 0.1MOS values, and 8 milliseconds of shakes make audio quality reduce 0.5MOS values, 11 millis Second shake makes audio quality reduce by 1 MOS values, and 13 milliseconds of shakes make audio quality reduce 1.5MOS values, and 18 milliseconds of shakes make audio Quality reduces 2MOS values；It is reference with non-jitter situation in the case that jitter buffer is 40 milliseconds, 10 milliseconds of shakes make audio matter Amount reduces 0.1MOS values, and 11 milliseconds of shakes make audio quality reduce by 0.5 MOS values, and 16 milliseconds of shakes make audio quality reduce by 1 MOS values, 18 milliseconds of shakes make audio quality reduce by 1.5 MOS values, and 20 milliseconds of shakes make audio quality reduce by 2 MOS values；According to Above-mentioned data are calculated and are shaken under the above situation using interpolation method or curve-fitting method（Jitter）With MOS values variation ( ) between functional relation it is as follows, jitter buffer be 20 milliseconds for：

（G.711 it encodes,JitterValue list Position is ms,Jitter bufferFor 20 ms）

（G.729 it encodes,JitterValue unit For ms,Jitter bufferFor 20 ms）.

Wherein it is determined that the step of controllable parameter influences Quality of experience in receiving unit includes：

Receiving unit controls and receives cache size；For order caching size (Jitter buffer) for, to type of coding G.711 in the case of having 2 milliseconds of shakes for audio signal, in transmission process, than the 10 milliseconds shakes of 20 milliseconds of jitter buffers are slow It rushes MOS values and increases than the 10 milliseconds jitter buffer MOS values increases by 0.36 of 0.30,40 milliseconds of jitter buffers；There are 4 milliseconds in transmission process In the case of shake, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase by 0.49,40 milliseconds of jitter buffers than 10 millis Second jitter buffer MOS values increase by 0.63；In the case of having 6 milliseconds of shakes in transmission process, 20 milliseconds of jitter buffers are than 10 milliseconds Jitter buffer MOS values increase than 10 milliseconds jitter buffer MOS values of 0.69,40 milliseconds of jitter buffers and increase by 0.94；In transmission process In the case of having 8 milliseconds of shakes, it is slow that than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase by 0.97,40 milliseconds of shakes Than 10 milliseconds jitter buffer MOS values of punching increase by 1.33；In the case of having 10 milliseconds of shakes in transmission process, 20 milliseconds of jitter buffers Than 10 milliseconds jitter buffer MOS values increase than 10 milliseconds jitter buffer MOS values of 0.77,40 milliseconds of jitter buffers and increase by 1.53；It passes In the case of having 12 milliseconds of shakes during defeated, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase by 0.42,40 millis Second than 10 milliseconds jitter buffer MOS values of jitter buffer increase by 1.53；In the case of having 14 milliseconds of shakes in transmission process, 20 milliseconds Than 10 milliseconds jitter buffer MOS values of jitter buffer increase than 10 milliseconds jitter buffer MOS values of 0.29,40 milliseconds of jitter buffers and increase 1.41；In the case of having 16 milliseconds of shakes in transmission process, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase Than 10 milliseconds jitter buffer MOS values of 0.27,40 millisecond of jitter buffer increase by 1.18；There is the case where 18 milliseconds of shakes in transmission process Under, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase than 10 milliseconds jitter buffers of 0.26,40 milliseconds of jitter buffers MOS values increase by 0.97；It is different relative to 10 milliseconds of jitter buffers using interpolation method or curve-fitting method according to above-mentioned data Under the conditions of jitter buffer, shake（Jitter）With MOS values variation () between substantially functional relation it is as follows：

（Jitter bufferFor 20ms,JitterIt takes Value unit is ms）

（Jitter bufferFor 40ms,Jitter Value unit is ms）

G.729 to type of coding, in the case of having 2 milliseconds of shakes in transmission process, 20 milliseconds of jitter buffers are shaken than 10 milliseconds It buffers MOS values and increases than the 10 milliseconds jitter buffer MOS values increases by 0.37 of 0.35,40 milliseconds of jitter buffers；There are 4 millis in transmission process In the case of second shake, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase by 0.66,40 milliseconds of jitter buffers than 10 Millisecond jitter buffer MOS values increase by 0.75；In the case of having 6 milliseconds of shakes in transmission process, 20 milliseconds of jitter buffers are than 10 millis Second jitter buffer MOS values increase than 10 milliseconds jitter buffer MOS values of 1.15,40 milliseconds of jitter buffers and increase by 1.33；Transmission process In have 8 milliseconds of shakes in the case of, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase by 1.41,40 milliseconds and shake Than 10 milliseconds jitter buffer MOS values of buffering increase by 1.87；In the case of having 10 milliseconds of shakes in transmission process, 20 milliseconds of shakes are slow Than 10 milliseconds jitter buffer MOS values of punching increase than 10 milliseconds jitter buffer MOS values of 1.36,40 milliseconds of jitter buffers and increase by 2.0；It passes In the case of having 12 milliseconds of shakes during defeated, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase by 0.92,40 millis Second than 10 milliseconds jitter buffer MOS values of jitter buffer increase by 1.87；In the case of having 14 milliseconds of shakes in transmission process, 20 milliseconds Than 10 milliseconds jitter buffer MOS values of jitter buffer increase than 10 milliseconds jitter buffer MOS values of 0.61,40 milliseconds of jitter buffers and increase 1.73；In the case of having 16 milliseconds of shakes in transmission process, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase Than 10 milliseconds jitter buffer MOS values of 0.35,40 millisecond of jitter buffer increase by 1.38；There is the case where 18 milliseconds of shakes in transmission process Under, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase than 10 milliseconds jitter buffers of 0.29,40 milliseconds of jitter buffers MOS values increase by 0.87；It is different relative to 10 milliseconds of jitter buffers using interpolation method or curve-fitting method according to above-mentioned data Under the conditions of jitter buffer, shake（Jitter）With MOS values variation () between functional relation it is as follows：

（Jitter bufferFor 20ms,JitterIt takes Value unit is ms）

（Jitter bufferFor 40ms,JitterIt takes Value unit is ms）

Type of coding, code rate, transmission path performance, order caching etc. in above-mentioned transmission unit, transmission unit, receiving unit The correspondence of controllable parameter and the variation of Quality of experience MOS values provides data for present invention design voice applications optimisation strategy Support.

Wherein, the step of monitoring unit makes corresponding adjusted ＆ optimized strategy for different situations include：

Periodically comparison audio experience quality measured values and preset audio Quality of experience threshold value, join according to comparing result tuning performance It is one or several in number, audio is optimized and revised with realizing；

4 kinds of situations are divided to adjust strategy, wherein Quality of experience calculated value MOS respectively_CIt indicates：

If MOS_CMore than or equal to first threshold, then the setting in transmission unit, transmission unit, receiving unit does not adjust；

If MOS_CLess than first threshold and it is more than or equal to second threshold, can makes adjustment to the setting of transmission unit, receiving unit, To advanced optimize Quality of experience, i.e., according to Quality of experience coding, the corresponding Quality of experience situation of rate, by low Quality of experience Type of coding, code rate are adjusted to the type of coding of high Quality of experience, rate；

In specific optimization process, such as the Quality of experience of voice applications can be made to be higher than second by adjusting type of coding, code rate Thresholding 0.6MOS values（This numeric reference E-MODEL Quality of experience criteria for classifying）, then no longer optimize；Otherwise, in next sound In frequency transmission cycle, the existing reception buffering of receiving unit is increased by 1 times, further to improve Quality of experience value, at this point, no matter Whether MOS can also obtain larger promotion, directly exit optimization program；

If MOS_CLess than second threshold and be more than or equal to third threshold value, for G.711 encoding, packet loss value be not more than 3.3%, and Jitter value is not more than 12 milliseconds（Under 40 milliseconds of bufferings）, 8 milliseconds（Under 20 milliseconds of bufferings）, 3 milliseconds（Under 10 milliseconds of bufferings）Situation Under, then it only makes adjustment to the setting of transmission unit, receiving unit, optimizes Quality of experience, i.e., according to Quality of experience coding, rate Corresponding Quality of experience situation carries out the type of coding of low Quality of experience, rate to the type of coding of high Quality of experience, rate Adjustment, until MOS_CMore than or equal to second threshold；As in next evaluation cycle, Quality of experience is still below second threshold, then receives Existing reception buffering is increased by 1 times by unit, further improves Quality of experience value；As Quality of experience cannot still be more than or equal to the second threshold Value then adjusts audio transmission path by transmission unit, optimizes audio experience quality, until MOS_CMore than or equal to second threshold；

If MOS_CLess than third threshold value, show in this case, transmission path cannot meet audio transmission needs, then to need to disconnect audio Transmissions links, again routing are then mainly adjusted by transmission unit to optimize Quality of experience.

The main object of the present invention：Audio experience grade is divided, default Quality of experience threshold interval is determined, determines and send Controllable parameter type in unit, transmission unit, receiving unit；It determines in transmission unit, transmission unit, receiving unit and encodes class The controllable parameters such as type, code rate, transmission path, order caching change the influence variation degree to Quality of experience value, establish system System controllable parameter and Quality of experience level adjustment mapping relations；Specific aim using code optimization, path optimization, optimization of rate, The strategies such as cache optimization regulate and control Quality of experience.

Description of the drawings

Fig. 1 is the basic principle figure of the present invention；

Fig. 2 is the adjustable strategies flow chart when MOSC is less than first threshold and is more than or equal to second threshold in the present invention；

Fig. 3 is the adjustable strategies flow chart when MOSC is less than second threshold and is more than or equal to third threshold value in the present invention；

Fig. 4 is specific experiment example system pie graph in the present invention；

Fig. 5 is in the present invention, using audio experience quality comparison schematic diagram after this method in one embodiment（MOSC is less than the One threshold value and be more than or equal to second threshold situation）；

Fig. 6 is in the present invention, using audio experience quality comparison schematic diagram after this method in one embodiment（MOSC is less than the Two threshold values and be more than or equal to third threshold condition）.

Specific implementation mode

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation describes, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Below Description only actually at least one exemplary embodiment is illustrative, is never used as to the present invention and its application or makes Any restrictions.Based on the embodiments of the present invention, those of ordinary skill in the art are not making creative work premise Lower obtained every other embodiment, shall fall within the protection scope of the present invention.

In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention, for those of ordinary skill in the art, without creative efforts, this field Those of ordinary skill can obtain other attached drawings by this attached drawing.

The present invention provides a kind of audio quality self-adjusting control method, the basic principle figure of this method is as shown in Figure 1.It is complete At this method, including four transmission unit, transmission unit, receiving unit, monitoring unit parts, including：

Further, determine that the step of controllable parameter influences audio experience quality in transmission unit includes：

Transmission unit controls type of coding, code rate；

（G.711 it encodes,PLRValue range is 0~1）

（G.729 it encodes,PLRValue range is 0~1）

（DelayValue unit is ms）

（G.729 it encodes,JitterValue unit For ms,Jitter bufferFor 20 ms）.

Further, determine that the step of controllable parameter influences Quality of experience in receiving unit includes：

（Jitter bufferFor 20ms,JitterIt takes Value unit is ms）

（Jitter bufferFor 40ms,Jitter Value unit is ms）

（Jitter bufferFor 20ms,JitterIt takes Value unit is ms）

（Jitter bufferFor 40ms,JitterIt takes Value unit is ms）

Further, the step of monitoring unit makes corresponding adjusted ＆ optimized strategy for different situations include：

The present invention is directed to the deficiency for optimizing and revising technology of existing audio quality, it is proposed that a kind of optimization tune of audio quality Adjusting method relates generally to transmission unit, transmission list according to audio experience quality come multiple unit controllable parameters in adjustment system Member, receiving unit are related to audio coding, audio encoding rate, transmission network performance, receive multiple controllable ginsengs such as buffer size Number.The present invention multiple scenes such as VoIP, Audio on Demand suitable for wireless, cable network.Its system and each functional unit such as Fig. 2 It is shown.

Transmission unit mainly completes the functions such as the coding of audio signal, package transmission, the controllable parameter master in transmission unit To include type of coding and code rate, evaluation result of the monitoring unit to audio experience quality be received, according to what is be previously set Algorithm mechanism is completed to achieve the purpose that optimize Quality of experience to the adjustment of controllable parameter.Software used in the present embodiment is OpenPhone；

Transmission unit mainly completes the transmission of audio package, the controllable parameter in transmission unit mainly include bandwidth, packet loss, when The controllable parameters such as prolong, shake, the parameter of the transmission path being typically different, network performance is different, by Real-time Transport Protocol and network Measurement Algorithm can be easier to obtain the specific targets of network transmission path, in the method, when audio quality is poor, with regard to that need to pass through Data packet transmission path is adjusted to improve signal transmission path performance, and then promotes audio quality.But compare other two units For, in the case of the adjustment of transmission unit is the most complicated, therefore audio quality is preferable, transmission unit is not adjusted generally It is whole.For the present embodiment for the ease of controlling transmission path performance, software used is NISTnet；

Receiving unit mainly completes the functions such as the unpacking of audio signal, decoding, and controllable parameter mainly receives in receiving unit Buffer size, the parameter have audio quality very great influence, are mainly according to monitoring unit in specific embodiment Feedback is completed to achieve the purpose that optimize audio experience quality to receiving the adjustment of buffering according to the algorithm mechanism being previously set.This Software used in embodiment receiving terminal is OpenPhone；

Monitoring unit is mainly completed to calculate voice applications Quality of experience value in real time, and machine in normal service learning method can be achieved to be somebody's turn to do at present Function, such as SVM, neural network, decision tree, the present embodiment select artificial neural network as the method for establishing mapping relations, Input layer includes type of coding, code rate, time delay, shake, packet loss, order caching totally 6 parameters, 20 nodes of hidden layer, Output layer has 1 output.

When G.729 transmitting terminal is encoded to, shake is about 2ms, and packet loss is about 2%, time delay value 100ms, jitter buffer 20ms In the case of, MOS values are about 3.04, using method shown in Fig. 3, are optimized to voice applications, then are encoded after audio quality optimization For G.711, jitter buffer 40ms, audio quality is through measuring about 3.60.

When G.729 transmitting terminal is encoded to, shake is about 8ms, and packet loss is about 6%, time delay value 150ms, jitter buffer 20ms In the case of, MOS values are about 2.07, using method shown in Fig. 4, first need to carry out repairing to network transmission path, recycle and send list Member or（And）Receiving unit optimizes voice applications, then is encoded to G.711 after audio quality optimization, audio quality is through measuring About 3.50.

Fig. 5 and Fig. 6 is using audio experience quality comparison schematic diagram after this method, wherein Fig. 5 is MOS_CLess than the first threshold The case where being worth and being more than or equal to second threshold, Fig. 6 is MOS_CThe case where less than second threshold and more than or equal to third threshold value.By scheming It is found that the adjustment by this method, audio experience quality is improved.

It is understood that the principle that embodiment of above is intended to be merely illustrative of the present and the exemplary implementation that uses Mode, however the present invention is not limited thereto.For those skilled in the art, in the essence for not departing from the present invention In the case of refreshing and essence, various modifications, improvement can be made, these variations and modifications are also considered as protection scope of the present invention.

Claims

1. a kind of audio quality self-adjusting control method, the system for completing this method includes at least transmission unit, transmission unit, connects Receive unit and monitoring unit, which is characterized in that the step of this method includes：

Step 1, receiving unit by audio quality be divided into it is excellent, good, in, poor, bad 5 grades, concrete numerical value set { MOS₅ =5、MOS₄=4、MOS₃=3、MOS₂=2、MOS₁=1 } it indicates, is divided into 4 sections altogether：[5,4]、（4,3]、（3,2]、（2,1], In, MOS₅Indicate that audio grade is ' excellent ' etc., MOS₄Indicate that audio grade is ' good ' etc., MOS₃Indicate that audio grade is ' in ' etc., MOS₂Indicate that audio grade is ' poor ' etc., MOS₁Indicate that audio grade is ' bad ' etc.；By MOS₄Labeled as first threshold, MOS₃Mark It is denoted as second threshold, MOS₂Labeled as third threshold value, MOS₁Labeled as the 4th threshold value；

2. according to the audio quality self-adjusting control method described in claim 1, which is characterized in that determining can in transmission unit Controlling the step of parameter influences audio experience quality includes：

Transmission unit controls type of coding, code rate；

3. according to the audio quality self-adjusting control method described in claim 1, which is characterized in that determining can in transmission unit Controlling the step of parameter influences audio experience quality includes：

（G.711 it encodes,PLRValue range is 0~1）

（G.729 it encodes,PLRValue range is 0~1）

（DelayValue unit is ms）

（G.729 it encodes,JitterValue unit For ms,Jitter bufferFor 20 ms）.

4. according to the audio quality self-adjusting control method described in claim 1, which is characterized in that determining can in receiving unit Controlling the step of parameter influences Quality of experience includes：

Receiving unit controls and receives cache size；For order caching size (Jitter buffer) for, to type of coding G.711 in the case of having 2 milliseconds of shakes for audio signal, in transmission process, than the 10 milliseconds shakes of 20 milliseconds of jitter buffers are slow It rushes MOS values and increases than the 10 milliseconds jitter buffer MOS values increases by 0.36 of 0.30,40 milliseconds of jitter buffers；There are 4 milliseconds in transmission process In the case of shake, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase by 0.49,40 milliseconds of jitter buffers than 10 millis Second jitter buffer MOS values increase by 0.63；In the case of having 6 milliseconds of shakes in transmission process, 20 milliseconds of jitter buffers are than 10 milliseconds Jitter buffer MOS values increase than 10 milliseconds jitter buffer MOS values of 0.69,40 milliseconds of jitter buffers and increase by 0.94；In transmission process In the case of having 8 milliseconds of shakes, it is slow that than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase by 0.97,40 milliseconds of shakes Than 10 milliseconds jitter buffer MOS values of punching increase by 1.33；In the case of having 10 milliseconds of shakes in transmission process, 20 milliseconds of jitter buffers Than 10 milliseconds jitter buffer MOS values increase than 10 milliseconds jitter buffer MOS values of 0.77,40 milliseconds of jitter buffers and increase by 1.53；It passes In the case of having 12 milliseconds of shakes during defeated, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase by 0.42,40 millis Second than 10 milliseconds jitter buffer MOS values of jitter buffer increase by 1.53；In the case of having 14 milliseconds of shakes in transmission process, 20 milliseconds Than 10 milliseconds jitter buffer MOS values of jitter buffer increase than 10 milliseconds jitter buffer MOS values of 0.29,40 milliseconds of jitter buffers and increase 1.41；In the case of having 16 milliseconds of shakes in transmission process, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase Than 10 milliseconds jitter buffer MOS values of 0.27,40 millisecond of jitter buffer increase by 1.18；There is the case where 18 milliseconds of shakes in transmission process Under, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase than 10 milliseconds jitter buffers of 0.26,40 milliseconds of jitter buffers MOS values increase by 0.97；It is different relative to 10 milliseconds of jitter buffers using interpolation method or curve-fitting method according to above-mentioned data Under the conditions of jitter buffer, shake（Jitter）With MOS values variation () between functional relation it is as follows：

（Jitter bufferFor 20ms,JitterValue Unit is ms）

（Jitter bufferFor 40ms,JitterIt takes Value unit is ms）

（Jitter bufferFor 20ms,JitterIt takes Value unit is ms）

（Jitter bufferFor 40ms,JitterValue Unit is ms）

5. according to the audio quality self-adjusting control method described in claim 1, which is characterized in that monitoring unit is for difference Situation makes the step of corresponding adjusted ＆ optimized strategy and includes：

If MOS_CLess than second threshold and it is more than or equal to third threshold value, for G.711 encoding, packet loss value is not more than 3.3%, and trembles Dynamic value is not more than 12 milliseconds（Under 40 milliseconds of bufferings）, 8 milliseconds（Under 20 milliseconds of bufferings）, 3 milliseconds（Under 10 milliseconds of bufferings）In the case of, It then only makes adjustment to the setting of transmission unit, receiving unit, optimizes Quality of experience, i.e., according to Quality of experience coding, rate pair The Quality of experience situation answered adjusts the type of coding of low Quality of experience, rate to the type of coding of high Quality of experience, rate It is whole, until MOS_CMore than or equal to second threshold；As in next evaluation cycle, Quality of experience is still below second threshold, then receives list Existing reception buffering is increased by 1 times by member, further improves Quality of experience value；As Quality of experience cannot still be more than or equal to the second threshold Value then adjusts audio transmission path by transmission unit, optimizes audio experience quality, until MOS_CMore than or equal to second threshold；