CN108495182A - A kind of audio quality self-adjusting control method - Google Patents

A kind of audio quality self-adjusting control method Download PDF

Info

Publication number
CN108495182A
CN108495182A CN201810243626.3A CN201810243626A CN108495182A CN 108495182 A CN108495182 A CN 108495182A CN 201810243626 A CN201810243626 A CN 201810243626A CN 108495182 A CN108495182 A CN 108495182A
Authority
CN
China
Prior art keywords
milliseconds
values
quality
jitter
mos
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810243626.3A
Other languages
Chinese (zh)
Inventor
胡治国
郭丽峰
闫涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanxi University
Original Assignee
Shanxi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanxi University filed Critical Shanxi University
Priority to CN201810243626.3A priority Critical patent/CN108495182A/en
Publication of CN108495182A publication Critical patent/CN108495182A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4392Processing of audio elementary streams involving audio buffer management
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/005Correction of errors induced by the transmission channel, if related to the coding algorithm
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/60Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for measuring the quality of voice signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4398Processing of audio elementary streams involving reformatting operations of audio signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Mathematical Physics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Databases & Information Systems (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)

Abstract

The invention discloses a kind of audio quality self-adjusting control method, this method determines default Quality of experience threshold interval, determines audio controllable parameter type in transmission process, transmission process, receive process by dividing audio experience credit rating;Determine that the variation of the controllable parameters such as type of coding, code rate, transmission path, order caching establishes system controllable parameter and Quality of experience level adjustment mapping relations to the influence degree of Quality of experience value in transmission process, transmission process, receive process;For the quality condition of different audios, be respectively adopted code optimization, path optimization, optimization of rate, cache optimization strategy regulate and control Quality of experience.The present invention realizes effective promotion of voice applications Quality of experience by the optimum organization of multiple strategy.

Description

A kind of audio quality self-adjusting control method
Technical field
The present invention relates to Streaming Media and network communication technology field, espespecially a kind of audio quality self-adjusting control method.
Background technology
Audio is one of the main business of network streaming media, and good audio experience quality assurance is that service provider wins It obtains client, account for commercioganic key technology factor.Complexity, diversity and the fragility of network system so that Quality of experience is protected Card and optimization are faced with many uncertain factors.Although having some researchs to optimizing and revising for audio quality, total comes It sees, has the following disadvantages:
It is by the regulation and control to single parameter in tradition research(Such as by the way of code rate adjustment)To improve audio experience Quality, but our experiences show that, such as adjustment, the adjustable region model of Quality of experience are optimized only for single controllable parameter It encloses smaller, it is difficult to realize from " poor " or " bad " grade to the conversion of " excellent " or " good " grade;
How controllable parameter in each unit, each unit is adjusted, be adjusted to which kind of degree lack specific quantisation metric standard or With reference to;
Lack using the comprehensive strategic of multi-method or more means go optimization audio quality research, a complete audio system by Multiple unit compositions, it is apparent that influence of each component units to audio experience quality is different in system, controllability also differs, Therefore, the single optimisation strategy of tradition is difficult to realize effective promotion of voice applications Quality of experience.
Invention content
In order to achieve the above objectives, technical scheme of the present invention is specifically realized in:
The present invention provides a kind of audio quality self-adjusting control method, completes this method, including transmission unit, transmission unit, connects Four unit, monitoring unit parts are received, including:
Step 1, receiving unit by audio quality be divided into it is excellent, good, in, poor, bad 5 grades, concrete numerical value set {MOS5=5、MOS4=4、MOS3=3、MOS2=2、MOS1=1 } it indicates, is divided into 4 sections altogether:[5,4]、(4,3]、(3,2]、(2, 1], wherein MOS5Indicate that audio grade is ' excellent ' etc., MOS4Indicate that audio grade is ' good ' etc., MOS3Indicate that audio grade is ' in ' etc., MOS2Indicate that audio grade is ' poor ' etc., MOS1Indicate that audio grade is ' bad ' etc.;By MOS4Labeled as first threshold, MOS3Labeled as second threshold, MOS2Labeled as third threshold value, MOS1Labeled as the 4th threshold value;
Step 2, monitoring unit calculate voice applications Quality of experience value in real time, by experiment obtain transmission unit, transmission unit, The correspondence of different parameters and voice applications Quality of experience value in receiving unit, obtains training dataset, by engineering It practises algorithm and establishes Quality of experience assessment models, model includes at least input layer, output layer two layers;Setting input layer at least has 6 6 inputs of a input, input layer are type of coding, code rate, the time delay of transmission path performance, shake, packet loss, Yi Jijie Receive cached parameters;Output layer exports Quality of experience;It is controllable more in model realization transmission unit, transmission unit, the receiving unit Tie up mapping of the parameter to Quality of experience;
Step 3 determines type of coding in transmission unit, transmission unit, receiving unit, code rate, transmission path performance, connects It receives caching controllable parameter and changes influence variation degree to Quality of experience, establish in voice applications controllable parameter in each component units Value variation and Quality of experience value added or the correspondence mappings relationship of decreasing value;
Step 4, monitoring unit measure Quality of experience in system terminal, and measured value and default Quality of experience threshold levels are carried out Compare, type of coding, code rate, biography in transmission unit, transmission unit, receiving unit are at least adjusted according to comparative analysis result At least one of defeated path performance, order caching, according to after adjustment type of coding and or code rate and or transmission road Diameter parameter and or order caching, further to send, transmit, receive audio signal, so reach optimization voice applications experience matter The purpose of amount.
Wherein it is determined that the step of controllable parameter influences audio experience quality in transmission unit includes:
Transmission unit controls type of coding, code rate;
For the corresponding voice quality of different coding type, G.711 G.726 > G729 > are G.723 by >;
G.726, G711 coded audio quality is higher than 0.4MOS values are encoded, and G.726 coded audio quality encodes 0.2MOS higher than G729 Value, G729 coded audio quality is higher than G.723 coding 0.2MOS values;
For the corresponding voice quality of different coding rate, under same coding, the audio quality of higher code rate is higher than The Quality of experience of relatively low coding:11.8kbit(G729)> 8kbit(G729)6.4 kbit of >(G729);6.3kbit (G723.1)> 5.3kbit(G723.1);
In the expansible rate of G729 codings, the high 0.2MOS of audio quality of 11.8kbit code rate ratio 8kbit code rates Value, the high 0.1MOS values of audio quality of 8kbit code rate ratio 6.4kbit code rates, during G723.1 is encoded, 6.3kbit is compiled The high 0.1MOS values of audio quality of bit rate ratio 5.3kbit code rates.
Wherein it is determined that the step of controllable parameter influences audio experience quality in transmission unit includes:
Transmission unit controlling transmission path performance;For transmission path packet loss(PLR)For, to G.711 encoding, with no net Network packet drop is reference, and 0.8% packet loss can make audio quality reduce by 0.1 MOS values, and 3.3% packet loss can be such that audio quality drops Low 0.5MOS values, 7.6% packet loss make audio quality reduce by 1 MOS values, and 11.8% packet loss makes audio quality reduce by 1.5 MOS Value, 19% packet loss make audio quality reduce 2MOS values;For encoding G.729, the case where compared to no Network Packet Loss, 0.5% loses Packet rate makes audio quality reduce 0.1MOS values, and 2.1% packet loss makes audio quality reduce 0.5MOS values, and 5.7% packet loss makes audio Quality reduces 1MOS values, and 9.8% packet loss makes audio quality reduce 1.5MOS values, and 16% packet loss makes audio quality reduce 2MOS Value;According to above-mentioned data packet loss is calculated using interpolation method or curve-fitting method(PLR)With MOS values variation () Between functional relation, it is as follows:
(G.711 it encodes,PLRValue range is 0~1)
(G.729 it encodes,PLRValue range is 0~1)
For transmission path delay (Delay) for, it is to refer to no network delay situation to all kinds of type of codings, 170 milliseconds Propagation delay time makes all kinds of audio coding quality reduce 0.1MOS values, and 265 milliseconds of propagation delay times make all kinds of audio coding quality reduce 0.5MOS values, 360 milliseconds of propagation delay times make all kinds of audio coding quality reduce 1MOS values, and 480 milliseconds of propagation delay times make each assonance Frequency coding quality reduces 1.5MOS values, and 700 milliseconds of propagation delay times make all kinds of audio coding quality reduce by 2 MOS values;According to above-mentioned Data utilize interpolation method or curve-fitting method, calculation delay(Delay)With MOS values variation () between functional relation, It is as follows:
DelayValue unit is ms)
For transmission path shake (Jitter) for, such as G.711 to type of coding, in the case that jitter buffer is 10 milliseconds, It is reference with non-jitter situation, 1 millisecond of shake makes audio quality reduce 0.1MOS values, and 3 milliseconds of shakes make audio quality reduce 0.5MOS values, 6 milliseconds of shakes make audio quality reduce 1MOS values, and shaking for 9 milliseconds makes audio quality reduction 1.5MOS values, 10 milliseconds Shake makes audio quality reduce 2MOS values;It is reference with non-jitter situation, 4 milliseconds are trembled in the case that jitter buffer is 20 milliseconds Dynamic that audio quality is made to reduce 0.1MOS values, 8 milliseconds of shakes make audio quality reduce 0.5MOS values, and 10 milliseconds of shakes make audio quality 1MOS values are reduced, 12 milliseconds of shakes make audio quality reduce 1.5MOS values, and 18 milliseconds of shakes make audio quality reduce 2MOS values;
It is reference with non-jitter situation in the case that jitter buffer is 40 milliseconds, 9 milliseconds of shakes make audio quality reduce 0.1MOS Value, 12 milliseconds of shakes make audio quality reduce 0.5MOS values, and 16 milliseconds of shakes make audio quality reduce 1MOS values, 19 milliseconds of shakes Audio quality is set to reduce 1.5MOS values, 21 milliseconds of shakes make audio quality reduce 2MOS values;G.729 to type of coding, shake is slow It is reference with non-jitter situation in the case that punching is 10 milliseconds, 1 millisecond of shake makes audio quality reduce 0.1MOS values, and 3 milliseconds are trembled Dynamic that audio quality is made to reduce 0.5MOS values, 5 milliseconds of shakes make audio quality reduce 1MOS values, and 7 milliseconds of shakes make audio quality drop Low 1.5MOS values, 9 milliseconds of shakes make audio quality reduce 2MOS values;In the case that jitter buffer is 20 milliseconds, with non-jitter feelings Condition is reference, and 5 milliseconds of shakes make audio quality reduce 0.1MOS values, and 8 milliseconds of shakes make audio quality reduce 0.5MOS values, 11 millis Second shake makes audio quality reduce by 1 MOS values, and 13 milliseconds of shakes make audio quality reduce 1.5MOS values, and 18 milliseconds of shakes make audio Quality reduces 2MOS values;It is reference with non-jitter situation in the case that jitter buffer is 40 milliseconds, 10 milliseconds of shakes make audio matter Amount reduces 0.1MOS values, and 11 milliseconds of shakes make audio quality reduce by 0.5 MOS values, and 16 milliseconds of shakes make audio quality reduce by 1 MOS values, 18 milliseconds of shakes make audio quality reduce by 1.5 MOS values, and 20 milliseconds of shakes make audio quality reduce by 2 MOS values;According to Above-mentioned data are calculated and are shaken under the above situation using interpolation method or curve-fitting method(Jitter)With MOS values variation ( ) between functional relation it is as follows, jitter buffer be 20 milliseconds for:
(G.711 it encodes,JitterValue list Position is ms,Jitter bufferFor 20 ms)
(G.729 it encodes,JitterValue unit For ms,Jitter bufferFor 20 ms).
Wherein it is determined that the step of controllable parameter influences Quality of experience in receiving unit includes:
Receiving unit controls and receives cache size;For order caching size (Jitter buffer) for, to type of coding G.711 in the case of having 2 milliseconds of shakes for audio signal, in transmission process, than the 10 milliseconds shakes of 20 milliseconds of jitter buffers are slow It rushes MOS values and increases than the 10 milliseconds jitter buffer MOS values increases by 0.36 of 0.30,40 milliseconds of jitter buffers;There are 4 milliseconds in transmission process In the case of shake, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase by 0.49,40 milliseconds of jitter buffers than 10 millis Second jitter buffer MOS values increase by 0.63;In the case of having 6 milliseconds of shakes in transmission process, 20 milliseconds of jitter buffers are than 10 milliseconds Jitter buffer MOS values increase than 10 milliseconds jitter buffer MOS values of 0.69,40 milliseconds of jitter buffers and increase by 0.94;In transmission process In the case of having 8 milliseconds of shakes, it is slow that than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase by 0.97,40 milliseconds of shakes Than 10 milliseconds jitter buffer MOS values of punching increase by 1.33;In the case of having 10 milliseconds of shakes in transmission process, 20 milliseconds of jitter buffers Than 10 milliseconds jitter buffer MOS values increase than 10 milliseconds jitter buffer MOS values of 0.77,40 milliseconds of jitter buffers and increase by 1.53;It passes In the case of having 12 milliseconds of shakes during defeated, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase by 0.42,40 millis Second than 10 milliseconds jitter buffer MOS values of jitter buffer increase by 1.53;In the case of having 14 milliseconds of shakes in transmission process, 20 milliseconds Than 10 milliseconds jitter buffer MOS values of jitter buffer increase than 10 milliseconds jitter buffer MOS values of 0.29,40 milliseconds of jitter buffers and increase 1.41;In the case of having 16 milliseconds of shakes in transmission process, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase Than 10 milliseconds jitter buffer MOS values of 0.27,40 millisecond of jitter buffer increase by 1.18;There is the case where 18 milliseconds of shakes in transmission process Under, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase than 10 milliseconds jitter buffers of 0.26,40 milliseconds of jitter buffers MOS values increase by 0.97;It is different relative to 10 milliseconds of jitter buffers using interpolation method or curve-fitting method according to above-mentioned data Under the conditions of jitter buffer, shake(Jitter)With MOS values variation () between substantially functional relation it is as follows:
Jitter bufferFor 20ms,JitterIt takes Value unit is ms)
Jitter bufferFor 40ms,Jitter Value unit is ms)
G.729 to type of coding, in the case of having 2 milliseconds of shakes in transmission process, 20 milliseconds of jitter buffers are shaken than 10 milliseconds It buffers MOS values and increases than the 10 milliseconds jitter buffer MOS values increases by 0.37 of 0.35,40 milliseconds of jitter buffers;There are 4 millis in transmission process In the case of second shake, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase by 0.66,40 milliseconds of jitter buffers than 10 Millisecond jitter buffer MOS values increase by 0.75;In the case of having 6 milliseconds of shakes in transmission process, 20 milliseconds of jitter buffers are than 10 millis Second jitter buffer MOS values increase than 10 milliseconds jitter buffer MOS values of 1.15,40 milliseconds of jitter buffers and increase by 1.33;Transmission process In have 8 milliseconds of shakes in the case of, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase by 1.41,40 milliseconds and shake Than 10 milliseconds jitter buffer MOS values of buffering increase by 1.87;In the case of having 10 milliseconds of shakes in transmission process, 20 milliseconds of shakes are slow Than 10 milliseconds jitter buffer MOS values of punching increase than 10 milliseconds jitter buffer MOS values of 1.36,40 milliseconds of jitter buffers and increase by 2.0;It passes In the case of having 12 milliseconds of shakes during defeated, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase by 0.92,40 millis Second than 10 milliseconds jitter buffer MOS values of jitter buffer increase by 1.87;In the case of having 14 milliseconds of shakes in transmission process, 20 milliseconds Than 10 milliseconds jitter buffer MOS values of jitter buffer increase than 10 milliseconds jitter buffer MOS values of 0.61,40 milliseconds of jitter buffers and increase 1.73;In the case of having 16 milliseconds of shakes in transmission process, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase Than 10 milliseconds jitter buffer MOS values of 0.35,40 millisecond of jitter buffer increase by 1.38;There is the case where 18 milliseconds of shakes in transmission process Under, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase than 10 milliseconds jitter buffers of 0.29,40 milliseconds of jitter buffers MOS values increase by 0.87;It is different relative to 10 milliseconds of jitter buffers using interpolation method or curve-fitting method according to above-mentioned data Under the conditions of jitter buffer, shake(Jitter)With MOS values variation () between functional relation it is as follows:
Jitter bufferFor 20ms,JitterIt takes Value unit is ms)
Jitter bufferFor 40ms,JitterIt takes Value unit is ms)
Type of coding, code rate, transmission path performance, order caching etc. in above-mentioned transmission unit, transmission unit, receiving unit The correspondence of controllable parameter and the variation of Quality of experience MOS values provides data for present invention design voice applications optimisation strategy Support.
Wherein, the step of monitoring unit makes corresponding adjusted & optimized strategy for different situations include:
Periodically comparison audio experience quality measured values and preset audio Quality of experience threshold value, join according to comparing result tuning performance It is one or several in number, audio is optimized and revised with realizing;
4 kinds of situations are divided to adjust strategy, wherein Quality of experience calculated value MOS respectivelyCIt indicates:
If MOSCMore than or equal to first threshold, then the setting in transmission unit, transmission unit, receiving unit does not adjust;
If MOSCLess than first threshold and it is more than or equal to second threshold, can makes adjustment to the setting of transmission unit, receiving unit, To advanced optimize Quality of experience, i.e., according to Quality of experience coding, the corresponding Quality of experience situation of rate, by low Quality of experience Type of coding, code rate are adjusted to the type of coding of high Quality of experience, rate;
In specific optimization process, such as the Quality of experience of voice applications can be made to be higher than second by adjusting type of coding, code rate Thresholding 0.6MOS values(This numeric reference E-MODEL Quality of experience criteria for classifying), then no longer optimize;Otherwise, in next sound In frequency transmission cycle, the existing reception buffering of receiving unit is increased by 1 times, further to improve Quality of experience value, at this point, no matter Whether MOS can also obtain larger promotion, directly exit optimization program;
If MOSCLess than second threshold and be more than or equal to third threshold value, for G.711 encoding, packet loss value be not more than 3.3%, and Jitter value is not more than 12 milliseconds(Under 40 milliseconds of bufferings), 8 milliseconds(Under 20 milliseconds of bufferings), 3 milliseconds(Under 10 milliseconds of bufferings)Situation Under, then it only makes adjustment to the setting of transmission unit, receiving unit, optimizes Quality of experience, i.e., according to Quality of experience coding, rate Corresponding Quality of experience situation carries out the type of coding of low Quality of experience, rate to the type of coding of high Quality of experience, rate Adjustment, until MOSCMore than or equal to second threshold;As in next evaluation cycle, Quality of experience is still below second threshold, then receives Existing reception buffering is increased by 1 times by unit, further improves Quality of experience value;As Quality of experience cannot still be more than or equal to the second threshold Value then adjusts audio transmission path by transmission unit, optimizes audio experience quality, until MOSCMore than or equal to second threshold;
If MOSCLess than third threshold value, show in this case, transmission path cannot meet audio transmission needs, then to need to disconnect audio Transmissions links, again routing are then mainly adjusted by transmission unit to optimize Quality of experience.
The main object of the present invention:Audio experience grade is divided, default Quality of experience threshold interval is determined, determines and send Controllable parameter type in unit, transmission unit, receiving unit;It determines in transmission unit, transmission unit, receiving unit and encodes class The controllable parameters such as type, code rate, transmission path, order caching change the influence variation degree to Quality of experience value, establish system System controllable parameter and Quality of experience level adjustment mapping relations;Specific aim using code optimization, path optimization, optimization of rate, The strategies such as cache optimization regulate and control Quality of experience.
Description of the drawings
Fig. 1 is the basic principle figure of the present invention;
Fig. 2 is the adjustable strategies flow chart when MOSC is less than first threshold and is more than or equal to second threshold in the present invention;
Fig. 3 is the adjustable strategies flow chart when MOSC is less than second threshold and is more than or equal to third threshold value in the present invention;
Fig. 4 is specific experiment example system pie graph in the present invention;
Fig. 5 is in the present invention, using audio experience quality comparison schematic diagram after this method in one embodiment(MOSC is less than the One threshold value and be more than or equal to second threshold situation);
Fig. 6 is in the present invention, using audio experience quality comparison schematic diagram after this method in one embodiment(MOSC is less than the Two threshold values and be more than or equal to third threshold condition).
Specific implementation mode
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation describes, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Below Description only actually at least one exemplary embodiment is illustrative, is never used as to the present invention and its application or makes Any restrictions.Based on the embodiments of the present invention, those of ordinary skill in the art are not making creative work premise Lower obtained every other embodiment, shall fall within the protection scope of the present invention.
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention, for those of ordinary skill in the art, without creative efforts, this field Those of ordinary skill can obtain other attached drawings by this attached drawing.
The present invention provides a kind of audio quality self-adjusting control method, the basic principle figure of this method is as shown in Figure 1.It is complete At this method, including four transmission unit, transmission unit, receiving unit, monitoring unit parts, including:
Step 1, receiving unit by audio quality be divided into it is excellent, good, in, poor, bad 5 grades, concrete numerical value set {MOS5=5、MOS4=4、MOS3=3、MOS2=2、MOS1=1 } it indicates, is divided into 4 sections altogether:[5,4]、(4,3]、(3,2]、(2, 1], wherein MOS5Indicate that audio grade is ' excellent ' etc., MOS4Indicate that audio grade is ' good ' etc., MOS3Indicate that audio grade is ' in ' etc., MOS2Indicate that audio grade is ' poor ' etc., MOS1Indicate that audio grade is ' bad ' etc.;By MOS4Labeled as first threshold, MOS3Labeled as second threshold, MOS2Labeled as third threshold value, MOS1Labeled as the 4th threshold value;
Step 2, monitoring unit calculate voice applications Quality of experience value in real time, by experiment obtain transmission unit, transmission unit, The correspondence of different parameters and voice applications Quality of experience value in receiving unit, obtains training dataset, by engineering It practises algorithm and establishes Quality of experience assessment models, model includes at least input layer, output layer two layers;Setting input layer at least has 6 6 inputs of a input, input layer are type of coding, code rate, the time delay of transmission path performance, shake, packet loss, Yi Jijie Receive cached parameters;Output layer exports Quality of experience;It is controllable more in model realization transmission unit, transmission unit, the receiving unit Tie up mapping of the parameter to Quality of experience;
Step 3 determines type of coding in transmission unit, transmission unit, receiving unit, code rate, transmission path performance, connects It receives caching controllable parameter and changes influence variation degree to Quality of experience, establish in voice applications controllable parameter in each component units Value variation and Quality of experience value added or the correspondence mappings relationship of decreasing value;
Step 4, monitoring unit measure Quality of experience in system terminal, and measured value and default Quality of experience threshold levels are carried out Compare, type of coding, code rate, biography in transmission unit, transmission unit, receiving unit are at least adjusted according to comparative analysis result At least one of defeated path performance, order caching, according to after adjustment type of coding and or code rate and or transmission road Diameter parameter and or order caching, further to send, transmit, receive audio signal, so reach optimization voice applications experience matter The purpose of amount.
Further, determine that the step of controllable parameter influences audio experience quality in transmission unit includes:
Transmission unit controls type of coding, code rate;
For the corresponding voice quality of different coding type, G.711 G.726 > G729 > are G.723 by >;
G.726, G711 coded audio quality is higher than 0.4MOS values are encoded, and G.726 coded audio quality encodes 0.2MOS higher than G729 Value, G729 coded audio quality is higher than G.723 coding 0.2MOS values;
For the corresponding voice quality of different coding rate, under same coding, the audio quality of higher code rate is higher than The Quality of experience of relatively low coding:11.8kbit(G729)> 8kbit(G729)6.4 kbit of >(G729);6.3kbit (G723.1)> 5.3kbit(G723.1);
In the expansible rate of G729 codings, the high 0.2MOS of audio quality of 11.8kbit code rate ratio 8kbit code rates Value, the high 0.1MOS values of audio quality of 8kbit code rate ratio 6.4kbit code rates, during G723.1 is encoded, 6.3kbit is compiled The high 0.1MOS values of audio quality of bit rate ratio 5.3kbit code rates.
Further, determine that the step of controllable parameter influences audio experience quality in transmission unit includes:
Transmission unit controlling transmission path performance;For transmission path packet loss(PLR)For, to G.711 encoding, with no net Network packet drop is reference, and 0.8% packet loss can make audio quality reduce by 0.1 MOS values, and 3.3% packet loss can be such that audio quality drops Low 0.5MOS values, 7.6% packet loss make audio quality reduce by 1 MOS values, and 11.8% packet loss makes audio quality reduce by 1.5 MOS Value, 19% packet loss make audio quality reduce 2MOS values;For encoding G.729, the case where compared to no Network Packet Loss, 0.5% loses Packet rate makes audio quality reduce 0.1MOS values, and 2.1% packet loss makes audio quality reduce 0.5MOS values, and 5.7% packet loss makes audio Quality reduces 1MOS values, and 9.8% packet loss makes audio quality reduce 1.5MOS values, and 16% packet loss makes audio quality reduce 2MOS Value;According to above-mentioned data packet loss is calculated using interpolation method or curve-fitting method(PLR)With MOS values variation () Between functional relation, it is as follows:
(G.711 it encodes,PLRValue range is 0~1)
(G.729 it encodes,PLRValue range is 0~1)
For transmission path delay (Delay) for, it is to refer to no network delay situation to all kinds of type of codings, 170 milliseconds Propagation delay time makes all kinds of audio coding quality reduce 0.1MOS values, and 265 milliseconds of propagation delay times make all kinds of audio coding quality reduce 0.5MOS values, 360 milliseconds of propagation delay times make all kinds of audio coding quality reduce 1MOS values, and 480 milliseconds of propagation delay times make each assonance Frequency coding quality reduces 1.5MOS values, and 700 milliseconds of propagation delay times make all kinds of audio coding quality reduce by 2 MOS values;According to above-mentioned Data utilize interpolation method or curve-fitting method, calculation delay(Delay)With MOS values variation () between functional relation, It is as follows:
DelayValue unit is ms)
For transmission path shake (Jitter) for, such as G.711 to type of coding, in the case that jitter buffer is 10 milliseconds, It is reference with non-jitter situation, 1 millisecond of shake makes audio quality reduce 0.1MOS values, and 3 milliseconds of shakes make audio quality reduce 0.5MOS values, 6 milliseconds of shakes make audio quality reduce 1MOS values, and shaking for 9 milliseconds makes audio quality reduction 1.5MOS values, 10 milliseconds Shake makes audio quality reduce 2MOS values;It is reference with non-jitter situation, 4 milliseconds are trembled in the case that jitter buffer is 20 milliseconds Dynamic that audio quality is made to reduce 0.1MOS values, 8 milliseconds of shakes make audio quality reduce 0.5MOS values, and 10 milliseconds of shakes make audio quality 1MOS values are reduced, 12 milliseconds of shakes make audio quality reduce 1.5MOS values, and 18 milliseconds of shakes make audio quality reduce 2MOS values;
It is reference with non-jitter situation in the case that jitter buffer is 40 milliseconds, 9 milliseconds of shakes make audio quality reduce 0.1MOS Value, 12 milliseconds of shakes make audio quality reduce 0.5MOS values, and 16 milliseconds of shakes make audio quality reduce 1MOS values, 19 milliseconds of shakes Audio quality is set to reduce 1.5MOS values, 21 milliseconds of shakes make audio quality reduce 2MOS values;G.729 to type of coding, shake is slow It is reference with non-jitter situation in the case that punching is 10 milliseconds, 1 millisecond of shake makes audio quality reduce 0.1MOS values, and 3 milliseconds are trembled Dynamic that audio quality is made to reduce 0.5MOS values, 5 milliseconds of shakes make audio quality reduce 1MOS values, and 7 milliseconds of shakes make audio quality drop Low 1.5MOS values, 9 milliseconds of shakes make audio quality reduce 2MOS values;In the case that jitter buffer is 20 milliseconds, with non-jitter feelings Condition is reference, and 5 milliseconds of shakes make audio quality reduce 0.1MOS values, and 8 milliseconds of shakes make audio quality reduce 0.5MOS values, 11 millis Second shake makes audio quality reduce by 1 MOS values, and 13 milliseconds of shakes make audio quality reduce 1.5MOS values, and 18 milliseconds of shakes make audio Quality reduces 2MOS values;It is reference with non-jitter situation in the case that jitter buffer is 40 milliseconds, 10 milliseconds of shakes make audio matter Amount reduces 0.1MOS values, and 11 milliseconds of shakes make audio quality reduce by 0.5 MOS values, and 16 milliseconds of shakes make audio quality reduce by 1 MOS values, 18 milliseconds of shakes make audio quality reduce by 1.5 MOS values, and 20 milliseconds of shakes make audio quality reduce by 2 MOS values;According to Above-mentioned data are calculated and are shaken under the above situation using interpolation method or curve-fitting method(Jitter)With MOS values variation ( ) between functional relation it is as follows, jitter buffer be 20 milliseconds for:
(G.711 it encodes,JitterValue list Position is ms,Jitter bufferFor 20 ms)
(G.729 it encodes,JitterValue unit For ms,Jitter bufferFor 20 ms).
Further, determine that the step of controllable parameter influences Quality of experience in receiving unit includes:
Receiving unit controls and receives cache size;For order caching size (Jitter buffer) for, to type of coding G.711 in the case of having 2 milliseconds of shakes for audio signal, in transmission process, than the 10 milliseconds shakes of 20 milliseconds of jitter buffers are slow It rushes MOS values and increases than the 10 milliseconds jitter buffer MOS values increases by 0.36 of 0.30,40 milliseconds of jitter buffers;There are 4 milliseconds in transmission process In the case of shake, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase by 0.49,40 milliseconds of jitter buffers than 10 millis Second jitter buffer MOS values increase by 0.63;In the case of having 6 milliseconds of shakes in transmission process, 20 milliseconds of jitter buffers are than 10 milliseconds Jitter buffer MOS values increase than 10 milliseconds jitter buffer MOS values of 0.69,40 milliseconds of jitter buffers and increase by 0.94;In transmission process In the case of having 8 milliseconds of shakes, it is slow that than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase by 0.97,40 milliseconds of shakes Than 10 milliseconds jitter buffer MOS values of punching increase by 1.33;In the case of having 10 milliseconds of shakes in transmission process, 20 milliseconds of jitter buffers Than 10 milliseconds jitter buffer MOS values increase than 10 milliseconds jitter buffer MOS values of 0.77,40 milliseconds of jitter buffers and increase by 1.53;It passes In the case of having 12 milliseconds of shakes during defeated, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase by 0.42,40 millis Second than 10 milliseconds jitter buffer MOS values of jitter buffer increase by 1.53;In the case of having 14 milliseconds of shakes in transmission process, 20 milliseconds Than 10 milliseconds jitter buffer MOS values of jitter buffer increase than 10 milliseconds jitter buffer MOS values of 0.29,40 milliseconds of jitter buffers and increase 1.41;In the case of having 16 milliseconds of shakes in transmission process, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase Than 10 milliseconds jitter buffer MOS values of 0.27,40 millisecond of jitter buffer increase by 1.18;There is the case where 18 milliseconds of shakes in transmission process Under, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase than 10 milliseconds jitter buffers of 0.26,40 milliseconds of jitter buffers MOS values increase by 0.97;It is different relative to 10 milliseconds of jitter buffers using interpolation method or curve-fitting method according to above-mentioned data Under the conditions of jitter buffer, shake(Jitter)With MOS values variation () between substantially functional relation it is as follows:
Jitter bufferFor 20ms,JitterIt takes Value unit is ms)
Jitter bufferFor 40ms,Jitter Value unit is ms)
G.729 to type of coding, in the case of having 2 milliseconds of shakes in transmission process, 20 milliseconds of jitter buffers are shaken than 10 milliseconds It buffers MOS values and increases than the 10 milliseconds jitter buffer MOS values increases by 0.37 of 0.35,40 milliseconds of jitter buffers;There are 4 millis in transmission process In the case of second shake, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase by 0.66,40 milliseconds of jitter buffers than 10 Millisecond jitter buffer MOS values increase by 0.75;In the case of having 6 milliseconds of shakes in transmission process, 20 milliseconds of jitter buffers are than 10 millis Second jitter buffer MOS values increase than 10 milliseconds jitter buffer MOS values of 1.15,40 milliseconds of jitter buffers and increase by 1.33;Transmission process In have 8 milliseconds of shakes in the case of, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase by 1.41,40 milliseconds and shake Than 10 milliseconds jitter buffer MOS values of buffering increase by 1.87;In the case of having 10 milliseconds of shakes in transmission process, 20 milliseconds of shakes are slow Than 10 milliseconds jitter buffer MOS values of punching increase than 10 milliseconds jitter buffer MOS values of 1.36,40 milliseconds of jitter buffers and increase by 2.0;It passes In the case of having 12 milliseconds of shakes during defeated, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase by 0.92,40 millis Second than 10 milliseconds jitter buffer MOS values of jitter buffer increase by 1.87;In the case of having 14 milliseconds of shakes in transmission process, 20 milliseconds Than 10 milliseconds jitter buffer MOS values of jitter buffer increase than 10 milliseconds jitter buffer MOS values of 0.61,40 milliseconds of jitter buffers and increase 1.73;In the case of having 16 milliseconds of shakes in transmission process, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase Than 10 milliseconds jitter buffer MOS values of 0.35,40 millisecond of jitter buffer increase by 1.38;There is the case where 18 milliseconds of shakes in transmission process Under, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase than 10 milliseconds jitter buffers of 0.29,40 milliseconds of jitter buffers MOS values increase by 0.87;It is different relative to 10 milliseconds of jitter buffers using interpolation method or curve-fitting method according to above-mentioned data Under the conditions of jitter buffer, shake(Jitter)With MOS values variation () between functional relation it is as follows:
Jitter bufferFor 20ms,JitterIt takes Value unit is ms)
Jitter bufferFor 40ms,JitterIt takes Value unit is ms)
Type of coding, code rate, transmission path performance, order caching etc. in above-mentioned transmission unit, transmission unit, receiving unit The correspondence of controllable parameter and the variation of Quality of experience MOS values provides data for present invention design voice applications optimisation strategy Support.
Further, the step of monitoring unit makes corresponding adjusted & optimized strategy for different situations include:
Periodically comparison audio experience quality measured values and preset audio Quality of experience threshold value, join according to comparing result tuning performance It is one or several in number, audio is optimized and revised with realizing;
4 kinds of situations are divided to adjust strategy, wherein Quality of experience calculated value MOS respectivelyCIt indicates:
If MOSCMore than or equal to first threshold, then the setting in transmission unit, transmission unit, receiving unit does not adjust;
If MOSCLess than first threshold and it is more than or equal to second threshold, can makes adjustment to the setting of transmission unit, receiving unit, To advanced optimize Quality of experience, i.e., according to Quality of experience coding, the corresponding Quality of experience situation of rate, by low Quality of experience Type of coding, code rate are adjusted to the type of coding of high Quality of experience, rate;
In specific optimization process, such as the Quality of experience of voice applications can be made to be higher than second by adjusting type of coding, code rate Thresholding 0.6MOS values(This numeric reference E-MODEL Quality of experience criteria for classifying), then no longer optimize;Otherwise, in next sound In frequency transmission cycle, the existing reception buffering of receiving unit is increased by 1 times, further to improve Quality of experience value, at this point, no matter Whether MOS can also obtain larger promotion, directly exit optimization program;
If MOSCLess than second threshold and be more than or equal to third threshold value, for G.711 encoding, packet loss value be not more than 3.3%, and Jitter value is not more than 12 milliseconds(Under 40 milliseconds of bufferings), 8 milliseconds(Under 20 milliseconds of bufferings), 3 milliseconds(Under 10 milliseconds of bufferings)Situation Under, then it only makes adjustment to the setting of transmission unit, receiving unit, optimizes Quality of experience, i.e., according to Quality of experience coding, rate Corresponding Quality of experience situation carries out the type of coding of low Quality of experience, rate to the type of coding of high Quality of experience, rate Adjustment, until MOSCMore than or equal to second threshold;As in next evaluation cycle, Quality of experience is still below second threshold, then receives Existing reception buffering is increased by 1 times by unit, further improves Quality of experience value;As Quality of experience cannot still be more than or equal to the second threshold Value then adjusts audio transmission path by transmission unit, optimizes audio experience quality, until MOSCMore than or equal to second threshold;
If MOSCLess than third threshold value, show in this case, transmission path cannot meet audio transmission needs, then to need to disconnect audio Transmissions links, again routing are then mainly adjusted by transmission unit to optimize Quality of experience.
The present invention is directed to the deficiency for optimizing and revising technology of existing audio quality, it is proposed that a kind of optimization tune of audio quality Adjusting method relates generally to transmission unit, transmission list according to audio experience quality come multiple unit controllable parameters in adjustment system Member, receiving unit are related to audio coding, audio encoding rate, transmission network performance, receive multiple controllable ginsengs such as buffer size Number.The present invention multiple scenes such as VoIP, Audio on Demand suitable for wireless, cable network.Its system and each functional unit such as Fig. 2 It is shown.
Transmission unit mainly completes the functions such as the coding of audio signal, package transmission, the controllable parameter master in transmission unit To include type of coding and code rate, evaluation result of the monitoring unit to audio experience quality be received, according to what is be previously set Algorithm mechanism is completed to achieve the purpose that optimize Quality of experience to the adjustment of controllable parameter.Software used in the present embodiment is OpenPhone;
Transmission unit mainly completes the transmission of audio package, the controllable parameter in transmission unit mainly include bandwidth, packet loss, when The controllable parameters such as prolong, shake, the parameter of the transmission path being typically different, network performance is different, by Real-time Transport Protocol and network Measurement Algorithm can be easier to obtain the specific targets of network transmission path, in the method, when audio quality is poor, with regard to that need to pass through Data packet transmission path is adjusted to improve signal transmission path performance, and then promotes audio quality.But compare other two units For, in the case of the adjustment of transmission unit is the most complicated, therefore audio quality is preferable, transmission unit is not adjusted generally It is whole.For the present embodiment for the ease of controlling transmission path performance, software used is NISTnet;
Receiving unit mainly completes the functions such as the unpacking of audio signal, decoding, and controllable parameter mainly receives in receiving unit Buffer size, the parameter have audio quality very great influence, are mainly according to monitoring unit in specific embodiment Feedback is completed to achieve the purpose that optimize audio experience quality to receiving the adjustment of buffering according to the algorithm mechanism being previously set.This Software used in embodiment receiving terminal is OpenPhone;
Monitoring unit is mainly completed to calculate voice applications Quality of experience value in real time, and machine in normal service learning method can be achieved to be somebody's turn to do at present Function, such as SVM, neural network, decision tree, the present embodiment select artificial neural network as the method for establishing mapping relations, Input layer includes type of coding, code rate, time delay, shake, packet loss, order caching totally 6 parameters, 20 nodes of hidden layer, Output layer has 1 output.
When G.729 transmitting terminal is encoded to, shake is about 2ms, and packet loss is about 2%, time delay value 100ms, jitter buffer 20ms In the case of, MOS values are about 3.04, using method shown in Fig. 3, are optimized to voice applications, then are encoded after audio quality optimization For G.711, jitter buffer 40ms, audio quality is through measuring about 3.60.
When G.729 transmitting terminal is encoded to, shake is about 8ms, and packet loss is about 6%, time delay value 150ms, jitter buffer 20ms In the case of, MOS values are about 2.07, using method shown in Fig. 4, first need to carry out repairing to network transmission path, recycle and send list Member or(And)Receiving unit optimizes voice applications, then is encoded to G.711 after audio quality optimization, audio quality is through measuring About 3.50.
Fig. 5 and Fig. 6 is using audio experience quality comparison schematic diagram after this method, wherein Fig. 5 is MOSCLess than the first threshold The case where being worth and being more than or equal to second threshold, Fig. 6 is MOSCThe case where less than second threshold and more than or equal to third threshold value.By scheming It is found that the adjustment by this method, audio experience quality is improved.
The main object of the present invention:Audio experience grade is divided, default Quality of experience threshold interval is determined, determines and send Controllable parameter type in unit, transmission unit, receiving unit;It determines in transmission unit, transmission unit, receiving unit and encodes class The controllable parameters such as type, code rate, transmission path, order caching change the influence variation degree to Quality of experience value, establish system System controllable parameter and Quality of experience level adjustment mapping relations;Specific aim using code optimization, path optimization, optimization of rate, The strategies such as cache optimization regulate and control Quality of experience.
It is understood that the principle that embodiment of above is intended to be merely illustrative of the present and the exemplary implementation that uses Mode, however the present invention is not limited thereto.For those skilled in the art, in the essence for not departing from the present invention In the case of refreshing and essence, various modifications, improvement can be made, these variations and modifications are also considered as protection scope of the present invention.

Claims (5)

1. a kind of audio quality self-adjusting control method, the system for completing this method includes at least transmission unit, transmission unit, connects Receive unit and monitoring unit, which is characterized in that the step of this method includes:
Step 1, receiving unit by audio quality be divided into it is excellent, good, in, poor, bad 5 grades, concrete numerical value set { MOS5 =5、MOS4=4、MOS3=3、MOS2=2、MOS1=1 } it indicates, is divided into 4 sections altogether:[5,4]、(4,3]、(3,2]、(2,1], In, MOS5Indicate that audio grade is ' excellent ' etc., MOS4Indicate that audio grade is ' good ' etc., MOS3Indicate that audio grade is ' in ' etc., MOS2Indicate that audio grade is ' poor ' etc., MOS1Indicate that audio grade is ' bad ' etc.;By MOS4Labeled as first threshold, MOS3Mark It is denoted as second threshold, MOS2Labeled as third threshold value, MOS1Labeled as the 4th threshold value;
Step 2, monitoring unit calculate voice applications Quality of experience value in real time, by experiment obtain transmission unit, transmission unit, The correspondence of different parameters and voice applications Quality of experience value in receiving unit, obtains training dataset, by engineering It practises algorithm and establishes Quality of experience assessment models, model includes at least input layer, output layer two layers;Setting input layer at least has 6 6 inputs of a input, input layer are type of coding, code rate, the time delay of transmission path performance, shake, packet loss, Yi Jijie Receive cached parameters;Output layer exports Quality of experience;It is controllable more in model realization transmission unit, transmission unit, the receiving unit Tie up mapping of the parameter to Quality of experience;
Step 3 determines type of coding in transmission unit, transmission unit, receiving unit, code rate, transmission path performance, connects It receives caching controllable parameter and changes influence variation degree to Quality of experience, establish in voice applications controllable parameter in each component units Value variation and Quality of experience value added or the correspondence mappings relationship of decreasing value;
Step 4, monitoring unit measure Quality of experience in system terminal, and measured value and default Quality of experience threshold levels are carried out Compare, type of coding, code rate, biography in transmission unit, transmission unit, receiving unit are at least adjusted according to comparative analysis result At least one of defeated path performance, order caching, according to after adjustment type of coding and or code rate and or transmission road Diameter parameter and or order caching, further to send, transmit, receive audio signal, so reach optimization voice applications experience matter The purpose of amount.
2. according to the audio quality self-adjusting control method described in claim 1, which is characterized in that determining can in transmission unit Controlling the step of parameter influences audio experience quality includes:
Transmission unit controls type of coding, code rate;
For the corresponding voice quality of different coding type, G.711 G.726 > G729 > are G.723 by >;
G.726, G711 coded audio quality is higher than 0.4MOS values are encoded, and G.726 coded audio quality encodes 0.2MOS higher than G729 Value, G729 coded audio quality is higher than G.723 coding 0.2MOS values;
For the corresponding voice quality of different coding rate, under same coding, the audio quality of higher code rate is higher than The Quality of experience of relatively low coding:11.8kbit(G729)> 8kbit(G729)6.4 kbit of >(G729);6.3kbit (G723.1)> 5.3kbit(G723.1);
In the expansible rate of G729 codings, the high 0.2MOS of audio quality of 11.8kbit code rate ratio 8kbit code rates Value, the high 0.1MOS values of audio quality of 8kbit code rate ratio 6.4kbit code rates, during G723.1 is encoded, 6.3kbit is compiled The high 0.1MOS values of audio quality of bit rate ratio 5.3kbit code rates.
3. according to the audio quality self-adjusting control method described in claim 1, which is characterized in that determining can in transmission unit Controlling the step of parameter influences audio experience quality includes:
Transmission unit controlling transmission path performance;For transmission path packet loss(PLR)For, to G.711 encoding, with no net Network packet drop is reference, and 0.8% packet loss can make audio quality reduce by 0.1 MOS values, and 3.3% packet loss can be such that audio quality drops Low 0.5MOS values, 7.6% packet loss make audio quality reduce by 1 MOS values, and 11.8% packet loss makes audio quality reduce by 1.5 MOS Value, 19% packet loss make audio quality reduce 2MOS values;For encoding G.729, the case where compared to no Network Packet Loss, 0.5% loses Packet rate makes audio quality reduce 0.1MOS values, and 2.1% packet loss makes audio quality reduce 0.5MOS values, and 5.7% packet loss makes audio Quality reduces 1MOS values, and 9.8% packet loss makes audio quality reduce 1.5MOS values, and 16% packet loss makes audio quality reduce 2MOS Value;According to above-mentioned data packet loss is calculated using interpolation method or curve-fitting method(PLR)With MOS values variation () Between functional relation, it is as follows:
(G.711 it encodes,PLRValue range is 0~1)
(G.729 it encodes,PLRValue range is 0~1)
For transmission path delay (Delay) for, it is to refer to no network delay situation to all kinds of type of codings, 170 milliseconds Propagation delay time makes all kinds of audio coding quality reduce 0.1MOS values, and 265 milliseconds of propagation delay times make all kinds of audio coding quality reduce 0.5MOS values, 360 milliseconds of propagation delay times make all kinds of audio coding quality reduce 1MOS values, and 480 milliseconds of propagation delay times make each assonance Frequency coding quality reduces 1.5MOS values, and 700 milliseconds of propagation delay times make all kinds of audio coding quality reduce by 2 MOS values;According to above-mentioned Data utilize interpolation method or curve-fitting method, calculation delay(Delay)With MOS values variation () between functional relation, It is as follows:
DelayValue unit is ms)
For transmission path shake (Jitter) for, such as G.711 to type of coding, in the case that jitter buffer is 10 milliseconds, It is reference with non-jitter situation, 1 millisecond of shake makes audio quality reduce 0.1MOS values, and 3 milliseconds of shakes make audio quality reduce 0.5MOS values, 6 milliseconds of shakes make audio quality reduce 1MOS values, and shaking for 9 milliseconds makes audio quality reduction 1.5MOS values, 10 milliseconds Shake makes audio quality reduce 2MOS values;It is reference with non-jitter situation, 4 milliseconds are trembled in the case that jitter buffer is 20 milliseconds Dynamic that audio quality is made to reduce 0.1MOS values, 8 milliseconds of shakes make audio quality reduce 0.5MOS values, and 10 milliseconds of shakes make audio quality 1MOS values are reduced, 12 milliseconds of shakes make audio quality reduce 1.5MOS values, and 18 milliseconds of shakes make audio quality reduce 2MOS values;
It is reference with non-jitter situation in the case that jitter buffer is 40 milliseconds, 9 milliseconds of shakes make audio quality reduce 0.1MOS Value, 12 milliseconds of shakes make audio quality reduce 0.5MOS values, and 16 milliseconds of shakes make audio quality reduce 1MOS values, 19 milliseconds of shakes Audio quality is set to reduce 1.5MOS values, 21 milliseconds of shakes make audio quality reduce 2MOS values;G.729 to type of coding, shake is slow It is reference with non-jitter situation in the case that punching is 10 milliseconds, 1 millisecond of shake makes audio quality reduce 0.1MOS values, and 3 milliseconds are trembled Dynamic that audio quality is made to reduce 0.5MOS values, 5 milliseconds of shakes make audio quality reduce 1MOS values, and 7 milliseconds of shakes make audio quality drop Low 1.5MOS values, 9 milliseconds of shakes make audio quality reduce 2MOS values;In the case that jitter buffer is 20 milliseconds, with non-jitter feelings Condition is reference, and 5 milliseconds of shakes make audio quality reduce 0.1MOS values, and 8 milliseconds of shakes make audio quality reduce 0.5MOS values, 11 millis Second shake makes audio quality reduce by 1 MOS values, and 13 milliseconds of shakes make audio quality reduce 1.5MOS values, and 18 milliseconds of shakes make audio Quality reduces 2MOS values;It is reference with non-jitter situation in the case that jitter buffer is 40 milliseconds, 10 milliseconds of shakes make audio matter Amount reduces 0.1MOS values, and 11 milliseconds of shakes make audio quality reduce by 0.5 MOS values, and 16 milliseconds of shakes make audio quality reduce by 1 MOS values, 18 milliseconds of shakes make audio quality reduce by 1.5 MOS values, and 20 milliseconds of shakes make audio quality reduce by 2 MOS values;According to Above-mentioned data are calculated and are shaken under the above situation using interpolation method or curve-fitting method(Jitter)With MOS values variation ( ) between functional relation it is as follows, jitter buffer be 20 milliseconds for:
(G.711 it encodes,JitterValue list Position is ms,Jitter bufferFor 20 ms)
(G.729 it encodes,JitterValue unit For ms,Jitter bufferFor 20 ms).
4. according to the audio quality self-adjusting control method described in claim 1, which is characterized in that determining can in receiving unit Controlling the step of parameter influences Quality of experience includes:
Receiving unit controls and receives cache size;For order caching size (Jitter buffer) for, to type of coding G.711 in the case of having 2 milliseconds of shakes for audio signal, in transmission process, than the 10 milliseconds shakes of 20 milliseconds of jitter buffers are slow It rushes MOS values and increases than the 10 milliseconds jitter buffer MOS values increases by 0.36 of 0.30,40 milliseconds of jitter buffers;There are 4 milliseconds in transmission process In the case of shake, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase by 0.49,40 milliseconds of jitter buffers than 10 millis Second jitter buffer MOS values increase by 0.63;In the case of having 6 milliseconds of shakes in transmission process, 20 milliseconds of jitter buffers are than 10 milliseconds Jitter buffer MOS values increase than 10 milliseconds jitter buffer MOS values of 0.69,40 milliseconds of jitter buffers and increase by 0.94;In transmission process In the case of having 8 milliseconds of shakes, it is slow that than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase by 0.97,40 milliseconds of shakes Than 10 milliseconds jitter buffer MOS values of punching increase by 1.33;In the case of having 10 milliseconds of shakes in transmission process, 20 milliseconds of jitter buffers Than 10 milliseconds jitter buffer MOS values increase than 10 milliseconds jitter buffer MOS values of 0.77,40 milliseconds of jitter buffers and increase by 1.53;It passes In the case of having 12 milliseconds of shakes during defeated, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase by 0.42,40 millis Second than 10 milliseconds jitter buffer MOS values of jitter buffer increase by 1.53;In the case of having 14 milliseconds of shakes in transmission process, 20 milliseconds Than 10 milliseconds jitter buffer MOS values of jitter buffer increase than 10 milliseconds jitter buffer MOS values of 0.29,40 milliseconds of jitter buffers and increase 1.41;In the case of having 16 milliseconds of shakes in transmission process, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase Than 10 milliseconds jitter buffer MOS values of 0.27,40 millisecond of jitter buffer increase by 1.18;There is the case where 18 milliseconds of shakes in transmission process Under, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase than 10 milliseconds jitter buffers of 0.26,40 milliseconds of jitter buffers MOS values increase by 0.97;It is different relative to 10 milliseconds of jitter buffers using interpolation method or curve-fitting method according to above-mentioned data Under the conditions of jitter buffer, shake(Jitter)With MOS values variation () between functional relation it is as follows:
Jitter bufferFor 20ms,JitterValue Unit is ms)
Jitter bufferFor 40ms,JitterIt takes Value unit is ms)
G.729 to type of coding, in the case of having 2 milliseconds of shakes in transmission process, 20 milliseconds of jitter buffers are shaken than 10 milliseconds It buffers MOS values and increases than the 10 milliseconds jitter buffer MOS values increases by 0.37 of 0.35,40 milliseconds of jitter buffers;There are 4 millis in transmission process In the case of second shake, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase by 0.66,40 milliseconds of jitter buffers than 10 Millisecond jitter buffer MOS values increase by 0.75;In the case of having 6 milliseconds of shakes in transmission process, 20 milliseconds of jitter buffers are than 10 millis Second jitter buffer MOS values increase than 10 milliseconds jitter buffer MOS values of 1.15,40 milliseconds of jitter buffers and increase by 1.33;Transmission process In have 8 milliseconds of shakes in the case of, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase by 1.41,40 milliseconds and shake Than 10 milliseconds jitter buffer MOS values of buffering increase by 1.87;In the case of having 10 milliseconds of shakes in transmission process, 20 milliseconds of shakes are slow Than 10 milliseconds jitter buffer MOS values of punching increase than 10 milliseconds jitter buffer MOS values of 1.36,40 milliseconds of jitter buffers and increase by 2.0;It passes In the case of having 12 milliseconds of shakes during defeated, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase by 0.92,40 millis Second than 10 milliseconds jitter buffer MOS values of jitter buffer increase by 1.87;In the case of having 14 milliseconds of shakes in transmission process, 20 milliseconds Than 10 milliseconds jitter buffer MOS values of jitter buffer increase than 10 milliseconds jitter buffer MOS values of 0.61,40 milliseconds of jitter buffers and increase 1.73;In the case of having 16 milliseconds of shakes in transmission process, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase Than 10 milliseconds jitter buffer MOS values of 0.35,40 millisecond of jitter buffer increase by 1.38;There is the case where 18 milliseconds of shakes in transmission process Under, than 10 milliseconds jitter buffer MOS values of 20 milliseconds of jitter buffers increase than 10 milliseconds jitter buffers of 0.29,40 milliseconds of jitter buffers MOS values increase by 0.87;It is different relative to 10 milliseconds of jitter buffers using interpolation method or curve-fitting method according to above-mentioned data Under the conditions of jitter buffer, shake(Jitter)With MOS values variation () between functional relation it is as follows:
Jitter bufferFor 20ms,JitterIt takes Value unit is ms)
Jitter bufferFor 40ms,JitterValue Unit is ms)
Type of coding, code rate, transmission path performance, order caching etc. in above-mentioned transmission unit, transmission unit, receiving unit The correspondence of controllable parameter and the variation of Quality of experience MOS values provides data for present invention design voice applications optimisation strategy Support.
5. according to the audio quality self-adjusting control method described in claim 1, which is characterized in that monitoring unit is for difference Situation makes the step of corresponding adjusted & optimized strategy and includes:
Periodically comparison audio experience quality measured values and preset audio Quality of experience threshold value, join according to comparing result tuning performance It is one or several in number, audio is optimized and revised with realizing;
4 kinds of situations are divided to adjust strategy, wherein Quality of experience calculated value MOS respectivelyCIt indicates:
If MOSCMore than or equal to first threshold, then the setting in transmission unit, transmission unit, receiving unit does not adjust;
If MOSCLess than first threshold and it is more than or equal to second threshold, can makes adjustment to the setting of transmission unit, receiving unit, To advanced optimize Quality of experience, i.e., according to Quality of experience coding, the corresponding Quality of experience situation of rate, by low Quality of experience Type of coding, code rate are adjusted to the type of coding of high Quality of experience, rate;
In specific optimization process, such as the Quality of experience of voice applications can be made to be higher than second by adjusting type of coding, code rate Thresholding 0.6MOS values(This numeric reference E-MODEL Quality of experience criteria for classifying), then no longer optimize;Otherwise, in next sound In frequency transmission cycle, the existing reception buffering of receiving unit is increased by 1 times, further to improve Quality of experience value, at this point, no matter Whether MOS can also obtain larger promotion, directly exit optimization program;
If MOSCLess than second threshold and it is more than or equal to third threshold value, for G.711 encoding, packet loss value is not more than 3.3%, and trembles Dynamic value is not more than 12 milliseconds(Under 40 milliseconds of bufferings), 8 milliseconds(Under 20 milliseconds of bufferings), 3 milliseconds(Under 10 milliseconds of bufferings)In the case of, It then only makes adjustment to the setting of transmission unit, receiving unit, optimizes Quality of experience, i.e., according to Quality of experience coding, rate pair The Quality of experience situation answered adjusts the type of coding of low Quality of experience, rate to the type of coding of high Quality of experience, rate It is whole, until MOSCMore than or equal to second threshold;As in next evaluation cycle, Quality of experience is still below second threshold, then receives list Existing reception buffering is increased by 1 times by member, further improves Quality of experience value;As Quality of experience cannot still be more than or equal to the second threshold Value then adjusts audio transmission path by transmission unit, optimizes audio experience quality, until MOSCMore than or equal to second threshold;
If MOSCLess than third threshold value, show in this case, transmission path cannot meet audio transmission needs, then to need to disconnect audio Transmissions links, again routing are then mainly adjusted by transmission unit to optimize Quality of experience.
CN201810243626.3A 2018-03-23 2018-03-23 A kind of audio quality self-adjusting control method Pending CN108495182A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810243626.3A CN108495182A (en) 2018-03-23 2018-03-23 A kind of audio quality self-adjusting control method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810243626.3A CN108495182A (en) 2018-03-23 2018-03-23 A kind of audio quality self-adjusting control method

Publications (1)

Publication Number Publication Date
CN108495182A true CN108495182A (en) 2018-09-04

Family

ID=63319570

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810243626.3A Pending CN108495182A (en) 2018-03-23 2018-03-23 A kind of audio quality self-adjusting control method

Country Status (1)

Country Link
CN (1) CN108495182A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110035299A (en) * 2019-04-18 2019-07-19 雷欧尼斯(北京)信息技术有限公司 The compression transmitting method and framework of immersion multi-object audio
CN110706679A (en) * 2019-09-30 2020-01-17 维沃移动通信有限公司 Audio processing method and electronic equipment
CN117409794A (en) * 2023-12-13 2024-01-16 深圳市声菲特科技技术有限公司 Audio signal processing method, system, computer device and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101489304A (en) * 2009-02-27 2009-07-22 南京邮电大学 Media access control method based on differentiate service of wireless multimedia sensor network
CN102044248A (en) * 2009-10-10 2011-05-04 北京理工大学 Objective evaluating method for audio quality of streaming media
CN102324229A (en) * 2011-09-08 2012-01-18 中国科学院自动化研究所 Method and system for detecting abnormal use of voice input equipment
CN103050128A (en) * 2013-01-29 2013-04-17 武汉大学 Vibration distortion-based voice frequency objective quality evaluating method and system
CN103888846A (en) * 2014-03-04 2014-06-25 浙江大学 Wireless video streaming service self-adaption rate control method based on QoE
CN103957216A (en) * 2014-05-09 2014-07-30 武汉大学 Non-reference audio quality evaluation method and system based on audio signal property classification
CN104917671A (en) * 2015-06-10 2015-09-16 腾讯科技(深圳)有限公司 Mobile terminal based audio processing method and device
CN105141404A (en) * 2015-07-28 2015-12-09 西安交通大学 Method for wireless resource allocation based on QoE (Quality of Experience) in LTE-A (Long Term Evolution-Advanced) system
US20170104552A1 (en) * 2015-10-10 2017-04-13 Dolby Laboratories Licensing Corporation Near Optimal Forward Error Correction System and Method

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101489304A (en) * 2009-02-27 2009-07-22 南京邮电大学 Media access control method based on differentiate service of wireless multimedia sensor network
CN102044248A (en) * 2009-10-10 2011-05-04 北京理工大学 Objective evaluating method for audio quality of streaming media
CN102324229A (en) * 2011-09-08 2012-01-18 中国科学院自动化研究所 Method and system for detecting abnormal use of voice input equipment
CN103050128A (en) * 2013-01-29 2013-04-17 武汉大学 Vibration distortion-based voice frequency objective quality evaluating method and system
CN103888846A (en) * 2014-03-04 2014-06-25 浙江大学 Wireless video streaming service self-adaption rate control method based on QoE
CN103957216A (en) * 2014-05-09 2014-07-30 武汉大学 Non-reference audio quality evaluation method and system based on audio signal property classification
CN104917671A (en) * 2015-06-10 2015-09-16 腾讯科技(深圳)有限公司 Mobile terminal based audio processing method and device
CN105141404A (en) * 2015-07-28 2015-12-09 西安交通大学 Method for wireless resource allocation based on QoE (Quality of Experience) in LTE-A (Long Term Evolution-Advanced) system
US20170104552A1 (en) * 2015-10-10 2017-04-13 Dolby Laboratories Licensing Corporation Near Optimal Forward Error Correction System and Method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈志伟,胡治国: "IP网络语言质量评价方法研究", 《计算机与现代化》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110035299A (en) * 2019-04-18 2019-07-19 雷欧尼斯(北京)信息技术有限公司 The compression transmitting method and framework of immersion multi-object audio
CN110035299B (en) * 2019-04-18 2021-02-05 雷欧尼斯(北京)信息技术有限公司 Compression transmission method and system for immersive object audio
CN110706679A (en) * 2019-09-30 2020-01-17 维沃移动通信有限公司 Audio processing method and electronic equipment
CN110706679B (en) * 2019-09-30 2022-03-29 维沃移动通信有限公司 Audio processing method and electronic equipment
CN117409794A (en) * 2023-12-13 2024-01-16 深圳市声菲特科技技术有限公司 Audio signal processing method, system, computer device and storage medium
CN117409794B (en) * 2023-12-13 2024-03-15 深圳市声菲特科技技术有限公司 Audio signal processing method, system, computer device and storage medium

Similar Documents

Publication Publication Date Title
CN108495182A (en) A kind of audio quality self-adjusting control method
CN100417125C (en) System and method of network self adaption real time multimedia flow transfer way
RU95117936A (en) METHOD FOR REGULATING POWER LEVEL IN COMMUNICATION SYSTEM, METHOD FOR REGULATING PRE-VALUED POWER LEVEL VALUE AND SYSTEM OF REGULATING POWER LEVEL
CN108924667B (en) Available bandwidth self-adaptive video fragment request method supporting QoE maximization
CN112954385B (en) Self-adaptive shunt decision method based on control theory and data driving
US20050105604A1 (en) Bit rate contol method and device
US6820231B2 (en) Communication system, and method of transmitting data therein
US7315814B2 (en) Method and apparatus providing adaptive multi-rate speech coding
DE69734831D1 (en) ADAPTIVE DATA RATE CONTROL FOR DIGITAL VIDEO COMPRESSION
CN111314022B (en) Screen updating transmission method based on reinforcement learning and fountain codes
CN105989844A (en) Audio transmission adaptive method and device
CN101366082B (en) Variable frame shifting code method, codec and wireless communication device
CN101636990A (en) Method of transmitting data in a communication system
CN102006221A (en) Method for optimizing quality of service of stream media
CN103269458B (en) A kind of adjustment control method for transmission of video in narrowband network situation
CN102543090A (en) Code rate automatic control system applicable to variable bit rate voice and audio coding
CN103004190A (en) Video streaming
CN101860538B (en) Network coded data segmenting method and video transmission method and device
EP1296479A1 (en) Data communication method and system for transmitting multiple data streams calculating available bandwidth per stream and bit stream trade-off
CN102111807A (en) Automatic regulation method of sending rate for wire local area network (WLAN) system
WO2023155747A1 (en) Data coding bitrate adaptive adjustment method, apparatus and device, and storage medium
US20080222493A1 (en) Method and system for control loop response time optimization
CN104333728B (en) Audio video transmission regulates and controls method, apparatus, system and computer readable storage medium
CN101621681B (en) Method and equipment for adjusting video communication quality
KR20040087844A (en) Method for Controlling Data Rate in Mobile Communication

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20180904

WD01 Invention patent application deemed withdrawn after publication