CN109302603A - A kind of video speech quality appraisal procedure and device - Google Patents
A kind of video speech quality appraisal procedure and device Download PDFInfo
- Publication number
- CN109302603A CN109302603A CN201710614327.1A CN201710614327A CN109302603A CN 109302603 A CN109302603 A CN 109302603A CN 201710614327 A CN201710614327 A CN 201710614327A CN 109302603 A CN109302603 A CN 109302603A
- Authority
- CN
- China
- Prior art keywords
- video
- quality
- audio
- calling
- video calling
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N17/00—Diagnosis, testing or measuring for television systems or their details
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/60—Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client
- H04N21/63—Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
- H04N21/643—Communication protocols
- H04N21/6437—Real-time Transport Protocol [RTP]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/14—Systems for two-way working
- H04N7/141—Systems for two-way working between two video terminals, e.g. videophone
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention discloses a kind of video speech quality appraisal procedure and devices, which comprises when detecting that video calling occurs, obtains the video parameter and audio frequency parameter of this video calling respectively;The video quality of this video calling is determined according to the video parameter, and the audio quality of this video calling is determined according to the audio frequency parameter;And according to the video quality and the audio quality, determine the video speech quality of this video calling.Using method provided in an embodiment of the present invention, video speech quality can be determined according to video calling, without reference to the participation in source, and also improve the accuracy of the assessment result of video speech quality.
Description
Technical field
The present invention relates to technical field of data processing more particularly to a kind of video speech quality appraisal procedure and devices.
Background technique
At this stage, VoLTE (Voice over Long Time Evolution, the voice on long term evolution) video calling
Quality determining method mainly includes following 3 kinds: (1) P.910 and P.911 subjective evaluation method respectively describes multimedia such as ITU
The video and view quality of application are assessed, P.910 Evaluation Method belong to multimedia application (such as video conference, storage and retrieval application,
Video-medical application etc.) unidirectional overall video quality nonreciprocal subjective evaluation method, describe test method in 4 in specification:
ACR (Absolute Category Rating, absolute category scoring), ACR-HR (Absolute Category Rating
With Hidden Reference, band hide reference absolute category scoring), DCR (Degradation Category
Rating, the scoring of loss type) and PC (Pair Comparision method, Paired Comparisons) etc.;(2) it is based on having reference
PEVQ (Perceptual Evaluation of Video Quality, the video quality evaluation of recommendation);(3) data are based on
It is grouped the appraisal procedure of layer model, is specifically assessed using the grouping information of transport layer.RTP is mainly based upon in VoLTE
(Realtime Transport Protocol, real-time transport protocol) packet is analyzed.
Above-mentioned three kinds of approach applications respectively have some limitations in the test of VoLTE the whole network, are mainly manifested in: (1) adopting
Human intervention is needed with subjective evaluation method, inspection can only be sampled using the method, a large number of users can not be assessed, and can not
Meet assessment in real time;(2) using there is the PEVQ method of reference to need to be arranged reference source, model of conversing not only it has not been suitable for, but also can not
Assess whole network users;(3) appraisal procedure of the use based on data grouping layer model can not carry out parameter oneself according to different business
Dynamic adjustment, computation complexity is high, computationally intensive, while can not converse massive video and carry out full dose calculating.
It can to sum up obtain, a kind of efficient video method for evaluating quality based on video calling how be found, without reference to source
It participates in, so that it may video speech quality be assessed, and the accuracy that can also improve the assessment result of video speech quality is
One of the technical problems that are urgent to solve.
Summary of the invention
The embodiment of the present invention provides a kind of video speech quality appraisal procedure and device, to carry out to video speech quality
Assessment, without reference to the participation in source, that is, can be improved the accuracy of the assessment result of video speech quality.
In a first aspect, the embodiment of the present invention provides a kind of video speech quality appraisal procedure, comprising:
When detecting that video calling occurs, the video parameter and audio frequency parameter of this video calling are obtained respectively;
The video quality of this video calling is determined according to the video parameter, and this is determined according to the audio frequency parameter
The audio quality of video calling;And
According to the video quality and the audio quality, the video speech quality of this video calling is determined.
Second aspect, the embodiment of the present invention provide a kind of video speech quality assessment device, comprising:
Acquiring unit, for detect video calling occur when, obtain respectively this video calling video parameter and
Audio frequency parameter;
First determination unit, for determining the video quality of this video calling according to the video parameter, and according to institute
State the audio quality that audio frequency parameter determines this video calling;
Second determination unit, for determining the view of this video calling according to the video quality and the audio quality
Frequency speech quality.
The third aspect, the embodiment of the present invention provide a kind of electronic equipment, including memory, processor and are stored in described deposit
On reservoir and the computer program that can run on the processor;The processor realizes that the application mentions when executing described program
The video speech quality appraisal procedure of confession.
Fourth aspect, the embodiment of the present invention provide a kind of computer readable storage medium, are stored thereon with computer program,
The program realizes the either step in video speech quality appraisal procedure provided by the present application when being executed by processor.
Beneficial effects of the present invention:
Video speech quality appraisal procedure provided in an embodiment of the present invention and device, when detecting that video calling occurs,
The video parameter and audio frequency parameter of this video calling are obtained respectively;This video calling is determined according to the video parameter
Video quality, and determine according to the audio frequency parameter audio quality of this video calling;And according to the video quality and
The audio quality determines the video speech quality of this video calling.Using method provided in an embodiment of the present invention, Neng Gougen
Video speech quality is determined according to video calling, without reference to the participation in source, and also improves the assessment result of video speech quality
Accuracy.
Other features and advantages of the present invention will be illustrated in the following description, also, partly becomes from specification
It obtains it is clear that understand through the implementation of the invention.The objectives and other advantages of the invention can be by written explanation
Specifically noted structure is achieved and obtained in book, claims and attached drawing.
Detailed description of the invention
The drawings described herein are used to provide a further understanding of the present invention, constitutes a part of the invention, this hair
Bright illustrative embodiments and their description are used to explain the present invention, and are not constituted improper limitations of the present invention.In the accompanying drawings:
Fig. 1 a is the flow diagram for the video speech quality appraisal procedure that the embodiment of the present invention one provides;
Fig. 1 b is the video parameter for obtaining this video calling respectively that the embodiment of the present invention one provides and audio frequency parameter
Flow diagram;
Fig. 1 c is the parameter configuration that VoLTE video calling is established using H.264 agreement that the embodiment of the present invention one provides
Configuration process schematic diagram;
Fig. 2 is the flow diagram for the acquisition video end-to-end time delay that the embodiment of the present invention one provides;
Fig. 3 is the flow diagram for the acquisition audio end-to-end time delay that the embodiment of the present invention one provides;
Fig. 4 is that the embodiment of the present invention one provides the flow diagram of order side time delay really;
Fig. 5 is the flow diagram of the audio quality for determination this video calling that the embodiment of the present invention one provides;
Fig. 6 is the flow diagram of the video speech quality for determination this video calling that the embodiment of the present invention one provides;
Fig. 7 is the flow diagram of the related coefficient for the determination video speech quality that the embodiment of the present invention one provides;
Fig. 8 is the structural schematic diagram that video speech quality provided by Embodiment 2 of the present invention assesses device;
Fig. 9 is the hardware configuration of the electronic equipment for the implementation video speech quality appraisal procedure that the embodiment of the present invention four provides
Schematic diagram.
Specific embodiment
The embodiment of the present invention provides a kind of video speech quality appraisal procedure and device, is detecting video calling
When, the video parameter and audio frequency parameter of this video calling are obtained respectively;Determine that this video is logical according to the video parameter
The video quality of words, and determine according to the audio frequency parameter audio quality of this video calling;And according to the video matter
Amount and the audio quality, determine the video speech quality of this video calling, not only realize and carry out to video speech quality
Assessment, and without reference to the participation in source, while also improving the accuracy of the assessment result of video speech quality.
Video speech quality appraisal procedure provided in an embodiment of the present invention can be applied and VoLTE system, assessment VoLTE system
Video speech quality of uniting is illustrated for method provided by the invention is applied to VoLTE system for convenience.
Below in conjunction with Figure of description, preferred embodiment of the present invention will be described, it should be understood that described herein
Preferred embodiment only for the purpose of illustrating and explaining the present invention and is not intended to limit the present invention, and in the absence of conflict, this hair
The feature in embodiment and embodiment in bright can be combined with each other.
Embodiment one
As shown in Figure 1a, the video speech quality appraisal procedure provided for the embodiment of the present invention one, may include following step
It is rapid:
S11, detect video calling occur when, obtain the video parameter and audio frequency parameter of this video calling respectively.
When it is implemented, the video parameter and audio of this video calling can be obtained respectively according to method shown in Fig. 1 b
Parameter:
S111, video calling real-time Transmission association is collected to this video calling progress data using at least one interface
Discuss RTP data packet.
When it is implemented, can use medium surface and this video call data of control plane Collect jointly and then obtain RTP
(Realtime Transport Protocol, real-time transport protocol) data packet.
When it is implemented, the interface of the medium surface can be, but not limited to include Mb;The interface of the control plane can
With but be not limited to include: S1-MME, S6a, S11, Mw, Gx and Rx interface etc..
Wherein, the interface is arranged between the transmission node that any two are adjacent in data packet transmission link.
When it is implemented, above-mentioned interface is set on the transmission link that the data packet that video calling generates is passed through in advance,
S1-MME interface such as is set between base station eNode and MME (Mobile Management Entity, mobile management entity),
S6a interface is set between MME and HSS, in MME and S&PGW (Service&Packet Data Network Gateway, clothes
Be engaged in & grouped data network gateway) between S11 interface etc. is set.
By extracting the RTP data packet on these interfaces, it is logical that this video can be obtained based on H.264 video encoding standard
The video or audio frequency parameter of words.
S112, it collected video calling RTP data packet is decoded according to the communication protocol of each interface is decoded
Video calling RTP data packet afterwards.
S113, according to decoded video calling RTP data packet, obtain this respectively using preset video encoding and decoding standard
The video parameter and audio frequency parameter of secondary video calling.
When it is implemented, can use H.264 video encoding and decoding standard from control plane message obtains video parameter and audio
Parameter, for example, being obtained from SIP/SDP message.
Preferably, the video parameter and the audio frequency parameter include coding parameter and transmission performance parameter, wherein institute
It states coding parameter to include at least with the next item down: code/decode type, end-to-end time delay, code rate, frame per second and maximum transmitted bit rate, institute
It states transmission performance parameter to include at least with the next item down: packet loss and transmission rate.
When it is implemented, video and audio are transmitted by respective carrying respectively, in establishing load bearing process, pass through carrying
Parameter (QCI) available video and audio transmission rate.In addition, passing through the rtp streaming of associated video and the RTP of audio
Stream, can calculate the transmission rate of video and audio in real time.
When it is implemented, can determine transmission bit rate according to following two ways:
Mode one: maximum transmitted bit rate is determined by the bearing parameter of VoLTE session establishment
Specifically, either audio or video can all establish corresponding carrying during VoLTE session establishment
Bearer carries the parameter for having the maximum transmitted bit rate of the carrying in bearer.Bearing parameter can be carried by having a plurality of signaling,
Such as: Initial Context Setup message is the maximum transmitted bit that can determine audio or video based on the bearing parameter
Rate.
Mode two: actual transmission bit rate is determined by RTP data packet
Specifically, the data volume size that each RTP packet has COUNT information to indicate the packet, by all RTP of the session
The quantity amount summation of packet can be obtained by actual transmission bit rate divided by transmission duration again.
When it is implemented, carrying out video in VoLTE system in the code rate and frame per second for determining audio frequency parameter and video parameter
When call, in the handshaking procedure of call, the negotiation of session parameter is had, major parameter includes code rate and frame per second, be can use
H.264/AVC agreement obtains the code rate and frame per second of this video calling.
Specifically, it can use the parameter configuration that H.264/AVC agreement establishes VoLTE video calling in advance, configured
Journey can refer to Fig. 1 c, each in parameter profile-level inquiry VoLTE network through the above configuration after being successfully established
A physical layer parameter.
In Fig. 1 c, m=video: medium type is indicated;B=AS: it indicates demand bandwidth (kbps);B=RS, b=RR are indicated
The Bandwidth adjustment of control channel;A=rtpmap indicates medium type and sample rate, wherein sample rate=9000, is H.264
Parameter;A=fmtp indicates the additional parameter of media formats, most important of which is that profile-level-id.It can be with according to ID
It is associated with out code rate, frame per second and the maximum video bitrate (Max video bit rate) of video media;A=rtcp-fb: table
Show the feedback parameter of control channel.
It should be noted that being H.264 a kind of new video encoding and decoding standard, the meeting pair in video session establishment process
H.264 ability is held consultation, and negotiations process handling capacity collection and number are to identify.H.264 capability set be one comprising one or
The list of multiple H.264 abilities, each H.264 ability include two mandatory parameters of Profile and Level and
Several optional parameters such as CustomMaxMBPS, CustomMaxFS.In h .264, Profile, which is used to define, generates bit stream
Encoding tool and algorithm, Level are then the parameter requests to some keys.The Profile of association encoding and decoding H.264 simultaneously and
Level table, available code rate, frame per second.
Preferably, can determine that this video is logical in VoLTE system first according to RTP data packet when determining packet loss
The total packet number and total number of discarded packets for talking about medium surface, then obtain the packet loss of this video calling divided by total packet number using total number of discarded packets
Rate.
Preferably, can be negotiated by extracting the audio coding decoding of SIP/SDP agreement when determining audio coding decoding type
In audio coding decoding scheme obtain audio coding decoding type;When determining coding and decoding video type, SIP/ can be equally extracted
Coding and decoding video scheme in the coding and decoding video negotiation of SDP agreement obtains coding and decoding video type.
Preferably, obtaining theoretical speed by code/decode format when determining the maximum transmitted bit rate that audio frequency parameter includes
Rate, then the practical maximum transmitted bit rate by association S1-MME interface and S11 interface acquisition VoLTE audio bearer.
Preferably, obtaining theoretical speed by code/decode format when determining the maximum transmitted bit rate that video parameter includes
Rate, then the practical maximum transmitted bit rate by association S1-MME interface and S11 interface acquisition VoLTE video bearer.
Further, the end-to-end time delay of the video calling includes audio end-to-end time delay and video end-to-end time delay,
It introduces individually below and determines that the audio end-to-end time delay that the audio frequency parameter includes and the video end that the video parameter includes arrive
Terminal delay time:
When it is implemented, being parsed when determining the audio end-to-end time delay that audio frequency parameter includes by medium surface Mb
Real-time Transport Protocol obtains the inter-packet gap and timestamp of audio RTP data packet, recycles the sip message and S1-MME interface of Mw
Calling and called association is carried out with the channel information that S11 interface obtains, and then obtains audio end-to-end time delay.
When it is implemented, being parsed when determining the video end-to-end time delay that video parameter includes by medium surface Mb
Real-time Transport Protocol obtains the inter-packet gap and timestamp of video RTP data packet, recycles the sip message and S1-MME interface of Mw
Calling and called association is carried out with the channel information that S11 interface obtains, and then obtains video end-to-end time delay.
Specifically, voice packet or video bag have passed through the process of a sequence, from mobile phone to eNodeB base from reception is sent to
The packet-based core networks EPC to stand to evolution to opposite end EPC, then to opposite end the base station eNodeB until arriving opposite end mobile phone.At this
In sequence process, each section has link identification and packet mark: such as GTP tunnel number and RTP serial number.Pass through each section of chain
Road matching, can do end-to-end tracking to each packet, then according to the timestamp of the timestamp given out a contract for a project and packet receiving, so that it may
To the end-to-end time delay of audio or video.
Preferably, can be executed according to method described shown in Fig. 2, including following when determining video end-to-end time delay
Step:
S21, the unilateral time delay for obtaining the video parameter.
When it is implemented, the step of can providing according to step S41~S43, determines the unilateral time delay of the video parameter,
It is specific it needs to be determined that RTP data packet timestamp, then according to the time of the time difference of former and later two RTP and former and later two RTP
Stamp difference determines unilateral time delay.
It should be noted that the timestamp field is to illustrate the synchronizing information of packet time in RTP stem, it is data
The key that can be restored with correct time sequencing.The value of timestamp gives the sampling time of the first character section of data in grouping
(Sampling Instant), it is desirable that the clock of sender's timestamp is continuous, monotone increasing, even if inputting in no data
Or send data when be also such.In silence, sender need not send data, the growth of retention time stamp, in receiving end, by
It is not lost in the serial number of the data grouping received, is known that there is no loss of data, as long as and relatively front and back grouping
Timestamp difference, so that it may determine output time interval.
In addition, RTP provides that the initial time stamp of a session must randomly choose, but agreement is not specified by the list of timestamp
Position, is also not specified by the accurate explanation of the value, but the particle of clock is determined by loadtype, and application types various in this way can
To select suitably to export accuracy of timekeeping as needed.
When RTP transmits audio data, generally selected logical time stamp rate is identical as sampling rate, but regards in transmission
Frequency according to when, it is necessary to make timestamp rate be greater than one of every frame it is ticking.If data are sampled in synchronization, Protocol Standard
Standard also allows multiple grouping timestamp value having the same.
S22, the unilateral time delay according to the video parameter, determine the video end-to-end time delay.
It since VoLTE call is Bidirectional Flow, is delayed when determining unilateral, main quilt can be obtained by calling and called association
The medium surface association cried, and then video end-to-end time delay can be determined according to unilateral time delay.
As shown in figure 3, the flow diagram of the acquisition audio end-to-end time delay provided for the embodiment of the present invention one, packet
Include following steps:
S31, the unilateral time delay for obtaining the audio frequency parameter.
S32, the unilateral time delay according to the audio frequency parameter, determine the audio end-to-end time delay.
When it is implemented, the explanation of step S21~S22 can be referred to, overlaps will not be repeated.
When it is implemented, when determining video end-to-end time delay or audio end-to-end time delay, it is thus necessary to determine that unilateral time delay, tool
Body can determine unilateral time delay according to method shown in Fig. 4, comprising the following steps:
S41, the Network Time Protocol NTP time difference for determining two adjacent video call RTP data packet.
When it is implemented, being illustrated so that adjacent data packet is respectively RTP1 and RTP2 data packet as an example, RTP1 data packet
Corresponding NTP (Network Time Protocol, the Network Time Protocol) time is denoted as NTP1;The corresponding NTP of RTP2 data packet
Time is denoted as NTP2, then the NTP time difference of RTP1 and RTP2 data packet is denoted as NTP2-NTP1.
S42, the time tolerance for determining two adjacent video call RTP data packet.
When it is implemented, the timestamp of RTP1 data packet and RTP2 data packet is denoted as RTP1 and RTP2 respectively, then RTP1
It can be denoted as with RTP2 packet time stamp difference: RTP2-RTP1.
The NTP difference and time tolerance that S43, basis are determined, determine unilateral time delay.
When it is implemented, unilateral time delay can indicate are as follows:
Unilateral time delay=(NTP2-NTP1)-(RTP2-RTP1) * clock frequency
Specifically, corresponding media sampling frequency can be looked into according to the packet header domain PT (loadtype) of RTP packet, thus obtained
Clock frequency.
It should be noted that the unilateral time delay that the embodiment of the present invention one is related to refers to from the every of video calling originating end progress
Primary video call reach video calling receiving end generate time delay, or from the video calling receiving end carry out it is each
Secondary call reaches the time delay that video calling originating end generates.
Preferably, video calling RTCP (Realtime Transport Control Protocol, reality can also be obtained
When transmission control protocol) data packet, determine the video parameter and audio frequency parameter using RTCP data packet, then recycle step
The method of S21~21, step S31~S32 and step S41~S43 determines video end-to-end time delay and audio end-to-end time delay.
Specifically, in the video quality and audio quality for determining this video calling, it can use and G.1070 determine,
It is described in detail below it:
S12, according to the video parameter, determine the video quality of this video calling.
When it is implemented, the video quality of this video calling can be determined according to the video parameter according to formula (1):
Wherein, VqFor the video quality of this video calling;
IcodingVideo quality when for coding distortion;
PplVFor packet loss;
Robustness degree degree for the video quality influenced by packet loss.
When it is implemented, packet loss PplVIt is determined, can be directly substituted into formula (1) by step S113;It can table
It is shown as:FrVIt can be by H.264 determining, BrVTable
Show the video bitrate at encoder, coefficient v8,v9,v10,v11,v12It, can be by code/decode type, video format, key for constant
Frame period and video show that size determines.
When it is implemented, video quality I when can determine coding distortion according to formula (2)coding:
Wherein, OfrFor optimal frame rate corresponding when video quality is maximized under current video bit rate;
IOfrFor the maximum value of video quality under current video bit rate;
FrVFor current frame rate;
DFrVRobustness degree for the video quality influenced by frame per second.
When it is implemented, corresponding optimal frame rate when video quality is maximized under current video bit rate in formula (2)
OfrIt can indicate are as follows: Ofr=v1+v2*BrV,1≤Ofr≤30,v1andv2: const, wherein BrVVideo ratio at presentation code device
Special rate, the maximum value I of video quality under current video bit rateOfrExpression formula are as follows:
v3,v4,and v5:const;And D in formula (1)FrVIt is represented by DFrV=v6+v7*BrV, 0 < DFrV,v6andv7: const, v1,
v2,v3,...,v7For constant, it can show that size determines by code/decode type, video format, key frame interval and video.
S13, according to the audio frequency parameter, determine the audio quality of this video calling.
When it is implemented, can determine the sound of this video calling according to method shown in fig. 5 when executing step S13
Frequency quality:
S131, according to the audio frequency parameter, determine the quality index of the audio quality.
When it is implemented, can determine the quality index of the audio quality according to formula (3):
Wherein, Idte, WB are in this video calling because degenerating caused by caller's echo;
Ie-eff, WB are in this video calling because degenerating caused by voice coding and packet loss;
Qx is the quality index of the audio quality.
It is possible to further determine in video calling according to formula (4) because of the Idte, WB of degenerating caused by caller's echo:
Wherein, the expression formula of Re, WB are as follows: Re, WB=80+2.5* (TERV, WB-14);And the expression formula of TERV, WB
Are as follows:And
TELR is caller's echo loudness scale;
Ts is this audio frequency in video call end-to-end time delay.
It is possible to further determine in video calling according to formula (5) because of degeneration caused by voice coding and packet loss
Ie-eff, WB:
Wherein, IeS, WB is the voice coding distortion factor;
PplSFor voice packet loss in this video calling;
BplSFor the robustness of packets of voice packet loss in this video calling.
S132, according to the quality index, the audio quality of this video calling is determined according to formula (6).
When it is implemented, the expression formula of formula (6) are as follows:
Wherein, Qx is the quality index;
SqFor the audio quality of the video calling.
When it is implemented, the quality index Qx is determined by formula (3).
Specifically, the embodiment of the present invention is to the implementation sequence of step S12 and S13 without limiting.
S14, according to the video quality and audio quality, determine the video speech quality of this video calling.
When it is implemented, the video speech quality of this video calling can be determined according to process shown in fig. 6, including with
Lower step:
S141, according to the video quality and the audio quality, determine respectively this video calling view quality and
The degree of audio-video delay and synchronous caused view quality decline.
When it is implemented, can determine this video according to formula (7) according to the video quality and the audio quality
The view quality of call:
MMSV=m5*Sq+m6*Vq+m7*Sq*Vq+m8 (7)
Wherein, MMSVFor the view quality of this video calling;
SqFor the audio quality;
VqFor the video quality;
m5,m6,m7,m8For related coefficient, the video depending on this video calling shows size and call task.
It is possible to further according to the video quality and audio quality, determine this video calling according to formula (8)
The degree of audio-video delay and synchronous caused view quality decline:
MMT=max { AD+MS, 1 } (8)
Wherein, AD=m9*(TS+TV)+m10,
TSFor this audio frequency in video call end-to-end time delay;
TVFor video end-to-end time delay in this video calling;
AD is audiovisual time delay absolute in this video calling;
MS is audio-visual media synchronization value in this video calling;
MMTThe degree declined for the audio-video delay of this video calling with synchronous caused view quality;
m9,m10,m11,m12,m13,m14For related coefficient, the video depending on this video calling shows that size and call are appointed
Business.
S142, the degree declined with synchronous caused view quality is postponed with the audio-video according to the view quality,
The video speech quality of this video calling is determined according to formula (9):
MMq=m1*MMSV+m2*MMT+m3*MMSV*MMT+m4 (9)
Wherein, MMSVIndicate the view quality;
MMTIndicate the degree of the audio-video delay with synchronous caused view quality decline;
m1,m2,m3,m4For related coefficient, the video depending on this video calling shows size and call task;
MMqIndicate the video speech quality of this video calling.
So far, can use video speech quality appraisal procedure provided in an embodiment of the present invention can assess the view of VoLTE
Frequency speech quality, and then massive video speech quality in whole net is assessed on this basis.
Specifically, the related coefficient m in formula (9)1,m2,m3,m4It can determine based on experience value.
Preferably, in order to improve the accuracy for the video speech quality determined, the present invention is determining video speech quality
When, the related coefficient in formula (9) can also be determined first with linear fit algorithm, the correlation for then fitting being recycled to obtain
Coefficient determines video speech quality, can specifically refer to process shown in Fig. 7, comprising the following steps:
S51, detect video calling occur when, extract several video calling samples.
S52, the subjective scoring for determining the video calling sample.
It is obtained in network after video speech quality executing step S11~S14, extracts individual call as cycle tests.
Subjective scoring is carried out to known cycle tests according to regulation P.800.
When it is implemented, P.800 the subjective scoring of the video calling sample can be determined with ITU-T, ITU-T is P.800
A kind of method for defining subjective testing video, that is, MOS (Mean Opinion Score) test.Test method is will to use
The behavior of family viewing video is investigated and quantifies, and to primary standard video and passes through wireless network respectively by different investigation users
Decline video after propagation carries out subjective feeling comparison, then gives a mark.Marking can be according to Absolute category
Rating, that is, ACR code of points, in general ACR points are 5 grades, shown in reference table 1:
Subjective feeling | Score value |
Excellent | 5 |
Good | 4 |
Fair | 3 |
Poor | 2 |
Bad | 1 |
According to the corresponding relationship of subjective feeling and score value in table 1, the subjective scoring of available each video calling sample.
S53, the degree declined with synchronous caused view quality is postponed with the audio-video based on the view quality, and
In conjunction with the subjective scoring of the video calling sample, the related coefficient m for evaluating this video calling total quality is obtained1,
m2,m3,m4。
When it is implemented, can use least square method is fitted the related coefficient estimated in formula (9), so that according to quasi-
It is more accurate to close the video speech quality that obtained related coefficient is determined.
Specifically, by MM in formula (9)q、MMSVAnd MMTRegard as and m1,m2,m3,m4Linear relationship, by MMSVAnd MMT
As independent variable, by MMqAs dependent variable, in parameter fitting process, the subjective scoring carried out to each sample of extraction is obtained
The appraisal result arrived is equivalent to MM as test valueq, then can be obtained by several linear equations, further according to least square
The principle of difference is fitted, so that it may obtain optimal related coefficient m1,m2,m3,m4Value.
For example, having extracted 10 video calling samples, it is then based on and subjectivity P.800 is carried out to this 10 video calling samples
Scoring obtains 10 appraisal results, then using this 10 appraisal results as dependent variable MMqValue.And utilize formula of the present invention
(7) and formula (8) can determine the corresponding MM of this 10 video calling samples respectivelySVAnd MMTValue, then by this 10 groups with it is right
10 appraisal results answered, which are updated in formula (9), can be obtained 10 linear equations, and available optimal one group based on this
Correlation coefficient value.Further, it is also possible to obtain optimal correlation coefficient value using cross validation algorithm.
Determining related coefficient m1,m2,m3,m4Afterwards, the video that this video calling can be obtained in substitution formula (9) leads to
Talk about quality.
The video speech quality appraisal procedure that the embodiment of the present invention one provides, when detecting that video calling occurs, respectively
Obtain the video parameter and audio frequency parameter of this video calling;The video matter of this video calling is determined according to the video parameter
It measures, and determines the audio quality of this video calling according to the audio frequency parameter;And according to the video quality and the sound
Frequency quality determines the video speech quality of this video calling.It, can be according to video using method provided in an embodiment of the present invention
Converse determine video speech quality, without reference to the participation in source, and also improve video speech quality assessment result it is accurate
Property, the assessment suitable for massive video speech quality;In addition, several can also be extracted when detecting that video calling occurs
Video calling sample, and determine the subjective scoring of the video calling sample, then mention from video parameter and audio frequency parameter respectively
Parameter sample is taken, and in conjunction with the subjective scoring of the video calling sample, using preset algorithm to the parameter sample and described
Subjective scoring carries out processing and obtains the parameter sets for being used to indicate this video calling total quality.Obtaining the parameter sets
Afterwards, the relevant parameter of different business can be adjusted according to the parameter sets, and then improves the accuracy of video speech quality assessment.
Embodiment two
Based on the same inventive concept, a kind of video speech quality assessment device is additionally provided in the embodiment of the present invention, due to
The principle that above-mentioned apparatus solves the problems, such as is similar to video speech quality appraisal procedure, therefore the implementation side of may refer to of above-mentioned apparatus
The implementation of method, overlaps will not be repeated.
As shown in figure 8, the structural schematic diagram of device is assessed for video speech quality provided by Embodiment 2 of the present invention, including
Acquiring unit 81, the first determination unit 82 and the second determination unit 83, in which:
Acquiring unit 81, for obtaining the video parameter of this video calling respectively when detecting that video calling occurs
And audio frequency parameter;
First determination unit 82, for determining the video quality of this video calling according to the video parameter, and according to
The audio frequency parameter determines the audio quality of this video calling;
Second determination unit 83 determines this view for being handled according to the video quality and the audio quality
The video speech quality of frequency call.
When it is implemented, the acquiring unit 81, is specifically used for carrying out this video calling using at least one interface
Data collect video calling realtime transmission protocol RTP data packet, and the interface is arranged in data packet transmission link any
Between two adjacent transmission nodes;Collected video calling RTP data packet is solved according to the communication protocol of each interface
Code obtains decoded video calling RTP data packet;According to decoded video calling RTP data packet, preset video is utilized
Encoding and decoding standard obtains the video parameter and audio frequency parameter of this video calling respectively.
Preferably, the video parameter and the audio frequency parameter include coding parameter and transmission performance parameter, wherein institute
It states coding parameter to include at least with the next item down: code/decode type, end-to-end time delay, code rate, frame per second and maximum transmitted bit rate, institute
It states transmission performance parameter to include at least with the next item down: packet loss and transmission rate.
When it is implemented, the acquiring unit 81, is specifically used for obtaining video end-to-end time delay in accordance with the following methods: obtain
The unilateral time delay of the video parameter;According to the unilateral time delay of the video parameter of acquisition, when determining that the video is end-to-end
Prolong.
Preferably, the acquiring unit 81, is also used to obtain audio end-to-end time delay in accordance with the following methods: obtaining the sound
The unilateral time delay of frequency parameter;According to the unilateral time delay of the audio frequency parameter of acquisition, the audio end-to-end time delay is determined.
Specifically, the acquiring unit 81, specifically for determining unilateral time delay in accordance with the following methods: determining two neighboring view
The Network Time Protocol NTP time difference of frequency call RTP data packet;And determine two adjacent video call RTP data packet
Time tolerance;According to the NTP difference and time tolerance determined, unilateral time delay is determined.
When it is implemented, first determination unit 82, is specifically used for true according to the video parameter according to following formula
Make the video quality of this video calling:
Wherein, VqFor the video quality of this video calling;
IcodingVideo quality when for coding distortion;
PplVFor packet loss;
Robustness degree for the video quality influenced by packet loss.
Further, first determination unit 82, specifically for determining video when coding distortion in accordance with the following methods
Quality Icoding:
Wherein, OfrFor optimal frame rate corresponding when video quality is maximized under current video bit rate;
IOfrFor the maximum value of video quality under current video bit rate;
FrVFor current frame rate;
DFrVRobustness degree for the video quality influenced by frame per second.
When it is implemented, first determination unit 82, is specifically used for determining the audio matter according to the audio frequency parameter
The quality index of amount;And
According to the quality index, the audio quality of this video calling is determined according to following formula:
Wherein, Qx is the quality index;
SqFor the audio quality of the video calling.
Preferably, first determination unit 82, is specifically used for determining institute according to the audio frequency parameter according to following formula
State the quality index of audio quality:
Wherein, Idte, WB are in this video calling because degenerating caused by caller's echo;
Ie-eff, WB are in this video calling because degenerating caused by voice coding and packet loss;
Qx is the quality index of the audio quality.
Further, second determination unit, specifically for determining Idte according to following formula:
Wherein, the expression formula of Re, WB are as follows: Re, WB=80+2.5* (TERV, WB-14);And the expression formula of TERV, WB
Are as follows:And
TELR is caller's echo loudness scale;
Ts is this audio frequency in video call end-to-end time delay.
Further, first determination unit 82, specifically for determining Ie-eff, WB according to following formula:
Wherein, IeS, WB is the voice coding distortion factor;
PplSFor voice packet loss in this video calling;
BplSFor the robustness of packets of voice packet loss in this video calling.
When it is implemented, second determination unit 83, is specifically used for according to the video quality and the audio quality,
The degree that the view quality of this video calling declines with audio-video delay and synchronous caused view quality is determined respectively;
Postpone the degree declined with synchronous caused view quality with the audio-video according to the view quality, under
State the video speech quality that formula determines this video calling:
MMq=m1*MMSV+m2*MMT+m3*MMSV*MMT+m4
Wherein, MMSVIndicate the view quality;
MMTIndicate the degree of the audio-video delay with synchronous caused view quality decline;
m1,m2,m3,m4For related coefficient, the video depending on this video calling shows size and call task;
MMqIndicate the video speech quality of this video calling.
Preferably, second determination unit 83, it is specifically used for according to the following equation according to the video quality and described
Audio quality determines the view quality of this video calling:
MMSV=m5*Sq+m6*Vq+m7*Sq*Vq+m8
Wherein, MMSVFor the view quality of this video calling;
SqFor the audio quality;
VqFor the video quality;
m5,m6,m7,m8For related coefficient, the video depending on this video calling shows size and call task.
Further, second determination unit 83 is specifically used for according to the following equation according to the video quality and sound
Frequency quality determines the degree that the audio-video delay of this video calling declines with synchronous caused view quality:
MMT=max { AD+MS, 1 }
Wherein, AD=m9*(TS+TV)+m10,
TSFor this audio frequency in video call end-to-end time delay;
TVFor video end-to-end time delay in this video calling;
AD is audiovisual time delay absolute in this video calling;
MS is audio-visual media synchronization value in this video calling;
MMTThe degree declined for the audio-video delay of this video calling with synchronous caused view quality;
m9,m10,m11,m12,m13,m14For related coefficient, the video depending on this video calling shows that size and call are appointed
Business.
Specifically, second determination unit, specifically for extracting several videos when detecting that video calling occurs
Call sample;And determine the subjective scoring of the video calling sample;And prolonged based on the view quality and the audio-video
The degree declined late with synchronous caused view quality, and in conjunction with the subjective scoring of the video calling sample, it obtains for commenting
The related coefficient m of this video calling total quality of valence1,m2,m3,m4。
Embodiment three
The embodiment of the present application three provides a kind of nonvolatile computer storage media, the computer storage medium storage
There are computer executable instructions, which can be performed the video speech quality in above-mentioned any means embodiment
Appraisal procedure.
Example IV
Fig. 9 is the hardware configuration of the electronic equipment for the implementation video speech quality appraisal procedure that the embodiment of the present invention four provides
Schematic diagram, as shown in figure 9, the electronic equipment includes:
One or more processors 910 and memory 920, in Fig. 9 by taking a processor 910 as an example.
The electronic equipment for executing video speech quality appraisal procedure can also include: input unit 930 and output device
940。
Processor 910, memory 920, input unit 930 and output device 940 can pass through bus or other modes
It connects, in Fig. 9 for being connected by bus.
Memory 920 is used as a kind of non-volatile computer readable storage medium storing program for executing, can be used for storing non-volatile software journey
Sequence, non-volatile computer executable program and module, such as the video speech quality appraisal procedure pair in the embodiment of the present application
Program instruction/module/the unit answered is (for example, attached acquiring unit shown in Fig. 8 81, the first determination unit 82 and second determine list
Member is 83).Non-volatile software program, instruction and the module/unit that processor 910 is stored in memory 920 by operation,
Thereby executing the various function application and data processing of server or intelligent terminal, i.e. realization above method embodiment video
Speech quality appraisal procedure.
Memory 920 may include storing program area and storage data area, wherein storing program area can store operation system
Application program required for system, at least one function;Storage data area, which can be stored, assesses making for device according to video speech quality
With the data etc. created.In addition, memory 920 may include high-speed random access memory, it can also include non-volatile
Memory, for example, at least a disk memory, flush memory device or other non-volatile solid state memory parts.In some realities
It applies in example, optional memory 920 includes the memory remotely located relative to processor 910, these remote memories can lead to
It crosses network connection to video speech quality and assesses device.The example of above-mentioned network include but is not limited to internet, intranet,
Local area network, mobile radio communication and combinations thereof.
Input unit 930 can receive the number or character information of input, and generates and assess device with video speech quality
User setting and function control related key signals input.Output device 940 may include that display screen etc. shows equipment.
One or more of modules are stored in the memory 920, when by one or more of processors
When 910 execution, the video speech quality appraisal procedure in above-mentioned any means embodiment is executed.
Method provided by the embodiment of the present application can be performed in the said goods, has the corresponding functional module of execution method and has
Beneficial effect.The not technical detail of detailed description in the present embodiment, reference can be made to method provided by the embodiment of the present application.
The electronic equipment of the embodiment of the present application exists in a variety of forms, including but not limited to:
(1) mobile communication equipment: the characteristics of this kind of equipment is that have mobile communication function, and to provide speech, data
Communication is main target.This Terminal Type includes: smart phone (such as iPhone), multimedia handset, functional mobile phone and low
Hold mobile phone etc..
(2) super mobile personal computer equipment: this kind of equipment belongs to the scope of personal computer, there is calculating and processing function
Can, generally also have mobile Internet access characteristic.This Terminal Type includes: PDA, MID and UMPC equipment etc., such as iPad.
(3) portable entertainment device: this kind of equipment can show and play multimedia content.Such equipment include: audio,
Video player (such as iPod), handheld device, e-book and intelligent toy and portable car-mounted navigation equipment.
(4) server: providing the equipment of the service of calculating, and the composition of server includes that processor, hard disk, memory, system are total
Line etc., server is similar with general computer architecture, but due to needing to provide highly reliable service, in processing energy
Power, stability, reliability, safety, scalability, manageability etc. are more demanding.
(5) other electronic devices with data interaction function.
Embodiment five
The embodiment of the present application five provides a kind of computer program product, wherein the computer program product includes depositing
The computer program in non-transient computer readable storage medium is stored up, the computer program includes program instruction, wherein when
When described program instruction is computer-executed, the computer is set to execute any one of the application above method embodiment video logical
Talk about method for evaluating quality.
Video speech quality appraisal procedure provided in an embodiment of the present invention and device, when detecting that video calling occurs,
The video parameter and audio frequency parameter of this video calling are obtained respectively;The view of this video calling is determined according to the video parameter
Frequency quality, and determine according to the audio frequency parameter audio quality of this video calling;And according to the video quality and institute
Audio quality is stated, determines the video speech quality of this video calling.It, being capable of basis using method provided in an embodiment of the present invention
Video calling determines video speech quality, without reference to the participation in source, and also improve the assessment result of video speech quality
Accuracy, the assessment suitable for massive video speech quality;In addition, when detecting that video calling occurs, if can also extract
Dry video calling sample, and determine the subjective scoring of the video calling sample, then respectively from video parameter and audio frequency parameter
Middle extracting parameter sample, and in conjunction with the subjective scoring of the video calling sample, using preset algorithm to the parameter sample and
The subjective scoring carries out processing and obtains the parameter sets for being used to indicate this video calling total quality.Obtaining the parameter set
After conjunction, the relevant parameter of different business can be adjusted according to the parameter sets, and then improves the accurate of video speech quality assessment
Property.
The assessment device of video speech quality provided by embodiments herein can be realized by a computer program.This field
Technical staff is it should be appreciated that above-mentioned module division mode is only one of numerous module division modes, if divided
It all should be the application's as long as video speech quality assessment device has above-mentioned function for other modules or non-division module
Within protection scope.
It should be understood by those skilled in the art that, the embodiment of the present invention can provide as method, system or computer program
Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the present invention
Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the present invention, which can be used in one or more,
The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces
The form of product.
The present invention be referring to according to the method for the embodiment of the present invention, the process of equipment (system) and computer program product
Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions
The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs
Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce
A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real
The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy
Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates,
Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or
The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting
Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or
The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one
The step of function of being specified in a box or multiple boxes.
Although preferred embodiments of the present invention have been described, it is created once a person skilled in the art knows basic
Property concept, then additional changes and modifications can be made to these embodiments.So it includes excellent that the following claims are intended to be interpreted as
It selects embodiment and falls into all change and modification of the scope of the invention.
Obviously, various changes and modifications can be made to the invention without departing from essence of the invention by those skilled in the art
Mind and range.In this way, if these modifications and changes of the present invention belongs to the range of the claims in the present invention and its equivalent technologies
Within, then the present invention is also intended to include these modifications and variations.
Claims (34)
1. a kind of video speech quality appraisal procedure characterized by comprising
When detecting that video calling occurs, the video parameter and audio frequency parameter of this video calling are obtained respectively;
The video quality of this video calling is determined according to the video parameter, and this video is determined according to the audio frequency parameter
The audio quality of call;And
According to the video quality and the audio quality, the video speech quality of this video calling is determined.
2. the method as described in claim 1, which is characterized in that when detecting that video calling occurs, obtain this view respectively
The video parameter and audio frequency parameter of frequency call, specifically include:
Data are carried out to this video calling using at least one interface and collect video calling realtime transmission protocol RTP data
Packet, the interface are arranged between the transmission node that any two are adjacent in data packet transmission link;
It is logical to be decoded to obtain decoded video to collected video calling RTP data packet according to the communication protocol of each interface
Talk about RTP data packet;
According to decoded video calling RTP data packet, it is logical that this video is obtained respectively using preset video encoding and decoding standard
The video parameter and audio frequency parameter of words.
3. method according to claim 2, which is characterized in that the video parameter and the audio frequency parameter include coding ginseng
Several and transmission performance parameter, wherein the coding parameter is included at least with the next item down: code/decode type, end-to-end time delay, code
Rate, frame per second and maximum transmitted bit rate, the transmission performance parameter are included at least with the next item down: packet loss and transmission rate.
4. method as claimed in claim 3, which is characterized in that obtain video end-to-end time delay in accordance with the following methods:
Obtain the unilateral time delay of the video parameter;
According to the unilateral time delay of the video parameter, the video end-to-end time delay is determined.
5. method as claimed in claim 3, which is characterized in that obtain audio end-to-end time delay in accordance with the following methods:
Obtain the unilateral time delay of the audio frequency parameter;
According to the unilateral time delay of the audio frequency parameter, the audio end-to-end time delay is determined.
6. method as described in claim 4 or 5, which is characterized in that determine unilateral time delay in accordance with the following methods:
Determine the Network Time Protocol NTP time difference of two adjacent video call RTP data packet;And
Determine the time tolerance of two adjacent video call RTP data packet;
According to the NTP difference and time tolerance determined, unilateral time delay is determined.
7. method as claimed in claim 3, which is characterized in that determine this view according to the video parameter according to following formula
The video quality of frequency call:
Wherein, VqFor the video quality of this video calling;
IcodingVideo quality when for coding distortion;
PplVFor packet loss;
Robustness degree for the video quality influenced by packet loss.
8. the method for claim 7, which is characterized in that determine video quality when coding distortion in accordance with the following methods
Icoding:
Wherein, OfrFor optimal frame rate corresponding when video quality is maximized under current video bit rate;
IOfrFor the maximum value of video quality under current video bit rate;
FrVFor current frame rate;
DFrVRobustness degree for the video quality influenced by frame per second.
9. method as claimed in claim 3, which is characterized in that determine the audio of this video calling according to the audio frequency parameter
Quality specifically includes:
According to the audio frequency parameter, the quality index of the audio quality is determined;And
According to the quality index, the audio quality of this video calling is determined according to following formula:
Wherein, Qx is the quality index;
SqFor the audio quality of the video calling.
10. method as claimed in claim 9, which is characterized in that according to following formula according to the audio frequency parameter, determine described in
The quality index of audio quality:
Wherein, Idte, WB are in this video calling because degenerating caused by caller's echo;
Ie-eff, WB are in this video calling because degenerating caused by voice coding and packet loss;
Qx is the quality index of the audio quality.
11. method as claimed in claim 10, which is characterized in that determine Idte, WB according to following formula:
Wherein, the expression formula of Re, WB are as follows: Re, WB=80+2.5* (TERV, WB-14);And the expression formula of TERV, WB are as follows:And
TELR is caller's echo loudness scale;
This audio frequency in video call end-to-end time delay of Ts.
12. method as claimed in claim 10, which is characterized in that determine Ie-eff, WB according to following formula:
Wherein, IeS, WB is the voice coding distortion factor;
PplSFor voice packet loss in this video calling;
BplSFor the robustness of packets of voice packet loss in this video calling.
13. the method as described in claim 7 or 9, which is characterized in that according to the video quality and the audio quality, really
The video speech quality of this fixed video calling, specifically includes:
According to the video quality and the audio quality, view quality and the audio-video delay of this video calling are determined respectively
With the degree of synchronous caused view quality decline;
Postpone the degree declined with synchronous caused view quality with the audio-video according to the view quality, according to following public affairs
Formula determines the video speech quality of this video calling:
MMq=m1*MMSV+m2*MMT+m3*MMSV*MMT+m4
Wherein, MMSVIndicate the view quality;
MMTIndicate the degree of the audio-video delay with synchronous caused view quality decline;
m1,m2,m3,m4For related coefficient, the video depending on this video calling shows size and call task;
MMqIndicate the video speech quality of this video calling.
14. method as claimed in claim 13, which is characterized in that according to the following equation according to the video quality and the sound
Frequency quality determines the view quality of this video calling:
MMSV=m5*Sq+m6*Vq+m7*Sq*Vq+m8
Wherein, MMSVFor the view quality of this video calling;
SqFor the audio quality;
VqFor the video quality;
m5,m6,m7,m8For related coefficient, the video depending on this video calling shows size and call task.
15. method as claimed in claim 13, which is characterized in that according to the following equation according to the video quality and audio matter
Amount determines the degree that the audio-video delay of this video calling declines with synchronous caused view quality:
MMT=max { AD+MS, 1 }
Wherein, AD=m9*(TS+TV)+m10,
TSFor this audio frequency in video call end-to-end time delay;
TVFor video end-to-end time delay in this video calling;
AD is audiovisual time delay absolute in this video calling;
MS is audio-visual media synchronization value in this video calling;
MMTThe degree declined for the audio-video delay of this video calling with synchronous caused view quality;
m9,m10,m11,m12,m13,m14For related coefficient, the video depending on this video calling shows size and call task.
16. method as claimed in claim 13, which is characterized in that determine related coefficient m by the following method1,m2,m3,m4:
When detecting that video calling occurs, several video calling samples are extracted;And
Determine the subjective scoring of the video calling sample;And
Postpone the degree declined with synchronous caused view quality with the audio-video based on the view quality, and in conjunction with described
The subjective scoring of video calling sample obtains the related coefficient m for evaluating this video calling total quality1,m2,m3,m4。
17. a kind of video speech quality assesses device characterized by comprising
Acquiring unit, for obtaining the video parameter and audio of this video calling respectively when detecting that video calling occurs
Parameter;
First determination unit, for determining the video quality of this video calling according to the video parameter, and according to the sound
Frequency parameter determines the audio quality of this video calling;
Second determination unit, for determining that the video of this video calling is logical according to the video quality and the audio quality
Talk about quality.
18. device as claimed in claim 17, which is characterized in that
The acquiring unit, it is logical specifically for collecting video to this video calling progress data using at least one interface
Realtime transmission protocol RTP data packet is talked about, the transmission node that any two are adjacent in data packet transmission link is arranged in the interface
Between;Collected video calling RTP data packet is decoded to obtain decoded video according to the communication protocol of each interface
Call RTP data packet;According to decoded video calling RTP data packet, obtained respectively using preset video encoding and decoding standard
The video parameter and audio frequency parameter of this video calling.
19. device as claimed in claim 18, which is characterized in that the video parameter and the audio frequency parameter include coding
Parameter and transmission performance parameter, wherein the coding parameter is included at least with the next item down: code/decode type, end-to-end time delay, code
Rate, frame per second and maximum transmitted bit rate, the transmission performance parameter are included at least with the next item down: packet loss and transmission rate.
20. device as claimed in claim 19, which is characterized in that
The acquiring unit is specifically used for obtaining video end-to-end time delay in accordance with the following methods: obtaining the list of the video parameter
Side time delay;According to the unilateral time delay of the video parameter, the video end-to-end time delay is determined.
21. device as claimed in claim 18, which is characterized in that
The acquiring unit is also used to obtain audio end-to-end time delay in accordance with the following methods: obtaining the unilateral of the audio frequency parameter
Time delay;According to the unilateral time delay of the audio frequency parameter, the audio end-to-end time delay is determined.
22. the device as described in claim 20 or 21, which is characterized in that
The acquiring unit, specifically for determining unilateral time delay in accordance with the following methods: determining two adjacent video call RTP data
The Network Time Protocol NTP time difference of packet;And determine the time tolerance of two adjacent video call RTP data packet;Root
According to the NTP difference and time tolerance determined, unilateral time delay is determined.
23. device as claimed in claim 19, which is characterized in that
First determination unit, specifically for determining the view of this video calling according to the video parameter according to following formula
Frequency quality:
Wherein, VqFor the video quality of this video calling;
IcodingVideo quality when for coding distortion;
PplVFor packet loss;
Robustness degree for the video quality influenced by packet loss.
24. device as claimed in claim 23, which is characterized in that
First determination unit, specifically for determining video quality I when coding distortion in accordance with the following methodscoding:
Wherein, OfrFor optimal frame rate corresponding when video quality is maximized under current video bit rate;
IOfrFor the maximum value of video quality under current video bit rate;
FrVFor current frame rate;
DFrVRobustness degree for the video quality influenced by frame per second.
25. device as claimed in claim 19, which is characterized in that
First determination unit is specifically used for determining the quality index of the audio quality according to the audio frequency parameter;And
According to the quality index, the audio quality of this video calling is determined according to following formula:
Wherein, Qx is the quality index;
SqFor the audio quality of the video calling.
26. device as claimed in claim 25, which is characterized in that
First determination unit is specifically used for determining the audio quality according to the audio frequency parameter according to following formula
Quality index:
Wherein, Idte, WB are in this video calling because degenerating caused by caller's echo;
Ie-eff, WB are in this video calling because degenerating caused by voice coding and packet loss;
Qx is the quality index of the audio quality.
27. device as claimed in claim 26, which is characterized in that
First determination unit, specifically for determining Idte, WB according to following formula:
Wherein, the expression formula of Re, WB are as follows: Re, WB=80+2.5* (TERV, WB-14);And the expression formula of TERV, WB are as follows:And
TELR is caller's echo loudness scale;
Ts is this audio frequency in video call end-to-end time delay.
28. device as claimed in claim 26, which is characterized in that
First determination unit, specifically for determining Ie-eff, WB according to following formula:
Wherein, IeS, WB is the voice coding distortion factor;
PplSFor voice packet loss in this video calling;
BplSFor the robustness of packets of voice packet loss in this video calling.
29. the device as described in claim 23 or 25, which is characterized in that
Second determination unit is specifically used for determining this video respectively according to the video quality and the audio quality
The view quality of call postpones and the degree of synchronous caused view quality decline with audio-video;
Postpone the degree declined with synchronous caused view quality with the audio-video according to the view quality, according to following public affairs
Formula determines the video speech quality of this video calling:
MMq=m1*MMSV+m2*MMT+m3*MMSV*MMT+m4
Wherein, MMSVIndicate the view quality;
MMTIndicate the degree of the audio-video delay with synchronous caused view quality decline;
m1,m2,m3,m4For related coefficient, the video depending on this video calling shows size and call task;
MMqIndicate the video speech quality of this video calling.
30. device as claimed in claim 29, which is characterized in that
Second determination unit is specifically used for being determined according to the video quality and the audio quality according to the following equation
The view quality of this video calling:
MMSV=m5*Sq+m6*Vq+m7*Sq*Vq+m8
Wherein, MMSVFor the view quality of this video calling;
SqFor the audio quality;
VqFor the video quality;
m5,m6,m7,m8For related coefficient, the video depending on this video calling shows size and call task.
31. device as claimed in claim 29, which is characterized in that
Second determination unit is specifically used for determining this according to the following equation according to the video quality and audio quality
The degree of the audio-video delay of video calling and synchronous caused view quality decline:
MMT=max { AD+MS, 1 }
Wherein, AD=m9*(TS+TV)+m10,
TSFor this audio frequency in video call end-to-end time delay;
TVFor video end-to-end time delay in this video calling;
AD is audiovisual time delay absolute in this video calling;
MS is audio-visual media synchronization value in this video calling;
MMTThe degree declined for the audio-video delay of this video calling with synchronous caused view quality;
m9,m10,m11,m12,m13,m14For related coefficient, the video depending on this video calling shows size and call task.
32. device as claimed in claim 29, which is characterized in that
Second determination unit, specifically for extracting several video calling samples when detecting that video calling occurs;And
Determine the subjective scoring of the video calling sample;And postponed based on the view quality with the audio-video and synchronize cause
The degree of view quality decline obtain logical for evaluating this video and in conjunction with the subjective scoring of the video calling sample
Talk about the related coefficient m of total quality1,m2,m3,m4。
33. a kind of electronic equipment, including memory, processor and it is stored on the memory and can transports on the processor
Capable computer program;It is characterized in that, the processor is realized when executing described program such as any one of claim 1~16 institute
The video speech quality appraisal procedure stated.
34. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor
It realizes when execution such as the step in the described in any item video speech quality appraisal procedures of claim 1~16.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710614327.1A CN109302603A (en) | 2017-07-25 | 2017-07-25 | A kind of video speech quality appraisal procedure and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710614327.1A CN109302603A (en) | 2017-07-25 | 2017-07-25 | A kind of video speech quality appraisal procedure and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109302603A true CN109302603A (en) | 2019-02-01 |
Family
ID=65167398
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710614327.1A Pending CN109302603A (en) | 2017-07-25 | 2017-07-25 | A kind of video speech quality appraisal procedure and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109302603A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109714557A (en) * | 2017-10-25 | 2019-05-03 | 中国移动通信集团公司 | Method for evaluating quality, device, electronic equipment and the storage medium of video calling |
CN111479109A (en) * | 2020-03-12 | 2020-07-31 | 上海交通大学 | Video quality evaluation method, system and terminal based on audio-visual combined attention |
CN112118442A (en) * | 2020-09-18 | 2020-12-22 | 平安科技(深圳)有限公司 | AI video call quality analysis method, device, computer equipment and storage medium |
CN111479105B (en) * | 2020-03-12 | 2021-06-04 | 上海交通大学 | Video and audio joint quality evaluation method and device |
CN111479107B (en) * | 2020-03-12 | 2021-06-08 | 上海交通大学 | No-reference audio and video joint quality evaluation method based on natural audio and video statistics |
CN111479106B (en) * | 2020-03-12 | 2021-06-29 | 上海交通大学 | Two-dimensional quality descriptor fused audio and video joint quality evaluation method and terminal |
CN113840131A (en) * | 2020-06-08 | 2021-12-24 | 中国移动通信有限公司研究院 | Video call quality evaluation method and device, electronic equipment and readable storage medium |
CN114258069A (en) * | 2021-12-28 | 2022-03-29 | 北京东土拓明科技有限公司 | Voice call quality evaluation method and device, computing equipment and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101547259A (en) * | 2009-04-30 | 2009-09-30 | 华东师范大学 | VoIP test method based on analog data flow |
CN101911714A (en) * | 2008-01-08 | 2010-12-08 | 日本电信电话株式会社 | Image quality estimation device, method, and program |
CN103379358A (en) * | 2012-04-23 | 2013-10-30 | 华为技术有限公司 | Method and device for assessing multimedia quality |
CN103634577A (en) * | 2012-08-22 | 2014-03-12 | 华为技术有限公司 | Multimedia quality monitoring method and apparatus |
CN104539943A (en) * | 2012-08-22 | 2015-04-22 | 华为技术有限公司 | Method and device for monitoring multimedia quality |
-
2017
- 2017-07-25 CN CN201710614327.1A patent/CN109302603A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101911714A (en) * | 2008-01-08 | 2010-12-08 | 日本电信电话株式会社 | Image quality estimation device, method, and program |
CN101547259A (en) * | 2009-04-30 | 2009-09-30 | 华东师范大学 | VoIP test method based on analog data flow |
CN103379358A (en) * | 2012-04-23 | 2013-10-30 | 华为技术有限公司 | Method and device for assessing multimedia quality |
CN103634577A (en) * | 2012-08-22 | 2014-03-12 | 华为技术有限公司 | Multimedia quality monitoring method and apparatus |
CN104539943A (en) * | 2012-08-22 | 2015-04-22 | 华为技术有限公司 | Method and device for monitoring multimedia quality |
Non-Patent Citations (1)
Title |
---|
HAYASHI T等: "Multimedia quality integration function for videophone services", 《IEEE GLOBAL TELECOMMUNICATIONS CONFERENCE》 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109714557A (en) * | 2017-10-25 | 2019-05-03 | 中国移动通信集团公司 | Method for evaluating quality, device, electronic equipment and the storage medium of video calling |
CN111479109A (en) * | 2020-03-12 | 2020-07-31 | 上海交通大学 | Video quality evaluation method, system and terminal based on audio-visual combined attention |
CN111479105B (en) * | 2020-03-12 | 2021-06-04 | 上海交通大学 | Video and audio joint quality evaluation method and device |
CN111479107B (en) * | 2020-03-12 | 2021-06-08 | 上海交通大学 | No-reference audio and video joint quality evaluation method based on natural audio and video statistics |
CN111479106B (en) * | 2020-03-12 | 2021-06-29 | 上海交通大学 | Two-dimensional quality descriptor fused audio and video joint quality evaluation method and terminal |
CN113840131A (en) * | 2020-06-08 | 2021-12-24 | 中国移动通信有限公司研究院 | Video call quality evaluation method and device, electronic equipment and readable storage medium |
CN112118442A (en) * | 2020-09-18 | 2020-12-22 | 平安科技(深圳)有限公司 | AI video call quality analysis method, device, computer equipment and storage medium |
WO2021174879A1 (en) * | 2020-09-18 | 2021-09-10 | 平安科技(深圳)有限公司 | Ai video call quality analysis method and apparatus, computer device, and storage medium |
CN114258069A (en) * | 2021-12-28 | 2022-03-29 | 北京东土拓明科技有限公司 | Voice call quality evaluation method and device, computing equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109302603A (en) | A kind of video speech quality appraisal procedure and device | |
Jelassi et al. | Quality of experience of VoIP service: A survey of assessment approaches and open issues | |
Chen et al. | A lightweight end-side user experience data collection system for quality evaluation of multimedia communications | |
JP4965659B2 (en) | How to determine video quality | |
CN103957216B (en) | Based on characteristic audio signal classification without reference audio quality evaluating method and system | |
US11748643B2 (en) | System and method for machine learning based QoE prediction of voice/video services in wireless networks | |
CN102057634B (en) | Audio quality estimation method and audio quality estimation device | |
CN102014126B (en) | Voice experience quality evaluation platform based on QoS (quality of service) and evaluation method | |
CN113067808B (en) | Data processing method, live broadcast method, authentication server and live broadcast data server | |
da Silva et al. | Quality assessment of interactive voice applications | |
US20100329360A1 (en) | Method and apparatus for svc video and aac audio synchronization using npt | |
CN109714557A (en) | Method for evaluating quality, device, electronic equipment and the storage medium of video calling | |
Goudarzi et al. | Audiovisual quality estimation for video calls in wireless applications | |
Jelassi et al. | A study of artificial speech quality assessors of VoIP calls subject to limited bursty packet losses | |
Wuttidittachotti et al. | Quality evaluation of mobile networks using VoIP applications: a case study with Skype and LINE based-on stationary tests in Bangkok | |
JP2006324865A (en) | Device, method and program for estimating network communication service satisfaction level | |
Osmanovic et al. | Impact of media-related SIFs on QoE for H. 265/HEVC video streaming | |
DE602004004577T2 (en) | Method and device for determining the language latency by a network element of a communication network | |
Daengsi et al. | Speech quality assessment of VoIP: G. 711 VS G. 722 based on interview tests with Thai users | |
Saidi et al. | Audiovisual quality study for videoconferencing on IP networks | |
Duque et al. | Quality assessment for video streaming P2P application over wireless mesh network | |
Rodriguez et al. | Assessment of quality-of-experience in telecommunication services | |
Kumar et al. | Comparison of popular video conferencing apps using client-side measurements on different backhaul networks | |
CN106993308A (en) | A kind of QoS of voice monitoring method, equipment and the system of VoLTE networks | |
Casas et al. | End-2-end evaluation of ip multimedia services, a user perceived quality of service approach |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190201 |