CN109302603A - A kind of video speech quality appraisal procedure and device - Google Patents

A kind of video speech quality appraisal procedure and device Download PDF

Info

Publication number
CN109302603A
CN109302603A CN201710614327.1A CN201710614327A CN109302603A CN 109302603 A CN109302603 A CN 109302603A CN 201710614327 A CN201710614327 A CN 201710614327A CN 109302603 A CN109302603 A CN 109302603A
Authority
CN
China
Prior art keywords
video
quality
audio
calling
video calling
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710614327.1A
Other languages
Chinese (zh)
Inventor
王亚楠
张志敏
李欣然
王嘉
裴夺飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Group Beijing Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Group Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Group Beijing Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201710614327.1A priority Critical patent/CN109302603A/en
Publication of CN109302603A publication Critical patent/CN109302603A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N17/00Diagnosis, testing or measuring for television systems or their details
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/643Communication protocols
    • H04N21/6437Real-time Transport Protocol [RTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a kind of video speech quality appraisal procedure and devices, which comprises when detecting that video calling occurs, obtains the video parameter and audio frequency parameter of this video calling respectively;The video quality of this video calling is determined according to the video parameter, and the audio quality of this video calling is determined according to the audio frequency parameter;And according to the video quality and the audio quality, determine the video speech quality of this video calling.Using method provided in an embodiment of the present invention, video speech quality can be determined according to video calling, without reference to the participation in source, and also improve the accuracy of the assessment result of video speech quality.

Description

A kind of video speech quality appraisal procedure and device
Technical field
The present invention relates to technical field of data processing more particularly to a kind of video speech quality appraisal procedure and devices.
Background technique
At this stage, VoLTE (Voice over Long Time Evolution, the voice on long term evolution) video calling Quality determining method mainly includes following 3 kinds: (1) P.910 and P.911 subjective evaluation method respectively describes multimedia such as ITU The video and view quality of application are assessed, P.910 Evaluation Method belong to multimedia application (such as video conference, storage and retrieval application, Video-medical application etc.) unidirectional overall video quality nonreciprocal subjective evaluation method, describe test method in 4 in specification: ACR (Absolute Category Rating, absolute category scoring), ACR-HR (Absolute Category Rating With Hidden Reference, band hide reference absolute category scoring), DCR (Degradation Category Rating, the scoring of loss type) and PC (Pair Comparision method, Paired Comparisons) etc.;(2) it is based on having reference PEVQ (Perceptual Evaluation of Video Quality, the video quality evaluation of recommendation);(3) data are based on It is grouped the appraisal procedure of layer model, is specifically assessed using the grouping information of transport layer.RTP is mainly based upon in VoLTE (Realtime Transport Protocol, real-time transport protocol) packet is analyzed.
Above-mentioned three kinds of approach applications respectively have some limitations in the test of VoLTE the whole network, are mainly manifested in: (1) adopting Human intervention is needed with subjective evaluation method, inspection can only be sampled using the method, a large number of users can not be assessed, and can not Meet assessment in real time;(2) using there is the PEVQ method of reference to need to be arranged reference source, model of conversing not only it has not been suitable for, but also can not Assess whole network users;(3) appraisal procedure of the use based on data grouping layer model can not carry out parameter oneself according to different business Dynamic adjustment, computation complexity is high, computationally intensive, while can not converse massive video and carry out full dose calculating.
It can to sum up obtain, a kind of efficient video method for evaluating quality based on video calling how be found, without reference to source It participates in, so that it may video speech quality be assessed, and the accuracy that can also improve the assessment result of video speech quality is One of the technical problems that are urgent to solve.
Summary of the invention
The embodiment of the present invention provides a kind of video speech quality appraisal procedure and device, to carry out to video speech quality Assessment, without reference to the participation in source, that is, can be improved the accuracy of the assessment result of video speech quality.
In a first aspect, the embodiment of the present invention provides a kind of video speech quality appraisal procedure, comprising:
When detecting that video calling occurs, the video parameter and audio frequency parameter of this video calling are obtained respectively;
The video quality of this video calling is determined according to the video parameter, and this is determined according to the audio frequency parameter The audio quality of video calling;And
According to the video quality and the audio quality, the video speech quality of this video calling is determined.
Second aspect, the embodiment of the present invention provide a kind of video speech quality assessment device, comprising:
Acquiring unit, for detect video calling occur when, obtain respectively this video calling video parameter and Audio frequency parameter;
First determination unit, for determining the video quality of this video calling according to the video parameter, and according to institute State the audio quality that audio frequency parameter determines this video calling;
Second determination unit, for determining the view of this video calling according to the video quality and the audio quality Frequency speech quality.
The third aspect, the embodiment of the present invention provide a kind of electronic equipment, including memory, processor and are stored in described deposit On reservoir and the computer program that can run on the processor;The processor realizes that the application mentions when executing described program The video speech quality appraisal procedure of confession.
Fourth aspect, the embodiment of the present invention provide a kind of computer readable storage medium, are stored thereon with computer program, The program realizes the either step in video speech quality appraisal procedure provided by the present application when being executed by processor.
Beneficial effects of the present invention:
Video speech quality appraisal procedure provided in an embodiment of the present invention and device, when detecting that video calling occurs, The video parameter and audio frequency parameter of this video calling are obtained respectively;This video calling is determined according to the video parameter Video quality, and determine according to the audio frequency parameter audio quality of this video calling;And according to the video quality and The audio quality determines the video speech quality of this video calling.Using method provided in an embodiment of the present invention, Neng Gougen Video speech quality is determined according to video calling, without reference to the participation in source, and also improves the assessment result of video speech quality Accuracy.
Other features and advantages of the present invention will be illustrated in the following description, also, partly becomes from specification It obtains it is clear that understand through the implementation of the invention.The objectives and other advantages of the invention can be by written explanation Specifically noted structure is achieved and obtained in book, claims and attached drawing.
Detailed description of the invention
The drawings described herein are used to provide a further understanding of the present invention, constitutes a part of the invention, this hair Bright illustrative embodiments and their description are used to explain the present invention, and are not constituted improper limitations of the present invention.In the accompanying drawings:
Fig. 1 a is the flow diagram for the video speech quality appraisal procedure that the embodiment of the present invention one provides;
Fig. 1 b is the video parameter for obtaining this video calling respectively that the embodiment of the present invention one provides and audio frequency parameter Flow diagram;
Fig. 1 c is the parameter configuration that VoLTE video calling is established using H.264 agreement that the embodiment of the present invention one provides Configuration process schematic diagram;
Fig. 2 is the flow diagram for the acquisition video end-to-end time delay that the embodiment of the present invention one provides;
Fig. 3 is the flow diagram for the acquisition audio end-to-end time delay that the embodiment of the present invention one provides;
Fig. 4 is that the embodiment of the present invention one provides the flow diagram of order side time delay really;
Fig. 5 is the flow diagram of the audio quality for determination this video calling that the embodiment of the present invention one provides;
Fig. 6 is the flow diagram of the video speech quality for determination this video calling that the embodiment of the present invention one provides;
Fig. 7 is the flow diagram of the related coefficient for the determination video speech quality that the embodiment of the present invention one provides;
Fig. 8 is the structural schematic diagram that video speech quality provided by Embodiment 2 of the present invention assesses device;
Fig. 9 is the hardware configuration of the electronic equipment for the implementation video speech quality appraisal procedure that the embodiment of the present invention four provides Schematic diagram.
Specific embodiment
The embodiment of the present invention provides a kind of video speech quality appraisal procedure and device, is detecting video calling When, the video parameter and audio frequency parameter of this video calling are obtained respectively;Determine that this video is logical according to the video parameter The video quality of words, and determine according to the audio frequency parameter audio quality of this video calling;And according to the video matter Amount and the audio quality, determine the video speech quality of this video calling, not only realize and carry out to video speech quality Assessment, and without reference to the participation in source, while also improving the accuracy of the assessment result of video speech quality.
Video speech quality appraisal procedure provided in an embodiment of the present invention can be applied and VoLTE system, assessment VoLTE system Video speech quality of uniting is illustrated for method provided by the invention is applied to VoLTE system for convenience.
Below in conjunction with Figure of description, preferred embodiment of the present invention will be described, it should be understood that described herein Preferred embodiment only for the purpose of illustrating and explaining the present invention and is not intended to limit the present invention, and in the absence of conflict, this hair The feature in embodiment and embodiment in bright can be combined with each other.
Embodiment one
As shown in Figure 1a, the video speech quality appraisal procedure provided for the embodiment of the present invention one, may include following step It is rapid:
S11, detect video calling occur when, obtain the video parameter and audio frequency parameter of this video calling respectively.
When it is implemented, the video parameter and audio of this video calling can be obtained respectively according to method shown in Fig. 1 b Parameter:
S111, video calling real-time Transmission association is collected to this video calling progress data using at least one interface Discuss RTP data packet.
When it is implemented, can use medium surface and this video call data of control plane Collect jointly and then obtain RTP (Realtime Transport Protocol, real-time transport protocol) data packet.
When it is implemented, the interface of the medium surface can be, but not limited to include Mb;The interface of the control plane can With but be not limited to include: S1-MME, S6a, S11, Mw, Gx and Rx interface etc..
Wherein, the interface is arranged between the transmission node that any two are adjacent in data packet transmission link.
When it is implemented, above-mentioned interface is set on the transmission link that the data packet that video calling generates is passed through in advance, S1-MME interface such as is set between base station eNode and MME (Mobile Management Entity, mobile management entity), S6a interface is set between MME and HSS, in MME and S&PGW (Service&Packet Data Network Gateway, clothes Be engaged in & grouped data network gateway) between S11 interface etc. is set.
By extracting the RTP data packet on these interfaces, it is logical that this video can be obtained based on H.264 video encoding standard The video or audio frequency parameter of words.
S112, it collected video calling RTP data packet is decoded according to the communication protocol of each interface is decoded Video calling RTP data packet afterwards.
S113, according to decoded video calling RTP data packet, obtain this respectively using preset video encoding and decoding standard The video parameter and audio frequency parameter of secondary video calling.
When it is implemented, can use H.264 video encoding and decoding standard from control plane message obtains video parameter and audio Parameter, for example, being obtained from SIP/SDP message.
Preferably, the video parameter and the audio frequency parameter include coding parameter and transmission performance parameter, wherein institute It states coding parameter to include at least with the next item down: code/decode type, end-to-end time delay, code rate, frame per second and maximum transmitted bit rate, institute It states transmission performance parameter to include at least with the next item down: packet loss and transmission rate.
When it is implemented, video and audio are transmitted by respective carrying respectively, in establishing load bearing process, pass through carrying Parameter (QCI) available video and audio transmission rate.In addition, passing through the rtp streaming of associated video and the RTP of audio Stream, can calculate the transmission rate of video and audio in real time.
When it is implemented, can determine transmission bit rate according to following two ways:
Mode one: maximum transmitted bit rate is determined by the bearing parameter of VoLTE session establishment
Specifically, either audio or video can all establish corresponding carrying during VoLTE session establishment Bearer carries the parameter for having the maximum transmitted bit rate of the carrying in bearer.Bearing parameter can be carried by having a plurality of signaling, Such as: Initial Context Setup message is the maximum transmitted bit that can determine audio or video based on the bearing parameter Rate.
Mode two: actual transmission bit rate is determined by RTP data packet
Specifically, the data volume size that each RTP packet has COUNT information to indicate the packet, by all RTP of the session The quantity amount summation of packet can be obtained by actual transmission bit rate divided by transmission duration again.
When it is implemented, carrying out video in VoLTE system in the code rate and frame per second for determining audio frequency parameter and video parameter When call, in the handshaking procedure of call, the negotiation of session parameter is had, major parameter includes code rate and frame per second, be can use H.264/AVC agreement obtains the code rate and frame per second of this video calling.
Specifically, it can use the parameter configuration that H.264/AVC agreement establishes VoLTE video calling in advance, configured Journey can refer to Fig. 1 c, each in parameter profile-level inquiry VoLTE network through the above configuration after being successfully established A physical layer parameter.
In Fig. 1 c, m=video: medium type is indicated;B=AS: it indicates demand bandwidth (kbps);B=RS, b=RR are indicated The Bandwidth adjustment of control channel;A=rtpmap indicates medium type and sample rate, wherein sample rate=9000, is H.264 Parameter;A=fmtp indicates the additional parameter of media formats, most important of which is that profile-level-id.It can be with according to ID It is associated with out code rate, frame per second and the maximum video bitrate (Max video bit rate) of video media;A=rtcp-fb: table Show the feedback parameter of control channel.
It should be noted that being H.264 a kind of new video encoding and decoding standard, the meeting pair in video session establishment process H.264 ability is held consultation, and negotiations process handling capacity collection and number are to identify.H.264 capability set be one comprising one or The list of multiple H.264 abilities, each H.264 ability include two mandatory parameters of Profile and Level and Several optional parameters such as CustomMaxMBPS, CustomMaxFS.In h .264, Profile, which is used to define, generates bit stream Encoding tool and algorithm, Level are then the parameter requests to some keys.The Profile of association encoding and decoding H.264 simultaneously and Level table, available code rate, frame per second.
Preferably, can determine that this video is logical in VoLTE system first according to RTP data packet when determining packet loss The total packet number and total number of discarded packets for talking about medium surface, then obtain the packet loss of this video calling divided by total packet number using total number of discarded packets Rate.
Preferably, can be negotiated by extracting the audio coding decoding of SIP/SDP agreement when determining audio coding decoding type In audio coding decoding scheme obtain audio coding decoding type;When determining coding and decoding video type, SIP/ can be equally extracted Coding and decoding video scheme in the coding and decoding video negotiation of SDP agreement obtains coding and decoding video type.
Preferably, obtaining theoretical speed by code/decode format when determining the maximum transmitted bit rate that audio frequency parameter includes Rate, then the practical maximum transmitted bit rate by association S1-MME interface and S11 interface acquisition VoLTE audio bearer.
Preferably, obtaining theoretical speed by code/decode format when determining the maximum transmitted bit rate that video parameter includes Rate, then the practical maximum transmitted bit rate by association S1-MME interface and S11 interface acquisition VoLTE video bearer.
Further, the end-to-end time delay of the video calling includes audio end-to-end time delay and video end-to-end time delay, It introduces individually below and determines that the audio end-to-end time delay that the audio frequency parameter includes and the video end that the video parameter includes arrive Terminal delay time:
When it is implemented, being parsed when determining the audio end-to-end time delay that audio frequency parameter includes by medium surface Mb Real-time Transport Protocol obtains the inter-packet gap and timestamp of audio RTP data packet, recycles the sip message and S1-MME interface of Mw Calling and called association is carried out with the channel information that S11 interface obtains, and then obtains audio end-to-end time delay.
When it is implemented, being parsed when determining the video end-to-end time delay that video parameter includes by medium surface Mb Real-time Transport Protocol obtains the inter-packet gap and timestamp of video RTP data packet, recycles the sip message and S1-MME interface of Mw Calling and called association is carried out with the channel information that S11 interface obtains, and then obtains video end-to-end time delay.
Specifically, voice packet or video bag have passed through the process of a sequence, from mobile phone to eNodeB base from reception is sent to The packet-based core networks EPC to stand to evolution to opposite end EPC, then to opposite end the base station eNodeB until arriving opposite end mobile phone.At this In sequence process, each section has link identification and packet mark: such as GTP tunnel number and RTP serial number.Pass through each section of chain Road matching, can do end-to-end tracking to each packet, then according to the timestamp of the timestamp given out a contract for a project and packet receiving, so that it may To the end-to-end time delay of audio or video.
Preferably, can be executed according to method described shown in Fig. 2, including following when determining video end-to-end time delay Step:
S21, the unilateral time delay for obtaining the video parameter.
When it is implemented, the step of can providing according to step S41~S43, determines the unilateral time delay of the video parameter, It is specific it needs to be determined that RTP data packet timestamp, then according to the time of the time difference of former and later two RTP and former and later two RTP Stamp difference determines unilateral time delay.
It should be noted that the timestamp field is to illustrate the synchronizing information of packet time in RTP stem, it is data The key that can be restored with correct time sequencing.The value of timestamp gives the sampling time of the first character section of data in grouping (Sampling Instant), it is desirable that the clock of sender's timestamp is continuous, monotone increasing, even if inputting in no data Or send data when be also such.In silence, sender need not send data, the growth of retention time stamp, in receiving end, by It is not lost in the serial number of the data grouping received, is known that there is no loss of data, as long as and relatively front and back grouping Timestamp difference, so that it may determine output time interval.
In addition, RTP provides that the initial time stamp of a session must randomly choose, but agreement is not specified by the list of timestamp Position, is also not specified by the accurate explanation of the value, but the particle of clock is determined by loadtype, and application types various in this way can To select suitably to export accuracy of timekeeping as needed.
When RTP transmits audio data, generally selected logical time stamp rate is identical as sampling rate, but regards in transmission Frequency according to when, it is necessary to make timestamp rate be greater than one of every frame it is ticking.If data are sampled in synchronization, Protocol Standard Standard also allows multiple grouping timestamp value having the same.
S22, the unilateral time delay according to the video parameter, determine the video end-to-end time delay.
It since VoLTE call is Bidirectional Flow, is delayed when determining unilateral, main quilt can be obtained by calling and called association The medium surface association cried, and then video end-to-end time delay can be determined according to unilateral time delay.
As shown in figure 3, the flow diagram of the acquisition audio end-to-end time delay provided for the embodiment of the present invention one, packet Include following steps:
S31, the unilateral time delay for obtaining the audio frequency parameter.
S32, the unilateral time delay according to the audio frequency parameter, determine the audio end-to-end time delay.
When it is implemented, the explanation of step S21~S22 can be referred to, overlaps will not be repeated.
When it is implemented, when determining video end-to-end time delay or audio end-to-end time delay, it is thus necessary to determine that unilateral time delay, tool Body can determine unilateral time delay according to method shown in Fig. 4, comprising the following steps:
S41, the Network Time Protocol NTP time difference for determining two adjacent video call RTP data packet.
When it is implemented, being illustrated so that adjacent data packet is respectively RTP1 and RTP2 data packet as an example, RTP1 data packet Corresponding NTP (Network Time Protocol, the Network Time Protocol) time is denoted as NTP1;The corresponding NTP of RTP2 data packet Time is denoted as NTP2, then the NTP time difference of RTP1 and RTP2 data packet is denoted as NTP2-NTP1.
S42, the time tolerance for determining two adjacent video call RTP data packet.
When it is implemented, the timestamp of RTP1 data packet and RTP2 data packet is denoted as RTP1 and RTP2 respectively, then RTP1 It can be denoted as with RTP2 packet time stamp difference: RTP2-RTP1.
The NTP difference and time tolerance that S43, basis are determined, determine unilateral time delay.
When it is implemented, unilateral time delay can indicate are as follows:
Unilateral time delay=(NTP2-NTP1)-(RTP2-RTP1) * clock frequency
Specifically, corresponding media sampling frequency can be looked into according to the packet header domain PT (loadtype) of RTP packet, thus obtained Clock frequency.
It should be noted that the unilateral time delay that the embodiment of the present invention one is related to refers to from the every of video calling originating end progress Primary video call reach video calling receiving end generate time delay, or from the video calling receiving end carry out it is each Secondary call reaches the time delay that video calling originating end generates.
Preferably, video calling RTCP (Realtime Transport Control Protocol, reality can also be obtained When transmission control protocol) data packet, determine the video parameter and audio frequency parameter using RTCP data packet, then recycle step The method of S21~21, step S31~S32 and step S41~S43 determines video end-to-end time delay and audio end-to-end time delay.
Specifically, in the video quality and audio quality for determining this video calling, it can use and G.1070 determine, It is described in detail below it:
S12, according to the video parameter, determine the video quality of this video calling.
When it is implemented, the video quality of this video calling can be determined according to the video parameter according to formula (1):
Wherein, VqFor the video quality of this video calling;
IcodingVideo quality when for coding distortion;
PplVFor packet loss;
Robustness degree degree for the video quality influenced by packet loss.
When it is implemented, packet loss PplVIt is determined, can be directly substituted into formula (1) by step S113;It can table It is shown as:FrVIt can be by H.264 determining, BrVTable Show the video bitrate at encoder, coefficient v8,v9,v10,v11,v12It, can be by code/decode type, video format, key for constant Frame period and video show that size determines.
When it is implemented, video quality I when can determine coding distortion according to formula (2)coding:
Wherein, OfrFor optimal frame rate corresponding when video quality is maximized under current video bit rate;
IOfrFor the maximum value of video quality under current video bit rate;
FrVFor current frame rate;
DFrVRobustness degree for the video quality influenced by frame per second.
When it is implemented, corresponding optimal frame rate when video quality is maximized under current video bit rate in formula (2) OfrIt can indicate are as follows: Ofr=v1+v2*BrV,1≤Ofr≤30,v1andv2: const, wherein BrVVideo ratio at presentation code device Special rate, the maximum value I of video quality under current video bit rateOfrExpression formula are as follows: v3,v4,and v5:const;And D in formula (1)FrVIt is represented by DFrV=v6+v7*BrV, 0 < DFrV,v6andv7: const, v1, v2,v3,...,v7For constant, it can show that size determines by code/decode type, video format, key frame interval and video.
S13, according to the audio frequency parameter, determine the audio quality of this video calling.
When it is implemented, can determine the sound of this video calling according to method shown in fig. 5 when executing step S13 Frequency quality:
S131, according to the audio frequency parameter, determine the quality index of the audio quality.
When it is implemented, can determine the quality index of the audio quality according to formula (3):
Wherein, Idte, WB are in this video calling because degenerating caused by caller's echo;
Ie-eff, WB are in this video calling because degenerating caused by voice coding and packet loss;
Qx is the quality index of the audio quality.
It is possible to further determine in video calling according to formula (4) because of the Idte, WB of degenerating caused by caller's echo:
Wherein, the expression formula of Re, WB are as follows: Re, WB=80+2.5* (TERV, WB-14);And the expression formula of TERV, WB Are as follows:And
TELR is caller's echo loudness scale;
Ts is this audio frequency in video call end-to-end time delay.
It is possible to further determine in video calling according to formula (5) because of degeneration caused by voice coding and packet loss Ie-eff, WB:
Wherein, IeS, WB is the voice coding distortion factor;
PplSFor voice packet loss in this video calling;
BplSFor the robustness of packets of voice packet loss in this video calling.
S132, according to the quality index, the audio quality of this video calling is determined according to formula (6).
When it is implemented, the expression formula of formula (6) are as follows:
Wherein, Qx is the quality index;
SqFor the audio quality of the video calling.
When it is implemented, the quality index Qx is determined by formula (3).
Specifically, the embodiment of the present invention is to the implementation sequence of step S12 and S13 without limiting.
S14, according to the video quality and audio quality, determine the video speech quality of this video calling.
When it is implemented, the video speech quality of this video calling can be determined according to process shown in fig. 6, including with Lower step:
S141, according to the video quality and the audio quality, determine respectively this video calling view quality and The degree of audio-video delay and synchronous caused view quality decline.
When it is implemented, can determine this video according to formula (7) according to the video quality and the audio quality The view quality of call:
MMSV=m5*Sq+m6*Vq+m7*Sq*Vq+m8 (7)
Wherein, MMSVFor the view quality of this video calling;
SqFor the audio quality;
VqFor the video quality;
m5,m6,m7,m8For related coefficient, the video depending on this video calling shows size and call task.
It is possible to further according to the video quality and audio quality, determine this video calling according to formula (8) The degree of audio-video delay and synchronous caused view quality decline:
MMT=max { AD+MS, 1 } (8)
Wherein, AD=m9*(TS+TV)+m10,
TSFor this audio frequency in video call end-to-end time delay;
TVFor video end-to-end time delay in this video calling;
AD is audiovisual time delay absolute in this video calling;
MS is audio-visual media synchronization value in this video calling;
MMTThe degree declined for the audio-video delay of this video calling with synchronous caused view quality;
m9,m10,m11,m12,m13,m14For related coefficient, the video depending on this video calling shows that size and call are appointed Business.
S142, the degree declined with synchronous caused view quality is postponed with the audio-video according to the view quality, The video speech quality of this video calling is determined according to formula (9):
MMq=m1*MMSV+m2*MMT+m3*MMSV*MMT+m4 (9)
Wherein, MMSVIndicate the view quality;
MMTIndicate the degree of the audio-video delay with synchronous caused view quality decline;
m1,m2,m3,m4For related coefficient, the video depending on this video calling shows size and call task;
MMqIndicate the video speech quality of this video calling.
So far, can use video speech quality appraisal procedure provided in an embodiment of the present invention can assess the view of VoLTE Frequency speech quality, and then massive video speech quality in whole net is assessed on this basis.
Specifically, the related coefficient m in formula (9)1,m2,m3,m4It can determine based on experience value.
Preferably, in order to improve the accuracy for the video speech quality determined, the present invention is determining video speech quality When, the related coefficient in formula (9) can also be determined first with linear fit algorithm, the correlation for then fitting being recycled to obtain Coefficient determines video speech quality, can specifically refer to process shown in Fig. 7, comprising the following steps:
S51, detect video calling occur when, extract several video calling samples.
S52, the subjective scoring for determining the video calling sample.
It is obtained in network after video speech quality executing step S11~S14, extracts individual call as cycle tests. Subjective scoring is carried out to known cycle tests according to regulation P.800.
When it is implemented, P.800 the subjective scoring of the video calling sample can be determined with ITU-T, ITU-T is P.800 A kind of method for defining subjective testing video, that is, MOS (Mean Opinion Score) test.Test method is will to use The behavior of family viewing video is investigated and quantifies, and to primary standard video and passes through wireless network respectively by different investigation users Decline video after propagation carries out subjective feeling comparison, then gives a mark.Marking can be according to Absolute category Rating, that is, ACR code of points, in general ACR points are 5 grades, shown in reference table 1:
Subjective feeling Score value
Excellent 5
Good 4
Fair 3
Poor 2
Bad 1
According to the corresponding relationship of subjective feeling and score value in table 1, the subjective scoring of available each video calling sample.
S53, the degree declined with synchronous caused view quality is postponed with the audio-video based on the view quality, and In conjunction with the subjective scoring of the video calling sample, the related coefficient m for evaluating this video calling total quality is obtained1, m2,m3,m4
When it is implemented, can use least square method is fitted the related coefficient estimated in formula (9), so that according to quasi- It is more accurate to close the video speech quality that obtained related coefficient is determined.
Specifically, by MM in formula (9)q、MMSVAnd MMTRegard as and m1,m2,m3,m4Linear relationship, by MMSVAnd MMT As independent variable, by MMqAs dependent variable, in parameter fitting process, the subjective scoring carried out to each sample of extraction is obtained The appraisal result arrived is equivalent to MM as test valueq, then can be obtained by several linear equations, further according to least square The principle of difference is fitted, so that it may obtain optimal related coefficient m1,m2,m3,m4Value.
For example, having extracted 10 video calling samples, it is then based on and subjectivity P.800 is carried out to this 10 video calling samples Scoring obtains 10 appraisal results, then using this 10 appraisal results as dependent variable MMqValue.And utilize formula of the present invention (7) and formula (8) can determine the corresponding MM of this 10 video calling samples respectivelySVAnd MMTValue, then by this 10 groups with it is right 10 appraisal results answered, which are updated in formula (9), can be obtained 10 linear equations, and available optimal one group based on this Correlation coefficient value.Further, it is also possible to obtain optimal correlation coefficient value using cross validation algorithm.
Determining related coefficient m1,m2,m3,m4Afterwards, the video that this video calling can be obtained in substitution formula (9) leads to Talk about quality.
The video speech quality appraisal procedure that the embodiment of the present invention one provides, when detecting that video calling occurs, respectively Obtain the video parameter and audio frequency parameter of this video calling;The video matter of this video calling is determined according to the video parameter It measures, and determines the audio quality of this video calling according to the audio frequency parameter;And according to the video quality and the sound Frequency quality determines the video speech quality of this video calling.It, can be according to video using method provided in an embodiment of the present invention Converse determine video speech quality, without reference to the participation in source, and also improve video speech quality assessment result it is accurate Property, the assessment suitable for massive video speech quality;In addition, several can also be extracted when detecting that video calling occurs Video calling sample, and determine the subjective scoring of the video calling sample, then mention from video parameter and audio frequency parameter respectively Parameter sample is taken, and in conjunction with the subjective scoring of the video calling sample, using preset algorithm to the parameter sample and described Subjective scoring carries out processing and obtains the parameter sets for being used to indicate this video calling total quality.Obtaining the parameter sets Afterwards, the relevant parameter of different business can be adjusted according to the parameter sets, and then improves the accuracy of video speech quality assessment.
Embodiment two
Based on the same inventive concept, a kind of video speech quality assessment device is additionally provided in the embodiment of the present invention, due to The principle that above-mentioned apparatus solves the problems, such as is similar to video speech quality appraisal procedure, therefore the implementation side of may refer to of above-mentioned apparatus The implementation of method, overlaps will not be repeated.
As shown in figure 8, the structural schematic diagram of device is assessed for video speech quality provided by Embodiment 2 of the present invention, including Acquiring unit 81, the first determination unit 82 and the second determination unit 83, in which:
Acquiring unit 81, for obtaining the video parameter of this video calling respectively when detecting that video calling occurs And audio frequency parameter;
First determination unit 82, for determining the video quality of this video calling according to the video parameter, and according to The audio frequency parameter determines the audio quality of this video calling;
Second determination unit 83 determines this view for being handled according to the video quality and the audio quality The video speech quality of frequency call.
When it is implemented, the acquiring unit 81, is specifically used for carrying out this video calling using at least one interface Data collect video calling realtime transmission protocol RTP data packet, and the interface is arranged in data packet transmission link any Between two adjacent transmission nodes;Collected video calling RTP data packet is solved according to the communication protocol of each interface Code obtains decoded video calling RTP data packet;According to decoded video calling RTP data packet, preset video is utilized Encoding and decoding standard obtains the video parameter and audio frequency parameter of this video calling respectively.
Preferably, the video parameter and the audio frequency parameter include coding parameter and transmission performance parameter, wherein institute It states coding parameter to include at least with the next item down: code/decode type, end-to-end time delay, code rate, frame per second and maximum transmitted bit rate, institute It states transmission performance parameter to include at least with the next item down: packet loss and transmission rate.
When it is implemented, the acquiring unit 81, is specifically used for obtaining video end-to-end time delay in accordance with the following methods: obtain The unilateral time delay of the video parameter;According to the unilateral time delay of the video parameter of acquisition, when determining that the video is end-to-end Prolong.
Preferably, the acquiring unit 81, is also used to obtain audio end-to-end time delay in accordance with the following methods: obtaining the sound The unilateral time delay of frequency parameter;According to the unilateral time delay of the audio frequency parameter of acquisition, the audio end-to-end time delay is determined.
Specifically, the acquiring unit 81, specifically for determining unilateral time delay in accordance with the following methods: determining two neighboring view The Network Time Protocol NTP time difference of frequency call RTP data packet;And determine two adjacent video call RTP data packet Time tolerance;According to the NTP difference and time tolerance determined, unilateral time delay is determined.
When it is implemented, first determination unit 82, is specifically used for true according to the video parameter according to following formula Make the video quality of this video calling:
Wherein, VqFor the video quality of this video calling;
IcodingVideo quality when for coding distortion;
PplVFor packet loss;
Robustness degree for the video quality influenced by packet loss.
Further, first determination unit 82, specifically for determining video when coding distortion in accordance with the following methods Quality Icoding:
Wherein, OfrFor optimal frame rate corresponding when video quality is maximized under current video bit rate;
IOfrFor the maximum value of video quality under current video bit rate;
FrVFor current frame rate;
DFrVRobustness degree for the video quality influenced by frame per second.
When it is implemented, first determination unit 82, is specifically used for determining the audio matter according to the audio frequency parameter The quality index of amount;And
According to the quality index, the audio quality of this video calling is determined according to following formula:
Wherein, Qx is the quality index;
SqFor the audio quality of the video calling.
Preferably, first determination unit 82, is specifically used for determining institute according to the audio frequency parameter according to following formula State the quality index of audio quality:
Wherein, Idte, WB are in this video calling because degenerating caused by caller's echo;
Ie-eff, WB are in this video calling because degenerating caused by voice coding and packet loss;
Qx is the quality index of the audio quality.
Further, second determination unit, specifically for determining Idte according to following formula:
Wherein, the expression formula of Re, WB are as follows: Re, WB=80+2.5* (TERV, WB-14);And the expression formula of TERV, WB Are as follows:And
TELR is caller's echo loudness scale;
Ts is this audio frequency in video call end-to-end time delay.
Further, first determination unit 82, specifically for determining Ie-eff, WB according to following formula:
Wherein, IeS, WB is the voice coding distortion factor;
PplSFor voice packet loss in this video calling;
BplSFor the robustness of packets of voice packet loss in this video calling.
When it is implemented, second determination unit 83, is specifically used for according to the video quality and the audio quality, The degree that the view quality of this video calling declines with audio-video delay and synchronous caused view quality is determined respectively;
Postpone the degree declined with synchronous caused view quality with the audio-video according to the view quality, under State the video speech quality that formula determines this video calling:
MMq=m1*MMSV+m2*MMT+m3*MMSV*MMT+m4
Wherein, MMSVIndicate the view quality;
MMTIndicate the degree of the audio-video delay with synchronous caused view quality decline;
m1,m2,m3,m4For related coefficient, the video depending on this video calling shows size and call task;
MMqIndicate the video speech quality of this video calling.
Preferably, second determination unit 83, it is specifically used for according to the following equation according to the video quality and described Audio quality determines the view quality of this video calling:
MMSV=m5*Sq+m6*Vq+m7*Sq*Vq+m8
Wherein, MMSVFor the view quality of this video calling;
SqFor the audio quality;
VqFor the video quality;
m5,m6,m7,m8For related coefficient, the video depending on this video calling shows size and call task.
Further, second determination unit 83 is specifically used for according to the following equation according to the video quality and sound Frequency quality determines the degree that the audio-video delay of this video calling declines with synchronous caused view quality:
MMT=max { AD+MS, 1 }
Wherein, AD=m9*(TS+TV)+m10,
TSFor this audio frequency in video call end-to-end time delay;
TVFor video end-to-end time delay in this video calling;
AD is audiovisual time delay absolute in this video calling;
MS is audio-visual media synchronization value in this video calling;
MMTThe degree declined for the audio-video delay of this video calling with synchronous caused view quality;
m9,m10,m11,m12,m13,m14For related coefficient, the video depending on this video calling shows that size and call are appointed Business.
Specifically, second determination unit, specifically for extracting several videos when detecting that video calling occurs Call sample;And determine the subjective scoring of the video calling sample;And prolonged based on the view quality and the audio-video The degree declined late with synchronous caused view quality, and in conjunction with the subjective scoring of the video calling sample, it obtains for commenting The related coefficient m of this video calling total quality of valence1,m2,m3,m4
Embodiment three
The embodiment of the present application three provides a kind of nonvolatile computer storage media, the computer storage medium storage There are computer executable instructions, which can be performed the video speech quality in above-mentioned any means embodiment Appraisal procedure.
Example IV
Fig. 9 is the hardware configuration of the electronic equipment for the implementation video speech quality appraisal procedure that the embodiment of the present invention four provides Schematic diagram, as shown in figure 9, the electronic equipment includes:
One or more processors 910 and memory 920, in Fig. 9 by taking a processor 910 as an example.
The electronic equipment for executing video speech quality appraisal procedure can also include: input unit 930 and output device 940。
Processor 910, memory 920, input unit 930 and output device 940 can pass through bus or other modes It connects, in Fig. 9 for being connected by bus.
Memory 920 is used as a kind of non-volatile computer readable storage medium storing program for executing, can be used for storing non-volatile software journey Sequence, non-volatile computer executable program and module, such as the video speech quality appraisal procedure pair in the embodiment of the present application Program instruction/module/the unit answered is (for example, attached acquiring unit shown in Fig. 8 81, the first determination unit 82 and second determine list Member is 83).Non-volatile software program, instruction and the module/unit that processor 910 is stored in memory 920 by operation, Thereby executing the various function application and data processing of server or intelligent terminal, i.e. realization above method embodiment video Speech quality appraisal procedure.
Memory 920 may include storing program area and storage data area, wherein storing program area can store operation system Application program required for system, at least one function;Storage data area, which can be stored, assesses making for device according to video speech quality With the data etc. created.In addition, memory 920 may include high-speed random access memory, it can also include non-volatile Memory, for example, at least a disk memory, flush memory device or other non-volatile solid state memory parts.In some realities It applies in example, optional memory 920 includes the memory remotely located relative to processor 910, these remote memories can lead to It crosses network connection to video speech quality and assesses device.The example of above-mentioned network include but is not limited to internet, intranet, Local area network, mobile radio communication and combinations thereof.
Input unit 930 can receive the number or character information of input, and generates and assess device with video speech quality User setting and function control related key signals input.Output device 940 may include that display screen etc. shows equipment.
One or more of modules are stored in the memory 920, when by one or more of processors When 910 execution, the video speech quality appraisal procedure in above-mentioned any means embodiment is executed.
Method provided by the embodiment of the present application can be performed in the said goods, has the corresponding functional module of execution method and has Beneficial effect.The not technical detail of detailed description in the present embodiment, reference can be made to method provided by the embodiment of the present application.
The electronic equipment of the embodiment of the present application exists in a variety of forms, including but not limited to:
(1) mobile communication equipment: the characteristics of this kind of equipment is that have mobile communication function, and to provide speech, data Communication is main target.This Terminal Type includes: smart phone (such as iPhone), multimedia handset, functional mobile phone and low Hold mobile phone etc..
(2) super mobile personal computer equipment: this kind of equipment belongs to the scope of personal computer, there is calculating and processing function Can, generally also have mobile Internet access characteristic.This Terminal Type includes: PDA, MID and UMPC equipment etc., such as iPad.
(3) portable entertainment device: this kind of equipment can show and play multimedia content.Such equipment include: audio, Video player (such as iPod), handheld device, e-book and intelligent toy and portable car-mounted navigation equipment.
(4) server: providing the equipment of the service of calculating, and the composition of server includes that processor, hard disk, memory, system are total Line etc., server is similar with general computer architecture, but due to needing to provide highly reliable service, in processing energy Power, stability, reliability, safety, scalability, manageability etc. are more demanding.
(5) other electronic devices with data interaction function.
Embodiment five
The embodiment of the present application five provides a kind of computer program product, wherein the computer program product includes depositing The computer program in non-transient computer readable storage medium is stored up, the computer program includes program instruction, wherein when When described program instruction is computer-executed, the computer is set to execute any one of the application above method embodiment video logical Talk about method for evaluating quality.
Video speech quality appraisal procedure provided in an embodiment of the present invention and device, when detecting that video calling occurs, The video parameter and audio frequency parameter of this video calling are obtained respectively;The view of this video calling is determined according to the video parameter Frequency quality, and determine according to the audio frequency parameter audio quality of this video calling;And according to the video quality and institute Audio quality is stated, determines the video speech quality of this video calling.It, being capable of basis using method provided in an embodiment of the present invention Video calling determines video speech quality, without reference to the participation in source, and also improve the assessment result of video speech quality Accuracy, the assessment suitable for massive video speech quality;In addition, when detecting that video calling occurs, if can also extract Dry video calling sample, and determine the subjective scoring of the video calling sample, then respectively from video parameter and audio frequency parameter Middle extracting parameter sample, and in conjunction with the subjective scoring of the video calling sample, using preset algorithm to the parameter sample and The subjective scoring carries out processing and obtains the parameter sets for being used to indicate this video calling total quality.Obtaining the parameter set After conjunction, the relevant parameter of different business can be adjusted according to the parameter sets, and then improves the accurate of video speech quality assessment Property.
The assessment device of video speech quality provided by embodiments herein can be realized by a computer program.This field Technical staff is it should be appreciated that above-mentioned module division mode is only one of numerous module division modes, if divided It all should be the application's as long as video speech quality assessment device has above-mentioned function for other modules or non-division module Within protection scope.
It should be understood by those skilled in the art that, the embodiment of the present invention can provide as method, system or computer program Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the present invention Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the present invention, which can be used in one or more, The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces The form of product.
The present invention be referring to according to the method for the embodiment of the present invention, the process of equipment (system) and computer program product Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.
Although preferred embodiments of the present invention have been described, it is created once a person skilled in the art knows basic Property concept, then additional changes and modifications can be made to these embodiments.So it includes excellent that the following claims are intended to be interpreted as It selects embodiment and falls into all change and modification of the scope of the invention.
Obviously, various changes and modifications can be made to the invention without departing from essence of the invention by those skilled in the art Mind and range.In this way, if these modifications and changes of the present invention belongs to the range of the claims in the present invention and its equivalent technologies Within, then the present invention is also intended to include these modifications and variations.

Claims (34)

1. a kind of video speech quality appraisal procedure characterized by comprising
When detecting that video calling occurs, the video parameter and audio frequency parameter of this video calling are obtained respectively;
The video quality of this video calling is determined according to the video parameter, and this video is determined according to the audio frequency parameter The audio quality of call;And
According to the video quality and the audio quality, the video speech quality of this video calling is determined.
2. the method as described in claim 1, which is characterized in that when detecting that video calling occurs, obtain this view respectively The video parameter and audio frequency parameter of frequency call, specifically include:
Data are carried out to this video calling using at least one interface and collect video calling realtime transmission protocol RTP data Packet, the interface are arranged between the transmission node that any two are adjacent in data packet transmission link;
It is logical to be decoded to obtain decoded video to collected video calling RTP data packet according to the communication protocol of each interface Talk about RTP data packet;
According to decoded video calling RTP data packet, it is logical that this video is obtained respectively using preset video encoding and decoding standard The video parameter and audio frequency parameter of words.
3. method according to claim 2, which is characterized in that the video parameter and the audio frequency parameter include coding ginseng Several and transmission performance parameter, wherein the coding parameter is included at least with the next item down: code/decode type, end-to-end time delay, code Rate, frame per second and maximum transmitted bit rate, the transmission performance parameter are included at least with the next item down: packet loss and transmission rate.
4. method as claimed in claim 3, which is characterized in that obtain video end-to-end time delay in accordance with the following methods:
Obtain the unilateral time delay of the video parameter;
According to the unilateral time delay of the video parameter, the video end-to-end time delay is determined.
5. method as claimed in claim 3, which is characterized in that obtain audio end-to-end time delay in accordance with the following methods:
Obtain the unilateral time delay of the audio frequency parameter;
According to the unilateral time delay of the audio frequency parameter, the audio end-to-end time delay is determined.
6. method as described in claim 4 or 5, which is characterized in that determine unilateral time delay in accordance with the following methods:
Determine the Network Time Protocol NTP time difference of two adjacent video call RTP data packet;And
Determine the time tolerance of two adjacent video call RTP data packet;
According to the NTP difference and time tolerance determined, unilateral time delay is determined.
7. method as claimed in claim 3, which is characterized in that determine this view according to the video parameter according to following formula The video quality of frequency call:
Wherein, VqFor the video quality of this video calling;
IcodingVideo quality when for coding distortion;
PplVFor packet loss;
Robustness degree for the video quality influenced by packet loss.
8. the method for claim 7, which is characterized in that determine video quality when coding distortion in accordance with the following methods Icoding:
Wherein, OfrFor optimal frame rate corresponding when video quality is maximized under current video bit rate;
IOfrFor the maximum value of video quality under current video bit rate;
FrVFor current frame rate;
DFrVRobustness degree for the video quality influenced by frame per second.
9. method as claimed in claim 3, which is characterized in that determine the audio of this video calling according to the audio frequency parameter Quality specifically includes:
According to the audio frequency parameter, the quality index of the audio quality is determined;And
According to the quality index, the audio quality of this video calling is determined according to following formula:
Wherein, Qx is the quality index;
SqFor the audio quality of the video calling.
10. method as claimed in claim 9, which is characterized in that according to following formula according to the audio frequency parameter, determine described in The quality index of audio quality:
Wherein, Idte, WB are in this video calling because degenerating caused by caller's echo;
Ie-eff, WB are in this video calling because degenerating caused by voice coding and packet loss;
Qx is the quality index of the audio quality.
11. method as claimed in claim 10, which is characterized in that determine Idte, WB according to following formula:
Wherein, the expression formula of Re, WB are as follows: Re, WB=80+2.5* (TERV, WB-14);And the expression formula of TERV, WB are as follows:And
TELR is caller's echo loudness scale;
This audio frequency in video call end-to-end time delay of Ts.
12. method as claimed in claim 10, which is characterized in that determine Ie-eff, WB according to following formula:
Wherein, IeS, WB is the voice coding distortion factor;
PplSFor voice packet loss in this video calling;
BplSFor the robustness of packets of voice packet loss in this video calling.
13. the method as described in claim 7 or 9, which is characterized in that according to the video quality and the audio quality, really The video speech quality of this fixed video calling, specifically includes:
According to the video quality and the audio quality, view quality and the audio-video delay of this video calling are determined respectively With the degree of synchronous caused view quality decline;
Postpone the degree declined with synchronous caused view quality with the audio-video according to the view quality, according to following public affairs Formula determines the video speech quality of this video calling:
MMq=m1*MMSV+m2*MMT+m3*MMSV*MMT+m4
Wherein, MMSVIndicate the view quality;
MMTIndicate the degree of the audio-video delay with synchronous caused view quality decline;
m1,m2,m3,m4For related coefficient, the video depending on this video calling shows size and call task;
MMqIndicate the video speech quality of this video calling.
14. method as claimed in claim 13, which is characterized in that according to the following equation according to the video quality and the sound Frequency quality determines the view quality of this video calling:
MMSV=m5*Sq+m6*Vq+m7*Sq*Vq+m8
Wherein, MMSVFor the view quality of this video calling;
SqFor the audio quality;
VqFor the video quality;
m5,m6,m7,m8For related coefficient, the video depending on this video calling shows size and call task.
15. method as claimed in claim 13, which is characterized in that according to the following equation according to the video quality and audio matter Amount determines the degree that the audio-video delay of this video calling declines with synchronous caused view quality:
MMT=max { AD+MS, 1 }
Wherein, AD=m9*(TS+TV)+m10,
TSFor this audio frequency in video call end-to-end time delay;
TVFor video end-to-end time delay in this video calling;
AD is audiovisual time delay absolute in this video calling;
MS is audio-visual media synchronization value in this video calling;
MMTThe degree declined for the audio-video delay of this video calling with synchronous caused view quality;
m9,m10,m11,m12,m13,m14For related coefficient, the video depending on this video calling shows size and call task.
16. method as claimed in claim 13, which is characterized in that determine related coefficient m by the following method1,m2,m3,m4:
When detecting that video calling occurs, several video calling samples are extracted;And
Determine the subjective scoring of the video calling sample;And
Postpone the degree declined with synchronous caused view quality with the audio-video based on the view quality, and in conjunction with described The subjective scoring of video calling sample obtains the related coefficient m for evaluating this video calling total quality1,m2,m3,m4
17. a kind of video speech quality assesses device characterized by comprising
Acquiring unit, for obtaining the video parameter and audio of this video calling respectively when detecting that video calling occurs Parameter;
First determination unit, for determining the video quality of this video calling according to the video parameter, and according to the sound Frequency parameter determines the audio quality of this video calling;
Second determination unit, for determining that the video of this video calling is logical according to the video quality and the audio quality Talk about quality.
18. device as claimed in claim 17, which is characterized in that
The acquiring unit, it is logical specifically for collecting video to this video calling progress data using at least one interface Realtime transmission protocol RTP data packet is talked about, the transmission node that any two are adjacent in data packet transmission link is arranged in the interface Between;Collected video calling RTP data packet is decoded to obtain decoded video according to the communication protocol of each interface Call RTP data packet;According to decoded video calling RTP data packet, obtained respectively using preset video encoding and decoding standard The video parameter and audio frequency parameter of this video calling.
19. device as claimed in claim 18, which is characterized in that the video parameter and the audio frequency parameter include coding Parameter and transmission performance parameter, wherein the coding parameter is included at least with the next item down: code/decode type, end-to-end time delay, code Rate, frame per second and maximum transmitted bit rate, the transmission performance parameter are included at least with the next item down: packet loss and transmission rate.
20. device as claimed in claim 19, which is characterized in that
The acquiring unit is specifically used for obtaining video end-to-end time delay in accordance with the following methods: obtaining the list of the video parameter Side time delay;According to the unilateral time delay of the video parameter, the video end-to-end time delay is determined.
21. device as claimed in claim 18, which is characterized in that
The acquiring unit is also used to obtain audio end-to-end time delay in accordance with the following methods: obtaining the unilateral of the audio frequency parameter Time delay;According to the unilateral time delay of the audio frequency parameter, the audio end-to-end time delay is determined.
22. the device as described in claim 20 or 21, which is characterized in that
The acquiring unit, specifically for determining unilateral time delay in accordance with the following methods: determining two adjacent video call RTP data The Network Time Protocol NTP time difference of packet;And determine the time tolerance of two adjacent video call RTP data packet;Root According to the NTP difference and time tolerance determined, unilateral time delay is determined.
23. device as claimed in claim 19, which is characterized in that
First determination unit, specifically for determining the view of this video calling according to the video parameter according to following formula Frequency quality:
Wherein, VqFor the video quality of this video calling;
IcodingVideo quality when for coding distortion;
PplVFor packet loss;
Robustness degree for the video quality influenced by packet loss.
24. device as claimed in claim 23, which is characterized in that
First determination unit, specifically for determining video quality I when coding distortion in accordance with the following methodscoding:
Wherein, OfrFor optimal frame rate corresponding when video quality is maximized under current video bit rate;
IOfrFor the maximum value of video quality under current video bit rate;
FrVFor current frame rate;
DFrVRobustness degree for the video quality influenced by frame per second.
25. device as claimed in claim 19, which is characterized in that
First determination unit is specifically used for determining the quality index of the audio quality according to the audio frequency parameter;And
According to the quality index, the audio quality of this video calling is determined according to following formula:
Wherein, Qx is the quality index;
SqFor the audio quality of the video calling.
26. device as claimed in claim 25, which is characterized in that
First determination unit is specifically used for determining the audio quality according to the audio frequency parameter according to following formula Quality index:
Wherein, Idte, WB are in this video calling because degenerating caused by caller's echo;
Ie-eff, WB are in this video calling because degenerating caused by voice coding and packet loss;
Qx is the quality index of the audio quality.
27. device as claimed in claim 26, which is characterized in that
First determination unit, specifically for determining Idte, WB according to following formula:
Wherein, the expression formula of Re, WB are as follows: Re, WB=80+2.5* (TERV, WB-14);And the expression formula of TERV, WB are as follows:And
TELR is caller's echo loudness scale;
Ts is this audio frequency in video call end-to-end time delay.
28. device as claimed in claim 26, which is characterized in that
First determination unit, specifically for determining Ie-eff, WB according to following formula:
Wherein, IeS, WB is the voice coding distortion factor;
PplSFor voice packet loss in this video calling;
BplSFor the robustness of packets of voice packet loss in this video calling.
29. the device as described in claim 23 or 25, which is characterized in that
Second determination unit is specifically used for determining this video respectively according to the video quality and the audio quality The view quality of call postpones and the degree of synchronous caused view quality decline with audio-video;
Postpone the degree declined with synchronous caused view quality with the audio-video according to the view quality, according to following public affairs Formula determines the video speech quality of this video calling:
MMq=m1*MMSV+m2*MMT+m3*MMSV*MMT+m4
Wherein, MMSVIndicate the view quality;
MMTIndicate the degree of the audio-video delay with synchronous caused view quality decline;
m1,m2,m3,m4For related coefficient, the video depending on this video calling shows size and call task;
MMqIndicate the video speech quality of this video calling.
30. device as claimed in claim 29, which is characterized in that
Second determination unit is specifically used for being determined according to the video quality and the audio quality according to the following equation The view quality of this video calling:
MMSV=m5*Sq+m6*Vq+m7*Sq*Vq+m8
Wherein, MMSVFor the view quality of this video calling;
SqFor the audio quality;
VqFor the video quality;
m5,m6,m7,m8For related coefficient, the video depending on this video calling shows size and call task.
31. device as claimed in claim 29, which is characterized in that
Second determination unit is specifically used for determining this according to the following equation according to the video quality and audio quality The degree of the audio-video delay of video calling and synchronous caused view quality decline:
MMT=max { AD+MS, 1 }
Wherein, AD=m9*(TS+TV)+m10,
TSFor this audio frequency in video call end-to-end time delay;
TVFor video end-to-end time delay in this video calling;
AD is audiovisual time delay absolute in this video calling;
MS is audio-visual media synchronization value in this video calling;
MMTThe degree declined for the audio-video delay of this video calling with synchronous caused view quality;
m9,m10,m11,m12,m13,m14For related coefficient, the video depending on this video calling shows size and call task.
32. device as claimed in claim 29, which is characterized in that
Second determination unit, specifically for extracting several video calling samples when detecting that video calling occurs;And Determine the subjective scoring of the video calling sample;And postponed based on the view quality with the audio-video and synchronize cause The degree of view quality decline obtain logical for evaluating this video and in conjunction with the subjective scoring of the video calling sample Talk about the related coefficient m of total quality1,m2,m3,m4
33. a kind of electronic equipment, including memory, processor and it is stored on the memory and can transports on the processor Capable computer program;It is characterized in that, the processor is realized when executing described program such as any one of claim 1~16 institute The video speech quality appraisal procedure stated.
34. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor It realizes when execution such as the step in the described in any item video speech quality appraisal procedures of claim 1~16.
CN201710614327.1A 2017-07-25 2017-07-25 A kind of video speech quality appraisal procedure and device Pending CN109302603A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710614327.1A CN109302603A (en) 2017-07-25 2017-07-25 A kind of video speech quality appraisal procedure and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710614327.1A CN109302603A (en) 2017-07-25 2017-07-25 A kind of video speech quality appraisal procedure and device

Publications (1)

Publication Number Publication Date
CN109302603A true CN109302603A (en) 2019-02-01

Family

ID=65167398

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710614327.1A Pending CN109302603A (en) 2017-07-25 2017-07-25 A kind of video speech quality appraisal procedure and device

Country Status (1)

Country Link
CN (1) CN109302603A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109714557A (en) * 2017-10-25 2019-05-03 中国移动通信集团公司 Method for evaluating quality, device, electronic equipment and the storage medium of video calling
CN111479109A (en) * 2020-03-12 2020-07-31 上海交通大学 Video quality evaluation method, system and terminal based on audio-visual combined attention
CN112118442A (en) * 2020-09-18 2020-12-22 平安科技(深圳)有限公司 AI video call quality analysis method, device, computer equipment and storage medium
CN111479105B (en) * 2020-03-12 2021-06-04 上海交通大学 Video and audio joint quality evaluation method and device
CN111479107B (en) * 2020-03-12 2021-06-08 上海交通大学 No-reference audio and video joint quality evaluation method based on natural audio and video statistics
CN111479106B (en) * 2020-03-12 2021-06-29 上海交通大学 Two-dimensional quality descriptor fused audio and video joint quality evaluation method and terminal
CN113840131A (en) * 2020-06-08 2021-12-24 中国移动通信有限公司研究院 Video call quality evaluation method and device, electronic equipment and readable storage medium
CN114258069A (en) * 2021-12-28 2022-03-29 北京东土拓明科技有限公司 Voice call quality evaluation method and device, computing equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101547259A (en) * 2009-04-30 2009-09-30 华东师范大学 VoIP test method based on analog data flow
CN101911714A (en) * 2008-01-08 2010-12-08 日本电信电话株式会社 Image quality estimation device, method, and program
CN103379358A (en) * 2012-04-23 2013-10-30 华为技术有限公司 Method and device for assessing multimedia quality
CN103634577A (en) * 2012-08-22 2014-03-12 华为技术有限公司 Multimedia quality monitoring method and apparatus
CN104539943A (en) * 2012-08-22 2015-04-22 华为技术有限公司 Method and device for monitoring multimedia quality

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101911714A (en) * 2008-01-08 2010-12-08 日本电信电话株式会社 Image quality estimation device, method, and program
CN101547259A (en) * 2009-04-30 2009-09-30 华东师范大学 VoIP test method based on analog data flow
CN103379358A (en) * 2012-04-23 2013-10-30 华为技术有限公司 Method and device for assessing multimedia quality
CN103634577A (en) * 2012-08-22 2014-03-12 华为技术有限公司 Multimedia quality monitoring method and apparatus
CN104539943A (en) * 2012-08-22 2015-04-22 华为技术有限公司 Method and device for monitoring multimedia quality

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HAYASHI T等: "Multimedia quality integration function for videophone services", 《IEEE GLOBAL TELECOMMUNICATIONS CONFERENCE》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109714557A (en) * 2017-10-25 2019-05-03 中国移动通信集团公司 Method for evaluating quality, device, electronic equipment and the storage medium of video calling
CN111479109A (en) * 2020-03-12 2020-07-31 上海交通大学 Video quality evaluation method, system and terminal based on audio-visual combined attention
CN111479105B (en) * 2020-03-12 2021-06-04 上海交通大学 Video and audio joint quality evaluation method and device
CN111479107B (en) * 2020-03-12 2021-06-08 上海交通大学 No-reference audio and video joint quality evaluation method based on natural audio and video statistics
CN111479106B (en) * 2020-03-12 2021-06-29 上海交通大学 Two-dimensional quality descriptor fused audio and video joint quality evaluation method and terminal
CN113840131A (en) * 2020-06-08 2021-12-24 中国移动通信有限公司研究院 Video call quality evaluation method and device, electronic equipment and readable storage medium
CN112118442A (en) * 2020-09-18 2020-12-22 平安科技(深圳)有限公司 AI video call quality analysis method, device, computer equipment and storage medium
WO2021174879A1 (en) * 2020-09-18 2021-09-10 平安科技(深圳)有限公司 Ai video call quality analysis method and apparatus, computer device, and storage medium
CN114258069A (en) * 2021-12-28 2022-03-29 北京东土拓明科技有限公司 Voice call quality evaluation method and device, computing equipment and storage medium

Similar Documents

Publication Publication Date Title
CN109302603A (en) A kind of video speech quality appraisal procedure and device
Jelassi et al. Quality of experience of VoIP service: A survey of assessment approaches and open issues
Chen et al. A lightweight end-side user experience data collection system for quality evaluation of multimedia communications
JP4965659B2 (en) How to determine video quality
CN103957216B (en) Based on characteristic audio signal classification without reference audio quality evaluating method and system
US11748643B2 (en) System and method for machine learning based QoE prediction of voice/video services in wireless networks
CN102057634B (en) Audio quality estimation method and audio quality estimation device
CN102014126B (en) Voice experience quality evaluation platform based on QoS (quality of service) and evaluation method
CN113067808B (en) Data processing method, live broadcast method, authentication server and live broadcast data server
da Silva et al. Quality assessment of interactive voice applications
US20100329360A1 (en) Method and apparatus for svc video and aac audio synchronization using npt
CN109714557A (en) Method for evaluating quality, device, electronic equipment and the storage medium of video calling
Goudarzi et al. Audiovisual quality estimation for video calls in wireless applications
Jelassi et al. A study of artificial speech quality assessors of VoIP calls subject to limited bursty packet losses
Wuttidittachotti et al. Quality evaluation of mobile networks using VoIP applications: a case study with Skype and LINE based-on stationary tests in Bangkok
JP2006324865A (en) Device, method and program for estimating network communication service satisfaction level
Osmanovic et al. Impact of media-related SIFs on QoE for H. 265/HEVC video streaming
DE602004004577T2 (en) Method and device for determining the language latency by a network element of a communication network
Daengsi et al. Speech quality assessment of VoIP: G. 711 VS G. 722 based on interview tests with Thai users
Saidi et al. Audiovisual quality study for videoconferencing on IP networks
Duque et al. Quality assessment for video streaming P2P application over wireless mesh network
Rodriguez et al. Assessment of quality-of-experience in telecommunication services
Kumar et al. Comparison of popular video conferencing apps using client-side measurements on different backhaul networks
CN106993308A (en) A kind of QoS of voice monitoring method, equipment and the system of VoLTE networks
Casas et al. End-2-end evaluation of ip multimedia services, a user perceived quality of service approach

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190201