WO2022155844A1 - 一种视频传输质量的评估方法及装置 - Google Patents

一种视频传输质量的评估方法及装置 Download PDF

Info

Publication number
WO2022155844A1
WO2022155844A1 PCT/CN2021/073091 CN2021073091W WO2022155844A1 WO 2022155844 A1 WO2022155844 A1 WO 2022155844A1 CN 2021073091 W CN2021073091 W CN 2021073091W WO 2022155844 A1 WO2022155844 A1 WO 2022155844A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
video
mapping relationship
layer
delay
Prior art date
Application number
PCT/CN2021/073091
Other languages
English (en)
French (fr)
Inventor
吴健
李拟珺
吴可镝
魏岳军
沈慧
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2021/073091 priority Critical patent/WO2022155844A1/zh
Publication of WO2022155844A1 publication Critical patent/WO2022155844A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/24Monitoring of processes or resources, e.g. monitoring of server load, available bandwidth, upstream requests

Definitions

  • the present application relates to the field of network transmission, and in particular, to a method and device for evaluating video transmission quality.
  • XR Extended Reality
  • VR virtual reality
  • AR Augmented Reality
  • MR mixed reality
  • Virtual reality is a medium that can be perceived by the human body such as sound and images through head-mounted devices. This technology can create a virtual world and give people an immersive experience.
  • Augmented reality is to superimpose the virtual world on the real world to realize the "augmentation" of the real world.
  • Mixed reality is to superimpose real things into the virtual world.
  • real entities and data entities coexist and interact in real time.
  • the key feature of mixed reality is that synthetic data entities and real entities can interact in real time.
  • the XR video transmission service has been applied in many fields, such as: entertainment field, medical field, education field, retail field, advertising field, etc.
  • NR New Radio 5G
  • High bit rate Since XR video services require higher resolution than general video services, the transmission rate of XR video services is higher than that of general video services. Higher, for example, it can reach 30Mbps-250Mbps.
  • the frame rate of common XR video services is 45fps, 60fps or 120fps, and the interval between video frames arriving at the base station is roughly the inverse of the frame rate.
  • the change of the video frame size means that the burst data amount of the video request is changed.
  • the uplink and downlink delay budget (PDB, Packet Delay Budget) is about 10ms.
  • the video frame compressed by the source compression standard is generally composed of multiple IP packets, and there will be a cliff effect in the transmission process.
  • the evaluation of video transmission quality generally uses indicators such as mean square error (MSE, Mean Square Error) and peak signal-to-noise ratio (PSNR, Peak Signal Noise Ratio).
  • MSE and PSNR often cannot accurately reflect subjective image quality.
  • the quality assessment for video transmission services in the prior art is mostly based on packet error rate (PER, Packet Error Rate), block error rate (BLER, Block Error Rate), delay budget (PDB, Packet Delay Budget),
  • PER Packet Error Rate
  • BLER Block Error Rate
  • PDB Packet Delay Budget
  • the present application provides a method and apparatus for evaluating video transmission quality, which can directly evaluate video transmission quality from the network side, thereby improving the accuracy of video transmission quality evaluation.
  • a first aspect of the present application provides a video transmission quality assessment method based on video transmission, including:
  • a first indicator for evaluating video transmission quality is determined based on the frame drop duration and the frame drop times.
  • the technical solution provided by this aspect directly evaluates the video transmission quality from the network side, and considers the picture quality of the video transmission service, which solves the problem that the video service is not combined in the prior art, and the video transmission quality is simply evaluated from the perspective of packet granularity. status quo.
  • the method further includes:
  • the incoming packet delay is the difference between the time when the last data packet of the video frame reaches the PDCP layer and the time when the first data packet of the video frame reaches the PDCP layer; the transmission delay is the time when the last data packet of the video frame reaches the PDCP layer; The difference between the time when a data packet is sent and the time when the first data packet of the video frame arrives at the PDCP layer.
  • weighted calculation is performed on the first index and the second index, and the video transmission quality is evaluated by using the calculation result.
  • the evaluation result can be more accurate.
  • the acquisition of the frame loss duration of the video frame in the video transmission process includes:
  • mapping relationship between the data packet of the PDCP layer and the video frame through the mapping relationship between the data packet of the IP layer and the video frame;
  • the frame drop duration is determined according to the frame drop position and the inter-frame decoding reference relationship.
  • mapping relationship between the data packet of the PDCP layer and the video frame through the mapping relationship between the data packet of the IP layer and the video frame, including:
  • the data packets arriving at the core network in each cycle are marked as the data packets of the same video frame to obtain the mapping relationship between the data packets of the IP layer and the video frames;
  • the PDCP layer receives the mapping relationship obtained by parsing the mapping relationship between the data packets of the IP layer and the video frames, so as to obtain the mapping relationship between the data packets of the PDCP layer and the video frames.
  • mapping relationship between the data packet of the PDCP layer and the video frame through the mapping relationship between the data packet of the IP layer and the video frame, including:
  • mapping relationship is transmitted to the IP layer, so as to obtain the mapping relationship between data packets in the IP layer and video frames;
  • the parsed mapping relationship is sent to the PDCP layer to obtain the mapping relationship between the data packets of the PDCP layer and the video frame.
  • the video frame is regarded as frame loss.
  • determining the frame loss duration according to the frame loss position and the inter-frame decoding reference relationship includes:
  • each video frame has an inter-frame decoding reference relationship, from the start position of the dropped frame to the last frame of the picture group, it is regarded as a dropped frame state.
  • determining the frame loss duration according to the frame loss position and the inter-frame decoding reference relationship includes:
  • each picture group if there is no inter-frame decoding reference relationship between each video frame, only the frame drop position is regarded as a dropped frame state.
  • the acquisition of the number of frame drops of the video frame during the video transmission process includes:
  • the first indicator is further determined according to the quality of the video frames lost in the frame loss process.
  • the quality of the video frame lost in the frame loss process is determined as follows:
  • f(j) is the quality of the jth video frame lost in a frame loss process
  • v 1 , v 2 , v 3 , v 4 are all constants
  • j is the lost frame in a frame loss process.
  • the jth frame, where j ⁇ [1, L i ], L i is the frame dropping duration of the i-th frame dropping.
  • the determining of the second index for evaluating video transmission quality based on the incoming packet delay and the transmission delay includes:
  • the average value of the incoming packet delay, the variance of the incoming packet delay, the average value of the transmission delay and the variance of the transmission delay are weighted and calculated to obtain a second indicator for video transmission quality.
  • the provided evaluation method for network commands combines video services, so that the video transmission quality evaluation takes into account the picture quality and interaction delay, and the video transmission quality can be directly evaluated on the network side.
  • a second aspect of the present application provides a device for evaluating video transmission quality based on video transmission, including:
  • the first acquisition module is used to acquire the frame drop duration and frame drop times of the video frame in the video transmission process
  • a first determining module configured to determine a first indicator for evaluating video transmission quality based on the frame loss duration and the number of frame loss.
  • the method further includes:
  • a second acquiring module configured to acquire the incoming packet delay and transmission delay of the video frame being transmitted to the network element
  • a second determining module configured to determine a second index for evaluating video transmission quality based on the incoming packet delay and the transmission delay
  • the incoming packet delay is the difference between the time when the last data packet of the video frame reaches the PDCP layer and the time when the first data packet of the video frame reaches the PDCP layer; the transmission delay is the time when the last data packet of the video frame reaches the PDCP layer; The difference between the time when a data packet is sent and the time when the first data packet of the video frame arrives at the PDCP layer.
  • the method further includes:
  • An evaluation module configured to perform weighted calculation on the first index and the second index, and use the calculation result to evaluate the video transmission quality; as an implementation manner of the second aspect, the first acquisition module includes: :
  • the first acquisition unit for obtaining the mapping relationship between the data packet of the PDCP layer and the video frame through the mapping relationship of the data packet of the IP layer to the video frame;
  • a first determining unit configured to determine a frame drop position based on the mapping relationship between the data packets of the PDCP layer and the video frame;
  • the second determining unit is configured to determine the frame drop duration according to the frame drop position and the inter-frame decoding reference relationship.
  • the first obtaining unit is specifically configured to:
  • the data packets arriving at the core network in each cycle are marked as the data packets of the same video frame to obtain the mapping relationship between the data packets of the IP layer and the video frames;
  • the PDCP layer receives the mapping relationship obtained by parsing the mapping relationship between the data packets of the IP layer and the video frames, so as to obtain the mapping relationship between the data packets of the PDCP layer and the video frames.
  • the first obtaining unit is specifically used for:
  • mapping relationship is transmitted to the IP layer, so as to obtain the mapping relationship between data packets in the IP layer and video frames;
  • the parsed mapping relationship is sent to the PDCP layer to obtain the mapping relationship between the data packets of the PDCP layer and the video frame.
  • the second determining unit includes:
  • a first acquisition subunit used for obtaining the frame-dropping start position in each image group in the video based on the frame-dropping position
  • the first determination subunit is used for, in each picture group, if each video frame has an inter-frame decoding reference relationship, from the start position of the dropped frame to the last frame of the picture group, it is regarded as a dropped frame state.
  • the second determining unit includes:
  • a second obtaining subunit used for obtaining the starting position of the dropped frame in each image group in the video based on the dropped frame position
  • the second determining subunit is used for, in each picture group, if each video frame does not have an inter-frame decoding reference relationship, it is only regarded as a frame-dropping state at the frame-dropping position.
  • the first acquisition module further includes:
  • the third determining unit is configured to, in two adjacent image groups, if both the last frame of the previous image group and the first frame of the following image group are in the frame-dropping state, the two dropped frames are combined and recorded as One drop frame.
  • the first indicator is further determined according to the quality of the video frames lost in the frame loss process.
  • the quality of the video frames lost in the frame loss process is determined as follows:
  • f(j) is the quality of the jth video frame lost in a frame loss process
  • v 1 , v 2 , v 3 , v 4 are all constants
  • j is the lost frame in a frame loss process.
  • the jth frame, where j ⁇ [1, L i ], L i is the frame dropping duration of the i-th frame dropping.
  • the second determining module is specifically configured to:
  • the average value of the incoming packet delay, the variance of the incoming packet delay, the average value of the transmission delay and the variance of the transmission delay are weighted and calculated to obtain the second indicator for video transmission quality.
  • a third aspect of the present application provides a computing device, comprising:
  • At least one memory which is connected to the bus and stores program instructions, the program instructions, when executed by the at least one processor, cause the at least one processor to perform the method based on any one of the first aspects above. Instructions for video transmission quality assessment methods for video transmissions.
  • a fourth aspect of the present application provides a computer-readable storage medium on which program instructions are stored, characterized in that, when executed by a computer, the program instructions cause the computer to perform any one of the above-mentioned first aspects. Instructions for a video transmission quality assessment method based on video transmission.
  • the technical solution provided by the present application solves the problem of video transmission quality assessment for video transmission services ranging from IP packet granularity to frame granularity, so that the video transmission quality can be directly assessed on the network side.
  • Fig. 1 is the application field schematic diagram of XR video transmission service in the prior art
  • FIG. 2 is a schematic diagram of a method for evaluating video transmission quality based on XR video transmission in the prior art
  • FIG. 3 is a flowchart of a method for evaluating video transmission quality based on video transmission provided by an embodiment of the present application
  • FIG. 4 is a flowchart of acquiring the frame loss duration of a video frame in a video transmission process provided by an embodiment of the present application
  • FIG. 5 is a flowchart of determining a first indicator for evaluating video transmission quality based on the frame loss duration and the number of frame drops provided by an embodiment of the present application;
  • FIG. 6 is another flowchart of a video transmission quality assessment method based on video transmission provided by an embodiment of the present application.
  • FIG. 7 is an architectural diagram of a specific implementation manner of a method for evaluating video transmission quality in an HEVC application scenario provided by an embodiment of the present application;
  • FIG. 8 is a schematic diagram of an inter-frame reference relationship in an HEVC application scenario provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of a freeze in an image group provided by an embodiment of the present application.
  • FIG. 10 is a schematic diagram of the number of freezes in an HEVC application scenario provided by an embodiment of the present application.
  • FIG. 11 is a first schematic diagram of an incoming packet delay and a transmission delay provided by an embodiment of the present application.
  • FIG. 12 is a second schematic diagram of an incoming packet delay and a transmission delay provided by an embodiment of the present application.
  • FIG. 13 is a schematic diagram of an inter-frame reference relationship in an SHVC application scenario provided by an embodiment of the present application.
  • FIG. 14 is a schematic diagram of a frame drop position and frame drop duration in an SHVC application scenario provided by an embodiment of the present application.
  • 15 is a schematic structural diagram of an apparatus for evaluating video transmission quality based on video transmission quality provided by an embodiment of the present application;
  • FIG. 16 is a schematic structural diagram of a computing device according to an embodiment of the present application.
  • Cliff effect is the phenomenon in which bit-level errors spread within a video frame. A single bit-level error will cause the picture quality of the entire video frame to drop sharply, which means that even under the same packet loss rate, the picture quality will be very different if the packet loss is in the same frame or evenly distributed in multiple frames.
  • MSE Mean Square Error
  • PSNR Peak Signal Noise Ratio
  • SSIM structural similarity
  • MSE and PSNR are two simple image quality evaluation parameters specified by the ITU-R Video Transmission Quality Expert Group, and they are also the two most commonly used parameters. However, these two parameters have certain limitations and often cannot accurately reflect subjective images. the quality of. Even if the parameter values of these two parameters are high, the quality of the subjectively perceived image is not necessarily good, and the image with better subjectively perceived quality is often not necessarily high in these two parameter values.
  • ⁇ x is the mean value of the original reference video image
  • ⁇ y is the mean value of the video image after transmission degradation
  • ⁇ x is the standard deviation of the original reference video image
  • ⁇ y is the degraded video image after transmission
  • ⁇ xy is the covariance of the original reference video image and the degraded video image after transmission
  • C 1 and C 2 are constants defined for numerical calculation stability.
  • the evaluation method of video transmission quality based on structural similarity is an end-to-end evaluation method, and it is difficult to obtain the parameters required for calculation during the video transmission process, such as: ⁇ x , ⁇ y , ⁇ x , ⁇ y , ⁇ xy and so on.
  • PER Packet Error Rate
  • BLER Block Error Rate
  • PDB Packet Delay Budget
  • FIG. 2 it is the overall architecture diagram of the scheme.
  • the scheme takes IP packet granularity as the basic evaluation index, obtains the IP packet from the source end, and transmits the IP packet to the terminal through NR to evaluate the video transmission quality.
  • This solution lacks source media information and terminal visualization information, so that the quality assessment of XR video transmission can only use indirect indicators available on the RAN side, such as PER, PDB, etc.
  • the use of IP packet granularity to evaluate the transmission quality of XR video is not accurate. Under the same packet loss rate, different packet loss locations have great differences in the number of frames lost in XR video.
  • video transmission quality assessment for XR video transmission requires video frame-level granularity, and this scheme does not provide a conversion mechanism from IP granularity packets to video frame granularity.
  • this solution the problem of video decoding dependence caused by the inter-frame reference relationship between video frames is not considered. Due to the existence of the coding unit reference relationship in video coding, there will be frame-level error propagation, and different frame drop positions and different frame drop durations will bring different XR video interactive experiences. In addition, the interaction delay will also affect the interactive experience of XR video, and this solution does not consider the key indicators of delay that can be perceived by the RAN side.
  • an embodiment of the present application provides a video transmission quality assessment method, which can improve the accuracy of the video transmission quality assessment.
  • the video transmission quality assessment method based on video transmission provided in the embodiment of the present application can be applied to scenarios such as high efficiency video coding (HEVC, High Efficiency Video Coding), scalable high efficiency video coding (SHVC, Scalable High efficiency Video Coding), etc., The specific application mode will be described in detail in the following embodiments.
  • HEVC High Efficiency Video Coding
  • SHVC Scalable High efficiency Video Coding
  • the specific application mode will be described in detail in the following embodiments.
  • using the video transmission quality evaluation method based on video transmission provided in this application it is also possible to guide operators to build a network, and quantitatively measure the impact of the network on the XR user experience, so as to drive network upgrades.
  • the video transmission quality assessment method based on video transmission provided by this application can also be used for locating and delimiting network problems, guiding network planning, network optimization, and the like.
  • the method mainly includes steps S110-S120, and each step will be sequentially introduced below:
  • the network element acquires the frame drop duration and frame drop times of the video frames during the video transmission process.
  • acquiring the frame loss duration of the video frame in the video transmission process may include steps S210-S230, as shown in FIG. 4, specifically:
  • the network element obtains the mapping relationship between the data packet of the PDCP layer and the video frame through the mapping relationship between the data packet of the IP layer and the video frame.
  • This embodiment provides two methods for implementing this step, specifically:
  • Method 1 The data packets arriving at the core network in each cycle are marked as the data packets of the same video frame, so as to obtain the mapping relationship between the data packets of the IP layer and the video frames;
  • the core network parses the mapping relationship between the data packets of the IP layer and the video frame, and sends the parsed mapping relationship to the PDCP layer to obtain the mapping relationship between the data packets of the PDCP layer and the video frame.
  • Method 2 By means of cross-layer labeling, from the layer having the mapping relationship between data packets and video frames, the mapping relationship is transmitted to the IP layer, so as to obtain the mapping relationship between the data packets of the IP layer and the video frames;
  • the parsed mapping relationship is sent to the PDCP layer to obtain the mapping relationship between the data packets of the PDCP layer and the video frame.
  • the process of cross-layer labeling is a process of adding media information labeling from a high layer to a lower layer.
  • the media information may include the frame number, the size of the GOP in the video, the type of each video frame, and the like (eg, I frame, P frame, etc.).
  • the gateway determines the packet loss situation of the PDCP layer, and equates the PDCP layer data packet loss situation as the IP layer data packet loss situation. Wherein, the gateway determines that the packet loss occurs in the PDCP layer, including but not limited to: decoding error, packet loss caused by the PDCP layer triggering the packet loss timer in the base station buffer, and packet loss caused by the active packet loss algorithm.
  • the gateway includes but is not limited to a base station, a router or an AP in a Wi-Fi scenario.
  • the network element determines the frame loss position based on the mapping relationship between the data packet of the PDCP layer and the video frame. Specifically: if there is a packet loss in the current video frame, the video frame is dropped. The number of lost data packets is not limited here. As long as there is packet loss in the current video frame, the video frame is considered to be lost.
  • S230 The network element determines the frame drop duration according to the frame drop position and the inter-frame decoding reference relationship. specific:
  • each video frame has an inter-frame decoding reference relationship, from the start position of the dropped frame to the last frame of the picture group, it is regarded as a dropped frame state. If each video frame does not have an inter-frame decoding reference relationship, it is only regarded as a frame-dropping state at the frame-dropping position.
  • a video may include multiple image groups, and one image group may include multiple video frames.
  • acquiring the number of frame drops of video frames during video transmission may include the following steps:
  • the network element determines a first indicator for evaluating video transmission quality based on the frame loss duration and the frame loss times.
  • This step may include steps S310-S320, as shown in FIG. 5, specifically:
  • the average value of the frame drop duration can be obtained by dividing the accumulated sum of each frame drop duration by the number of frame drop times.
  • S320 Perform a weighted calculation on the average value of the frame dropping duration and the number of frame dropping times to obtain a first indicator for evaluating video transmission quality.
  • the average value of the frame dropping duration and the weighting coefficient of the frame dropping times can be set according to the actual application.
  • This embodiment provides evaluation of video transmission quality from the picture quality of video transmission, especially transforming the evaluation of video transmission quality from the traditional evaluation method of packet granularity, and also provides a method of packet mapping, which can directly evaluate the video transmission quality from the network side , to improve the accuracy of the evaluation.
  • Another embodiment of the present application further provides a method for determining the first indicator in step S120, wherein the first indicator is further determined according to the quality of the video frames lost in the frame dropping process.
  • the quality of the video frame lost in the frame dropping process is determined as follows:
  • f(j) is the quality of the jth video frame lost in a frame loss process
  • v 1 , v 2 , v 3 , v 4 are all constants
  • j is the lost frame in a frame loss process.
  • the jth frame, where j ⁇ [1, L i ], L i is the frame dropping duration of the i-th frame dropping.
  • the video transmission quality assessment method based on video transmission further includes steps S410-S420, and each step will be sequentially introduced below:
  • S410 Acquire an incoming packet delay and a transmission delay for transmitting the video frame to the network element.
  • acquiring the incoming packet delay for transmitting the video frame to the network element includes steps S510-S530, and each step will be sequentially introduced below:
  • S530 Use the difference between the time t2 when the last data packet of the video frame arrives at the PDCP layer and the time t1 when the first data packet of the video frame arrives at the PDCP layer to obtain the time when the video frame is transmitted to the network element. extension.
  • acquiring the transmission delay includes steps S610-S630, and each step will be sequentially introduced below:
  • S630 Use the difference between the time t3 when the last data packet of the video frame is sent and the time t1 when the first data packet of the video frame arrives at the PDCP layer to obtain the transmission delay of the video frame.
  • S420 Determine the second index for evaluating the video transmission quality based on the incoming packet delay and the transmission delay. This step includes steps S710-S730, and each step will be sequentially introduced below:
  • S710 Calculate the average value and variance of the incoming packet delay
  • S730 Perform a weighted calculation on the average of the incoming packet delay, the variance of the incoming packet delay, the average of the transmission delay, and the variance of the transmission delay to obtain a second indicator for video transmission quality.
  • This embodiment also considers an interactive delay evaluation method based on a radio access network (RAN, Radio Access Network) side perceptible delay parameter, so as to improve the evaluation accuracy.
  • RAN Radio Access Network
  • a weighted calculation is performed on the first index in the above embodiment and the second index in the above embodiment, and the video transmission quality is evaluated by using the calculation result.
  • the technical solution of this embodiment establishes a video transmission quality evaluation system that considers both the picture quality and the user interaction experience, so that the video transmission quality evaluation is more reasonable.
  • the evaluation method for video transmission quality provided in this embodiment can also be used to evaluate network quality, and the quality of the network is characterized by the level of the index used for video transmission quality.
  • a method for evaluating video transmission quality based on video transmission provided by the embodiment of the present application is described in detail by taking XR video transmission service evaluation in an HEVC application scenario as an example.
  • the XR video transmission service is a burst service, so the video service has a distinct period. Therefore, the quality evaluation of the XR video transmission service is generally based on the observation period, such as 10s.
  • the video transmission quality evaluation method provided by the embodiment of the present application mainly includes two parts, one part is image quality evaluation based on frame loss position information, and the other part is image quality evaluation based on RAN side delay indicator information .
  • Step 1 The base station determines the packet loss situation of the PDCP layer, and equates the PDCP layer data packet loss situation to the IP layer data packet loss situation.
  • the base station (the gateway unit in this embodiment is taken as an example by the base station) judging that the packet loss occurs in the PDCP layer includes but is not limited to: decoding error, packet loss caused by the PDCP layer triggering the packet loss timer in the base station buffer, and active Packet loss caused by the packet loss algorithm.
  • Step 2 Determine the frame loss position according to the mapping relationship between the data packets of the PDCP layer and the video frames.
  • mapping relationship represents the data packets included in the video frame in one observation period, that is, the video frame to which a data packet belongs in the observation period. For example: in an observation period, the data packet No. 1 belongs to the video frame No. 1.
  • the mapping relationship between the data packet of the IP layer and the video frame can be obtained in the following ways:
  • the first way is to mark the data packets arriving at the IP layer in an observation period as the data packets of the same video frame, that is, to obtain the mapping relationship between the data packets of the IP layer and the video frames.
  • the second way is to obtain the mapping relationship between the data packets of the IP layer and the video frames by means of cross-layer annotation. Since the bottom layer cannot know which data packets belong to the same frame, but the high layer can know the information, cross-layer annotation is to transfer the mapping relationship from the high layer with the mapping relationship between data packets to video frames to the IP layer, for example, The mapping relationship is transmitted from the Real-time Transport Protocol (RTP, Real-time Transport Protocol) layer to the IP layer, so that the IP layer obtains the mapping relationship between the data packets of the IP layer and the video frames.
  • RTP Real-time Transport Protocol
  • the frame loss position needs to be determined based on the mapping relationship. Specifically, if there is a packet loss in the current video frame, the video frame can be regarded as a lost frame, and the lost frame position of the video frame is recorded regardless of the number of lost packets. Wherein, whether there is a data packet loss can be determined by whether the packet numbers of the data packets in the video frame are continuous.
  • Step 3 After the frame drop position is determined, the frame drop duration needs to be determined based on the frame drop position and the inter-frame decoding reference relationship.
  • the inter-frame reference relationship in this embodiment will be described first.
  • FIG. 8 in an XR video transmission service whose source coding mode is HEVC, video frames are coded into I frames and P frames.
  • the referenced relationship between frames is indicated by arrows.
  • the following P frame will refer to the previous I frame or P frame. Among them, the I frame can be decoded independently, and the P frame needs to be decoded by the reference frame.
  • the frame length in a picture group is 8, and the first frame in the picture group is a frame that can be decoded independently, and the decoding of the remaining frames needs to refer to the previous frame for decoding.
  • use 1 to represent this The position is not dropped, and 0 is used to indicate that the frame is dropped at this position, and it is recorded to 11110011
  • the frame loss duration in the first picture group is 4 (the 5th, 6th, 7th, and 8th frames are all lost);
  • the starting position of the dropped frame is the first frame. Since there is an inter-frame decoding reference relationship in the picture group, the frame drop duration of the picture group is 8 (the first to eighth frames are all dropped).
  • the size of one GOP is one frame.
  • Step 4 After determining the frame loss duration, it is necessary to determine the number of frame loss, that is, the number of freezes.
  • FIG. 9 it is a schematic diagram of freezing in an image group.
  • the first frame drop position S i that is, the starting position of frame loss
  • a freeze will be formed in the time corresponding to the last frame of the image group from this position to the last frame of the image group.
  • two consecutive frame drops are recorded as one time, the starting position of the frame drop remains unchanged, and the frame drop durations are superimposed.
  • the shaded part is in a stuck state.
  • adjacent GOPs if the last frame of the previous GOP and the first frame of the next GOG are both in the frame-dropping state, the two frames will be dropped twice. Frame merging is recorded as a lost frame, which is recorded as a freeze.
  • the first indicator in step 5 may also be determined by the quality of the video frames lost in the frame dropping process.
  • the quality of the video frame lost in the frame dropping process is determined as follows:
  • f(j) is the quality of the jth video frame lost in a frame loss process
  • v 1 , v 2 , v 3 , v 4 are all constants
  • j is the lost frame in a frame loss process.
  • the jth frame, where j ⁇ [1, L i ], L i is the frame dropping duration of the i-th frame dropping.
  • the first indicator for video transmission quality is calculated based on the quality f(j) of the jth video frame lost during a frame loss process, the frame loss duration Li obtained in step 3, and the number of frame drops n obtained in step 4 Among them, C 1 and C 2 are both constants, and i is the ith frame loss.
  • a video frame includes multiple data packets.
  • a video frame includes 5 data packets as an example, that is, the 5 small rectangles shown in FIG. 11 and FIG. 12 respectively represent one data packet.
  • record the time t 1 when the first packet of the video frame arrives at the PDCP layer of the base station and record the time t 2 when the last packet of the video frame arrives at the base station.
  • After the data packet of the video frame arrives at the base station it will wait in the cache After being scheduled, it is sent over the air interface to record the time t 3 when all data packets of the video frame are sent out, that is, the time when the end packet of the video frame is sent out.
  • this physical quantity is used to represent the time between the arrival of the first packet of the video frame in the PDCP layer buffer area to the arrival of the last packet in the PDCP layer buffer area interval, this physical quantity is used to characterize the stability of the incoming water of the data packets of the base station. If T 1 is small, such as 2ms, then the base station considers that the incoming water of the data packets is relatively stable. If T1 is large, such as 10ms, then the base station considers that the data packet jitter is large and the data packet is unstable.
  • the service interruption caused by the failure of the front-end network element node can be excluded and the video transmission quality on the RAN side can be misjudged.
  • T 2 t 3 -t 1 of the video frame transmission to the base station
  • this physical quantity is used to represent the time interval between the arrival of the first packet of the video frame in the PDCP layer buffer area to the completion of scheduling and sending of the last packet , the physical quantity is used to characterize the video transmission quality. If T2 is small, the video transmission quality is considered to be good, and if T2 is large, the video transmission quality is considered to be poor.
  • T 1avg (T 11 +T 12 +...+T 1m )/m
  • T 2avg (T 21 +T 22 +...+T 2m )/m
  • T 1m is the first The incoming packet delay for the transmission of m video frames to the base station
  • T 2m is the transmission delay for the m-th video frame to be transmitted to the base station
  • m is the total number of video frames.
  • the second index f 2 for video transmission quality is calculated from the average value T 1avg of the incoming packet delay, the variance D 1avg of the incoming packet delay, the average T 2avg of the transmission delay and the variance D 2avg of the transmission delay.
  • r 1 *T 1avg +r 2 *D 1avg +r 3 *T 2avg +r 4 *D 2avg where r 1 , r 2 , r 3 , and r 4 are weighting coefficients.
  • the video transmission quality evaluation method provided in this embodiment is basically the same as the previous embodiment, the difference lies in the picture quality evaluation based on the frame loss position information, so the picture quality evaluation based on the RAN side delay index information will not be repeated. .
  • the video frame is divided into a base layer (BL, Base Layer) and multiple layers of enhanced layers (EL, Enhanced Layer).
  • BL base layer
  • EL enhanced Layer
  • this embodiment takes one base layer and one enhancement layer as an example for description.
  • Step 1 The base station determines the packet loss situation of the PDCP layer in the base layer and the packet loss situation of the PDCP layer in the enhancement layer, and compares the packet loss situation of the PDCP layer in the base layer with the packet loss situation of the PDCP layer in the enhancement layer. It is equivalent to the packet loss situation of the IP layer.
  • the base station determines that data packet loss occurs including but not limited to: decoding errors, packet loss caused by the PDCP layer triggering a packet loss timer in the base station buffer, and packet loss caused by an active packet loss algorithm.
  • Step 2 Determine the frame drop position of the base layer and the frame drop position of the enhancement layer respectively according to the mapping relationship between the data packets of the PDCP layer and the base layer and the enhancement layer of the video frame. For the specific implementation manner, refer to step 2 in the previous embodiment.
  • Step 3 After the frame drop position is determined, the frame drop duration needs to be determined based on the frame drop position and the inter-frame decoding reference relationship.
  • the inter-frame reference relationship in this embodiment will be described first.
  • FIG. 13 in an XR video transmission service whose source coding mode is SHVC, video frames are coded into I frames and P frames.
  • arrows are used to indicate the reference relationship between frames.
  • the subsequent P frame will refer to the previous I frame or P frame
  • the enhancement layer will refer to the P frame of the base layer during inter-layer reference. or I frame.
  • the I frame can be decoded independently, and the P frame needs to be decoded by the reference frame.
  • the base layer of the first frame or IDR refresh frame in a GOP can be independently decoded. Based on the above inter-frame decoding reference relationship, it can be known that if an error occurs in decoding the received I frame or P frame, errors will occur in all inter-frame decoding before the arrival of a GOP or IDR refresh frame. Different from the HEVC scene, an error in the decoding of the base layer in the SHVC scene will lead to an error in the decoding of the enhancement layer of the same frame.
  • the decoding reference relationship in the SHVC scenario can be summarized as: when the base layer loses a frame, the base layer from the starting position of the lost frame to the last frame in the group of pictures is regarded as all lost, and the enhanced The layer from the frame loss starting position to the last frame in the GOP is regarded as all lost; if the enhancement layer loses frames, the enhancement layer from the frame loss starting position to the last frame in the GOP is regarded as all lost.
  • the quality of the video frame is the quality of the enhancement layer.
  • the quality of the video frame is the quality of the base layer; if both the enhancement layer and the base layer are lost, the quality of the video frame is the quality of the base layer. It will trigger dropped frames and cause the video to freeze.
  • the frame loss of the enhancement layer recorded in this embodiment is only the frame loss of the enhancement layer, and the frame loss of the base layer is not. Drop frame duration. For multiple enhancement layers, we record the starting position of multiple enhancement layers and the frame loss duration of each enhancement layer.
  • the base layer (BL) loses the frame for the first time
  • the corresponding enhancement layer (EL) at the same frame is also recorded as a frame loss state.
  • the frame loss start position is earlier than the first The frame drop position, therefore, the first drop frame position S iEL of the enhancement layer needs to be recorded.
  • the frame drop duration is L iBL
  • the frame drop duration is L iBL +L iEL .
  • Step 4 in this embodiment is the same as step 4 in the above-mentioned embodiment, so it will not be repeated in this embodiment.
  • Step 5 Calculate the average value of the dropped frame duration and the number of dropped frames in layers.
  • n' is the number of frame dropping of the enhancement layer
  • n'' is the number of frame dropping of the base layer.
  • the first indicator in step 5 may also be determined according to the quality of the video frames lost in the frame loss process.
  • the quality of the video frame lost in the frame dropping process is determined as follows:
  • f(j) is the quality of the jth video frame lost in a frame loss process
  • v 1 , v 2 , v 3 , v 4 are all constants
  • j is the lost frame in a frame loss process.
  • the jth frame, where j ⁇ [1, L iBL ], L iBL is the frame dropping duration of the i-th frame dropping at the base layer.
  • the frame-dropping duration L iBL of the i-th frame loss in the base layer the frame-dropping duration L iEL of the i-th frame dropping in the enhancement layer
  • the number of dropped frames of the base layer n "the number of dropped frames of the enhancement layer n' is calculated as the first indicator for the quality of video transmission.
  • C 1 , C 2 and C 3 are all constants, and i is the ith frame loss.
  • the evaluation method for video transmission quality provided by the above embodiment comprehensively considers the evaluation based on picture quality and interaction delay. Problem location, delimitation, etc.
  • the apparatus includes:
  • the first obtaining module 810 is used to obtain the frame drop duration and frame drop times of the video frame in the video transmission process
  • a first determining module 820 configured to determine a first indicator for evaluating video transmission quality based on the frame dropping duration and the number of frame dropping.
  • the device also includes:
  • a second acquiring module 830 configured to acquire the incoming packet delay and transmission delay of the video frame being transmitted to the network element
  • the second determination module 840 is configured to determine a second indicator for evaluating video transmission quality based on the packet arrival delay and transmission delay.
  • the device also includes:
  • the evaluation module 850 is configured to perform weighted calculation on the first index and the second index, and use the calculation result to evaluate the video transmission quality.
  • the first obtaining module 810 includes:
  • the first obtaining unit 8110 is used to obtain the mapping relationship between the data packet of the PDCP layer and the video frame through the mapping relationship of the data packet of the IP layer to the video frame;
  • a first determining unit 8120 configured to determine the frame loss position based on the mapping relationship between the data packets of the PDCP layer and the video frames;
  • the second determining unit 8130 is configured to determine the frame drop duration according to the frame drop position and the inter-frame decoding reference relationship.
  • the first obtaining unit is specifically used for:
  • the data packets arriving at the core network in each cycle are marked as the data packets of the same video frame to obtain the mapping relationship between the data packets of the IP layer and the video frames;
  • the core network parses the mapping relationship between the data packets of the IP layer and the video frames, and sends the parsed mapping relationship to the PDCP layer to obtain the mapping relationship between the data packets of the PDCP layer and the video frames.
  • the first obtaining unit is specifically used for:
  • mapping relationship is transmitted to the IP layer, so as to obtain the mapping relationship between data packets in the IP layer and video frames;
  • the parsed mapping relationship is sent to the PDCP layer to obtain the mapping relationship between the data packets of the PDCP layer and the video frame.
  • the first obtaining unit 8110 includes:
  • the first sending subunit 8111 is used to send the mapping relationship between the data packet of the IP layer and the video frame to the core network for analysis;
  • the second sending subunit 8112 is configured to send the parsed mapping relationship to the PDCP layer to obtain the mapping relationship between the data packets of the PDCP layer and the video frame.
  • the first determining unit 8120 is specifically configured to: if there is packet loss in the current video frame, the video frame is dropped.
  • the second determining unit 8130 includes:
  • the first acquisition subunit 8131 is used to obtain the frame-dropping start position in each image group in the video based on the frame-dropping position;
  • the first determination subunit 8132 is used for, in each picture group, if each video frame has an inter-frame decoding reference relationship, from the start position of the dropped frame to the last frame of the picture group, it is regarded as a dropped frame state .
  • the second determining unit 8130 includes:
  • the second acquisition subunit 8133 is used to obtain the frame-dropping start position in each image group in the video based on the frame-dropping position;
  • the second determining subunit 8134 is configured to, in each picture group, only consider the frame drop state at the frame drop position if there is no inter-frame decoding reference relationship for each video frame.
  • the first obtaining module 810 further includes:
  • the third determining unit 8140 is configured to, in two adjacent picture groups, if both the last frame of the previous picture group and the first frame of the next picture group are in the dropped frame state, combine the two dropped frames to record for a drop frame.
  • the first indicator is also determined according to the quality of the video frames lost in the frame loss process. .
  • the quality of the video frame lost in the frame dropping process is determined as follows:
  • f(j) is the quality of the jth video frame lost in a frame loss process
  • v 1 , v 2 , v 3 , v 4 are all constants
  • j is the lost frame in a frame loss process.
  • the jth frame, where j ⁇ [1, L i ], L i is the frame dropping duration of the i-th frame dropping.
  • the second obtaining module 830 includes:
  • the second obtaining unit 8310 is used to obtain the time t 1 when the first data packet of the video frame arrives at the PDCP layer;
  • the third obtaining unit 8320 is used to obtain the time t 2 when the last data packet of the video frame arrives at the PDCP layer;
  • the third calculation unit 8330 is configured to use the difference between the time t2 when the last data packet of the video frame arrives at the PDCP layer and the time t1 when the first data packet of the video frame arrives at the PDCP layer to obtain the video frame transmitted to the The incoming packet delay of the network element.
  • the second obtaining module 830 further includes:
  • the fourth obtaining unit 8340 is used to obtain the time t 1 when the first data packet of the video frame arrives at the PDCP layer;
  • the fifth obtaining unit 8350 is used to obtain the time t 3 when the last data packet of the video frame is sent;
  • the fourth calculation unit 8360 is used to make a difference between the time t3 when the last data packet of the video frame is sent and the time t1 when the first data packet of the video frame arrives at the PDCP layer to obtain the transmission time of the video frame. extension.
  • the second determining module 840 includes:
  • a fifth calculation unit 8410 configured to calculate the average value and variance of the incoming packet delay
  • a sixth calculation unit 8420 configured to calculate the average value and variance of the transmission delay
  • the seventh calculation unit 8430 is used to perform weighted calculation on the average value of the incoming packet delay, the variance of the incoming packet delay, the average value of the transmission delay and the variance of the transmission delay, so as to obtain the No. 1 data for video transmission quality. Two indicators.
  • FIG. 16 is a schematic structural diagram of a computing device 1500 provided by an embodiment of the present application.
  • the computing device 1500 includes: a processor 1510 , a memory 1520 , a communication interface 1530 , and a bus 1540 .
  • the communication interface 1530 in the computing device 1500 shown in FIG. 17 may be used to communicate with other devices.
  • the processor 1510 can be connected with the memory 1520 .
  • the memory 1520 may be used to store the program codes and data. Therefore, the memory 1520 may be a storage unit inside the processor 1510 , or an external storage unit independent from the processor 1510 , or may include a storage unit inside the processor 1510 and an external storage unit independent from the processor 1510 . part.
  • computing device 1500 may also include bus 1540 .
  • the memory 1520 and the communication interface 1530 may be connected to the processor 1510 through the bus 1540 .
  • the bus 1540 may be a Peripheral Component Interconnect (PCI, Peripheral Component Interconnect) bus or an Extended Industry Standard Architecture (EISA, Extended Industry Standard Architecture) bus or the like.
  • PCI Peripheral Component Interconnect
  • EISA Extended Industry Standard Architecture
  • the bus 1540 can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only one line is shown in Figure 16, but it does not mean that there is only one bus or one type of bus.
  • the processor 1510 may adopt a central processing unit (CPU, central processing unit).
  • the processor can also be other general-purpose processors, digital signal processors (DSP, digital signal processors), application specific integrated circuits (ASIC, application specific integrated circuits), ready-made programmable gate arrays (FPGA, field programmable gate Array) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • DSP digital signal processors
  • ASIC application specific integrated circuits
  • FPGA field programmable gate array
  • a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the processor 1510 uses one or more integrated circuits to execute related programs to implement the technical solutions provided by the embodiments of the present application.
  • the memory 1520 may include read only memory and random access memory and provides instructions and data to the processor 1510 .
  • a portion of the processor 1510 may also include non-volatile random access memory.
  • the processor 1510 may also store device type information.
  • the processor 1510 executes the computer-implemented instructions in the memory 1520 to perform the operation steps of the above-described method.
  • the computing device 1500 may correspond to corresponding subjects in executing the methods according to the various embodiments of the present application, and the above-mentioned and other operations and/or functions of the modules in the computing device 1500 are respectively for the purpose of realizing the present application.
  • the corresponding processes of each method in the embodiment will not be repeated here.
  • the disclosed system, apparatus and method may be implemented in other manners.
  • the apparatus embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented.
  • the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and components shown as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the functions, if implemented in the form of software functional units and sold or used as independent products, may be stored in a computer-readable storage medium.
  • the technical solution of the present application can be embodied in the form of a software product in essence, or the part that contributes to the prior art or the part of the technical solution.
  • the computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, Read-Only Memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program codes .
  • Embodiments of the present application further provide a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, is used to execute a method for generating diverse problems, and the method includes the methods described in the foregoing embodiments. at least one of the options.
  • the computer storage medium of the embodiments of the present application may adopt any combination of one or more computer-readable media.
  • the computer-readable medium may be a computer-readable signal medium or a computer-readable storage medium.
  • the computer readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or any combination of the above. More specific examples (a non-exhaustive list) of computer readable storage media include: electrical connections having one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), Erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the above.
  • a computer-readable storage medium can be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a propagated data signal in baseband or as part of a carrier wave, with computer-readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .
  • Program code embodied on a computer readable medium may be transmitted using any suitable medium including, but not limited to, wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
  • Computer program code for carrying out the operations of the present application may be written in one or more programming languages, including object-oriented programming languages—such as Java, Smalltalk, C++, but also conventional Procedural programming language - such as the "C" language or similar programming language.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, through the Internet using an Internet service provider) connect).
  • LAN local area network
  • WAN wide area network
  • Internet service provider an external computer

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本申请涉及网络传输领域,可应用于视频传输质量的评估,具体涉及一种视频传输质量的评估方法,所述方法包括:获取视频传输过程中视频帧的丢帧时长和丢帧次数;基于所述丢帧时长和丢帧次数确定用于评估视频传输质量的第一指标。获取所述视频帧传输到网元的来包时延和传输时延;基于所述来包时延和传输时延确定用于评估视频传输质量的第二指标。对所述第一指标和所述第二指标进行加权计算,利用该计算结果对视频传输质量进行评估。基于本申请提供的技术方案,可以使视频传输质量评估更准确。

Description

一种视频传输质量的评估方法及装置 技术领域
本申请涉及网络传输领域,特别涉及一种视频传输质量的评估方法及装置。
背景技术
XR(eXtended Reality)是虚拟现实(VR、Virtual Reality)、增强现实(AR、Augmented Reality)和混合现实(MR、Mixed Reality)的总称。虚拟现实是通过头戴设备产生声音、图像等人体可感知到的媒介,该技术能够创造处一个虚拟的世界、给人一种沉浸式体验。增强现实是将虚拟世界叠加到现实世界,实现对真实世界的“增强”。混合现实是将真实的东西叠加到虚拟世界中,在混合现实的环境中,真实实体和数据实体共存,同时能实时交互,混合现实的关键特征就是合成数据实体和真实实体能够实时交互。如图1所示,XR视频传输业务已经应用于多个领域,比如:娱乐领域、医疗领域、教育领域、零售领域、广告领域等。
利用NR(新空口5G)传输XR视频业务具有如下特点:(1)高码率:由于XR视频业务比一般视频业务需要更高的分辨率,XR视频业务的传输速率比一般视频业务的传输速率更高,例如可达到30Mbps-250Mbps。(2)常见的XR视频业务的帧率为45fps、60fps或120fps,而视频帧到达基站的间隔大致为帧率的倒数。(3)视频帧大小的变化意味着视频请求的突发数据量是变化的。(4)传输存在时延抖动,有时可以达到数毫秒。(5)对于数据包而言,上下行时延预算(PDB,Packet Delay Budget)约为10ms。(6)通过信源压缩标准压缩后的视频帧一般由多个IP包组成,在传输过程中会存在悬崖效应。
传统技术中,对于视频传输质量的评估一般采用均方误差(MSE,Mean Square Error)、峰值信噪比(PSNR,Peak Signal Noise Ratio)等指标进行评估。但是,MSE和PSNR往往不能准确反映主观的图像质量。基于传统技术,现有技术中针对视频传输业务的质量评估大多基于误包率(PER,Packet Error Rate)、误块率(BLER,Block Error Rate)、时延预算(PDB,Packet Delay Budget),但是这些指标均是间接反映视频传输质量的指标,利用这些指标对视频传输质量的评估是端到端的一种评估方法,并不能准确的评估视频传输质量。
发明内容
鉴于现有技术的以上问题,本申请提供一种视频传输质量的评估方法及装置,可以从网络侧直接评估视频传输质量,提高了视频传输质量评估的准确性。
为了达到上述目的,本申请第一方面提供一种基于视频传输的视频传输质量评估方法,包括:
获取视频传输过程中视频帧的丢帧时长和丢帧次数;
基于所述丢帧时长和丢帧次数确定用于评估视频传输质量的第一指标。
由上,本方面提供的技术方案,从网络侧直接对视频传输质量进行评估,考虑了视频传输业务的画面质量,解决了现有技术中没有结合视频业务,单纯从包粒度角度评估视频传输质量的现状。
作为第一方面的一种实现方式,还包括:
获取所述视频帧传输到网元的来包时延和传输时延;
基于所述来包时延和传输时延确定用于评估视频传输质量的第二指标;
其中,所述来包时延为所述视频帧最后一数据包到达PDCP层的时间与所述视频帧首个数据包到达PDCP层的时间之差;所述传输时延为所述视频帧最后一数据包发送完毕的时间与所述视频帧首个数据包到达PDCP层的时间之差。
作为第一方面的一种实现方式,对所述第一指标和所述第二指标进行加权计算,利用该计算结果对视频传输质量进行评估。
由上,使视频传输质量评估兼顾视频传输业务的画面质量和交互时延,可以使评估结果更准确。
作为第一方面的一种实现方式,所述获取视频传输过程中视频帧的丢帧时长,包括:
通过IP层的数据包到所述视频帧的映射关系获得PDCP层的数据包到所述视频帧的映射关系;
基于所述PDCP层的数据包到所述视频帧的映射关系确定丢帧位置;
根据所述丢帧位置和帧间解码参考关系确定丢帧时长。
由上,提供了基于丢帧位置和帧间解码参考关系的丢帧时长确定方法,充分考虑了视频帧译码依赖问题,可以更准确的确定出丢帧时长。
作为第一方面的一种实现方式,
所述通过IP层的数据包到所述视频帧的映射关系获得PDCP层的数据包到所述视频帧的映射关系,包括:
将各周期内到达核心网的数据包标注为同一视频帧的数据包,以获得IP层的数据包到视频帧的映射关系;
PDCP层接收对所述IP层的数据包到视频帧的映射关系所解析得到的映射关系,以获得所述PDCP层的数据包到所述视频帧的映射关系。
作为第一方面的一种实现方式,
所述通过IP层的数据包到所述视频帧的映射关系获得PDCP层的数据包到所述视频帧的映射关系,包括:
通过跨层标注的方式,从具有数据包到视频帧的映射关系的层中将该映射关系传输到IP层,以获得IP层的数据包到视频帧的映射关系;
将所述IP层的数据包到所述视频帧的映射关系发送至核心网进行解析;
将解析后的映射关系发送至PDCP层,以获得所述PDCP层的数据包到所述视频帧的映射关系。
作为第一方面的一种实现方式,
若当前视频帧中存在数据包丢失,则视为该视频帧丢帧。
作为第一方面的一种实现方式,所述根据所述丢帧位置和帧间解码参考关系确定丢帧时长,包括:
基于所述丢帧位置获得所述视频中各图像组中的丢帧起始位置;
在各图像组中,若各视频帧存在帧间解码参考关系,则从所述丢帧起始位置起至该图像组最后一帧,均视为丢帧状态。
作为第一方面的一种实现方式,所述根据所述丢帧位置和帧间解码参考关系确定丢帧时长,包括:
基于所述丢帧位置获得所述视频中各图像组中的丢帧起始位置;
在各图像组中,若各视频帧不存在帧间解码参考关系,则仅在丢帧位置处视为丢帧状态。
对于视频业务来说,由于视频编码中编码单元参考关系的存在,会存在帧级别的错误传播,不同的丢帧位置会导致视频丢帧时长不同,基于此,本方面通过考虑视频帧的帧间解码参考关系和丢帧起始位置,以确定丢帧时长,可以较为准确的描述丢帧时长。
作为第一方面的一种实现方式,所述获取视频传输过程中视频帧的丢帧次数,包括:
在相邻的两个图像组中,若前一图像组的最后一帧和后一图像组的第一帧均为丢帧状态,则将两次丢帧合并记为一次丢帧。
由上,对于每个图像组来说,找到最早的丢帧位置,从该帧到该帧所在最后一帧内会形成一次卡顿,但是如果下一个图像组的第一帧就出现丢帧,从用户角度来说,仍可观察到一次卡顿,因此,若前一图像组的最后一帧和后一图像组的第一帧均为丢帧状态,则将两次丢帧合并记为一次丢帧。丢帧起始位置不变,丢帧时长叠加。
作为第一方面的一种实现方式,所述第一指标还根据丢帧过程中丢失的视频帧的质量确定。作为第一方面的一种实现方式,所述丢帧过程中丢失的视频帧的质量按下式确定:
f(j)=max(v 1*ln(v 2*j+v 3)+v 4,0)
其中,f(j)为在一次丢帧过程中,丢失的第j个视频帧的质量,v 1、v 2、v 3、 v 4均为常数,j为在一次丢帧过程中,丢失的第j帧,其中,j∈[1,L i],L i为第i次丢帧的丢帧时长。
作为第一方面的一种实现方式,所述基于所述来包时延和传输时延确定用于评估视频传输质量的第二指标,包括:
计算所述来包时延的平均值和方差;
计算所述传输时延的平均值和方差;
对所述来包时延的平均值、来包时延的方差、传输时延的平均值和传输时延的方差进行加权计算,获得用于视频传输质量的第二指标。
由上,提供的对网络指令的评估方法结合了视频业务,使得视频传输质量评估兼顾画面质量和交互时延,可以在网络侧直接评估视频传输质量。
本申请的第二方面提供一种基于视频传输的视频传输质量评估装置,包括:
第一获取模块,用于获取视频传输过程中视频帧的丢帧时长和丢帧次数;
第一确定模块,用于基于所述丢帧时长和丢帧次数确定用于评估视频传输质量的第一指标。
作为第二方面的一种实现方式,还包括:
第二获取模块,用于获取所述视频帧传输到网元的来包时延和传输时延;
第二确定模块,用于基于所述来包时延和传输时延确定用于评估视频传输质量的第二指标;
其中,所述来包时延为所述视频帧最后一数据包到达PDCP层的时间与所述视频帧首个数据包到达PDCP层的时间之差;所述传输时延为所述视频帧最后一数据包发送完毕的时间与所述视频帧首个数据包到达PDCP层的时间之差。
作为第二方面的一种实现方式,还包括:
评估模块,用于对所述第一指标和所述第二指标进行加权计算,利用该计算结果对视频传输质量进行评估;作为第二方面的一种实现方式,所述第一获取模块,包括:
第一获取单元,用于通过IP层的数据包到所述视频帧的映射关系获得PDCP层的数据包到所述视频帧的映射关系;
第一确定单元,用于基于所述PDCP层的数据包到所述视频帧的映射关系确定丢帧位置;
第二确定单元,用于根据所述丢帧位置和帧间解码参考关系确定丢帧时长。
作为第二方面的一种实现方式,所述第一获取单元中,所述第一获取单元具体用于:
将各周期内到达核心网的数据包标注为同一视频帧的数据包,以获得IP层的数据包到视频帧的映射关系;
PDCP层接收对所述IP层的数据包到视频帧的映射关系所解析得到的映射关系,以获得所述PDCP层的数据包到所述视频帧的映射关系。
作为第二方面的一种实现方式,所述第一获取单元中,
所述第一获取单元具体用于:
通过跨层标注的方式,从具有数据包到视频帧的映射关系的层中将该映射关系传输到IP层,以获得IP层的数据包到视频帧的映射关系;
将所述IP层的数据包到所述视频帧的映射关系发送至核心网进行解析;
将解析后的映射关系发送至PDCP层,以获得所述PDCP层的数据包到所述视频帧的映射关系。
作为第二方面的一种实现方式,若当前视频帧中存在数据包丢失,则视为该视频帧丢帧。
作为第二方面的一种实现方式,所述第二确定单元,包括:
第一获取子单元,用于基于所述丢帧位置获得所述视频中各图像组中的丢帧起始位置;
第一确定子单元,用于在各图像组中,若各视频帧存在帧间解码参考关系,则从所述丢帧起始位置起至该图像组最后一帧,均视为丢帧状态。
作为第二方面的一种实现方式,所述第二确定单元,包括:
第二获取子单元,用于基于所述丢帧位置获得所述视频中各图像组中的丢帧起始位置;
第二确定子单元,用于在各图像组中,若各视频帧不存在帧间解码参考关系,则仅在丢帧位置处视为丢帧状态。
作为第二方面的一种实现方式,所述第一获取模块,还包括:
第三确定单元,用于在相邻的两个图像组中,若前一图像组的最后一帧和后一图像组的第一帧均为丢帧状态,则将两次丢帧合并记为一次丢帧。
作为第二方面的一种实现方式,所述第一确定模块中,所述第一指标还根据丢帧过程中丢失的视频帧的质量确定。
作为第二方面的一种实现方式,所述丢帧过程中丢失的视频帧的质量按下式确定:
f(j)=max(v 1*ln(v 2*j+v 3)+v 4,0)
其中,f(j)为在一次丢帧过程中,丢失的第j个视频帧的质量,v 1、v 2、v 3、v 4均为常数,j为在一次丢帧过程中,丢失的第j帧,其中,j∈[1,L i],L i为第i次丢帧的丢帧时长。
作为第二方面的一种实现方式,所述第二确定模块,具体用于:
计算所述来包时延的平均值和方差;
计算所述传输时延的平均值和方差;
所述来包时延的平均值、来包时延的方差、传输时延的平均值和传输时延的方差进行加权计算,获得用于视频传输质量的第二指标。
本申请的第三方面提供一种计算设备,包括:
总线;
通信接口,其与所述总线连接;
至少一个处理器,其与所述总线连接;以及
至少一个存储器,其与所述总线连接并存储有程序指令,所述程序指令当被所述至少一个处理器执行时,使得所述至少一个处理器执行上述第一方面任一项所述的基于视频传输的视频传输质量评估方法的指令。
本申请的第四方面提供一种计算机可读存储介质,其上存储有程序指令,其特征在于,所述程序指令当被计算机执行时,使得所述计算机执行上述第一方面任一项所述的基于视频传输的视频传输质量评估方法的指令。
综上,本申请提供的技术方案,解决了针对视频传输业务的视频传输质量评估从IP包粒度到帧粒度的画面质量评估的问题,使可以在网络侧直接对视频传输质量评估。
本申请的这些和其它方面在以下(多个)实施例的描述中会更加简明易懂。
附图说明
以下参照附图来进一步说明本申请的各个特征和各个特征之间的联系。附图均为示例性的,一些特征并不以实际比例示出,并且一些附图中可能省略了本申请所涉及领域的惯常的且对于本申请非必要的特征,或是额外示出了对于本申请非必要的特征,附图所示的各个特征的组合并不用以限制本申请。另外,在本说明书全文中,相同的附图标记所指代的内容也是相同的。具体的附图说明如下:
图1为现有技术中XR视频传输业务的应用领域示意图;
图2为现有技术中基于XR视频传输的视频传输质量评估方法示意图;
图3为本申请实施例提供的一种基于视频传输的视频传输质量评估方法的流程图;
图4为本申请实施例提供的获取视频传输过程中视频帧的丢帧时长的流程图;
图5为本申请实施例提供的基于所述丢帧时长和丢帧次数确定用于评估视频传输质量的第一指标的流程图;
图6为本申请实施例提供的基于视频传输的视频传输质量评估方法的另一流程图;
图7为本申请实施例提供的HEVC应用场景下视频传输质量的评估方法的具体实现方式架构图;
图8为本申请实施例提供的HEVC应用场景下帧间参考关系示意图;
图9为本申请实施例提供的一个图像组中的卡顿示意图;
图10为本申请实施例提供的HEVC应用场景下卡顿次数示意图;
图11为本申请实施例提供的来包时延和传输时延第一示意图;
图12为本申请实施例提供的来包时延和传输时延第二示意图;
图13为本申请实施例提供的SHVC应用场景下帧间参考关系示意图;
图14为本申请实施例提供的SHVC应用场景下丢帧位置和丢帧时长示意图;
图15为本申请实施例提供的基于视频传输质量的视频传输质量评估装置的结构示意图;
图16为本申请实施例提供的一种计算设备的结构示意图。
具体实施方式
说明书和权利要求书中的词语“第一、第二、第三”等或模块A、模块B、模块C等类似用语,仅用于区别类似的对象,不代表针对对象的特定排序,可以理解地,在允许的情况下可以互换特定的顺序或先后次序,以使这里描述的本申请实施例能够以除了在这里图示或描述的以外的顺序实施。
在以下的描述中,所涉及的表示步骤的标号,如S110、S120……等,并不表示一定会按此步骤执行,在允许的情况下可以互换前后步骤的顺序,或同时执行。
说明书和权利要求书中使用的术语“包括”不应解释为限制于其后列出的内容;它不排除其它的元件或步骤。因此,其应当诠释为指定所提到的所述特征、整体、步骤或部件的存在,但并不排除存在或添加一个或更多其它特征、整体、步骤或部件及其组群。因此,表述“包括装置A和B的设备”不应局限为仅由部件A和B组成的设备。
本说明书中提到的“一个实施例”或“实施例”意味着与该实施例结合描述的特定特征、结构或特性包括在本申请的至少一个实施例中。因此,在本说明书各处出现的用语“在一个实施例中”或“在实施例中”并不一定都指同一实施例,但可以指同一实施例。此外,在一个或多个实施例中,能够以任何适当的方式组合各特定特征、结构或特性,如从本公开对本领域的普通技术人员显而易见的那样。
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。如有不一致,以本说明书中所说明的含义或者根据本说明书中记载的内容得出的含义为准。另外,本文中所使用的术语只是为了描述本申请实施例的目的,不是旨在限制本申请。
对本申请具体实施方式进行进一步详细说明之前,对本申请实施例中涉及的名词和属于,以及其在本申请中相应的用途\作用\功能等进行说明,本申请实施例中涉及的名词和术语适用于如下的解释:
悬崖效应:悬崖效应是指比特级的错误在视频帧内扩散的现象。单个比特级错误会导致整个视频帧画面质量急剧下降,意味着即使在相同的丢包率 下,丢包在同一帧和均匀的分散在多帧中,画面质量会天差地别。
下面,首先对现有技术进行分析:
传统的端到端的视频传输质量评估方法一般采用均方误差(MSE,Mean Square Error)、峰值信噪比(PSNR,Peak Signal Noise Ratio)和结构相似度(SSIM,Structural Similarity)等指标对视频传输质量进行评估。下面,对该传统方法进行详细介绍。
首先介绍利用均方误差对视频传输质量评估的方法。对于数字图像来说,设f(m,n)为原参考视频图像,f′(m,n)为经传输降质后的视频图像,M为视频图像的时长,N为视频图像的宽度,那么视频图像的尺寸大小为M*N,基于上述定义,该图像的均方误差
Figure PCTCN2021073091-appb-000001
然后介绍利用峰值信噪比对视频传输质量评估的方法。对于数字图像来说,设A为视频图像的最大灰度值,为MSE为视频图像的均方误差,那么该图像的峰值信噪比
Figure PCTCN2021073091-appb-000002
MSE和PSNR为ITU-R视频传输质量专家组规定的两个简单的图像质量评估参数,也是最常用的两个参数,但是这两个参数均存在一定的局限性,往往不能准确的反映主观图像的质量。即使这两个参数的参数值较高,但主观感觉的图像的质量却不一定好,而主观感觉质量较好的图像,往往这两个参数值不一定是高的。
最后介绍利用结构相似度对视频传输质量评估的方法。基于均方误差和峰值信噪比不能反映主观图像质量的缺点,传统技术还提出了基于结构相似度对视频传输质量评估的方法。结构相似这一指标充分考虑了人眼视觉特性,分别对原参考视频图像和经传输降质后的视频图像从亮度、对比度和结构三个方面的相似度进行比较,综合计算出结构相似度值
Figure PCTCN2021073091-appb-000003
其中,在该式中,μ x为原参考视频图像的均值,μ y为经传输降质后的视频图像的均值,σ x为原参考视频图像的标准差,σ y为经传输降质后的视频图像的标准差,σ xy为原参考视频图像和经传输降质后的视频图像的协方差,C 1和C 2为了数值计算稳定性定义的常数。
基于结构相似度对视频传输质量评估的方法为端到端的评估方法,在视频传输过程中很难获得计算所需的参数,比如:μ x、μ y、σ x、σ y、σ xy等。
基于传统的端到端的视频传输质量评估方法所存在的缺陷,目前,针对视频传输质量评估的方法大多基于误包率(PER,Packet Error Rate)、误块率(BLER,Block Error Rate)和时延预算(PDB,Packet Delay Budget)等指标,但是,这些指标均为无线接入网(RAN,Radio Access Network)侧可 感知的间接性评估视频传输质量的指标,不能准确的评估视频传输质量。
下面,提供一种现有技术基于XR视频传输的视频传输质量评估方法。如图2所示,为该方案的整体架构图。该方案以IP包粒度作为评估的基本指标,通过从源端获取IP包,通过NR将该IP包传输到终端,以进行视频传输质量评估。该方案缺少了源端媒体信息和终端可视化信息,导致XR视频传输的质量评估只能使用RAN侧可获取的间接性指标,比如:PER、PDB等。但是,由于悬崖效应的存在,导致利用IP包粒度对XR视频传输质量的评估并不准确,在相同的丢包率下,不同的丢包位置对XR视频的丢帧数影响有很大区别,因此,针对XR视频传输的视频传输质量评估需要视频帧级别的粒度,而该方案并未提供从IP粒度包到视频帧粒度的转换机制。另外,在该方案中,并未考虑由于视频帧间的帧间参考关系带来的视频解码依赖的问题。由于视频编码中编码单元参考关系的存在,会出现帧级别的错误传播,不同的丢帧位置和不同的丢帧时长会带来不同的XR视频的交互体验。除此之外,交互时延也会影响XR视频的交互体验,而该方案并未考虑RAN侧可感知的时延关键性指标。
基于对现有技术的研究以及现有技术所存在的缺陷,本申请的一个实施例提供一种视频传输质量评估方法,可以提高视频传输质量评估的准确性。
下面结合附图对本申请的实施例进行详细说明。首先,介绍本申请实施例提供的基于视频传输的视频传输质量评估方法所应用的场景。
本申请实施例提供的基于视频传输的视频传输质量评估方法可应用于高效率视频编码(HEVC、High Efficiency Video Coding)、可伸缩高效率视频编码(SHVC,Scalable High efficiency Video Coding)等场景中等,其具体应用方式在后续实施例中进行详细说明。另外,使用本申请提供的基于视频传输的视频传输质量评估方法,还可以指导运营商建网,定量的衡量网络对XR用户体验的影响,以牵引网络升级换代。本申请提供的基于视频传输的视频传输质量评估方法还可以用于网路问题的定位、定界,指导网络规划、网络优化等。
下面参见各图,对本申请实施例提供的一种基于视频传输的视频传输质量评估方法进行说明,如图3所示,该方法主要包括步骤S110-S120,下面对各个步骤进行依次介绍:
S110:网元获取视频传输过程中视频帧的丢帧时长和丢帧次数。
其中,获取视频传输过程中视频帧的丢帧时长可以包括步骤S210-S230,如图4所示,具体的:
S210:网元通过IP层的数据包到所述视频帧的映射关系获得PDCP层的数据包到所述视频帧的映射关系。本实施例提供实现该步骤的两种方法,具体的:
方法一:将各周期内到达核心网的数据包标注为同一视频帧的数据包,以获得IP层的数据包到视频帧的映射关系;
所述核心网对所述IP层的数据包到视频帧的映射关系进行解析,将解析后 的映射关系发送至PDCP层,以获得所述PDCP层的数据包到所述视频帧的映射关系。
方法二:通过跨层标注的方式,从具有数据包到视频帧的映射关系的层中将该映射关系传输到IP层,以获得IP层的数据包到视频帧的映射关系;
将所述IP层的数据包到所述视频帧的映射关系发送至核心网进行解析;
将解析后的映射关系发送至PDCP层,以获得所述PDCP层的数据包到所述视频帧的映射关系。
其中,跨层标注的过程为从高层向低层添加媒体信息标注的过程。该媒体信息可以包括帧号、视频中的图像组的大小、各视频帧的种类等(比如I帧、P帧等)。
在本步骤中,网关判断PDCP层数据包丢包情况,并将PDCP层数据包丢包情况等效为IP层数据包丢包情况。其中,网关判断PDCP层出现数据包丢包包括但不局限于:译码出错、PDCP层在基站缓冲区触发丢包定时器导致的丢包以及主动丢包算法导致的丢包。其中,网关包括但不限于基站、Wi-Fi场景下的路由器或AP。
S220:网元基于所述PDCP层的数据包到所述视频帧的映射关系确定丢帧位置。具体的:若当前视频帧中存在数据包丢失,则该视频帧丢帧。此处不限定数据包丢失的个数,只要当前视频帧存在丢包,则认为该视频帧丢帧。
S230:网元根据所述丢帧位置和帧间解码参考关系确定丢帧时长。具体的:
基于所述丢帧位置获得所述视频中各图像组(GOP,Group Of Picture)中的丢帧起始位置;
在各图像组中,若各视频帧存在帧间解码参考关系,则从所述丢帧起始位置起至该图像组最后一帧,均视为丢帧状态。若各视频帧不存在帧间解码参考关系,则仅在丢帧位置处视为丢帧状态。
在本步骤中,一段视频中可以包括多个图像组,一个图像组可以包括多个视频帧。
另外,获取视频传输过程中视频帧的丢帧次数,可以包括下述步骤:
对于每个图像组,找到图像组中最早出现丢帧的位置,从该帧到该帧所在的图像组的最后一帧对应的时间间隔内会形成一次卡顿,但是如果下一个图像组的第一帧就出现丢帧,则将两次连续的丢帧记录为一次丢帧,丢帧起始位置不变,丢帧时长叠加。因此,在相邻的两个图像组中,若前一图像组的最后一帧和后一图像组的第一帧均为丢帧状态,则将两次丢帧合并记为一次丢帧。
S120:网元基于所述丢帧时长和丢帧次数确定用于评估视频传输质量的第一指标。
本步骤可以包括步骤S310-S320,如图5所示,具体的:
S310:计算所述丢帧时长的平均值。
在本步骤中,用各丢帧时长累加得到的和除以丢帧次数可以得到丢帧时长的 平均值。
S320:对所述丢帧时长的平均值和所述丢帧次数进行加权计算,获得用于评估视频传输质量的第一指标。其中,丢帧时长的平均值和丢帧次数的加权系数可以根据实际应用进行设置。
本实施例提供了从视频传输的画面质量评估视频传输质量,尤其是将视频传输质量的评估从传统包粒度的评估方法转变,还提供了包映射的方法,可以直接从网络侧评估视频传输质量,以提高评估的准确性。
本申请另外一个实施例还提供一种确定步骤S120中第一指标的方法,其中,所述第一指标还根据丢帧过程中丢失的视频帧的质量确定。
所述丢帧过程中丢失的视频帧的质量按下式确定:
f(j)=max(v 1*ln(v 2*j+v 3)+v 4,0)
其中,f(j)为在一次丢帧过程中,丢失的第j个视频帧的质量,v 1、v 2、v 3、v 4均为常数,j为在一次丢帧过程中,丢失的第j帧,其中,j∈[1,L i],L i为第i次丢帧的丢帧时长。
如图6所示,在本申请的另一实施例中,基于视频传输的视频传输质量评估方法还包括步骤S410-S420,下面对各个步骤进行依次介绍:
S410:获取所述视频帧传输到网元的来包时延和传输时延。
其中,获取视频帧传输到网元的来包时延包括步骤S510-S530,下面对各个步骤进行依次介绍:
S510:获取视频帧首个数据包到达PDCP层的时间t 1
S520:获取视频帧最后一数据包到达PDCP层的时间t 2
S530:用所述视频帧最后一数据包到达PDCP层的时间t 2与所述视频帧首个数据包到达PDCP层的时间t 1作差,获得所述视频帧传输到网元的来包时延。
另外,获取所述传输时延,包括步骤S610-S630,下面对各个步骤进行依次介绍:
S610:获取视频帧首个数据包到达PDCP层的时间t 1
S620:获取视频帧最后一数据包发送完毕的时间t 3
S630:用所述视频帧最后一数据包发送完毕的时间t 3和所述视频帧首个数据包到达PDCP层的时间t 1作差,获得所述视频帧的传输时延。
S420:基于所述来包时延和传输时延确定用于评估视频传输质量的第二指标,该步骤包括步骤S710-S730,下面对各个步骤进行依次介绍:
S710:计算所述来包时延的平均值和方差;
S720:计算所述传输时延的平均值和方差;
S730:对所述来包时延的平均值、来包时延的方差、传输时延的平均值和传输时延的方差进行加权计算,获得用于视频传输质量的第二指标。
本实施例还考虑了基于无线接入网(RAN,Radio Access Network)侧可感知时延参数的交互时延评估方法,以提高评估精度。
在本申请的另一实施例中,对上述实施例中的第一指标和上述实施例中的第二指标进行加权计算,利用该计算结果对视频传输质量进行评估。
本实施例的技术方案建立了既考虑画面质量又考虑用户交互体验的视频传输质量评估体系,使视频传输质量评估更加合理。
本实施例提供的视频传输质量的评估方法,还可以用于评估网络质量,通过用于视频传输质量的指标的高低来表征网络质量的好坏。下面参见各图,首先以HEVC应用场景下的XR视频传输业务评估为例,对本申请实施例提供的一种基于视频传输的视频传输质量评估方法进行详述。此处需要说明的是,XR视频传输业务为突发(burst)业务,因此视频业务是周期分明,因此,针对XR视频传输业务的质量评估一般是以观察周期为基本单位的,例如10s。
参见图7所示的架构图,对本申请提供的一种视频传输质量的评估方法的一个实施例进行说明。如图7所示,本申请实施例提供的基于视频传输质量评估方法主要包括两部分,其中一部分为基于丢帧位置信息的画面质量评估,另外一部分为基于RAN侧时延指标信息的画面质量评估。
首先,对基于丢帧位置信息的画面质量评估进行详细介绍。
步骤1:基站判断PDCP层数据包丢包情况,并将PDCP层数据包丢包情况等效为IP层数据包丢包情况。其中,基站(本实施例的网关单元以基站进行举例)判断PDCP层出现数据包丢包包括但不局限于:译码出错、PDCP层在基站缓冲区触发丢包定时器导致的丢包以及主动丢包算法导致的丢包。
步骤2:通过PDCP层的数据包到视频帧的映射关系确定丢帧位置。
具体的:首先获得IP层的数据包到视频帧的映射关系,然后将该映射关系传输至核心网进行解析,将解析后的映射关系发送至PDCP层,则获得了PDCP层的数据包到视频帧的映射关系。其中,映射关系表示一个观察周期内视频帧所包含的数据包,即某数据包在观察周期内所属的视频帧。例如:在一个观察周期内,1号数据包属于1号视频帧。在本步骤中,IP层的数据包到视频帧的映射关系可以通过以下方式获得:
第一种方式为将一个观察周期内的到达IP层的数据包标注为同一视频帧的数据包,即获得了IP层的数据包到视频帧的映射关系。
第二种方式为通过跨层标注的方式来获得IP层的数据包到视频帧的映射关系。由于底层是无法获知哪些数据包是属于同一帧的,但是高层可以获知这些信息,跨层标注即为从具有数据包到视频帧的映射关系的高层中将该映射关系传输至IP层,例如,从实时传输协议(RTP,Real-time Transport Protocol)层将该映射关系传输至IP层,以使IP层获得IP层的数据包到视频帧的映射关系。
在获得了PDCP层的数据包到视频帧的映射关系后,需基于该映射关系确定丢帧位置。具体为:若当前视频帧存在数据包丢失即可视为该视频帧丢帧,不考虑数据包丢失的个数,记录该视频帧的丢帧位置。其中,可以通过视频帧中数据包的包号是否连续来判断是否存在数据包丢失。
步骤3:在确定了丢帧位置之后,需基于丢帧位置和帧间解码参考关系来确定丢帧时长。下面,首先对本实施例的帧间参考关系进行说明。如图8所示,在一个信源编码方式为HEVC的XR视频传输业务中,视频帧被编码为I帧和P帧。参见图8,用箭头表示帧间被参考关系,在编码时,由于后面的P帧会参考前面的I帧或P帧。其中,I帧可独立解码,P帧则需被参考帧来辅助解码。在一个图像组(GOP,Group Of Picture)中,其第一帧(I帧)或者瞬时解码刷新(IDR,Instantaneous Decoding Refresh)帧可独立解码。基于上述的帧间解码参考关系可知,如果接收到的I帧或P帧解码时出现错误,则会导致在一个图像组或者IDR刷新帧到来之前的所有帧间解码均出现错误。因此,在一个图像组中,若各个视频帧存在帧间解码参考关系,则从丢帧位置起至该图像组的最后一帧,均视为丢帧状态。例如:假设一个图像组中帧长为8,且该图像组中第一帧为可独立解码的帧,其余帧的解码需参考前一帧进行解码,在一个观察周期内,用1来表示此位置未丢帧,用0来表示此位置丢帧,记录到11110011|011……,根据该记录结果可以看出,在第一个图像组中,第5帧为丢帧起始位置,由于该图像组中视频帧存在帧间解码参考关系,因此,第一个图像组中丢帧时长为4(第5、6、7、8帧均丢失);另外,根据该记录结果还可以看出,在第二图像组中,丢帧起始位置为第一帧,由于该图像组存在帧间解码参考关系,因此该图像组的丢帧时长为8(第1-8帧均丢帧)。另外,若一个视频序列全部由I帧组成,则一个图像组的大小为一帧。
步骤4:在确定丢帧时长之后,需要确定丢帧次数,即卡顿次数。如图9所示,为一个图像组中的卡顿示意图。在一个图像组中,找到首次丢帧位置S i(即丢帧起始位置),从该位置到该图像组最后一帧所对应的时间内会形成一次卡顿,若在一个图像组的第一帧就出现丢帧,则本实施例中将两次连续的丢帧记录为一次,其丢帧起始位置不变,丢帧时长叠加。如图10所示,阴影部分为卡顿状态,在相邻的图像组中,若前一图像组的最后一帧和后一图像组的第一帧均为丢帧状态,则将两次丢帧合并记为一次丢帧,即记为一次卡顿。
步骤5:根据步骤3获得的丢帧时长L i和步骤4获得的丢帧次数n计算丢帧时长的平均值Savg=(L 1+L 2+...+L i+...+L n)/n。
根据丢帧时长的平均值和丢帧次数,利用下式计算得到用于视频传输质量的第一指标f 1=w 1*Savg+w 2*n,其中,w 1和w 2为加权系数。
在本实施例的HEVC应用场景下中,步骤5中所述第一指标还可以丢帧过程中丢失的视频帧的质量确定。。其中,所述丢帧过程中丢失的视频帧的质量按下式确定:
f(j)=max(v 1*ln(v 2*j+v 3)+v 4,0)
其中,f(j)为在一次丢帧过程中,丢失的第j个视频帧的质量,v 1、v 2、v 3、v 4均为常数,j为在一次丢帧过程中,丢失的第j帧,其中,j∈[1,L i],L i为第 i次丢帧的丢帧时长。
基于在一次丢帧过程中,丢失的第j个视频帧的质量f(j)、步骤3获得的丢帧时长L i和步骤4获得的丢帧次数n计算用于视频传输质量的第一指标
Figure PCTCN2021073091-appb-000004
其中,C 1和C 2均为常数,i为第i次丢帧。
接下来,对基于RAN侧时延指标信息的画面质量评估进行详细介绍。
结合图11和图12,对于基站来说,一个视频帧中包括多个数据包。此处以一个视频帧包括5个数据包为例,即图11和图12所示的5个小矩形分别代表一个数据包。如图11和图12所示,记录视频帧首包到达基站的PDCP层的时间t 1,记录视频帧的尾包到达基站的时间t 2,视频帧的数据包到达基站后会在缓存中等待被调度,当其被调度后,经空口发送,记录视频帧所有数据包都被发送出去的时间t 3,即视频帧的尾包被发送出去的时间。然后利用下式计算视频帧传输到基站的来包时延T 1=t 2-t 1,该物理量用来表示视频帧首包到达PDCP层缓存区到尾包到达PDCP层缓存区之间的时间间隔,该物理量用来表征基站数据包来水的稳定性,如果T 1较小,比如2ms,那么基站认为数据包来水较为稳定。如果T 1较大,比如10ms,那么基站认为数据包抖动较大,数据包来水不稳定。基于视频帧传输到基站的来包时延T 1,可以排除由于前置网元节点故障导致的业务中断而误判RAN侧视频传输质量。接下来,利用下式计算视频帧传输到基站的传输时延T 2=t 3-t 1,该物理量用来表示视频帧首包到达PDCP层缓存区到尾包完成调度发出之间的时间间隔,该物理量用来表征视频传输质量,如果T 2较小,则认为视频传输质量较好,如果T 2较大,则认为视频传输质量较差。
然后,计算视频帧传输到基站的来包时延T 1的均值平均值T 1avg和方差D 1avg;计算视频帧传输到基站的传输时延T 2的均值平均值T 2avg和方差D 2avg。其中,T 1avg=(T 11+T 12+...+T 1m)/m,T 2avg=(T 21+T 22+...+T 2m)/m,上式中,T 1m为第m个视频帧传输到基站的来包时延,T 2m为第m个视频帧传输到基站的传输时延,m为视频帧的总个数。
根据对来包时延的平均值T 1avg、来包时延的方差D 1avg、传输时延的平均值T 2avg和传输时延的方差D 2avg计算用于视频传输质量的第二指标f 2=r 1*T 1avg+r 2*D 1avg+r 3*T 2avg+r 4*D 2avg,其中,r 1、r 2、r 3、r 4为加权系数。
基于上述获得的第一指标f 1和第二指标f 2,利用下式进行计算获得XQI=k 1*f 1+k 2*f 2,利用XQI对视频传输质量进行评估。
下面参见各图,首先以SHVC应用场景下的XR视频传输业务评估为例,对本申请实施例提供的一种视频传输质量的评估方法进行详述。本实施例提供的视频传输质量的评估方法与上一实施例基本相同,不同之处在于基于丢帧位置信息的画面质量评估,故不再对基于RAN侧时延指标信息的画面质量评估进行赘述。
下面对本实施例提供的SHVC应用场景下基于丢帧位置信息的画面质量评估进行详细介绍。
在SHVC应用场景下,视频帧被分为一个基本层(BL,Base Layer)和多个层强层(EL,Enhanced Layer)。为了方便描述,本实施例以一个基本层和一个增强层为例进行描述。
步骤1:基站判断基本层中PDCP层数据包丢包情况和增强层中PDCP层数据包丢包情况,并将基本层中PDCP层数据包丢包情况和增强层中PDCP层数据包丢包情况等效为IP层数据包丢包情况。其中,基站判断出现数据包丢包包括但不局限于:译码出错、PDCP层在基站缓冲区触发丢包定时器导致的丢包以及主动丢包算法导致的丢包。
步骤2:通过PDCP层的数据包到视频帧基本层和增强层的映射关系分别确定基本层丢帧位置和增强层的丢帧位置。其具体实现方式可参照上一实施例步骤2。
步骤3:在确定了丢帧位置之后,需基于丢帧位置和帧间解码参考关系来确定丢帧时长。下面,首先对本实施例的帧间参考关系进行说明。如图13所示,一个信源编码方式为SHVC的XR视频传输业务中,视频帧被编码为I帧和P帧。参见图13,用箭头表示帧间被参考关系,在编码时,在同一层中,后面的P帧会参考前面的I帧或P帧,层间参考时,增强层会参考基本层的P帧或I帧。其中,I帧可独立解码,P帧则需被参考帧来辅助解码。在一个图像组中的第一帧或者IDR刷新帧的基本层可独立解码。基于上述的帧间解码参考关系可知,如果接收到的I帧或P帧解码时出现错误,则会导致在一个图像组或者IDR刷新帧到来之前的所有帧间解码均出现错误。与HEVC场景不同的是,SHVC场景下基本层解码出现错误会导致同帧的增强层解码出错。基于上述解码参考关系的介绍,可以将SHVC场景下的解码参考关系总结为:当基本层丢帧,则基本层从该丢帧起始位置到该图像组内最后一帧视为全部丢失,增强层从该丢帧起始位置到该图像组内最后一帧视为全部丢失;如果增强层丢帧,则增强层从丢帧起始位置到该图像组内最后一帧视为全部丢失。在层强层存在的情况下视频帧的质量为增强层的质量,如果增强层丢失,基本层存在的情况下,则视频帧的质量为基本层的质量;如果增强层和基本层均丢失,则会触发丢帧导致视频卡顿。另外需要注意的是,如果基本层丢帧,则增强层必然丢帧,因此,本实施例记录的增强层丢帧仅为增强层丢帧,基本层不丢帧,丢帧时长指增强层的丢帧持续时长。对于多个增强层,我们记录多个增强层丢帧的起始位置和各层的增强层丢帧时长。如图14所示,基本层(BL)首次丢帧,则对应增强层(EL)同帧处也记录为丢帧状态,另外,对于增强层来说,丢帧起始位置早于基本层首次丢帧位置,因此,还需记录增强层的首次丢帧位置S iEL。对于基本层来说,其丢帧时长为L iBL,对于增强层来说,其丢帧时长为L iBL+L iEL
本实施例中的步骤4与上述实施例中步骤4相同,故本实施例不再对其进行 赘述。
步骤5:分层计算丢帧时长的平均值和丢帧次数。
例如:计算增强层的丢帧时长的平均值为S ELavg=(L 1EL+L 2EL+...+L n′EL)/n′;
计算基本层的丢帧时长的平均值为S BLavg=(L 1BL+L 2BL+...+L nBLn)/n″;
其中,n′为增强层丢帧次数,n″为基本层丢帧次数。
根据增强层的丢帧时长的平均值、丢帧次数、基本层的丢帧时长和丢帧次数,利用下式计算得到用于视频传输质量的第一指标f 1′=w 1*S ELavg+w 2*n′+w 3*S BLavg+w 4*n″,其中,w 1、w 2、w 3、w 4为加权系数。
在本实施例的SHVC应用场景下,步骤5中所述第一指标还可以根据丢帧过程中丢失的视频帧的质量确定。其中,所述丢帧过程中丢失的视频帧的质量按下式确定:
f(j)=max(v 1*ln(v 2*j+v 3)+v 4,0)
其中,f(j)为在一次丢帧过程中,丢失的第j个视频帧的质量,v 1、v 2、v 3、v 4均为常数,j为在一次丢帧过程中,丢失的第j帧,其中,j∈[1,L iBL],L iBL为基本层第i次丢帧的丢帧时长。
基于在一次丢帧过程中,丢失的第j个视频帧的质量f(j)、基本层第i次丢帧的丢帧时长L iBL、增强层第i次丢帧的丢帧时长L iEL和基本层丢帧次数n″增强层丢帧次数n′计算用于视频传输质量的第一指标
Figure PCTCN2021073091-appb-000005
其中,C 1、C 2和C 3均为常数,i为第i次丢帧。
上述实施例提供的视频传输质量的评估方法,综合考虑了基于画面质量和交互时延的评估,该方案突破了现有视频传输质量评估方法中单纯从包粒度的角度评估的现状,可用于网络问题定位、定界等。
下面,参照图15来说明本申请的一个实施例提供的基于视频传输质量的视频传输质量评估装置,如图15所示,该装置包括
第一获取模块810,用于获取视频传输过程中视频帧的丢帧时长和丢帧次数;
第一确定模块820,用于基于所述丢帧时长和丢帧次数确定用于评估视频传输质量的第一指标。
所述装置还包括:
第二获取模块830,用于获取所述视频帧传输到网元的来包时延和传输时延;
第二确定模块840,用于基于所述来包时延和传输时延确定用于评估视频传输质量的第二指标。
所述装置还包括:
评估模块850,用于对所述第一指标和所述第二指标进行加权计算,利用该 计算结果对视频传输质量进行评估。
所述第一获取模块810,包括:
第一获取单元8110,用于通过IP层的数据包到所述视频帧的映射关系获得PDCP层的数据包到所述视频帧的映射关系;
第一确定单元8120,用于基于所述PDCP层的数据包到所述视频帧的映射关系确定丢帧位置;
第二确定单元8130,用于根据所述丢帧位置和帧间解码参考关系确定丢帧时长。
所述第一获取单元具体用于:
将各周期内到达核心网的数据包标注为同一视频帧的数据包,以获得IP层的数据包到视频帧的映射关系;
所述核心网对所述IP层的数据包到视频帧的映射关系进行解析,将解析后的映射关系发送至PDCP层,以获得所述PDCP层的数据包到所述视频帧的映射关系。
所述第一获取单元具体用于:
通过跨层标注的方式,从具有数据包到视频帧的映射关系的层中将该映射关系传输到IP层,以获得IP层的数据包到视频帧的映射关系;
将所述IP层的数据包到所述视频帧的映射关系发送至核心网进行解析;
将解析后的映射关系发送至PDCP层,以获得所述PDCP层的数据包到所述视频帧的映射关系。
第一获取单元8110,包括:
第一发送子单元8111,用于将所述IP层的数据包到所述视频帧的映射关系发送至核心网进行解析;
第二发送子单元8112,用于将解析后的映射关系发送至PDCP层,以获得所述PDCP层的数据包到所述视频帧的映射关系。
所述第一确定单元8120,具体用于:若当前视频帧中存在数据包丢失,则该视频帧丢帧。
所述第二确定单元8130,包括:
第一获取子单元8131,用于基于所述丢帧位置获得所述视频中各图像组中的丢帧起始位置;
第一确定子单元8132,用于在各图像组中,若各视频帧存在帧间解码参考关系,则从所述丢帧起始位置起至该图像组最后一帧,均视为丢帧状态。
所述第二确定单元8130,包括:
第二获取子单元8133,用于基于所述丢帧位置获得所述视频中各图像组中的丢帧起始位置;
第二确定子单元8134,用于在各图像组中,若各视频帧不存在帧间解码参考关系,则仅在丢帧位置处视为丢帧状态。
所述第一获取模块810,还包括:
第三确定单元8140,用于在相邻的两个图像组中,若前一图像组的最后一帧和后一图像组的第一帧均为丢帧状态,则将两次丢帧合并记为一次丢帧。
所述第一指标还根据丢帧过程中丢失的视频帧的质量确定。。
所述丢帧过程中丢失的视频帧的质量按下式确定:
f(j)=max(v 1*ln(v 2*j+v 3)+v 4,0)
其中,f(j)为在一次丢帧过程中,丢失的第j个视频帧的质量,v 1、v 2、v 3、v 4均为常数,j为在一次丢帧过程中,丢失的第j帧,其中,j∈[1,L i],L i为第i次丢帧的丢帧时长。
所述第二获取模块830,包括:
第二获取单元8310,用于获取视频帧首个数据包到达PDCP层的时间t 1
第三获取单元8320,用于获取视频帧最后一数据包到达PDCP层的时间t 2
第三计算单元8330,用于用所述视频帧最后一数据包到达PDCP层的时间t 2与所述视频帧首个数据包到达PDCP层的时间t 1作差,获得所述视频帧传输到网元的来包时延。
所述第二获取模块830,还包括:
第四获取单元8340,用于获取视频帧首个数据包到达PDCP层的时间t 1
第五获取单元8350,用于获取视频帧最后一数据包发送完毕的时间t 3
第四计算单元8360,用于用所述视频帧最后一数据包发送完毕的时间t 3和所述视频帧首个数据包到达PDCP层的时间t 1作差,获得所述视频帧的传输时延。
所述第二确定模块840,包括:
第五计算单元8410,用于计算所述来包时延的平均值和方差;
第六计算单元8420,用于计算所述传输时延的平均值和方差;
第七计算单元8430,用于对所述来包时延的平均值、来包时延的方差、传输时延的平均值和传输时延的方差进行加权计算,获得用于视频传输质量的第二指标。
图16是本申请实施例提供的一种计算设备1500的结构性示意性图。该计算设备1500包括:处理器1510、存储器1520、通信接口1530、总线1540。
应理解,图17所示的计算设备1500中的通信接口1530可以用于与其他设备之间进行通信。
其中,该处理器1510可以与存储器1520连接。该存储器1520可以用于存储该程序代码和数据。因此,该存储器1520可以是处理器1510内部的存储单元,也可以是与处理器1510独立的外部存储单元,还可以是包括处理器1510内部的存储单元和与处理器1510独立的外部存储单元的部件。
可选的,计算设备1500还可以包括总线1540。其中,存储器1520、通信接口1530可以通过总线1540与处理器1510连接。总线1540可以是外设部件互连标准(PCI,Peripheral Component Interconnect)总线或扩展工业标准 结构(EISA,Extended Industry Standard Architecture)总线等。所述总线1540可以分为地址总线、数据总线、控制总线等。为便于表示,图16中仅用一条线表示,但并不表示仅有一根总线或一种类型的总线。
应理解,在本申请实施例中,该处理器1510可以采用中央处理单元(CPU,central processing unit)。该处理器还可以是其它通用处理器、数字信号处理器(DSP,digital signal processor)、专用集成电路(ASIC,application specific integrated circuit)、现成可编程门阵列(FPGA,field programmable gate Array)或者其它可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。或者该处理器1510采用一个或多个集成电路,用于执行相关程序,以实现本申请实施例所提供的技术方案。
该存储器1520可以包括只读存储器和随机存取存储器,并向处理器1510提供指令和数据。处理器1510的一部分还可以包括非易失性随机存取存储器。例如,处理器1510还可以存储设备类型的信息。
在计算设备1500运行时,所述处理器1510执行所述存储器1520中的计算机执行指令执行上述方法的操作步骤。
应理解,根据本申请实施例的计算设备1500可以对应于执行根据本申请各实施例的方法中的相应主体,并且计算设备1500中的各个模块的上述和其它操作和/或功能分别为了实现本实施例各方法的相应流程,为了简洁,在此不再赘述。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方, 或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
本申请实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时用于执行一种多样化问题生成方法,该方法包括上述各个实施例所描述的方案中的至少之一。
本申请实施例的计算机存储介质,可以采用一个或多个计算机可读的介质的任意组合。计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质。计算机可读存储介质例如可以是,但不限于,电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本文件中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。
计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。
计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括、但不限于无线、电线、光缆、RF等等,或者上述的任意合适的组合。
可以以一种或多种程序设计语言或其组合来编写用于执行本申请操作的计算机程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如 Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络,包括局域网(LAN)或广域网(WAN),连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
注意,上述仅为本申请的较佳实施例及所运用的技术原理。本领域技术人员会理解,本申请不限于这里所述的特定实施例,对本领域技术人员来说能够进行各种明显的变化、重新调整和替代而不会脱离本申请的保护范围。因此,虽然通过以上实施例对本申请进行了较为详细的说明,但是本申请不仅仅限于以上实施例,在不脱离本申请的构思的情况下,还可以包括更多其他等效实施例,均属于本申请的保护范畴。

Claims (28)

  1. 一种视频传输质量的评估方法,其特征在于,包括:
    获取视频传输过程中视频帧的丢帧时长和丢帧次数;
    基于所述丢帧时长和丢帧次数确定用于评估视频传输质量的第一指标。
  2. 根据权利要求1所述的方法,其特征在于,还包括:
    获取所述视频帧传输到网元的来包时延和传输时延;基于所述来包时延和传输时延确定用于评估视频传输质量的第二指标;
    其中,所述来包时延为所述视频帧最后一数据包到达PDCP层的时间与所述视频帧首个数据包到达PDCP层的时间之差;所述传输时延为所述视频帧最后一数据包发送完毕的时间与所述视频帧首个数据包到达PDCP层的时间之差。
  3. 根据权利要求2所述的方法,其特征在于,还包括:
    对所述第一指标和所述第二指标进行加权计算,利用该计算结果对视频传输质量进行评估。
  4. 根据权利要求1所述的方法,其特征在于,所述获取视频传输过程中视频帧的丢帧时长,包括:
    通过IP层的数据包到所述视频帧的映射关系获得PDCP层的数据包到所述视频帧的映射关系;
    基于所述PDCP层的数据包到所述视频帧的映射关系确定丢帧位置;
    根据所述丢帧位置和帧间解码参考关系确定丢帧时长。
  5. 根据权利要求4所述的方法,其特征在于,所述通过IP层的数据包到所述视频帧的映射关系获得PDCP层的数据包到所述视频帧的映射关系,包括:
    将各周期内到达核心网的数据包标注为同一视频帧的数据包,以获得IP层的数据包到视频帧的映射关系;
    PDCP层接收对所述IP层的数据包到视频帧的映射关系所解析得到的映射关系,以获得所述PDCP层的数据包到所述视频帧的映射关系。
  6. 根据权利要求4所述的方法,其特征在于,所述通过IP层的数据包到所述视频帧的映射关系获得PDCP层的数据包到所述视频帧的映射关系,包括:
    通过跨层标注的方式,从具有数据包到视频帧的映射关系的层中将该映射关系传输到IP层,以获得IP层的数据包到视频帧的映射关系;
    将所述IP层的数据包到所述视频帧的映射关系发送至核心网进行解析;
    将解析后的映射关系发送至PDCP层,以获得所述PDCP层的数据包到所述视频帧的映射关系。
  7. 根据权利要求4所述的方法,其特征在于,若当前视频帧中存在数据包丢失,则视为该视频帧丢帧。
  8. 根据权利要求4所述的方法,其特征在于,所述根据所述丢帧位置和帧 间解码参考关系确定丢帧时长,包括:
    基于所述丢帧位置获得所述视频中各图像组中的丢帧起始位置;
    在各图像组中,若各视频帧存在帧间解码参考关系,则从所述丢帧起始位置起至该图像组最后一帧,均视为丢帧状态。
  9. 根据权利要求4所述的方法,其特征在于,所述根据所述丢帧位置和帧间解码参考关系确定丢帧时长,包括:
    基于所述丢帧位置获得所述视频中各图像组中的丢帧起始位置;
    在各图像组中,若各视频帧不存在帧间解码参考关系,则仅在丢帧位置处视为丢帧状态。
  10. 根据权利要求8或9所述的方法,其特征在于,所述获取视频传输过程中视频帧的丢帧次数,包括:
    在相邻的两个图像组中,若前一图像组的最后一帧和后一图像组的第一帧均为丢帧状态,则将两次丢帧合并记为一次丢帧。
  11. 根据权利要求1所述的方法,其特征在于,所述第一指标还根据丢帧过程中丢失的视频帧的质量确定。
  12. 根据权利要求11所述的方法,其特征在于,所述丢帧过程中丢失的视频帧的质量按下式确定:
    f(j)=max(v 1*ln(v 2*j+v 3)+v 4,0)
    其中,f(j)为在一次丢帧过程中,丢失的第j个视频帧的质量,v 1、v 2、v 3、v 4均为常数,j为在一次丢帧过程中,丢失的第j帧,其中,j∈[1,L i],L i为第i次丢帧的丢帧时长。
  13. 根据权利要求2所述的方法,其特征在于,所述基于所述来包时延和传输时延确定用于评估视频传输质量的第二指标,包括:
    计算所述来包时延的平均值和方差;
    计算所述传输时延的平均值和方差;
    对所述来包时延的平均值、来包时延的方差、传输时延的平均值和传输时延的方差进行加权计算,获得用于视频传输质量的第二指标。
  14. 一种基于视频传输的视频传输质量评估装置,其特征在于,包括:
    第一获取模块,用于获取视频传输过程中视频帧的丢帧时长和丢帧次数;
    第一确定模块,用于基于所述丢帧时长和丢帧次数确定用于评估视频传输质量的第一指标。
  15. 根据权利要求14所述的装置,其特征在于,还包括:
    第二获取模块,用于获取所述视频帧传输到网元的来包时延和传输时延;
    第二确定模块,用于基于所述来包时延和传输时延确定用于评估视频传输质量的第二指标;
    其中,所述来包时延为所述视频帧最后一数据包到达PDCP层的时间与所述视频帧首个数据包到达PDCP层的时间之差;所述传输时延为所述视频帧最后一 数据包发送完毕的时间与所述视频帧首个数据包到达PDCP层的时间之差。
  16. 根据权利要求15所述的装置,其特征在于,还包括:
    评估模块,用于对所述第一指标和所述第二指标进行加权计算,利用该计算结果对视频传输质量进行评估。
  17. 根据权利要求14所述的装置,其特征在于,所述第一获取模块,包括:
    第一获取单元,用于通过IP层的数据包到所述视频帧的映射关系获得PDCP层的数据包到所述视频帧的映射关系;
    第一确定单元,用于基于所述PDCP层的数据包到所述视频帧的映射关系确定丢帧位置;
    第二确定单元,用于根据所述丢帧位置和帧间解码参考关系确定丢帧时长。
  18. 根据权利要求17所述的装置,其特征在于,所述第一获取单元具体用于:
    将各周期内到达核心网的数据包标注为同一视频帧的数据包,以获得IP层的数据包到视频帧的映射关系;
    PDCP层接收对所述IP层的数据包到视频帧的映射关系所解析得到的映射关系,以获得所述PDCP层的数据包到所述视频帧的映射关系。
  19. 根据权利要求17所述的装置,其特征在于,所述第一获取单元具体用于:
    通过跨层标注的方式,从具有数据包到视频帧的映射关系的层中将该映射关系传输到IP层,以获得IP层的数据包到视频帧的映射关系;
    将所述IP层的数据包到所述视频帧的映射关系发送至核心网进行解析;
    将解析后的映射关系发送至PDCP层,以获得所述PDCP层的数据包到所述视频帧的映射关系。
  20. 根据权利要求17所述的装置,其特征在于,若当前视频帧中存在数据包丢失,则视为该视频帧丢帧。
  21. 根据权利要求17所述的装置,其特征在于,所述第二确定单元,包括:
    第一获取子单元,用于基于所述丢帧位置获得所述视频中各图像组中的丢帧起始位置;
    第一确定子单元,用于在各图像组中,若各视频帧存在帧间解码参考关系,则从所述丢帧起始位置起至该图像组最后一帧,均视为丢帧状态。
  22. 根据权利要求17所述的装置,其特征在于,所述第二确定单元,包括:
    第二获取子单元,用于基于所述丢帧位置获得所述视频中各图像组中的丢帧起始位置;
    第二确定子单元,用于在各图像组中,若各视频帧不存在帧间解码参考关系,则仅在丢帧位置处视为丢帧状态。
  23. 根据权利要求21或22所述的装置,其特征在于,所述第一获取模块, 还包括:
    第三确定单元,用于在相邻的两个图像组中,若前一图像组的最后一帧和后一图像组的第一帧均为丢帧状态,则将两次丢帧合并记为一次丢帧。
  24. 根据权利要求14所述的装置,其特征在于,所述第一指标还根据丢帧过程中丢失的视频帧的质量确定。
  25. 根据权利要求24所述的装置,其特征在于,所述丢帧过程中丢失的视频帧的质量按下式确定:
    f(j)=max(v 1*ln(v 2*j+v 3)+v 4,0)
    其中,f(j)为在一次丢帧过程中,丢失的第j个视频帧的质量,v 1、v 2、v 3、v 4均为常数,j为在一次丢帧过程中,丢失的第j帧,其中,j∈[1,L i],L i为第i次丢帧的丢帧时长。
  26. 根据权利要求15所述的装置,其特征在于,所述第二确定模块,具体用于:
    计算所述来包时延的平均值和方差;
    计算所述传输时延的平均值和方差;
    对所述来包时延的平均值、来包时延的方差、传输时延的平均值和传输时延的方差进行加权计算,获得用于视频传输质量的第二指标。
  27. 一种计算设备,其特征在于,包括:
    总线;
    通信接口,其与所述总线连接;
    至少一个处理器,其与所述总线连接;以及
    至少一个存储器,其与所述总线连接并存储有程序指令,所述程序指令当被所述至少一个处理器执行时,使得所述至少一个处理器执行权利要求1-13任一项所述的基于视频传输的视频传输质量评估方法。
  28. 一种计算机可读存储介质,其上存储有程序指令,其特征在于,所述程序指令当被计算机执行时,使得所述计算机执行权利要求1-13任一项所述的基于视频传输的视频传输质量评估方法。
PCT/CN2021/073091 2021-01-21 2021-01-21 一种视频传输质量的评估方法及装置 WO2022155844A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/073091 WO2022155844A1 (zh) 2021-01-21 2021-01-21 一种视频传输质量的评估方法及装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/073091 WO2022155844A1 (zh) 2021-01-21 2021-01-21 一种视频传输质量的评估方法及装置

Publications (1)

Publication Number Publication Date
WO2022155844A1 true WO2022155844A1 (zh) 2022-07-28

Family

ID=82548305

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/073091 WO2022155844A1 (zh) 2021-01-21 2021-01-21 一种视频传输质量的评估方法及装置

Country Status (1)

Country Link
WO (1) WO2022155844A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2482558A1 (en) * 2009-11-03 2012-08-01 Huawei Technologies Co., Ltd. Method, apparatus and system for evaluation of video quality
CN104811694A (zh) * 2015-04-28 2015-07-29 华为技术有限公司 一种视频数据质量评估的方法和装置
CN105100508A (zh) * 2014-05-05 2015-11-25 华为技术有限公司 一种网络语音质量评估方法、装置和系统
CN107734360A (zh) * 2017-09-15 2018-02-23 深圳英飞拓科技股份有限公司 流媒体服务器的控制方法及装置
CN109144858A (zh) * 2018-08-02 2019-01-04 腾讯科技(北京)有限公司 流畅度检测方法、装置、计算设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2482558A1 (en) * 2009-11-03 2012-08-01 Huawei Technologies Co., Ltd. Method, apparatus and system for evaluation of video quality
CN105100508A (zh) * 2014-05-05 2015-11-25 华为技术有限公司 一种网络语音质量评估方法、装置和系统
CN104811694A (zh) * 2015-04-28 2015-07-29 华为技术有限公司 一种视频数据质量评估的方法和装置
CN107734360A (zh) * 2017-09-15 2018-02-23 深圳英飞拓科技股份有限公司 流媒体服务器的控制方法及装置
CN109144858A (zh) * 2018-08-02 2019-01-04 腾讯科技(北京)有限公司 流畅度检测方法、装置、计算设备及存储介质

Similar Documents

Publication Publication Date Title
US11483360B2 (en) Adaptive streaming with early client indication
CN113542655B (zh) 视频通信中的视频标注技术
CN107211193B (zh) 感知体验质量估计驱动的智能适应视频流传输方法和系统
Ke myEvalSVC: an integrated simulation framework for evaluation of H. 264/SVC transmission
CN104735470B (zh) 一种流媒体数据传输方法及装置
JP5670551B2 (ja) ビデオストリームの品質を評価する方法と装置
Ray et al. Vantage: optimizing video upload for time-shifted viewing of social live streams
CN110401820A (zh) 多路视频处理方法、装置、介质及电子设备
US8750293B2 (en) Apparatus and method for rendering video with retransmission delay
CN109618565A (zh) 流式视频用户体验质量的自动测量方法和系统
JP2010518682A (ja) ビデオテレフォニー品質評価方法及び装置
CN109819322A (zh) 视频传输方法、装置、计算机可读存储介质及电子设备
CN103119895A (zh) 用于测量传输链上的音频和视频比特流传输的质量的方法和系统
US10862945B2 (en) Adaptive restful real-time live media streaming
EP2347599A1 (en) Method and system for determining a quality value of a video stream
WO2021052500A1 (zh) 视频图像的传输方法、发送设备、视频通话方法和设备
WO2012013777A2 (en) Method and apparatus for assessing the quality of a video signal during encoding or compressing of the video signal
JP4914400B2 (ja) 品質推定方法、品質推定システム、ユーザ端末、品質管理端末およびプログラム
US20220408097A1 (en) Adaptively encoding video frames using content and network analysis
Wang et al. No-reference hybrid video quality assessment based on partial least squares regression
US20030152080A1 (en) System and method for fault tolerant multimedia communication
WO2022155844A1 (zh) 一种视频传输质量的评估方法及装置
CN109862400A (zh) 一种流媒体传输方法、装置及其系统
WO2024041365A1 (zh) 一种视频决策码率确定方法、装置、存储介质及电子装置
JP2008211579A (ja) 映像品質推定方法および映像通信システム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21920247

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21920247

Country of ref document: EP

Kind code of ref document: A1