CN113747229A

CN113747229A - Slice size prediction and adaptive code rate control method, system and medium

Info

Publication number: CN113747229A
Application number: CN202110885078.6A
Authority: CN
Inventors: 宋利; 袁靖昊; 解蓉; 张文军
Original assignee: Shanghai Jiaotong University
Current assignee: Shanghai Jiaotong University
Priority date: 2021-08-03
Filing date: 2021-08-03
Publication date: 2021-12-03
Anticipated expiration: 2041-08-03
Also published as: CN113747229B

Abstract

The invention provides a method, a system and a medium for slice size prediction and adaptive code rate control, wherein the method comprises the following steps: s1, providing two download strategies, i.e., optimizing buffer precision and optimizing first-opening delay, for the first Start _ Num slices, and performing trade-off between first-opening delay and precision buffer; s2, on the basis of S1, obtaining the buffer information in the process of self-adaptive streaming media, and performing a filling operation on the size of the enhancement layer slice which is not downloaded in the optimized first-opening delay strategy to establish complete reference buffer historical information; and S3, predicting the size of the enhancement layer slice at the current time slot by using the historical information of the reference buffer and the downloading condition of the current slice. The invention designs a slice size prediction module to replace the average representation of the slice size in the traditional ABR algorithm, and adds the module on the existing ABR algorithm, thereby bringing great performance improvement.

Description

Slice size prediction and adaptive code rate control method, system and medium

Technical Field

The invention relates to the technical field of HTTP adaptive streaming media service, in particular to a method, a system and a medium for slice size prediction and adaptive code rate control.

Background

Video delivery has evolved as a major part of today's internet traffic over the last decade due to advances in network technology, device capabilities, and audio video compression schemes. Cisco writes tracks in its annual visual network index report, video traffic and 67% of global internet traffic occupying 2016, and predicts that 80% of the share will be reached in 2021. This trend requires that the best QoE (quality of experience) video transmission service be provided over the existing internet, but the internet is just a best effort, non-real time, data transmission approach at the beginning of the design. Around 2005, Move Networks introduced an elegant and simple video delivery model that is more popular than progressive download and other proprietary streaming methods due to its better functionality and lower deployment cost. This new model, called HTTP Adaptive Streaming (HAS), handles media content like regular Web content and delivers it in small pieces via the HTTP protocol. HAS is rapidly adopted by leading service and content providers, becoming the dominant means of video transmission. Video transmission over the public internet is also referred to as OTT video streaming because the content or streaming service provider is typically not a network provider. The advent of mobile end user devices with high processing and rendering capabilities HAS played a key role in the growth of streaming video traffic, which also provides a number of application scenarios for HAS.

SVC is an AVC extension, and the most known two pairs of encoding modes are H.264/AVC and H.264/SVC, and H.265/HEVC and H.265/SHVC. SVC encodes video into one Base Layer (BL) and several Enhancement Layers (ELs), each EL relying on all layers before it for coding. The SVC can reduce the storage consumption of the server and provide better error resistance. Also, SVC can gradually improve quality by gradually increasing EL. However, SVC also results in increased coding complexity, which results in 10% extra overhead for approximately every layer.

Some scholars propose to apply SVC to ABR scenarios, and make good attempts to reduce the requirements of the client buffer and server storage by using SVC in combination with DASH, so as to further improve the user QoE. The first article that really performs QoE management on the SVC-DASH system is to convert the quality selection of SVC slices into an optimization problem, propose to use the LBP algorithm to solve the optimization problem, and verify the effectiveness of the algorithm in a simulation and true LTE environment. It was also learned how to make use of the QUIC feature for adaptive streaming of SHVC-encoded content. The learner provides a DASH/SVC data set. There is also a first proposal to exploit SVC inter-layer correlation to optimize DASH scheduling. However, the coded video selected by them has fewer layers and only uses BL for prediction, and there is still room for improvement. Moreover, their approach requires that an initial buffer be established first, resulting in a large head-on delay. Since the video scene changes, the real bit rate and the video slice size also fluctuate, most algorithms use the average bit rate in the MPD file to calculate the video slice size, and the coarse-grained slice size indicates a large error, which provides a breakthrough point for optimization.

Disclosure of Invention

In view of the above-mentioned deficiencies in the prior art, an object of the present invention is to provide a method, a system, and a medium for slice size prediction and adaptive code rate control, which can accurately predict a video slice size and improve transmission performance.

In a first aspect of the present invention, a slice size prediction method is provided, including:

s1, providing two download strategies, i.e., optimizing buffer precision and optimizing first-opening delay, for the first Start _ Num slices, and performing trade-off between first-opening delay and precision buffer;

s2, on the basis of S1, obtaining buffer information in the process of self-adaptive streaming media, and performing filling operation on the size of the enhancement layer slice which is not downloaded in the optimized first-opening delay strategy to establish complete reference buffer historical information;

and S3, predicting the size of the enhancement layer slice at the current time slot by using the historical information of the reference buffer and the downloading condition of the current slice.

Optionally, the optimizing the buffer precision refers to: with the goal of optimizing buffer precision, choose to download all layers for the previous Start _ Num slices, thus creating a complete initial buffer.

Optionally, the optimizing the start delay refers to: with the goal of reducing the first-to-first delay, the normal ABR decision is selected for the first Start _ Num slices without actively downloading all layers.

Optionally, a padding operation is performed on the size of the enhancement layer slice not downloaded by the optimization start-up delay strategy, wherein any one of the following two padding strategies is adopted according to an enhancement layer prediction mode:

reference to the base layer: in the strategy, the undelivered enhancement layer amplifies and calculates the size of the base layer of the time slot by comparing according to the average code rate in the MPD file;

reference to all lower layers: in the strategy, the undelivered enhancement layer is compared according to the average code rate in the MPD file, and all the previous lower layer sizes of the time slot are subjected to amplification calculation.

Optionally, the reference buffer history information and the downloading condition of the current slice are used for predicting the size of the enhancement layer slice at the current time slot, wherein any one of the following prediction modes 1-4 is adopted according to different prediction reference information:

prediction method 1: only referring to the size of the base layer of the current time slot, amplifying the obtained size of the base layer according to the evaluation code rate of each layer in the MPD file to obtain the size of the slice of the enhancement layer;

prediction method 2: only referring to the size of the base layer of the current time slot, amplifying all obtained layer sizes according to the evaluation code rate of each layer in the MPD file to obtain the size of the slice of the enhancement layer;

prediction method 3: referring to the sizes of the base layers of all the previous time slots, establishing a linear regression relationship between the acquired sizes of the base layers of all the time slots and the size of the enhancement layer to be calculated, and substituting the size of the base layer of the current time slot into a regression equation to obtain the size of the slice of the enhancement layer;

prediction mode 4: and establishing a linear regression relationship between the acquired enhancement layer size to be calculated of all the time slots and the slice size of all layers before the enhancement layer by referring to the base layer sizes of all the time slots before, and substituting the base layer size of the current time slot into a regression equation to obtain the slice size of the enhancement layer.

In a second aspect of the present invention, there is provided a slice size prediction system, including:

a download strategy module, which provides two download strategies of optimizing buffer precision and optimizing first-opening delay for the first Start _ Num slices, and is used for balancing the first-opening delay and the precision buffer;

a reference buffer history information establishing module, which is used for acquiring the buffer information in the adaptive streaming media process on the basis of the downloading strategy module, and performing a filling operation on the size of the enhancement layer slice which is not downloaded in the optimization first-opening delay strategy to establish complete reference buffer history information;

and the prediction module is used for predicting the slice size of the enhancement layer at the current time slot by utilizing the historical information of the reference buffer and the downloading condition of the current slice.

In a third aspect of the present invention, an adaptive code rate control method is provided, including:

predicting the slice size of a certain enhancement layer at the current time slot position by adopting a slice size prediction module in combination with the constructed historical information of the reference buffer area to obtain a predicted value; wherein the slice size prediction module adopts the slice size prediction method;

and adopting a self-adaptive code rate module to carry out decision processing on the predicted value.

In a fourth aspect of the present invention, there is provided an adaptive code rate control system comprising:

the slice size prediction module is used for predicting the slice size of a certain enhancement layer at the current time slot position by combining the constructed historical information of the reference buffer area to obtain a predicted value; wherein the slice size prediction module employs the slice size prediction method;

and the self-adaptive code rate module is used for performing decision processing on the predicted value obtained by the slice size prediction module.

In a fifth aspect of the present invention, there is provided a computer apparatus comprising a memory and a processor, wherein the memory stores a computer program, and the processor implements the slice size prediction method or the adaptive code rate control method when executing the computer program.

A sixth aspect of the present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the slice size prediction method or the adaptive rate control method.

Compared with the prior art, the embodiment of the invention has the following beneficial effects:

according to the slice size prediction and adaptive code rate control method, system and medium, a slice size prediction module is designed by adopting the established reference buffer historical information to replace the average representation of the slice size in the traditional adaptive code rate (ABR) algorithm, and the module is added on the traditional adaptive code rate (ABR) algorithm, so that the video slice size can be accurately predicted, and the transmission performance is improved.

Drawings

Other features, objects and advantages of the present invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:

FIG. 1 is a system diagram of adaptive streaming according to an embodiment of the present invention;

FIG. 2 is a graph of experimental usage network traces in an embodiment of the present invention;

fig. 3 shows the result of evaluating the transmission quality of streaming media according to an embodiment of the present invention;

fig. 4 is a probability distribution diagram of the prediction accuracy of the slice prediction module in an embodiment of the invention.

Detailed Description

The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that variations and modifications can be made by persons skilled in the art without departing from the spirit of the invention. All falling within the scope of the present invention.

DASH is a protocol that adapts to network conditions and heterogeneous client devices. The client requests and analyzes the MPD file from the server to obtain basic information including average code rate, slice URL and the like. The ABR algorithm then decides the quality of the next slice to request based on network and buffer conditions. However, since the video content is changing, the true bitrate is fluctuating, which is greatly different from the average bitrate provided by the MPD file. This bias is more prominent for VBR encoded video and is transparent and unknown to the ABR algorithm. The invention firstly establishes a reference buffer area historical information, on the basis, a slice size prediction module is used for predicting the size of an enhancement layer, and the prediction information is transmitted to a self-adaptive code rate decision module for making code rate decision. Referring to fig. 1, the invention designs a slice size prediction module to replace the average representation of the slice size in the conventional ABR algorithm, and adds the module to the existing ABR algorithm, so that the video slice size can be accurately predicted, and great performance improvement is brought.

Specifically, referring to fig. 1, the present embodiment provides a slice size prediction method, including the following steps:

in this step, two initial strategies are specifically:

optimizing the precision of the buffer area: for this strategy, the goal is to optimize the buffer precision, so the strategy chooses to download all layers for the first Start _ Num slices, thus creating a complete initial buffer to provide more reference information for the enhancement layer slice size prediction performed at S2.

Optimizing the first-opening delay: for this strategy, the goal is to reduce the Start-up delay, so the strategy chooses to make normal ABR decisions for the first Start _ Num slices, does not actively download all layers, and for the non-downloaded high-layer slices, the completion strategy mentioned in S2 can be used to omit the establishment of a complete reference buffer history information.

in this step, when obtaining the history buffer information, not all enhancement layers may be requested to be downloaded, and in order to better perform the enhancement layer size prediction of the current time slot, the history buffer information needs to be complemented, and two complementing strategies are designed according to the enhancement layer prediction mode:

reference to the base layer: in the strategy, the undelivered enhancement layer amplifies and calculates the size of the base layer of the time slot by comparing according to the average code rate in the MPD file, and the formula is as follows:

wherein segment [ i ] [ j ] represents the jth enhancement layer size of the ith slice, bitrate [ j ] represents the average bitrate of the jth enhancement layer, and bitrate [0] represents the average bitrate of the base layer.

Reference to all lower layers: in the strategy, the undelivered enhancement layer is compared according to the average code rate in the MPD file, and all the previous lower layer sizes of the time slot are subjected to amplification calculation. The formula is as follows:

wherein segment [ i ] [ j ] represents the jth enhancement layer size of the ith slice, bitrate [ j ] represents the average bitrate of the jth enhancement layer, and bitrate [ q ] represents the average bitrate of the qth enhancement layer (the base layer represents 0).

In this step, the prediction of the slice size of an enhancement layer in the current time slot can be divided into four types according to the difference of prediction reference information:

prediction method 1: only referring to the size of the base layer of the current time slot, the prediction method uses the obtained size of the base layer to amplify according to the evaluation code rate of each layer in the MPD file to obtain the size of the enhancement layer slice, and the calculation formula is as follows:

Prediction method 2: the prediction method only refers to the size of the basic layer of the current time slot, the obtained sizes of all layers are used, amplification is carried out according to the evaluation code rate of each layer in the MPD file, the size of the enhancement layer slice is obtained, and the calculation formula is as follows:

Prediction method 3: referring to the sizes of the base layers of all the previous time slots, the prediction method uses the acquired sizes of the base layers of all the time slots and the size of the enhancement layer to be calculated, establishes a linear regression relationship between the acquired sizes of the base layers of all the time slots and the size of the enhancement layer to be calculated, and brings the size of the base layer of the current time slot into a regression equation to obtain the size of the enhancement layer slice to be calculated, wherein the calculation formula is as follows:

segment[i][j]＝a_i,j*segment[i][0]+b_i,j

wherein segment [ i][j]Represents the jth enhancement layer size of the ith slice, a_i,jAnd b_i,jIs a linear regression coefficient

Wherein B represents the base layer slice size vector of the reference buffer history information, E represents the enhancement layer slice size vector of the reference buffer history information, and a and B are linear regression coefficients.

Prediction mode 4: referring to the sizes of the base layers of all the previous time slots, the prediction method uses the sum of the sizes of the enhancement layers to be calculated of all the obtained time slots and the sizes of the slices of all the layers before the enhancement layers to establish a linear regression relationship between the sizes and the sizes, and brings the size of the base layer of the current time slot into a regression equation to obtain the size of the slice of the enhancement layer, wherein the calculation formula is as follows:

wherein segment [ i ] [ j ] represents the jth enhancement layer size of the ith slice, B represents the vector of the sum of the sizes of all the bottom layer slices of the reference buffer history information, E represents the vector of the sizes of the enhancement layer slices of the reference buffer history information, and a and B are linear regression coefficients.

The first prediction mode and the second prediction mode can calculate the size of the required enhancement layer more timely because only the information of the current time slot is referred to, but the reference to the historical information is lacked. In the second prediction mode, all lower layers before the enhancement layer are required to be referred to, so that the calculation amount is large, but the prediction is more accurate.

The third prediction mode and the fourth prediction mode establish a linear regression relationship between the reference layer and the prediction layer, and refer to more historical information, so that the prediction is more accurate, but the calculation amount is greatly improved compared with the first prediction mode and the second prediction mode. The fourth prediction mode refers to all lower layers before the enhancement layer is required, so that the calculation amount is large, but the prediction is more accurate.

Therefore, for a scene with weak computing power, a first prediction mode or a second prediction mode can be selected; if more requirements are required for the prediction accuracy, a third prediction mode and a fourth prediction mode can be selected. Further, since the difference in computational complexity between the first prediction mode and the second prediction mode, and between the third prediction mode and the fourth prediction mode is not large, in some preferred embodiments, it is more recommended to use the second prediction mode and the fourth prediction mode, respectively. Of course, the specific method can be selected according to the actual application occasion and requirements. The above operation in this step utilizes the inter-layer correlation of scalable coding to perform adaptive code rate control.

Based on the above embodiments, based on the same technical concept, in another embodiment of the present invention, there is provided a slice size prediction system, including: the device comprises a downloading strategy module, a reference buffer historical information establishing module and a prediction module, wherein: the download strategy module provides two download strategies of optimizing the precision of the buffer area and optimizing the first Start delay for the first Start _ Num slices, and is used for balancing the first Start delay and the precision buffer area; the reference buffer area historical information establishing module acquires buffer area information in the self-adaptive streaming media process on the basis of the downloading strategy module, and performs filling operation on the size of enhancement layer slices which are not downloaded in the optimized first-opening delay strategy to establish complete reference buffer area historical information; the prediction module predicts an enhancement layer slice size at the current time slot using the reference buffer history information and a download of the current slice.

The specific technology of each module in this embodiment may refer to the implementation technology of the corresponding step in the slice size prediction method embodiment, and is not described herein again.

In another embodiment of the present invention, an adaptive code rate control method is provided, including:

s100, predicting the slice size of a certain enhancement layer at the current time slot position by adopting a slice size prediction module in combination with constructed historical information of a reference buffer area to obtain a predicted value; the slice size prediction module adopts the following slice size prediction method:

s1, providing two downloading strategies for the first Start _ Num slices, and carrying out balance between the first delay and the accurate buffer area;

s2, obtaining buffer zone information in the process of self-adapting streaming media, and conducting a filling operation on the size of the enhancement layer slice which is not downloaded, and establishing complete reference buffer zone historical information;

And S200, adopting a self-adaptive code rate module to carry out decision processing on the predicted value.

Based on the same technical concept, in another embodiment of the present invention, there is provided an adaptive code rate control system using scalable coding inter-layer correlation, including: the device comprises a slice size prediction module and an adaptive code rate module, wherein the slice size prediction module predicts the slice size of a certain enhancement layer at the current time slot position by combining the constructed historical information of a reference buffer zone to obtain a predicted value; the slice size prediction module adopts the slice size prediction method in the embodiment; and the self-adaptive code rate module carries out decision processing on the predicted value obtained by the slice size prediction module.

Based on the same technical concept, in another embodiment of the present invention, there is provided a computer apparatus including a memory and a processor, the memory storing a computer program, and the processor implementing the slice size prediction method of the above-described embodiment or the adaptive code rate control method of the above-described embodiment when executing the computer program.

Based on the same technical concept, in another embodiment of the present invention, a computer-readable storage medium is provided, on which a computer program is stored, and the computer program, when executed by a processor, implements the slice size prediction method of the above-mentioned embodiment or the adaptive code rate control method of the above-mentioned embodiment.

Furthermore, a simulation platform is designed to carry out experiments, and a concept of an accurate prediction interval is provided for the prediction accuracy of the slice size. The simulation platform is modified and expanded to meet the request logic of scalable coding content on the basis of the Sabre. The precise prediction interval is defined as: the real slice size of 0.8-1.2 times is defined as the accurate prediction interval, that is, the prediction size falling into the cover interval can be regarded as accurate prediction.

Based on the above-described embodiments, specific experiments are provided below to illustrate the cases of the embodiments of the present invention.

First, experimental configuration

Sabre is a simulation tool for AVC-ABR scenarios, which is adapted to SVC request logic and 3 experimental beds were developed. TB1 is a simple version that provides SVC logic only, and both TB2 and TB3 incorporate size prediction modules, the difference between which is how the initial buffer is established. TB2 does not change the ABR request logic, and the previous Start _ Num slices do normal quality decision and download work, completing the EL layer for download. TB3 selects to completely download all layers of the previous Start _ Num slices before ABR decision and normal play.

For the basic ABR algorithm, Bola (buffer based), Throughput (bandwidth based) and Dynamic (hybrid) were selected for this experiment.

Fig. 2 shows waveforms of a network trace used in simulation, which are square waves with continuous fluctuation, average bandwidth is 5182kbps, and RTT is 20 ms.

Test video this experiment was analyzed using Big Buck Bunny (BBB), Tears of Steel (TOS) and Elephants Dream (ED) with 5 layers each and processed in 2 seconds slices.

System quality evaluation index this experiment selects the average code rate, average enhancement layer number, average code rate switching and average buffering times. And meanwhile, selecting the probability of falling into the accurate prediction interval to evaluate the prediction accuracy.

Second, result evaluation

Under the test configuration of the experiment, the figure shows the average test results of 3 test beds, 3 basic ABR algorithms, 4 slice size prediction methods and 5 network tracks.

Specific result values are given in fig. 3, and a prediction accuracy probability distribution diagram of the prediction module is given in fig. 4.

It is found through comparison that the above embodiments of the present invention have significantly improved indexes such as average bit rate, average number of enhancement layers, average code rate switching, average buffering times, and prediction accuracy compared with other technologies.

It should be noted that, the steps in the method provided by the present invention may be implemented by using corresponding modules, devices, units, and the like in the system, and those skilled in the art may refer to the technical solution of the system to implement the step flow of the method, that is, the embodiment in the system may be understood as a preferred example for implementing the method, and details are not described here.

Those skilled in the art will appreciate that, in addition to implementing the system and its various devices provided by the present invention in purely computer readable program code means, the method steps can be fully programmed to enable the system and its various devices provided by the present invention to perform the same functions in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system and various devices thereof provided by the present invention can be considered as a hardware component, and the devices included in the system and various devices thereof for realizing various functions can also be considered as structures in the hardware component; means for performing the functions may also be regarded as structures within both software modules and hardware components for performing the methods.

The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by those skilled in the art within the scope of the appended claims without affecting the spirit of the invention.

Claims

1. A slice size prediction method, comprising:

s2, on the basis of S1, obtaining the buffer information in the process of self-adaptive streaming media, and performing a filling operation on the size of the enhancement layer slice which is not downloaded in the optimized first-opening delay strategy to establish complete reference buffer historical information;

2. The slice size prediction method according to claim 1, wherein the optimizing the buffer precision is: with the goal of optimizing buffer precision, choose to download all layers for the previous Start _ Num slices, thus creating a complete initial buffer.

3. The slice size prediction method according to claim 1, wherein the optimizing the start delay is performed by: with the goal of reducing the first-to-first delay, the normal ABR decision is selected for the first Start _ Num slices without actively downloading all layers.

4. The slice size prediction method according to claim 1, wherein the padding operation is performed on the enhancement layer slice size not downloaded in the optimized start-up delay strategy, wherein either of the following two padding strategies is adopted according to the enhancement layer prediction mode:

reference to all lower layers: in this strategy, the undelivered enhancement layer performs an amplification calculation on all previous lower layer sizes of the time slot by comparing according to the average code rate in the MPD file.

5. The slice size prediction method according to claim 1, wherein the reference buffer history information and a current slice download condition are used to predict the enhancement layer slice size at the current time slot, and any of the following prediction modes 1-4 is adopted according to the difference of prediction reference information:

prediction method 2: only referring to the size of the base layer of the current time slot, amplifying all the obtained layer sizes according to the evaluation code rate of each layer in the MPD file to obtain the size of the slice of the enhancement layer;

prediction mode 4: and establishing a linear regression relationship between the sizes of the enhancement layers to be calculated of all the acquired time slots and the sum of the sizes of the slices of all the layers before the enhancement layer by referring to the sizes of the base layers of all the time slots before, and substituting the size of the base layer of the current time slot into a regression equation to obtain the size of the slice of the enhancement layer.

6. A slice size prediction system, comprising:

a prediction module that predicts an enhancement layer slice size at a current time slot using the reference buffer history information and a download of the current slice.

7. An adaptive code rate control method using inter-layer correlation for scalable coding, comprising:

predicting the slice size of a certain enhancement layer at the current time slot position by adopting a slice size prediction module in combination with the constructed historical information of the reference buffer area to obtain a predicted value; wherein the slice size prediction module adopts the slice size prediction method of any one of claims 1 to 5;

8. An adaptive code rate control system, comprising:

the slice size prediction module is used for predicting the slice size of a certain enhancement layer at the current time slot position by combining the constructed historical information of the reference buffer area to obtain a predicted value; wherein the slice size prediction module adopts the slice size prediction method of any one of claims 1 to 5;

9. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the slice size prediction method of any one of claims 1 to 5 or the adaptive code rate control method of claim 7 when executing the computer program.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the slice size prediction method of any one of claims 1 to 5 or the adaptive code rate control method of claim 7.