CN114640851A

CN114640851A - Self-adaptive omnidirectional video streaming method based on quality perception

Info

Publication number: CN114640851A
Application number: CN202210272188.XA
Authority: CN
Inventors: 王传; 吴霄汉; 吴岚; 梁晶; 刘胜; 黄寒梅; 李靓平; 刘鸿谋; 莫冬花; 李明星; 黎菲
Original assignee: Guangxi Haohua Technology Co ltd
Current assignee: Guangxi Haohua Technology Co ltd
Priority date: 2022-03-18
Filing date: 2022-03-18
Publication date: 2022-06-17
Anticipated expiration: 2042-03-18
Also published as: CN114640851B

Abstract

The invention discloses a quality-perception-based self-adaptive omnidirectional video streaming method, which considers two quality determining factors, namely viewpoint movement and brightness change, specific to omnidirectional video and adopts user investigation and research to quantify the influence of the characteristics on subjective perception quality. By introducing the two characteristics to improve the SSIM measurement, a user perception quality model is established. On the basis that the MPC algorithm makes a decision for each video block, the quality allocation is performed for all regions within the video block to maximize the SSIM quality of the overall video block. The invention can realize more reasonable quality distribution for the video area, and further improve the utilization rate of the streaming media resources and the quality of user experience.

Description

Self-adaptive omnidirectional video streaming method based on quality perception

Technical Field

The invention relates to the technical field of streaming media transmission, in particular to a quality-perception-based self-adaptive omnidirectional video streaming method.

Background

In recent years, omni-directional video has become one of the emerging internet traffic. Meanwhile, the transmission of an omni-directional video stream is more challenging than a conventional video stream. To create a panoramic experience, omnidirectional video must stream panoramic content in a high resolution, hitless manner, which results in a significant consumption of bandwidth and resources. In the adaptive transmission process, the omnidirectional video is first projected into a common two-dimensional planar video, which is then temporally sliced into video blocks. For each video block, the encoder transcodes it to multiple bit rate levels (representing different qualities). Finally, the video blocks of the respective bit rate levels are further spatially cut into video regions. Similar to conventional adaptive methods, the video player can dynamically switch the quality level at the boundary of two consecutive video regions to cope with fluctuations in network bandwidth.

Most of the current adaptive algorithms employ a viewpoint-driven approach, in which only video content in the viewpoint area (user-facing area) is streamed with high quality. However, such methods have the following limitations: firstly, the viewpoint area is usually much larger than the computer screen, and the content of the streaming viewpoint area still needs at least twice the bandwidth of the common video under the same quality; secondly, because of the need to pre-fetch the content of the view area, the player must predict the user behavior (i.e. view movement), and any prediction error may lead to a degradation of the user quality of experience (QoE); finally, to accommodate the movement of viewpoints, omni-directional video must be spatially sliced and transcoded into multiple quality levels, which can greatly increase the size of the video. Since the user's perception of the quality of the omnidirectional video is different from that of the ordinary video and is uniquely influenced by the movement of the viewpoint, this feature helps to further reduce the bandwidth requirement and improve the quality of the user experience. Therefore, the existing adaptive omnidirectional video streaming algorithm still has certain defects in resource allocation and QoE maximization, and cannot meet the deployment and development requirements of the current high-quality omnidirectional streaming media service. Therefore, a more scientific and efficient adaptive omni-directional video streaming method is urgently needed.

Disclosure of Invention

The invention aims to solve the problems of low resource utilization rate, low video service quality and the like of the conventional self-adaptive omnidirectional video streaming method, and provides a self-adaptive omnidirectional video streaming method based on quality perception.

In order to solve the problems, the invention is realized by the following technical scheme:

the self-adaptive omnidirectional video streaming method based on quality perception comprises the following steps:

step 1, projecting an omnidirectional video file to a two-dimensional plane by using a common symmetric projection method; cutting the projected video file into a plurality of video blocks with equal length, and transcoding each video block into different bit rate levels;

step 2, cutting each video frame of the video block of each bit rate level into a plurality of video areas from space, and further cutting each video area into a plurality of video windows from space;

step 3, calculating the relative viewpoint moving speed and viewpoint brightness change of each pixel point of each video frame of each bit rate level video block based on the viewpoint track preset by the omnidirectional video file, wherein:

v^b(p，q)＝|v_u-v₀ ^b(p，q)|

l^b(p，q)＝|l_u ^b(p，q)-l_m ^b|

step 4, calculating the relative viewpoint moving speed and viewpoint brightness change of each video area under each bit rate level, wherein:

step 5, calculating a just visible difference threshold value based on a viewpoint of each pixel point of the video frame under each bit rate level, wherein:

step 6, calculating the weight of each video window under each bit rate level, wherein:

and 7, calculating the user perception quality of each video window under each bit rate level, wherein:

step 8, calculating the user perception quality of each video area under each bit rate level, wherein:

step 9, establishing a user perception quality query table of the relative viewpoint moving speed and viewpoint brightness change of each video area under each bit rate level based on the relative viewpoint moving speed and viewpoint brightness change obtained in the step 4 and the user perception quality obtained in the step 8;

step 10, modeling the adaptive bit rate decision of the video blocks as an optimization problem based on model predictive control by using an ABR algorithm, and determining the bit rate of each video block by solving the optimization problem, namely correspondingly determining the bit rate of each video frame of the video blocks;

step 11, predicting the viewpoint position of a future time point by using the existing linear regression method based on the historical viewpoint position of the current user, using the lowest viewpoint moving speed of the current time point in the past m seconds as the predicted value of the viewpoint moving speed of the future time point, and finally calculating the relative viewpoint moving speed and viewpoint brightness change of all video areas under each bit rate level by adopting the same formula in the steps 3 and 4;

step 12, calculating the relative viewpoint moving speed and viewpoint brightness change of all video regions at each bit rate level obtained in step 11 and the relative viewpoint moving speed and viewpoint brightness change of the corresponding video region at the corresponding bit rate level obtained in step 4, determining the user perception quality corresponding to the video region with the minimum euclidean distance by using the user perception quality query table constructed in step 9, and finally obtaining the user perception quality of all video regions at each bit rate level;

step 13, based on the bit rate of the video block determined in step 10 and the user perceived quality of all video regions at each bit rate level determined in step 12, further determining the quality of each video region in all video frames of the video block with the goal of maximizing the overall user perceived quality of the video block when the total quality of all video regions of all video frames of the video block is lower than or equal to the bit rate of the video block;

in the above formula, v^b(p, q) represents the relative viewpoint movement speed, l, of the pixel point (p, q) at bit rate level b^b(p, q) represents the viewpoint brightness variation of the pixel point (p, q) at the bit rate level b, v_uIndicating the moving speed of the user's viewpoint, v₀ ^b(p, q) represents the motion velocity of the object at pixel point (p, q) at bit rate level b, l_u ^b(p, q) denotes the luminance at a pixel point (p, q) at bit rate level b, l_m ^bThe average brightness of all pixel points in a video region where m-second foresight points are located under the bit rate level b is represented, m is a set value, and | represents an absolute value symbol;

representing the relative viewpoint movement speed of the video area i at the bit rate level b,

representing the viewpoint brightness change of a video area i under the bit rate level b, wherein N is the number of all pixel points of the video area i; JND^b(p, q) represents the view-based just visible disparity threshold for pixel point (p, q) at bit rate level b, CJND^b(p, q) represents the just visible difference threshold of the pixel point (p, q) at bit rate level b based on the video content characteristics, a represents a given non-zero constant;

representing the weight of the j-th video window of video area i at bit rate level b, i representing the number of video areas, j representing the number of video windows, N_ijRepresenting the number of all pixel points of the jth video window of the video area i;

representing the user perceived quality of the jth video window of video region i at bit rate level b;

j-th video representing video area i at bit rate level bThe average value of the gray levels of all pixel points in the window,

expressing the mean value of all pixel points of a jth video window of a video area i under a source video bit rate o;

representing the variance of all pixel gray levels in the jth video window of video region i at bit rate level b,

expressing the variance of all pixel points of a jth video window of a video area i under a source video bit rate o;

representing the covariance of all pixel points of gray levels of a jth video window of a video area i under a bit rate level b and a source video bit rate o; a is₁、a₂Are all given non-zero constants;

representing the user perceived quality of video region i at bit rate level b; n is a radical of_iRepresenting the number of all video windows of video region i.

The optimization problem constructed in the step 10 is as follows:

max(α*Q(b_t)-β*Rebuf(b_t)-γ*Smooth(Q(b_t)))

in the formula, b_tFor the bit rate of the current tth video block to be decided, Q (b)_t) Representing the bit rate b_tVideo quality of the next tth video block, Rebuf (b)_t) Representing the bit rate b_tThe Cartin time, Smooth (Q (b)) of the next tth video Block_t) Represents the bit rate b)_tThe quality smoothness of the next tth video block, α represents the video quality weight, β represents the katon time weight, and γ represents the quality smoothness weight.

The Euclidean distance D of the step 12_iComprises the following steps:

in the formula (I), the compound is shown in the specification,

representing the relative viewpoint movement speed of video area i at bit rate level b in the user perceived quality look-up table at step 9,

representing the viewpoint brightness change of the video area i under the bit rate level b in the user perception quality query table in the step 9, wherein N is the number of pixel points of the video area i;

representing the relative viewpoint moving speed of the video area i predicted at step 11,

representing the viewpoint brightness variation of the video area i predicted in step 11; d_iRepresenting the euclidean distance.

The objective function of step 13 is:

wherein i is the number of the video area to be decided currently, M is the number of the video areas in the video block, and S_iRepresenting the size of the current video area to be decided i, b_tiFor the bit rate of the current video area to be decided i,

representing bit rate level b_tiUser perception quality of the current video area to be decided, i, b_tIs the bit rate of the video block.

Compared with the prior art, the method and the device have the advantages that on the basis of the traditional viewpoint driving method, the user perception quality model is established by utilizing two quality determining factors specific to the omnidirectional video, so that the video quality perceived by the user subjectively can be calculated more accurately. And establishing an adaptive quality decision model by adjusting the quality of the video block and all video areas in the video block. Based on the two models, the invention can realize more reasonable video quality distribution and further improve the utilization rate of streaming media resources and the quality of user experience.

Drawings

Fig. 1 is a diagram of an application scenario of the present invention.

Fig. 2 is a general flow chart of a quality-aware based adaptive omni-directional video streaming method;

FIG. 3 is a flow diagram of a user perceived quality model;

fig. 4 is a flow chart of an adaptive quality decision model.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to specific examples.

Fig. 1 is a diagram of an application scenario of the present invention, which mainly includes a video server, a Content Delivery Network (CDN) and a video player. At the video server side, the projected flat video file is first cut into multiple video blocks of equal length, which are then transcoded into different bit rate levels (representing different sharpness and quality) using an encoder. The content distribution network acquires video blocks of each bit rate level of the flat video file from the video server, spatially cuts the video blocks of each bit rate level into a plurality of video areas for storage, and calculates the user perception quality of all the areas of the video blocks of each bit rate level based on the historical viewpoint data and the user perception quality model. The video player outputs the bit rate level of each video block and the quality of each video region in the block based on a quality decider, and requests a content distribution network to download the corresponding video region through the internet.

An adaptive omni-directional video streaming method based on quality perception, as shown in fig. 2, includes the following steps:

step 1, projecting an omnidirectional video file to a two-dimensional plane by using a common symmetric projection method; and cutting the projected video file into a plurality of video blocks with equal length, and transcoding each video block into different bit rate levels.

And preprocessing the omnidirectional video file, and projecting the omnidirectional video file to a two-dimensional plane by using a symmetric projection method-ERP projection technology. The FFmpeg tool is used to segment the projected video file into multiple video blocks of equal length (e.g., 1 second), and each video block is transcoded into different bit rate levels corresponding to different sharpness and video quality. In the present embodiment, 750kbps, 1200kbps and 1850kbps correspond to low definition, standard definition and high definition, respectively.

Step 2, spatially cutting each video frame of video blocks of respective bit rate levels into a plurality of video regions (e.g., a 6 × 12 grid), and further spatially cutting each video region into a plurality of video windows (e.g., a 3 × 3 grid).

And 3, calculating the relative viewpoint moving speed and viewpoint brightness change of each pixel point of each video frame of each bit rate level video block based on the viewpoint track preset by the omnidirectional video file.

According to the characteristics of the omnidirectional video, the perception of the user on the quality of the omnidirectional video is influenced by the movement of the viewpoint of the user. Therefore, the influence of two quality determining factors, namely the relative viewpoint moving speed and the viewpoint brightness change, which are specific to the omnidirectional video on the user perception quality is quantized through research of the user, so that the quality of the omnidirectional video which is subjectively perceived by the user is more accurately modeled.

The relative viewpoint moving speed refers to the user viewpoint moving speed v_u(derived from the speed of the glasses-worn device when the user watches the omnidirectional video) and the motion speed v of the object at the pixel point (p, q) of the video frame at bit rate level b₀ ^b(p, q) is the relative viewpoint moving speed v of each pixel of the video frame at bit rate level b^b(p, q) is:

v^b(p，q)＝|v_u-v₀ ^b(p，q)|

in the formula, | | represents an absolute value symbol.

Viewpoint luminance change refers to luminance l at a video frame pixel point (p, q) at bit rate level b_u ^b(p, q) and average brightness l of all pixel points in viewpoint area at time point m seconds before current time point_m ^bThe difference is the viewpoint brightness change l of each pixel of the video frame at bit rate level b^b(p, q) is:

l^b(p，q)＝|l_u ^b(p，q)-l_m ^b|

in the formula, | | represents an absolute value symbol, and m is a set value.

And 4, calculating the relative viewpoint moving speed and viewpoint brightness change of each video area under each bit rate level.

Relative viewpoint moving speed of video area i at bit rate level b

Comprises the following steps:

viewpoint luminance variation of video region i at bit rate level b

Comprises the following steps:

in the formula, N is the number of pixels in the video area i.

And 5, calculating Just visible Difference (JND) thresholds of each pixel point of the video frames at each bit rate level based on the viewpoints.

The conventional JND threshold is related only to video content characteristics, and is referred to as a JND threshold based on video content characteristics. According to the research result of the present invention, the JND threshold is not only related to the video content characteristics, but also related to the relative viewpoint moving speed and viewpoint brightnessThe change is related to what is called the view-based JND threshold. Wherein JND thresholds follow v at the same bit rate level b^b(p, q) increases and increases with l^bThe (p, q) is increased and then decreased, and the two factors have independent effects on the JND threshold. By fitting the rules of the user study data, the JND threshold based on the viewpoint can be calculated as follows:

in the formula, JND^b(p, q) represents the view-based JND threshold at pixel point (p, q) at bitrate level b, CJND^b(p, q) denotes the JND threshold at pixel point (p, q) at bit rate level b based on the video content characteristics, a being a given non-zero constant.

And 6, calculating the weight of each video window under each bit rate level.

View-based JND threshold JND at pixel point (p, q) at bitrate level b^bThe smaller (p, q) is, the more easily a user perceives quality difference, so that the weight value corresponding to the pixel point (p, q) is higher, and conversely, the JND threshold JND based on the viewpoint at the pixel point (p, q) is higher^bThe larger (p, q) is, the less the user can perceive the quality difference, and thus the weight value corresponding to the pixel point (p, q) is lower. Since the pixels in different video windows have different importance, the invention defines the weight based on new quality determining factors (relative viewpoint moving speed and viewpoint brightness change), wherein the weight of the jth video window of the video area i at the bit rate level b

Comprises the following steps:

where i denotes the number of the video area, j denotes the number of the video window, N_ijJ-th video window representing video area iThe number of pixels of the port.

And 7, calculating the user perception quality of each video window at each bit rate level.

Referring to fig. 3, if the calculation is performed based on the Structural Similarity Index Metric (SSIM), the user perceived quality of the jth video window of the video area i at the bit rate level b is calculated

Comprises the following steps:

in the formula (I), the compound is shown in the specification,

represents the average value of all pixel points in the jth video window of the video area i at the bit rate level b,

represents the variance of all pixel gray levels of the jth video window of the video area i at the bit rate level b,

representing the covariance of all pixel points of gray levels of a jth video window of a video area i under a bit rate level b and a source video bit rate o; a is₁、a₂Are given non-zero constants.

And 8, calculating the user perception quality of each video area under each bit rate level.

Calculating based on the user perceived quality and weight of all video windows in the video area, the user perceived quality of the video area i under the bit rate level b

Comprises the following steps:

in the formula, N_iRepresenting the number of video windows of video region i.

And 9, establishing a user perception quality query table of the relative viewpoint moving speed and viewpoint brightness change of each video area under each bit rate level based on the relative viewpoint moving speed and viewpoint brightness change obtained in the step 4 and the user perception quality obtained in the step 8.

The format of the user perceived quality look-up table is as follows:

an adaptive quality decision model is then made based on the constructed user perceived quality look-up table, see fig. 4.

Step 10, modeling the adaptive bit rate decision of the video block as an optimization problem based on Model Predictive Control (MPC) by using an ABR algorithm, and determining the bit rate of each video block by solving the optimization problem, that is, determining the bit rate of each video frame of the video block correspondingly.

The bit rate of a video block refers to the amount of data transmitted per second, and is an average concept, for example, the bit rate of a video block of 3-5s is the total data volume/(3-5 s), and this total data volume includes the data volume of all video frames in the video block, and the data volume of each video frame is not the same, so there is no way to say exactly how much the bit rate of a video frame is because it is not measured in this way but it can be determined that once the bit rate of a video block determines that it includes the data volume of each video frame, the bit rate is determined accordingly.

The present invention utilizes the existing ABR algorithm to solve the bit rate of each video block. The optimization goal of the ABR algorithm is to maximize user QoE, where video quality, quality smoothness and katon time have a significant impact on user QoE, for which the present invention uses a linear QoE model based on the above factors as the optimization goal of the algorithm. Specifically, the optimization problem P to be solved is defined as follows:

P：max(QoE＝α*Q(b_t)-β*Rebuf(b_t)-γ*Smooth(Q(b_t)))

in the equation, the problem P represents an optimization goal to maximize the quality of user experience (QoE), i.e., the adaptive bitrate algorithm. b_tFor the bit rate of the current tth video block to be decided, Q (b)_t) Representing the bit rate b_tVideo quality of the next tth video block, Rebuf (b)_t) Representing the bit rate b_tThe Cartin time, Smooth (Q (b)) of the next tth video Block_t) Represents the bit rate b)_tThe quality smoothness of the next tth video block, α represents the video quality weight, β represents the katon time weight, and γ represents the quality smoothness weight.

The specific process for solving the optimization problem P is as follows: upon a request to download video block T, the bit rate of the current video block T is selected based on the goal of maximizing the overall QoE of the next T video blocks by enumerating all bit rate combinations for the future T video blocks. After the video block is downloaded, it will be moved forward T future time views. This process repeats until all video block transmissions are complete.

Step 11, predicting the viewpoint position at a future time point by using the existing linear regression method based on the historical viewpoint position of the current user, and using the minimum viewpoint moving speed of the past m seconds of the current time point as the predicted value of the viewpoint moving speed at the future time point

Finally, the same formula in steps 3 and 4 is adopted to calculate the relative viewpoint moving speed of all video areas under each bit rate level

And viewpoint luminance variation

Step 12, calculating the relative viewpoint moving speed and viewpoint brightness change of all video regions at each bit rate level obtained in step 11 and the relative viewpoint moving speed and viewpoint brightness change of the corresponding video region at the corresponding bit rate level obtained in step 4, and determining the user perceived quality corresponding to the video region with the minimum euclidean distance by using the user perceived quality lookup table constructed in step 9

The user perceived quality (SSIM) of all video regions at various bit rate levels can be obtained.

Determining the SSIM value of each video region at each bit rate level according to the SSIM of the entry or the closest entry matching the two features of the relative viewpoint moving speed and viewpoint brightness change stored in the user perceived quality lookup table in step 9, so that the client can roughly estimate the overall user perceived quality of the video block without acquiring video content, that is, comparing the calculated relative viewpoint moving speed and viewpoint brightness change information of all video regions at each bit rate level with the corresponding video region information at the corresponding bit rate level in the lookup table to determine the user perceived quality (SSIM) of all video regions at each bit rate level.

Step 13, based on the bit rate b of the video block determined in step 10_tAnd based on the stepsThe user perceived quality of all video regions at each bit rate level determined in step 12 is further determined with the goal of maximizing the overall user perceived quality (SSIM) of the video block in the case where the total quality of all video regions of all video frames of the video block is less than or equal to the video block bit rate.

After determining the bit rate of a video block, the bit rate of all video frames in the block can be determined, which determines the total quality of all video regions in the video frame. In the case where the total quality of all video regions of all video frames is less than or equal to the video block bit rate, the quality of each video region in the video block is output by defining an objective function of the overall user perceived quality and solving.

Based on the obtained SSIM values of the video areas in the video blocks with different bit rate levels, the quality level of each video area in the video block is determined by maximizing the overall SSIM value of the video block, and an objective function is defined as:

where i is the current video region number to be decided, b_tiAnd M is the number of video areas in the video block, wherein M is the bit rate of the current video area i to be decided. S_iIs the size of the video area i,

bit rate level b determined for step 12_tiUser perceived quality of lower video region i, b_tIs the bit rate of the video block t determined in step 10. The specific quality decision process is as follows: combining enumeration and greedy, for any pair of video regions (e.g., video regions 1 and 2), if a quality assignment is found (b)_t1、b_t2) Is distributed over another mass (b'_t1、b′_t2) More rational, i.e. higher total

And a smaller total size (S) of the video area₁+S₂) Then the latter allocation is excluded in the quality allocation iteration of the remaining video regions (e.g. 3, 4.., M), and the process repeats until the quality allocation of all video regions ends.

The invention considers two quality determining factors, namely viewpoint moving speed and brightness change, which are specific to the omnidirectional video, and quantifies the influence of the viewpoint moving speed and the brightness change on the subjective perception quality of the user by adopting user investigation and research. By introducing the two characteristics to improve the SSIM measurement, a user perception quality model is established. On the basis of the MPC algorithm's bit rate decision for each video block, quality allocation is made to all regions within the video block to maximize the overall SSIM quality of the video block. The invention can realize more reasonable quality distribution for the video area, and further improve the utilization rate of the streaming media resources and the quality of user experience.

It should be noted that, although the above-mentioned embodiments of the present invention are illustrative, the present invention is not limited thereto, and thus the present invention is not limited to the above-mentioned embodiments. Other embodiments, which can be made by those skilled in the art in light of the teachings of the present invention, are considered to be within the scope of the present invention without departing from its principles.

Claims

1. The self-adaptive omnidirectional video streaming method based on quality perception is characterized by comprising the following steps of:

v^b(p，q)＝|v_u-v₀ ^b(p，q)|

l^b(p，q)＝|l_u ^b(p，q)-l_m ^b|

step 10, modeling adaptive bit rate decision of the video blocks by using an ABR algorithm as an optimization problem based on model predictive control, and determining the bit rate of each video block by solving the optimization problem, namely correspondingly determining the bit rate of each video frame of the video blocks;

step 13, based on the bit rate of the video block determined in step 10 and the user perceived quality of all video areas at each bit rate level determined in step 12, in the case that the total quality of all video areas of all video frames of the video block is lower than or equal to the bit rate of the video block, further determining the quality of each video area in all video frames of the video block with the aim of maximizing the total user perceived quality of the video block;

in the above formula, v^b(p, q) represents the relative viewpoint movement speed, l, of the pixel point (p, q) at the bit rate level b^b(p, q) represents the viewpoint brightness variation of the pixel point (p, q) at the bit rate level b, v_uIndicating the moving speed of the user's viewpoint, v₀ ^b(p, q) represents the motion velocity of the object at pixel point (p, q) at bit rate level b, l_u ^b(p, q) denotes the luminance at a pixel point (p, q) at bit rate level b, l_m ^bThe average brightness of all pixel points in a video region where m-second foresight points are located under the bit rate level b is represented, m is a set value, and | represents an absolute value symbol;

expressing the mean value of the gray levels of all pixel points of the jth video window of the video area i under the source video bit rate o;

expressing the variance of all pixel gray levels of a jth video window of a video area i under a source video bit rate o;

representing the user perceived quality of video region i at bit rate level b; ni represents the number of all video windows of video region i.

2. The adaptive omni-directional video streaming method based on quality perception according to claim 1, wherein the optimization problem constructed in step 10 is:

max(α*Q(b_t)-β*Rebuf(b_t)-γ*Smooth(Q(b_t)))

in the formula, b_tFor the bit rate of the current tth video block to be decided, Q (b)_t) Representing the bit rate b_tVideo quality of the next tth video block, Rebuf (b)_t) Representing the bit rate b_tThe pause time, Smo, of the next tth video blockoth(Q(b_t) Represents the bit rate b)_tThe quality smoothness of the next tth video block, α represents the video quality weight, β represents the katon time weight, and γ represents the quality smoothness weight.

3. The quality-aware-based adaptive omni-directional video streaming method according to claim 1, wherein the euclidean distance D of step 12 is a distance of euclidean_iComprises the following steps:

in the formula (I), the compound is shown in the specification,

representing the viewpoint brightness change of the video area i predicted in step 11; d_iRepresenting the euclidean distance.

4. The quality-aware based adaptive omni-directional video streaming method according to claim 1, wherein the objective function of step 13 is:

wherein i is the number of the video area to be decided currently, M is the number of the video areas in the video block, and S_iRepresenting the size of the current video area to be decided i, b_tiFor the bit rate of the current video area i to be decided,