CN109286855B

CN109286855B - Panoramic video transmission method, transmission device and transmission system

Info

Publication number: CN109286855B
Application number: CN201710590143.6A
Authority: CN
Inventors: 谢澜; 张行功; 郭宗明
Original assignee: Peking University; Peking University Founder Group Co Ltd; Beijing Founder Electronics Co Ltd
Current assignee: Peking University
Priority date: 2017-07-19
Filing date: 2017-07-19
Publication date: 2020-10-13
Anticipated expiration: 2037-07-19
Also published as: CN109286855A

Abstract

The invention provides a transmission method, a transmission device and a transmission system of a panoramic video. The panoramic video transmission method is used for a client and comprises the following steps: acquiring and analyzing a media file from a server side, and downloading a video clip according to the media file; in the downloading process, acquiring the head position information of a user, and predicting the probability of the video clip being watched according to the head position information; calculating the size of a video playing buffer area, and calculating the upper limit value of the total code rate of the video clip according to the network bandwidth estimation value and the size of the video playing buffer area; and calculating the space quality change of the video clip and the expected distortion of the video clip, establishing a QoE model according to the space quality change and the expected distortion of the video clip, and selecting the video clip needing to be downloaded. By the technical scheme of the invention, the technical effects of reducing the data volume of panoramic video transmission, improving the video quality, reducing the quality fluctuation of space and the video playing jam are achieved.

Description

Panoramic video transmission method, transmission device and transmission system

Technical Field

The present invention relates to the field of multimedia technologies, and in particular, to a panoramic video transmission method, a panoramic video transmission apparatus, a panoramic video transmission system, a computer device, and a computer-readable storage medium.

Background

With the development of multimedia technology, virtual reality technology (VR) has received a great deal of attention from the industry and academia. Among them, panoramic videos, such as 360-degree videos and omni-directional videos, are widely used.

The data size of virtual reality video is usually very large, which makes the compression rate and network transmission for panoramic video the most challenging. For example, the code rate of a panoramic video with the resolution of 4Kx2K mapped in an ERP mode can reach 15Mbps to 20Mbps after compression coding. The development of internet application of panoramic video is hindered by the characteristics of high resolution and high code rate of the panoramic video. In addition, when the user watches the panoramic video, only the video content in the window is actually seen, and the content in other areas is not seen by the user. Therefore, transmitting the entire content of the panoramic video (both in-window and out-of-window content) to the client results in wasted bandwidth.

The HTTP dynamic streaming media technology can realize the self-adaptive transmission based on the window, namely the content transmission in the user window is high in quality, the content transmission outside the window is low in quality, and therefore the total data transmission quantity is reduced. The transmission method based on the block is to divide the video into blocks in space, and the client can selectively download the video content, such as downloading a high-quality version for the block in the window, downloading a low-quality version for the block outside the window, or not downloading.

However, providing high quality block-based transmission still presents challenges: (1) video block acquisition errors. Because the client needs to predict the watching direction of the user to acquire future video content in advance, the client does not download some video blocks due to prediction errors, and if the video content is needed during actual playing, the video content is lacked to influence the user experience; (2) and (5) video playing is blocked. Since there is immediacy in the prediction of the user viewing orientation, that is, the accuracy of the user orientation result is extremely reduced after the user viewing orientation is predicted for too long time, the playing buffer area of the video is very small (for example, 3 seconds), and in the case of a small buffer area, the video playing is easy to pause; (3) the mixed code rate causes boundary effects. Since the transmission method based on the block needs to spatially divide the video, when the client selects the video content, the video may have an obvious boundary or inconsistent quality during rendering due to different bit rates.

Therefore, how to provide a method for transmitting a panoramic video based on blocks to achieve the purposes of improving video quality, reducing spatial quality fluctuation and video playing pause becomes a technical problem to be solved at present.

Disclosure of Invention

The present invention is directed to solving at least one of the problems of the prior art or the related art.

To this end, a first aspect of the present invention is to provide a method for transmitting a panoramic video, which is used for a server.

The second aspect of the present invention is to provide a method for transmitting a panoramic video, which is used for a client.

A third aspect of the present invention is to provide a panoramic video transmission apparatus for a server.

The fourth aspect of the present invention is to provide a device for transmitting panoramic video, which is used for a client.

A fifth aspect of the present invention is to provide a transmission system of panoramic video.

A sixth aspect of the invention is directed to a computer device.

A seventh aspect of the present invention is directed to a computer-readable storage medium.

In view of this, a first aspect of the present invention provides a method for transmitting a panoramic video, which performs blocking, encoding, and slicing on the panoramic video according to preset configuration information to obtain a video clip and a media description file; the video clip and the media description file are stored in a server.

According to the transmission method of the panoramic video, at the server end, parameters such as the number of space blocks, the width and the height of the blocks, coding parameters, the duration of video segments and the like can be predefined and used as configuration files of configuration information, the panoramic video and the configuration file are then retrieved and processed by a processor executing the video blocking, and, in particular, the method comprises the steps of carrying out blocking, coding and slicing on the panoramic video to obtain video clips and media description files, storing the processed video clips and media description files in an HTTP server for later use, when a client downloads the panoramic video, the video segment and the media description file are subjected to visual point adaptation and code rate adaptation, and the video segment needing to be downloaded is solved through an optimization method, therefore, the purposes of improving the video quality, reducing the quality fluctuation of the space and stopping the video playing are achieved.

In the above technical solution, preferably, the format of the panoramic video includes an ERP format and a CMP format; the configuration information includes: the number, width, height, playing duration, encoding parameters, and code rate level of the video segments.

In the technical scheme, the generation of panoramic video content is explained first, in 2016, a draft of an omnibearing media application format is provided by MPEG, when a panoramic video is manufactured, a plurality of cameras are usually used for recording visual scenes of the real world, and video frames (images) output by the cameras at the same moment need to be spliced, projected and mapped and then packaged into a two-dimensional plane data frame to carry out video coding. The splicing refers to restoring a real world visual field of images acquired by a plurality of cameras at the same time through technologies such as feature point matching, fusion and the like, and the images are projected onto a three-dimensional projection structure, such as a sphere or a cube, after being spliced. Since the Projection structure is three-dimensional, but the encoder widely used at present encodes a two-dimensional plane video, it is necessary to further map an image on the Projection structure to a two-dimensional plane, and perform video compression encoding after obtaining a two-dimensional mapped data frame, and currently, commonly used mapping methods include equivalent visual Projection (ERP), cube Projection (CMP), and the like.

In any of the above technical solutions, preferably, the media description file includes spatial position information, encoding information, a bitrate, a quality distortion value, and a Uniform Resource Locator (URL) of the video clip.

In this embodiment, the media description file includes, but is not limited to, spatial location information, encoding information, bitrate, quality distortion value, and Uniform Resource Locator (URL) of the video clip.

In a second aspect of the present invention, a method for transmitting a panoramic video is provided, which is used for a client and is used in cooperation with a method for transmitting a panoramic video in any one of the above technical solutions, and the method for transmitting a panoramic video includes: acquiring and analyzing a media description file from a server side, and downloading a video clip according to the media description file; in the downloading process, acquiring the head position information of a user, and predicting the probability of the video clip being watched according to the head position information; calculating the size of a video playing buffer area, and calculating the upper limit value of the total code rate of the video clip according to the network bandwidth estimation value and the size of the video playing buffer area; and calculating the space quality change of the video clip and the expected distortion of the video clip, establishing a QoE model according to the space quality change and the expected distortion of the video clip, and selecting the video clip needing to be downloaded.

According to the transmission method of the panoramic video, when a client user in a network downloads the panoramic video, the uniform resource locator of the panoramic video is obtained according to the media description file, and therefore downloading is carried out. And performing view adaptation, code rate adaptation and establishment of an optimization model to maximize QoE in the downloading process to determine the downloaded segments. Specifically, in the visual point self-adaption process, the probability that a future block is observed is predicted by acquiring historical head position information of a user; calculating the total video clip code rate which can be transmitted according to the network bandwidth and the size of the playing buffer area; and finally, solving the video clip needing to be downloaded by an optimization method, wherein the overall quality of the video, the space quality fluctuation caused by space blocking and the expected value of distortion of the video clip under the watched probability are considered in the optimization method, and the client can selectively obtain the video clip by establishing an optimization model, so that the data volume of panoramic video transmission is reduced, the aims of improving the video quality, reducing the space quality fluctuation and video playing jam are fulfilled, and the user experience of watching the panoramic video is improved under the network environment with bandwidth limitation.

In the above technical solution, preferably, the step of obtaining the head position information of the user and predicting the probability of the video segment being viewed according to the head position information includes the euler angle of the user orientation, and defining (α, γ) the euler angle of the user orientation, wherein α is the yaw angle, β is the pitch angle, γ roll angle, and defining t₀Is the current time; defining a prediction interval; definition of

Points on the sphere corresponding to the video segments, wherein

Is latitude, θ is longitude; definition of U_iThe number of points on the spherical surface corresponding to the ith video clip; definition of

To see a point

Is calculated at (t) from the euler angles (α, gamma) of the user's orientation₀Predicted value of user orientation at time (+)

And the correct probability P of the predicted value of the user orientation_E(α, gamma.) according to (t)₀(+) predicted value of user orientationProbability of correctness P_E(α, gamma) calculating the point on the sphere corresponding to the video clip

Probability of being viewed

The calculation formula is as follows:

probability P of a video segment being viewed_iThe calculation formula is the mean value of the observed probability of the point on the spherical surface corresponding to the video clip:

in the technical scheme, specific steps that the client needs to perform the viewpoint adaptation in each adaptation stage are limited. Specifically, to calculate the probability that a video block is seen, the probability that a point on a spherical surface is seen is calculated, and for one point on the spherical surface

Since this point is likely to be seen through several orientations of the user, the probability that this point is seen

Set of orientations calculated to see this point

The corresponding probability mean, namely:

probability P that a video block is viewed_iCalculate the spherical point U as this video block_iThe mean of the probabilities that the corresponding points on the sphere are viewed, i.e.:

wherein, P_E(α, γ) is at (t)₀Time + correct probability of predicted value of user orientation.

In any of the above technical solutions, preferably, the predicted value of the user direction is a predicted value of the user direction

The calculation formula of (2) is as follows:

wherein m is_α、m_β、m_γIs a linear regression parameter; probability of correctness P of predicted value of user orientation_EThe formula for the calculation of (α, gamma) is P_E(α,β,γ)＝P_yaw(α)P_pitch(β)P_roll(λ) wherein P_yaw(α)，P_pitch(β)，P_roll(gamma) are the correct probabilities of the predicted values of the yaw angle, the pitch angle and the roll angle respectively, and the calculation formula is as follows:

wherein, mu_αMean, σ, of the angular distribution of yaw_αThe standard deviation is the predicted value of the yaw angle; mu.s_βMean value of the angular distribution of pitch, σ_βThe standard deviation of the predicted value of the pitch angle is used; mu.s_λMean value of the distribution of roll angles, σ_λIs the standard deviation of the predicted value of roll angle.

In this embodiment, first, a linear regression is used to calculate the value at (t)₀Predicted value of user orientation at time (+)

Wherein m is_α、m_β、m_γFor the parameters in the linear regression, these three parameters can be utilized separatelyAt window [ t₀-1,t₀]And solving historical data of the yaw angle, the pitch angle and the roll angle in time by a least square method. Second, by data statistics, at (t)₀The predicted value of the orientation of the user at time + has a difference from the true value thereof, and the relationship therebetween conforms to the Gaussian distribution, i.e., e_α～N(μ_α,σ_α)，e_β～N(μ_β,σ_β)，e_γ～N(μ_γ,σ_γ) From this, the euler angle prediction correct probability component, i.e. P, can be calculated_yaw(α)，P_pitch(β)，P_roll(γ). The yaw angle, the pitch angle and the roll angle are independent, so the Euler angle prediction accuracy probability P is_E(α, γ) can be calculated as P_E(α,β,γ)＝P_yaw(α)P_pitch(β)P_roll(λ)。

In any of the above technical solutions, preferably, the step of calculating the size of the video playing buffer and calculating the upper limit value of the total bitrate of the video clip according to the network bandwidth estimation value and the size of the video playing buffer specifically includes: defining T as the duration of the video clips, wherein the duration of each video clip is equal; definition b_kThe size of a video playing buffer area after the kth video clip set is downloaded; definition of R_kThe upper limit value of the total code rate for downloading the kth video clip set; definition of R_minThe minimum value of the total code rate for downloading the kth video clip set; definition C_kThe estimated value of the network bandwidth in the process of downloading the k video clip set; definition B_targetA target value of a video playing buffer area; the size b of the video playing buffer area after the kth video clip set is downloaded_kThe calculation formula of (2) is as follows:

downloading total code rate upper limit value R of kth video clip set_kThe calculation formula of (2) is as follows:

in the technical scheme, the target-based cache is limitedAnd determining the total available code rate of the video clip by using a code rate control algorithm of the buffer area, thereby avoiding pause of playing under the condition of a small buffer area. Specifically, the change of the size of the video buffer is determined by the network bandwidth and the total code rate of the downloaded video clip set, when the video clip set is downloaded, the size of the video buffer is increased, and the size of the video buffer is decreased along with the playing of the video. Therefore, the video buffer size after downloading the kth video segment set can be calculated as:

to avoid video playing jamming, we control the video buffer size to a target value, i.e. let b_k＝B_targetThen, the total bitrate of the kth video segment set can be found as:

to avoid negative results, R_kSetting a minimum value R_minThen R is_kCan be calculated as:

in any of the above technical solutions, preferably, N is defined as the number of video segments; defining M as the code rate grade of the video clip; definition of r_i,jThe code rate of the ith video clip at the jth code rate level is shown, wherein i is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to M; definition of d_i,jThe quality distortion value of the ith video clip at the jth code rate level is shown, wherein i is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to M; definition of x_i,jWhether the ith video clip is selected at the jth code rate level or not, x _i,j1 represents the selection, x _i,j0 represents no choice, where 1 ≦ i ≦ N,1 ≦ j ≦ M, any i and any j form the set X, X ═ X_i,jI is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to M }; definition s_iIs the area of the sphere corresponding to the ith video segment,

defining phi (X) as the period of a video segmentDistortion is observed; defining Ψ (X) as a spatial quality variation of the video segment; defining R as the upper limit value of the total code rate of the video clip; the expected distortion Φ (X) of a video segment is calculated as:

P_iis the probability of a video segment being viewed;

the spatial quality variation Ψ (X) of a video segment is calculated by the following formula:

P_iis the probability of a video segment being viewed;

the QoE model is:

wherein, η for the optimization of the target weights,

in the technical scheme, when a video clip is selected, two QoE factors need to be considered, wherein one factor is expected distortion and represents an expected value of the distortion under the condition of considering the watching probability of the video clip, and the other factor is spatial quality change and represents quality smoothness of the video clip_kThen, the optimization problem can be defined as:

wherein the content of the first and second substances,

in each self-adaptive stage, the client solves the objective function and the conditional function to obtain a video segment required to be obtained, then sends a request to the server for downloading, and enters the next self-adaptive stage after downloading is completed until the client finishes watching the video.

In a third aspect of the present invention, a transmission apparatus for panoramic video is provided, where the transmission apparatus is used for a server, and includes: the processing unit is used for carrying out blocking, coding and slicing processing on the panoramic video according to preset configuration information to obtain video clips and media description files; a storage unit, configured to store the video clip and the media description file in the server.

According to the panoramic video transmission device, at the server end, parameters such as the number of space blocks, the width and height of the blocks, coding parameters, the duration of video segments and the like can be predefined and used as configuration files of configuration information, the panoramic video and the configuration file are then retrieved and processed by a processor executing the video blocking, and, in particular, the method comprises the steps of carrying out blocking, coding and slicing on the panoramic video to obtain video clips and media description files, storing the processed video clips and media description files in an HTTP server for later use, when a client downloads the panoramic video, the video segment and the media description file are subjected to visual point adaptation and code rate adaptation, and the video segment needing to be downloaded is solved through an optimization method, therefore, the purposes of improving the video quality, reducing the quality fluctuation of the space and stopping the video playing are achieved.

Firstly, the generation of panoramic video content is explained, in 2016 (6 months), an MPEG (moving Picture experts group) proposes a standard draft of an all-dimensional media application format, when a panoramic video is manufactured, a plurality of cameras are usually used for recording visual scenes of a real world, and video frames (images) output by the cameras at the same moment need to be spliced, projected and mapped and then packaged into a two-dimensional plane data frame to carry out video coding. The splicing refers to restoring a real world visual field of images acquired by a plurality of cameras at the same time through technologies such as feature point matching, fusion and the like, and the images are projected onto a three-dimensional projection structure, such as a sphere or a cube, after being spliced. Since the Projection structure is three-dimensional, but the encoder widely used at present encodes a two-dimensional plane video, it is necessary to further map an image on the Projection structure to a two-dimensional plane, and perform video compression encoding after obtaining a two-dimensional mapped data frame, and currently, commonly used mapping methods include equivalent standardized Projection (ERP), cube Projection (CMP), and the like.

In any of the above technical solutions, preferably, the media description file includes spatial position information, encoding information, a code rate, a quality distortion value, and a Uniform Resource Locator (URL) of the video segment.

In a fourth aspect of the present invention, a panoramic video transmission apparatus is provided, which is used for a client, and used in cooperation with a panoramic video transmission apparatus in any one of the above technical solutions, and is used for a server, where the panoramic video transmission apparatus includes: the downloading unit is used for acquiring and analyzing the media description file from the server side and downloading the video clip according to the media description file; the video adaptive unit is used for acquiring the head position information of the user in the downloading process and predicting the probability of the video clip being watched according to the head position information; the code rate self-adaption unit is used for calculating the size of a video playing buffer area and calculating the upper limit value of the total code rate of the video clip according to the network bandwidth estimation value and the size of the video playing buffer area; and the optimized video selection unit is used for calculating the space quality change of the video clip and the expected distortion of the video clip, establishing a QoE model according to the space quality change and the expected distortion of the video clip, and selecting the video clip needing to be downloaded.

According to the panoramic video transmission device, when a client user in a network downloads the panoramic video, the client user can acquire the uniform resource locator of the panoramic video according to the media description file so as to download the panoramic video. And performing view adaptation, code rate adaptation and establishment of an optimization model to maximize QoE in the downloading process to determine the downloaded segments. Specifically, in the visual point self-adaption process, the probability that a future block is observed is predicted by acquiring historical head position information of a user; calculating the total video clip code rate which can be transmitted according to the network bandwidth and the size of the playing buffer area; and finally, solving the video clip needing to be downloaded by an optimization method, wherein the overall quality of the video, the space quality fluctuation caused by space blocking and the expected value of distortion of the video clip under the watched probability are considered in the optimization method, and the client can selectively obtain the video clip by establishing an optimization model, so that the data volume of panoramic video transmission is reduced, the aims of improving the video quality, reducing the space quality fluctuation and video playing jam are fulfilled, and the user experience of watching the panoramic video is improved under the network environment with bandwidth limitation.

In the above technical solution, preferably, the viewpoint adaptive unit is specifically configured to define (α, γ) an euler angle of the user orientation, where α is a yaw angle, β is a pitch angle, γ roll angle, and the head position information includes the euler angle of the user orientation, and define t₀Is the current time; defining a prediction interval; definition of

Points on the sphere corresponding to the video segments, wherein

To see a point

And the correct probability P of the predicted value of the user orientation_E(α, gamma.) according to (t)₀Time # P correct probability of predicted value of user orientation_E(α, gamma) calculating the point on the sphere corresponding to the video clip

Probability of being viewed

The calculation formula is as follows:

Set of orientations calculated to see this point

The corresponding probability mean, namely:

The calculation formula of (2) is as follows:

Wherein m is_α、m_β、m_γFor the parameters in the linear regression, these three parameters can be utilized in the window t₀-1,t₀]And solving historical data of the yaw angle, the pitch angle and the roll angle in time by a least square method. Second, by data statistics, at (t)₀The predicted value of the orientation of the user at time + has a difference from the true value thereof, and the relationship therebetween conforms to the Gaussian distribution, i.e., e_α～N(μ_α,σ_α)，e_β～N(μ_β,σ_β)，e_γ～N(μ_γ,σ_γ) From this, the euler angle prediction correct probability component, i.e. P, can be calculated_yaw(α)，P_pitch(β)，P_roll(γ). The yaw angle, the pitch angle and the roll angle are independent, so the Euler angle prediction accuracy probability P is_E(α, γ) can be calculated as P_E(α,β,γ)＝P_yaw(α)P_pitch(β)P_roll(λ)。

In any of the above technical solutions, preferably, the code rate adaptation unit is specifically configured to: defining T as the duration of the video clips, wherein the duration of each video clip is equal; definition b_kThe size of a video playing buffer area after the kth video clip set is downloaded; definition of R_kThe upper limit value of the total code rate for downloading the kth video clip set; definition ofR_minThe minimum value of the total code rate for downloading the kth video clip set; definition C_kThe estimated value of the network bandwidth in the process of downloading the k video clip set; definition B_targetA target value of a video playing buffer area; the size b of the video playing buffer area after the kth video clip set is downloaded_kThe calculation formula of (2) is as follows:

in the technical scheme, a code rate control algorithm based on a target buffer area is limited to determine the total available code rate of the video clip, so that the pause of playing under the condition of a small buffer area is avoided. Specifically, the change of the size of the video buffer is determined by the network bandwidth and the total code rate of the downloaded video clip set, when the video clip set is downloaded, the size of the video buffer is increased, and the size of the video buffer is decreased along with the playing of the video. Therefore, the video buffer size after downloading the kth video segment set can be calculated as:

in any of the above solutions, preferably, the optimized video selection unit hasThe body is used for: defining N as the number of video clips; defining M as the code rate grade of the video clip; definition of r_i,jThe code rate of the ith video clip at the jth code rate level is shown, wherein i is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to M; definition of d_i,jThe quality distortion value of the ith video clip at the jth code rate level is shown, wherein i is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to M; definition of x_i,jWhether the ith video clip is selected at the jth code rate level or not, x_i,j1 represents the selection, x_i,j0 represents no choice, where 1 ≦ i ≦ N,1 ≦ j ≦ M, any i and any j form the set X, X ═ X_i,jI is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to M }; definition s_iIs the area of the sphere corresponding to the ith video segment,

defining Φ (X) as the expected distortion of the video segment; defining Ψ (X) as a spatial quality variation of the video segment; defining R as the upper limit value of the total code rate of the video clip; the expected distortion Φ (X) of a video segment is calculated as:

wherein, P_iIs the probability of a video segment being viewed;

wherein, P_iIs the probability of a video segment being viewed;

the QoE model is: m is_XinΦ(X)+η·Ψ(X)，

Wherein, η for the optimization of the target weights,

in the technical scheme, two QoE factors need to be considered when selecting the video clip, wherein one factor is expected distortion which represents an expected value of distortion under the condition of considering the viewing probability of the video clip; the second is the change of the spatial quality,therefore, solving the video segment target obtained by the client by establishing an optimization model is to maximize QoE, specifically, defining phi (X) as the expected distortion of the video segment, defining psi (X) as the spatial quality change of the video segment, defining η as the optimization target weight, defining R as the upper limit value of the total bitrate of the video segment, wherein the total bitrate of the client in selecting the video segment cannot exceed the value, and in each adaptive stage, the value is the total bitrate R of the kth video segment set obtained in the bitrate adaptive stage_kThen, the optimization problem can be defined as:

wherein the content of the first and second substances,

In a fifth aspect of the present invention, a panoramic video transmission system is provided, including: the panoramic video transmission device in any one of the above technical solutions is used for a client; and the panoramic video transmission device in any one of the technical schemes is used for a server side.

According to the transmission system of the panoramic video of the present invention, the transmission device of the panoramic video for the server in any one of the above technical solutions and the transmission device of the panoramic video for the client in any one of the above technical solutions are adopted, so that all the beneficial effects of the transmission device of the panoramic video for the server and the transmission device of the panoramic video for the client are provided, and are not described herein again.

In a sixth aspect of the present invention, a computer device is provided, which includes a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor is configured to execute the steps of the method for transmitting panoramic video for a server according to any one of the above technical solutions; or the processor is configured to perform the steps of the transmission method for the client-side panoramic video according to any one of the above technical solutions.

A seventh aspect of the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the transmission method for panoramic video of a server according to any one of the above technical solutions; or the computer program when executed by the processor, implements the steps of the method for transmitting panoramic video for a client as in any of the above-mentioned technical solutions.

Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

Drawings

The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

fig. 1 shows a schematic flow diagram of a transmission method of panoramic video according to an embodiment of the first aspect of the present invention;

fig. 2 shows a flow chart of a transmission method of a panoramic video according to an embodiment of the second aspect of the present invention;

fig. 3 shows a schematic block diagram of a transmission apparatus of panoramic video according to an embodiment of a third aspect of the present invention;

fig. 4 shows a schematic block diagram of a transmission apparatus of a panoramic video according to an embodiment of a fourth aspect of the present invention;

fig. 5 shows a schematic block diagram of a transmission system of panoramic video according to an embodiment of a fifth aspect of the present invention;

FIG. 6 shows a schematic diagram of a computer device according to an embodiment of a sixth aspect of the present invention;

FIG. 7 shows a block-based panoramic video transport framework in accordance with an embodiment of the present invention;

FIG. 8 shows a schematic diagram of a blocking process for panoramic video according to one embodiment of the invention;

FIG. 9 shows a schematic diagram of user orientation in terms of Euler angles, according to one embodiment of the present invention;

FIG. 10 shows a schematic diagram of user orientation prediction and true value difference data statistics, according to one embodiment of the invention.

Detailed Description

In order that the above objects, features and advantages of the present invention can be more clearly understood, a more particular description of the invention will be rendered by reference to the appended drawings. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than those specifically described herein, and therefore the scope of the present invention is not limited by the specific embodiments disclosed below.

As shown in fig. 1, a flow chart of a transmission method of a panoramic video according to an embodiment of the first aspect of the present invention is schematically shown. The transmission method is used for a server and comprises the following steps:

102, partitioning, coding and slicing a panoramic video according to preset configuration information to obtain video clips and a media description file;

step 104, storing the video clip and the media description file in the server.

The panoramic video transmission method provided by the invention can predefine the parameters such as the number of space blocks, the width and height of the blocks, coding parameters, the video clip duration and the like at the server end as configuration files of configuration information, the panoramic video and the configuration file are then retrieved and processed by a processor executing the video blocking, and, in particular, the method comprises the steps of carrying out blocking, coding and slicing on the panoramic video to obtain video clips and media description files, storing the processed video clips and media description files in an HTTP server for later use, when a client downloads the panoramic video, the video segment and the media description file are subjected to visual point adaptation and code rate adaptation, and the video segment needing to be downloaded is solved through an optimization method, therefore, the purposes of improving the video quality, reducing the quality fluctuation of the space and stopping the video playing are achieved.

In the above embodiment, preferably, the format of the panoramic video includes an ERP format, a CMP format; the configuration information includes: the number, width, height, playing duration, encoding parameters, and code rate level of the video segments.

In this embodiment, the format of the obtained panoramic video includes, but is not limited to, an ERP format and a CMP format, and after the processor that executes video blocking obtains the panoramic video in the ERP format or the CMP format, the processor performs processing according to preset configuration information, where the configuration information includes, but is not limited to, the number of blocks, the width and height of the blocks, encoding parameters, video segment duration, and other parameters.

In any of the above embodiments, preferably, the media description file includes, but is not limited to, spatial location information, encoding information, bitrate, quality distortion value, and Uniform Resource Locator (URL) of the video clip.

As shown in fig. 2, a flowchart of a transmission method of a panoramic video according to an embodiment of the second aspect of the present invention is shown. The transmission method is used for a client, and is used in cooperation with a server in combination with the transmission method of the panoramic video in any one of the embodiments, and the transmission method comprises the following steps:

step 202, acquiring and analyzing a media description file from a server side, and downloading a video clip according to the media description file;

step 204, in the downloading process, obtaining the head position information of the user, and predicting the probability of the video clip being watched according to the head position information;

step 206, calculating the size of a video playing buffer area, and calculating the upper limit value of the total code rate of the video clip according to the network bandwidth estimation value and the size of the video playing buffer area;

and step 208, calculating the spatial quality change of the video clip and the expected distortion of the video clip, establishing a QoE model according to the spatial quality change and the expected distortion of the video clip, and selecting the video clip needing to be downloaded.

In this embodiment, when downloading a panoramic video, a client user in a network acquires a uniform resource locator of the panoramic video according to a media description file, and downloads the panoramic video. And performing view adaptation, code rate adaptation and establishment of an optimization model to maximize QoE in the downloading process to determine the downloaded segments. Specifically, in the visual point self-adaption process, the probability that a future block is observed is predicted by acquiring historical head position information of a user; calculating the total video clip code rate which can be transmitted according to the network bandwidth and the size of the playing buffer area; and finally, solving the video clip needing to be downloaded by an optimization method, wherein the overall quality of the video, the space quality fluctuation caused by space blocking and the expected value of distortion of the video clip under the watched probability are considered in the optimization method, and the client can selectively obtain the video clip by establishing an optimization model, so that the data volume of panoramic video transmission is reduced, the aims of improving the video quality, reducing the space quality fluctuation and video playing jam are fulfilled, and the user experience of watching the panoramic video is improved under the network environment with bandwidth limitation.

In the above embodiment, preferably, the step of obtaining the head position information of the user and predicting the probability of the video segment being viewed according to the head position information includes the euler angle of the user orientation, and defining (α, γ) the euler angle of the user orientation, wherein α is the yaw angle, β is the pitch angle, γ roll angle, and defining t₀Is the current time; defining a prediction interval; definition of

Points on the sphere corresponding to the video segments, wherein

To see a point

Probability of being viewed

The calculation formula is as follows:

in this embodiment, specific steps are defined that the client needs to perform the view adaptation at each adaptation stage. Specifically, to calculate the probability that a video block is seen, the probability that a point on a spherical surface is seen is calculated, and for one point on the spherical surface

Set of orientations calculated to see this point

The corresponding probability mean, namely:

In any of the above embodiments, preferably, the predicted value of the user orientation

The calculation formula of (2) is as follows:

wherein the content of the first and second substances,μ_αmean, σ, of the angular distribution of yaw_αThe standard deviation is the predicted value of the yaw angle; mu.s_βMean value of the angular distribution of pitch, σ_βThe standard deviation of the predicted value of the pitch angle is used; mu.s_λMean value of the distribution of roll angles, σ_λIs the standard deviation of the predicted value of roll angle.

In this embodiment, first, a linear regression is used to calculate at (t)₀Predicted value of user orientation at time (+)

In any of the above embodiments, preferably, the step of calculating the size of the video playing buffer and calculating the upper limit value of the total bitrate of the video segment according to the network bandwidth estimation value and the size of the video playing buffer specifically includes: defining T as the duration of the video clips, wherein the duration of each video clip is equal; definition b_kThe size of a video playing buffer area after the kth video clip set is downloaded; definition of R_kThe upper limit value of the total code rate for downloading the kth video clip set; definition of R_minThe minimum value of the total code rate for downloading the kth video clip set; statorYi C_kThe estimated value of the network bandwidth in the process of downloading the k video clip set; definition B_targetA target value of a video playing buffer area; the size b of the video playing buffer area after the kth video clip set is downloaded_kThe calculation formula of (2) is as follows:

in this embodiment, a rate control algorithm based on the target buffer is defined to determine the total available rate of the video clip, thereby avoiding the pause in playing under the condition of small buffer. Specifically, the change of the size of the video buffer is determined by the network bandwidth and the total code rate of the downloaded video clip set, when the video clip set is downloaded, the size of the video buffer is increased, and the size of the video buffer is decreased along with the playing of the video. Therefore, the video buffer size after downloading the kth video segment set can be calculated as:

in any of the above embodiments, preferably, N is defined as the number of video segments; defining M as the code rate grade of the video clip; definition of r_i,jFor the ith video sliceThe code rate of the segment at the jth code rate level, wherein i is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to M; definition of d_i,jThe quality distortion value of the ith video clip at the jth code rate level is shown, wherein i is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to M; definition of x_i,jWhether the ith video clip is selected at the jth code rate level or not, x_i,j1 represents the selection, x_i,j0 represents no choice, where 1 ≦ i ≦ N,1 ≦ j ≦ M, any i and any j form the set X, X ═ X_i,jI is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to M }; definition s_iIs the area of the sphere corresponding to the ith video segment,

P_iis the probability of a video segment being viewed;

P_iis the probability of a video segment being viewed;

the QoE model is:

wherein, η for the optimization of the target weights,

in this embodiment, when selecting a video clip, two QoE factors need to be considered, one of which is an expected distortion, which represents an expected value of distortion considering the probability that the video clip is viewed; the second is spatial quality variation, which represents the quality smoothness in the video clip. Due to the fact thatSpecifically, phi (X) is defined as the expected distortion of the video segment, psi (X) is defined as the spatial quality change of the video segment, η is defined as the optimization target weight, R is defined as the upper limit value of the total bitrate of the video segment, the total bitrate of the client cannot exceed the value when the client selects the video segment, and the value is the total bitrate R of the kth video segment set obtained in the bitrate adaptive stage at each adaptive stage_kThen, the optimization problem can be defined as:

wherein the content of the first and second substances,

As shown in fig. 3, a schematic block diagram of a transmission apparatus of a panoramic video according to an embodiment of the third aspect of the present invention. Wherein, this transmission apparatus 300 is used for the server, include:

a processing unit 302, configured to perform blocking, encoding, and slicing processing on the panoramic video according to preset configuration information to obtain a video clip and a media description file;

a storage unit 304, configured to store the video segment and the media description file in the server.

According to the panoramic video transmission apparatus 300 of the present invention, at the server side, parameters such as the number of spatial blocks, the width and height of the blocks, coding parameters, video segment duration, etc. can be predefined, and used as configuration files of configuration information, the panoramic video and the configuration file are then retrieved and processed by a processor executing the video blocking, and, in particular, the method comprises the steps of carrying out blocking, coding and slicing on the panoramic video to obtain video clips and media description files, storing the processed video clips and media description files in an HTTP server for later use, when a client downloads the panoramic video, the video segment and the media description file are subjected to visual point adaptation and code rate adaptation, and the video segment needing to be downloaded is solved through an optimization method, therefore, the purposes of improving the video quality, reducing the quality fluctuation of the space and stopping the video playing are achieved.

In the above embodiment, preferably, the format of the panoramic video includes an ERP format and a CMP format; the configuration information includes: the number, width, height, playing duration, encoding parameters, and code rate level of the video segments.

In any of the above embodiments, preferably, the media description file includes spatial location information, encoding information, bitrate, quality distortion value, and Uniform Resource Locator (URL) of the video clip.

In this embodiment, the media description file includes, but is not limited to, spatial location information, encoding information, bitrate, quality distortion value, Uniform Resource Locator (URL) of the video clip.

As shown in fig. 4, a schematic block diagram of a transmission apparatus of a panoramic video according to an embodiment of the fourth aspect of the present invention. Wherein, the transmission apparatus 400 is used for a client, and is used in cooperation with the transmission apparatus of the panoramic video in any one of the above embodiments, and the transmission apparatus 400 includes:

a downloading unit 402, configured to acquire and parse the media description file from the server, and download the video segment according to the media description file;

a view self-adapting unit 404, configured to, during the downloading process, obtain head position information of the user, and predict a probability that the video segment is viewed according to the head position information;

a code rate self-adapting unit 406, configured to calculate the size of a video playing buffer, and calculate an upper limit value of a total code rate of a video segment according to the network bandwidth estimation value and the size of the video playing buffer;

and an optimized video selection unit 408, configured to calculate a spatial quality change of the video segment and an expected distortion of the video segment, establish a QoE model according to the spatial quality change and the expected distortion of the video segment, and select a video segment to be downloaded.

According to the transmission device 400 of the panoramic video, when downloading the panoramic video, a client user in a network can acquire the uniform resource locator of the panoramic video according to the media description file, so as to download the panoramic video. And performing view adaptation, code rate adaptation and establishment of an optimization model to maximize QoE in the downloading process to determine the downloaded segments. Specifically, in the visual point self-adaption process, the probability that a future block is observed is predicted by acquiring historical head position information of a user; calculating the total video clip code rate which can be transmitted according to the network bandwidth and the size of the playing buffer area; and finally, solving the video clip needing to be downloaded by an optimization method, wherein the overall quality of the video, the space quality fluctuation caused by space blocking and the expected value of distortion of the video clip under the watched probability are considered in the optimization method, and the client can selectively obtain the video clip by establishing an optimization model, so that the data volume of panoramic video transmission is reduced, the aims of improving the video quality, reducing the space quality fluctuation and video playing jam are fulfilled, and the user experience of watching the panoramic video is improved under the network environment with bandwidth limitation.

In the above embodiment, preferably, the viewpoint adapting unit 404 is specifically configured to define (α, γ) the Euler angle of the user orientation, wherein α is yaw angle, β is pitch angle, γ roll angle, and the Euler angle of the user orientation, and to define t₀Is the current time; defining a prediction interval; definition of

Points on the sphere corresponding to the video segments, wherein

To see a point

Probability of being viewed

The calculation formula is as follows:

Set of orientations calculated to see this point

The corresponding probability mean, namely:

The calculation formula of (2) is as follows:

wherein m is_α、m_β、m_γIs a linear regression parameter; probability of correctness P of predicted value of user orientation_EThe formula for the calculation of (α, gamma) is P_E(α,β,γ)＝P_yaw(α)P_pitch(β)P_roll(λ) wherein P_yaw(α)，P_pitch(β)，P_roll(gamma) predicted values of yaw angle, pitch angle and roll angleThe correct probability is calculated by the formula:

In any of the above embodiments, preferably, the code rate adaptation unit 406 is specifically configured to: defining T as the duration of the video clips, wherein the duration of each video clip is equal; definition b_kThe size of a video playing buffer area after the kth video clip set is downloaded; definition of R_kTo downloadThe upper limit value of the total code rate of the k video clip sets; definition of R_minThe minimum value of the total code rate for downloading the kth video clip set; definition C_kThe estimated value of the network bandwidth in the process of downloading the k video clip set; definition B_targetA target value of a video playing buffer area; the size b of the video playing buffer area after the kth video clip set is downloaded_kThe calculation formula of (2) is as follows:

in any of the embodiments described above in detail,preferably, the optimized video selection unit 408 is specifically configured to: defining N as the number of video clips; defining M as the code rate grade of the video clip; definition of r_i,jThe code rate of the ith video clip at the jth code rate level is shown, wherein i is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to M; definition of d_i,jThe quality distortion value of the ith video clip at the jth code rate level is shown, wherein i is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to M; definition of x_i,jWhether the ith video clip is selected at the jth code rate level or not, x_i,j1 represents the selection, x_i,j0 represents no choice, where 1 ≦ i ≦ N,1 ≦ j ≦ M, any i and any j form the set X, X ═ X_i,jI is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to M }; definition s_iIs the area of the sphere corresponding to the ith video segment,

wherein, P_iIs the probability of a video segment being viewed;

wherein, P_iIs the probability of a video segment being viewed;

the QoE model is:

wherein, η for the optimization of the target weights,

in this embodiment, when selecting a video clip, it is necessaryTherefore, solving the video clip target obtained by the client by establishing an optimization model is to maximize QoE, specifically, defining phi (X) as the expected distortion of the video clip, defining psi (X) as the spatial quality change of the video clip, defining η as the weight of the optimization target, defining R as the upper limit value of the total bitrate of the video clip, wherein the total bitrate of the client cannot exceed the value when the video clip is selected, and at each adaptive stage, the value is the total bitrate R of the kth video clip set obtained at the bitrate adaptive stage_kThen, the optimization problem can be defined as:

wherein the content of the first and second substances,

As shown in fig. 5, a schematic block diagram of a transmission system of panoramic video according to an embodiment of the fifth aspect of the present invention. The transmission system 500 includes: the panoramic video transmission device 502 in any one of the above embodiments is used for a server; and a panoramic video transmission device 504 as in any of the above embodiments, for a client.

As shown in FIG. 6, a schematic diagram of a computer device according to an embodiment of a sixth aspect of the present invention. Wherein the computer device 1 comprises: a memory 12, a processor 14 and a computer program stored on the memory 12 and executable on the processor 14, the processor 14 being configured to perform the steps of the method for transmitting panoramic video for a server as in any of the embodiments described above; or processor 14 is adapted to perform the steps of the transmission method for client panoramic video as in any of the embodiments described above.

As shown in fig. 7, a block-based panoramic video transmission framework according to an embodiment of the present invention is illustrated.

In this embodiment, first, parameters such as the number of spatial blocks, the width and height of the blocks, coding parameters, video segment duration, and the like are predefined as configuration files of configuration information; then, a processor executing the video blocking acquires the panoramic video and the configuration file in the ERP format and performs required processing. Specifically, the original panoramic video is partitioned, encoded and sliced to obtain a video clip and a media description file, wherein the media description file comprises information such as spatial position information of the video clip, encoding information of the video clip, data size of the video clip, quality distortion of the video clip and the like; and then, storing the generated media description file and media segment in an HTTP server for standby.

When a client in a network downloads a panoramic video, the client needs to perform point self-adaptation and code rate self-adaptation. The code rate adaptation is carried out based on a code rate control algorithm of a target buffer area, and specifically, the total video clip code rate capable of being transmitted is calculated according to the network bandwidth and the size of a play buffer area; the view point self-adaption predicts the viewing probability of the video block based on a viewing probability model, and particularly predicts the viewing probability of the future block by acquiring historical head position information of a user. And finally, the client solves the video segment needing to be downloaded by an optimization method, and in the optimization method, the invention considers the total quality of the video and the space quality fluctuation caused by space blocking, thereby achieving the purposes of improving the video quality and reducing the space quality fluctuation and video playing pause.

As shown in fig. 8, a schematic diagram of a blocking process for a panoramic video according to an embodiment of the present invention.

In this embodiment, the original panoramic video is subjected to blocking, encoding and slicing to obtain a video clip and a media description file. The method of processing involves the following related concepts: defining W as the width of the panoramic video;defining H as the height of the panoramic video; defining N as the number of video blocks; defining M as the code rate grade of the video; definition of r_i,jThe actual code rate of the video clip of the ith video block at the jth code rate level is greater than or equal to 1 and less than or equal to N, and j is greater than or equal to 1 and less than or equal to M; definition of d_i,jAnd the distortion value of the video clip of the ith video block at the jth code rate level is represented, wherein i is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to M.

As shown in fig. 9, a schematic diagram of user orientation in euler angles according to one embodiment of the present invention. The orientation of the user's head is represented by euler angles, which are yaw, pitch, and roll angles, respectively.

As shown in fig. 10, a diagram of user orientation prediction value and true value difference data statistics according to one embodiment of the present invention.

In this embodiment, the data statistics show that₀Predicted value of user head orientation at time (+)

The difference from its true value, the relationship between them corresponding to a Gaussian distribution, i.e. e_α～N(μ_α,σ_α)，e_β～N(μ_β,σ_β)，e_γ～N(μ_γ,σ_γ) Wherein, mu_αMean, σ, of the angular distribution of yaw_αThe standard deviation is the predicted value of the yaw angle; mu.s_βMean value of the angular distribution of pitch, σ_βThe standard deviation of the predicted value of the pitch angle is used; mu.s_λMean value of the distribution of roll angles, σ_λIs the standard deviation of the predicted value of roll angle. From this, the euler angle prediction correct probability component, i.e. P, can be calculated_yaw(α)，P_pitch(β)，P_roll(γ). The yaw angle, the pitch angle and the roll angle are independent, so the Euler angle prediction accuracy probability P is_E(α, γ) can be calculated as P_E(α,β,γ)＝P_yaw(α)P_pitch(β)P_roll(λ)。

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A transmission method of panoramic video is used for a client, and is characterized in that the transmission method comprises the following steps:

acquiring and analyzing a media description file from a server, and downloading a video clip according to the media description file, wherein the media description file and the video clip are obtained by the server by blocking, coding and slicing the panoramic video according to preset configuration information;

in the downloading process, acquiring head position information of a user, and predicting the probability of the video clip being watched according to the head position information;

calculating the size of a video playing buffer area, and calculating the upper limit value of the total code rate of the video clip according to the network bandwidth estimation value and the size of the video playing buffer area;

calculating the space quality change of the video clip and the expected distortion of the video clip, establishing a QoE model according to the space quality change and the expected distortion of the video clip, and selecting the video clip needing to be downloaded;

the method for predicting the probability of the video clip being watched according to the head position information comprises the following steps:

the head position information comprises euler angles of user orientation, defined as (α, β, γ) euler angles of the user orientation, where α is yaw angle, β is pitch angle, γ roll angle;

definition of t₀Defining a prediction interval as a current time;

definition of

Is a point on the sphere corresponding to the video segment, wherein

Is latitude, θ is longitude;

definition of U_iThe number of points on the spherical surface corresponding to the ith video clip;

definition of

To see a point

Set of orientations of (a);

from the euler angle (α, γ) of the user's orientation, the value at (t) is calculated₀Predicted value of user orientation at time (+)

And the correct probability P of the predicted value of the user orientation_E(α,β,γ)；

According to said (t)₀Time # P correct probability of predicted value of user orientation_E(α, gamma) and calculating the point on the spherical surface corresponding to the video clip

Probability of being viewed

The calculation formula is as follows:

probability P that the video segment is viewed_iThe calculation formula is the mean value of the observed probability of the points on the spherical surface corresponding to the video clip:

2. the method for transmitting panoramic video according to claim 1,

prediction value of the user orientation

The calculation formula of (2) is as follows:

wherein m is_α、m_β、m_γIs a linear regression parameter;

the probability P of correctness of the predicted value of the user orientation_EThe formula for the calculation of (α, γ) is:

P_E(α,β,γ)＝P_yaw(α)P_pitch(β)P_roll(λ)，

wherein, P_yaw(α)，P_pitch(β)，P_roll(gamma) are the correct probabilities of the yaw angle, the pitch angle and the predicted roll angle respectively, and the calculation formula is as follows:

wherein the content of the first and second substances,

μ_αis the mean, σ, of the yaw angular distribution_αThe standard deviation of the predicted value of the yaw angle is used as the standard deviation;

μ_βis the mean, σ, of the pitch angle distribution_βThe standard deviation of the predicted value of the pitch angle is used;

μ_λis the mean, σ, of the distribution of roll angles_λIs the standard deviation of the predicted value of the roll angle.

3. The method for transmitting a panoramic video according to claim 1, wherein the step of calculating the size of the video playing buffer and calculating the upper limit value of the total bitrate of the video clip according to the network bandwidth estimation value and the size of the video playing buffer specifically comprises:

defining T as the video segment time length, wherein the time length of each video segment is equal;

definition b_kThe size of the video playing buffer area after the kth video clip set is downloaded;

definition of R_kThe upper limit value of the total code rate for downloading the kth video clip set;

definition of R_minThe minimum value of the total code rate for downloading the kth video clip set;

definition C_kThe estimated value of the network bandwidth in the process of downloading the kth video clip set is obtained;

definition B_targetThe target value of the video playing buffer zone is obtained;

the size b of the video playing buffer area after the kth video segment set is downloaded_kThe calculation formula of (2) is as follows:

4. the method for transmitting panoramic video according to claim 1,

defining N as the number of the video clips;

defining M as the code rate grade of the video clip;

definition of r_i,jThe code rate of the ith video clip at the jth code rate level is shown, wherein i is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to M;

definition of d_i,jThe quality distortion value of the ith video clip at the jth code rate level is shown, wherein i is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to M;

definition of x_i,jWhether the ith video clip is selected at the jth code rate level or not, x_i,j1 represents the selection, x_i,j0 representsSelecting, wherein i is more than or equal to 1 and less than or equal to N, j is more than or equal to 1 and less than or equal to M, any i and any j form a set X, and X is { X ═ X_i,j|1≤i≤N,1≤j≤M}；

Definition s_iIs the area of the sphere corresponding to the ith video segment,

defining Φ (X) as an expected distortion of the video segment;

defining Ψ (X) as a spatial quality variation of the video segment;

defining R as the upper limit value of the total code rate of the video clip;

the expected distortion Φ (X) of the video segment is calculated as:

P_iis the probability of the video segment being viewed;

the calculation formula of the spatial quality variation Ψ (X) of the video segment is as follows:

P_iis the probability of the video segment being viewed;

the QoE model is as follows:

wherein, η for the optimization of the target weights,

5. a transmission apparatus of panoramic video for a client, the transmission apparatus comprising:

the downloading unit is used for acquiring and analyzing a media description file from a server side and downloading a video clip according to the media description file, wherein the media description file and the video clip are obtained by the server side by blocking, coding and slicing the panoramic video according to preset configuration information;

the video adaptive unit is used for acquiring the head position information of a user in the downloading process and predicting the probability of watching the video clip according to the head position information;

the code rate self-adaption unit is used for calculating the size of a video playing buffer area and calculating the upper limit value of the total code rate of the video clip according to the network bandwidth estimation value and the size of the video playing buffer area;

the optimized video selection unit is used for calculating the space quality change of the video clip and the expected distortion of the video clip, establishing a QoE model according to the space quality change and the expected distortion of the video clip, and selecting the video clip needing to be downloaded;

the viewpoint adaptive unit is specifically configured to:

definition of t₀Defining a prediction interval as a current time;

definition of

Is a point on the sphere corresponding to the video segment, wherein

Is latitude, θ is longitude;

definition of

To see a point

Set of orientations of (a);

Probability of being viewed

The calculation formula is as follows:

6. the panoramic video transmission apparatus according to claim 5,

prediction value of the user orientation

The calculation formula of (2) is as follows:

m_α、m_β、m_γis a linear regression parameter;

P_E(α,β,γ)＝P_yaw(α)P_pitch(β)P_roll(λ)，

wherein the content of the first and second substances,

7. The apparatus for transmitting a panoramic video according to claim 5, wherein the code rate adaptation unit is specifically configured to:

8. the apparatus for transmitting panoramic video according to claim 5, wherein the optimized video selection unit is specifically configured to:

defining N as the number of the video clips;

defining M as the code rate grade of the video clip;

definition of x_i,jWhether the ith video clip is selected at the jth code rate level or not, x_i,j1 represents the selection, x_i,j0 represents no choice, where 1 ≦ i ≦ N,1 ≦ j ≦ M, any i and any j form the set X, X ═ X_i,j|1≤i≤N,1≤j≤M}；

Definition s_iIs the area of the sphere corresponding to the ith video segment,

defining Φ (X) as an expected distortion of the video segment;

defining Ψ (X) as a spatial quality variation of the video segment;

defining R as the upper limit value of the total code rate of the video clip;

the expected distortion Φ (X) of the video segment is calculated as:

P_iis the probability of the video segment being viewed;

P_iis the probability of the video segment being viewed;

the QoE model is as follows:

wherein, η for the optimization of the target weights,

9. a system for transmitting panoramic video, comprising: the panoramic video transmission apparatus of any one of claims 5 to 8, for a client, an

A transmission device of panoramic video is used for a server side, and comprises the following components:

the processing unit is used for carrying out blocking, coding and slicing processing on the panoramic video according to preset configuration information to obtain video clips and media description files;

a storage unit, configured to store the video clip and the media description file in the server.

10. Computer device comprising a memory, a processor and a computer program stored on said memory and executable on said processor, characterized in that said processor is adapted to perform the steps of the method for transmission of panoramic video according to any one of claims 1 to 4.

11. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method for transmitting panoramic video according to any one of claims 1 to 4.