CN109286855B - Panoramic video transmission method, transmission device and transmission system - Google Patents

Panoramic video transmission method, transmission device and transmission system Download PDF

Info

Publication number
CN109286855B
CN109286855B CN201710590143.6A CN201710590143A CN109286855B CN 109286855 B CN109286855 B CN 109286855B CN 201710590143 A CN201710590143 A CN 201710590143A CN 109286855 B CN109286855 B CN 109286855B
Authority
CN
China
Prior art keywords
video
video clip
definition
code rate
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201710590143.6A
Other languages
Chinese (zh)
Other versions
CN109286855A (en
Inventor
谢澜
张行功
郭宗明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Peking University Founder Group Co Ltd
Beijing Founder Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University, Peking University Founder Group Co Ltd, Beijing Founder Electronics Co Ltd filed Critical Peking University
Priority to CN201710590143.6A priority Critical patent/CN109286855B/en
Publication of CN109286855A publication Critical patent/CN109286855A/en
Application granted granted Critical
Publication of CN109286855B publication Critical patent/CN109286855B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/262Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists
    • H04N21/26208Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists the scheduling operation being performed under constraints
    • H04N21/26216Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists the scheduling operation being performed under constraints involving the channel capacity, e.g. network bandwidth
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/266Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
    • H04N21/2662Controlling the complexity of the video stream, e.g. by scaling the resolution or bitrate of the video stream based on the client capabilities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/65Transmission of management data between client and server
    • H04N21/658Transmission by the client directed to the server
    • H04N21/6587Control parameters, e.g. trick play commands, viewpoint selection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/858Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot
    • H04N21/8586Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot by using a URL

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The invention provides a transmission method, a transmission device and a transmission system of a panoramic video. The panoramic video transmission method is used for a client and comprises the following steps: acquiring and analyzing a media file from a server side, and downloading a video clip according to the media file; in the downloading process, acquiring the head position information of a user, and predicting the probability of the video clip being watched according to the head position information; calculating the size of a video playing buffer area, and calculating the upper limit value of the total code rate of the video clip according to the network bandwidth estimation value and the size of the video playing buffer area; and calculating the space quality change of the video clip and the expected distortion of the video clip, establishing a QoE model according to the space quality change and the expected distortion of the video clip, and selecting the video clip needing to be downloaded. By the technical scheme of the invention, the technical effects of reducing the data volume of panoramic video transmission, improving the video quality, reducing the quality fluctuation of space and the video playing jam are achieved.

Description

Panoramic video transmission method, transmission device and transmission system
Technical Field
The present invention relates to the field of multimedia technologies, and in particular, to a panoramic video transmission method, a panoramic video transmission apparatus, a panoramic video transmission system, a computer device, and a computer-readable storage medium.
Background
With the development of multimedia technology, virtual reality technology (VR) has received a great deal of attention from the industry and academia. Among them, panoramic videos, such as 360-degree videos and omni-directional videos, are widely used.
The data size of virtual reality video is usually very large, which makes the compression rate and network transmission for panoramic video the most challenging. For example, the code rate of a panoramic video with the resolution of 4Kx2K mapped in an ERP mode can reach 15Mbps to 20Mbps after compression coding. The development of internet application of panoramic video is hindered by the characteristics of high resolution and high code rate of the panoramic video. In addition, when the user watches the panoramic video, only the video content in the window is actually seen, and the content in other areas is not seen by the user. Therefore, transmitting the entire content of the panoramic video (both in-window and out-of-window content) to the client results in wasted bandwidth.
The HTTP dynamic streaming media technology can realize the self-adaptive transmission based on the window, namely the content transmission in the user window is high in quality, the content transmission outside the window is low in quality, and therefore the total data transmission quantity is reduced. The transmission method based on the block is to divide the video into blocks in space, and the client can selectively download the video content, such as downloading a high-quality version for the block in the window, downloading a low-quality version for the block outside the window, or not downloading.
However, providing high quality block-based transmission still presents challenges: (1) video block acquisition errors. Because the client needs to predict the watching direction of the user to acquire future video content in advance, the client does not download some video blocks due to prediction errors, and if the video content is needed during actual playing, the video content is lacked to influence the user experience; (2) and (5) video playing is blocked. Since there is immediacy in the prediction of the user viewing orientation, that is, the accuracy of the user orientation result is extremely reduced after the user viewing orientation is predicted for too long time, the playing buffer area of the video is very small (for example, 3 seconds), and in the case of a small buffer area, the video playing is easy to pause; (3) the mixed code rate causes boundary effects. Since the transmission method based on the block needs to spatially divide the video, when the client selects the video content, the video may have an obvious boundary or inconsistent quality during rendering due to different bit rates.
Therefore, how to provide a method for transmitting a panoramic video based on blocks to achieve the purposes of improving video quality, reducing spatial quality fluctuation and video playing pause becomes a technical problem to be solved at present.
Disclosure of Invention
The present invention is directed to solving at least one of the problems of the prior art or the related art.
To this end, a first aspect of the present invention is to provide a method for transmitting a panoramic video, which is used for a server.
The second aspect of the present invention is to provide a method for transmitting a panoramic video, which is used for a client.
A third aspect of the present invention is to provide a panoramic video transmission apparatus for a server.
The fourth aspect of the present invention is to provide a device for transmitting panoramic video, which is used for a client.
A fifth aspect of the present invention is to provide a transmission system of panoramic video.
A sixth aspect of the invention is directed to a computer device.
A seventh aspect of the present invention is directed to a computer-readable storage medium.
In view of this, a first aspect of the present invention provides a method for transmitting a panoramic video, which performs blocking, encoding, and slicing on the panoramic video according to preset configuration information to obtain a video clip and a media description file; the video clip and the media description file are stored in a server.
According to the transmission method of the panoramic video, at the server end, parameters such as the number of space blocks, the width and the height of the blocks, coding parameters, the duration of video segments and the like can be predefined and used as configuration files of configuration information, the panoramic video and the configuration file are then retrieved and processed by a processor executing the video blocking, and, in particular, the method comprises the steps of carrying out blocking, coding and slicing on the panoramic video to obtain video clips and media description files, storing the processed video clips and media description files in an HTTP server for later use, when a client downloads the panoramic video, the video segment and the media description file are subjected to visual point adaptation and code rate adaptation, and the video segment needing to be downloaded is solved through an optimization method, therefore, the purposes of improving the video quality, reducing the quality fluctuation of the space and stopping the video playing are achieved.
In the above technical solution, preferably, the format of the panoramic video includes an ERP format and a CMP format; the configuration information includes: the number, width, height, playing duration, encoding parameters, and code rate level of the video segments.
In the technical scheme, the generation of panoramic video content is explained first, in 2016, a draft of an omnibearing media application format is provided by MPEG, when a panoramic video is manufactured, a plurality of cameras are usually used for recording visual scenes of the real world, and video frames (images) output by the cameras at the same moment need to be spliced, projected and mapped and then packaged into a two-dimensional plane data frame to carry out video coding. The splicing refers to restoring a real world visual field of images acquired by a plurality of cameras at the same time through technologies such as feature point matching, fusion and the like, and the images are projected onto a three-dimensional projection structure, such as a sphere or a cube, after being spliced. Since the Projection structure is three-dimensional, but the encoder widely used at present encodes a two-dimensional plane video, it is necessary to further map an image on the Projection structure to a two-dimensional plane, and perform video compression encoding after obtaining a two-dimensional mapped data frame, and currently, commonly used mapping methods include equivalent visual Projection (ERP), cube Projection (CMP), and the like.
In any of the above technical solutions, preferably, the media description file includes spatial position information, encoding information, a bitrate, a quality distortion value, and a Uniform Resource Locator (URL) of the video clip.
In this embodiment, the media description file includes, but is not limited to, spatial location information, encoding information, bitrate, quality distortion value, and Uniform Resource Locator (URL) of the video clip.
In a second aspect of the present invention, a method for transmitting a panoramic video is provided, which is used for a client and is used in cooperation with a method for transmitting a panoramic video in any one of the above technical solutions, and the method for transmitting a panoramic video includes: acquiring and analyzing a media description file from a server side, and downloading a video clip according to the media description file; in the downloading process, acquiring the head position information of a user, and predicting the probability of the video clip being watched according to the head position information; calculating the size of a video playing buffer area, and calculating the upper limit value of the total code rate of the video clip according to the network bandwidth estimation value and the size of the video playing buffer area; and calculating the space quality change of the video clip and the expected distortion of the video clip, establishing a QoE model according to the space quality change and the expected distortion of the video clip, and selecting the video clip needing to be downloaded.
According to the transmission method of the panoramic video, when a client user in a network downloads the panoramic video, the uniform resource locator of the panoramic video is obtained according to the media description file, and therefore downloading is carried out. And performing view adaptation, code rate adaptation and establishment of an optimization model to maximize QoE in the downloading process to determine the downloaded segments. Specifically, in the visual point self-adaption process, the probability that a future block is observed is predicted by acquiring historical head position information of a user; calculating the total video clip code rate which can be transmitted according to the network bandwidth and the size of the playing buffer area; and finally, solving the video clip needing to be downloaded by an optimization method, wherein the overall quality of the video, the space quality fluctuation caused by space blocking and the expected value of distortion of the video clip under the watched probability are considered in the optimization method, and the client can selectively obtain the video clip by establishing an optimization model, so that the data volume of panoramic video transmission is reduced, the aims of improving the video quality, reducing the space quality fluctuation and video playing jam are fulfilled, and the user experience of watching the panoramic video is improved under the network environment with bandwidth limitation.
In the above technical solution, preferably, the step of obtaining the head position information of the user and predicting the probability of the video segment being viewed according to the head position information includes the euler angle of the user orientation, and defining (α, γ) the euler angle of the user orientation, wherein α is the yaw angle, β is the pitch angle, γ roll angle, and defining t0Is the current time; defining a prediction interval; definition of
Figure GDA0002470094180000041
Points on the sphere corresponding to the video segments, wherein
Figure GDA0002470094180000042
Is latitude, θ is longitude; definition of UiThe number of points on the spherical surface corresponding to the ith video clip; definition of
Figure GDA0002470094180000043
To see a point
Figure GDA0002470094180000044
Is calculated at (t) from the euler angles (α, gamma) of the user's orientation0Predicted value of user orientation at time (+)
Figure GDA0002470094180000045
Figure GDA0002470094180000046
And the correct probability P of the predicted value of the user orientationE(α, gamma.) according to (t)0(+) predicted value of user orientationProbability of correctness PE(α, gamma) calculating the point on the sphere corresponding to the video clip
Figure GDA00024700941800000410
Probability of being viewed
Figure GDA0002470094180000047
The calculation formula is as follows:
Figure GDA0002470094180000048
probability P of a video segment being viewediThe calculation formula is the mean value of the observed probability of the point on the spherical surface corresponding to the video clip:
Figure GDA0002470094180000049
in the technical scheme, specific steps that the client needs to perform the viewpoint adaptation in each adaptation stage are limited. Specifically, to calculate the probability that a video block is seen, the probability that a point on a spherical surface is seen is calculated, and for one point on the spherical surface
Figure GDA0002470094180000051
Since this point is likely to be seen through several orientations of the user, the probability that this point is seen
Figure GDA0002470094180000052
Set of orientations calculated to see this point
Figure GDA0002470094180000053
The corresponding probability mean, namely:
Figure GDA0002470094180000054
probability P that a video block is viewediCalculate the spherical point U as this video blockiThe mean of the probabilities that the corresponding points on the sphere are viewed, i.e.:
Figure GDA0002470094180000055
wherein, PE(α, γ) is at (t)0Time + correct probability of predicted value of user orientation.
In any of the above technical solutions, preferably, the predicted value of the user direction is a predicted value of the user direction
Figure GDA0002470094180000056
Figure GDA0002470094180000057
The calculation formula of (2) is as follows:
Figure GDA0002470094180000058
wherein m isα、mβ、mγIs a linear regression parameter; probability of correctness P of predicted value of user orientationEThe formula for the calculation of (α, gamma) is PE(α,β,γ)=Pyaw(α)Ppitch(β)Proll(λ) wherein Pyaw(α),Ppitch(β),Proll(gamma) are the correct probabilities of the predicted values of the yaw angle, the pitch angle and the roll angle respectively, and the calculation formula is as follows:
Figure GDA0002470094180000059
wherein, muαMean, σ, of the angular distribution of yawαThe standard deviation is the predicted value of the yaw angle; mu.sβMean value of the angular distribution of pitch, σβThe standard deviation of the predicted value of the pitch angle is used; mu.sλMean value of the distribution of roll angles, σλIs the standard deviation of the predicted value of roll angle.
In this embodiment, first, a linear regression is used to calculate the value at (t)0Predicted value of user orientation at time (+)
Figure GDA00024700941800000510
Wherein m isα、mβ、mγFor the parameters in the linear regression, these three parameters can be utilized separatelyAt window [ t0-1,t0]And solving historical data of the yaw angle, the pitch angle and the roll angle in time by a least square method. Second, by data statistics, at (t)0The predicted value of the orientation of the user at time + has a difference from the true value thereof, and the relationship therebetween conforms to the Gaussian distribution, i.e., eα~N(μαα),eβ~N(μββ),eγ~N(μγγ) From this, the euler angle prediction correct probability component, i.e. P, can be calculatedyaw(α),Ppitch(β),Proll(γ). The yaw angle, the pitch angle and the roll angle are independent, so the Euler angle prediction accuracy probability P isE(α, γ) can be calculated as PE(α,β,γ)=Pyaw(α)Ppitch(β)Proll(λ)。
In any of the above technical solutions, preferably, the step of calculating the size of the video playing buffer and calculating the upper limit value of the total bitrate of the video clip according to the network bandwidth estimation value and the size of the video playing buffer specifically includes: defining T as the duration of the video clips, wherein the duration of each video clip is equal; definition bkThe size of a video playing buffer area after the kth video clip set is downloaded; definition of RkThe upper limit value of the total code rate for downloading the kth video clip set; definition of RminThe minimum value of the total code rate for downloading the kth video clip set; definition CkThe estimated value of the network bandwidth in the process of downloading the k video clip set; definition BtargetA target value of a video playing buffer area; the size b of the video playing buffer area after the kth video clip set is downloadedkThe calculation formula of (2) is as follows:
Figure GDA0002470094180000061
downloading total code rate upper limit value R of kth video clip setkThe calculation formula of (2) is as follows:
Figure GDA0002470094180000062
in the technical scheme, the target-based cache is limitedAnd determining the total available code rate of the video clip by using a code rate control algorithm of the buffer area, thereby avoiding pause of playing under the condition of a small buffer area. Specifically, the change of the size of the video buffer is determined by the network bandwidth and the total code rate of the downloaded video clip set, when the video clip set is downloaded, the size of the video buffer is increased, and the size of the video buffer is decreased along with the playing of the video. Therefore, the video buffer size after downloading the kth video segment set can be calculated as:
Figure GDA0002470094180000063
to avoid video playing jamming, we control the video buffer size to a target value, i.e. let bk=BtargetThen, the total bitrate of the kth video segment set can be found as:
Figure GDA0002470094180000064
to avoid negative results, RkSetting a minimum value RminThen R iskCan be calculated as:
Figure GDA0002470094180000065
in any of the above technical solutions, preferably, N is defined as the number of video segments; defining M as the code rate grade of the video clip; definition of ri,jThe code rate of the ith video clip at the jth code rate level is shown, wherein i is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to M; definition of di,jThe quality distortion value of the ith video clip at the jth code rate level is shown, wherein i is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to M; definition of xi,jWhether the ith video clip is selected at the jth code rate level or not, x i,j1 represents the selection, x i,j0 represents no choice, where 1 ≦ i ≦ N,1 ≦ j ≦ M, any i and any j form the set X, X ═ Xi,jI is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to M }; definition siIs the area of the sphere corresponding to the ith video segment,
Figure GDA0002470094180000071
defining phi (X) as the period of a video segmentDistortion is observed; defining Ψ (X) as a spatial quality variation of the video segment; defining R as the upper limit value of the total code rate of the video clip; the expected distortion Φ (X) of a video segment is calculated as:
Figure GDA0002470094180000072
Piis the probability of a video segment being viewed;
the spatial quality variation Ψ (X) of a video segment is calculated by the following formula:
Figure GDA0002470094180000073
Piis the probability of a video segment being viewed;
the QoE model is:
Figure GDA0002470094180000074
wherein, η for the optimization of the target weights,
Figure GDA0002470094180000075
in the technical scheme, when a video clip is selected, two QoE factors need to be considered, wherein one factor is expected distortion and represents an expected value of the distortion under the condition of considering the watching probability of the video clip, and the other factor is spatial quality change and represents quality smoothness of the video clipkThen, the optimization problem can be defined as:
Figure GDA0002470094180000076
wherein the content of the first and second substances,
Figure GDA0002470094180000077
in each self-adaptive stage, the client solves the objective function and the conditional function to obtain a video segment required to be obtained, then sends a request to the server for downloading, and enters the next self-adaptive stage after downloading is completed until the client finishes watching the video.
In a third aspect of the present invention, a transmission apparatus for panoramic video is provided, where the transmission apparatus is used for a server, and includes: the processing unit is used for carrying out blocking, coding and slicing processing on the panoramic video according to preset configuration information to obtain video clips and media description files; a storage unit, configured to store the video clip and the media description file in the server.
According to the panoramic video transmission device, at the server end, parameters such as the number of space blocks, the width and height of the blocks, coding parameters, the duration of video segments and the like can be predefined and used as configuration files of configuration information, the panoramic video and the configuration file are then retrieved and processed by a processor executing the video blocking, and, in particular, the method comprises the steps of carrying out blocking, coding and slicing on the panoramic video to obtain video clips and media description files, storing the processed video clips and media description files in an HTTP server for later use, when a client downloads the panoramic video, the video segment and the media description file are subjected to visual point adaptation and code rate adaptation, and the video segment needing to be downloaded is solved through an optimization method, therefore, the purposes of improving the video quality, reducing the quality fluctuation of the space and stopping the video playing are achieved.
In the above technical solution, preferably, the format of the panoramic video includes an ERP format and a CMP format; the configuration information includes: the number, width, height, playing duration, encoding parameters, and code rate level of the video segments.
Firstly, the generation of panoramic video content is explained, in 2016 (6 months), an MPEG (moving Picture experts group) proposes a standard draft of an all-dimensional media application format, when a panoramic video is manufactured, a plurality of cameras are usually used for recording visual scenes of a real world, and video frames (images) output by the cameras at the same moment need to be spliced, projected and mapped and then packaged into a two-dimensional plane data frame to carry out video coding. The splicing refers to restoring a real world visual field of images acquired by a plurality of cameras at the same time through technologies such as feature point matching, fusion and the like, and the images are projected onto a three-dimensional projection structure, such as a sphere or a cube, after being spliced. Since the Projection structure is three-dimensional, but the encoder widely used at present encodes a two-dimensional plane video, it is necessary to further map an image on the Projection structure to a two-dimensional plane, and perform video compression encoding after obtaining a two-dimensional mapped data frame, and currently, commonly used mapping methods include equivalent standardized Projection (ERP), cube Projection (CMP), and the like.
In any of the above technical solutions, preferably, the media description file includes spatial position information, encoding information, a code rate, a quality distortion value, and a Uniform Resource Locator (URL) of the video segment.
In this embodiment, the media description file includes, but is not limited to, spatial location information, encoding information, bitrate, quality distortion value, and Uniform Resource Locator (URL) of the video clip.
In a fourth aspect of the present invention, a panoramic video transmission apparatus is provided, which is used for a client, and used in cooperation with a panoramic video transmission apparatus in any one of the above technical solutions, and is used for a server, where the panoramic video transmission apparatus includes: the downloading unit is used for acquiring and analyzing the media description file from the server side and downloading the video clip according to the media description file; the video adaptive unit is used for acquiring the head position information of the user in the downloading process and predicting the probability of the video clip being watched according to the head position information; the code rate self-adaption unit is used for calculating the size of a video playing buffer area and calculating the upper limit value of the total code rate of the video clip according to the network bandwidth estimation value and the size of the video playing buffer area; and the optimized video selection unit is used for calculating the space quality change of the video clip and the expected distortion of the video clip, establishing a QoE model according to the space quality change and the expected distortion of the video clip, and selecting the video clip needing to be downloaded.
According to the panoramic video transmission device, when a client user in a network downloads the panoramic video, the client user can acquire the uniform resource locator of the panoramic video according to the media description file so as to download the panoramic video. And performing view adaptation, code rate adaptation and establishment of an optimization model to maximize QoE in the downloading process to determine the downloaded segments. Specifically, in the visual point self-adaption process, the probability that a future block is observed is predicted by acquiring historical head position information of a user; calculating the total video clip code rate which can be transmitted according to the network bandwidth and the size of the playing buffer area; and finally, solving the video clip needing to be downloaded by an optimization method, wherein the overall quality of the video, the space quality fluctuation caused by space blocking and the expected value of distortion of the video clip under the watched probability are considered in the optimization method, and the client can selectively obtain the video clip by establishing an optimization model, so that the data volume of panoramic video transmission is reduced, the aims of improving the video quality, reducing the space quality fluctuation and video playing jam are fulfilled, and the user experience of watching the panoramic video is improved under the network environment with bandwidth limitation.
In the above technical solution, preferably, the viewpoint adaptive unit is specifically configured to define (α, γ) an euler angle of the user orientation, where α is a yaw angle, β is a pitch angle, γ roll angle, and the head position information includes the euler angle of the user orientation, and define t0Is the current time; defining a prediction interval; definition of
Figure GDA0002470094180000091
Points on the sphere corresponding to the video segments, wherein
Figure GDA0002470094180000092
Is latitude, θ is longitude; definition of UiThe number of points on the spherical surface corresponding to the ith video clip; definition of
Figure GDA0002470094180000093
To see a point
Figure GDA0002470094180000094
Is calculated at (t) from the euler angles (α, gamma) of the user's orientation0Predicted value of user orientation at time (+)
Figure GDA0002470094180000095
And the correct probability P of the predicted value of the user orientationE(α, gamma.) according to (t)0Time # P correct probability of predicted value of user orientationE(α, gamma) calculating the point on the sphere corresponding to the video clip
Figure GDA0002470094180000101
Probability of being viewed
Figure GDA0002470094180000102
The calculation formula is as follows:
Figure GDA0002470094180000103
probability P of a video segment being viewediThe calculation formula is the mean value of the observed probability of the point on the spherical surface corresponding to the video clip:
Figure GDA0002470094180000104
in the technical scheme, specific steps that the client needs to perform the viewpoint adaptation in each adaptation stage are limited. Specifically, to calculate the probability that a video block is seen, the probability that a point on a spherical surface is seen is calculated, and for one point on the spherical surface
Figure GDA0002470094180000105
Since this point is likely to be seen through several orientations of the user, the probability that this point is seen
Figure GDA0002470094180000106
Set of orientations calculated to see this point
Figure GDA0002470094180000107
The corresponding probability mean, namely:
Figure GDA0002470094180000108
probability P that a video block is viewediCalculate the spherical point U as this video blockiThe mean of the probabilities that the corresponding points on the sphere are viewed, i.e.:
Figure GDA0002470094180000109
wherein, PE(α, γ) is at (t)0Time + correct probability of predicted value of user orientation.
In any of the above technical solutions, preferably, the predicted value of the user direction is a predicted value of the user direction
Figure GDA00024700941800001010
Figure GDA00024700941800001011
The calculation formula of (2) is as follows:
Figure GDA00024700941800001012
wherein m isα、mβ、mγIs a linear regression parameter; probability of correctness P of predicted value of user orientationEThe formula for the calculation of (α, gamma) is PE(α,β,γ)=Pyaw(α)Ppitch(β)Proll(λ) wherein Pyaw(α),Ppitch(β),Proll(gamma) are the correct probabilities of the predicted values of the yaw angle, the pitch angle and the roll angle respectively, and the calculation formula is as follows:
Figure GDA00024700941800001013
wherein, muαMean, σ, of the angular distribution of yawαThe standard deviation is the predicted value of the yaw angle; mu.sβMean value of the angular distribution of pitch, σβThe standard deviation of the predicted value of the pitch angle is used; mu.sλMean value of the distribution of roll angles, σλIs the standard deviation of the predicted value of roll angle.
In this embodiment, first, a linear regression is used to calculate the value at (t)0Predicted value of user orientation at time (+)
Figure GDA0002470094180000111
Wherein m isα、mβ、mγFor the parameters in the linear regression, these three parameters can be utilized in the window t0-1,t0]And solving historical data of the yaw angle, the pitch angle and the roll angle in time by a least square method. Second, by data statistics, at (t)0The predicted value of the orientation of the user at time + has a difference from the true value thereof, and the relationship therebetween conforms to the Gaussian distribution, i.e., eα~N(μαα),eβ~N(μββ),eγ~N(μγγ) From this, the euler angle prediction correct probability component, i.e. P, can be calculatedyaw(α),Ppitch(β),Proll(γ). The yaw angle, the pitch angle and the roll angle are independent, so the Euler angle prediction accuracy probability P isE(α, γ) can be calculated as PE(α,β,γ)=Pyaw(α)Ppitch(β)Proll(λ)。
In any of the above technical solutions, preferably, the code rate adaptation unit is specifically configured to: defining T as the duration of the video clips, wherein the duration of each video clip is equal; definition bkThe size of a video playing buffer area after the kth video clip set is downloaded; definition of RkThe upper limit value of the total code rate for downloading the kth video clip set; definition ofRminThe minimum value of the total code rate for downloading the kth video clip set; definition CkThe estimated value of the network bandwidth in the process of downloading the k video clip set; definition BtargetA target value of a video playing buffer area; the size b of the video playing buffer area after the kth video clip set is downloadedkThe calculation formula of (2) is as follows:
Figure GDA0002470094180000112
downloading total code rate upper limit value R of kth video clip setkThe calculation formula of (2) is as follows:
Figure GDA0002470094180000113
in the technical scheme, a code rate control algorithm based on a target buffer area is limited to determine the total available code rate of the video clip, so that the pause of playing under the condition of a small buffer area is avoided. Specifically, the change of the size of the video buffer is determined by the network bandwidth and the total code rate of the downloaded video clip set, when the video clip set is downloaded, the size of the video buffer is increased, and the size of the video buffer is decreased along with the playing of the video. Therefore, the video buffer size after downloading the kth video segment set can be calculated as:
Figure GDA0002470094180000114
to avoid video playing jamming, we control the video buffer size to a target value, i.e. let bk=BtargetThen, the total bitrate of the kth video segment set can be found as:
Figure GDA0002470094180000115
to avoid negative results, RkSetting a minimum value RminThen R iskCan be calculated as:
Figure GDA0002470094180000116
in any of the above solutions, preferably, the optimized video selection unit hasThe body is used for: defining N as the number of video clips; defining M as the code rate grade of the video clip; definition of ri,jThe code rate of the ith video clip at the jth code rate level is shown, wherein i is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to M; definition of di,jThe quality distortion value of the ith video clip at the jth code rate level is shown, wherein i is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to M; definition of xi,jWhether the ith video clip is selected at the jth code rate level or not, xi,j1 represents the selection, xi,j0 represents no choice, where 1 ≦ i ≦ N,1 ≦ j ≦ M, any i and any j form the set X, X ═ Xi,jI is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to M }; definition siIs the area of the sphere corresponding to the ith video segment,
Figure GDA0002470094180000126
defining Φ (X) as the expected distortion of the video segment; defining Ψ (X) as a spatial quality variation of the video segment; defining R as the upper limit value of the total code rate of the video clip; the expected distortion Φ (X) of a video segment is calculated as:
Figure GDA0002470094180000121
wherein, PiIs the probability of a video segment being viewed;
the spatial quality variation Ψ (X) of a video segment is calculated by the following formula:
Figure GDA0002470094180000122
wherein, PiIs the probability of a video segment being viewed;
the QoE model is: m isXinΦ(X)+η·Ψ(X),
Wherein, η for the optimization of the target weights,
Figure GDA0002470094180000123
in the technical scheme, two QoE factors need to be considered when selecting the video clip, wherein one factor is expected distortion which represents an expected value of distortion under the condition of considering the viewing probability of the video clip; the second is the change of the spatial quality,therefore, solving the video segment target obtained by the client by establishing an optimization model is to maximize QoE, specifically, defining phi (X) as the expected distortion of the video segment, defining psi (X) as the spatial quality change of the video segment, defining η as the optimization target weight, defining R as the upper limit value of the total bitrate of the video segment, wherein the total bitrate of the client in selecting the video segment cannot exceed the value, and in each adaptive stage, the value is the total bitrate R of the kth video segment set obtained in the bitrate adaptive stagekThen, the optimization problem can be defined as:
Figure GDA0002470094180000124
wherein the content of the first and second substances,
Figure GDA0002470094180000125
in each self-adaptive stage, the client solves the objective function and the conditional function to obtain a video segment required to be obtained, then sends a request to the server for downloading, and enters the next self-adaptive stage after downloading is completed until the client finishes watching the video.
In a fifth aspect of the present invention, a panoramic video transmission system is provided, including: the panoramic video transmission device in any one of the above technical solutions is used for a client; and the panoramic video transmission device in any one of the technical schemes is used for a server side.
According to the transmission system of the panoramic video of the present invention, the transmission device of the panoramic video for the server in any one of the above technical solutions and the transmission device of the panoramic video for the client in any one of the above technical solutions are adopted, so that all the beneficial effects of the transmission device of the panoramic video for the server and the transmission device of the panoramic video for the client are provided, and are not described herein again.
In a sixth aspect of the present invention, a computer device is provided, which includes a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor is configured to execute the steps of the method for transmitting panoramic video for a server according to any one of the above technical solutions; or the processor is configured to perform the steps of the transmission method for the client-side panoramic video according to any one of the above technical solutions.
A seventh aspect of the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the transmission method for panoramic video of a server according to any one of the above technical solutions; or the computer program when executed by the processor, implements the steps of the method for transmitting panoramic video for a client as in any of the above-mentioned technical solutions.
Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 shows a schematic flow diagram of a transmission method of panoramic video according to an embodiment of the first aspect of the present invention;
fig. 2 shows a flow chart of a transmission method of a panoramic video according to an embodiment of the second aspect of the present invention;
fig. 3 shows a schematic block diagram of a transmission apparatus of panoramic video according to an embodiment of a third aspect of the present invention;
fig. 4 shows a schematic block diagram of a transmission apparatus of a panoramic video according to an embodiment of a fourth aspect of the present invention;
fig. 5 shows a schematic block diagram of a transmission system of panoramic video according to an embodiment of a fifth aspect of the present invention;
FIG. 6 shows a schematic diagram of a computer device according to an embodiment of a sixth aspect of the present invention;
FIG. 7 shows a block-based panoramic video transport framework in accordance with an embodiment of the present invention;
FIG. 8 shows a schematic diagram of a blocking process for panoramic video according to one embodiment of the invention;
FIG. 9 shows a schematic diagram of user orientation in terms of Euler angles, according to one embodiment of the present invention;
FIG. 10 shows a schematic diagram of user orientation prediction and true value difference data statistics, according to one embodiment of the invention.
Detailed Description
In order that the above objects, features and advantages of the present invention can be more clearly understood, a more particular description of the invention will be rendered by reference to the appended drawings. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than those specifically described herein, and therefore the scope of the present invention is not limited by the specific embodiments disclosed below.
As shown in fig. 1, a flow chart of a transmission method of a panoramic video according to an embodiment of the first aspect of the present invention is schematically shown. The transmission method is used for a server and comprises the following steps:
102, partitioning, coding and slicing a panoramic video according to preset configuration information to obtain video clips and a media description file;
step 104, storing the video clip and the media description file in the server.
The panoramic video transmission method provided by the invention can predefine the parameters such as the number of space blocks, the width and height of the blocks, coding parameters, the video clip duration and the like at the server end as configuration files of configuration information, the panoramic video and the configuration file are then retrieved and processed by a processor executing the video blocking, and, in particular, the method comprises the steps of carrying out blocking, coding and slicing on the panoramic video to obtain video clips and media description files, storing the processed video clips and media description files in an HTTP server for later use, when a client downloads the panoramic video, the video segment and the media description file are subjected to visual point adaptation and code rate adaptation, and the video segment needing to be downloaded is solved through an optimization method, therefore, the purposes of improving the video quality, reducing the quality fluctuation of the space and stopping the video playing are achieved.
In the above embodiment, preferably, the format of the panoramic video includes an ERP format, a CMP format; the configuration information includes: the number, width, height, playing duration, encoding parameters, and code rate level of the video segments.
In this embodiment, the format of the obtained panoramic video includes, but is not limited to, an ERP format and a CMP format, and after the processor that executes video blocking obtains the panoramic video in the ERP format or the CMP format, the processor performs processing according to preset configuration information, where the configuration information includes, but is not limited to, the number of blocks, the width and height of the blocks, encoding parameters, video segment duration, and other parameters.
In any of the above embodiments, preferably, the media description file includes, but is not limited to, spatial location information, encoding information, bitrate, quality distortion value, and Uniform Resource Locator (URL) of the video clip.
As shown in fig. 2, a flowchart of a transmission method of a panoramic video according to an embodiment of the second aspect of the present invention is shown. The transmission method is used for a client, and is used in cooperation with a server in combination with the transmission method of the panoramic video in any one of the embodiments, and the transmission method comprises the following steps:
step 202, acquiring and analyzing a media description file from a server side, and downloading a video clip according to the media description file;
step 204, in the downloading process, obtaining the head position information of the user, and predicting the probability of the video clip being watched according to the head position information;
step 206, calculating the size of a video playing buffer area, and calculating the upper limit value of the total code rate of the video clip according to the network bandwidth estimation value and the size of the video playing buffer area;
and step 208, calculating the spatial quality change of the video clip and the expected distortion of the video clip, establishing a QoE model according to the spatial quality change and the expected distortion of the video clip, and selecting the video clip needing to be downloaded.
In this embodiment, when downloading a panoramic video, a client user in a network acquires a uniform resource locator of the panoramic video according to a media description file, and downloads the panoramic video. And performing view adaptation, code rate adaptation and establishment of an optimization model to maximize QoE in the downloading process to determine the downloaded segments. Specifically, in the visual point self-adaption process, the probability that a future block is observed is predicted by acquiring historical head position information of a user; calculating the total video clip code rate which can be transmitted according to the network bandwidth and the size of the playing buffer area; and finally, solving the video clip needing to be downloaded by an optimization method, wherein the overall quality of the video, the space quality fluctuation caused by space blocking and the expected value of distortion of the video clip under the watched probability are considered in the optimization method, and the client can selectively obtain the video clip by establishing an optimization model, so that the data volume of panoramic video transmission is reduced, the aims of improving the video quality, reducing the space quality fluctuation and video playing jam are fulfilled, and the user experience of watching the panoramic video is improved under the network environment with bandwidth limitation.
In the above embodiment, preferably, the step of obtaining the head position information of the user and predicting the probability of the video segment being viewed according to the head position information includes the euler angle of the user orientation, and defining (α, γ) the euler angle of the user orientation, wherein α is the yaw angle, β is the pitch angle, γ roll angle, and defining t0Is the current time; defining a prediction interval; definition of
Figure GDA0002470094180000161
Points on the sphere corresponding to the video segments, wherein
Figure GDA0002470094180000162
Is latitude, θ is longitude; definition of UiThe number of points on the spherical surface corresponding to the ith video clip; definition of
Figure GDA0002470094180000163
To see a point
Figure GDA0002470094180000164
Is calculated at (t) from the euler angles (α, gamma) of the user's orientation0Predicted value of user orientation at time (+)
Figure GDA0002470094180000165
Figure GDA0002470094180000166
And the correct probability P of the predicted value of the user orientationE(α, gamma.) according to (t)0Time # P correct probability of predicted value of user orientationE(α, gamma) calculating the point on the sphere corresponding to the video clip
Figure GDA0002470094180000167
Probability of being viewed
Figure GDA0002470094180000168
The calculation formula is as follows:
Figure GDA0002470094180000169
probability P of a video segment being viewediThe calculation formula is the mean value of the observed probability of the point on the spherical surface corresponding to the video clip:
Figure GDA00024700941800001610
in this embodiment, specific steps are defined that the client needs to perform the view adaptation at each adaptation stage. Specifically, to calculate the probability that a video block is seen, the probability that a point on a spherical surface is seen is calculated, and for one point on the spherical surface
Figure GDA00024700941800001611
Since this point is likely to be seen through several orientations of the user, the probability that this point is seen
Figure GDA00024700941800001612
Set of orientations calculated to see this point
Figure GDA00024700941800001613
The corresponding probability mean, namely:
Figure GDA00024700941800001614
probability P that a video block is viewediCalculate the spherical point U as this video blockiThe mean of the probabilities that the corresponding points on the sphere are viewed, i.e.:
Figure GDA0002470094180000171
wherein, PE(α, γ) is at (t)0Time + correct probability of predicted value of user orientation.
In any of the above embodiments, preferably, the predicted value of the user orientation
Figure GDA0002470094180000172
Figure GDA0002470094180000173
The calculation formula of (2) is as follows:
Figure GDA0002470094180000174
wherein m isα、mβ、mγIs a linear regression parameter; probability of correctness P of predicted value of user orientationEThe formula for the calculation of (α, gamma) is PE(α,β,γ)=Pyaw(α)Ppitch(β)Proll(λ) wherein Pyaw(α),Ppitch(β),Proll(gamma) are the correct probabilities of the predicted values of the yaw angle, the pitch angle and the roll angle respectively, and the calculation formula is as follows:
Figure GDA0002470094180000175
wherein the content of the first and second substances,μαmean, σ, of the angular distribution of yawαThe standard deviation is the predicted value of the yaw angle; mu.sβMean value of the angular distribution of pitch, σβThe standard deviation of the predicted value of the pitch angle is used; mu.sλMean value of the distribution of roll angles, σλIs the standard deviation of the predicted value of roll angle.
In this embodiment, first, a linear regression is used to calculate at (t)0Predicted value of user orientation at time (+)
Figure GDA0002470094180000176
Wherein m isα、mβ、mγFor the parameters in the linear regression, these three parameters can be utilized in the window t0-1,t0]And solving historical data of the yaw angle, the pitch angle and the roll angle in time by a least square method. Second, by data statistics, at (t)0The predicted value of the orientation of the user at time + has a difference from the true value thereof, and the relationship therebetween conforms to the Gaussian distribution, i.e., eα~N(μαα),eβ~N(μββ),eγ~N(μγγ) From this, the euler angle prediction correct probability component, i.e. P, can be calculatedyaw(α),Ppitch(β),Proll(γ). The yaw angle, the pitch angle and the roll angle are independent, so the Euler angle prediction accuracy probability P isE(α, γ) can be calculated as PE(α,β,γ)=Pyaw(α)Ppitch(β)Proll(λ)。
In any of the above embodiments, preferably, the step of calculating the size of the video playing buffer and calculating the upper limit value of the total bitrate of the video segment according to the network bandwidth estimation value and the size of the video playing buffer specifically includes: defining T as the duration of the video clips, wherein the duration of each video clip is equal; definition bkThe size of a video playing buffer area after the kth video clip set is downloaded; definition of RkThe upper limit value of the total code rate for downloading the kth video clip set; definition of RminThe minimum value of the total code rate for downloading the kth video clip set; statorYi CkThe estimated value of the network bandwidth in the process of downloading the k video clip set; definition BtargetA target value of a video playing buffer area; the size b of the video playing buffer area after the kth video clip set is downloadedkThe calculation formula of (2) is as follows:
Figure GDA0002470094180000181
downloading total code rate upper limit value R of kth video clip setkThe calculation formula of (2) is as follows:
Figure GDA0002470094180000182
in this embodiment, a rate control algorithm based on the target buffer is defined to determine the total available rate of the video clip, thereby avoiding the pause in playing under the condition of small buffer. Specifically, the change of the size of the video buffer is determined by the network bandwidth and the total code rate of the downloaded video clip set, when the video clip set is downloaded, the size of the video buffer is increased, and the size of the video buffer is decreased along with the playing of the video. Therefore, the video buffer size after downloading the kth video segment set can be calculated as:
Figure GDA0002470094180000183
to avoid video playing jamming, we control the video buffer size to a target value, i.e. let bk=BtargetThen, the total bitrate of the kth video segment set can be found as:
Figure GDA0002470094180000184
to avoid negative results, RkSetting a minimum value RminThen R iskCan be calculated as:
Figure GDA0002470094180000185
in any of the above embodiments, preferably, N is defined as the number of video segments; defining M as the code rate grade of the video clip; definition of ri,jFor the ith video sliceThe code rate of the segment at the jth code rate level, wherein i is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to M; definition of di,jThe quality distortion value of the ith video clip at the jth code rate level is shown, wherein i is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to M; definition of xi,jWhether the ith video clip is selected at the jth code rate level or not, xi,j1 represents the selection, xi,j0 represents no choice, where 1 ≦ i ≦ N,1 ≦ j ≦ M, any i and any j form the set X, X ═ Xi,jI is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to M }; definition siIs the area of the sphere corresponding to the ith video segment,
Figure GDA0002470094180000186
defining Φ (X) as the expected distortion of the video segment; defining Ψ (X) as a spatial quality variation of the video segment; defining R as the upper limit value of the total code rate of the video clip; the expected distortion Φ (X) of a video segment is calculated as:
Figure GDA0002470094180000191
Piis the probability of a video segment being viewed;
the spatial quality variation Ψ (X) of a video segment is calculated by the following formula:
Figure GDA0002470094180000192
Piis the probability of a video segment being viewed;
the QoE model is:
Figure GDA0002470094180000193
wherein, η for the optimization of the target weights,
Figure GDA0002470094180000194
in this embodiment, when selecting a video clip, two QoE factors need to be considered, one of which is an expected distortion, which represents an expected value of distortion considering the probability that the video clip is viewed; the second is spatial quality variation, which represents the quality smoothness in the video clip. Due to the fact thatSpecifically, phi (X) is defined as the expected distortion of the video segment, psi (X) is defined as the spatial quality change of the video segment, η is defined as the optimization target weight, R is defined as the upper limit value of the total bitrate of the video segment, the total bitrate of the client cannot exceed the value when the client selects the video segment, and the value is the total bitrate R of the kth video segment set obtained in the bitrate adaptive stage at each adaptive stagekThen, the optimization problem can be defined as:
Figure GDA0002470094180000195
wherein the content of the first and second substances,
Figure GDA0002470094180000196
in each self-adaptive stage, the client solves the objective function and the conditional function to obtain a video segment required to be obtained, then sends a request to the server for downloading, and enters the next self-adaptive stage after downloading is completed until the client finishes watching the video.
As shown in fig. 3, a schematic block diagram of a transmission apparatus of a panoramic video according to an embodiment of the third aspect of the present invention. Wherein, this transmission apparatus 300 is used for the server, include:
a processing unit 302, configured to perform blocking, encoding, and slicing processing on the panoramic video according to preset configuration information to obtain a video clip and a media description file;
a storage unit 304, configured to store the video segment and the media description file in the server.
According to the panoramic video transmission apparatus 300 of the present invention, at the server side, parameters such as the number of spatial blocks, the width and height of the blocks, coding parameters, video segment duration, etc. can be predefined, and used as configuration files of configuration information, the panoramic video and the configuration file are then retrieved and processed by a processor executing the video blocking, and, in particular, the method comprises the steps of carrying out blocking, coding and slicing on the panoramic video to obtain video clips and media description files, storing the processed video clips and media description files in an HTTP server for later use, when a client downloads the panoramic video, the video segment and the media description file are subjected to visual point adaptation and code rate adaptation, and the video segment needing to be downloaded is solved through an optimization method, therefore, the purposes of improving the video quality, reducing the quality fluctuation of the space and stopping the video playing are achieved.
In the above embodiment, preferably, the format of the panoramic video includes an ERP format and a CMP format; the configuration information includes: the number, width, height, playing duration, encoding parameters, and code rate level of the video segments.
In this embodiment, the format of the obtained panoramic video includes, but is not limited to, an ERP format and a CMP format, and after the processor that executes video blocking obtains the panoramic video in the ERP format or the CMP format, the processor performs processing according to preset configuration information, where the configuration information includes, but is not limited to, the number of blocks, the width and height of the blocks, encoding parameters, video segment duration, and other parameters.
In any of the above embodiments, preferably, the media description file includes spatial location information, encoding information, bitrate, quality distortion value, and Uniform Resource Locator (URL) of the video clip.
In this embodiment, the media description file includes, but is not limited to, spatial location information, encoding information, bitrate, quality distortion value, Uniform Resource Locator (URL) of the video clip.
As shown in fig. 4, a schematic block diagram of a transmission apparatus of a panoramic video according to an embodiment of the fourth aspect of the present invention. Wherein, the transmission apparatus 400 is used for a client, and is used in cooperation with the transmission apparatus of the panoramic video in any one of the above embodiments, and the transmission apparatus 400 includes:
a downloading unit 402, configured to acquire and parse the media description file from the server, and download the video segment according to the media description file;
a view self-adapting unit 404, configured to, during the downloading process, obtain head position information of the user, and predict a probability that the video segment is viewed according to the head position information;
a code rate self-adapting unit 406, configured to calculate the size of a video playing buffer, and calculate an upper limit value of a total code rate of a video segment according to the network bandwidth estimation value and the size of the video playing buffer;
and an optimized video selection unit 408, configured to calculate a spatial quality change of the video segment and an expected distortion of the video segment, establish a QoE model according to the spatial quality change and the expected distortion of the video segment, and select a video segment to be downloaded.
According to the transmission device 400 of the panoramic video, when downloading the panoramic video, a client user in a network can acquire the uniform resource locator of the panoramic video according to the media description file, so as to download the panoramic video. And performing view adaptation, code rate adaptation and establishment of an optimization model to maximize QoE in the downloading process to determine the downloaded segments. Specifically, in the visual point self-adaption process, the probability that a future block is observed is predicted by acquiring historical head position information of a user; calculating the total video clip code rate which can be transmitted according to the network bandwidth and the size of the playing buffer area; and finally, solving the video clip needing to be downloaded by an optimization method, wherein the overall quality of the video, the space quality fluctuation caused by space blocking and the expected value of distortion of the video clip under the watched probability are considered in the optimization method, and the client can selectively obtain the video clip by establishing an optimization model, so that the data volume of panoramic video transmission is reduced, the aims of improving the video quality, reducing the space quality fluctuation and video playing jam are fulfilled, and the user experience of watching the panoramic video is improved under the network environment with bandwidth limitation.
In the above embodiment, preferably, the viewpoint adapting unit 404 is specifically configured to define (α, γ) the Euler angle of the user orientation, wherein α is yaw angle, β is pitch angle, γ roll angle, and the Euler angle of the user orientation, and to define t0Is the current time; defining a prediction interval; definition of
Figure GDA0002470094180000211
Points on the sphere corresponding to the video segments, wherein
Figure GDA0002470094180000212
Is latitude, θ is longitude; definition of UiThe number of points on the spherical surface corresponding to the ith video clip; definition of
Figure GDA0002470094180000213
To see a point
Figure GDA0002470094180000214
Is calculated at (t) from the euler angles (α, gamma) of the user's orientation0Predicted value of user orientation at time (+)
Figure GDA0002470094180000215
And the correct probability P of the predicted value of the user orientationE(α, gamma.) according to (t)0Time # P correct probability of predicted value of user orientationE(α, gamma) calculating the point on the sphere corresponding to the video clip
Figure GDA0002470094180000216
Probability of being viewed
Figure GDA0002470094180000217
The calculation formula is as follows:
Figure GDA0002470094180000218
probability P of a video segment being viewediThe calculation formula is the mean value of the observed probability of the point on the spherical surface corresponding to the video clip:
Figure GDA0002470094180000219
in this embodiment, specific steps are defined that the client needs to perform the view adaptation at each adaptation stage. Specifically, to calculate the probability that a video block is seen, the probability that a point on a spherical surface is seen is calculated, and for one point on the spherical surface
Figure GDA00024700941800002110
Since this point is likely to be seen through several orientations of the user, the probability that this point is seen
Figure GDA00024700941800002111
Set of orientations calculated to see this point
Figure GDA00024700941800002112
The corresponding probability mean, namely:
Figure GDA0002470094180000221
probability P that a video block is viewediCalculate the spherical point U as this video blockiThe mean of the probabilities that the corresponding points on the sphere are viewed, i.e.:
Figure GDA0002470094180000222
wherein, PE(α, γ) is at (t)0Time + correct probability of predicted value of user orientation.
In any of the above embodiments, preferably, the predicted value of the user orientation
Figure GDA0002470094180000223
Figure GDA0002470094180000224
The calculation formula of (2) is as follows:
Figure GDA0002470094180000225
wherein m isα、mβ、mγIs a linear regression parameter; probability of correctness P of predicted value of user orientationEThe formula for the calculation of (α, gamma) is PE(α,β,γ)=Pyaw(α)Ppitch(β)Proll(λ) wherein Pyaw(α),Ppitch(β),Proll(gamma) predicted values of yaw angle, pitch angle and roll angleThe correct probability is calculated by the formula:
Figure GDA0002470094180000226
wherein, muαMean, σ, of the angular distribution of yawαThe standard deviation is the predicted value of the yaw angle; mu.sβMean value of the angular distribution of pitch, σβThe standard deviation of the predicted value of the pitch angle is used; mu.sλMean value of the distribution of roll angles, σλIs the standard deviation of the predicted value of roll angle.
In this embodiment, first, a linear regression is used to calculate at (t)0Predicted value of user orientation at time (+)
Figure GDA0002470094180000227
Wherein m isα、mβ、mγFor the parameters in the linear regression, these three parameters can be utilized in the window t0-1,t0]And solving historical data of the yaw angle, the pitch angle and the roll angle in time by a least square method. Second, by data statistics, at (t)0The predicted value of the orientation of the user at time + has a difference from the true value thereof, and the relationship therebetween conforms to the Gaussian distribution, i.e., eα~N(μαα),eβ~N(μββ),eγ~N(μγγ) From this, the euler angle prediction correct probability component, i.e. P, can be calculatedyaw(α),Ppitch(β),Proll(γ). The yaw angle, the pitch angle and the roll angle are independent, so the Euler angle prediction accuracy probability P isE(α, γ) can be calculated as PE(α,β,γ)=Pyaw(α)Ppitch(β)Proll(λ)。
In any of the above embodiments, preferably, the code rate adaptation unit 406 is specifically configured to: defining T as the duration of the video clips, wherein the duration of each video clip is equal; definition bkThe size of a video playing buffer area after the kth video clip set is downloaded; definition of RkTo downloadThe upper limit value of the total code rate of the k video clip sets; definition of RminThe minimum value of the total code rate for downloading the kth video clip set; definition CkThe estimated value of the network bandwidth in the process of downloading the k video clip set; definition BtargetA target value of a video playing buffer area; the size b of the video playing buffer area after the kth video clip set is downloadedkThe calculation formula of (2) is as follows:
Figure GDA0002470094180000231
downloading total code rate upper limit value R of kth video clip setkThe calculation formula of (2) is as follows:
Figure GDA0002470094180000232
in this embodiment, a rate control algorithm based on the target buffer is defined to determine the total available rate of the video clip, thereby avoiding the pause in playing under the condition of small buffer. Specifically, the change of the size of the video buffer is determined by the network bandwidth and the total code rate of the downloaded video clip set, when the video clip set is downloaded, the size of the video buffer is increased, and the size of the video buffer is decreased along with the playing of the video. Therefore, the video buffer size after downloading the kth video segment set can be calculated as:
Figure GDA0002470094180000233
to avoid video playing jamming, we control the video buffer size to a target value, i.e. let bk=BtargetThen, the total bitrate of the kth video segment set can be found as:
Figure GDA0002470094180000234
to avoid negative results, RkSetting a minimum value RminThen R iskCan be calculated as:
Figure GDA0002470094180000235
in any of the embodiments described above in detail,preferably, the optimized video selection unit 408 is specifically configured to: defining N as the number of video clips; defining M as the code rate grade of the video clip; definition of ri,jThe code rate of the ith video clip at the jth code rate level is shown, wherein i is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to M; definition of di,jThe quality distortion value of the ith video clip at the jth code rate level is shown, wherein i is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to M; definition of xi,jWhether the ith video clip is selected at the jth code rate level or not, xi,j1 represents the selection, xi,j0 represents no choice, where 1 ≦ i ≦ N,1 ≦ j ≦ M, any i and any j form the set X, X ═ Xi,jI is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to M }; definition siIs the area of the sphere corresponding to the ith video segment,
Figure GDA0002470094180000236
defining Φ (X) as the expected distortion of the video segment; defining Ψ (X) as a spatial quality variation of the video segment; defining R as the upper limit value of the total code rate of the video clip; the expected distortion Φ (X) of a video segment is calculated as:
Figure GDA0002470094180000241
wherein, PiIs the probability of a video segment being viewed;
the spatial quality variation Ψ (X) of a video segment is calculated by the following formula:
Figure GDA0002470094180000242
wherein, PiIs the probability of a video segment being viewed;
the QoE model is:
Figure GDA0002470094180000243
wherein, η for the optimization of the target weights,
Figure GDA0002470094180000244
in this embodiment, when selecting a video clip, it is necessaryTherefore, solving the video clip target obtained by the client by establishing an optimization model is to maximize QoE, specifically, defining phi (X) as the expected distortion of the video clip, defining psi (X) as the spatial quality change of the video clip, defining η as the weight of the optimization target, defining R as the upper limit value of the total bitrate of the video clip, wherein the total bitrate of the client cannot exceed the value when the video clip is selected, and at each adaptive stage, the value is the total bitrate R of the kth video clip set obtained at the bitrate adaptive stagekThen, the optimization problem can be defined as:
Figure GDA0002470094180000245
wherein the content of the first and second substances,
Figure GDA0002470094180000246
in each self-adaptive stage, the client solves the objective function and the conditional function to obtain a video segment required to be obtained, then sends a request to the server for downloading, and enters the next self-adaptive stage after downloading is completed until the client finishes watching the video.
As shown in fig. 5, a schematic block diagram of a transmission system of panoramic video according to an embodiment of the fifth aspect of the present invention. The transmission system 500 includes: the panoramic video transmission device 502 in any one of the above embodiments is used for a server; and a panoramic video transmission device 504 as in any of the above embodiments, for a client.
As shown in FIG. 6, a schematic diagram of a computer device according to an embodiment of a sixth aspect of the present invention. Wherein the computer device 1 comprises: a memory 12, a processor 14 and a computer program stored on the memory 12 and executable on the processor 14, the processor 14 being configured to perform the steps of the method for transmitting panoramic video for a server as in any of the embodiments described above; or processor 14 is adapted to perform the steps of the transmission method for client panoramic video as in any of the embodiments described above.
As shown in fig. 7, a block-based panoramic video transmission framework according to an embodiment of the present invention is illustrated.
In this embodiment, first, parameters such as the number of spatial blocks, the width and height of the blocks, coding parameters, video segment duration, and the like are predefined as configuration files of configuration information; then, a processor executing the video blocking acquires the panoramic video and the configuration file in the ERP format and performs required processing. Specifically, the original panoramic video is partitioned, encoded and sliced to obtain a video clip and a media description file, wherein the media description file comprises information such as spatial position information of the video clip, encoding information of the video clip, data size of the video clip, quality distortion of the video clip and the like; and then, storing the generated media description file and media segment in an HTTP server for standby.
When a client in a network downloads a panoramic video, the client needs to perform point self-adaptation and code rate self-adaptation. The code rate adaptation is carried out based on a code rate control algorithm of a target buffer area, and specifically, the total video clip code rate capable of being transmitted is calculated according to the network bandwidth and the size of a play buffer area; the view point self-adaption predicts the viewing probability of the video block based on a viewing probability model, and particularly predicts the viewing probability of the future block by acquiring historical head position information of a user. And finally, the client solves the video segment needing to be downloaded by an optimization method, and in the optimization method, the invention considers the total quality of the video and the space quality fluctuation caused by space blocking, thereby achieving the purposes of improving the video quality and reducing the space quality fluctuation and video playing pause.
As shown in fig. 8, a schematic diagram of a blocking process for a panoramic video according to an embodiment of the present invention.
In this embodiment, the original panoramic video is subjected to blocking, encoding and slicing to obtain a video clip and a media description file. The method of processing involves the following related concepts: defining W as the width of the panoramic video;defining H as the height of the panoramic video; defining N as the number of video blocks; defining M as the code rate grade of the video; definition of ri,jThe actual code rate of the video clip of the ith video block at the jth code rate level is greater than or equal to 1 and less than or equal to N, and j is greater than or equal to 1 and less than or equal to M; definition of di,jAnd the distortion value of the video clip of the ith video block at the jth code rate level is represented, wherein i is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to M.
As shown in fig. 9, a schematic diagram of user orientation in euler angles according to one embodiment of the present invention. The orientation of the user's head is represented by euler angles, which are yaw, pitch, and roll angles, respectively.
As shown in fig. 10, a diagram of user orientation prediction value and true value difference data statistics according to one embodiment of the present invention.
In this embodiment, the data statistics show that0Predicted value of user head orientation at time (+)
Figure GDA0002470094180000261
The difference from its true value, the relationship between them corresponding to a Gaussian distribution, i.e. eα~N(μαα),eβ~N(μββ),eγ~N(μγγ) Wherein, muαMean, σ, of the angular distribution of yawαThe standard deviation is the predicted value of the yaw angle; mu.sβMean value of the angular distribution of pitch, σβThe standard deviation of the predicted value of the pitch angle is used; mu.sλMean value of the distribution of roll angles, σλIs the standard deviation of the predicted value of roll angle. From this, the euler angle prediction correct probability component, i.e. P, can be calculatedyaw(α),Ppitch(β),Proll(γ). The yaw angle, the pitch angle and the roll angle are independent, so the Euler angle prediction accuracy probability P isE(α, γ) can be calculated as PE(α,β,γ)=Pyaw(α)Ppitch(β)Proll(λ)。
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (11)

1. A transmission method of panoramic video is used for a client, and is characterized in that the transmission method comprises the following steps:
acquiring and analyzing a media description file from a server, and downloading a video clip according to the media description file, wherein the media description file and the video clip are obtained by the server by blocking, coding and slicing the panoramic video according to preset configuration information;
in the downloading process, acquiring head position information of a user, and predicting the probability of the video clip being watched according to the head position information;
calculating the size of a video playing buffer area, and calculating the upper limit value of the total code rate of the video clip according to the network bandwidth estimation value and the size of the video playing buffer area;
calculating the space quality change of the video clip and the expected distortion of the video clip, establishing a QoE model according to the space quality change and the expected distortion of the video clip, and selecting the video clip needing to be downloaded;
the method for predicting the probability of the video clip being watched according to the head position information comprises the following steps:
the head position information comprises euler angles of user orientation, defined as (α, β, γ) euler angles of the user orientation, where α is yaw angle, β is pitch angle, γ roll angle;
definition of t0Defining a prediction interval as a current time;
definition of
Figure FDA0002527134380000011
Is a point on the sphere corresponding to the video segment, wherein
Figure FDA0002527134380000012
Is latitude, θ is longitude;
definition of UiThe number of points on the spherical surface corresponding to the ith video clip;
definition of
Figure FDA0002527134380000013
To see a point
Figure FDA0002527134380000014
Set of orientations of (a);
from the euler angle (α, γ) of the user's orientation, the value at (t) is calculated0Predicted value of user orientation at time (+)
Figure FDA0002527134380000015
And the correct probability P of the predicted value of the user orientationE(α,β,γ);
According to said (t)0Time # P correct probability of predicted value of user orientationE(α, gamma) and calculating the point on the spherical surface corresponding to the video clip
Figure FDA0002527134380000016
Probability of being viewed
Figure FDA0002527134380000017
The calculation formula is as follows:
Figure FDA0002527134380000018
probability P that the video segment is viewediThe calculation formula is the mean value of the observed probability of the points on the spherical surface corresponding to the video clip:
Figure FDA0002527134380000021
2. the method for transmitting panoramic video according to claim 1,
prediction value of the user orientation
Figure FDA0002527134380000022
The calculation formula of (2) is as follows:
Figure FDA0002527134380000023
wherein m isα、mβ、mγIs a linear regression parameter;
the probability P of correctness of the predicted value of the user orientationEThe formula for the calculation of (α, γ) is:
PE(α,β,γ)=Pyaw(α)Ppitch(β)Proll(λ),
wherein, Pyaw(α),Ppitch(β),Proll(gamma) are the correct probabilities of the yaw angle, the pitch angle and the predicted roll angle respectively, and the calculation formula is as follows:
Figure FDA0002527134380000024
wherein the content of the first and second substances,
μαis the mean, σ, of the yaw angular distributionαThe standard deviation of the predicted value of the yaw angle is used as the standard deviation;
μβis the mean, σ, of the pitch angle distributionβThe standard deviation of the predicted value of the pitch angle is used;
μλis the mean, σ, of the distribution of roll anglesλIs the standard deviation of the predicted value of the roll angle.
3. The method for transmitting a panoramic video according to claim 1, wherein the step of calculating the size of the video playing buffer and calculating the upper limit value of the total bitrate of the video clip according to the network bandwidth estimation value and the size of the video playing buffer specifically comprises:
defining T as the video segment time length, wherein the time length of each video segment is equal;
definition bkThe size of the video playing buffer area after the kth video clip set is downloaded;
definition of RkThe upper limit value of the total code rate for downloading the kth video clip set;
definition of RminThe minimum value of the total code rate for downloading the kth video clip set;
definition CkThe estimated value of the network bandwidth in the process of downloading the kth video clip set is obtained;
definition BtargetThe target value of the video playing buffer zone is obtained;
the size b of the video playing buffer area after the kth video segment set is downloadedkThe calculation formula of (2) is as follows:
Figure FDA0002527134380000025
downloading total code rate upper limit value R of kth video clip setkThe calculation formula of (2) is as follows:
Figure FDA0002527134380000031
4. the method for transmitting panoramic video according to claim 1,
defining N as the number of the video clips;
defining M as the code rate grade of the video clip;
definition of ri,jThe code rate of the ith video clip at the jth code rate level is shown, wherein i is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to M;
definition of di,jThe quality distortion value of the ith video clip at the jth code rate level is shown, wherein i is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to M;
definition of xi,jWhether the ith video clip is selected at the jth code rate level or not, xi,j1 represents the selection, xi,j0 representsSelecting, wherein i is more than or equal to 1 and less than or equal to N, j is more than or equal to 1 and less than or equal to M, any i and any j form a set X, and X is { X ═ Xi,j|1≤i≤N,1≤j≤M};
Definition siIs the area of the sphere corresponding to the ith video segment,
Figure FDA0002527134380000036
defining Φ (X) as an expected distortion of the video segment;
defining Ψ (X) as a spatial quality variation of the video segment;
defining R as the upper limit value of the total code rate of the video clip;
the expected distortion Φ (X) of the video segment is calculated as:
Figure FDA0002527134380000032
Piis the probability of the video segment being viewed;
the calculation formula of the spatial quality variation Ψ (X) of the video segment is as follows:
Figure FDA0002527134380000033
Piis the probability of the video segment being viewed;
the QoE model is as follows:
Figure FDA0002527134380000034
wherein, η for the optimization of the target weights,
Figure FDA0002527134380000035
5. a transmission apparatus of panoramic video for a client, the transmission apparatus comprising:
the downloading unit is used for acquiring and analyzing a media description file from a server side and downloading a video clip according to the media description file, wherein the media description file and the video clip are obtained by the server side by blocking, coding and slicing the panoramic video according to preset configuration information;
the video adaptive unit is used for acquiring the head position information of a user in the downloading process and predicting the probability of watching the video clip according to the head position information;
the code rate self-adaption unit is used for calculating the size of a video playing buffer area and calculating the upper limit value of the total code rate of the video clip according to the network bandwidth estimation value and the size of the video playing buffer area;
the optimized video selection unit is used for calculating the space quality change of the video clip and the expected distortion of the video clip, establishing a QoE model according to the space quality change and the expected distortion of the video clip, and selecting the video clip needing to be downloaded;
the viewpoint adaptive unit is specifically configured to:
the head position information comprises euler angles of user orientation, defined as (α, β, γ) euler angles of the user orientation, where α is yaw angle, β is pitch angle, γ roll angle;
definition of t0Defining a prediction interval as a current time;
definition of
Figure FDA0002527134380000041
Is a point on the sphere corresponding to the video segment, wherein
Figure FDA0002527134380000042
Is latitude, θ is longitude;
definition of UiThe number of points on the spherical surface corresponding to the ith video clip;
definition of
Figure FDA0002527134380000043
To see a point
Figure FDA0002527134380000044
Set of orientations of (a);
from the euler angle (α, γ) of the user's orientation, the value at (t) is calculated0Predicted value of user orientation at time (+)
Figure FDA0002527134380000045
And the correct probability P of the predicted value of the user orientationE(α,β,γ);
According to said (t)0Time # P correct probability of predicted value of user orientationE(α, gamma) and calculating the point on the spherical surface corresponding to the video clip
Figure FDA0002527134380000046
Probability of being viewed
Figure FDA0002527134380000047
The calculation formula is as follows:
Figure FDA0002527134380000048
probability P that the video segment is viewediThe calculation formula is the mean value of the observed probability of the points on the spherical surface corresponding to the video clip:
Figure FDA0002527134380000049
6. the panoramic video transmission apparatus according to claim 5,
prediction value of the user orientation
Figure FDA00025271343800000410
The calculation formula of (2) is as follows:
Figure FDA00025271343800000411
mα、mβ、mγis a linear regression parameter;
the probability P of correctness of the predicted value of the user orientationEThe formula for the calculation of (α, γ) is:
PE(α,β,γ)=Pyaw(α)Ppitch(β)Proll(λ),
wherein, Pyaw(α),Ppitch(β),Proll(gamma) are the correct probabilities of the yaw angle, the pitch angle and the predicted roll angle respectively, and the calculation formula is as follows:
Figure FDA0002527134380000051
wherein the content of the first and second substances,
μαis the mean, σ, of the yaw angular distributionαThe standard deviation of the predicted value of the yaw angle is used as the standard deviation;
μβis the mean, σ, of the pitch angle distributionβThe standard deviation of the predicted value of the pitch angle is used;
μλis the mean, σ, of the distribution of roll anglesλIs the standard deviation of the predicted value of the roll angle.
7. The apparatus for transmitting a panoramic video according to claim 5, wherein the code rate adaptation unit is specifically configured to:
defining T as the video segment time length, wherein the time length of each video segment is equal;
definition bkThe size of the video playing buffer area after the kth video clip set is downloaded;
definition of RkThe upper limit value of the total code rate for downloading the kth video clip set;
definition of RminThe minimum value of the total code rate for downloading the kth video clip set;
definition CkThe estimated value of the network bandwidth in the process of downloading the kth video clip set is obtained;
definition BtargetThe target value of the video playing buffer zone is obtained;
the size b of the video playing buffer area after the kth video segment set is downloadedkThe calculation formula of (2) is as follows:
Figure FDA0002527134380000052
downloading total code rate upper limit value R of kth video clip setkThe calculation formula of (2) is as follows:
Figure FDA0002527134380000053
8. the apparatus for transmitting panoramic video according to claim 5, wherein the optimized video selection unit is specifically configured to:
defining N as the number of the video clips;
defining M as the code rate grade of the video clip;
definition of ri,jThe code rate of the ith video clip at the jth code rate level is shown, wherein i is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to M;
definition of di,jThe quality distortion value of the ith video clip at the jth code rate level is shown, wherein i is more than or equal to 1 and less than or equal to N, and j is more than or equal to 1 and less than or equal to M;
definition of xi,jWhether the ith video clip is selected at the jth code rate level or not, xi,j1 represents the selection, xi,j0 represents no choice, where 1 ≦ i ≦ N,1 ≦ j ≦ M, any i and any j form the set X, X ═ Xi,j|1≤i≤N,1≤j≤M};
Definition siIs the area of the sphere corresponding to the ith video segment,
Figure FDA0002527134380000065
defining Φ (X) as an expected distortion of the video segment;
defining Ψ (X) as a spatial quality variation of the video segment;
defining R as the upper limit value of the total code rate of the video clip;
the expected distortion Φ (X) of the video segment is calculated as:
Figure FDA0002527134380000061
Piis the probability of the video segment being viewed;
the calculation formula of the spatial quality variation Ψ (X) of the video segment is as follows:
Figure FDA0002527134380000062
Piis the probability of the video segment being viewed;
the QoE model is as follows:
Figure FDA0002527134380000063
wherein, η for the optimization of the target weights,
Figure FDA0002527134380000064
9. a system for transmitting panoramic video, comprising: the panoramic video transmission apparatus of any one of claims 5 to 8, for a client, an
A transmission device of panoramic video is used for a server side, and comprises the following components:
the processing unit is used for carrying out blocking, coding and slicing processing on the panoramic video according to preset configuration information to obtain video clips and media description files;
a storage unit, configured to store the video clip and the media description file in the server.
10. Computer device comprising a memory, a processor and a computer program stored on said memory and executable on said processor, characterized in that said processor is adapted to perform the steps of the method for transmission of panoramic video according to any one of claims 1 to 4.
11. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method for transmitting panoramic video according to any one of claims 1 to 4.
CN201710590143.6A 2017-07-19 2017-07-19 Panoramic video transmission method, transmission device and transmission system Expired - Fee Related CN109286855B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710590143.6A CN109286855B (en) 2017-07-19 2017-07-19 Panoramic video transmission method, transmission device and transmission system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710590143.6A CN109286855B (en) 2017-07-19 2017-07-19 Panoramic video transmission method, transmission device and transmission system

Publications (2)

Publication Number Publication Date
CN109286855A CN109286855A (en) 2019-01-29
CN109286855B true CN109286855B (en) 2020-10-13

Family

ID=65184903

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710590143.6A Expired - Fee Related CN109286855B (en) 2017-07-19 2017-07-19 Panoramic video transmission method, transmission device and transmission system

Country Status (1)

Country Link
CN (1) CN109286855B (en)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109714631A (en) * 2019-02-26 2019-05-03 华南理工大学 One kind being based on HTTP video flowing dynamic self-adaptation bit-rate selection method
CN110099294B (en) * 2019-06-11 2021-05-07 山东大学 Dynamic self-adaptive streaming media code rate allocation method for keeping space-time consistency of 360-degree video
CN110602506B (en) * 2019-09-25 2023-04-28 咪咕视讯科技有限公司 Video processing method, network device and computer readable storage medium
CN112714315B (en) * 2019-10-24 2023-02-28 上海交通大学 Layered buffering method and system based on panoramic video
CN112738646B (en) * 2019-10-28 2023-06-23 阿里巴巴集团控股有限公司 Data processing method, device, system, readable storage medium and server
CN113453076B (en) * 2020-03-24 2023-07-14 中国移动通信集团河北有限公司 User video service quality evaluation method, device, computing equipment and storage medium
CN112055263B (en) * 2020-09-08 2021-08-13 西安交通大学 360-degree video streaming transmission system based on significance detection
CN112584119B (en) * 2020-11-24 2022-07-22 鹏城实验室 Self-adaptive panoramic video transmission method and system based on reinforcement learning
CN112565208A (en) * 2020-11-24 2021-03-26 鹏城实验室 Multi-user panoramic video cooperative transmission method, system and storage medium
CN112822564B (en) * 2021-01-06 2023-03-24 鹏城实验室 Viewpoint-based panoramic video adaptive streaming media transmission method and system
CN112929691B (en) * 2021-01-29 2022-06-14 复旦大学 Multi-user panoramic video transmission method
CN114640870B (en) * 2022-03-21 2023-10-03 陕西师范大学 QoE-driven wireless VR video self-adaptive transmission optimization method and system
WO2023197811A1 (en) * 2022-04-12 2023-10-19 北京字节跳动网络技术有限公司 Video downloading method and apparatus, video transmission method and apparatus, terminal device, server and medium
CN114979799A (en) * 2022-05-20 2022-08-30 北京字节跳动网络技术有限公司 Panoramic video processing method, device, equipment and storage medium
CN115278354A (en) * 2022-06-14 2022-11-01 北京大学 Method and system for evaluating and optimizing video transmission quality based on user behavior index
CN117596376B (en) * 2024-01-18 2024-04-19 深圳大学 360-Degree video intelligent edge transmission method, system, wearable device and medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10204658B2 (en) * 2014-07-14 2019-02-12 Sony Interactive Entertainment Inc. System and method for use in playing back panorama video content
GB2536025B (en) * 2015-03-05 2021-03-03 Nokia Technologies Oy Video streaming method
CN105323552B (en) * 2015-10-26 2019-03-12 北京时代拓灵科技有限公司 A kind of panoramic video playback method and system
JP6058184B1 (en) * 2016-03-10 2017-01-11 株式会社コロプラ Method and program for controlling head mounted display system
CN105915937B (en) * 2016-05-10 2019-12-13 上海乐相科技有限公司 Panoramic video playing method and device
CN106060515B (en) * 2016-07-14 2018-11-06 腾讯科技(深圳)有限公司 Panorama pushing method for media files and device

Also Published As

Publication number Publication date
CN109286855A (en) 2019-01-29

Similar Documents

Publication Publication Date Title
CN109286855B (en) Panoramic video transmission method, transmission device and transmission system
Xie et al. 360ProbDASH: Improving QoE of 360 video streaming using tile-based HTTP adaptive streaming
US20230283653A1 (en) Methods and apparatus to reduce latency for 360-degree viewport adaptive streaming
Xiao et al. Optile: Toward optimal tiling in 360-degree video streaming
CN108833880B (en) Method and device for predicting viewpoint and realizing optimal transmission of virtual reality video by using cross-user behavior mode
Zhou et al. Clustile: Toward minimizing bandwidth in 360-degree video streaming
US10979663B2 (en) Methods and apparatuses for image processing to optimize image resolution and for optimizing video streaming bandwidth for VR videos
US10440407B2 (en) Adaptive control for immersive experience delivery
CN110248212B (en) Multi-user 360-degree video stream server-side code rate self-adaptive transmission method and system
US11095936B2 (en) Streaming media transmission method and client applied to virtual reality technology
EP1994759B1 (en) A method and device for adapting a temporal frequency of a sequence of video images
Jiang et al. Plato: Learning-based adaptive streaming of 360-degree videos
EP3406310A1 (en) Method and apparatuses for handling visual virtual reality content
CN112584119B (en) Self-adaptive panoramic video transmission method and system based on reinforcement learning
US11539985B2 (en) No reference realtime video quality assessment
US8311128B2 (en) Method of processing a coded data stream
US20200404241A1 (en) Processing system for streaming volumetric video to a client device
CN115037962A (en) Video adaptive transmission method, device, terminal equipment and storage medium
KR102129115B1 (en) Method and apparatus for transmitting adaptive video in real time using content-aware neural network
CN111988661A (en) Incorporating visual objects into video material
WO2018120857A1 (en) Streaming media technology-based method and apparatus for processing video data
Begen Quality-aware HTTP adaptive streaming
US11917327B2 (en) Dynamic resolution switching in live streams based on video quality assessment
Koch et al. Increasing the Quality of 360 {\deg} Video Streaming by Transitioning between Viewport Quality Adaptation Mechanisms
US11490094B2 (en) 360-degree video streaming method and apparatus

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20230414

Address after: 100871 No. 5, the Summer Palace Road, Beijing, Haidian District

Patentee after: Peking University

Address before: 100871 No. 5, the Summer Palace Road, Beijing, Haidian District

Patentee before: Peking University

Patentee before: PEKING UNIVERSITY FOUNDER GROUP Co.,Ltd.

Patentee before: BEIJING FOUNDER ELECTRONICS Co.,Ltd.

CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20201013