CN112468828B

CN112468828B - Code rate distribution method and device for panoramic video, mobile terminal and storage medium

Info

Publication number: CN112468828B
Application number: CN202011337673.8A
Authority: CN
Inventors: 张磊; 索琰琰; 伍曦明; 崔来中
Original assignee: Shenzhen University
Current assignee: Shenzhen University
Priority date: 2020-11-25
Filing date: 2020-11-25
Publication date: 2022-06-17
Anticipated expiration: 2040-11-25
Also published as: CN112468828A

Abstract

The embodiment of the invention discloses a method and a device for allocating code rates of panoramic videos, a mobile terminal and a storage medium. The method comprises the following steps: acquiring the number of prediction errors of a visual area of a historical user in real time; determining the optimal pattern block segmentation mode according to the visual area prediction error degrees of the historical user; determining a target image block in a current prediction user visual area and the weight of the target image block according to the optimal image block segmentation mode; and substituting the weight of the target image block into a preset maximized experience quality model, and determining a target video clip to be downloaded and the code rate of the target video clip according to the preset maximized experience quality model. The technical scheme provided by the embodiment of the invention realizes better tolerance to the prediction error, reduces the influence of the prediction error of the visual area, and overcomes the problem that the visual area of the user is difficult to predict accurately on the mobile terminal, thereby reducing the pause in the user viewing and improving the experience quality of the user.

Description

Code rate distribution method and device for panoramic video, mobile terminal and storage medium

Technical Field

The embodiment of the invention relates to the technical field of video coding, in particular to a code rate allocation method and device for a panoramic video, a mobile terminal and a storage medium.

Background

In recent years, panoramic videos come into thousands of households, and due to the panoramic characteristics, unique watching experience can be brought to a viewer, so that the viewer is more and more popular with the user. But the panoramic nature also makes panoramic video much larger than conventional video at the same perceived quality, so transmitting panoramic video requires consuming more bandwidth, which is very difficult in wireless networks. Secondly, compared with the traditional video, the panoramic video brings higher expenses of calculation, batteries and the like to the mobile terminal with limited resources, and the experience quality and the popularization speed of the panoramic video in the mobile terminal are greatly reduced. However, when the user views the panoramic video, only a limited portion of the spherical image is visible at the same time, which is determined by the user's viewing zone. Therefore, each frame of the panoramic video can be divided into smaller non-overlapping rectangular areas (image blocks), and each image block can be independently transmitted and decoded, so that the mobile terminal only needs to download and decode the image blocks in the visual area of the user to ensure the experience quality of the user, and meanwhile, the bandwidth can be greatly saved and the resource consumption of the mobile terminal can be reduced.

The existing panoramic video transmission method based on the image blocks mainly adopts a fixed image block segmentation mode, and after the image blocks are segmented, the image blocks to be downloaded and the code rates thereof are adjusted according to the current prediction visual area, the bandwidth and the occupation condition of a buffer area.

However, the most important precondition of the panoramic video transmission method adopting the fixed pattern block segmentation mode and adjusting the pattern blocks to be downloaded according to the user view area is that the future view area of the user can be accurately predicted, but the method is difficult for the mobile terminal, because in the existing panoramic video transmission method, in order to ensure that the prediction algorithm runs fast enough on the mobile terminal to meet the real-time requirement of the panoramic video stream, the light-weight prediction algorithm is usually deployed on the mobile terminal, so that the problem of low accuracy of the prediction result is caused, and the method is more easily influenced by the video type. Therefore, if the future view area of the user is predicted incorrectly, the mobile terminal may fail to download some image blocks in the view area of the user in advance, and a pause phenomenon may occur for the user to watch, thereby seriously affecting the quality of experience of the user.

Disclosure of Invention

Embodiments of the present invention provide a method and an apparatus for allocating a code rate of a panoramic video, a mobile terminal, and a storage medium, so as to reduce an influence of a viewing area prediction error, overcome a problem that a user viewing area is difficult to predict accurately on the mobile terminal, reduce a user's hesitation in viewing, and improve user experience quality.

In a first aspect, an embodiment of the present invention provides a method for allocating a bitrate of a panoramic video, where the method includes:

acquiring the degree of visual area prediction error of a historical user in real time;

determining an optimal image block segmentation mode according to the visual area prediction error degrees of the historical user;

determining a target image block in a current prediction user visual area and the weight of the target image block according to the optimal image block segmentation mode;

and substituting the weight of the target image block into a preset maximized experience quality model, and determining a target video clip needing to be downloaded and the code rate of the target video clip according to the preset maximized experience quality model.

In a second aspect, an embodiment of the present invention further provides a device for allocating a bitrate for a panoramic video, where the device includes:

the historical error acquisition module is used for acquiring the degree of the visual area prediction error of the historical user in real time;

the segmentation mode determining module is used for determining the optimal segment segmentation mode according to the visual area prediction error degrees of the historical user;

the image block weight determining module is used for determining a target image block in a current prediction user visual area and the weight of the target image block according to the optimal image block segmentation mode;

and the code rate determining module is used for substituting the weight of the target image block into a preset maximized experience quality model and determining a target video clip to be downloaded and the code rate of the target video clip according to the preset maximized experience quality model.

In a third aspect, an embodiment of the present invention further provides a mobile terminal, where the mobile terminal includes:

one or more processors;

a memory for storing one or more programs;

when the one or more programs are executed by the one or more processors, the one or more processors implement the rate allocation method for panoramic video according to any embodiment of the present invention.

In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the rate allocation method for panoramic video provided in any embodiment of the present invention.

The embodiment of the invention provides a code rate allocation method of a panoramic video, which comprises the steps of firstly collecting historical user visual area prediction error degrees in real time, then determining an optimal image block segmentation mode according to the historical user visual area prediction error degrees, further determining target image blocks in a current prediction user visual area and weights of the target image blocks according to the optimal image block segmentation mode, and substituting the weights of the target image blocks into a preset maximized experience quality model to determine a target video clip required to be downloaded and the code rate of the target video clip. According to the technical scheme provided by the embodiment of the invention, the segmentation mode of the image blocks is timely adjusted according to the historical user visual area prediction error, so that better tolerance to the prediction error is realized, the influence of the visual area prediction error is reduced, and the problem that the visual area of the user is difficult to accurately predict on the mobile terminal is solved, so that the pause in the user viewing is reduced, and the experience quality of the user is improved.

Drawings

Fig. 1 is a flowchart of a method for allocating a bitrate of a panoramic video according to an embodiment of the present invention;

fig. 2 is a flowchart of a code rate allocation method for a panoramic video according to a second embodiment of the present invention;

fig. 3 is a schematic structural diagram of a device for allocating a bitrate of a panoramic video according to a third embodiment of the present invention;

fig. 4 is a schematic structural diagram of a mobile terminal according to a fourth embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.

Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the steps as a sequential process, many of the steps can be performed in parallel, concurrently or simultaneously. In addition, the order of the steps may be rearranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figure. The processes may correspond to methods, functions, procedures, subroutines, and the like.

Example one

Fig. 1 is a flowchart of a code rate allocation method for a panoramic video according to an embodiment of the present invention. The method can be executed by the code rate allocation device for the panoramic video provided by the embodiment of the invention, and the device can be realized in a hardware and/or software mode and can be generally integrated in the mobile terminal. As shown in fig. 1, the method specifically comprises the following steps:

and S11, acquiring the prediction error degree of the visual area of the historical user in real time.

Specifically, in the process of playing a panoramic video, in order to ensure the experience quality of a user, save bandwidth and reduce resource consumption, a video frame is generally segmented to obtain image blocks which can be independently transmitted and decoded, and then the image blocks which need to be downloaded are determined by predicting a user visual area and according to the predicted user visual area. The prediction process of the user visual area may be implemented by linear regression, ridge regression, support vector regression, long-term and short-term memory regression, and the like, which is not limited in this embodiment. When the user watches the video, an actual user visual area is generated, and the mobile terminal can acquire the error degree between the user visual area obtained by predicting the historical moment and the actual user visual area in real time in the playing process of the panoramic video, namely the error degree predicted by the historical user visual area.

And S12, determining the optimal picture block segmentation mode according to the historical user visual area prediction error degree.

Specifically, different pattern block segmentation modes can generate pattern blocks with different sizes, and for the current predicted user visual area, the areas where the pattern blocks intersected with the current predicted user visual area and determined by the different pattern block segmentation modes are located are different. When selecting the image blocks, the mobile terminal selects the image blocks intersected with the visual area of the current prediction user as much as possible to download, the tolerance of different image block segmentation modes to the prediction error is different, and when the size of each image block is increased, the prediction error can be absorbed more, so that the video pause time is reduced, but the increase of the size of the image blocks can also bring about the increase of the used resources. Therefore, in this embodiment, a tile splitting manner with a more suitable tolerance may be selected as the optimal tile splitting manner according to the number of viewing zone prediction errors of the historical user.

Optionally, before determining the optimal tile segmentation mode according to the number of viewing area prediction errors of the historical user, the method further includes: respectively determining average evaluation indexes of a plurality of preset prediction error degrees in a plurality of preset image block segmentation modes, wherein the average evaluation indexes are used for coordinating the relationship between the accuracy rate and the recall rate of the image blocks selected based on the corresponding preset image block segmentation modes; establishing a lookup table according to the relation between the average evaluation index and the segmentation mode of the preset image blocks and the tolerable maximum preset prediction error degree; correspondingly, the optimal image block segmentation mode is determined according to the visual area prediction error degrees of the historical user, and the optimal image block segmentation mode comprises the following steps: and determining the optimal pattern block segmentation mode corresponding to the preset evaluation index in the lookup table according to the visual area prediction error degrees of the historical user.

Specifically, in order to obtain the tolerance of each pattern block segmentation mode to the prediction error, different prediction errors may be used to test the performance of different pattern block segmentation modes. Specifically, random deviation can be added into a real watching track of a user used in a data set to serve as a preset prediction error degree, the track with the random deviation added serves as a predicted visual area result, then average judgment indexes of different preset prediction error degrees in different preset pattern block segmentation modes are calculated, then a lookup table is built according to the average judgment indexes, and the table can store the maximum preset prediction error degree which can be tolerated by each average judgment index in different preset pattern block segmentation modes. The definition of the evaluation index is to coordinate the relationship between the accuracy and the recall rate when the image blocks are selected based on the corresponding preset image block segmentation mode. Alternatively, the evaluation index may be obtained by the following formula:

wherein, F_βIndicating an evaluation index, Precision indicating a Precision rate, Recall indicating a Recall rate, beta indicating a weighting parameter, a larger beta indicating a higher weight for the quality of user experience, and a smaller beta indicating a higher weight for the transmission efficiency. Preferably, β may be set to 1, the preset prediction error degrees include, but are not limited to, 30 °, 40 °, 50 °, 60 °, 70 °, and 80 °, and the preset tile slicing manners include, but are not limited to, 4 β 04, 5 × 5, 6 × 6, 7 × 7, 8 × 8, 9 × 9, and 10 × 10. The accuracy and recall can be obtained by the following formulas:

wherein TP represents the number of the image blocks selected according to the predicted user visual area in the actual user visual area, FP represents the number of the image blocks selected according to the predicted user visual area which are not in the actual user visual area, and FN represents the image blocks which are not selected in the actual user visual area. After the lookup table is obtained, the optimal pattern block segmentation mode capable of tolerating the visual area error degrees of the historical user can be determined in the lookup table according to the preset evaluation index set by the user. Preferably, the predetermined criteria may include, but are not limited to, 0.5, 0.6, 0.7, 0.8, and 0.9.

And S13, determining the target image block in the current prediction user view area and the weight of the target image block according to the optimal image block segmentation mode.

Specifically, after the optimal tile block splitting manner is determined, the target tile blocks located in the current prediction user view area may be picked out according to the optimal tile block splitting manner, and weights are set for the target tile blocks, so as to determine the code rate of the target video segment to be downloaded according to the weights of the target tile blocks, and specifically, the weights may be set to 0.5.

And S14, substituting the weight of the target image block into the preset maximized experience quality model, and determining the target video clip to be downloaded and the code rate of the target video clip according to the preset maximized experience quality model.

Specifically, when the mobile terminal needs to allocate a code rate to the to-be-downloaded tile, it performs code rate adaptation and establishes a preset quality of experience (QoE) model to determine the code rate of each tile. After the model is built, the mobile terminal can traverse all the video clips, find out the code rate which enables the QoE score to be maximum, and further download the corresponding target video clip. In the process of establishing the preset maximized quality of experience model, the weight of the target image block can be used as one parameter, so that the importance degree of each image block determined according to the current predicted user visual area is taken into consideration.

Optionally, before substituting the weight of the target image block into the preset maximized quality of experience model and determining the target video segment to be downloaded and the code rate of the target video segment according to the preset maximized quality of experience model, the method further includes: determining the quality level of each video clip in the panoramic video, the quality fluctuation between every two continuous video clips and the pause time required for downloading each video clip; and establishing a preset maximum experience quality model according to the quality grade, the quality fluctuation and the Kapause time. Further optionally, determining the quality level of each video segment in the panoramic video includes:

wherein,

represents the quality class, w_i，jRepresents the weight of the jth tile in the ith video segment of the panoramic video, q (b)_i，j) Representing a non-decreasing mapping function between the jth tile in the ith video segment and the quality of experience of the user, b_i，jRepresenting the code rate of a jth image block in an ith video segment, wherein n represents the total number of image blocks in the ith video segment;

determining quality fluctuations between each two consecutive video segments in the panoramic video, comprising:

wherein,

indicating a quality fluctuation;

determining the required pause time for downloading each video clip in the panoramic video, comprising:

wherein,

indicates the time of calton, C_iRepresents the predicted bandwidth throughput when downloading the ith video segment, B_iIndicating the buffer size at the start of downloading the ith video segment, t_missRepresenting the time it takes to download an un-downloaded tile before re-downloading;

establishing a preset maximized experience quality model according to the quality grade, the quality fluctuation and the Kapause time, comprising the following steps of:

wherein Q is_iPresentation of experience qualityAnd p, q and r represent weight coefficients.

Specifically, in the preset maximized experience quality model, the total quality of the video in the visual area, the quality fluctuation of the video in the space and the time domain and the pause time of the video are considered, and through the model, the mobile terminal can calculate the target video clips to be actually downloaded and the code rates of all the target video clips, so that the pause time of the video is further reduced, the quality fluctuation of the video in the space and the time domain is reduced, and the experience quality of a user is improved. Wherein the non-decreasing mapping function maps the code rate to a discrete level, and level 0 indicates that the tile is not downloaded; b₁＝B_default，B_defaultA default buffer size indicating a startup phase;

can express the time required for downloading the video clip i, and the calculation formula of the size of the buffer zone after the video clip i is downloaded can be

If the remaining buffer size is 0, it indicates that there is a pause, where L represents the length of the video segment i, i.e. the buffer is increased by L seconds after the video segment i is downloaded.

Optionally, a model-based predictive control (MPC) framework may also be applied to optimize QoE of multiple video clips within a limited range. Specifically, the QoE may be optimized first using the prediction information in the optimization window [ t, t + k-1], then moving the optimization window to [ t +1, t + k ], and then optimizing the QoE of the next video segment, and so on.

On the basis of the foregoing technical solution, optionally, after determining the target tile and the weight of the target tile in the current prediction user view area according to the optimal tile splitting manner, the method further includes: expanding the visual area of the current predicted user according to a preset step length, and determining the weight of a new target image block added after expansion, wherein the weight of the new target image block is lower than that of the target image block; correspondingly, substituting the weight of the target image block into the preset maximized experience quality model, and determining the target video clip to be downloaded and the code rate of the target video clip according to the preset maximized experience quality model, including: and substituting the weight of the target image block and the weight of the new target image block into a preset maximized experience quality model, and determining the target video clip and the code rate of the target video clip according to the preset maximized experience quality model.

Specifically, the current predicted user view may be gradually expanded to account for some potential errors in the user view prediction process, and a lower weight, including but not limited to 0.25 and 0.125, may be set for new target tiles added after each expansion. Optionally, define 2^-kAnd distributing weight to the new target image block at the kth level as a weight function, wherein the level of the target image block is set to be 1, the level of the new target image block obtained after each subsequent expansion is sequentially increased by one, and for each expansion, the preset step length of the visual area expansion can be set to be a fixed value or a dynamic value according to factors such as a prediction error, the moving speed of the head or sight of a user and the like. Correspondingly, when the target video segment and the code rate of the target video segment are determined, the weights of all determined target image blocks and the weight of a new target image block can be substituted into the preset maximized quality of experience model for calculation.

On the basis of the above technical solution, optionally, before applying the method provided in this embodiment, the panoramic video may be initialized at the server end first, so that the mobile terminal downloads the determined target video segment according to the media description file obtained by initialization. The method specifically comprises the following steps: the method comprises the steps of firstly carrying out blocking processing on panoramic videos with different code rates according to different image block segmentation modes, then carrying out coding operation and slicing processing on image blocks obtained after the blocking processing to obtain video segments and initial media description files, adding the total frame number of the panoramic videos and the used image block segmentation modes into the initial media description files to obtain final media description files, and finally storing all the video segments and the corresponding media description files in a server for later use. By increasing the total frame number and the pattern segmentation mode in the media description file, the method is more suitable for multi-thread and asynchronous decoding. Preferably, the duration of the video segment is 1 second, the projection format of the panoramic video is equidistant columnar projection, the tile slicing mode includes, but is not limited to, 2 × 2, 3 × 3, 4 × 4, 5 × 5, 6 × 6, 7 × 7, 8 × 8, 9 × 9 and 10 × 10, and the media description file includes, but is not limited to, video bitrate, coding information, tile number, video duration, total frame number of video, and the employed tile slicing mode.

According to the technical scheme provided by the embodiment of the invention, firstly, the prediction error degrees of the visual area of the historical user are collected in real time, then the optimal image block segmentation mode is determined according to the prediction error degrees of the visual area of the historical user, the target image blocks in the visual area of the current prediction user and the weights of the target image blocks are determined according to the optimal image block segmentation mode, and then the weights of the target image blocks are substituted into the preset maximized experience quality model, so that the target video clips to be downloaded and the code rates of the target video clips can be determined. By adjusting the segmentation mode of the image blocks in time according to the historical user visual area prediction error, better tolerance to the prediction error is realized, the influence of the visual area prediction error is reduced, and the problem that the user visual area is difficult to accurately predict on the mobile terminal is solved, so that the blockage during the user watching is reduced, and the experience quality of the user is improved.

Example two

Fig. 2 is a flowchart of a code rate allocation method for a panoramic video according to a second embodiment of the present invention. The technical solution of this embodiment is further refined based on the above technical solution, and optionally, in the process of determining the target video segment to be downloaded and the code rate of the target video segment, the conditions of available hardware resources, bandwidth and buffer area of the mobile terminal are additionally considered, so as to ensure that the mobile terminal can complete decoding in a short time, thereby avoiding interruption of video playing and further reducing video blockage. Specifically, in this embodiment, before substituting the weight of the target tile into the preset maximized quality of experience model and determining the target video segment to be downloaded and the code rate of the target video segment according to the preset maximized quality of experience model, the method further includes: determining the maximum video code rate supporting real-time decoding according to the current available hardware resources of the mobile terminal and the optimal picture block segmentation mode; determining the downloadable total video code rate according to the current bandwidth of the mobile terminal and the occupation condition of the current buffer area; correspondingly, substituting the weight of the target image block into the preset maximized experience quality model, and determining the target video clip to be downloaded and the code rate of the target video clip according to the preset maximized experience quality model, including: and determining the target video clip and the code rate of the target video clip according to the preset maximized experience quality model, the maximum video code rate and the total video code rate. Correspondingly, as shown in fig. 2, the method specifically includes the following steps:

and S21, acquiring the prediction error degree of the visual area of the historical user in real time.

And S22, determining the optimal picture block segmentation mode according to the historical user visual area prediction error degree.

And S23, determining the target image block in the current prediction user view area and the weight of the target image block according to the optimal image block segmentation mode.

And S24, determining the maximum video code rate supporting real-time decoding according to the current available hardware resources of the mobile terminal and the optimal picture block segmentation mode.

And S25, determining the total downloadable video code rate according to the current bandwidth of the mobile terminal and the current buffer occupation condition.

And S26, substituting the weight of the target image block into the preset maximized experience quality model, and determining the target video clip and the code rate of the target video clip according to the preset maximized experience quality model, the maximum video code rate and the total video code rate.

Specifically, since the video frame needs to be segmented, as the number of blocks increases, the number of decoding tasks also increases, thereby increasing the decoding difficulty of the mobile terminal with limited hardware resources, the maximum video bitrate that can be supported without pause in decoding can be calculated according to the available hardware resources of the current mobile terminal, and the total downloadable video bitrate can be obtained through the current bandwidth and the current buffer occupation. And then after the target video segment and the code rate of the target video segment are determined according to a preset maximized experience quality model, adjusting the code rate according to the maximum video code rate and the total video code rate, namely if the determined code rate of the target video segment exceeds the maximum video code rate, adjusting the determined code rate to be less than or equal to the maximum video code rate, and if the sum of the determined code rates of all the target video segments exceeds the total video code rate, adjusting part or all of the code rates so that the sum is less than or equal to the total video code rate. Meanwhile, in order to solve the problems of high calling frequency, large search space and high solving difficulty of the preset maximized experience quality model, constraint conditions can be set for the model, namely, the code rate of the panoramic video is limited in a bounded search space according to the maximum video code rate, the conditions of network throughput and a buffer area are considered for the code rate selection, namely, the code rate of each target video segment is limited according to the total video code rate except when the buffer area reaches the maximum limit. In addition, the constraint condition may further include: the decoding time is limited to be smaller than the playing length of the video, the selection of the code rate should consider the weight of the image blocks, in the same video segment, the code rate of the image blocks with low weight cannot be higher than that of the image blocks with high weight, the image blocks belonging to the same grade should select the same code rate, and when the throughput and the user behavior are relatively stable in the optimization window, all future video segments in the same optimization window are endowed with the same optimization result.

Optionally, before determining the maximum video bitrate supporting real-time decoding according to the currently available hardware resources of the mobile terminal and the optimal tile segmentation mode, the method further includes: respectively determining decoding time required by each panoramic video in a parallel asynchronous decoding mode for the panoramic videos corresponding to the multiple preset code rates and the multiple preset picture block segmentation modes; determining the influence relationship between decoding time and the number of picture blocks, the video code rate and the number of decoding threads; correspondingly, the determining the maximum video bitrate supporting real-time decoding according to the currently available hardware resources of the mobile terminal and the optimal tile segmentation mode includes: determining the maximum decoding thread number currently supported by the mobile terminal according to the currently available hardware resources; and determining the maximum video code rate according to the maximum decoding thread number, the optimal picture block segmentation mode and the influence relation.

Specifically, for panoramic videos with different preset image block segmentation modes with different preset code rates, a parallel asynchronous decoding mode is adopted for decoding, and time required for decoding the whole video is recorded, so that the relationship between decoding time and the number of image blocks, the relationship between decoding time and video code rate, and the relationship between decoding time and available decoding resources (namely the number of decoding threads) of the current equipment are determined. Wherein, the influence of the three factors of the number of the image blocks, the video code rate and the number of the decoding threads on the decoding time is almost independent, so that F can be respectively defined_n(x)、F_r(x) And F_c(x) For the ratio between the decoding time when each factor value is x and the decoding time of the base line, and constructing an analytic model D-D₀×F_n(x₁)×F_r(x₂)×F_c(x₃) For calculating the number of current blocks as x₁The video code rate is x₂And the decoding thread number is x₃Decoding time of time D of₀The decoding time of the base line is represented, and the configuration of the decoding time of the base line is as follows: the pattern block segmentation mode is 2 multiplied by 2, the video code rate is 480P, the number of decoding threads is 1, the decoding time is increased along with the increase of the number of pattern blocks or the improvement of the video code rate, and is reduced along with the increase of the number of decoding threads.

According to the technical scheme provided by the embodiment of the invention, the conditions of available hardware resources, bandwidth and a buffer area of the mobile terminal are taken into consideration in the process of determining the target video segment to be downloaded and the code rate of the target video segment, so that the mobile terminal can finish decoding in a short time, the interruption of video playing is avoided, and video blockage is further reduced.

EXAMPLE III

Fig. 3 is a schematic structural diagram of a device for allocating a bitrate for a panoramic video according to a third embodiment of the present invention, where the device may be implemented by hardware and/or software, and may be generally integrated in a mobile terminal. As shown in fig. 3, the apparatus includes:

the historical error acquisition module 31 is used for acquiring the degree of the visual area prediction error of the historical user in real time;

the segmentation mode determining module 32 is used for determining the optimal segment segmentation mode according to the visual area prediction error degrees of the historical user;

the image block weight determining module 33 is configured to determine a target image block and a weight of the target image block in the currently predicted user view region according to the optimal image block splitting manner;

and the code rate determining module 34 is configured to substitute the weight of the target image block into the preset maximized quality of experience model, and determine the target video segment to be downloaded and the code rate of the target video segment according to the preset maximized quality of experience model.

On the basis of the foregoing technical solution, optionally, the apparatus for allocating a bitrate of a panoramic video further includes:

the maximum video code rate determining module is used for determining the maximum video code rate supporting real-time decoding according to the current available hardware resources of the mobile terminal and the optimal picture block segmentation mode before substituting the weight of the target picture block into the preset maximized experience quality model and determining the target video segment to be downloaded and the code rate of the target video segment according to the preset maximized experience quality model;

the total video code rate determining module is used for determining the downloadable total video code rate according to the current bandwidth of the mobile terminal and the occupation condition of the current buffer area;

correspondingly, the code rate determining module 34 is specifically configured to:

and determining the target video clip and the code rate of the target video clip according to the preset maximized experience quality model, the maximum video code rate and the total video code rate.

On the basis of the foregoing technical solution, optionally, the apparatus for allocating a bitrate for a panoramic video further includes:

the decoding time determining module is used for determining the decoding time required by each panoramic video in a parallel asynchronous decoding mode for the panoramic videos corresponding to various preset code rates and various preset picture block segmentation modes before determining the maximum video code rate supporting real-time decoding according to the current available hardware resources of the mobile terminal and the optimal picture block segmentation mode;

the influence relation determining module is used for determining the influence relation between the decoding time and the number of the image blocks, the video code rate and the number of the decoding threads;

correspondingly, the maximum video rate determining module comprises:

the maximum decoding thread number determining unit is used for determining the maximum decoding thread number currently supported by the mobile terminal according to the currently available hardware resources;

and the maximum video code rate determining unit is used for determining the maximum video code rate according to the maximum decoding thread number, the optimal pattern block segmentation mode and the influence relation.

the judgment index determining module is used for respectively determining average judgment indexes of a plurality of preset prediction error degrees in a plurality of preset image block segmentation modes before determining the optimal image block segmentation mode according to the historical user visual area prediction error degrees, and the average judgment indexes are used for coordinating the relationship between the accuracy rate and the recall rate of the image blocks selected based on the corresponding preset image block segmentation modes;

the lookup table establishing module is used for establishing a lookup table according to the relation between the average evaluation index and the segmentation mode of the preset image blocks and the tolerable maximum preset prediction error degree;

correspondingly, the segmentation mode determining module 32 is specifically configured to:

and determining the optimal pattern block segmentation mode corresponding to the preset evaluation index in the lookup table according to the visual area prediction error degrees of the historical user.

the visual area expanding module is used for expanding the visual area of the current prediction user according to a preset step length after determining a target image block and the weight of the target image block in the visual area of the current prediction user according to the optimal image block segmentation mode, and determining the weight of a new target image block added after expansion, wherein the weight of the new target image block is lower than that of the target image block;

and substituting the weight of the target image block and the weight of the new target image block into a preset maximized experience quality model, and determining the target video clip and the code rate of the target video clip according to the preset maximized experience quality model.

the parameter determination module is used for determining the quality grade of each video clip in the panoramic video, the quality fluctuation between every two continuous video clips and the pause time required for downloading each video clip before substituting the weight of the target image block into the preset maximized experience quality model and determining the target video clip to be downloaded and the code rate of the target video clip according to the preset maximized experience quality model;

and the model establishing module is used for establishing a preset maximized experience quality model according to the quality grade, the quality fluctuation and the Kanton time.

On the basis of the above technical solution, optionally, the parameter determining module includes:

the quality level determining unit is used for determining the quality level of each video clip in the panoramic video and comprises the following steps:

wherein,

indicating quality class, w_i，jWeight representing the jth tile in the ith video segment of the panoramic video, q (b)_i，j) Representing a non-decreasing mapping function between the jth tile in the ith video segment and the quality of experience of the user, b_i，jRepresenting the code rate of a jth image block in an ith video segment, wherein n represents the total number of image blocks in the ith video segment;

a quality fluctuation determination unit for determining a quality fluctuation between every two consecutive video segments in the panoramic video, comprising:

wherein,

indicating a quality fluctuation;

the pause time determining unit is used for determining pause time required by downloading each video clip in the panoramic video, and comprises the following steps:

wherein,

denotes the Cartesian time, C_iRepresents the predicted bandwidth throughput when downloading the ith video segment, B_iIndicating the buffer size at the start of downloading the ith video clip, t_missRepresenting the time it takes to download an un-downloaded tile before re-downloading;

the model building module is specifically configured to:

wherein Q is_iRepresenting quality of experience, and p, q, and r represent weighting coefficients.

The code rate allocation device for the panoramic video, provided by the embodiment of the invention, can execute the code rate allocation method for the panoramic video, provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.

It should be noted that, in the embodiment of the apparatus for allocating a bitrate to a panoramic video, each unit and each module included in the apparatus are only divided according to functional logic, but are not limited to the above division as long as the corresponding function can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.

Example four

Fig. 4 is a schematic structural diagram of a mobile terminal according to a fourth embodiment of the present invention, which shows a block diagram of an exemplary mobile terminal suitable for implementing the embodiment of the present invention. The mobile terminal shown in fig. 4 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention. As shown in fig. 4, the mobile terminal includes a processor 41, a memory 42, an input device 43, and an output device 44; the number of the processors 41 in the mobile terminal may be one or more, one processor 41 is taken as an example in fig. 4, the processor 41, the memory 42, the input device 43 and the output device 44 in the mobile terminal may be connected by a bus or in other manners, and the connection by the bus is taken as an example in fig. 4.

The memory 42 is a computer-readable storage medium, and can be used to store software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the rate allocation method for panoramic video in the embodiment of the present invention (for example, the historical error collecting module 31, the slicing manner determining module 32, the tile weight determining module 33, and the rate determining module 34 in the rate allocation device for panoramic video). The processor 41 executes various functional applications and data processing of the mobile terminal by running software programs, instructions and modules stored in the memory 42, that is, implements the above-described rate allocation method for the panoramic video.

The memory 42 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the mobile terminal, and the like. Further, the memory 42 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, memory 42 may further include memory located remotely from processor 41, which may be connected to the mobile terminal through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The input device 43 may be used to acquire the media description file and the video clip to be downloaded from the server side, and to generate key signal inputs related to user settings and function control of the mobile terminal, etc. The output device 44 may be used to play panoramic video or the like to the user.

EXAMPLE five

An embodiment of the present invention further provides a storage medium containing computer-executable instructions, where the computer-executable instructions are executed by a computer processor to perform a method for allocating a bitrate for a panoramic video, and the method includes:

determining the optimal pattern block segmentation mode according to the visual area prediction error degrees of the historical user;

and substituting the weight of the target image block into a preset maximized experience quality model, and determining a target video clip to be downloaded and the code rate of the target video clip according to the preset maximized experience quality model.

The storage medium may be any of various types of memory devices or storage devices. The term "storage medium" is intended to include: mounting media such as CD-ROM, floppy disk, or tape devices; computer system memory or random access memory such as DRAM, DDR RAM, SRAM, EDO RAM, Lanbas (Rambus) RAM, etc.; non-volatile memory, such as flash memory, magnetic media (e.g., hard disk or optical storage); registers or other similar types of memory elements, etc. The storage medium may also include other types of memory or combinations thereof. In addition, the storage medium may be located in the computer system in which the program is executed, or may be located in a different second computer system connected to the computer system through a network (such as the internet). The second computer system may provide the program instructions to the computer for execution. The term "storage medium" may include two or more storage media that may reside in different locations, such as in different computer systems that are connected by a network. The storage medium may store program instructions (e.g., embodied as a computer program) that are executable by one or more processors.

Of course, the storage medium containing the computer-executable instructions provided by the embodiments of the present invention is not limited to the method operations described above, and may also perform related operations in the rate allocation method for panoramic video provided by any embodiment of the present invention.

A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.

It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims

1. A code rate allocation method for panoramic video is characterized by comprising the following steps:

substituting the weight of the target image block into a preset maximized experience quality model, and determining a target video clip to be downloaded and the code rate of the target video clip according to the preset maximized experience quality model;

before substituting the weight of the target image block into a preset maximized experience quality model and determining a target video segment to be downloaded and the code rate of the target video segment according to the preset maximized experience quality model, the method further comprises the following steps:

determining the quality level of each video clip in the panoramic video, the quality fluctuation between every two continuous video clips and the pause time required for downloading each video clip;

establishing the preset maximized experience quality model according to the quality grade, the quality fluctuation and the pause time;

the determining the quality level of each video clip in the panoramic video comprises the following steps:

wherein,

represents the quality class, w_i,jRepresents the weight of the jth tile in the ith video segment of the panoramic video, q (b)_i,j) Representing a non-decreasing mapping function between the jth tile in the ith video segment and the quality of experience of the user, b_i，jRepresenting the code rate of a jth image block in an ith video segment, wherein n represents the total number of image blocks in the ith video segment;

the determining quality fluctuation between every two consecutive video segments in the panoramic video comprises:

wherein,

representing the quality fluctuation;

the determining the required pause time for downloading each video clip in the panoramic video comprises:

wherein,

represents the time of the seizure, C_iRepresents the predicted bandwidth throughput when downloading the ith video segment, B_iIndicating the buffer size at the start of downloading the ith video segment, t_missRepresenting the time it takes to download an un-downloaded tile before re-downloading;

the establishing the preset maximized quality of experience model according to the quality level, the quality fluctuation and the Kanton time comprises:

2. The method for allocating bitrate of a panoramic video according to claim 1, wherein before the substituting the weight of the target tile into a preset maximized quality of experience model and determining a target video clip to be downloaded and a bitrate of the target video clip according to the preset maximized quality of experience model, the method further comprises:

determining the maximum video code rate supporting real-time decoding according to the current available hardware resources of the mobile terminal and the optimal picture block segmentation mode;

determining a downloadable total video code rate according to the current bandwidth of the mobile terminal and the current buffer area occupation condition;

correspondingly, the substituting the weight of the target image block into a preset maximized quality of experience model, and determining the target video segment to be downloaded and the code rate of the target video segment according to the preset maximized quality of experience model includes:

3. The method for allocating bitrate of panoramic video according to claim 2, wherein before determining the maximum video bitrate supporting real-time decoding according to the currently available hardware resources of the mobile terminal and the optimal tile slicing manner, the method further comprises:

respectively determining the decoding time required by each panoramic video by adopting a parallel asynchronous decoding mode for the panoramic videos corresponding to various preset code rates and various preset picture block segmentation modes;

determining the influence relationship between the decoding time and the number of the image blocks, the video code rate and the number of decoding threads;

correspondingly, the determining the maximum video bitrate supporting real-time decoding according to the currently available hardware resources of the mobile terminal and the optimal tile segmentation mode includes:

determining the maximum decoding thread number currently supported by the mobile terminal according to the currently available hardware resources;

and determining the maximum video code rate according to the maximum decoding thread number, the optimal pattern block segmentation mode and the influence relation.

4. The method for allocating bitrate of panoramic video according to claim 1, wherein before determining the optimal tile slicing manner according to the degree of view prediction error of the historical user, the method further comprises:

respectively determining average evaluation indexes of a plurality of preset prediction error degrees under a plurality of preset image block segmentation modes, wherein the average evaluation indexes are used for coordinating the relationship between the accuracy and the recall rate of the image blocks selected based on the corresponding preset image block segmentation modes;

establishing a lookup table according to the relation between the average evaluation index and the preset pattern block segmentation mode and the tolerable maximum preset prediction error degree;

correspondingly, the determining the optimal segment segmentation mode according to the historical user visual area prediction error degrees comprises the following steps:

and determining the optimal pattern block segmentation mode corresponding to a preset judgment index in the lookup table according to the visual area prediction error degrees of the historical user.

5. The method for allocating bitrate of panoramic video according to claim 1, wherein after the determining target tiles in the current predicted user view region and the weights of the target tiles according to the optimal tile slicing manner, the method further comprises:

expanding the current prediction user visual area according to a preset step length, and determining the weight of a new target image block added after expansion, wherein the weight of the new target image block is lower than that of the target image block;

correspondingly, the substituting the weight of the target image block into a preset maximized quality of experience model, and determining the target video clip to be downloaded and the code rate of the target video clip according to the preset maximized quality of experience model includes:

and substituting the weight of the target image block and the weight of the new target image block into the preset maximized experience quality model, and determining the target video clip and the code rate of the target video clip according to the preset maximized experience quality model.

6. An apparatus for allocating bitrate of a panoramic video, comprising:

the code rate determining module is used for substituting the weight of the target image block into a preset maximized experience quality model and determining a target video clip to be downloaded and the code rate of the target video clip according to the preset maximized experience quality model;

the device further comprises:

the parameter determination module is used for determining the quality grade of each video clip in the panoramic video, the quality fluctuation between every two continuous video clips and the pause time required for downloading each video clip before substituting the weight of the target image block into a preset maximized experience quality model and determining the target video clip to be downloaded and the code rate of the target video clip according to the preset maximized experience quality model;

a model establishing module for establishing the preset maximized experience quality model according to the quality grade, the quality fluctuation and the pause time;

the parameter determination module comprises:

wherein,

represents the quality class, w_i,jRepresents the weight of the jth tile in the ith video segment of the panoramic video, q (b)_i,j) Representing a non-decreasing mapping function between the jth tile in the ith video segment and the quality of experience of the user, b_i,jRepresenting the code rate of a jth image block in an ith video segment, wherein n represents the total number of image blocks in the ith video segment;

wherein,

representing the quality fluctuation;

wherein,

the model building module is specifically configured to:

7. A mobile terminal, comprising:

one or more processors;

a memory for storing one or more programs;

when executed by the one or more processors, cause the one or more processors to implement the rate allocation method for panoramic video of any of claims 1-5.

8. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements a rate allocation method for panoramic video according to any one of claims 1 to 5.