WO2018010653A1

WO2018010653A1 - Panoramic media file push method and device

Info

Publication number: WO2018010653A1
Application number: PCT/CN2017/092562
Authority: WO
Inventors: 袁梓瑾
Original assignee: 腾讯科技（深圳）有限公司
Priority date: 2016-07-14
Filing date: 2017-07-12
Publication date: 2018-01-18
Also published as: CN106060515B; CN106060515A

Abstract

Disclosed in the present invention are a panoramic media file push method and device. The method comprises: obtaining a panoramic media file to be pushed, the panoramic media file comprising at least one panoramic image frame; dividing each panoramic image frame into multiple view regions according to a preset condition; obtaining a first central view region on the multiple view regions of each panoramic image frame, a region occupied by the first central view region being larger than or equal to a region occupied by one view region; coding the panoramic image frame according to the first central view region; and pushing the coded panoramic image frame. The present invention resolves the technical problem of low push accuracy caused when an existing panoramic media file push mode is used.

Description

Panoramic media file pushing method and device

The present application claims priority to Chinese Patent Application No. 201610557007.2, filed on Jul. 14, 2016, the entire disclosure of which is incorporated herein by reference. .

Technical field

The present invention relates to the field of computers, and in particular to a method and device for pushing a panoramic media file.

Background technique

As the panoramic media file can provide users with a more realistic and immersive viewing experience that is different from the traditional limited view, it has gradually become one of the main contents in the virtual reality field. However, panoramic media files have great difficulties and challenges in engineering technology compared to traditional media files.

At present, for the playback of panoramic media files, a common technical means is to use a quadrangular pyramid. Specifically, the panoramic sphere is built into a quadrangular pyramid, and the center of the viewer's field of view is vertically aligned with the center of the bottom surface of the cone, and the image to be played is projected onto the quadrangular pyramid surface by projection geometric transformation. This will preserve the high definition of the image in the viewer's field of view by projecting to the underside of the cone, while the rest of the field of view will be quickly compressed as it is projected onto the side of the cone, greatly reducing the bandwidth pressure when pushing panoramic media files. .

However, since the panoramic media file needs to provide a 360-degree panoramic picture, if only a fixed predefined viewing angle of the picture content is transmitted, when the viewer's field of view moves, a part of the picture in the new field of view cannot be normally rendered due to compression. Therefore, the above manner of pushing the panoramic media file only for a predefined perspective will make the pushed panoramic media file inaccurate, thereby causing the problem that the panoramic media file is distorted during playback.

In response to the above problems, no effective solution has been proposed yet.

Summary of the invention

The embodiment of the invention provides a method and a device for pushing a panoramic media file to solve at least the technical problem of low push accuracy caused by the push mode of the existing panoramic media file.

According to an aspect of the present invention, a method for pushing a panoramic media file is provided, which is applied to a terminal having a display device, comprising: acquiring a panoramic media file to be pushed, wherein the panoramic media file includes at least one panoramic image. And dividing the panoramic image frame into a plurality of view regions according to a predetermined condition; acquiring a first central view region on the plurality of view regions of the panoramic image frame, wherein the area occupied by the first central view region is greater than or equal to one The area occupied by the view area; encoding the panoramic image frame according to the first central view area determined above; and pushing the encoded panoramic image frame to the display device.

According to another aspect of the present invention, a panoramic media file pushing apparatus is further provided, which is applied to a terminal having a display device, and includes: a first acquiring unit, configured to acquire a panoramic media file to be pushed, wherein the panoramic view The media file includes at least one frame of the panoramic image frame; the dividing unit is configured to divide the panoramic image frame into a plurality of view regions according to a predetermined condition; and the second acquiring unit is configured to be used on the plurality of view regions of the panoramic image frame Determining a first central view area, wherein the area occupied by the first central view area is greater than or equal to an area occupied by the view area; the coding unit is configured to encode the panoramic image frame according to the determined first central view area; a pushing unit, configured to push the encoded panoramic image frame.

In the embodiment of the present invention, after acquiring at least one frame of the panoramic image frame in the panoramic media file to be pushed, the panoramic image frame is respectively divided into a plurality of view regions according to a predetermined condition, and the first plurality of view regions are acquired. a central view area, and encoding the panoramic image frame according to the first central view area to push the encoded panoramic image frame. That is to say, by using the plurality of view areas divided on the panoramic image frame to obtain the central view area, it is realized that the central view area is accurately positioned by using the plurality of view areas, and the accuracy of the acquired picture of the central view area is ensured. The technical problem of only obtaining a picture with high compression distortion in the related art is overcome. Further, the use of multiple view areas to quickly obtain the central view area will greatly improve the acquisition efficiency, thereby achieving the effect of improving the push efficiency of the panoramic media files.

DRAWINGS

The drawings described herein are provided to provide a further understanding of the invention and constitute a part of this application. The illustrative embodiments of the present invention and the description thereof are intended to explain the present invention and are not intended to limit the invention. In the drawing:

1 is a schematic diagram of an application environment of an optional panoramic media file pushing method according to an embodiment of the present invention;

2 is a flowchart of an optional panoramic media file pushing method according to an embodiment of the present invention;

3 is a schematic diagram of an optional panoramic media file pushing method according to an embodiment of the present invention;

4 is a schematic diagram of another optional panoramic media file pushing method according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of still another optional panoramic media file pushing method according to an embodiment of the present invention; FIG.

6 is a schematic diagram of still another optional panoramic media file pushing method according to an embodiment of the present invention;

7 is a schematic diagram of an optional panoramic media file pushing device according to an embodiment of the present invention;

FIG. 8 is a schematic diagram of an optional panoramic media file push terminal according to an embodiment of the present invention.

detailed description

The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is an embodiment of the invention, but not all of the embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts shall fall within the scope of the present invention.

It is to be understood that the terms "first", "second" and the like in the specification and claims of the present invention are used to distinguish similar objects, and are not necessarily used to describe a particular order or order. It is to be understood that the data so used may be interchanged where appropriate, so that the embodiments of the invention described herein can be implemented in a sequence other than those illustrated or described herein. In addition, the terms "comprises" and "comprises" and "the" and "the" are intended to cover a non-exclusive inclusion, for example, a process, method, system, product, or device that comprises a series of steps or units is not necessarily limited to Those steps or units, but may include those that are not clearly listed or for these processes, Other steps or units inherent to the law, product or equipment.

In the embodiment of the present invention, an embodiment of the above method for pushing a panoramic media file is provided. As an optional implementation manner, the panoramic media file pushing method may be, but is not limited to, applied to an application environment as shown in FIG. 1. As shown in FIG. 1, the terminal 106 obtains a panoramic media file to be pushed from the server 102 through the network 104, wherein the panoramic media file includes at least one frame of panoramic image frames. Dividing the panoramic image frame into a plurality of view regions according to predetermined conditions, determining a first central view region on the plurality of view regions of the panoramic image frame, and encoding the panoramic image frame according to the first central view region to push the encoded Panoramic image frame.

As another optional implementation manner, the panoramic media file pushing method may also be applied to another application environment, such as only being applied to the terminal, and implementing the panoramic image frame in the panoramic media file in the terminal. For the division, coding, and push operations, reference may be made to the foregoing embodiments, and details are not described herein again in this embodiment.

In this embodiment, after acquiring at least one frame of the panoramic image frame in the panoramic media file to be pushed, the panoramic image frame is divided into a plurality of view regions according to a predetermined condition, and the first central view is determined on the plurality of view regions. And encoding the panoramic image frame according to the first central view area to push the encoded panoramic image frame. That is, by acquiring the first central view area by using the plurality of view areas divided on the panoramic image frame, the first central view area is accurately positioned by using the plurality of view areas to ensure the acquired view of the central view area. Accuracy, to overcome the problem that only the highly compressed distortion picture can be obtained in the related art, and further, the use of multiple view areas to quickly obtain the central view area will greatly improve the acquisition efficiency, thereby improving the push efficiency of the panoramic media file. Effect.

Optionally, in this embodiment, the foregoing terminal may include, but is not limited to, at least one of the following: a mobile phone, a tablet computer, a notebook computer, a desktop PC, smart glasses, and other hardware devices for playing panoramic media files. The above network may include, but is not limited to, at least one of the following: a wide area network, a metropolitan area network, and a local area network. The above is only an example, and the embodiment does not limit this.

According to an embodiment of the present invention, a method for pushing a panoramic media file is provided. As shown in FIG. 2, the method includes:

S202, the panoramic media file to be pushed is obtained, where the panoramic media file includes at least one frame of the panoramic image frame;

S204. Divide the panoramic image frame into multiple view areas according to a predetermined condition;

S206, determining a first central view area on a plurality of view areas of the panoramic image frame, wherein the area occupied by the first central view area is greater than or equal to the area occupied by one view area;

S208. Encode the panoramic image frame according to the first central view area.

S210: Push the encoded panoramic image frame.

Optionally, in this embodiment, the method for pushing the panoramic media file may be, but is not limited to, being applied to a virtual reality (VR) process, where the virtual reality VR may be, but not limited to, a comprehensive use of computer graphics. The system and various interface devices such as actual control provide a immersive sensation in a three-dimensional environment that can be generated on a computer. If the above method is applied to the VR glasses, the view area is divided and encoded by the panoramic media file, so that a more accurate and clear panoramic media file can be quickly provided during the process of playing the panoramic media file. The above panoramic media file may include, but is not limited to, at least one of the following: a panoramic image, a panoramic video, and the like. The above is only an example, and is not limited in this embodiment.

It is to be noted that, after acquiring at least one frame of the panoramic image frame in the panoramic media file to be pushed, the panoramic image frame is divided into a plurality of view regions according to a predetermined condition, and the first central view region is determined on the plurality of view regions. And encoding the panoramic image frame according to the determined first central view area to push the encoded panoramic image frame. That is, by acquiring the first central view area by using the plurality of view areas divided on the panoramic image frame, the first central view area is accurately positioned by using the plurality of view areas to ensure the acquired view of the central view area. Accuracy, to overcome the problem that only the highly compressed distortion picture can be obtained in the related art, and further, the use of multiple view areas to quickly obtain the central view area will greatly improve the acquisition efficiency, thereby improving the push efficiency of the panoramic media file. Effect.

Optionally, in this embodiment, the dividing each frame of the panoramic image frame according to the predetermined condition may include, but is not limited to, dividing the panoramic image frame uniformly distributed on the panoramic sphere into the same size according to the pre-configured specification condition. Multiple rectangular view areas.

For example, taking the center of the panoramic sphere as the origin, since the spherical-to-center (origin) distance is the radius of the panoramic sphere, the polar coordinate system can be used to represent the various positions on the panoramic image frame. For example, (θ _x , θ _y ) is used to represent a position on a panoramic image frame, where θ _x represents an angle that is zero in front of the horizontal and counterclockwise in the horizontal direction; θ _y represents zero above the horizontal, Maintain an angle that is counterclockwise about horizontal and vertical. Further, as shown in FIG. 3, one frame of the panoramic image frame is divided into a plurality of view areas, such as an A view area to a P view area. That is, the center of the panoramic ball is taken as the origin, and the horizontal direction is rotated counterclockwise by 360 degrees into 6 parts, each of which is 60 degrees; and the horizontal direction is rotated counterclockwise by 180 degrees into three parts, each of which is 60 degrees. The angle range definition for each view area can be:

Where x, y are the numbers of the 18 view regions in the x and y directions, respectively x, [1, 6], y ∈ [1, 3].

Further, in the present embodiment, the first center view area acquired on the plurality of view areas may be, but is not limited to, a view area formed for a viewer's viewing field of view. It should be noted that the area of the plurality of view areas divided by the panoramic image frame may be, but is not limited to, determined according to the area of the first central view area, such as the size of the area occupied by each view area divided on the panoramic image frame. The size of the area occupied by the first central view area may be greater than or equal to the size of the area occupied by the first central view area. The above is only an example, and is not limited in this embodiment.

Optionally, in this embodiment, acquiring the central view area on multiple view areas of each frame of the panoramic image frame may include, but is not limited to, at least one of the following:

1) determining a coordinate range of the first central view area according to the motion data detected by the sensor;

2) acquiring a play mode of the panoramic media file; determining a coordinate range of the second central view area on the plurality of view areas after the predetermined time period according to the play mode of the panoramic media file and the coordinate range of the first central view area.

Optionally, in this embodiment, in the foregoing manner 1), after determining, according to the motion data detected by the sensor, the coordinates of the first central view area are used, the first coordinate of the first central view area is used to quickly obtain the first The central view area is a target view area in the plurality of view areas, wherein the target view area includes at least one view area that overlaps the coordinate range of the first central view area. Further, the picture of the first central view area is extracted from the target view area. That is, in the embodiment, the motion data detected by the sensor on the terminal can be used to quickly acquire the coordinates of the first central view area at the current time, and then use the coordinates to determine the target view area occupied by the first central view area. In order to quickly extract and push the picture in the first central view area from the target view area, the purpose of improving the push efficiency of the panoramic media file is achieved. For example, as shown in FIG. 4, the target view area in which the first central view area is located according to the coordinates of the first central view area includes: an A view area, a B view area, and an E view area. And the F view area (shown in the shaded figure in Figure 4). Further, the picture of the first central view area is extracted from the target view area, and further, the target view area where the first central view area is located may be conveniently coded and pushed according to a resolution higher than other view areas.

Optionally, in this embodiment, in the foregoing manner 2), the coordinates of the second central view area may be predicted, that is, predicted according to the play mode of the panoramic media file and the coordinates of the first central view area. The coordinates of the second central view area after the predetermined time period t. Therefore, the screen that will be played after the predetermined time period t is pushed by the prediction is realized, so as to overcome the problem of the screen delay occurring during the playback process caused by the delay of the network communication, thereby achieving the purpose of improving the push efficiency.

Optionally, in this embodiment, the encoding the panoramic image frame according to the central view area may include, but is not limited to, providing different resolution levels according to different view areas, such as the resolution of the view area where the central view area is located. The level is higher than the resolution level of the other view areas in the plurality of view areas of the panoramic image frame. It should be noted that, in the encoding process provided in this embodiment, it may be, but is not limited to, separately performing coding in units of view areas to obtain a code stream to be pushed, thereby achieving different resolution levels for different view areas. Encoding to save bandwidth and reduce transmission overhead.

Optionally, in this embodiment, the code stream of the encoded view area may be, but is not limited to, sliced according to time. For example, in response to each play request on the server side, a time slice of a panoramic media file is always pushed each time. It is possible, but not limited to, to slice a view area using various classic Moving Picture Experts Group (MPEG) video segmentation techniques, and then perform streaming services according to an adaptive code stream push strategy.

With the embodiment provided by the present application, the first central view area is obtained by using a plurality of view areas divided on the panoramic image frame, thereby accurately positioning the first central view area by using the plurality of view areas, and ensuring the obtained first view area. The accuracy of the picture in the central view area overcomes the problem that only the highly compressed distortion picture can be obtained in the related art. Further, the use of multiple view areas to quickly acquire the first central view area will greatly improve the acquisition efficiency, and further Achieve the effect of improving the push efficiency of panoramic media files.

As an alternative, determining the first central view area on the plurality of view areas of the panoramic image frame comprises:

S1. Determine a coordinate range of the first central view area according to the motion data detected by the sensor;

S2, determining a target view area from the plurality of view areas by using a coordinate range of the first central view area, wherein the target view area includes at least one view area overlapping the coordinate range of the first central view area;

S3. Extract a picture corresponding to the first central view area from the target view area.

Optionally, in the embodiment, determining the target view area from the plurality of view areas by using the coordinate range of the first central view area comprises: obtaining an identifier of the view area within a coordinate range of the first central view area; The view area indicated by the identifier of the view area is spliced to obtain the target view area.

Optionally, in this embodiment, the motion data detected by the foregoing sensor may include, but is not limited to, at least one of the following: an angle of the head rotation and an eye rotation parameter. The foregoing is only an example, and the motion data may further include other motion data for detecting a field of view of the viewer, which is not limited in this embodiment.

Specifically, in combination with the following example, the coordinate range of the first central view area is determined according to the motion data detected by the sensor, and the identifier of the view area of the target view area overlapping the first central view area is determined by using the above coordinate range. In the example shown in FIG. 4, the target view area includes: an A view area, a B view area, an E view area, and an F view area (shown in shades of FIG. 4). As shown in FIG. 4, the first middle view area is mainly in the F view area, and covers a part of the adjacent A/B/E view area, and the picture content in the first center view area is A/B/E/. The picture mosaic of the four areas of F is obtained.

When playing the picture in the first central view area, the decoded picture in the four view areas of the A/B/E/F may be stitched according to the panoramic ball image projection method to obtain the target view area, and then according to the first center. The relative position of the view area quickly extracts the picture in the first central view area from the target view area.

It should be noted that, in this embodiment, in a case where the size of the area occupied by the first central view area is equal to the size of the area occupied by each view area, the first central view area may be included in multiple view areas. In the target view area, one of the view areas can also be strictly coincident, thereby directly obtaining the picture in the first central view area.

According to the embodiment provided by the present application, the target view area where the first central view area is located is obtained from the plurality of view areas according to the coordinates of the first central view area detected by the sensor, so as to quickly extract the first view from the target view area. The picture in the center view area is pushed and played to achieve the effect of improving the push efficiency of the push panoramic media file.

As an alternative, encoding the panoramic image frame according to the central view area includes:

S1. The target view area in which the first central view area is located is encoded according to the first resolution, and the other view areas except the target view area in the panoramic image frame are encoded according to the second resolution, wherein the first resolution is higher than the second resolution. rate.

Optionally, in this embodiment, each divided view area may be, but is not limited to, multi-scale coding to obtain a code stream of multiple resolution levels. Wherein, for a plurality of view regions in the panoramic image frame, the resolution of the central view region (identified by the resolution level) may be, but not limited to, higher than the resolution of the other view regions. Therefore, the picture of the central view area of interest can be played clearly and faithfully, and the picture of other view areas is blurredly played, so as to reduce transmission overhead and save bandwidth.

Through the embodiments provided by the present application, by encoding different view regions in a panoramic image frame according to different resolutions, not only the pictures in the central view area but also the picture blur processing of other view areas can be clearly and clearly played. It will achieve the purpose of saving bandwidth.

As an alternative, obtaining the central view area on multiple view areas of each frame of the panoramic image frame includes:

S1, acquiring a play mode of the panoramic media file;

S2. Determine a coordinate range of the second central view area on the plurality of view areas after the predetermined time period according to the play mode of the panoramic media file and the coordinate range of the first central view area.

Optionally, in this embodiment, the coordinate range of the second central view area on the plurality of view areas after the predetermined time period may be calculated according to the following formula:

Where (x ₀ , y ₀ ) is used to represent the coordinates of the first central view area, (x _t , y _t ) is used to represent the coordinates of the second central view area after the predetermined time period t; v mod is used to indicate the play mode , v mod _x (t) is used to indicate the offset angle in the x direction after the predetermined time period t in the play mode, the x direction is the horizontal direction, and v mod _y (t) is used to indicate the predetermined time period t in the play mode. The offset angle in the back y direction, and the y direction is the vertical direction.

Optionally, in this embodiment, the foregoing play mode may include, but is not limited to, at least one of: a first play mode for playing a picture in the first central view area, and a first search mode for searching the third central view area. The second play mode, the third play mode for playing the picture in the third central view area. The above is only an example, and is not limited in this embodiment.

It should be noted that, in this embodiment, the above play mode may be, but is not limited to, an offset angle that will affect the predetermined time period t. For example, for the first play mode (also referred to as the view main mode), if one view angle is maintained for a long time, that is, staying in the first center view area for a long time, the offset angle after the predetermined time period t can be predicted according to the play mode. If it is 0, it can be predicted that the coordinates of the second central view area after the predetermined time period are the same as the coordinates of the first central view area, that is, x _t = x ₀ , y _t = y ₀ . What should be noted for the second play mode is that the search motion in the search process can be a uniform motion, and the offset angle can be the product of the moving speed v and the moving time t, or can be a non-uniform motion, and is obtained according to the relevant calculation manner. The offset angle in this play mode. This embodiment does not limit this.

Specifically, in combination with the above formula, it is assumed that the coordinates (x ₀ , y ₀ ) of the first central view area are obtained, and in the case that the current play mode is v mod, the first acquisition is in the x, y direction according to the play mode v mod . The offset angle from the current position after time period t: v mod _x (t), v mod _y (t). Then, the coordinates (x _t , y _t ) of the second central view region of the predetermined time period t are predicted using the above formula.

According to the embodiment provided by the present application, the coordinates of the second central view area on the plurality of view areas after the predetermined time period are determined according to the play mode of the panoramic media file and the coordinates of the first central view area, thereby achieving attention after the predetermined time period. The accurate prediction of the field of view ensures that the picture of the second central view area to be pushed is obtained in advance, and further, the problem of the playback delay caused by the network transmission delay can be avoided.

As an alternative, the playback mode of obtaining the panoramic media file includes:

1) determining, in a predetermined period, that the change range of the motion data detected by the sensor is less than a predetermined threshold, determining that the play mode is the first play mode, wherein the first play mode is used to play the picture in the first central view area;

2) determining, in a predetermined period, that the change range of the motion data detected by the sensor is greater than or equal to a predetermined threshold, determining that the play mode is the second play mode, wherein the second play mode is used to search for the third center view area;

3) the change range of the motion data detected by the sensor is less than a predetermined threshold value in a predetermined period, and when the last play mode is the second play mode, determining that the play mode is the third play mode, wherein the third play mode is used for playing The picture in the third center view area.

Specifically, the following example is used to illustrate that the first play mode is represented by a micro-swing viewing main mode (ma), the second play mode is represented by a new content search mode (ms), and the third play mode is focused with new content. The mode (mf) is indicated. The foregoing play mode may be specifically as shown in FIG. 5, and may include:

1) Micro-swing viewing master mode (ma): This mode stays in the picture played in the first center view area, and the hardware device (such as the glasses terminal) used for viewing will be relatively stationary or have a slight swing (ie, scheduled) The amplitude of the swing in the period is less than a predetermined threshold), but does not actually leave the first central view area;

2) New content search mode (ms): This mode will leave the micro-swing viewing mode for fast motion search for new content in new fields of view (such as the third center view area), and hardware devices for viewing (such as glasses terminals) ) will move quickly and deviate from the original motion track. That is, the amplitude of the wobble in the predetermined period is greater than or equal to a predetermined threshold.

3) New content focus mode (mf): This mode may stay in the third center view area for a short time and then enter the new content search mode, or may actually enter the micro-swing view main mode to stay in the third center view area. That is, the motion data detected by the sensor indicates that the amplitude of the swing in the predetermined period is less than the predetermined threshold, and the previous play mode is the second play mode.

In the present embodiment, the motion pattern is judged based on the movement trajectory within a predetermined period (e.g., time window T). One type that only swings back and forth in a short distance is the micro-swing to view the main mode (ma); the other type of faster moving is the new content search mode (ms); the other is the previous mode is new content. The search mode is a new content focus mode (mf) that is relatively stationary or slightly oscillating in a predetermined period of time (such as time window T).

It should be noted that the foregoing third central view area may be, but not limited to, a second central view area, and may be, but is not limited to, other views of the plurality of view areas except the first central view area and the second central view area. Area.

Through the embodiment provided by the present application, by acquiring the play mode of the panoramic media file, the play mode is used to predict the offset angle of the visual field range within the predetermined time period, thereby implementing the coordinates and the offset angle according to the first central view area. The coordinates of the second central view area after the predetermined time period are determined.

S10, repeating the following steps until traversing a plurality of view regions in the panoramic image frame after a predetermined period of time:

S12. Acquire multiple sub-view areas divided in the current view area from multiple view areas.

S14. Acquire a reference value of multiple subview regions, where the reference value is the largest of the significance level indicated by the salient feature of the subview region and the probability that the second central view region falls within the subview region. value;

S16. Determine a third resolution of the current view area according to a maximum value of the reference values of the multiple sub-view areas.

S18. Encode the current view area according to the third resolution.

Optionally, in this embodiment, acquiring the reference values of the multiple sub-view areas includes: repeating the following steps until the plurality of sub-view areas are traversed: acquiring the current sub-view area from the plurality of sub-view areas; acquiring the current sub-view area The significance level indicated by the significance feature and the probability that the second central view area falls within the current sub-view area; the maximum value of the significance level and the probability is used as the reference value of the current sub-view area.

Optionally, in this embodiment, each of the plurality of view regions may be, but is not limited to, divided into four sub-view regions of the same size. As shown in FIG. 6, the current view area includes a sub-view area a, a sub-view area b, a sub-view area c, and a sub-view area d.

Optionally, in this embodiment, obtaining the reference value of the current sub-view area may include, but is not limited to, obtaining a significance level indicated by the salient feature of the current sub-view area and the second central view area falls in the current sub-view area. The probability of the maximum of both.

It should be noted that, in this embodiment, the above-mentioned salient features may be, but are not limited to, a region for indicating a visually significant region, such as a center of the stage, and a region with a high probability of attention, which may be configured to be highly salient. The visually significant area of the level. For example, areas such as dark areas, auditoriums, and the sky with low probability of attention can be configured as visually significant areas of low significance level. Wherein, the above-mentioned significance level can be, but is not limited to, represented by Sa(t, θ _x , θ _y ), where θ _x ∈ [0, 360°) θ _y ∈ [-90°, 90°], the above-mentioned significance level can be It is not limited to a priori calculation based on the classical visual saliency detection algorithm. According to Sa(t, θ _x , θ _y ), the saliency level of each sub-view area can be counted, for example, RSa(t, sx, sy) represents the saliency level of the sub-view area (sx, sy) after the predetermined time period t. . As an alternative calculation method:

The sub-view area x direction number sx ranges from sx ∈ [1, 12], and the sub view area y direction number sy ranges from sy ∈ [1, 6].

It should be noted that, in this embodiment, taking the current view area as an example, the probability of falling in a sub-view area is represented by Pi(t, sx, sy), and the four sub-view areas included in the current view area are The reference value can be identified in the following ways:

The reference value of the subview area a is:

aPi(t,x,y)=max(RSa(t,2x-1,2y-1),Pi(t,2x-1,2y-1))

The reference value of sub-view area b is:

bPi(t,x,y)=max(RSa(t,2x,2y-1),Pi(t,2x,2y-1))

The reference value of the subview area c is:

cPi(t,x,y)=max(RSa(t,2x-1,2y),Pi(t,2x-1,2y))

The reference value of the sub-view area d is: dPi(t, x, y) = max(RSa(t, 2x, 2y), Pi(t, 2x, 2y))

Further, the resolution of the current view area is determined according to the maximum value mPi(t, x, y) of the above four reference values, wherein:

mPi(t,x,y)=max(aPi(t,x,y),bPi(t,x,y),cPi(t,x,y),dPi(t,x,y))) (4 )

That is to say, the resolution of the current view area is updated and adjusted according to the resolution of the sub-view area with the largest reference value to ensure high definition of the content of interest.

The embodiment provides the maximum value of the reference values in the plurality of sub-view areas included in the view area, and determines the resolution of the view area according to the maximum value, so as to implement different view areas after the predetermined time period. Configure different resolutions for encoding to save bandwidth. In addition, the sub-view area that is most likely to be pushed after the predetermined time period t is predicted according to the saliency level indicated by the saliency feature and the probability of the second center view area falling, and further the other sub-view areas in the view area The resolution of the view area is adjusted to the highest resolution to ensure the playback clarity of the content being watched.

As an optional solution, determining the third resolution of the current view area according to the maximum value of the reference values of the plurality of sub-view areas includes:

S1, the resolution level of the third resolution of the current view area is calculated by the following formula:

S(t,x,y)=1+(n-1)*mPi(t,x,y)*Qnet (5)

Where (x, y) is the coordinates of the current view area, and S(t, x, y) is used to indicate the resolution level of the third resolution of the current view area in the panoramic image frame after the predetermined time period t, mPi (t, x, y) is used to indicate the maximum value of the reference values of the plurality of sub-view areas in the current view area after the predetermined time period t, Qnet is used for Indicates the current network bandwidth level, n is used to indicate the resolution level, where Qnet∈[0,1], S(t,x,y)∈{1,2,...,n};

S2, determining a third resolution according to a resolution level at which the third resolution is located.

It should be noted that Qnet indicates the current network bandwidth level. The higher the level, the more inclined it is to push the high-quality version content, and the worse the trend is to push the low-resolution version content to ensure the smooth viewing experience. In addition, S(t, x, y) represents the resolution level at which the third resolution is located. The higher the level, the higher the resolution version of the push, such as the highest resolution version n, but vice versa, such as the lowest resolution version 1.

Through the embodiments provided by the present application, the pictures in multiple view areas are coded according to different resolutions to ensure that the clearest picture can be seen in the area occupied by the central view area, while the relative blur is seen in the area of other view areas. The picture is to ensure that the difference image is played while playing the panoramic image frame, thereby reducing the transmission overhead, saving bandwidth, and improving the push efficiency.

As an alternative, the probability of obtaining the second central view area falling in the current sub-view area includes:

P(t, sx, sy)=exp(-((sx-x _t ) ² +(sy-y _t ) ² )) (6)

Where (sx, sy) is used to represent the coordinates of the current sub-view area, and P(t, sx, sy) is used to indicate the probability that the second central view area falls within the current sub-view area after the predetermined time period t, (x _t , y _t ) is used to indicate the coordinates of the second central view area after the predetermined time period t.

It should be noted that the above formula is a reverse exponential function with e as the base, that is, the closer the current sub-view region is to the second central view region, the larger the function value is, and the corresponding probability is larger.

It should be noted that, for the foregoing method embodiments, for the sake of simple description, they are all expressed as a series of action combinations, but those skilled in the art should understand that the present invention is not limited by the described action sequence. Because certain steps may be performed in other sequences or concurrently in accordance with the present invention. In addition, those skilled in the art should also understand that the embodiments described in the specification are all preferred embodiments, and the actions and modules involved are not necessarily required by the present invention.

Through the description of the above embodiments, those skilled in the art can clearly understand that the method according to the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course, by hardware, but in many cases, the former is A better implementation. Based on such understanding, the technical solution of the present invention may contribute to the prior art in part or in the software product. Formally embodied, the computer software product is stored in a storage medium (such as ROM/RAM, disk, optical disk), and includes a plurality of instructions for making a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) The methods described in various embodiments of the invention are performed.

According to an embodiment of the present invention, there is also provided a panoramic media file pushing apparatus for implementing the above-described panoramic media file pushing method, which is applied to a terminal having a display device. As shown in Figure 7, the device includes:

The first obtaining unit 702 is configured to obtain a panoramic media file to be pushed, where the panoramic media file includes at least one frame of the panoramic image frame;

a dividing unit 704, configured to divide the panoramic image frame into a plurality of view regions according to a predetermined condition;

The second obtaining unit 706 is configured to determine a first central view area on the plurality of view areas of the panoramic image frame, wherein the area occupied by the first central view area is greater than or equal to the area occupied by one view area;

The encoding unit 708 is configured to encode the panoramic image frame according to the determined first central view area;

The pushing unit 710 is configured to push the encoded panoramic image frame.

Optionally, in this embodiment, the push device of the panoramic media file may be, but is not limited to, applied to a virtual reality (VR) process, where the virtual reality VR may be, but is not limited to, a comprehensive use of computer graphics. The system and various interface devices such as actual control provide a immersive sensation in a three-dimensional environment that can be generated on a computer. If the above device is applied to the VR glasses, the view area is divided and encoded by the panoramic media file, so that a more accurate and clear panoramic media file can be quickly provided during the process of playing the panoramic media file. The above panoramic media file may include, but is not limited to, at least one of the following: a panoramic image, a panoramic video, and the like. The above is only an example, and is not limited in this embodiment.

It is to be noted that, after acquiring at least one frame of the panoramic image frame in the panoramic media file to be pushed, the frame panoramic image frame is divided into multiple view regions according to a predetermined condition, and the first central view is obtained on the plurality of view regions. And encoding the panoramic image frame according to the first central view area to push the encoded panoramic image frame. That is to say, by using the plurality of view areas divided on the panoramic image frame to obtain the first central view area, the plurality of view areas are used to accurately locate the central view area, thereby ensuring the accuracy of the acquired central view area. In order to overcome the problem that only the highly compressed distortion picture can be obtained in the related art, further, using the plurality of view areas to quickly acquire the first central view area, the access efficiency is greatly improved, thereby improving the push efficiency of the panoramic media file. Effect.

Further, in the present embodiment, the first center view area acquired on the plurality of view areas may be, but is not limited to, a view area formed for a viewer's viewing field of view. It should be noted that the area of the plurality of view areas divided by the panoramic image frame may be, but is not limited to, determined according to the area of the first central view area, such as the size of the area occupied by each view area divided on the panoramic image frame. It may be, but not limited to, less than or equal to the size of the area occupied by the central view area, and greater than a quarter of the area occupied by the central view area. The above is only an example, and is not limited in this embodiment.

Optionally, in this embodiment, in the foregoing manner 1), after the coordinates of the first central view area are determined according to the motion data detected by the sensor, the first central view area is used. The target quickly acquires a target view area of the first central view area in the plurality of view areas, wherein the target view area includes at least one view area that overlaps the coordinate range of the first central view area. Further, the picture of the first central view area is extracted from the target view area. That is, in the embodiment, the motion data detected by the sensor on the terminal can be used to quickly acquire the coordinates of the first central view area at the current time, and then use the coordinates to determine the target view area occupied by the first central view area. In order to quickly extract and push the picture in the first central view area from the target view area, the purpose of improving the push efficiency of the panoramic media file is achieved. For example, as shown in FIG. 4, the target view area in which the first central view area is obtained according to the coordinates of the first central view area includes: an A view area, a B view area, an E view area, and an F view area (as shown in FIG. Show). Further, the picture of the first central view area is extracted from the target view area, and further, the target view area where the first central view area is located may be conveniently coded and pushed according to a resolution higher than other view areas.

Through the embodiments provided by the present application, by utilizing multiple view regions divided on a panoramic image frame Acquiring the first central view area, thereby accurately positioning the first central view area by using the plurality of view areas, and ensuring the accuracy of the acquired first central view area, so as to overcome the high compression in the related art. The problem of distorted picture, further, the use of multiple view areas to quickly acquire the first central view area, will greatly improve the acquisition efficiency, and thus achieve the effect of improving the push efficiency of the panoramic media file.

As an optional solution, the second obtaining unit includes:

1) a first determining module, configured to determine a coordinate range of the first central view area according to the motion data detected by the sensor;

2) a first obtaining module, configured to obtain a target view area from the plurality of view areas by using a coordinate range of the first central view area, wherein the target view area includes at least one view area overlapping the coordinate range of the first central view area ;

4) An extraction module, configured to extract a picture corresponding to the first central view area from the target view area.

Optionally, in this embodiment, the first obtaining module includes: (1) an acquiring sub-module, configured to acquire an identifier of a view area where a coordinate range of the first central view area is located; and (2) a splicing sub-module, configured to: The target view area is obtained by splicing the view area indicated by the identifier of the view area.

Specifically, the following example is used to determine the coordinates of the first central view area according to the motion data detected by the sensor, and the view area identifier of the target view area where the first central view area is located is obtained by using the coordinates, and the view area identifier includes: A view. Zone, B view zone, E view zone and F view zone (shown in the shadow of Figure 4), as shown in Figure 4, mainly in the F view zone, covering a part of the adjacent A/B/E view zone, The picture content in a central view area is obtained by stitching the pictures in the four areas of A/B/E/F.

It should be noted that, in this embodiment, the size of the area occupied by the first central view area is equal to In the case of the size of the area occupied by each view area, the first central view area may be included in the target view area formed by the plurality of view areas, or may be strictly coincident with one of the view areas, thereby achieving direct access to the first The picture in the center view area.

As an alternative, the coding unit includes:

1) a first encoding module, configured to encode a target view area in which the first central view area is located according to the first resolution, and encode other view areas in the panoramic image frame except the target view area according to the second resolution, where The resolution is higher than the second resolution.

As an optional solution, the second obtaining unit includes:

1) a second acquiring module, configured to acquire a play mode of the panoramic media file;

2) The second determining module is configured to determine a coordinate range of the second central view area on the plurality of view areas after the predetermined time period according to the play mode of the panoramic media file and the coordinate range of the first central view area.

Optionally, in the second determining module in this embodiment, the coordinate range of the second central view area on the plurality of view areas after the predetermined time period may be calculated according to the following formula:

As an alternative, the second acquisition module includes:

1) a third determining submodule, configured to determine that the play mode is the first play mode when the change range of the motion data detected by the sensor is less than a predetermined threshold in a predetermined period, wherein the first play mode is used to play the first center The picture in the view area;

2) a fourth determining submodule, configured to determine, in a predetermined period, that the change range of the motion data detected by the sensor is greater than or equal to a predetermined threshold, wherein the play mode is the second play mode, wherein the second play The drop mode is used to search for the third center view area;

3) a fifth determining submodule, configured to determine, in a predetermined period, that the range of change of the motion data detected by the sensor is less than a predetermined threshold, and when the previous play mode is the second play mode, determining that the play mode is the third play mode, where The third play mode is used to play the picture in the third center view area.

Specifically, the following example is used to illustrate that the first play mode is represented by a micro-swing viewing main mode (ma), the second play mode is represented by a new content search mode (ms), and the third play mode is represented by a new content focus mode (mf). . The foregoing play mode may be specifically as shown in FIG. 5, and may include:

1) Micro-swing viewing master mode (ma): This mode stays in the picture played in the first center view area, and the hardware device (such as the glasses terminal) used for viewing will be relatively stationary or have a slight swing (ie, scheduled) The amplitude of the swing in the period is less than a predetermined threshold), but does not actually leave the first central view area.

That is, in the present embodiment, the motion pattern is judged based on the movement trajectory within a predetermined period (e.g., time window T). One type that only swings back and forth in a short distance is the micro-swing to view the main mode (ma); the other type of faster moving is the new content search mode (ms); the other is the previous mode is new content. The search mode is a new content focus mode (mf) that is relatively stationary or slightly oscillating in a predetermined period of time (such as time window T).

As an alternative, the coding unit includes:

1) A processing module for repeatedly performing the following steps until traversing a plurality of view regions in the panoramic image frame after a predetermined period of time:

S1. Obtain, from multiple view areas, multiple sub-view areas divided in the current view area;

S2. Acquire a reference value of multiple subview regions, where the reference value is a maximum value of a significance level indicated by the salient feature of the subview region and a probability that the second central view region falls within the subview region;

S3. Determine a third resolution of the current view area according to a maximum value of the reference values of the multiple sub-view areas.

S4, encoding the current view area according to the third resolution.

Optionally, in this embodiment, the processing module is configured to obtain reference values of the plurality of sub-view areas by repeating the following steps until the plurality of sub-view areas are traversed: acquiring the current sub-view area from the plurality of sub-view areas; The significance level indicated by the saliency feature of the current sub-view area and the probability that the second central view area falls within the current sub-view area; the maximum value of the saliency level and the probability is used as the reference value of the current sub-view area.

The reference value of the subview area a is:

aPi(t,x,y)=max(RSa(t,2x-1,2y-1),Pi(t,2x-1,2y-1))

The reference value of sub-view area b is:

bPi(t,x,y)=max(RSa(t,2x,2y-1),Pi(t,2x,2y-1))

The reference value of the subview area c is:

cPi(t,x,y)=max(RSa(t,2x-1,2y),Pi(t,2x-1,2y))

mPi(t,x,y)=max(aPi(t,x,y),bPi(t,x,y),cPi(t,x,y),dPi(t,x,y))) (10) )

As an alternative, the processing module is implemented according to the following steps: The maximum value in the reference value determines the third resolution of the current view area:

S(t,x,y)=1+(n-1)*mPi(t,x,y)*Qnet (11)

Where (x, y) is the coordinates of the current view area, and S(t, x, y) is used to indicate the resolution level of the third resolution of the current view area in the panoramic image frame after the predetermined time period t, mPi (t, x, y) is used to indicate the maximum value of the reference values of the plurality of sub-view areas in the current view area after the predetermined time period t, Qnet is used to indicate the current network bandwidth level, and n is used to represent the resolution level, wherein , Qnet∈[0,1],S(t,x,y)∈{1,2,...,n};

As an optional solution, the processing module obtains the probability that the second central view area falls in the current sub-view area by the following steps:

P(t, sx, sy)=exp(-((sx-x _t ) ² +(sy-y _t ) ² )) (12)

According to an embodiment of the present invention, a panoramic media file pushing terminal for implementing the above-mentioned panoramic media file pushing method is further provided. As shown in FIG. 8, the terminal includes:

1) a communication interface 802, configured to obtain a panoramic media file to be pushed, wherein the panoramic media file includes at least one frame of the panoramic image frame; and is further configured to push the encoded panoramic image frame;

2) The processor 804 is connected to the communication interface 802, and is configured to respectively divide the panoramic image frame into a plurality of view regions according to predetermined conditions; and further configured to determine the first central view region on the plurality of view regions of the panoramic image frame, wherein The area occupied by the first central view area is greater than or equal to the area occupied by one view area; and is further configured to encode the panoramic image frame according to the central view area;

3) A memory 806, coupled to the communication interface 802 and the processor 804, configured to store the panoramic media file and the determined first central view area.

For example, the specific examples in this embodiment may refer to the examples described in the foregoing embodiments, and details are not described herein again.

Embodiments of the present invention also provide a storage medium. Optionally, in this embodiment, the foregoing storage medium may be located in at least one of the plurality of network devices in the network.

Optionally, in the present embodiment, the storage medium is arranged to store program code for performing the following steps:

S1, the panoramic media file to be pushed is obtained, where the panoramic media file includes at least one frame of the panoramic image frame;

S2, dividing the panoramic image frame into a plurality of view areas according to a predetermined condition;

S3, determining a first central view area on a plurality of view areas of the panoramic image frame, wherein the area occupied by the first central view area is greater than or equal to the area occupied by one view area;

S4, encoding the panoramic image frame according to the first central view area;

S5, pushing the encoded panoramic image frame.

Optionally, the storage medium is further arranged to store program code for performing the following steps:

S2. Acquire a target view area from the plurality of view areas by using a coordinate range of the first central view area, where the target view area includes at least one view area overlapping the coordinate range of the first central view area;

Optionally, the storage medium is further configured to store program code for: encoding the target view area in which the first central view area is located according to the first resolution, and encoding the target view area in the panoramic image frame according to the second resolution Other view areas than the first resolution, wherein the first resolution is higher than the second resolution rate.

S1, acquiring a play mode of the panoramic media file;

Optionally, the storage medium is further arranged to store program code for performing the steps of: repeating the steps of: traversing the plurality of view regions in the panoramic image frame after a predetermined period of time: determining the current from the plurality of view regions a plurality of sub-view areas divided in the view area; obtaining reference values of the plurality of sub-view areas, wherein the reference value is a saliency level indicated by the saliency feature of the sub-view area and a probability that the second central view area falls within the sub-view area a maximum of the two; determining a third resolution of the current view region based on a maximum of the reference values of the plurality of sub-view regions; encoding the current view region according to the third resolution.

Optionally, in this embodiment, the foregoing storage medium may include, but is not limited to, a U disk, a Read-Only Memory (ROM), a removable hard disk, a magnetic disk, or an optical disk, and the like. .

The serial numbers of the embodiments of the present invention are merely for the description, and do not represent the advantages and disadvantages of the embodiments.

The integrated unit in the above embodiment, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in the above-described computer readable storage medium. Based on such understanding, the technical solution of the present invention may contribute to the prior art or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium. A number of instructions are included to cause one or more computer devices (which may be a personal computer, server or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention.

In the above-mentioned embodiments of the present invention, the descriptions of the various embodiments are different, and the parts that are not detailed in a certain embodiment can be referred to the related descriptions of other embodiments.

In the several embodiments provided by the present application, it should be understood that the disclosed client may be implemented in other manners. Wherein the device embodiments described above are merely illustrative, for example The division of the unit is only a logical function division, and the actual implementation may have another division manner. For example, multiple units or components may be combined or may be integrated into another system, or some features may be ignored or not executed. In addition, the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, unit or module, and may be electrical or otherwise.

The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit. The above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.

The above description is only a preferred embodiment of the present invention, and it should be noted that those skilled in the art can also make several improvements and retouchings without departing from the principles of the present invention. It should be considered as the scope of protection of the present invention.

Claims

A panoramic media file pushing method is applied to a terminal having a display device, including:

Obtaining a panoramic media file to be pushed, where the panoramic media file includes at least one frame of panoramic image frames;

Dividing the panoramic image frame into a plurality of view regions according to predetermined conditions;

Determining, in the plurality of view areas of the panoramic image frame, a first central view area, wherein the area occupied by the first central view area is greater than or equal to an area occupied by the view area;

Encoding the panoramic image frame according to the determined first central view area;

The encoded panoramic image frame is pushed to the display device.
The method of claim 1, wherein the determining the first central view area on the plurality of view regions of the panoramic image frame comprises:

Determining a coordinate range of the first central view area according to motion data detected by the sensor;

Determining a target view area from the plurality of view areas using a coordinate range of the first central view area, wherein the target view area includes at least one view area overlapping a coordinate range of the first central view area; as well as

Extracting a picture corresponding to the first central view area from the target view area.
The method of claim 2, wherein the determining the target view area from the plurality of view areas using the coordinate range of the first central view area comprises:

Obtaining an identification of a view area within a coordinate range of the first central view area;

A view area indicated by the identifier of the view area is spliced to obtain the target view area.
The method of claim 2 wherein said encoding said panoramic image frame in accordance with said determined central view region comprises:

Encoding the target view area according to a first resolution, and encoding other view areas of the panoramic image frame other than the target view area according to a second resolution, wherein the first resolution is higher than the Second resolution.
The method of claim 2 further comprising:

Obtaining a play mode of the panoramic media file;

And determining a coordinate range of the second central view area on the plurality of view areas after the predetermined time period according to the play mode of the panoramic media file and the coordinate range of the first central view area.
The method according to claim 5, wherein the determining the second center on the plurality of view areas after the predetermined time period is determined according to the play mode of the panoramic media file and the coordinate range of the first central view area The coordinate range of the view area includes calculating a coordinate range of the second central view area according to the following formula:

Wherein (x 0, y 0 ) is used to represent the coordinates of the first central view area, and (x t, y t ) is used to represent the coordinates of the second central view area after the predetermined time period t; In the representation of the play mode, v mod x (t) is used to indicate the offset angle in the x direction after the predetermined time period t in the play mode, the x direction is the horizontal direction, and v mod y (t) is used. The y direction is a vertical direction indicating the offset angle in the y direction after the predetermined time period t in the play mode.
The method of claim 5, wherein the acquiring a play mode of the panoramic media file comprises:

Determining, in a predetermined period, that the change range of the motion data detected by the sensor is less than a predetermined threshold, determining that the play mode is a first play mode, wherein the first play mode is used to play the first center The picture in the view area;

Determining, in the predetermined period, that the change range of the motion data detected by the sensor is greater than or equal to the predetermined threshold, determining that the play mode is a second play mode, wherein the second play mode is used for searching a third central view area;

And determining, in the predetermined period, that the change range of the motion data detected by the sensor is less than the predetermined threshold, and determining that the play mode is third when the previous play mode is the second play mode a play mode, wherein the third play mode is for playing a picture in the third center view area.
The method of claim 5 wherein said encoding said panoramic image frame in accordance with said determined first central view region comprises:

Repeating the following steps until traversing the panoramic image frame after the predetermined period of time The plurality of view areas:

Obtaining a plurality of sub-view areas divided in the current view area from the plurality of view areas;

Obtaining reference values of the plurality of sub-view regions, wherein the reference value is a saliency level indicated by the saliency feature of the sub-view region and a probability that the second central view region falls within the sub-view region The maximum of the two;

Determining a third resolution of the current view region based on a maximum of the reference values of the plurality of sub-view regions;

The current view area is encoded in accordance with the third resolution.
The method of claim 8, wherein the obtaining the reference values of the plurality of sub-view regions comprises:

Repeat the following steps until you traverse the multiple subview regions:

Obtaining a current sub-view area from the plurality of sub-view areas;

Obtaining the significance level indicated by the significant feature of the current sub-view area and the probability that the second central view area falls within the current sub-view area;

The maximum value of the significance level and the probability is used as the reference value of the current sub-view area.
The method according to claim 8 or 9, wherein the determining the third resolution of the current view area according to the maximum value of the reference values of the plurality of sub-view areas comprises:

The resolution level of the third resolution of the current view area is calculated by the following formula:

S(t,x,y)=1+(n-1)*mPi(t,x,y)*Qnet,

Where (x, y) is the coordinates of the current view area, and S(t, x, y) is used to indicate the number of the current view area in the panoramic image frame after the predetermined time period t The resolution level at which the three resolutions are located, mPi(t, x, y) is used to indicate the maximum value of the reference values of the plurality of sub-view regions in the current view region after the predetermined time period t, Qnet is used to indicate the current network bandwidth level, and n is used to indicate the resolution level, where Qnet∈[0,1], S(t,x,y)∈{1,2,L,n};

The third resolution is determined according to a resolution level at which the third resolution is located.
The method according to any one of claims 8 to 10, wherein the calculation is performed by the following formula The probability that the second central view area falls within the current subview area:

P(t, sx, sy)=exp(-((sx-x t ) 2 +(sy-y t ) 2 )),

Where (sx, sy) is used to represent the coordinates of the current sub-view area, and P(t, sx, sy) is used to indicate that the second central view area falls within the current after the predetermined time period t The probability of the sub-view area, (x t, y t ), is used to represent the coordinates of the second central view area after the predetermined time period t.
A panoramic media file pushing device is applied to a terminal having a display device, including:

a first acquiring unit, configured to acquire a panoramic media file to be pushed, where the panoramic media file includes at least one frame of panoramic image frames;

a dividing unit, configured to divide the panoramic image frame into a plurality of view regions according to a predetermined condition;

a second acquiring unit, configured to determine a first central view area on the plurality of view areas of the panoramic image frame, where the area occupied by the first central view area is greater than or equal to an area occupied by one of the view areas ;

a coding unit, configured to encode the panoramic image frame according to the determined first central view area;

a pushing unit, configured to push the encoded panoramic image frame.
The apparatus of claim 12, wherein the second obtaining unit comprises:

a first determining module, configured to determine a coordinate range of the first central view area according to the motion data detected by the sensor;

a first acquiring module, configured to acquire a target view area from the plurality of view areas by using a coordinate range of the first central view area, wherein the target view area includes a coordinate range with the first central view area Overlapping at least one view area;

And an extracting module, configured to extract, from the target view area, a picture corresponding to the first central view area.
The apparatus of claim 13, wherein the first obtaining module comprises:

Obtaining a submodule, configured to acquire an identifier of a view area within a coordinate range of the first central view area;

And a splicing sub-module, configured to splicing the view area indicated by the identifier of the view area to obtain the target view area.
The apparatus of claim 13 wherein said encoding unit comprises:

a first encoding module, configured to encode the target view area according to a first resolution, and encode other view areas of the panoramic image frame except the target view area according to a second resolution, where the A resolution is higher than the second resolution.
The apparatus of claim 13, wherein the second obtaining unit comprises:

a second acquiring module, configured to acquire a play mode of the panoramic media file;

a second determining module, configured to determine a coordinate range of the second central view area on the plurality of view areas after the predetermined time period according to the play mode of the panoramic media file and the coordinate range of the first central view area.
The apparatus of claim 16, wherein the second determining module comprises calculating a coordinate range of the second central view area according to the following formula:

Wherein (x 0 , y 0 ) is used to represent the coordinates of the first central view area, and (x t , y t ) is used to represent the coordinates of the second central view area after the predetermined time period t; In the representation of the play mode, v mod x (t) is used to indicate the offset angle in the x direction after the predetermined time period t in the play mode, the x direction is the horizontal direction, and v mod y (t) is used. The y direction is a vertical direction indicating the offset angle in the y direction after the predetermined time period t in the play mode.
The apparatus of claim 16, wherein the second acquisition module comprises:

a third determining submodule, configured to determine, in a predetermined period, that the change range of the motion data detected by the sensor is less than a predetermined threshold, wherein the play mode is a first play mode, where the first play mode For playing a picture in the first central view area;

a fourth determining submodule, configured to determine, in the predetermined period, that the change range of the motion data detected by the sensor is greater than or equal to the predetermined threshold, determining that the play mode is a second play mode, where The second play mode is used to search for the third center view area;

a fifth determining submodule, configured to: when the change range of the motion data detected by the sensor is less than the predetermined threshold in the predetermined period, and when the last play mode is the second play mode, Determining that the play mode is a third play mode, wherein the third play mode is Playing the picture in the third central view area.
The apparatus of claim 16, wherein the encoding unit comprises:

a processing module, configured to repeatedly perform the following steps until traversing the plurality of view regions in the panoramic image frame after the predetermined period of time:

Obtaining a plurality of sub-view areas divided in the current view area from the plurality of view areas;

Obtaining reference values of the plurality of sub-view regions, wherein the reference value is a saliency level indicated by the saliency feature of the sub-view region and a probability that the second central view region falls within the sub-view region The maximum of the two;

Determining a third resolution of the current view region based on a maximum of the reference values of the plurality of sub-view regions;

The current view area is encoded in accordance with the third resolution.
The apparatus according to claim 19, wherein the processing module implements acquiring reference values of the plurality of sub-view areas by:

Repeat the following steps until you traverse the multiple subview regions:

Obtaining a current sub-view area from the plurality of sub-view areas;

Obtaining the significance level indicated by the significant feature of the current sub-view area and the probability that the second central view area falls within the current sub-view area;

The maximum value of the significance level and the probability is used as the reference value of the current sub-view area.
The apparatus according to claim 19 or 20, wherein the processing module determines to determine a third resolution of the current view area according to a maximum value of the reference values of the plurality of sub-view areas by:

The resolution level of the third resolution of the current view area is calculated by the following formula:

S(t,x,y)=1+(n-1)*mPi(t,x,y)*Qnet,

Where (x, y) is the coordinates of the current view area, and S(t, x, y) is used to indicate the number of the current view area in the panoramic image frame after the predetermined time period t a resolution level at which the three resolutions are located, mPi(t, x, y) is used to indicate the plurality of children in the current view area after the predetermined time period t The maximum value of the reference value of the view area, Qnet is used to indicate the current network bandwidth level, and n is used to represent the resolution level, where Qnet∈[0,1], S(t,x,y)∈{1, 2, L, n}; and

The third resolution is determined according to a resolution level at which the third resolution is located.
The apparatus according to any one of claims 19 to 21, wherein the processing module calculates a probability that the second center view area falls within the current sub-view area by the following formula:

P(t, sx, sy)=exp(-((sx-x t ) 2 +(sy-y t ) 2 )),

Where (sx, sy) is used to represent the coordinates of the current sub-view area, and P(t, sx, sy) is used to indicate that the second central view area falls within the current after the predetermined time period t The probability of the sub-view area, (x t , y t ), is used to represent the coordinates of the second central view area after the predetermined time period t.