WO2018010653A1 - Panoramic media file push method and device - Google Patents

Panoramic media file push method and device Download PDF

Info

Publication number
WO2018010653A1
WO2018010653A1 PCT/CN2017/092562 CN2017092562W WO2018010653A1 WO 2018010653 A1 WO2018010653 A1 WO 2018010653A1 CN 2017092562 W CN2017092562 W CN 2017092562W WO 2018010653 A1 WO2018010653 A1 WO 2018010653A1
Authority
WO
WIPO (PCT)
Prior art keywords
view area
view
central
area
play mode
Prior art date
Application number
PCT/CN2017/092562
Other languages
French (fr)
Chinese (zh)
Inventor
袁梓瑾
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2018010653A1 publication Critical patent/WO2018010653A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/111Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation
    • H04N13/117Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation the virtual viewpoint locations being selected by the viewers or determined by viewer tracking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/122Improving the 3D impression of stereoscopic images by modifying image signal contents, e.g. by filtering or adding monoscopic depth cues
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/30Image reproducers
    • H04N13/349Multi-view displays for displaying three or more geometrical viewpoints without viewer tracking
    • H04N13/351Multi-view displays for displaying three or more geometrical viewpoints without viewer tracking for displaying simultaneously

Definitions

  • the present invention relates to the field of computers, and in particular to a method and device for pushing a panoramic media file.
  • panoramic media file can provide users with a more realistic and immersive viewing experience that is different from the traditional limited view, it has gradually become one of the main contents in the virtual reality field.
  • panoramic media files have great difficulties and challenges in engineering technology compared to traditional media files.
  • a common technical means is to use a quadrangular pyramid.
  • the panoramic sphere is built into a quadrangular pyramid, and the center of the viewer's field of view is vertically aligned with the center of the bottom surface of the cone, and the image to be played is projected onto the quadrangular pyramid surface by projection geometric transformation. This will preserve the high definition of the image in the viewer's field of view by projecting to the underside of the cone, while the rest of the field of view will be quickly compressed as it is projected onto the side of the cone, greatly reducing the bandwidth pressure when pushing panoramic media files. .
  • the panoramic media file needs to provide a 360-degree panoramic picture, if only a fixed predefined viewing angle of the picture content is transmitted, when the viewer's field of view moves, a part of the picture in the new field of view cannot be normally rendered due to compression. Therefore, the above manner of pushing the panoramic media file only for a predefined perspective will make the pushed panoramic media file inaccurate, thereby causing the problem that the panoramic media file is distorted during playback.
  • the embodiment of the invention provides a method and a device for pushing a panoramic media file to solve at least the technical problem of low push accuracy caused by the push mode of the existing panoramic media file.
  • a method for pushing a panoramic media file is provided, which is applied to a terminal having a display device, comprising: acquiring a panoramic media file to be pushed, wherein the panoramic media file includes at least one panoramic image. And dividing the panoramic image frame into a plurality of view regions according to a predetermined condition; acquiring a first central view region on the plurality of view regions of the panoramic image frame, wherein the area occupied by the first central view region is greater than or equal to one The area occupied by the view area; encoding the panoramic image frame according to the first central view area determined above; and pushing the encoded panoramic image frame to the display device.
  • a panoramic media file pushing apparatus which is applied to a terminal having a display device, and includes: a first acquiring unit, configured to acquire a panoramic media file to be pushed, wherein the panoramic view The media file includes at least one frame of the panoramic image frame; the dividing unit is configured to divide the panoramic image frame into a plurality of view regions according to a predetermined condition; and the second acquiring unit is configured to be used on the plurality of view regions of the panoramic image frame Determining a first central view area, wherein the area occupied by the first central view area is greater than or equal to an area occupied by the view area; the coding unit is configured to encode the panoramic image frame according to the determined first central view area; a pushing unit, configured to push the encoded panoramic image frame.
  • the panoramic image frame is respectively divided into a plurality of view regions according to a predetermined condition, and the first plurality of view regions are acquired.
  • a central view area and encoding the panoramic image frame according to the first central view area to push the encoded panoramic image frame. That is to say, by using the plurality of view areas divided on the panoramic image frame to obtain the central view area, it is realized that the central view area is accurately positioned by using the plurality of view areas, and the accuracy of the acquired picture of the central view area is ensured.
  • the technical problem of only obtaining a picture with high compression distortion in the related art is overcome. Further, the use of multiple view areas to quickly obtain the central view area will greatly improve the acquisition efficiency, thereby achieving the effect of improving the push efficiency of the panoramic media files.
  • FIG. 1 is a schematic diagram of an application environment of an optional panoramic media file pushing method according to an embodiment of the present invention
  • FIG. 2 is a flowchart of an optional panoramic media file pushing method according to an embodiment of the present invention
  • FIG. 3 is a schematic diagram of an optional panoramic media file pushing method according to an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of another optional panoramic media file pushing method according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of still another optional panoramic media file pushing method according to an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of still another optional panoramic media file pushing method according to an embodiment of the present invention.
  • FIG. 7 is a schematic diagram of an optional panoramic media file pushing device according to an embodiment of the present invention.
  • FIG. 8 is a schematic diagram of an optional panoramic media file push terminal according to an embodiment of the present invention.
  • the panoramic media file pushing method may be, but is not limited to, applied to an application environment as shown in FIG. 1.
  • the terminal 106 obtains a panoramic media file to be pushed from the server 102 through the network 104, wherein the panoramic media file includes at least one frame of panoramic image frames. Dividing the panoramic image frame into a plurality of view regions according to predetermined conditions, determining a first central view region on the plurality of view regions of the panoramic image frame, and encoding the panoramic image frame according to the first central view region to push the encoded Panoramic image frame.
  • the panoramic media file pushing method may also be applied to another application environment, such as only being applied to the terminal, and implementing the panoramic image frame in the panoramic media file in the terminal.
  • another application environment such as only being applied to the terminal, and implementing the panoramic image frame in the panoramic media file in the terminal.
  • the panoramic image frame is divided into a plurality of view regions according to a predetermined condition, and the first central view is determined on the plurality of view regions. And encoding the panoramic image frame according to the first central view area to push the encoded panoramic image frame. That is, by acquiring the first central view area by using the plurality of view areas divided on the panoramic image frame, the first central view area is accurately positioned by using the plurality of view areas to ensure the acquired view of the central view area. Accuracy, to overcome the problem that only the highly compressed distortion picture can be obtained in the related art, and further, the use of multiple view areas to quickly obtain the central view area will greatly improve the acquisition efficiency, thereby improving the push efficiency of the panoramic media file. Effect.
  • the foregoing terminal may include, but is not limited to, at least one of the following: a mobile phone, a tablet computer, a notebook computer, a desktop PC, smart glasses, and other hardware devices for playing panoramic media files.
  • the above network may include, but is not limited to, at least one of the following: a wide area network, a metropolitan area network, and a local area network. The above is only an example, and the embodiment does not limit this.
  • a method for pushing a panoramic media file includes:
  • the panoramic media file to be pushed is obtained, where the panoramic media file includes at least one frame of the panoramic image frame;
  • the method for pushing the panoramic media file may be, but is not limited to, being applied to a virtual reality (VR) process, where the virtual reality VR may be, but not limited to, a comprehensive use of computer graphics.
  • VR virtual reality
  • the system and various interface devices such as actual control provide a immersive sensation in a three-dimensional environment that can be generated on a computer.
  • the view area is divided and encoded by the panoramic media file, so that a more accurate and clear panoramic media file can be quickly provided during the process of playing the panoramic media file.
  • the above panoramic media file may include, but is not limited to, at least one of the following: a panoramic image, a panoramic video, and the like. The above is only an example, and is not limited in this embodiment.
  • the panoramic image frame is divided into a plurality of view regions according to a predetermined condition, and the first central view region is determined on the plurality of view regions. And encoding the panoramic image frame according to the determined first central view area to push the encoded panoramic image frame. That is, by acquiring the first central view area by using the plurality of view areas divided on the panoramic image frame, the first central view area is accurately positioned by using the plurality of view areas to ensure the acquired view of the central view area. Accuracy, to overcome the problem that only the highly compressed distortion picture can be obtained in the related art, and further, the use of multiple view areas to quickly obtain the central view area will greatly improve the acquisition efficiency, thereby improving the push efficiency of the panoramic media file. Effect.
  • the dividing each frame of the panoramic image frame according to the predetermined condition may include, but is not limited to, dividing the panoramic image frame uniformly distributed on the panoramic sphere into the same size according to the pre-configured specification condition. Multiple rectangular view areas.
  • the polar coordinate system can be used to represent the various positions on the panoramic image frame.
  • ( ⁇ x , ⁇ y ) is used to represent a position on a panoramic image frame, where ⁇ x represents an angle that is zero in front of the horizontal and counterclockwise in the horizontal direction; ⁇ y represents zero above the horizontal, Maintain an angle that is counterclockwise about horizontal and vertical.
  • one frame of the panoramic image frame is divided into a plurality of view areas, such as an A view area to a P view area.
  • the angle range definition for each view area can be:
  • x, y are the numbers of the 18 view regions in the x and y directions, respectively x, [1, 6], y ⁇ [1, 3].
  • the first center view area acquired on the plurality of view areas may be, but is not limited to, a view area formed for a viewer's viewing field of view.
  • the area of the plurality of view areas divided by the panoramic image frame may be, but is not limited to, determined according to the area of the first central view area, such as the size of the area occupied by each view area divided on the panoramic image frame.
  • the size of the area occupied by the first central view area may be greater than or equal to the size of the area occupied by the first central view area.
  • acquiring the central view area on multiple view areas of each frame of the panoramic image frame may include, but is not limited to, at least one of the following:
  • the first coordinate of the first central view area is used to quickly obtain the first
  • the central view area is a target view area in the plurality of view areas, wherein the target view area includes at least one view area that overlaps the coordinate range of the first central view area.
  • the picture of the first central view area is extracted from the target view area. That is, in the embodiment, the motion data detected by the sensor on the terminal can be used to quickly acquire the coordinates of the first central view area at the current time, and then use the coordinates to determine the target view area occupied by the first central view area.
  • the target view area in which the first central view area is located according to the coordinates of the first central view area includes: an A view area, a B view area, and an E view area. And the F view area (shown in the shaded figure in Figure 4). Further, the picture of the first central view area is extracted from the target view area, and further, the target view area where the first central view area is located may be conveniently coded and pushed according to a resolution higher than other view areas.
  • the coordinates of the second central view area may be predicted, that is, predicted according to the play mode of the panoramic media file and the coordinates of the first central view area.
  • the coordinates of the second central view area after the predetermined time period t. Therefore, the screen that will be played after the predetermined time period t is pushed by the prediction is realized, so as to overcome the problem of the screen delay occurring during the playback process caused by the delay of the network communication, thereby achieving the purpose of improving the push efficiency.
  • the encoding the panoramic image frame according to the central view area may include, but is not limited to, providing different resolution levels according to different view areas, such as the resolution of the view area where the central view area is located.
  • the level is higher than the resolution level of the other view areas in the plurality of view areas of the panoramic image frame.
  • it may be, but is not limited to, separately performing coding in units of view areas to obtain a code stream to be pushed, thereby achieving different resolution levels for different view areas. Encoding to save bandwidth and reduce transmission overhead.
  • the code stream of the encoded view area may be, but is not limited to, sliced according to time.
  • a time slice of a panoramic media file is always pushed each time. It is possible, but not limited to, to slice a view area using various classic Moving Picture Experts Group (MPEG) video segmentation techniques, and then perform streaming services according to an adaptive code stream push strategy.
  • MPEG Moving Picture Experts Group
  • the first central view area is obtained by using a plurality of view areas divided on the panoramic image frame, thereby accurately positioning the first central view area by using the plurality of view areas, and ensuring the obtained first view area.
  • the accuracy of the picture in the central view area overcomes the problem that only the highly compressed distortion picture can be obtained in the related art. Further, the use of multiple view areas to quickly acquire the first central view area will greatly improve the acquisition efficiency, and further Achieve the effect of improving the push efficiency of panoramic media files.
  • determining the first central view area on the plurality of view areas of the panoramic image frame comprises:
  • determining the target view area from the plurality of view areas by using the coordinate range of the first central view area comprises: obtaining an identifier of the view area within a coordinate range of the first central view area; The view area indicated by the identifier of the view area is spliced to obtain the target view area.
  • the motion data detected by the foregoing sensor may include, but is not limited to, at least one of the following: an angle of the head rotation and an eye rotation parameter.
  • the motion data may further include other motion data for detecting a field of view of the viewer, which is not limited in this embodiment.
  • the coordinate range of the first central view area is determined according to the motion data detected by the sensor, and the identifier of the view area of the target view area overlapping the first central view area is determined by using the above coordinate range.
  • the target view area includes: an A view area, a B view area, an E view area, and an F view area (shown in shades of FIG. 4).
  • the first middle view area is mainly in the F view area, and covers a part of the adjacent A/B/E view area, and the picture content in the first center view area is A/B/E/. The picture mosaic of the four areas of F is obtained.
  • the decoded picture in the four view areas of the A/B/E/F may be stitched according to the panoramic ball image projection method to obtain the target view area, and then according to the first center.
  • the relative position of the view area quickly extracts the picture in the first central view area from the target view area.
  • the first central view area may be included in multiple view areas.
  • one of the view areas can also be strictly coincident, thereby directly obtaining the picture in the first central view area.
  • the target view area where the first central view area is located is obtained from the plurality of view areas according to the coordinates of the first central view area detected by the sensor, so as to quickly extract the first view from the target view area.
  • the picture in the center view area is pushed and played to achieve the effect of improving the push efficiency of the push panoramic media file.
  • encoding the panoramic image frame according to the central view area includes:
  • the target view area in which the first central view area is located is encoded according to the first resolution
  • the other view areas except the target view area in the panoramic image frame are encoded according to the second resolution, wherein the first resolution is higher than the second resolution. rate.
  • each divided view area may be, but is not limited to, multi-scale coding to obtain a code stream of multiple resolution levels.
  • the resolution of the central view region (identified by the resolution level) may be, but not limited to, higher than the resolution of the other view regions. Therefore, the picture of the central view area of interest can be played clearly and faithfully, and the picture of other view areas is blurredly played, so as to reduce transmission overhead and save bandwidth.
  • obtaining the central view area on multiple view areas of each frame of the panoramic image frame includes:
  • the coordinate range of the second central view area on the plurality of view areas after the predetermined time period may be calculated according to the following formula:
  • (x 0 , y 0 ) is used to represent the coordinates of the first central view area
  • (x t , y t ) is used to represent the coordinates of the second central view area after the predetermined time period t
  • v mod is used to indicate the play mode
  • v mod x (t) is used to indicate the offset angle in the x direction after the predetermined time period t in the play mode
  • the x direction is the horizontal direction
  • v mod y (t) is used to indicate the predetermined time period t in the play mode.
  • the offset angle in the back y direction, and the y direction is the vertical direction.
  • the foregoing play mode may include, but is not limited to, at least one of: a first play mode for playing a picture in the first central view area, and a first search mode for searching the third central view area.
  • the second play mode, the third play mode for playing the picture in the third central view area may include, but is not limited to, at least one of: a first play mode for playing a picture in the first central view area, and a first search mode for searching the third central view area.
  • the second play mode, the third play mode for playing the picture in the third central view area may include, but is not limited to, at least one of: a first play mode for playing a picture in the first central view area, and a first search mode for searching the third central view area.
  • the second play mode, the third play mode for playing the picture in the third central view area may include, but is not limited to, at least one of: a first play mode for playing a picture in the first central view area, and a first search mode for searching the third central view area.
  • the above play mode may be, but is not limited to, an offset angle that will affect the predetermined time period t.
  • the first play mode also referred to as the view main mode
  • the search motion in the search process can be a uniform motion
  • the offset angle can be the product of the moving speed v and the moving time t, or can be a non-uniform motion, and is obtained according to the relevant calculation manner.
  • the offset angle in this play mode does not limit this.
  • the coordinates of the second central view area on the plurality of view areas after the predetermined time period are determined according to the play mode of the panoramic media file and the coordinates of the first central view area, thereby achieving attention after the predetermined time period.
  • the accurate prediction of the field of view ensures that the picture of the second central view area to be pushed is obtained in advance, and further, the problem of the playback delay caused by the network transmission delay can be avoided.
  • the playback mode of obtaining the panoramic media file includes:
  • the change range of the motion data detected by the sensor is less than a predetermined threshold value in a predetermined period, and when the last play mode is the second play mode, determining that the play mode is the third play mode, wherein the third play mode is used for playing The picture in the third center view area.
  • the first play mode is represented by a micro-swing viewing main mode (ma)
  • the second play mode is represented by a new content search mode (ms)
  • the third play mode is focused with new content.
  • the mode (mf) is indicated.
  • the foregoing play mode may be specifically as shown in FIG. 5, and may include:
  • Micro-swing viewing master mode This mode stays in the picture played in the first center view area, and the hardware device (such as the glasses terminal) used for viewing will be relatively stationary or have a slight swing (ie, scheduled) The amplitude of the swing in the period is less than a predetermined threshold), but does not actually leave the first central view area;
  • New content search mode This mode will leave the micro-swing viewing mode for fast motion search for new content in new fields of view (such as the third center view area), and hardware devices for viewing (such as glasses terminals) ) will move quickly and deviate from the original motion track. That is, the amplitude of the wobble in the predetermined period is greater than or equal to a predetermined threshold.
  • New content focus mode (mf): This mode may stay in the third center view area for a short time and then enter the new content search mode, or may actually enter the micro-swing view main mode to stay in the third center view area. That is, the motion data detected by the sensor indicates that the amplitude of the swing in the predetermined period is less than the predetermined threshold, and the previous play mode is the second play mode.
  • the motion pattern is judged based on the movement trajectory within a predetermined period (e.g., time window T).
  • a predetermined period e.g., time window T.
  • One type that only swings back and forth in a short distance is the micro-swing to view the main mode (ma); the other type of faster moving is the new content search mode (ms); the other is the previous mode is new content.
  • the search mode is a new content focus mode (mf) that is relatively stationary or slightly oscillating in a predetermined period of time (such as time window T).
  • third central view area may be, but not limited to, a second central view area, and may be, but is not limited to, other views of the plurality of view areas except the first central view area and the second central view area. Area.
  • the play mode is used to predict the offset angle of the visual field range within the predetermined time period, thereby implementing the coordinates and the offset angle according to the first central view area.
  • the coordinates of the second central view area after the predetermined time period are determined.
  • encoding the panoramic image frame according to the central view area includes:
  • acquiring the reference values of the multiple sub-view areas includes: repeating the following steps until the plurality of sub-view areas are traversed: acquiring the current sub-view area from the plurality of sub-view areas; acquiring the current sub-view area The significance level indicated by the significance feature and the probability that the second central view area falls within the current sub-view area; the maximum value of the significance level and the probability is used as the reference value of the current sub-view area.
  • each of the plurality of view regions may be, but is not limited to, divided into four sub-view regions of the same size.
  • the current view area includes a sub-view area a, a sub-view area b, a sub-view area c, and a sub-view area d.
  • obtaining the reference value of the current sub-view area may include, but is not limited to, obtaining a significance level indicated by the salient feature of the current sub-view area and the second central view area falls in the current sub-view area. The probability of the maximum of both.
  • the above-mentioned salient features may be, but are not limited to, a region for indicating a visually significant region, such as a center of the stage, and a region with a high probability of attention, which may be configured to be highly salient.
  • the visually significant area of the level For example, areas such as dark areas, auditoriums, and the sky with low probability of attention can be configured as visually significant areas of low significance level.
  • the above-mentioned significance level can be, but is not limited to, represented by Sa(t, ⁇ x , ⁇ y ), where ⁇ x ⁇ [0, 360°) ⁇ y ⁇ [-90°, 90°], the above-mentioned significance level can be It is not limited to a priori calculation based on the classical visual saliency detection algorithm.
  • the saliency level of each sub-view area can be counted, for example, RSa(t, sx, sy) represents the saliency level of the sub-view area (sx, sy) after the predetermined time period t. .
  • RSa(t, sx, sy) represents the saliency level of the sub-view area (sx, sy) after the predetermined time period t.
  • the sub-view area x direction number sx ranges from sx ⁇ [1, 12], and the sub view area y direction number sy ranges from sy ⁇ [1, 6].
  • the probability of falling in a sub-view area is represented by Pi(t, sx, sy), and the four sub-view areas included in the current view area are
  • the reference value can be identified in the following ways:
  • the reference value of the subview area a is:
  • aPi(t,x,y) max(RSa(t,2x-1,2y-1),Pi(t,2x-1,2y-1))
  • the reference value of sub-view area b is:
  • the reference value of the subview area c is:
  • dPi(t, x, y) max(RSa(t, 2x, 2y), Pi(t, 2x, 2y))
  • the resolution of the current view area is determined according to the maximum value mPi(t, x, y) of the above four reference values, wherein:
  • mPi(t,x,y) max(aPi(t,x,y),bPi(t,x,y),cPi(t,x,y),dPi(t,x,y))) (4 )
  • the resolution of the current view area is updated and adjusted according to the resolution of the sub-view area with the largest reference value to ensure high definition of the content of interest.
  • the embodiment provides the maximum value of the reference values in the plurality of sub-view areas included in the view area, and determines the resolution of the view area according to the maximum value, so as to implement different view areas after the predetermined time period. Configure different resolutions for encoding to save bandwidth.
  • the sub-view area that is most likely to be pushed after the predetermined time period t is predicted according to the saliency level indicated by the saliency feature and the probability of the second center view area falling, and further the other sub-view areas in the view area
  • the resolution of the view area is adjusted to the highest resolution to ensure the playback clarity of the content being watched.
  • determining the third resolution of the current view area according to the maximum value of the reference values of the plurality of sub-view areas includes:
  • the resolution level of the third resolution of the current view area is calculated by the following formula:
  • (x, y) is the coordinates of the current view area
  • S(t, x, y) is used to indicate the resolution level of the third resolution of the current view area in the panoramic image frame after the predetermined time period t
  • mPi (t, x, y) is used to indicate the maximum value of the reference values of the plurality of sub-view areas in the current view area after the predetermined time period t
  • Qnet is used for Indicates the current network bandwidth level
  • n is used to indicate the resolution level, where Qnet ⁇ [0,1], S(t,x,y) ⁇ 1,2,...,n ⁇ ;
  • Qnet indicates the current network bandwidth level. The higher the level, the more inclined it is to push the high-quality version content, and the worse the trend is to push the low-resolution version content to ensure the smooth viewing experience.
  • S(t, x, y) represents the resolution level at which the third resolution is located. The higher the level, the higher the resolution version of the push, such as the highest resolution version n, but vice versa, such as the lowest resolution version 1.
  • the pictures in multiple view areas are coded according to different resolutions to ensure that the clearest picture can be seen in the area occupied by the central view area, while the relative blur is seen in the area of other view areas.
  • the picture is to ensure that the difference image is played while playing the panoramic image frame, thereby reducing the transmission overhead, saving bandwidth, and improving the push efficiency.
  • the probability of obtaining the second central view area falling in the current sub-view area includes:
  • (sx, sy) is used to represent the coordinates of the current sub-view area
  • P(t, sx, sy) is used to indicate the probability that the second central view area falls within the current sub-view area after the predetermined time period t
  • (x t , y t ) is used to indicate the coordinates of the second central view area after the predetermined time period t.
  • the above formula is a reverse exponential function with e as the base, that is, the closer the current sub-view region is to the second central view region, the larger the function value is, and the corresponding probability is larger.
  • the method according to the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course, by hardware, but in many cases, the former is A better implementation.
  • the technical solution of the present invention may contribute to the prior art in part or in the software product.
  • the computer software product is stored in a storage medium (such as ROM/RAM, disk, optical disk), and includes a plurality of instructions for making a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) The methods described in various embodiments of the invention are performed.
  • a panoramic media file pushing apparatus for implementing the above-described panoramic media file pushing method, which is applied to a terminal having a display device.
  • the device includes:
  • the first obtaining unit 702 is configured to obtain a panoramic media file to be pushed, where the panoramic media file includes at least one frame of the panoramic image frame;
  • a dividing unit 704 configured to divide the panoramic image frame into a plurality of view regions according to a predetermined condition
  • the second obtaining unit 706 is configured to determine a first central view area on the plurality of view areas of the panoramic image frame, wherein the area occupied by the first central view area is greater than or equal to the area occupied by one view area;
  • the encoding unit 708 is configured to encode the panoramic image frame according to the determined first central view area
  • the pushing unit 710 is configured to push the encoded panoramic image frame.
  • the push device of the panoramic media file may be, but is not limited to, applied to a virtual reality (VR) process, where the virtual reality VR may be, but is not limited to, a comprehensive use of computer graphics.
  • VR virtual reality
  • the system and various interface devices such as actual control provide a immersive sensation in a three-dimensional environment that can be generated on a computer. If the above device is applied to the VR glasses, the view area is divided and encoded by the panoramic media file, so that a more accurate and clear panoramic media file can be quickly provided during the process of playing the panoramic media file.
  • the above panoramic media file may include, but is not limited to, at least one of the following: a panoramic image, a panoramic video, and the like. The above is only an example, and is not limited in this embodiment.
  • the frame panoramic image frame is divided into multiple view regions according to a predetermined condition, and the first central view is obtained on the plurality of view regions. And encoding the panoramic image frame according to the first central view area to push the encoded panoramic image frame. That is to say, by using the plurality of view areas divided on the panoramic image frame to obtain the first central view area, the plurality of view areas are used to accurately locate the central view area, thereby ensuring the accuracy of the acquired central view area.
  • the access efficiency is greatly improved, thereby improving the push efficiency of the panoramic media file. Effect.
  • the dividing each frame of the panoramic image frame according to the predetermined condition may include, but is not limited to, dividing the panoramic image frame uniformly distributed on the panoramic sphere into the same size according to the pre-configured specification condition. Multiple rectangular view areas.
  • the polar coordinate system can be used to represent the various positions on the panoramic image frame.
  • ( ⁇ x , ⁇ y ) is used to represent a position on a panoramic image frame, where ⁇ x represents an angle that is zero in front of the horizontal and counterclockwise in the horizontal direction; ⁇ y represents zero above the horizontal, Maintain an angle that is counterclockwise about horizontal and vertical.
  • one frame of the panoramic image frame is divided into a plurality of view areas, such as an A view area to a P view area.
  • the angle range definition for each view area can be:
  • x, y are the numbers of the 18 view regions in the x and y directions, respectively x, [1, 6], y ⁇ [1, 3].
  • the first center view area acquired on the plurality of view areas may be, but is not limited to, a view area formed for a viewer's viewing field of view.
  • the area of the plurality of view areas divided by the panoramic image frame may be, but is not limited to, determined according to the area of the first central view area, such as the size of the area occupied by each view area divided on the panoramic image frame. It may be, but not limited to, less than or equal to the size of the area occupied by the central view area, and greater than a quarter of the area occupied by the central view area. The above is only an example, and is not limited in this embodiment.
  • acquiring the central view area on multiple view areas of each frame of the panoramic image frame may include, but is not limited to, at least one of the following:
  • the first central view area is used.
  • the target quickly acquires a target view area of the first central view area in the plurality of view areas, wherein the target view area includes at least one view area that overlaps the coordinate range of the first central view area.
  • the picture of the first central view area is extracted from the target view area. That is, in the embodiment, the motion data detected by the sensor on the terminal can be used to quickly acquire the coordinates of the first central view area at the current time, and then use the coordinates to determine the target view area occupied by the first central view area.
  • the target view area in which the first central view area is obtained according to the coordinates of the first central view area includes: an A view area, a B view area, an E view area, and an F view area (as shown in FIG. Show).
  • the picture of the first central view area is extracted from the target view area, and further, the target view area where the first central view area is located may be conveniently coded and pushed according to a resolution higher than other view areas.
  • the coordinates of the second central view area may be predicted, that is, predicted according to the play mode of the panoramic media file and the coordinates of the first central view area.
  • the coordinates of the second central view area after the predetermined time period t. Therefore, the screen that will be played after the predetermined time period t is pushed by the prediction is realized, so as to overcome the problem of the screen delay occurring during the playback process caused by the delay of the network communication, thereby achieving the purpose of improving the push efficiency.
  • the encoding the panoramic image frame according to the central view area may include, but is not limited to, providing different resolution levels according to different view areas, such as the resolution of the view area where the central view area is located.
  • the level is higher than the resolution level of the other view areas in the plurality of view areas of the panoramic image frame.
  • it may be, but is not limited to, separately performing coding in units of view areas to obtain a code stream to be pushed, thereby achieving different resolution levels for different view areas. Encoding to save bandwidth and reduce transmission overhead.
  • the code stream of the encoded view area may be, but is not limited to, sliced according to time.
  • a time slice of a panoramic media file is always pushed each time. It is possible, but not limited to, to slice a view area using various classic Moving Picture Experts Group (MPEG) video segmentation techniques, and then perform streaming services according to an adaptive code stream push strategy.
  • MPEG Moving Picture Experts Group
  • the second obtaining unit includes:
  • a first determining module configured to determine a coordinate range of the first central view area according to the motion data detected by the sensor
  • a first obtaining module configured to obtain a target view area from the plurality of view areas by using a coordinate range of the first central view area, wherein the target view area includes at least one view area overlapping the coordinate range of the first central view area ;
  • An extraction module configured to extract a picture corresponding to the first central view area from the target view area.
  • the first obtaining module includes: (1) an acquiring sub-module, configured to acquire an identifier of a view area where a coordinate range of the first central view area is located; and (2) a splicing sub-module, configured to: The target view area is obtained by splicing the view area indicated by the identifier of the view area.
  • the motion data detected by the foregoing sensor may include, but is not limited to, at least one of the following: an angle of the head rotation and an eye rotation parameter.
  • the motion data may further include other motion data for detecting a field of view of the viewer, which is not limited in this embodiment.
  • the following example is used to determine the coordinates of the first central view area according to the motion data detected by the sensor, and the view area identifier of the target view area where the first central view area is located is obtained by using the coordinates, and the view area identifier includes: A view. Zone, B view zone, E view zone and F view zone (shown in the shadow of Figure 4), as shown in Figure 4, mainly in the F view zone, covering a part of the adjacent A/B/E view zone, The picture content in a central view area is obtained by stitching the pictures in the four areas of A/B/E/F.
  • the decoded picture in the four view areas of the A/B/E/F may be stitched according to the panoramic ball image projection method to obtain the target view area, and then according to the first center.
  • the relative position of the view area quickly extracts the picture in the first central view area from the target view area.
  • the size of the area occupied by the first central view area is equal to In the case of the size of the area occupied by each view area, the first central view area may be included in the target view area formed by the plurality of view areas, or may be strictly coincident with one of the view areas, thereby achieving direct access to the first The picture in the center view area.
  • the target view area where the first central view area is located is obtained from the plurality of view areas according to the coordinates of the first central view area detected by the sensor, so as to quickly extract the first view from the target view area.
  • the picture in the center view area is pushed and played to achieve the effect of improving the push efficiency of the push panoramic media file.
  • the coding unit includes:
  • a first encoding module configured to encode a target view area in which the first central view area is located according to the first resolution, and encode other view areas in the panoramic image frame except the target view area according to the second resolution, where The resolution is higher than the second resolution.
  • each divided view area may be, but is not limited to, multi-scale coding to obtain a code stream of multiple resolution levels.
  • the resolution of the central view region (identified by the resolution level) may be, but not limited to, higher than the resolution of the other view regions. Therefore, the picture of the central view area of interest can be played clearly and faithfully, and the picture of other view areas is blurredly played, so as to reduce transmission overhead and save bandwidth.
  • the second obtaining unit includes:
  • a second acquiring module configured to acquire a play mode of the panoramic media file
  • the second determining module is configured to determine a coordinate range of the second central view area on the plurality of view areas after the predetermined time period according to the play mode of the panoramic media file and the coordinate range of the first central view area.
  • the coordinate range of the second central view area on the plurality of view areas after the predetermined time period may be calculated according to the following formula:
  • (x 0 , y 0 ) is used to represent the coordinates of the first central view area
  • (x t , y t ) is used to represent the coordinates of the second central view area after the predetermined time period t
  • v mod is used to indicate the play mode
  • v mod x (t) is used to indicate the offset angle in the x direction after the predetermined time period t in the play mode
  • the x direction is the horizontal direction
  • v mod y (t) is used to indicate the predetermined time period t in the play mode.
  • the offset angle in the back y direction, and the y direction is the vertical direction.
  • the foregoing play mode may include, but is not limited to, at least one of: a first play mode for playing a picture in the first central view area, and a first search mode for searching the third central view area.
  • the second play mode, the third play mode for playing the picture in the third central view area may include, but is not limited to, at least one of: a first play mode for playing a picture in the first central view area, and a first search mode for searching the third central view area.
  • the second play mode, the third play mode for playing the picture in the third central view area may include, but is not limited to, at least one of: a first play mode for playing a picture in the first central view area, and a first search mode for searching the third central view area.
  • the second play mode, the third play mode for playing the picture in the third central view area may include, but is not limited to, at least one of: a first play mode for playing a picture in the first central view area, and a first search mode for searching the third central view area.
  • the above play mode may be, but is not limited to, an offset angle that will affect the predetermined time period t.
  • the first play mode also referred to as the view main mode
  • the search motion in the search process can be a uniform motion
  • the offset angle can be the product of the moving speed v and the moving time t, or can be a non-uniform motion, and is obtained according to the relevant calculation manner.
  • the offset angle in this play mode does not limit this.
  • the coordinates of the second central view area on the plurality of view areas after the predetermined time period are determined according to the play mode of the panoramic media file and the coordinates of the first central view area, thereby achieving attention after the predetermined time period.
  • the accurate prediction of the field of view ensures that the picture of the second central view area to be pushed is obtained in advance, and further, the problem of the playback delay caused by the network transmission delay can be avoided.
  • the second acquisition module includes:
  • a third determining submodule configured to determine that the play mode is the first play mode when the change range of the motion data detected by the sensor is less than a predetermined threshold in a predetermined period, wherein the first play mode is used to play the first center The picture in the view area;
  • a fourth determining submodule configured to determine, in a predetermined period, that the change range of the motion data detected by the sensor is greater than or equal to a predetermined threshold, wherein the play mode is the second play mode, wherein the second play The drop mode is used to search for the third center view area;
  • a fifth determining submodule configured to determine, in a predetermined period, that the range of change of the motion data detected by the sensor is less than a predetermined threshold, and when the previous play mode is the second play mode, determining that the play mode is the third play mode, where The third play mode is used to play the picture in the third center view area.
  • the first play mode is represented by a micro-swing viewing main mode (ma)
  • the second play mode is represented by a new content search mode (ms)
  • the third play mode is represented by a new content focus mode (mf).
  • the foregoing play mode may be specifically as shown in FIG. 5, and may include:
  • Micro-swing viewing master mode This mode stays in the picture played in the first center view area, and the hardware device (such as the glasses terminal) used for viewing will be relatively stationary or have a slight swing (ie, scheduled) The amplitude of the swing in the period is less than a predetermined threshold), but does not actually leave the first central view area.
  • New content search mode This mode will leave the micro-swing viewing mode for fast motion search for new content in new fields of view (such as the third center view area), and hardware devices for viewing (such as glasses terminals) ) will move quickly and deviate from the original motion track. That is, the amplitude of the wobble in the predetermined period is greater than or equal to a predetermined threshold.
  • New content focus mode (mf): This mode may stay in the third center view area for a short time and then enter the new content search mode, or may actually enter the micro-swing view main mode to stay in the third center view area. That is, the motion data detected by the sensor indicates that the amplitude of the swing in the predetermined period is less than the predetermined threshold, and the previous play mode is the second play mode.
  • the motion pattern is judged based on the movement trajectory within a predetermined period (e.g., time window T).
  • a predetermined period e.g., time window T.
  • One type that only swings back and forth in a short distance is the micro-swing to view the main mode (ma); the other type of faster moving is the new content search mode (ms); the other is the previous mode is new content.
  • the search mode is a new content focus mode (mf) that is relatively stationary or slightly oscillating in a predetermined period of time (such as time window T).
  • third central view area may be, but not limited to, a second central view area, and may be, but is not limited to, other views of the plurality of view areas except the first central view area and the second central view area. Area.
  • the play mode is used to predict the offset angle of the visual field range within the predetermined time period, thereby implementing the coordinates and the offset angle according to the first central view area.
  • the coordinates of the second central view area after the predetermined time period are determined.
  • the coding unit includes:
  • the processing module is configured to obtain reference values of the plurality of sub-view areas by repeating the following steps until the plurality of sub-view areas are traversed: acquiring the current sub-view area from the plurality of sub-view areas; The significance level indicated by the saliency feature of the current sub-view area and the probability that the second central view area falls within the current sub-view area; the maximum value of the saliency level and the probability is used as the reference value of the current sub-view area.
  • each of the plurality of view regions may be, but is not limited to, divided into four sub-view regions of the same size.
  • the current view area includes a sub-view area a, a sub-view area b, a sub-view area c, and a sub-view area d.
  • obtaining the reference value of the current sub-view area may include, but is not limited to, obtaining a significance level indicated by the salient feature of the current sub-view area and the second central view area falls in the current sub-view area. The probability of the maximum of both.
  • the above-mentioned salient features may be, but are not limited to, a region for indicating a visually significant region, such as a center of the stage, and a region with a high probability of attention, which may be configured to be highly salient.
  • the visually significant area of the level For example, areas such as dark areas, auditoriums, and the sky with low probability of attention can be configured as visually significant areas of low significance level.
  • the above-mentioned significance level can be, but is not limited to, represented by Sa(t, ⁇ x , ⁇ y ), where ⁇ x ⁇ [0, 360°) ⁇ y ⁇ [-90°, 90°], the above-mentioned significance level can be It is not limited to a priori calculation based on the classical visual saliency detection algorithm.
  • the saliency level of each sub-view area can be counted, for example, RSa(t, sx, sy) represents the saliency level of the sub-view area (sx, sy) after the predetermined time period t. .
  • RSa(t, sx, sy) represents the saliency level of the sub-view area (sx, sy) after the predetermined time period t.
  • the sub-view area x direction number sx ranges from sx ⁇ [1, 12], and the sub view area y direction number sy ranges from sy ⁇ [1, 6].
  • the probability of falling in a sub-view area is represented by Pi(t, sx, sy), and the four sub-view areas included in the current view area are
  • the reference value can be identified in the following ways:
  • the reference value of the subview area a is:
  • aPi(t,x,y) max(RSa(t,2x-1,2y-1),Pi(t,2x-1,2y-1))
  • the reference value of sub-view area b is:
  • the reference value of the subview area c is:
  • dPi(t, x, y) max(RSa(t, 2x, 2y), Pi(t, 2x, 2y))
  • the resolution of the current view area is determined according to the maximum value mPi(t, x, y) of the above four reference values, wherein:
  • mPi(t,x,y) max(aPi(t,x,y),bPi(t,x,y),cPi(t,x,y),dPi(t,x,y))) (10) )
  • the resolution of the current view area is updated and adjusted according to the resolution of the sub-view area with the largest reference value to ensure high definition of the content of interest.
  • the embodiment provides the maximum value of the reference values in the plurality of sub-view areas included in the view area, and determines the resolution of the view area according to the maximum value, so as to implement different view areas after the predetermined time period. Configure different resolutions for encoding to save bandwidth.
  • the sub-view area that is most likely to be pushed after the predetermined time period t is predicted according to the saliency level indicated by the saliency feature and the probability of the second center view area falling, and further the other sub-view areas in the view area
  • the resolution of the view area is adjusted to the highest resolution to ensure the playback clarity of the content being watched.
  • the processing module is implemented according to the following steps:
  • the maximum value in the reference value determines the third resolution of the current view area:
  • the resolution level of the third resolution of the current view area is calculated by the following formula:
  • (x, y) is the coordinates of the current view area
  • S(t, x, y) is used to indicate the resolution level of the third resolution of the current view area in the panoramic image frame after the predetermined time period t
  • mPi (t, x, y) is used to indicate the maximum value of the reference values of the plurality of sub-view areas in the current view area after the predetermined time period t
  • Qnet is used to indicate the current network bandwidth level
  • n is used to represent the resolution level, wherein , Qnet ⁇ [0,1],S(t,x,y) ⁇ 1,2,...,n ⁇ ;
  • Qnet indicates the current network bandwidth level. The higher the level, the more inclined it is to push the high-quality version content, and the worse the trend is to push the low-resolution version content to ensure the smooth viewing experience.
  • S(t, x, y) represents the resolution level at which the third resolution is located. The higher the level, the higher the resolution version of the push, such as the highest resolution version n, but vice versa, such as the lowest resolution version 1.
  • the pictures in multiple view areas are coded according to different resolutions to ensure that the clearest picture can be seen in the area occupied by the central view area, while the relative blur is seen in the area of other view areas.
  • the picture is to ensure that the difference image is played while playing the panoramic image frame, thereby reducing the transmission overhead, saving bandwidth, and improving the push efficiency.
  • the processing module obtains the probability that the second central view area falls in the current sub-view area by the following steps:
  • (sx, sy) is used to represent the coordinates of the current sub-view area
  • P(t, sx, sy) is used to indicate the probability that the second central view area falls within the current sub-view area after the predetermined time period t
  • (x t , y t ) is used to indicate the coordinates of the second central view area after the predetermined time period t.
  • the above formula is a reverse exponential function with e as the base, that is, the closer the current sub-view region is to the second central view region, the larger the function value is, and the corresponding probability is larger.
  • a panoramic media file pushing terminal for implementing the above-mentioned panoramic media file pushing method is further provided.
  • the terminal includes:
  • a communication interface 802 configured to obtain a panoramic media file to be pushed, wherein the panoramic media file includes at least one frame of the panoramic image frame; and is further configured to push the encoded panoramic image frame;
  • the processor 804 is connected to the communication interface 802, and is configured to respectively divide the panoramic image frame into a plurality of view regions according to predetermined conditions; and further configured to determine the first central view region on the plurality of view regions of the panoramic image frame, wherein The area occupied by the first central view area is greater than or equal to the area occupied by one view area; and is further configured to encode the panoramic image frame according to the central view area;
  • Embodiments of the present invention also provide a storage medium.
  • the foregoing storage medium may be located in at least one of the plurality of network devices in the network.
  • the storage medium is arranged to store program code for performing the following steps:
  • the panoramic media file to be pushed is obtained, where the panoramic media file includes at least one frame of the panoramic image frame;
  • the storage medium is further arranged to store program code for performing the following steps:
  • the storage medium is further configured to store program code for: encoding the target view area in which the first central view area is located according to the first resolution, and encoding the target view area in the panoramic image frame according to the second resolution Other view areas than the first resolution, wherein the first resolution is higher than the second resolution rate.
  • the storage medium is further arranged to store program code for performing the following steps:
  • the storage medium is further arranged to store program code for performing the steps of: repeating the steps of: traversing the plurality of view regions in the panoramic image frame after a predetermined period of time: determining the current from the plurality of view regions a plurality of sub-view areas divided in the view area; obtaining reference values of the plurality of sub-view areas, wherein the reference value is a saliency level indicated by the saliency feature of the sub-view area and a probability that the second central view area falls within the sub-view area a maximum of the two; determining a third resolution of the current view region based on a maximum of the reference values of the plurality of sub-view regions; encoding the current view region according to the third resolution.
  • the foregoing storage medium may include, but is not limited to, a U disk, a Read-Only Memory (ROM), a removable hard disk, a magnetic disk, or an optical disk, and the like. .
  • ROM Read-Only Memory
  • the foregoing storage medium may include, but is not limited to, a U disk, a Read-Only Memory (ROM), a removable hard disk, a magnetic disk, or an optical disk, and the like.
  • the integrated unit in the above embodiment if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in the above-described computer readable storage medium.
  • the technical solution of the present invention may contribute to the prior art or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium.
  • a number of instructions are included to cause one or more computer devices (which may be a personal computer, server or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention.
  • the disclosed client may be implemented in other manners.
  • the device embodiments described above are merely illustrative, for example
  • the division of the unit is only a logical function division, and the actual implementation may have another division manner.
  • multiple units or components may be combined or may be integrated into another system, or some features may be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, unit or module, and may be electrical or otherwise.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Image Processing (AREA)
  • Studio Devices (AREA)

Abstract

Disclosed in the present invention are a panoramic media file push method and device. The method comprises: obtaining a panoramic media file to be pushed, the panoramic media file comprising at least one panoramic image frame; dividing each panoramic image frame into multiple view regions according to a preset condition; obtaining a first central view region on the multiple view regions of each panoramic image frame, a region occupied by the first central view region being larger than or equal to a region occupied by one view region; coding the panoramic image frame according to the first central view region; and pushing the coded panoramic image frame. The present invention resolves the technical problem of low push accuracy caused when an existing panoramic media file push mode is used.

Description

全景媒体文件推送方法及装置Panoramic media file pushing method and device
本申请要求于2016年7月14日提交中国专利局,申请号为201610557007.2,发明名称为“全景媒体文件推送方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。The present application claims priority to Chinese Patent Application No. 201610557007.2, filed on Jul. 14, 2016, the entire disclosure of which is incorporated herein by reference. .
技术领域Technical field
本发明涉及计算机领域,具体而言,涉及一种全景媒体文件推送方法及装置。The present invention relates to the field of computers, and in particular to a method and device for pushing a panoramic media file.
背景技术Background technique
由于全景媒体文件能提供给用户区别于传统有限视野更为逼真的沉浸观看体验,因而逐渐成为虚拟现实领域主要的内容之一。然而全景媒体文件相对于传统媒体文件,在工程技术上有着巨大困难与挑战。As the panoramic media file can provide users with a more realistic and immersive viewing experience that is different from the traditional limited view, it has gradually become one of the main contents in the virtual reality field. However, panoramic media files have great difficulties and challenges in engineering technology compared to traditional media files.
目前,针对全景媒体文件的播放,常用的技术手段是利用四棱锥。具体来说,是将全景球内置于一个四棱锥体内,并使观看者的视野中心垂直对齐于锥体底面中心,通过投影几何变换,将所要播放的画面投影到四棱锥面上。这样通过投影到锥体底面将把画面高清保留在观看者的视野内,而视野外的其余内容由于被投影到锥体侧面,将被迅速压缩,从而大大降低了推送全景媒体文件时的带宽压力。At present, for the playback of panoramic media files, a common technical means is to use a quadrangular pyramid. Specifically, the panoramic sphere is built into a quadrangular pyramid, and the center of the viewer's field of view is vertically aligned with the center of the bottom surface of the cone, and the image to be played is projected onto the quadrangular pyramid surface by projection geometric transformation. This will preserve the high definition of the image in the viewer's field of view by projecting to the underside of the cone, while the rest of the field of view will be quickly compressed as it is projected onto the side of the cone, greatly reducing the bandwidth pressure when pushing panoramic media files. .
然而,由于全景媒体文件需要提供360度全景画面,如果只传输一个固定的预定义视角的画面内容,当观看者的视野中心发生移动的话,在新视野里就有一部分画面由于压缩而无法正常呈现,所以上述这种仅仅针对一个预定义视角,推送全景媒体文件的方式,将使得所推送的全景媒体文件不准确,从而导致全景媒体文件播放时出现失真的问题。However, since the panoramic media file needs to provide a 360-degree panoramic picture, if only a fixed predefined viewing angle of the picture content is transmitted, when the viewer's field of view moves, a part of the picture in the new field of view cannot be normally rendered due to compression. Therefore, the above manner of pushing the panoramic media file only for a predefined perspective will make the pushed panoramic media file inaccurate, thereby causing the problem that the panoramic media file is distorted during playback.
针对上述的问题,目前尚未提出有效的解决方案。In response to the above problems, no effective solution has been proposed yet.
发明内容 Summary of the invention
本发明实施例提供了一种全景媒体文件推送方法及装置,以至少解决采用现有的全景媒体文件的推送方式所导致的推送准确性较低的技术问题。The embodiment of the invention provides a method and a device for pushing a panoramic media file to solve at least the technical problem of low push accuracy caused by the push mode of the existing panoramic media file.
根据本发明实施例的一个方面,提供了一种全景媒体文件推送方法,应用于具有显示装置的终端,包括:获取待推送的全景媒体文件,其中,上述全景媒体文件中包括至少一帧全景图像帧;按照预定条件将上述全景图像帧划分为多个视图区;在上述全景图像帧的上述多个视图区上获取第一中心视图区,其中,上述第一中心视图区所占区域大于等于一个上述视图区所占区域;根据上述确定的第一中心视图区对上述全景图像帧进行编码;向显示装置推送编码后的上述全景图像帧。According to an aspect of the present invention, a method for pushing a panoramic media file is provided, which is applied to a terminal having a display device, comprising: acquiring a panoramic media file to be pushed, wherein the panoramic media file includes at least one panoramic image. And dividing the panoramic image frame into a plurality of view regions according to a predetermined condition; acquiring a first central view region on the plurality of view regions of the panoramic image frame, wherein the area occupied by the first central view region is greater than or equal to one The area occupied by the view area; encoding the panoramic image frame according to the first central view area determined above; and pushing the encoded panoramic image frame to the display device.
根据本发明实施例的另一方面,还提供了一种全景媒体文件推送装置,应用于具有显示装置的终端,包括:第一获取单元,用于获取待推送的全景媒体文件,其中,上述全景媒体文件中包括至少一帧全景图像帧;划分单元,用于按照预定条件将上述全景图像帧划分为多个视图区;第二获取单元,用于在上述全景图像帧的上述多个视图区上确定第一中心视图区,其中,上述第一中心视图区所占区域大于等于一个上述视图区所占区域;编码单元,用于根据上述确定的第一中心视图区对上述全景图像帧进行编码;推送单元,用于推送编码后的上述全景图像帧。According to another aspect of the present invention, a panoramic media file pushing apparatus is further provided, which is applied to a terminal having a display device, and includes: a first acquiring unit, configured to acquire a panoramic media file to be pushed, wherein the panoramic view The media file includes at least one frame of the panoramic image frame; the dividing unit is configured to divide the panoramic image frame into a plurality of view regions according to a predetermined condition; and the second acquiring unit is configured to be used on the plurality of view regions of the panoramic image frame Determining a first central view area, wherein the area occupied by the first central view area is greater than or equal to an area occupied by the view area; the coding unit is configured to encode the panoramic image frame according to the determined first central view area; a pushing unit, configured to push the encoded panoramic image frame.
在本发明实施例中,在获取待推送的全景媒体文件中的至少一帧全景图像帧后,按照预定条件分别将全景图像帧划分为多个视图区,在上述多个视图区上获取第一中心视图区,并根据该第一中心视图区对全景图像帧进行编码,以推送编码后的全景图像帧。也就是说,通过利用在全景图像帧上划分的多个视图区获取中心视图区,从而实现了利用多个视图区准确定位中心视图区,保证所获取的中心视图区的画面的准确性,以克服相关技术中只能获取到被高度压缩失真的画面的技术问题。进一步,利用多个视图区快速获取中心视图区,还将大大提高获取效率,进而实现提高全景媒体文件的推送效率的效果。In the embodiment of the present invention, after acquiring at least one frame of the panoramic image frame in the panoramic media file to be pushed, the panoramic image frame is respectively divided into a plurality of view regions according to a predetermined condition, and the first plurality of view regions are acquired. a central view area, and encoding the panoramic image frame according to the first central view area to push the encoded panoramic image frame. That is to say, by using the plurality of view areas divided on the panoramic image frame to obtain the central view area, it is realized that the central view area is accurately positioned by using the plurality of view areas, and the accuracy of the acquired picture of the central view area is ensured. The technical problem of only obtaining a picture with high compression distortion in the related art is overcome. Further, the use of multiple view areas to quickly obtain the central view area will greatly improve the acquisition efficiency, thereby achieving the effect of improving the push efficiency of the panoramic media files.
附图说明DRAWINGS
此处所说明的附图用来提供对本发明的进一步理解,构成本申请的一部 分,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。在附图中:The drawings described herein are provided to provide a further understanding of the invention and constitute a part of this application. The illustrative embodiments of the present invention and the description thereof are intended to explain the present invention and are not intended to limit the invention. In the drawing:
图1是根据本发明实施例的一种可选的全景媒体文件推送方法的应用环境示意图;1 is a schematic diagram of an application environment of an optional panoramic media file pushing method according to an embodiment of the present invention;
图2是根据本发明实施例的一种可选的全景媒体文件推送方法的流程图;2 is a flowchart of an optional panoramic media file pushing method according to an embodiment of the present invention;
图3是根据本发明实施例的一种可选的全景媒体文件推送方法的示意图;3 is a schematic diagram of an optional panoramic media file pushing method according to an embodiment of the present invention;
图4是根据本发明实施例的另一种可选的全景媒体文件推送方法的示意图;4 is a schematic diagram of another optional panoramic media file pushing method according to an embodiment of the present invention;
图5是根据本发明实施例的又一种可选的全景媒体文件推送方法的示意图;FIG. 5 is a schematic diagram of still another optional panoramic media file pushing method according to an embodiment of the present invention; FIG.
图6是根据本发明实施例的又一种可选的全景媒体文件推送方法的示意图;6 is a schematic diagram of still another optional panoramic media file pushing method according to an embodiment of the present invention;
图7是根据本发明实施例的一种可选的全景媒体文件推送装置的示意图;以及7 is a schematic diagram of an optional panoramic media file pushing device according to an embodiment of the present invention;
图8是根据本发明实施例的一种可选的全景媒体文件推送终端的示意图。FIG. 8 is a schematic diagram of an optional panoramic media file push terminal according to an embodiment of the present invention.
具体实施方式detailed description
为了使本技术领域的人员更好地理解本发明方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分的实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本发明保护的范围。The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is an embodiment of the invention, but not all of the embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts shall fall within the scope of the present invention.
需要说明的是,本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本发明的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方 法、产品或设备固有的其它步骤或单元。It is to be understood that the terms "first", "second" and the like in the specification and claims of the present invention are used to distinguish similar objects, and are not necessarily used to describe a particular order or order. It is to be understood that the data so used may be interchanged where appropriate, so that the embodiments of the invention described herein can be implemented in a sequence other than those illustrated or described herein. In addition, the terms "comprises" and "comprises" and "the" and "the" are intended to cover a non-exclusive inclusion, for example, a process, method, system, product, or device that comprises a series of steps or units is not necessarily limited to Those steps or units, but may include those that are not clearly listed or for these processes, Other steps or units inherent to the law, product or equipment.
在本发明实施例中,提供了一种上述全景媒体文件推送方法的实施例。作为一种可选的实施方式,该全景媒体文件推送方法可以但不限于应用于如图1所示的应用环境中。如图1所示,终端106通过网络104从服务器102获取待推送的全景媒体文件,其中,该全景媒体文件中包括至少一帧全景图像帧。按照预定条件将全景图像帧划分为多个视图区,在全景图像帧的多个视图区上确定第一中心视图区,并根据该第一中心视图区对全景图像帧进行编码,以推送编码后的全景图像帧。In the embodiment of the present invention, an embodiment of the above method for pushing a panoramic media file is provided. As an optional implementation manner, the panoramic media file pushing method may be, but is not limited to, applied to an application environment as shown in FIG. 1. As shown in FIG. 1, the terminal 106 obtains a panoramic media file to be pushed from the server 102 through the network 104, wherein the panoramic media file includes at least one frame of panoramic image frames. Dividing the panoramic image frame into a plurality of view regions according to predetermined conditions, determining a first central view region on the plurality of view regions of the panoramic image frame, and encoding the panoramic image frame according to the first central view region to push the encoded Panoramic image frame.
作为另一种可选的实施方式,该全景媒体文件推送方法还可以但不限于应用于另一应用环境中,如仅应用于终端中,在终端内实现对全景媒体文件中的全景图像帧的划分、编码、推送操作,具体可参照上述实施方式,本实施例中在此不再赘述。As another optional implementation manner, the panoramic media file pushing method may also be applied to another application environment, such as only being applied to the terminal, and implementing the panoramic image frame in the panoramic media file in the terminal. For the division, coding, and push operations, reference may be made to the foregoing embodiments, and details are not described herein again in this embodiment.
在本实施例中,在获取待推送的全景媒体文件中的至少一帧全景图像帧后,按照预定条件将全景图像帧划分为多个视图区,在上述多个视图区上确定第一中心视图区,并根据该第一中心视图区对全景图像帧进行编码,以推送编码后的全景图像帧。也就是说,通过利用在全景图像帧上划分的多个视图区获取第一中心视图区,从而实现了利用多个视图区准确定位第一中心视图区,保证所获取的中心视图区的画面的准确性,以克服相关技术中只能获取到被高度压缩失真的画面的问题,进一步,利用多个视图区快速获取中心视图区,还将大大提高获取效率,进而实现提高全景媒体文件的推送效率的效果。In this embodiment, after acquiring at least one frame of the panoramic image frame in the panoramic media file to be pushed, the panoramic image frame is divided into a plurality of view regions according to a predetermined condition, and the first central view is determined on the plurality of view regions. And encoding the panoramic image frame according to the first central view area to push the encoded panoramic image frame. That is, by acquiring the first central view area by using the plurality of view areas divided on the panoramic image frame, the first central view area is accurately positioned by using the plurality of view areas to ensure the acquired view of the central view area. Accuracy, to overcome the problem that only the highly compressed distortion picture can be obtained in the related art, and further, the use of multiple view areas to quickly obtain the central view area will greatly improve the acquisition efficiency, thereby improving the push efficiency of the panoramic media file. Effect.
可选地,在本实施例中,上述终端可以包括但不限于以下至少之一:手机、平板电脑、笔记本电脑、台式PC机、智能眼镜及其他用于播放全景媒体文件的硬件设备。上述网络可以包括但不限于以下至少之一:广域网、城域网、局域网。上述只是一种示例,本实施例对此不做任何限定。Optionally, in this embodiment, the foregoing terminal may include, but is not limited to, at least one of the following: a mobile phone, a tablet computer, a notebook computer, a desktop PC, smart glasses, and other hardware devices for playing panoramic media files. The above network may include, but is not limited to, at least one of the following: a wide area network, a metropolitan area network, and a local area network. The above is only an example, and the embodiment does not limit this.
根据本发明实施例,提供了一种全景媒体文件推送方法,如图2所示,该方法包括:According to an embodiment of the present invention, a method for pushing a panoramic media file is provided. As shown in FIG. 2, the method includes:
S202,获取待推送的全景媒体文件,其中,全景媒体文件中包括至少一帧全景图像帧; S202, the panoramic media file to be pushed is obtained, where the panoramic media file includes at least one frame of the panoramic image frame;
S204,按照预定条件将全景图像帧划分为多个视图区;S204. Divide the panoramic image frame into multiple view areas according to a predetermined condition;
S206,在全景图像帧的多个视图区上确定第一中心视图区,其中,第一中心视图区所占区域大于等于一个视图区所占区域;S206, determining a first central view area on a plurality of view areas of the panoramic image frame, wherein the area occupied by the first central view area is greater than or equal to the area occupied by one view area;
S208,根据第一中心视图区对全景图像帧进行编码;S208. Encode the panoramic image frame according to the first central view area.
S210,推送编码后的全景图像帧。S210: Push the encoded panoramic image frame.
可选地,在本实施例中,上述全景媒体文件的推送方法可以但不限于应用于虚拟现实(Virtual Reality,VR)过程中,其中,虚拟现实VR可以但不限于是一种综合利用计算机图形系统和各种现实际控制等接口设备,在计算机上生成的、可交互的三维环境中提供沉浸感觉的技术。如将上述方法应用于VR眼镜中,通过对全景媒体文件进行视图区的划分及编码处理,以实现在播放全景媒体文件的过程中,可以快速提供更加准确清晰的全景媒体文件。其中,上述全景媒体文件可以包括但不限于以下至少之一:全景图像和全景视频等。上述仅是一种示例,本实施例中对此不做任何限定。Optionally, in this embodiment, the method for pushing the panoramic media file may be, but is not limited to, being applied to a virtual reality (VR) process, where the virtual reality VR may be, but not limited to, a comprehensive use of computer graphics. The system and various interface devices such as actual control provide a immersive sensation in a three-dimensional environment that can be generated on a computer. If the above method is applied to the VR glasses, the view area is divided and encoded by the panoramic media file, so that a more accurate and clear panoramic media file can be quickly provided during the process of playing the panoramic media file. The above panoramic media file may include, but is not limited to, at least one of the following: a panoramic image, a panoramic video, and the like. The above is only an example, and is not limited in this embodiment.
需要说明的是,在获取待推送的全景媒体文件中的至少一帧全景图像帧后,按照预定条件将全景图像帧划分为多个视图区,在上述多个视图区上确定第一中心视图区,并根据确定的第一中心视图区对全景图像帧进行编码,以推送编码后的全景图像帧。也就是说,通过利用在全景图像帧上划分的多个视图区获取第一中心视图区,从而实现了利用多个视图区准确定位第一中心视图区,保证所获取的中心视图区的画面的准确性,以克服相关技术中只能获取到被高度压缩失真的画面的问题,进一步,利用多个视图区快速获取中心视图区,还将大大提高获取效率,进而实现提高全景媒体文件的推送效率的效果。It is to be noted that, after acquiring at least one frame of the panoramic image frame in the panoramic media file to be pushed, the panoramic image frame is divided into a plurality of view regions according to a predetermined condition, and the first central view region is determined on the plurality of view regions. And encoding the panoramic image frame according to the determined first central view area to push the encoded panoramic image frame. That is, by acquiring the first central view area by using the plurality of view areas divided on the panoramic image frame, the first central view area is accurately positioned by using the plurality of view areas to ensure the acquired view of the central view area. Accuracy, to overcome the problem that only the highly compressed distortion picture can be obtained in the related art, and further, the use of multiple view areas to quickly obtain the central view area will greatly improve the acquisition efficiency, thereby improving the push efficiency of the panoramic media file. Effect.
可选地,在本实施例中,按照预定条件分别对每帧全景图像帧进行划分可以包括但不限于:按照预先配置的规格条件将均匀分布在全景球上的全景图像帧划分为相同大小的多个矩形视图区。Optionally, in this embodiment, the dividing each frame of the panoramic image frame according to the predetermined condition may include, but is not limited to, dividing the panoramic image frame uniformly distributed on the panoramic sphere into the same size according to the pre-configured specification condition. Multiple rectangular view areas.
例如,以全景球中心为原点,因为球面到中心(原点)距离为全景球半径长度,因而可以用极坐标系统来表示观全景图像帧上的各个位置。例如,用(θxy)表示全景图像帧上的一个位置,其中θx表示以水平正前方为零,保持水平方向逆时针绕行的角度;θy表示以水平正上方为零,保持与水平垂 直方向逆时针绕行的角度。进一步,如图3所示,将一帧全景图像帧划分为多个视图区,例如A视图区至P视图区。即以全景球中心为原点,水平方向逆时针旋转360度等分为6份,每份60度;与水平垂直方向逆时针旋转180度等分为3份,每份60度。则每个视图区的角度范围定义可以为:For example, taking the center of the panoramic sphere as the origin, since the spherical-to-center (origin) distance is the radius of the panoramic sphere, the polar coordinate system can be used to represent the various positions on the panoramic image frame. For example, (θ x , θ y ) is used to represent a position on a panoramic image frame, where θ x represents an angle that is zero in front of the horizontal and counterclockwise in the horizontal direction; θ y represents zero above the horizontal, Maintain an angle that is counterclockwise about horizontal and vertical. Further, as shown in FIG. 3, one frame of the panoramic image frame is divided into a plurality of view areas, such as an A view area to a P view area. That is, the center of the panoramic ball is taken as the origin, and the horizontal direction is rotated counterclockwise by 360 degrees into 6 parts, each of which is 60 degrees; and the horizontal direction is rotated counterclockwise by 180 degrees into three parts, each of which is 60 degrees. The angle range definition for each view area can be:
Figure PCTCN2017092562-appb-000001
Figure PCTCN2017092562-appb-000001
其中,x,y为18个视图区分别在x方向和y方向上的编号,x∈[1,6],y∈[1,3]。Where x, y are the numbers of the 18 view regions in the x and y directions, respectively x, [1, 6], y ∈ [1, 3].
此外,在本实施例中,在上述多个视图区上获取的第一中心视图区可以但不限于为观看者的观看视野形成的视图区。需要说明的是,在全景图像帧划分的多个视图区的面积可以但不限于根据第一中心视图区的面积确定,如在全景图像帧上所划分的每个视图区所占区域的大小,可以但不限于小于等于第一中心视图区所占区域的大小,大于四分之一的第一中心视图区所占区域的大小。上述仅是一种示例,本实施例中对此不做任何限定。Further, in the present embodiment, the first center view area acquired on the plurality of view areas may be, but is not limited to, a view area formed for a viewer's viewing field of view. It should be noted that the area of the plurality of view areas divided by the panoramic image frame may be, but is not limited to, determined according to the area of the first central view area, such as the size of the area occupied by each view area divided on the panoramic image frame. The size of the area occupied by the first central view area may be greater than or equal to the size of the area occupied by the first central view area. The above is only an example, and is not limited in this embodiment.
可选地,在本实施例中,在每帧全景图像帧的多个视图区上获取中心视图区可以包括但不限于以下至少之一:Optionally, in this embodiment, acquiring the central view area on multiple view areas of each frame of the panoramic image frame may include, but is not limited to, at least one of the following:
1)根据传感器检测到的运动数据确定第一中心视图区的坐标范围;1) determining a coordinate range of the first central view area according to the motion data detected by the sensor;
2)获取全景媒体文件的播放模式;根据全景媒体文件的播放模式及第一中心视图区的坐标范围确定预定时间段后多个视图区上第二中心视图区的坐标范围。2) acquiring a play mode of the panoramic media file; determining a coordinate range of the second central view area on the plurality of view areas after the predetermined time period according to the play mode of the panoramic media file and the coordinate range of the first central view area.
可选地,在本实施例中,在上述方式1)中可以但不限于根据传感器检测到的运动数据确定第一中心视图区的坐标之后,利用该第一中心视图区的坐标快速获取第一中心视图区在多个视图区中的目标视图区,其中,目标视图区包括与第一中心视图区的坐标范围重叠的至少一个视图区。进一步,从上述目标视图区中提取第一中心视图区的画面。也就是说,在本实施例中,可以利用终端上的传感器检测到的运动数据快速获取当前时间第一中心视图区的坐标,进而利用该坐标确定第一中心视图区所占的目标视图区,以实现快速从目标视图区中提取并推送第一中心视图区中的画面,进而达到提高全景媒体文件的推送效率的目的。例如,如图4所示,根据第一中心视图区的坐标获取到第一中心视图区所在目标视图区包括:A视图区、B视图区、E视图区 及F视图区(如图4阴影所示)。进一步,从上述目标视图区中提取第一中心视图区的画面,此外,还可以便于对第一中心视图区所在目标视图区按照高于其他视图区的分辨率进行编码推送。Optionally, in this embodiment, in the foregoing manner 1), after determining, according to the motion data detected by the sensor, the coordinates of the first central view area are used, the first coordinate of the first central view area is used to quickly obtain the first The central view area is a target view area in the plurality of view areas, wherein the target view area includes at least one view area that overlaps the coordinate range of the first central view area. Further, the picture of the first central view area is extracted from the target view area. That is, in the embodiment, the motion data detected by the sensor on the terminal can be used to quickly acquire the coordinates of the first central view area at the current time, and then use the coordinates to determine the target view area occupied by the first central view area. In order to quickly extract and push the picture in the first central view area from the target view area, the purpose of improving the push efficiency of the panoramic media file is achieved. For example, as shown in FIG. 4, the target view area in which the first central view area is located according to the coordinates of the first central view area includes: an A view area, a B view area, and an E view area. And the F view area (shown in the shaded figure in Figure 4). Further, the picture of the first central view area is extracted from the target view area, and further, the target view area where the first central view area is located may be conveniently coded and pushed according to a resolution higher than other view areas.
可选地,在本实施例中,在上述方式2)中可以但不限于预测第二中心视图区的坐标,也就是说,根据全景媒体文件的播放模式及第一中心视图区的坐标,预测预定时间段t后第二中心视图区的坐标。从而实现通过预测推送预定时间段t后将播放的画面,以克服网络通信的延时所导致的播放过程中出现画面延时的问题,进而达到提高推送效率的目的。Optionally, in this embodiment, in the foregoing manner 2), the coordinates of the second central view area may be predicted, that is, predicted according to the play mode of the panoramic media file and the coordinates of the first central view area. The coordinates of the second central view area after the predetermined time period t. Therefore, the screen that will be played after the predetermined time period t is pushed by the prediction is realized, so as to overcome the problem of the screen delay occurring during the playback process caused by the delay of the network communication, thereby achieving the purpose of improving the push efficiency.
可选地,在本实施例中,上述根据中心视图区对全景图像帧进行编码可以包括但不限于:根据不同的视图区提供不同分辨率等级进行编码,如中心视图区所在视图区的分辨率等级高于全景图像帧的多个视图区中其他视图区的分辨率等级。需要说明的是,在本实施例提供的编码过程中,可以但不限于是以视图区为单位分别进行编码以得到所要推送的码流,从而实现针对不同的视图区按照对应不同的分辨率等级进行编码,以达到节省带宽,减轻传输开销的效果。Optionally, in this embodiment, the encoding the panoramic image frame according to the central view area may include, but is not limited to, providing different resolution levels according to different view areas, such as the resolution of the view area where the central view area is located. The level is higher than the resolution level of the other view areas in the plurality of view areas of the panoramic image frame. It should be noted that, in the encoding process provided in this embodiment, it may be, but is not limited to, separately performing coding in units of view areas to obtain a code stream to be pushed, thereby achieving different resolution levels for different view areas. Encoding to save bandwidth and reduce transmission overhead.
可选地,在本实施例中,上述编码后的视图区的码流可以但不限于按照时间进行切片。例如,在服务器端响应每个播放请求,每次总是推送一个全景媒体文件的时间切片。可以但不限于采用各种经典的运动图像专家组(Moving Picture Experts Group,MPEG)视频分片技术对一个视图区进行切片,再根据自适应的码流推送策略进行流服务。Optionally, in this embodiment, the code stream of the encoded view area may be, but is not limited to, sliced according to time. For example, in response to each play request on the server side, a time slice of a panoramic media file is always pushed each time. It is possible, but not limited to, to slice a view area using various classic Moving Picture Experts Group (MPEG) video segmentation techniques, and then perform streaming services according to an adaptive code stream push strategy.
通过本申请提供的实施例,通过利用在全景图像帧上划分的多个视图区获取第一中心视图区,从而实现了利用多个视图区准确定位第一中心视图区,保证所获取的第一中心视图区的画面的准确性,以克服相关技术中只能获取到被高度压缩失真的画面的问题,进一步,利用多个视图区快速获取第一中心视图区,还将大大提高获取效率,进而实现提高全景媒体文件的推送效率的效果。With the embodiment provided by the present application, the first central view area is obtained by using a plurality of view areas divided on the panoramic image frame, thereby accurately positioning the first central view area by using the plurality of view areas, and ensuring the obtained first view area. The accuracy of the picture in the central view area overcomes the problem that only the highly compressed distortion picture can be obtained in the related art. Further, the use of multiple view areas to quickly acquire the first central view area will greatly improve the acquisition efficiency, and further Achieve the effect of improving the push efficiency of panoramic media files.
作为一种可选的方案,在全景图像帧的多个视图区上确定第一中心视图区包括:As an alternative, determining the first central view area on the plurality of view areas of the panoramic image frame comprises:
S1,根据传感器检测到的运动数据确定第一中心视图区的坐标范围; S1. Determine a coordinate range of the first central view area according to the motion data detected by the sensor;
S2,利用第一中心视图区的坐标范围从多个视图区中确定目标视图区,其中,目标视图区包括与第一中心视图区的坐标范围重叠的至少一个视图区;以及S2, determining a target view area from the plurality of view areas by using a coordinate range of the first central view area, wherein the target view area includes at least one view area overlapping the coordinate range of the first central view area;
S3,从目标视图区中提取第一中心视图区对应的画面。S3. Extract a picture corresponding to the first central view area from the target view area.
可选地,在本实施例中,利用第一中心视图区的坐标范围从多个视图区中确定目标视图区包括:获取在第一中心视图区的坐标范围内的视图区的标识;以及将视图区的标识所指示的视图区拼接得到目标视图区。Optionally, in the embodiment, determining the target view area from the plurality of view areas by using the coordinate range of the first central view area comprises: obtaining an identifier of the view area within a coordinate range of the first central view area; The view area indicated by the identifier of the view area is spliced to obtain the target view area.
可选地,在本实施例中,上述传感器检测到的运动数据可以包括但不限于以下至少之一:头部转动的角度以及眼球转动参数。上述仅是一种示例,上述运动数据还可以包括其他用于检测观看者的视野范围的运动数据,本实施例中对此不做任何限定。Optionally, in this embodiment, the motion data detected by the foregoing sensor may include, but is not limited to, at least one of the following: an angle of the head rotation and an eye rotation parameter. The foregoing is only an example, and the motion data may further include other motion data for detecting a field of view of the viewer, which is not limited in this embodiment.
具体结合以下示例进行说明,根据传感器检测到的运动数据确定第一中心视图区的坐标范围,利用上述坐标范围确定与第一中心视图区重叠的目标视图区的视图区的标识。在如图4所示的示例中,该目标视图区包括:A视图区、B视图区、E视图区及F视图区(如图4阴影所示)。如图4所示,第一中学视图区主要在F视图区内,同时涵盖相邻的A/B/E视图区的一部分,第一中心视图区中的画面内容是由A/B/E/F四个区的画面拼接得到。Specifically, in combination with the following example, the coordinate range of the first central view area is determined according to the motion data detected by the sensor, and the identifier of the view area of the target view area overlapping the first central view area is determined by using the above coordinate range. In the example shown in FIG. 4, the target view area includes: an A view area, a B view area, an E view area, and an F view area (shown in shades of FIG. 4). As shown in FIG. 4, the first middle view area is mainly in the F view area, and covers a part of the adjacent A/B/E view area, and the picture content in the first center view area is A/B/E/. The picture mosaic of the four areas of F is obtained.
在播放上述第一中心视图区中的画面时,可以根据全景球图像投影方法将解码后的A/B/E/F四个视图区中的画面展开拼接得到目标视图区,然后根据第一中心视图区的相对位置从该目标视图区中快速提取出第一中心视图区中的画面。When playing the picture in the first central view area, the decoded picture in the four view areas of the A/B/E/F may be stitched according to the panoramic ball image projection method to obtain the target view area, and then according to the first center. The relative position of the view area quickly extracts the picture in the first central view area from the target view area.
需要说明的是,在本实施例中,在第一中心视图区所占区域的大小等于每一个视图区所占区域的大小的情况下,上述第一中心视图区可以包含在多个视图区构成的目标视图区中,也可以与其中一个视图区严格重合,从而实现直接获取该第一中心视图区中的画面。It should be noted that, in this embodiment, in a case where the size of the area occupied by the first central view area is equal to the size of the area occupied by each view area, the first central view area may be included in multiple view areas. In the target view area, one of the view areas can also be strictly coincident, thereby directly obtaining the picture in the first central view area.
通过本申请提供的实施例,根据传感器检测到的第一中心视图区的坐标从多个视图区中获取第一中心视图区所在的目标视图区,从而实现快速从目标视图区中提取该第一中心视图区的画面,并推送播放该画面,以达到提高推送全景媒体文件的推送效率的效果。 According to the embodiment provided by the present application, the target view area where the first central view area is located is obtained from the plurality of view areas according to the coordinates of the first central view area detected by the sensor, so as to quickly extract the first view from the target view area. The picture in the center view area is pushed and played to achieve the effect of improving the push efficiency of the push panoramic media file.
作为一种可选的方案,根据中心视图区对全景图像帧进行编码包括:As an alternative, encoding the panoramic image frame according to the central view area includes:
S1,按照第一分辨率编码第一中心视图区所在目标视图区,按照第二分辨率编码全景图像帧中除目标视图区之外的其他视图区,其中,第一分辨率高于第二分辨率。S1. The target view area in which the first central view area is located is encoded according to the first resolution, and the other view areas except the target view area in the panoramic image frame are encoded according to the second resolution, wherein the first resolution is higher than the second resolution. rate.
可选地,在本实施例中,对每一个划分的视图区可以但不限于进行多尺度编码,以得到多个分辨率等级的码流。其中,对于全景图像帧中的多个视图区,中心视图区的分辨率(用分辨率等级标识)可以但不限于高于其他视图区的分辨率。从而使被关注的中心视图区的画面可以被清晰真实地播放,而对于其他视图区的画面模糊播放,以达到减少传输开销,节省带宽的目的。Optionally, in this embodiment, each divided view area may be, but is not limited to, multi-scale coding to obtain a code stream of multiple resolution levels. Wherein, for a plurality of view regions in the panoramic image frame, the resolution of the central view region (identified by the resolution level) may be, but not limited to, higher than the resolution of the other view regions. Therefore, the picture of the central view area of interest can be played clearly and faithfully, and the picture of other view areas is blurredly played, so as to reduce transmission overhead and save bandwidth.
通过本申请提供的实施例,通过针对全景图像帧中的不同的视图区按照不同的分辨率进行编码,不仅可以突出清晰地播放中心视图区中的画面,而且对其他视图区的画面模糊处理,将达到节省带宽的目的。Through the embodiments provided by the present application, by encoding different view regions in a panoramic image frame according to different resolutions, not only the pictures in the central view area but also the picture blur processing of other view areas can be clearly and clearly played. It will achieve the purpose of saving bandwidth.
作为一种可选的方案,在每帧全景图像帧的多个视图区上获取中心视图区包括:As an alternative, obtaining the central view area on multiple view areas of each frame of the panoramic image frame includes:
S1,获取全景媒体文件的播放模式;S1, acquiring a play mode of the panoramic media file;
S2,根据全景媒体文件的播放模式及第一中心视图区的坐标范围确定预定时间段后多个视图区上第二中心视图区的坐标范围。S2. Determine a coordinate range of the second central view area on the plurality of view areas after the predetermined time period according to the play mode of the panoramic media file and the coordinate range of the first central view area.
可选地,在本实施例中,可以根据以下公式计算预定时间段后多个视图区上第二中心视图区的坐标范围:Optionally, in this embodiment, the coordinate range of the second central view area on the plurality of view areas after the predetermined time period may be calculated according to the following formula:
Figure PCTCN2017092562-appb-000002
Figure PCTCN2017092562-appb-000002
其中,(x0,y0)用于表示第一中心视图区的坐标,(xt,yt)用于表示预定时间段t后第二中心视图区的坐标;v mod用于表示播放模式,v modx(t)用于表示在播放模式下预定时间段t后x方向上的偏移角度,x方向为水平方向,v mody(t)用于表示在播放模式下预定时间段t后y方向上的偏移角度,y方向为垂直方向。Where (x 0 , y 0 ) is used to represent the coordinates of the first central view area, (x t , y t ) is used to represent the coordinates of the second central view area after the predetermined time period t; v mod is used to indicate the play mode , v mod x (t) is used to indicate the offset angle in the x direction after the predetermined time period t in the play mode, the x direction is the horizontal direction, and v mod y (t) is used to indicate the predetermined time period t in the play mode. The offset angle in the back y direction, and the y direction is the vertical direction.
可选地,在本实施例中,上述播放模式可以包括但不限于以下至少之一:用于播放第一中心视图区中的画面的第一播放模式、用于搜索第三中心视图区的第二播放模式、用于播放第三中心视图区中的画面的第三播放模式。上述仅是一种示例,本实施例中对此不做任何限定。 Optionally, in this embodiment, the foregoing play mode may include, but is not limited to, at least one of: a first play mode for playing a picture in the first central view area, and a first search mode for searching the third central view area. The second play mode, the third play mode for playing the picture in the third central view area. The above is only an example, and is not limited in this embodiment.
需要说明的是,在本实施例中,上述播放模式可以但不限于将影响在预定时间段t后的偏移角度。例如,对于第一播放模式(也可称作观看主模式),长时间保持一个视野角度,即长时间停留在第一中心视图区,则可以根据该播放模式预测预定时间段t后偏移角度为0,则可以预测预定时间段后第二中心视图区的坐标与第一中心视图区的坐标相同,即xt=x0,yt=y0。这里对于第二播放模式需要说明的是,搜索过程中的搜索运动可以为匀速运动,则偏移角度可以为移动速度v与移动时间t的乘积,也可以为非匀速运动,根据相关计算方式获取在该播放模式下的偏移角度。本实施例中对此不做任何限定。It should be noted that, in this embodiment, the above play mode may be, but is not limited to, an offset angle that will affect the predetermined time period t. For example, for the first play mode (also referred to as the view main mode), if one view angle is maintained for a long time, that is, staying in the first center view area for a long time, the offset angle after the predetermined time period t can be predicted according to the play mode. If it is 0, it can be predicted that the coordinates of the second central view area after the predetermined time period are the same as the coordinates of the first central view area, that is, x t = x 0 , y t = y 0 . What should be noted for the second play mode is that the search motion in the search process can be a uniform motion, and the offset angle can be the product of the moving speed v and the moving time t, or can be a non-uniform motion, and is obtained according to the relevant calculation manner. The offset angle in this play mode. This embodiment does not limit this.
具体结合上述公式进行说明,假设获取第一中心视图区的坐标(x0,y0),当前播放模式为v mod的情况下,先根据播放模式v mod获取分别在x,y方向上在预定时间段t后相对当前位置的偏移角度:v modx(t),v mody(t)。然后,利用上述公式预测预定时间段t第二中心视图区的坐标(xt,yt)。Specifically, in combination with the above formula, it is assumed that the coordinates (x 0 , y 0 ) of the first central view area are obtained, and in the case that the current play mode is v mod, the first acquisition is in the x, y direction according to the play mode v mod . The offset angle from the current position after time period t: v mod x (t), v mod y (t). Then, the coordinates (x t , y t ) of the second central view region of the predetermined time period t are predicted using the above formula.
通过本申请提供的实施例,根据全景媒体文件的播放模式及第一中心视图区的坐标确定预定时间段后多个视图区上第二中心视图区的坐标,从而实现对预定时间段后被关注的视野范围的准确预测,以保证提前及时获取到所要推送的第二中心视图区的画面,进一步,还可以避免网络传输延时导致的播放延时的问题。According to the embodiment provided by the present application, the coordinates of the second central view area on the plurality of view areas after the predetermined time period are determined according to the play mode of the panoramic media file and the coordinates of the first central view area, thereby achieving attention after the predetermined time period. The accurate prediction of the field of view ensures that the picture of the second central view area to be pushed is obtained in advance, and further, the problem of the playback delay caused by the network transmission delay can be avoided.
作为一种可选的方案,获取全景媒体文件的播放模式包括:As an alternative, the playback mode of obtaining the panoramic media file includes:
1)在预定周期内传感器检测到的运动数据的变化范围小于预定阈值时,则确定播放模式为第一播放模式,其中,第一播放模式用于播放第一中心视图区中的画面;1) determining, in a predetermined period, that the change range of the motion data detected by the sensor is less than a predetermined threshold, determining that the play mode is the first play mode, wherein the first play mode is used to play the picture in the first central view area;
2)在预定周期内传感器检测到的运动数据的变化范围大于等于预定阈值时,则确定播放模式为第二播放模式,其中,第二播放模式用于搜索第三中心视图区;2) determining, in a predetermined period, that the change range of the motion data detected by the sensor is greater than or equal to a predetermined threshold, determining that the play mode is the second play mode, wherein the second play mode is used to search for the third center view area;
3)在预定周期内传感器检测到的运动数据的变化范围小于预定阈值,且上一个播放模式为第二播放模式时,则确定播放模式为第三播放模式,其中,第三播放模式用于播放第三中心视图区中的画面。3) the change range of the motion data detected by the sensor is less than a predetermined threshold value in a predetermined period, and when the last play mode is the second play mode, determining that the play mode is the third play mode, wherein the third play mode is used for playing The picture in the third center view area.
具体结合以下示例进行说明,假设第一播放模式以微摆动观看主模式(ma)表示,第二播放模式以新内容搜索模式(ms)表示,第三播放模式以新内容聚焦 模式(mf)表示。其中,上述播放模式具体如图5所示可以包括:Specifically, the following example is used to illustrate that the first play mode is represented by a micro-swing viewing main mode (ma), the second play mode is represented by a new content search mode (ms), and the third play mode is focused with new content. The mode (mf) is indicated. The foregoing play mode may be specifically as shown in FIG. 5, and may include:
1)微摆动观看主模式(ma):该模式停留在第一中心视图区所播放的画面,且用于观看的硬件设备(如眼镜终端)会相对静止,或是有微小摆动(即在预定周期内的摆动幅度小于预定阈值),但不会实际离开第一中心视图区;1) Micro-swing viewing master mode (ma): This mode stays in the picture played in the first center view area, and the hardware device (such as the glasses terminal) used for viewing will be relatively stationary or have a slight swing (ie, scheduled) The amplitude of the swing in the period is less than a predetermined threshold), but does not actually leave the first central view area;
2)新内容搜索模式(ms):该模式将离开微摆动观看模式,用于快速运动搜索新视野(如第三中心视图区)里新的内容,且用于观看的硬件设备(如眼镜终端)会快速移动,偏离原来运动轨道。也就是,在预定周期内的摆动幅度大于等于预定阈值。2) New content search mode (ms): This mode will leave the micro-swing viewing mode for fast motion search for new content in new fields of view (such as the third center view area), and hardware devices for viewing (such as glasses terminals) ) will move quickly and deviate from the original motion track. That is, the amplitude of the wobble in the predetermined period is greater than or equal to a predetermined threshold.
3)新内容聚焦模式(mf):该模式可能将短暂停留第三中心视图区后离开再进入新内容搜索模式,也可能真实进入微摆动观看主模式停留在第三中心视图区。即在传感器检测到的运动数据指示在预定周期内的摆动幅度小于预定阈值,且上一个播放模式为第二播放模式。3) New content focus mode (mf): This mode may stay in the third center view area for a short time and then enter the new content search mode, or may actually enter the micro-swing view main mode to stay in the third center view area. That is, the motion data detected by the sensor indicates that the amplitude of the swing in the predetermined period is less than the predetermined threshold, and the previous play mode is the second play mode.
在本实施例中,根据在预定周期(如时间窗T)内的移动轨迹来判断其运动模式。一种仅有短距离来回摆动,则是微摆动观看主模式(ma);另一种大距离较快速的移动,则是新内容搜索模式(ms);再一种是上一个模式是新内容搜索模式,过去预定周期(如时间窗T)内相对静止或者微小摆动,则是新内容聚焦模式(mf)。In the present embodiment, the motion pattern is judged based on the movement trajectory within a predetermined period (e.g., time window T). One type that only swings back and forth in a short distance is the micro-swing to view the main mode (ma); the other type of faster moving is the new content search mode (ms); the other is the previous mode is new content. The search mode is a new content focus mode (mf) that is relatively stationary or slightly oscillating in a predetermined period of time (such as time window T).
需要说明的是,上述第三中心视图区可以但不限于为第二中心视图区,也可以但不限于为多个视图区中除第一中心视图区及第二中心视图区之外的其他视图区。It should be noted that the foregoing third central view area may be, but not limited to, a second central view area, and may be, but is not limited to, other views of the plurality of view areas except the first central view area and the second central view area. Area.
通过本申请提供的实施例,通过获取全景媒体文件的播放模式,以利用该播放模式来预测预定时间段内视野范围的偏移角度,从而实现根据第一中心视图区的坐标及偏移角度来确定预定时间段后的第二中心视图区的坐标。Through the embodiment provided by the present application, by acquiring the play mode of the panoramic media file, the play mode is used to predict the offset angle of the visual field range within the predetermined time period, thereby implementing the coordinates and the offset angle according to the first central view area. The coordinates of the second central view area after the predetermined time period are determined.
作为一种可选的方案,根据中心视图区对全景图像帧进行编码包括:As an alternative, encoding the panoramic image frame according to the central view area includes:
S10,重复执行以下步骤,直至遍历在预定时间段后全景图像帧中的多个视图区:S10, repeating the following steps until traversing a plurality of view regions in the panoramic image frame after a predetermined period of time:
S12,从多个视图区中获取当前视图区中划分的多个子视图区;S12. Acquire multiple sub-view areas divided in the current view area from multiple view areas.
S14,获取多个子视图区的参考值,其中,参考值为子视图区的显著性特征所指示的显著性等级与第二中心视图区落在子视图区的概率二者中的最大 值;S14. Acquire a reference value of multiple subview regions, where the reference value is the largest of the significance level indicated by the salient feature of the subview region and the probability that the second central view region falls within the subview region. value;
S16,根据多个子视图区的参考值中的最大值确定当前视图区的第三分辨率;S16. Determine a third resolution of the current view area according to a maximum value of the reference values of the multiple sub-view areas.
S18,按照第三分辨率对当前视图区进行编码。S18. Encode the current view area according to the third resolution.
可选地,在本实施例中,获取多个子视图区的参考值包括:重复执行以下步骤,直至遍历多个子视图区:从多个子视图区中获取当前子视图区;获取当前子视图区的显著性特征所指示的显著性等级及第二中心视图区落在当前子视图区的概率;将显著性等级与概率二者中的最大值作为当前子视图区的参考值。Optionally, in this embodiment, acquiring the reference values of the multiple sub-view areas includes: repeating the following steps until the plurality of sub-view areas are traversed: acquiring the current sub-view area from the plurality of sub-view areas; acquiring the current sub-view area The significance level indicated by the significance feature and the probability that the second central view area falls within the current sub-view area; the maximum value of the significance level and the probability is used as the reference value of the current sub-view area.
可选地,在本实施例中,对多个视图区中的每个视图区可以但不限于划分为大小相同的四个子视图区。如图6所示,当前视图区包括子视图区a、子视图区b、子视图区c、和子视图区d。Optionally, in this embodiment, each of the plurality of view regions may be, but is not limited to, divided into four sub-view regions of the same size. As shown in FIG. 6, the current view area includes a sub-view area a, a sub-view area b, a sub-view area c, and a sub-view area d.
可选地,在本实施例中,获取当前子视图区的参考值可以包括但不限于获取当前子视图区的显著性特征所指示的显著性等级及第二中心视图区落在当前子视图区的概率二者中的最大值。Optionally, in this embodiment, obtaining the reference value of the current sub-view area may include, but is not limited to, obtaining a significance level indicated by the salient feature of the current sub-view area and the second central view area falls in the current sub-view area. The probability of the maximum of both.
需要说明的是,在本实施例中,上述显著性特征可以但不限于用于表示一种视觉显著性区域分布,如舞台中央等,被关注概率较高的区域,可被配置为高显著性等级的视觉显著性区域。而如偏暗区域,观众席,天空等被关注概率较低的区域,可被配置为低显著性等级的视觉显著性区域。其中,上述显著性等级可以但不限于用Sa(t,θxy)表示,其中,θx∈[0,360°)θy∈[-90°,90°],上述显著性等级可以但不限于根据经典视觉显著性检测算法先验的计算出。根据Sa(t,θxy)可以统计出每个子视图区的显著性等级,如RSa(t,sx,sy)表示预定时间段t后子视图区(sx,sy)的显著性等级。作为一种可选的计算方式:It should be noted that, in this embodiment, the above-mentioned salient features may be, but are not limited to, a region for indicating a visually significant region, such as a center of the stage, and a region with a high probability of attention, which may be configured to be highly salient. The visually significant area of the level. For example, areas such as dark areas, auditoriums, and the sky with low probability of attention can be configured as visually significant areas of low significance level. Wherein, the above-mentioned significance level can be, but is not limited to, represented by Sa(t, θ x , θ y ), where θ x ∈ [0, 360°) θ y ∈ [-90°, 90°], the above-mentioned significance level can be It is not limited to a priori calculation based on the classical visual saliency detection algorithm. According to Sa(t, θ x , θ y ), the saliency level of each sub-view area can be counted, for example, RSa(t, sx, sy) represents the saliency level of the sub-view area (sx, sy) after the predetermined time period t. . As an alternative calculation method:
Figure PCTCN2017092562-appb-000003
Figure PCTCN2017092562-appb-000003
其中,子视图区x方向编号sx的范围为sx∈[1,12],子视图区y方向编号sy的范围为sy∈[1,6]。 The sub-view area x direction number sx ranges from sx ∈ [1, 12], and the sub view area y direction number sy ranges from sy ∈ [1, 6].
需要说明的是,在本实施例中,以当前视图区为例,落在一个子视图区的概率用Pi(t,sx,sy)表示,则上述当前视图区中包括的四个子视图区的参考值可以通过以下方式标识:It should be noted that, in this embodiment, taking the current view area as an example, the probability of falling in a sub-view area is represented by Pi(t, sx, sy), and the four sub-view areas included in the current view area are The reference value can be identified in the following ways:
子视图区a的参考值为:The reference value of the subview area a is:
aPi(t,x,y)=max(RSa(t,2x-1,2y-1),Pi(t,2x-1,2y-1))aPi(t,x,y)=max(RSa(t,2x-1,2y-1),Pi(t,2x-1,2y-1))
子视图区b的参考值为:The reference value of sub-view area b is:
bPi(t,x,y)=max(RSa(t,2x,2y-1),Pi(t,2x,2y-1))bPi(t,x,y)=max(RSa(t,2x,2y-1),Pi(t,2x,2y-1))
子视图区c的参考值为:The reference value of the subview area c is:
cPi(t,x,y)=max(RSa(t,2x-1,2y),Pi(t,2x-1,2y))cPi(t,x,y)=max(RSa(t,2x-1,2y),Pi(t,2x-1,2y))
子视图区d的参考值为:dPi(t,x,y)=max(RSa(t,2x,2y),Pi(t,2x,2y))The reference value of the sub-view area d is: dPi(t, x, y) = max(RSa(t, 2x, 2y), Pi(t, 2x, 2y))
进一步,根据上述四个参考值中的最大值mPi(t,x,y)确定当前视图区的分辨率,其中:Further, the resolution of the current view area is determined according to the maximum value mPi(t, x, y) of the above four reference values, wherein:
mPi(t,x,y)=max(aPi(t,x,y),bPi(t,x,y),cPi(t,x,y),dPi(t,x,y)))  (4)mPi(t,x,y)=max(aPi(t,x,y),bPi(t,x,y),cPi(t,x,y),dPi(t,x,y))) (4 )
也就是说,将当前视图区的分辨率按照参考值最大的子视图区的分辨率进行更新调整,以保证被关注内容的高清晰度。That is to say, the resolution of the current view area is updated and adjusted according to the resolution of the sub-view area with the largest reference value to ensure high definition of the content of interest.
通过本申请提供的实施例,获取视图区所包括的多个子视图区中的参考值中的最大值,根据该最大值确定该视图区的分辨率,从而实现在预定时间段后针对不同视图区配置不同的分辨率进行编码,以达到节省带宽的目的。此外,根据显著性特征所指示的显著性等级及第二中心视图区落下的概率来预测预定时间段t后被推送可能性最大的子视图区,进而将该子视图区所在视图区中其他子视图区的分辨率均调整为最高分辨率,以保证被关注内容的播放清晰度。The embodiment provides the maximum value of the reference values in the plurality of sub-view areas included in the view area, and determines the resolution of the view area according to the maximum value, so as to implement different view areas after the predetermined time period. Configure different resolutions for encoding to save bandwidth. In addition, the sub-view area that is most likely to be pushed after the predetermined time period t is predicted according to the saliency level indicated by the saliency feature and the probability of the second center view area falling, and further the other sub-view areas in the view area The resolution of the view area is adjusted to the highest resolution to ensure the playback clarity of the content being watched.
作为一种可选的方案,根据多个子视图区的参考值中的最大值确定当前视图区的第三分辨率包括:As an optional solution, determining the third resolution of the current view area according to the maximum value of the reference values of the plurality of sub-view areas includes:
S1,通过以下公式计算当前视图区的第三分辨率所在的分辨率等级:S1, the resolution level of the third resolution of the current view area is calculated by the following formula:
S(t,x,y)=1+(n-1)*mPi(t,x,y)*Qnet  (5)S(t,x,y)=1+(n-1)*mPi(t,x,y)*Qnet (5)
其中,(x,y)为当前视图区的坐标,S(t,x,y)用于表示在预定时间段t后全景图像帧中当前视图区的第三分辨率所在的分辨率等级,mPi(t,x,y)用于表示在预定时间段t后在当前视图区中多个子视图区的参考值的最大值,Qnet用于 表示当前网络带宽等级,n用于表示分辨率等级,其中,Qnet∈[0,1],S(t,x,y)∈{1,2,...,n};Where (x, y) is the coordinates of the current view area, and S(t, x, y) is used to indicate the resolution level of the third resolution of the current view area in the panoramic image frame after the predetermined time period t, mPi (t, x, y) is used to indicate the maximum value of the reference values of the plurality of sub-view areas in the current view area after the predetermined time period t, Qnet is used for Indicates the current network bandwidth level, n is used to indicate the resolution level, where Qnet∈[0,1], S(t,x,y)∈{1,2,...,n};
S2,根据第三分辨率所在的分辨率等级确定第三分辨率。S2, determining a third resolution according to a resolution level at which the third resolution is located.
需要说明的是,Qnet表示当前的网络带宽等级,等级越高,越倾向于推送高画质版本内容,越差越倾向于推送低分辨率版本内容,从而确保观看流畅性的体验。此外,S(t,x,y)表示第三分辨率所在的分辨率等级,等级越高,推送的分辨率版本越高,如最高分辨率版本n,反而反之,如最低分辨率版本1。It should be noted that Qnet indicates the current network bandwidth level. The higher the level, the more inclined it is to push the high-quality version content, and the worse the trend is to push the low-resolution version content to ensure the smooth viewing experience. In addition, S(t, x, y) represents the resolution level at which the third resolution is located. The higher the level, the higher the resolution version of the push, such as the highest resolution version n, but vice versa, such as the lowest resolution version 1.
通过本申请提供的实施例,通过按照不同分辨率编码多个视图区中的画面,以确保在中心视图区所占区域可以看到最清晰的画面,而在其他视图区所在区域看到相对模糊的画面,以保证在播放全景图像帧的同时,实现区别播放,从而达到减少传输开销,节省带宽,提高推送效率的目的。Through the embodiments provided by the present application, the pictures in multiple view areas are coded according to different resolutions to ensure that the clearest picture can be seen in the area occupied by the central view area, while the relative blur is seen in the area of other view areas. The picture is to ensure that the difference image is played while playing the panoramic image frame, thereby reducing the transmission overhead, saving bandwidth, and improving the push efficiency.
作为一种可选的方案,获取第二中心视图区落在当前子视图区的概率包括:As an alternative, the probability of obtaining the second central view area falling in the current sub-view area includes:
P(t,sx,sy)=exp(-((sx-xt)2+(sy-yt)2))  (6)P(t, sx, sy)=exp(-((sx-x t ) 2 +(sy-y t ) 2 )) (6)
其中,(sx,sy)用于表示当前子视图区的坐标,P(t,sx,sy)用于表示在预定时间段t后第二中心视图区落在当前子视图区的概率,(xt,yt)用于表示预定时间段t后第二中心视图区的坐标。Where (sx, sy) is used to represent the coordinates of the current sub-view area, and P(t, sx, sy) is used to indicate the probability that the second central view area falls within the current sub-view area after the predetermined time period t, (x t , y t ) is used to indicate the coordinates of the second central view area after the predetermined time period t.
需要说明的是,上述公式为以e为底的反向指数函数,也就是说,在当前子视图区距离第二中心视图区的位置越近,函数值越大,对应概率也越大。It should be noted that the above formula is a reverse exponential function with e as the base, that is, the closer the current sub-view region is to the second central view region, the larger the function value is, and the corresponding probability is larger.
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本发明并不受所描述的动作顺序的限制,因为依据本发明,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本发明所必须的。It should be noted that, for the foregoing method embodiments, for the sake of simple description, they are all expressed as a series of action combinations, but those skilled in the art should understand that the present invention is not limited by the described action sequence. Because certain steps may be performed in other sequences or concurrently in accordance with the present invention. In addition, those skilled in the art should also understand that the embodiments described in the specification are all preferred embodiments, and the actions and modules involved are not necessarily required by the present invention.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到根据上述实施例的方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的 形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本发明各个实施例所述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the method according to the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course, by hardware, but in many cases, the former is A better implementation. Based on such understanding, the technical solution of the present invention may contribute to the prior art in part or in the software product. Formally embodied, the computer software product is stored in a storage medium (such as ROM/RAM, disk, optical disk), and includes a plurality of instructions for making a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) The methods described in various embodiments of the invention are performed.
根据本发明实施例,还提供了一种用于实施上述全景媒体文件推送方法的全景媒体文件推送装置,应用于具有显示装置的终端。如图7所示,该装置包括:According to an embodiment of the present invention, there is also provided a panoramic media file pushing apparatus for implementing the above-described panoramic media file pushing method, which is applied to a terminal having a display device. As shown in Figure 7, the device includes:
第一获取单元702,用于获取待推送的全景媒体文件,其中,全景媒体文件中包括至少一帧全景图像帧;The first obtaining unit 702 is configured to obtain a panoramic media file to be pushed, where the panoramic media file includes at least one frame of the panoramic image frame;
划分单元704,用于按照预定条件将全景图像帧划分为多个视图区;a dividing unit 704, configured to divide the panoramic image frame into a plurality of view regions according to a predetermined condition;
第二获取单元706,用于在全景图像帧的多个视图区上确定第一中心视图区,其中,第一中心视图区所占区域大于等于一个视图区所占区域;The second obtaining unit 706 is configured to determine a first central view area on the plurality of view areas of the panoramic image frame, wherein the area occupied by the first central view area is greater than or equal to the area occupied by one view area;
编码单元708,用于根据确定的第一中心视图区对全景图像帧进行编码;The encoding unit 708 is configured to encode the panoramic image frame according to the determined first central view area;
推送单元710,用于推送编码后的全景图像帧。The pushing unit 710 is configured to push the encoded panoramic image frame.
可选地,在本实施例中,上述全景媒体文件的推送装置可以但不限于应用于虚拟现实(Virtual Reality,VR)过程中,其中,虚拟现实VR可以但不限于是一种综合利用计算机图形系统和各种现实际控制等接口设备,在计算机上生成的、可交互的三维环境中提供沉浸感觉的技术。如将上述装置应用于VR眼镜中,通过对全景媒体文件进行视图区的划分及编码处理,以实现在播放全景媒体文件的过程中,可以快速提供更加准确清晰的全景媒体文件。其中,上述全景媒体文件可以包括但不限于以下至少之一:全景图像和全景视频等。上述仅是一种示例,本实施例中对此不做任何限定。Optionally, in this embodiment, the push device of the panoramic media file may be, but is not limited to, applied to a virtual reality (VR) process, where the virtual reality VR may be, but is not limited to, a comprehensive use of computer graphics. The system and various interface devices such as actual control provide a immersive sensation in a three-dimensional environment that can be generated on a computer. If the above device is applied to the VR glasses, the view area is divided and encoded by the panoramic media file, so that a more accurate and clear panoramic media file can be quickly provided during the process of playing the panoramic media file. The above panoramic media file may include, but is not limited to, at least one of the following: a panoramic image, a panoramic video, and the like. The above is only an example, and is not limited in this embodiment.
需要说明的是,在获取待推送的全景媒体文件中的至少一帧全景图像帧后,按照预定条件将帧全景图像帧划分为多个视图区,在上述多个视图区上获取第一中心视图区,并根据该第一中心视图区对全景图像帧进行编码,以推送编码后的全景图像帧。也就是说,通过利用在全景图像帧上划分的多个视图区获取第一中心视图区,从而实现了利用多个视图区准确定位中心视图区,保证所获取的中心视图区的画面的准确性,以克服相关技术中只能获取到被高度压缩失真的画面的问题,进一步,利用多个视图区快速获取第一中心视图区,还将大大提高获取效率,进而实现提高全景媒体文件的推送效率 的效果。It is to be noted that, after acquiring at least one frame of the panoramic image frame in the panoramic media file to be pushed, the frame panoramic image frame is divided into multiple view regions according to a predetermined condition, and the first central view is obtained on the plurality of view regions. And encoding the panoramic image frame according to the first central view area to push the encoded panoramic image frame. That is to say, by using the plurality of view areas divided on the panoramic image frame to obtain the first central view area, the plurality of view areas are used to accurately locate the central view area, thereby ensuring the accuracy of the acquired central view area. In order to overcome the problem that only the highly compressed distortion picture can be obtained in the related art, further, using the plurality of view areas to quickly acquire the first central view area, the access efficiency is greatly improved, thereby improving the push efficiency of the panoramic media file. Effect.
可选地,在本实施例中,按照预定条件分别对每帧全景图像帧进行划分可以包括但不限于:按照预先配置的规格条件将均匀分布在全景球上的全景图像帧划分为相同大小的多个矩形视图区。Optionally, in this embodiment, the dividing each frame of the panoramic image frame according to the predetermined condition may include, but is not limited to, dividing the panoramic image frame uniformly distributed on the panoramic sphere into the same size according to the pre-configured specification condition. Multiple rectangular view areas.
例如,以全景球中心为原点,因为球面到中心(原点)距离为全景球半径长度,因而可以用极坐标系统来表示观全景图像帧上的各个位置。例如,用(θxy)表示全景图像帧上的一个位置,其中θx表示以水平正前方为零,保持水平方向逆时针绕行的角度;θy表示以水平正上方为零,保持与水平垂直方向逆时针绕行的角度。进一步,如图3所示,将一帧全景图像帧划分为多个视图区,例如A视图区至P视图区。即以全景球中心为原点,水平方向逆时针旋转360度等分为6份,每份60度;与水平垂直方向逆时针旋转180度等分为3份,每份60度。则每个视图区的角度范围定义可以为:For example, taking the center of the panoramic sphere as the origin, since the spherical-to-center (origin) distance is the radius of the panoramic sphere, the polar coordinate system can be used to represent the various positions on the panoramic image frame. For example, (θ x , θ y ) is used to represent a position on a panoramic image frame, where θ x represents an angle that is zero in front of the horizontal and counterclockwise in the horizontal direction; θ y represents zero above the horizontal, Maintain an angle that is counterclockwise about horizontal and vertical. Further, as shown in FIG. 3, one frame of the panoramic image frame is divided into a plurality of view areas, such as an A view area to a P view area. That is, the center of the panoramic ball is taken as the origin, and the horizontal direction is rotated counterclockwise by 360 degrees into 6 parts, each of which is 60 degrees; and the horizontal direction is rotated counterclockwise by 180 degrees into three parts, each of which is 60 degrees. The angle range definition for each view area can be:
Figure PCTCN2017092562-appb-000004
Figure PCTCN2017092562-appb-000004
其中,x,y为18个视图区分别在x方向和y方向上的编号,x∈[1,6],y∈[1,3]。Where x, y are the numbers of the 18 view regions in the x and y directions, respectively x, [1, 6], y ∈ [1, 3].
此外,在本实施例中,在上述多个视图区上获取的第一中心视图区可以但不限于为观看者的观看视野形成的视图区。需要说明的是,在全景图像帧划分的多个视图区的面积可以但不限于根据第一中心视图区的面积确定,如在全景图像帧上所划分的每个视图区所占区域的大小,可以但不限于小于等于中心视图区所占区域的大小,大于四分之一的中心视图区所占区域的大小。上述仅是一种示例,本实施例中对此不做任何限定。Further, in the present embodiment, the first center view area acquired on the plurality of view areas may be, but is not limited to, a view area formed for a viewer's viewing field of view. It should be noted that the area of the plurality of view areas divided by the panoramic image frame may be, but is not limited to, determined according to the area of the first central view area, such as the size of the area occupied by each view area divided on the panoramic image frame. It may be, but not limited to, less than or equal to the size of the area occupied by the central view area, and greater than a quarter of the area occupied by the central view area. The above is only an example, and is not limited in this embodiment.
可选地,在本实施例中,在每帧全景图像帧的多个视图区上获取中心视图区可以包括但不限于以下至少之一:Optionally, in this embodiment, acquiring the central view area on multiple view areas of each frame of the panoramic image frame may include, but is not limited to, at least one of the following:
1)根据传感器检测到的运动数据确定第一中心视图区的坐标范围;1) determining a coordinate range of the first central view area according to the motion data detected by the sensor;
2)获取全景媒体文件的播放模式;根据全景媒体文件的播放模式及第一中心视图区的坐标范围确定预定时间段后多个视图区上第二中心视图区的坐标范围。2) acquiring a play mode of the panoramic media file; determining a coordinate range of the second central view area on the plurality of view areas after the predetermined time period according to the play mode of the panoramic media file and the coordinate range of the first central view area.
可选地,在本实施例中,在上述方式1)中可以但不限于根据传感器检测到的运动数据确定第一中心视图区的坐标之后,利用该第一中心视图区的坐 标快速获取第一中心视图区在多个视图区中的目标视图区,其中,目标视图区包括与第一中心视图区的坐标范围重叠的至少一个视图区。进一步,从上述目标视图区中提取第一中心视图区的画面。也就是说,在本实施例中,可以利用终端上的传感器检测到的运动数据快速获取当前时间第一中心视图区的坐标,进而利用该坐标确定第一中心视图区所占的目标视图区,以实现快速从目标视图区中提取并推送第一中心视图区中的画面,进而达到提高全景媒体文件的推送效率的目的。例如,如图4所示,根据第一中心视图区的坐标获取到第一中心视图区所在目标视图区包括:A视图区、B视图区、E视图区及F视图区(如图4阴影所示)。进一步,从上述目标视图区中提取第一中心视图区的画面,此外,还可以便于对第一中心视图区所在目标视图区按照高于其他视图区的分辨率进行编码推送。Optionally, in this embodiment, in the foregoing manner 1), after the coordinates of the first central view area are determined according to the motion data detected by the sensor, the first central view area is used. The target quickly acquires a target view area of the first central view area in the plurality of view areas, wherein the target view area includes at least one view area that overlaps the coordinate range of the first central view area. Further, the picture of the first central view area is extracted from the target view area. That is, in the embodiment, the motion data detected by the sensor on the terminal can be used to quickly acquire the coordinates of the first central view area at the current time, and then use the coordinates to determine the target view area occupied by the first central view area. In order to quickly extract and push the picture in the first central view area from the target view area, the purpose of improving the push efficiency of the panoramic media file is achieved. For example, as shown in FIG. 4, the target view area in which the first central view area is obtained according to the coordinates of the first central view area includes: an A view area, a B view area, an E view area, and an F view area (as shown in FIG. Show). Further, the picture of the first central view area is extracted from the target view area, and further, the target view area where the first central view area is located may be conveniently coded and pushed according to a resolution higher than other view areas.
可选地,在本实施例中,在上述方式2)中可以但不限于预测第二中心视图区的坐标,也就是说,根据全景媒体文件的播放模式及第一中心视图区的坐标,预测预定时间段t后第二中心视图区的坐标。从而实现通过预测推送预定时间段t后将播放的画面,以克服网络通信的延时所导致的播放过程中出现画面延时的问题,进而达到提高推送效率的目的。Optionally, in this embodiment, in the foregoing manner 2), the coordinates of the second central view area may be predicted, that is, predicted according to the play mode of the panoramic media file and the coordinates of the first central view area. The coordinates of the second central view area after the predetermined time period t. Therefore, the screen that will be played after the predetermined time period t is pushed by the prediction is realized, so as to overcome the problem of the screen delay occurring during the playback process caused by the delay of the network communication, thereby achieving the purpose of improving the push efficiency.
可选地,在本实施例中,上述根据中心视图区对全景图像帧进行编码可以包括但不限于:根据不同的视图区提供不同分辨率等级进行编码,如中心视图区所在视图区的分辨率等级高于全景图像帧的多个视图区中其他视图区的分辨率等级。需要说明的是,在本实施例提供的编码过程中,可以但不限于是以视图区为单位分别进行编码以得到所要推送的码流,从而实现针对不同的视图区按照对应不同的分辨率等级进行编码,以达到节省带宽,减轻传输开销的效果。Optionally, in this embodiment, the encoding the panoramic image frame according to the central view area may include, but is not limited to, providing different resolution levels according to different view areas, such as the resolution of the view area where the central view area is located. The level is higher than the resolution level of the other view areas in the plurality of view areas of the panoramic image frame. It should be noted that, in the encoding process provided in this embodiment, it may be, but is not limited to, separately performing coding in units of view areas to obtain a code stream to be pushed, thereby achieving different resolution levels for different view areas. Encoding to save bandwidth and reduce transmission overhead.
可选地,在本实施例中,上述编码后的视图区的码流可以但不限于按照时间进行切片。例如,在服务器端响应每个播放请求,每次总是推送一个全景媒体文件的时间切片。可以但不限于采用各种经典的运动图像专家组(Moving Picture Experts Group,MPEG)视频分片技术对一个视图区进行切片,再根据自适应的码流推送策略进行流服务。Optionally, in this embodiment, the code stream of the encoded view area may be, but is not limited to, sliced according to time. For example, in response to each play request on the server side, a time slice of a panoramic media file is always pushed each time. It is possible, but not limited to, to slice a view area using various classic Moving Picture Experts Group (MPEG) video segmentation techniques, and then perform streaming services according to an adaptive code stream push strategy.
通过本申请提供的实施例,通过利用在全景图像帧上划分的多个视图区 获取第一中心视图区,从而实现了利用多个视图区准确定位第一中心视图区,保证所获取的第一中心视图区的画面的准确性,以克服相关技术中只能获取到被高度压缩失真的画面的问题,进一步,利用多个视图区快速获取第一中心视图区,还将大大提高获取效率,进而实现提高全景媒体文件的推送效率的效果。Through the embodiments provided by the present application, by utilizing multiple view regions divided on a panoramic image frame Acquiring the first central view area, thereby accurately positioning the first central view area by using the plurality of view areas, and ensuring the accuracy of the acquired first central view area, so as to overcome the high compression in the related art. The problem of distorted picture, further, the use of multiple view areas to quickly acquire the first central view area, will greatly improve the acquisition efficiency, and thus achieve the effect of improving the push efficiency of the panoramic media file.
作为一种可选的方案,第二获取单元包括:As an optional solution, the second obtaining unit includes:
1)第一确定模块,用于根据传感器检测到的运动数据确定第一中心视图区的坐标范围;1) a first determining module, configured to determine a coordinate range of the first central view area according to the motion data detected by the sensor;
2)第一获取模块,用于利用第一中心视图区的坐标范围从多个视图区中获取目标视图区,其中,目标视图区包括与第一中心视图区的坐标范围重叠的至少一个视图区;2) a first obtaining module, configured to obtain a target view area from the plurality of view areas by using a coordinate range of the first central view area, wherein the target view area includes at least one view area overlapping the coordinate range of the first central view area ;
4)提取模块,用于从目标视图区中提取第一中心视图区对应的画面。4) An extraction module, configured to extract a picture corresponding to the first central view area from the target view area.
可选地,在本实施例中,第一获取模块包括:(1)获取子模块,用于获取第一中心视图区的坐标范围所在的视图区的标识;(2)拼接子模块,用于利用视图区的标识所指示的视图区拼接得到目标视图区。Optionally, in this embodiment, the first obtaining module includes: (1) an acquiring sub-module, configured to acquire an identifier of a view area where a coordinate range of the first central view area is located; and (2) a splicing sub-module, configured to: The target view area is obtained by splicing the view area indicated by the identifier of the view area.
可选地,在本实施例中,上述传感器检测到的运动数据可以包括但不限于以下至少之一:头部转动的角度以及眼球转动参数。上述仅是一种示例,上述运动数据还可以包括其他用于检测观看者的视野范围的运动数据,本实施例中对此不做任何限定。Optionally, in this embodiment, the motion data detected by the foregoing sensor may include, but is not limited to, at least one of the following: an angle of the head rotation and an eye rotation parameter. The foregoing is only an example, and the motion data may further include other motion data for detecting a field of view of the viewer, which is not limited in this embodiment.
具体结合以下示例进行说明,根据传感器检测到的运动数据确定第一中心视图区的坐标,利用上述坐标获取到第一中心视图区所在目标视图区的视图区标识,该视图区标识包括:A视图区、B视图区、E视图区及F视图区(如图4阴影所示),如图4所示,主要在F视图区,同时涵盖相邻的A/B/E视图区的一部分,第一中心视图区中的画面内容是由A/B/E/F四个区的画面拼接得到。Specifically, the following example is used to determine the coordinates of the first central view area according to the motion data detected by the sensor, and the view area identifier of the target view area where the first central view area is located is obtained by using the coordinates, and the view area identifier includes: A view. Zone, B view zone, E view zone and F view zone (shown in the shadow of Figure 4), as shown in Figure 4, mainly in the F view zone, covering a part of the adjacent A/B/E view zone, The picture content in a central view area is obtained by stitching the pictures in the four areas of A/B/E/F.
在播放上述第一中心视图区中的画面时,可以根据全景球图像投影方法将解码后的A/B/E/F四个视图区中的画面展开拼接得到目标视图区,然后根据第一中心视图区的相对位置从该目标视图区中快速提取出第一中心视图区中的画面。When playing the picture in the first central view area, the decoded picture in the four view areas of the A/B/E/F may be stitched according to the panoramic ball image projection method to obtain the target view area, and then according to the first center. The relative position of the view area quickly extracts the picture in the first central view area from the target view area.
需要说明的是,在本实施例中,在第一中心视图区所占区域的大小等于 每一个视图区所占区域的大小的情况下,上述第一中心视图区可以包含在多个视图区构成的目标视图区中,也可以与其中一个视图区严格重合,从而实现直接获取该第一中心视图区中的画面。It should be noted that, in this embodiment, the size of the area occupied by the first central view area is equal to In the case of the size of the area occupied by each view area, the first central view area may be included in the target view area formed by the plurality of view areas, or may be strictly coincident with one of the view areas, thereby achieving direct access to the first The picture in the center view area.
通过本申请提供的实施例,根据传感器检测到的第一中心视图区的坐标从多个视图区中获取第一中心视图区所在的目标视图区,从而实现快速从目标视图区中提取该第一中心视图区的画面,并推送播放该画面,以达到提高推送全景媒体文件的推送效率的效果。According to the embodiment provided by the present application, the target view area where the first central view area is located is obtained from the plurality of view areas according to the coordinates of the first central view area detected by the sensor, so as to quickly extract the first view from the target view area. The picture in the center view area is pushed and played to achieve the effect of improving the push efficiency of the push panoramic media file.
作为一种可选的方案,编码单元包括:As an alternative, the coding unit includes:
1)第一编码模块,用于按照第一分辨率编码第一中心视图区所在目标视图区,按照第二分辨率编码全景图像帧中除目标视图区之外的其他视图区,其中,第一分辨率高于第二分辨率。1) a first encoding module, configured to encode a target view area in which the first central view area is located according to the first resolution, and encode other view areas in the panoramic image frame except the target view area according to the second resolution, where The resolution is higher than the second resolution.
可选地,在本实施例中,对每一个划分的视图区可以但不限于进行多尺度编码,以得到多个分辨率等级的码流。其中,对于全景图像帧中的多个视图区,中心视图区的分辨率(用分辨率等级标识)可以但不限于高于其他视图区的分辨率。从而使被关注的中心视图区的画面可以被清晰真实地播放,而对于其他视图区的画面模糊播放,以达到减少传输开销,节省带宽的目的。Optionally, in this embodiment, each divided view area may be, but is not limited to, multi-scale coding to obtain a code stream of multiple resolution levels. Wherein, for a plurality of view regions in the panoramic image frame, the resolution of the central view region (identified by the resolution level) may be, but not limited to, higher than the resolution of the other view regions. Therefore, the picture of the central view area of interest can be played clearly and faithfully, and the picture of other view areas is blurredly played, so as to reduce transmission overhead and save bandwidth.
通过本申请提供的实施例,通过针对全景图像帧中的不同的视图区按照不同的分辨率进行编码,不仅可以突出清晰地播放中心视图区中的画面,而且对其他视图区的画面模糊处理,将达到节省带宽的目的。Through the embodiments provided by the present application, by encoding different view regions in a panoramic image frame according to different resolutions, not only the pictures in the central view area but also the picture blur processing of other view areas can be clearly and clearly played. It will achieve the purpose of saving bandwidth.
作为一种可选的方案,第二获取单元包括:As an optional solution, the second obtaining unit includes:
1)第二获取模块,用于获取全景媒体文件的播放模式;1) a second acquiring module, configured to acquire a play mode of the panoramic media file;
2)第二确定模块,用于根据全景媒体文件的播放模式及第一中心视图区的坐标范围确定预定时间段后多个视图区上第二中心视图区的坐标范围。2) The second determining module is configured to determine a coordinate range of the second central view area on the plurality of view areas after the predetermined time period according to the play mode of the panoramic media file and the coordinate range of the first central view area.
可选地,在本实施例中的第二确定模块中可以根据以下公式计算预定时间段后多个视图区上第二中心视图区的坐标范围:Optionally, in the second determining module in this embodiment, the coordinate range of the second central view area on the plurality of view areas after the predetermined time period may be calculated according to the following formula:
Figure PCTCN2017092562-appb-000005
Figure PCTCN2017092562-appb-000005
其中,(x0,y0)用于表示第一中心视图区的坐标,(xt,yt)用于表示预定时间段t后第二中心视图区的坐标;v mod用于表示播放模式,v modx(t)用于表示在 播放模式下预定时间段t后x方向上的偏移角度,x方向为水平方向,v mody(t)用于表示在播放模式下预定时间段t后y方向上的偏移角度,y方向为垂直方向。Where (x 0 , y 0 ) is used to represent the coordinates of the first central view area, (x t , y t ) is used to represent the coordinates of the second central view area after the predetermined time period t; v mod is used to indicate the play mode , v mod x (t) is used to indicate the offset angle in the x direction after the predetermined time period t in the play mode, the x direction is the horizontal direction, and v mod y (t) is used to indicate the predetermined time period t in the play mode. The offset angle in the back y direction, and the y direction is the vertical direction.
可选地,在本实施例中,上述播放模式可以包括但不限于以下至少之一:用于播放第一中心视图区中的画面的第一播放模式、用于搜索第三中心视图区的第二播放模式、用于播放第三中心视图区中的画面的第三播放模式。上述仅是一种示例,本实施例中对此不做任何限定。Optionally, in this embodiment, the foregoing play mode may include, but is not limited to, at least one of: a first play mode for playing a picture in the first central view area, and a first search mode for searching the third central view area. The second play mode, the third play mode for playing the picture in the third central view area. The above is only an example, and is not limited in this embodiment.
需要说明的是,在本实施例中,上述播放模式可以但不限于将影响在预定时间段t后的偏移角度。例如,对于第一播放模式(也可称作观看主模式),长时间保持一个视野角度,即长时间停留在第一中心视图区,则可以根据该播放模式预测预定时间段t后偏移角度为0,则可以预测预定时间段后第二中心视图区的坐标与第一中心视图区的坐标相同,即xt=x0,yt=y0。这里对于第二播放模式需要说明的是,搜索过程中的搜索运动可以为匀速运动,则偏移角度可以为移动速度v与移动时间t的乘积,也可以为非匀速运动,根据相关计算方式获取在该播放模式下的偏移角度。本实施例中对此不做任何限定。It should be noted that, in this embodiment, the above play mode may be, but is not limited to, an offset angle that will affect the predetermined time period t. For example, for the first play mode (also referred to as the view main mode), if one view angle is maintained for a long time, that is, staying in the first center view area for a long time, the offset angle after the predetermined time period t can be predicted according to the play mode. If it is 0, it can be predicted that the coordinates of the second central view area after the predetermined time period are the same as the coordinates of the first central view area, that is, x t = x 0 , y t = y 0 . What should be noted for the second play mode is that the search motion in the search process can be a uniform motion, and the offset angle can be the product of the moving speed v and the moving time t, or can be a non-uniform motion, and is obtained according to the relevant calculation manner. The offset angle in this play mode. This embodiment does not limit this.
具体结合上述公式进行说明,假设获取第一中心视图区的坐标(x0,y0),当前播放模式为v mod的情况下,先根据播放模式v mod获取分别在x,y方向上在预定时间段t后相对当前位置的偏移角度:v modx(t),v mody(t)。然后,利用上述公式预测预定时间段t第二中心视图区的坐标(xt,yt)。Specifically, in combination with the above formula, it is assumed that the coordinates (x 0 , y 0 ) of the first central view area are obtained, and in the case that the current play mode is v mod, the first acquisition is in the x, y direction according to the play mode v mod . The offset angle from the current position after time period t: v mod x (t), v mod y (t). Then, the coordinates (x t , y t ) of the second central view region of the predetermined time period t are predicted using the above formula.
通过本申请提供的实施例,根据全景媒体文件的播放模式及第一中心视图区的坐标确定预定时间段后多个视图区上第二中心视图区的坐标,从而实现对预定时间段后被关注的视野范围的准确预测,以保证提前及时获取到所要推送的第二中心视图区的画面,进一步,还可以避免网络传输延时导致的播放延时的问题。According to the embodiment provided by the present application, the coordinates of the second central view area on the plurality of view areas after the predetermined time period are determined according to the play mode of the panoramic media file and the coordinates of the first central view area, thereby achieving attention after the predetermined time period. The accurate prediction of the field of view ensures that the picture of the second central view area to be pushed is obtained in advance, and further, the problem of the playback delay caused by the network transmission delay can be avoided.
作为一种可选的方案,第二获取模块式包括:As an alternative, the second acquisition module includes:
1)第三确定子模块,用于在预定周期内传感器检测到的运动数据的变化范围小于预定阈值时,则确定播放模式为第一播放模式,其中,第一播放模式用于播放第一中心视图区中的画面;1) a third determining submodule, configured to determine that the play mode is the first play mode when the change range of the motion data detected by the sensor is less than a predetermined threshold in a predetermined period, wherein the first play mode is used to play the first center The picture in the view area;
2)第四确定子模块,用于在预定周期内传感器检测到的运动数据的变化范围大于等于预定阈值时,则确定播放模式为第二播放模式,其中,第二播 放模式用于搜索第三中心视图区;2) a fourth determining submodule, configured to determine, in a predetermined period, that the change range of the motion data detected by the sensor is greater than or equal to a predetermined threshold, wherein the play mode is the second play mode, wherein the second play The drop mode is used to search for the third center view area;
3)第五确定子模块,用于在预定周期内传感器检测到的运动数据的变化范围小于预定阈值,且上一个播放模式为第二播放模式时,则确定播放模式为第三播放模式,其中,第三播放模式用于播放第三中心视图区中的画面。3) a fifth determining submodule, configured to determine, in a predetermined period, that the range of change of the motion data detected by the sensor is less than a predetermined threshold, and when the previous play mode is the second play mode, determining that the play mode is the third play mode, where The third play mode is used to play the picture in the third center view area.
具体结合以下示例进行说明,假设第一播放模式以微摆动观看主模式(ma)表示,第二播放模式以新内容搜索模式(ms)表示,第三播放模式以新内容聚焦模式(mf)表示。其中,上述播放模式具体如图5所示可以包括:Specifically, the following example is used to illustrate that the first play mode is represented by a micro-swing viewing main mode (ma), the second play mode is represented by a new content search mode (ms), and the third play mode is represented by a new content focus mode (mf). . The foregoing play mode may be specifically as shown in FIG. 5, and may include:
1)微摆动观看主模式(ma):该模式停留在第一中心视图区所播放的画面,且用于观看的硬件设备(如眼镜终端)会相对静止,或是有微小摆动(即在预定周期内的摆动幅度小于预定阈值),但不会实际离开第一中心视图区。1) Micro-swing viewing master mode (ma): This mode stays in the picture played in the first center view area, and the hardware device (such as the glasses terminal) used for viewing will be relatively stationary or have a slight swing (ie, scheduled) The amplitude of the swing in the period is less than a predetermined threshold), but does not actually leave the first central view area.
2)新内容搜索模式(ms):该模式将离开微摆动观看模式,用于快速运动搜索新视野(如第三中心视图区)里新的内容,且用于观看的硬件设备(如眼镜终端)会快速移动,偏离原来运动轨道。也就是,在预定周期内的摆动幅度大于等于预定阈值。2) New content search mode (ms): This mode will leave the micro-swing viewing mode for fast motion search for new content in new fields of view (such as the third center view area), and hardware devices for viewing (such as glasses terminals) ) will move quickly and deviate from the original motion track. That is, the amplitude of the wobble in the predetermined period is greater than or equal to a predetermined threshold.
3)新内容聚焦模式(mf):该模式可能将短暂停留第三中心视图区后离开再进入新内容搜索模式,也可能真实进入微摆动观看主模式停留在第三中心视图区。即在传感器检测到的运动数据指示在预定周期内的摆动幅度小于预定阈值,且上一个播放模式为第二播放模式。3) New content focus mode (mf): This mode may stay in the third center view area for a short time and then enter the new content search mode, or may actually enter the micro-swing view main mode to stay in the third center view area. That is, the motion data detected by the sensor indicates that the amplitude of the swing in the predetermined period is less than the predetermined threshold, and the previous play mode is the second play mode.
也就是说,在本实施例中,根据在预定周期(如时间窗T)内的移动轨迹来判断其运动模式。一种仅有短距离来回摆动,则是微摆动观看主模式(ma);另一种大距离较快速的移动,则是新内容搜索模式(ms);再一种是上一个模式是新内容搜索模式,过去预定周期(如时间窗T)内相对静止或者微小摆动,则是新内容聚焦模式(mf)。That is, in the present embodiment, the motion pattern is judged based on the movement trajectory within a predetermined period (e.g., time window T). One type that only swings back and forth in a short distance is the micro-swing to view the main mode (ma); the other type of faster moving is the new content search mode (ms); the other is the previous mode is new content. The search mode is a new content focus mode (mf) that is relatively stationary or slightly oscillating in a predetermined period of time (such as time window T).
需要说明的是,上述第三中心视图区可以但不限于为第二中心视图区,也可以但不限于为多个视图区中除第一中心视图区及第二中心视图区之外的其他视图区。It should be noted that the foregoing third central view area may be, but not limited to, a second central view area, and may be, but is not limited to, other views of the plurality of view areas except the first central view area and the second central view area. Area.
通过本申请提供的实施例,通过获取全景媒体文件的播放模式,以利用该播放模式来预测预定时间段内视野范围的偏移角度,从而实现根据第一中心视图区的坐标及偏移角度来确定预定时间段后的第二中心视图区的坐标。 Through the embodiment provided by the present application, by acquiring the play mode of the panoramic media file, the play mode is used to predict the offset angle of the visual field range within the predetermined time period, thereby implementing the coordinates and the offset angle according to the first central view area. The coordinates of the second central view area after the predetermined time period are determined.
作为一种可选的方案,编码单元包括:As an alternative, the coding unit includes:
1)处理模块,用于重复执行以下步骤,直至遍历在预定时间段后全景图像帧中的多个视图区:1) A processing module for repeatedly performing the following steps until traversing a plurality of view regions in the panoramic image frame after a predetermined period of time:
S1,从多个视图区中获取当前视图区中划分的多个子视图区;S1. Obtain, from multiple view areas, multiple sub-view areas divided in the current view area;
S2,获取多个子视图区的参考值,其中,参考值为子视图区的显著性特征所指示的显著性等级与第二中心视图区落在子视图区的概率二者中的最大值;S2. Acquire a reference value of multiple subview regions, where the reference value is a maximum value of a significance level indicated by the salient feature of the subview region and a probability that the second central view region falls within the subview region;
S3,根据多个子视图区的参考值中的最大值确定当前视图区的第三分辨率;S3. Determine a third resolution of the current view area according to a maximum value of the reference values of the multiple sub-view areas.
S4,按照第三分辨率对当前视图区进行编码。S4, encoding the current view area according to the third resolution.
可选地,在本实施例中,处理模块通过以下步骤实现获取多个子视图区的参考值:重复执行以下步骤,直至遍历多个子视图区:从多个子视图区中获取当前子视图区;获取当前子视图区的显著性特征所指示的显著性等级及第二中心视图区落在当前子视图区的概率;将显著性等级与概率二者中的最大值作为当前子视图区的参考值。Optionally, in this embodiment, the processing module is configured to obtain reference values of the plurality of sub-view areas by repeating the following steps until the plurality of sub-view areas are traversed: acquiring the current sub-view area from the plurality of sub-view areas; The significance level indicated by the saliency feature of the current sub-view area and the probability that the second central view area falls within the current sub-view area; the maximum value of the saliency level and the probability is used as the reference value of the current sub-view area.
可选地,在本实施例中,对多个视图区中的每个视图区可以但不限于划分为大小相同的四个子视图区。如图6所示,当前视图区包括子视图区a、子视图区b、子视图区c、和子视图区d。Optionally, in this embodiment, each of the plurality of view regions may be, but is not limited to, divided into four sub-view regions of the same size. As shown in FIG. 6, the current view area includes a sub-view area a, a sub-view area b, a sub-view area c, and a sub-view area d.
可选地,在本实施例中,获取当前子视图区的参考值可以包括但不限于获取当前子视图区的显著性特征所指示的显著性等级及第二中心视图区落在当前子视图区的概率二者中的最大值。Optionally, in this embodiment, obtaining the reference value of the current sub-view area may include, but is not limited to, obtaining a significance level indicated by the salient feature of the current sub-view area and the second central view area falls in the current sub-view area. The probability of the maximum of both.
需要说明的是,在本实施例中,上述显著性特征可以但不限于用于表示一种视觉显著性区域分布,如舞台中央等,被关注概率较高的区域,可被配置为高显著性等级的视觉显著性区域。而如偏暗区域,观众席,天空等被关注概率较低的区域,可被配置为低显著性等级的视觉显著性区域。其中,上述显著性等级可以但不限于用Sa(t,θxy)表示,其中,θx∈[0,360°)θy∈[-90°,90°],上述显著性等级可以但不限于根据经典视觉显著性检测算法先验的计算出。根据Sa(t,θxy)可以统计出每个子视图区的显著性等级,如RSa(t,sx,sy)表示预定时间段t后子视图区(sx,sy)的显著性等级。作为一种可选 的计算方式:It should be noted that, in this embodiment, the above-mentioned salient features may be, but are not limited to, a region for indicating a visually significant region, such as a center of the stage, and a region with a high probability of attention, which may be configured to be highly salient. The visually significant area of the level. For example, areas such as dark areas, auditoriums, and the sky with low probability of attention can be configured as visually significant areas of low significance level. Wherein, the above-mentioned significance level can be, but is not limited to, represented by Sa(t, θ x , θ y ), where θ x ∈ [0, 360°) θ y ∈ [-90°, 90°], the above-mentioned significance level can be It is not limited to a priori calculation based on the classical visual saliency detection algorithm. According to Sa(t, θ x , θ y ), the saliency level of each sub-view area can be counted, for example, RSa(t, sx, sy) represents the saliency level of the sub-view area (sx, sy) after the predetermined time period t. . As an alternative calculation method:
Figure PCTCN2017092562-appb-000006
Figure PCTCN2017092562-appb-000006
其中,子视图区x方向编号sx的范围为sx∈[1,12],子视图区y方向编号sy的范围为sy∈[1,6]。The sub-view area x direction number sx ranges from sx ∈ [1, 12], and the sub view area y direction number sy ranges from sy ∈ [1, 6].
需要说明的是,在本实施例中,以当前视图区为例,落在一个子视图区的概率用Pi(t,sx,sy)表示,则上述当前视图区中包括的四个子视图区的参考值可以通过以下方式标识:It should be noted that, in this embodiment, taking the current view area as an example, the probability of falling in a sub-view area is represented by Pi(t, sx, sy), and the four sub-view areas included in the current view area are The reference value can be identified in the following ways:
子视图区a的参考值为:The reference value of the subview area a is:
aPi(t,x,y)=max(RSa(t,2x-1,2y-1),Pi(t,2x-1,2y-1))aPi(t,x,y)=max(RSa(t,2x-1,2y-1),Pi(t,2x-1,2y-1))
子视图区b的参考值为:The reference value of sub-view area b is:
bPi(t,x,y)=max(RSa(t,2x,2y-1),Pi(t,2x,2y-1))bPi(t,x,y)=max(RSa(t,2x,2y-1),Pi(t,2x,2y-1))
子视图区c的参考值为:The reference value of the subview area c is:
cPi(t,x,y)=max(RSa(t,2x-1,2y),Pi(t,2x-1,2y))cPi(t,x,y)=max(RSa(t,2x-1,2y),Pi(t,2x-1,2y))
子视图区d的参考值为:dPi(t,x,y)=max(RSa(t,2x,2y),Pi(t,2x,2y))The reference value of the sub-view area d is: dPi(t, x, y) = max(RSa(t, 2x, 2y), Pi(t, 2x, 2y))
进一步,根据上述四个参考值中的最大值mPi(t,x,y)确定当前视图区的分辨率,其中:Further, the resolution of the current view area is determined according to the maximum value mPi(t, x, y) of the above four reference values, wherein:
mPi(t,x,y)=max(aPi(t,x,y),bPi(t,x,y),cPi(t,x,y),dPi(t,x,y)))  (10)mPi(t,x,y)=max(aPi(t,x,y),bPi(t,x,y),cPi(t,x,y),dPi(t,x,y))) (10) )
也就是说,将当前视图区的分辨率按照参考值最大的子视图区的分辨率进行更新调整,以保证被关注内容的高清晰度。That is to say, the resolution of the current view area is updated and adjusted according to the resolution of the sub-view area with the largest reference value to ensure high definition of the content of interest.
通过本申请提供的实施例,获取视图区所包括的多个子视图区中的参考值中的最大值,根据该最大值确定该视图区的分辨率,从而实现在预定时间段后针对不同视图区配置不同的分辨率进行编码,以达到节省带宽的目的。此外,根据显著性特征所指示的显著性等级及第二中心视图区落下的概率来预测预定时间段t后被推送可能性最大的子视图区,进而将该子视图区所在视图区中其他子视图区的分辨率均调整为最高分辨率,以保证被关注内容的播放清晰度。The embodiment provides the maximum value of the reference values in the plurality of sub-view areas included in the view area, and determines the resolution of the view area according to the maximum value, so as to implement different view areas after the predetermined time period. Configure different resolutions for encoding to save bandwidth. In addition, the sub-view area that is most likely to be pushed after the predetermined time period t is predicted according to the saliency level indicated by the saliency feature and the probability of the second center view area falling, and further the other sub-view areas in the view area The resolution of the view area is adjusted to the highest resolution to ensure the playback clarity of the content being watched.
作为一种可选的方案,处理模块通过以下步骤实现根据多个子视图区的 参考值中的最大值确定当前视图区的第三分辨率:As an alternative, the processing module is implemented according to the following steps: The maximum value in the reference value determines the third resolution of the current view area:
S1,通过以下公式计算当前视图区的第三分辨率所在的分辨率等级:S1, the resolution level of the third resolution of the current view area is calculated by the following formula:
S(t,x,y)=1+(n-1)*mPi(t,x,y)*Qnet  (11)S(t,x,y)=1+(n-1)*mPi(t,x,y)*Qnet (11)
其中,(x,y)为当前视图区的坐标,S(t,x,y)用于表示在预定时间段t后全景图像帧中当前视图区的第三分辨率所在的分辨率等级,mPi(t,x,y)用于表示在预定时间段t后在当前视图区中多个子视图区的参考值的最大值,Qnet用于表示当前网络带宽等级,n用于表示分辨率等级,其中,Qnet∈[0,1],S(t,x,y)∈{1,2,...,n};Where (x, y) is the coordinates of the current view area, and S(t, x, y) is used to indicate the resolution level of the third resolution of the current view area in the panoramic image frame after the predetermined time period t, mPi (t, x, y) is used to indicate the maximum value of the reference values of the plurality of sub-view areas in the current view area after the predetermined time period t, Qnet is used to indicate the current network bandwidth level, and n is used to represent the resolution level, wherein , Qnet∈[0,1],S(t,x,y)∈{1,2,...,n};
S2,根据第三分辨率所在的分辨率等级确定第三分辨率。S2, determining a third resolution according to a resolution level at which the third resolution is located.
需要说明的是,Qnet表示当前的网络带宽等级,等级越高,越倾向于推送高画质版本内容,越差越倾向于推送低分辨率版本内容,从而确保观看流畅性的体验。此外,S(t,x,y)表示第三分辨率所在的分辨率等级,等级越高,推送的分辨率版本越高,如最高分辨率版本n,反而反之,如最低分辨率版本1。It should be noted that Qnet indicates the current network bandwidth level. The higher the level, the more inclined it is to push the high-quality version content, and the worse the trend is to push the low-resolution version content to ensure the smooth viewing experience. In addition, S(t, x, y) represents the resolution level at which the third resolution is located. The higher the level, the higher the resolution version of the push, such as the highest resolution version n, but vice versa, such as the lowest resolution version 1.
通过本申请提供的实施例,通过按照不同分辨率编码多个视图区中的画面,以确保在中心视图区所占区域可以看到最清晰的画面,而在其他视图区所在区域看到相对模糊的画面,以保证在播放全景图像帧的同时,实现区别播放,从而达到减少传输开销,节省带宽,提高推送效率的目的。Through the embodiments provided by the present application, the pictures in multiple view areas are coded according to different resolutions to ensure that the clearest picture can be seen in the area occupied by the central view area, while the relative blur is seen in the area of other view areas. The picture is to ensure that the difference image is played while playing the panoramic image frame, thereby reducing the transmission overhead, saving bandwidth, and improving the push efficiency.
作为一种可选的方案,处理模块通过以下步骤实现获取第二中心视图区落在当前子视图区的概率:As an optional solution, the processing module obtains the probability that the second central view area falls in the current sub-view area by the following steps:
P(t,sx,sy)=exp(-((sx-xt)2+(sy-yt)2))  (12)P(t, sx, sy)=exp(-((sx-x t ) 2 +(sy-y t ) 2 )) (12)
其中,(sx,sy)用于表示当前子视图区的坐标,P(t,sx,sy)用于表示在预定时间段t后第二中心视图区落在当前子视图区的概率,(xt,yt)用于表示预定时间段t后第二中心视图区的坐标。Where (sx, sy) is used to represent the coordinates of the current sub-view area, and P(t, sx, sy) is used to indicate the probability that the second central view area falls within the current sub-view area after the predetermined time period t, (x t , y t ) is used to indicate the coordinates of the second central view area after the predetermined time period t.
需要说明的是,上述公式为以e为底的反向指数函数,也就是说,在当前子视图区距离第二中心视图区的位置越近,函数值越大,对应概率也越大。It should be noted that the above formula is a reverse exponential function with e as the base, that is, the closer the current sub-view region is to the second central view region, the larger the function value is, and the corresponding probability is larger.
根据本发明实施例,还提供了一种用于实施上述全景媒体文件推送方法的全景媒体文件推送终端,如图8所示,该终端包括: According to an embodiment of the present invention, a panoramic media file pushing terminal for implementing the above-mentioned panoramic media file pushing method is further provided. As shown in FIG. 8, the terminal includes:
1)通讯接口802,设置为获取待推送的全景媒体文件,其中,全景媒体文件中包括至少一帧全景图像帧;还设置为推送编码后的全景图像帧;1) a communication interface 802, configured to obtain a panoramic media file to be pushed, wherein the panoramic media file includes at least one frame of the panoramic image frame; and is further configured to push the encoded panoramic image frame;
2)处理器804,与通讯接口802连接,设置为按照预定条件分别将全景图像帧划分为多个视图区;还设置为在全景图像帧的多个视图区上确定第一中心视图区,其中,第一中心视图区所占区域大于等于一个视图区所占区域;还设置为根据中心视图区对全景图像帧进行编码;2) The processor 804 is connected to the communication interface 802, and is configured to respectively divide the panoramic image frame into a plurality of view regions according to predetermined conditions; and further configured to determine the first central view region on the plurality of view regions of the panoramic image frame, wherein The area occupied by the first central view area is greater than or equal to the area occupied by one view area; and is further configured to encode the panoramic image frame according to the central view area;
3)存储器806,与通讯接口802及处理器804连接,设置为存储全景媒体文件及确定的第一中心视图区。3) A memory 806, coupled to the communication interface 802 and the processor 804, configured to store the panoramic media file and the determined first central view area.
可选地,本实施例中的具体示例可以参考上述实施例中所描述的示例,本实施例在此不再赘述。For example, the specific examples in this embodiment may refer to the examples described in the foregoing embodiments, and details are not described herein again.
本发明的实施例还提供了一种存储介质。可选地,在本实施例中,上述存储介质可以位于网络中的多个网络设备中的至少一个网络设备。Embodiments of the present invention also provide a storage medium. Optionally, in this embodiment, the foregoing storage medium may be located in at least one of the plurality of network devices in the network.
可选地,在本实施例中,存储介质被设置为存储用于执行以下步骤的程序代码:Optionally, in the present embodiment, the storage medium is arranged to store program code for performing the following steps:
S1,获取待推送的全景媒体文件,其中,全景媒体文件中包括至少一帧全景图像帧;S1, the panoramic media file to be pushed is obtained, where the panoramic media file includes at least one frame of the panoramic image frame;
S2,按照预定条件将全景图像帧划分为多个视图区;S2, dividing the panoramic image frame into a plurality of view areas according to a predetermined condition;
S3,在全景图像帧的多个视图区上确定第一中心视图区,其中,第一中心视图区所占区域大于等于一个视图区所占区域;S3, determining a first central view area on a plurality of view areas of the panoramic image frame, wherein the area occupied by the first central view area is greater than or equal to the area occupied by one view area;
S4,根据第一中心视图区对全景图像帧进行编码;S4, encoding the panoramic image frame according to the first central view area;
S5,推送编码后的全景图像帧。S5, pushing the encoded panoramic image frame.
可选地,存储介质还被设置为存储用于执行以下步骤的程序代码:Optionally, the storage medium is further arranged to store program code for performing the following steps:
S1,根据传感器检测到的运动数据确定第一中心视图区的坐标范围;S1. Determine a coordinate range of the first central view area according to the motion data detected by the sensor;
S2,利用第一中心视图区的坐标范围从多个视图区中获取目标视图区,其中,目标视图区包括与第一中心视图区的坐标范围重叠的至少一个视图区;S2. Acquire a target view area from the plurality of view areas by using a coordinate range of the first central view area, where the target view area includes at least one view area overlapping the coordinate range of the first central view area;
S3,从目标视图区中提取第一中心视图区对应的画面。S3. Extract a picture corresponding to the first central view area from the target view area.
可选地,存储介质还被设置为存储用于执行以下步骤的程序代码:按照第一分辨率编码第一中心视图区所在目标视图区,按照第二分辨率编码全景图像帧中除目标视图区之外的其他视图区,其中,第一分辨率高于第二分辨 率。Optionally, the storage medium is further configured to store program code for: encoding the target view area in which the first central view area is located according to the first resolution, and encoding the target view area in the panoramic image frame according to the second resolution Other view areas than the first resolution, wherein the first resolution is higher than the second resolution rate.
可选地,存储介质还被设置为存储用于执行以下步骤的程序代码:Optionally, the storage medium is further arranged to store program code for performing the following steps:
S1,获取全景媒体文件的播放模式;S1, acquiring a play mode of the panoramic media file;
S2,根据全景媒体文件的播放模式及第一中心视图区的坐标范围确定预定时间段后多个视图区上第二中心视图区的坐标范围。S2. Determine a coordinate range of the second central view area on the plurality of view areas after the predetermined time period according to the play mode of the panoramic media file and the coordinate range of the first central view area.
可选地,存储介质还被设置为存储用于执行以下步骤的程序代码:重复执行以下步骤,直至遍历在预定时间段后全景图像帧中的多个视图区:从多个视图区中确定当前视图区中划分的多个子视图区;获取多个子视图区的参考值,其中,参考值为子视图区的显著性特征所指示的显著性等级与第二中心视图区落在子视图区的概率二者中的最大值;根据多个子视图区的参考值中的最大值确定当前视图区的第三分辨率;按照第三分辨率对当前视图区进行编码。Optionally, the storage medium is further arranged to store program code for performing the steps of: repeating the steps of: traversing the plurality of view regions in the panoramic image frame after a predetermined period of time: determining the current from the plurality of view regions a plurality of sub-view areas divided in the view area; obtaining reference values of the plurality of sub-view areas, wherein the reference value is a saliency level indicated by the saliency feature of the sub-view area and a probability that the second central view area falls within the sub-view area a maximum of the two; determining a third resolution of the current view region based on a maximum of the reference values of the plurality of sub-view regions; encoding the current view region according to the third resolution.
可选地,在本实施例中,上述存储介质可以包括但不限于:U盘、只读存储器(ROM,Read-Only Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。Optionally, in this embodiment, the foregoing storage medium may include, but is not limited to, a U disk, a Read-Only Memory (ROM), a removable hard disk, a magnetic disk, or an optical disk, and the like. .
可选地,本实施例中的具体示例可以参考上述实施例中所描述的示例,本实施例在此不再赘述。For example, the specific examples in this embodiment may refer to the examples described in the foregoing embodiments, and details are not described herein again.
上述本发明实施例序号仅仅为了描述,不代表实施例的优劣。The serial numbers of the embodiments of the present invention are merely for the description, and do not represent the advantages and disadvantages of the embodiments.
上述实施例中的集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在上述计算机可读取的存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在存储介质中,包括若干指令用以使得一台或多台计算机设备(可为个人计算机、服务器或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。The integrated unit in the above embodiment, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in the above-described computer readable storage medium. Based on such understanding, the technical solution of the present invention may contribute to the prior art or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium. A number of instructions are included to cause one or more computer devices (which may be a personal computer, server or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention.
在本发明的上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。In the above-mentioned embodiments of the present invention, the descriptions of the various embodiments are different, and the parts that are not detailed in a certain embodiment can be referred to the related descriptions of other embodiments.
在本申请所提供的几个实施例中,应该理解到,所揭露的客户端,可通过其它的方式实现。其中,以上所描述的装置实施例仅仅是示意性的,例如 所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,单元或模块的间接耦合或通信连接,可以是电性或其它的形式。In the several embodiments provided by the present application, it should be understood that the disclosed client may be implemented in other manners. Wherein the device embodiments described above are merely illustrative, for example The division of the unit is only a logical function division, and the actual implementation may have another division manner. For example, multiple units or components may be combined or may be integrated into another system, or some features may be ignored or not executed. In addition, the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, unit or module, and may be electrical or otherwise.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit. The above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
以上所述仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。 The above description is only a preferred embodiment of the present invention, and it should be noted that those skilled in the art can also make several improvements and retouchings without departing from the principles of the present invention. It should be considered as the scope of protection of the present invention.

Claims (22)

  1. 一种全景媒体文件推送方法,应用于具有显示装置的终端,包括:A panoramic media file pushing method is applied to a terminal having a display device, including:
    获取待推送的全景媒体文件,其中,所述全景媒体文件中包括至少一帧全景图像帧;Obtaining a panoramic media file to be pushed, where the panoramic media file includes at least one frame of panoramic image frames;
    按照预定条件将所述全景图像帧划分为多个视图区;Dividing the panoramic image frame into a plurality of view regions according to predetermined conditions;
    在所述全景图像帧的所述多个视图区上确定第一中心视图区,其中,所述第一中心视图区所占区域大于等于一个所述视图区所占区域;Determining, in the plurality of view areas of the panoramic image frame, a first central view area, wherein the area occupied by the first central view area is greater than or equal to an area occupied by the view area;
    根据所确定的第一中心视图区对所述全景图像帧进行编码;以及Encoding the panoramic image frame according to the determined first central view area;
    向所述显示装置推送编码后的所述全景图像帧。The encoded panoramic image frame is pushed to the display device.
  2. 根据权利要求1所述的方法,其中,所述在所述全景图像帧的所述多个视图区上确定第一中心视图区包括:The method of claim 1, wherein the determining the first central view area on the plurality of view regions of the panoramic image frame comprises:
    根据传感器检测到的运动数据确定所述第一中心视图区的坐标范围;Determining a coordinate range of the first central view area according to motion data detected by the sensor;
    利用所述第一中心视图区的坐标范围从所述多个视图区中确定目标视图区,其中,所述目标视图区包括与所述第一中心视图区的坐标范围重叠的至少一个视图区;以及Determining a target view area from the plurality of view areas using a coordinate range of the first central view area, wherein the target view area includes at least one view area overlapping a coordinate range of the first central view area; as well as
    从所述目标视图区中提取所述第一中心视图区对应的画面。Extracting a picture corresponding to the first central view area from the target view area.
  3. 根据权利要求2所述的方法,其中,所述利用所述第一中心视图区的坐标范围从所述多个视图区中确定目标视图区包括:The method of claim 2, wherein the determining the target view area from the plurality of view areas using the coordinate range of the first central view area comprises:
    获取在所述第一中心视图区的坐标范围内的视图区的标识;以及Obtaining an identification of a view area within a coordinate range of the first central view area;
    将所述视图区的标识所指示的视图区拼接,以获得所述目标视图区。A view area indicated by the identifier of the view area is spliced to obtain the target view area.
  4. 根据权利要求2所述的方法,其中,所述根据所确定的中心视图区对所述全景图像帧进行编码包括:The method of claim 2 wherein said encoding said panoramic image frame in accordance with said determined central view region comprises:
    按照第一分辨率编码所述目标视图区,以及按照第二分辨率编码所述全景图像帧中除所述目标视图区之外的其他视图区,其中,所述第一分辨率高于所述第二分辨率。Encoding the target view area according to a first resolution, and encoding other view areas of the panoramic image frame other than the target view area according to a second resolution, wherein the first resolution is higher than the Second resolution.
  5. 根据权利要求2所述的方法,还包括:The method of claim 2 further comprising:
    获取所述全景媒体文件的播放模式;以及 Obtaining a play mode of the panoramic media file;
    根据所述全景媒体文件的所述播放模式及所述第一中心视图区的坐标范围确定预定时间段后所述多个视图区上第二中心视图区的坐标范围。And determining a coordinate range of the second central view area on the plurality of view areas after the predetermined time period according to the play mode of the panoramic media file and the coordinate range of the first central view area.
  6. 根据权利要求5所述的方法,其中,所述根据所述全景媒体文件的所述播放模式及所述第一中心视图区的坐标范围确定预定时间段后所述多个视图区上第二中心视图区的坐标范围包括根据以下公式计算所述第二中心视图区的坐标范围:The method according to claim 5, wherein the determining the second center on the plurality of view areas after the predetermined time period is determined according to the play mode of the panoramic media file and the coordinate range of the first central view area The coordinate range of the view area includes calculating a coordinate range of the second central view area according to the following formula:
    Figure PCTCN2017092562-appb-100001
    Figure PCTCN2017092562-appb-100001
    其中,(x0,y0)用于表示所述第一中心视图区的坐标,(xt,yt)用于表示预定时间段t后所述第二中心视图区的坐标;v mod用于表示所述播放模式,v modx(t)用于表示在所述播放模式下所述预定时间段t后x方向上的偏移角度,x方向为水平方向,v mody(t)用于表示在所述播放模式下所述预定时间段t后y方向上的偏移角度,y方向为垂直方向。Wherein (x 0, y 0 ) is used to represent the coordinates of the first central view area, and (x t, y t ) is used to represent the coordinates of the second central view area after the predetermined time period t; In the representation of the play mode, v mod x (t) is used to indicate the offset angle in the x direction after the predetermined time period t in the play mode, the x direction is the horizontal direction, and v mod y (t) is used. The y direction is a vertical direction indicating the offset angle in the y direction after the predetermined time period t in the play mode.
  7. 根据权利要求5所述的方法,其中,所述获取所述全景媒体文件的播放模式包括:The method of claim 5, wherein the acquiring a play mode of the panoramic media file comprises:
    在预定周期内所述传感器检测到的所述运动数据的变化范围小于预定阈值时,则确定所述播放模式为第一播放模式,其中,所述第一播放模式用于播放所述第一中心视图区中的画面;Determining, in a predetermined period, that the change range of the motion data detected by the sensor is less than a predetermined threshold, determining that the play mode is a first play mode, wherein the first play mode is used to play the first center The picture in the view area;
    在所述预定周期内所述传感器检测到的所述运动数据的变化范围大于等于所述预定阈值时,则确定所述播放模式为第二播放模式,其中,所述第二播放模式用于搜索第三中心视图区;以及Determining, in the predetermined period, that the change range of the motion data detected by the sensor is greater than or equal to the predetermined threshold, determining that the play mode is a second play mode, wherein the second play mode is used for searching a third central view area;
    在所述预定周期内所述传感器检测到的所述运动数据的变化范围小于所述预定阈值,且上一个所述播放模式为所述第二播放模式时,则确定所述播放模式为第三播放模式,其中,所述第三播放模式用于播放所述第三中心视图区中的画面。And determining, in the predetermined period, that the change range of the motion data detected by the sensor is less than the predetermined threshold, and determining that the play mode is third when the previous play mode is the second play mode a play mode, wherein the third play mode is for playing a picture in the third center view area.
  8. 根据权利要求5所述的方法,其中,所述根据所确定的第一中心视图区对所述全景图像帧进行编码包括:The method of claim 5 wherein said encoding said panoramic image frame in accordance with said determined first central view region comprises:
    重复执行以下步骤,直至遍历在所述预定时间段后所述全景图像帧中的 所述多个视图区:Repeating the following steps until traversing the panoramic image frame after the predetermined period of time The plurality of view areas:
    从所述多个视图区中获取当前视图区中划分的多个子视图区;Obtaining a plurality of sub-view areas divided in the current view area from the plurality of view areas;
    获取所述多个子视图区的参考值,其中,所述参考值为所述子视图区的显著性特征所指示的显著性等级与所述第二中心视图区落在所述子视图区的概率二者中的最大值;Obtaining reference values of the plurality of sub-view regions, wherein the reference value is a saliency level indicated by the saliency feature of the sub-view region and a probability that the second central view region falls within the sub-view region The maximum of the two;
    根据所述多个子视图区的所述参考值中的最大值确定所述当前视图区的第三分辨率;以及Determining a third resolution of the current view region based on a maximum of the reference values of the plurality of sub-view regions;
    按照所述第三分辨率对所述当前视图区进行编码。The current view area is encoded in accordance with the third resolution.
  9. 根据权利要求8所述的方法,其中,所述获取所述多个子视图区的参考值包括:The method of claim 8, wherein the obtaining the reference values of the plurality of sub-view regions comprises:
    重复执行以下步骤,直至遍历所述多个子视图区:Repeat the following steps until you traverse the multiple subview regions:
    从所述多个子视图区中获取当前子视图区;Obtaining a current sub-view area from the plurality of sub-view areas;
    获取所述当前子视图区的所述显著性特征所指示的所述显著性等级及所述第二中心视图区落在所述当前子视图区的概率;Obtaining the significance level indicated by the significant feature of the current sub-view area and the probability that the second central view area falls within the current sub-view area;
    将所述显著性等级与所述概率二者中的最大值作为所述当前子视图区的所述参考值。The maximum value of the significance level and the probability is used as the reference value of the current sub-view area.
  10. 根据权利要求8或9所述的方法,其中,所述根据所述多个子视图区的所述参考值中的最大值确定所述当前视图区的第三分辨率包括:The method according to claim 8 or 9, wherein the determining the third resolution of the current view area according to the maximum value of the reference values of the plurality of sub-view areas comprises:
    通过以下公式计算所述当前视图区的所述第三分辨率所在的分辨率等级:The resolution level of the third resolution of the current view area is calculated by the following formula:
    S(t,x,y)=1+(n-1)*mPi(t,x,y)*Qnet,S(t,x,y)=1+(n-1)*mPi(t,x,y)*Qnet,
    其中,(x,y)为所述当前视图区的坐标,S(t,x,y)用于表示在所述预定时间段t后所述全景图像帧中所述当前视图区的所述第三分辨率所在的分辨率等级,mPi(t,x,y)用于表示在所述预定时间段t后在所述当前视图区中所述多个子视图区的所述参考值的最大值,Qnet用于表示当前网络带宽等级,n用于表示分辨率等级,其中,Qnet∈[0,1],S(t,x,y)∈{1,2,L,n};以及Where (x, y) is the coordinates of the current view area, and S(t, x, y) is used to indicate the number of the current view area in the panoramic image frame after the predetermined time period t The resolution level at which the three resolutions are located, mPi(t, x, y) is used to indicate the maximum value of the reference values of the plurality of sub-view regions in the current view region after the predetermined time period t, Qnet is used to indicate the current network bandwidth level, and n is used to indicate the resolution level, where Qnet∈[0,1], S(t,x,y)∈{1,2,L,n};
    根据所述第三分辨率所在的分辨率等级确定所述第三分辨率。The third resolution is determined according to a resolution level at which the third resolution is located.
  11. 根据权利要求8至10中任一项所述的方法,其中,通过以下公式计算所述 第二中心视图区落在所述当前子视图区的概率:The method according to any one of claims 8 to 10, wherein the calculation is performed by the following formula The probability that the second central view area falls within the current subview area:
    P(t,sx,sy)=exp(-((sx-xt)2+(sy-yt)2)),P(t, sx, sy)=exp(-((sx-x t ) 2 +(sy-y t ) 2 )),
    其中,(sx,sy)用于表示所述当前子视图区的坐标,P(t,sx,sy)用于表示在所述预定时间段t后所述第二中心视图区落在所述当前子视图区的概率,(xt,yt)用于表示所述预定时间段t后所述第二中心视图区的坐标。Where (sx, sy) is used to represent the coordinates of the current sub-view area, and P(t, sx, sy) is used to indicate that the second central view area falls within the current after the predetermined time period t The probability of the sub-view area, (x t, y t ), is used to represent the coordinates of the second central view area after the predetermined time period t.
  12. 一种全景媒体文件推送装置,应用于具有显示装置的终端,包括:A panoramic media file pushing device is applied to a terminal having a display device, including:
    第一获取单元,用于获取待推送的全景媒体文件,其中,所述全景媒体文件中包括至少一帧全景图像帧;a first acquiring unit, configured to acquire a panoramic media file to be pushed, where the panoramic media file includes at least one frame of panoramic image frames;
    划分单元,用于按照预定条件将所述全景图像帧划分为多个视图区;a dividing unit, configured to divide the panoramic image frame into a plurality of view regions according to a predetermined condition;
    第二获取单元,用于在所述全景图像帧的所述多个视图区上确定第一中心视图区,其中,所述第一中心视图区所占区域大于等于一个所述视图区所占区域;a second acquiring unit, configured to determine a first central view area on the plurality of view areas of the panoramic image frame, where the area occupied by the first central view area is greater than or equal to an area occupied by one of the view areas ;
    编码单元,用于根据所述确定的第一中心视图区对所述全景图像帧进行编码;以及a coding unit, configured to encode the panoramic image frame according to the determined first central view area;
    推送单元,用于推送编码后的所述全景图像帧。a pushing unit, configured to push the encoded panoramic image frame.
  13. 根据权利要求12所述的装置,其中,所述第二获取单元包括:The apparatus of claim 12, wherein the second obtaining unit comprises:
    第一确定模块,用于根据传感器检测到的运动数据确定所述第一中心视图区的坐标范围;a first determining module, configured to determine a coordinate range of the first central view area according to the motion data detected by the sensor;
    第一获取模块,用于利用所述第一中心视图区的坐标范围从所述多个视图区中获取目标视图区,其中,所述目标视图区包括与所述第一中心视图区的坐标范围重叠的至少一个视图区;a first acquiring module, configured to acquire a target view area from the plurality of view areas by using a coordinate range of the first central view area, wherein the target view area includes a coordinate range with the first central view area Overlapping at least one view area;
    提取模块,用于从所述目标视图区中提取所述第一中心视图区对应的画面。And an extracting module, configured to extract, from the target view area, a picture corresponding to the first central view area.
  14. 根据权利要求13所述的装置,其中,所述第一获取模块包括:The apparatus of claim 13, wherein the first obtaining module comprises:
    获取子模块,用于获取在所述第一中心视图区的坐标范围内的视图区的标识;Obtaining a submodule, configured to acquire an identifier of a view area within a coordinate range of the first central view area;
    拼接子模块,用于将所述视图区的标识所指示的视图区拼接,以获得所述目标视图区。 And a splicing sub-module, configured to splicing the view area indicated by the identifier of the view area to obtain the target view area.
  15. 根据权利要求13所述的装置,其中,所述编码单元包括:The apparatus of claim 13 wherein said encoding unit comprises:
    第一编码模块,用于按照第一分辨率编码所述目标视图区,以及按照第二分辨率编码所述全景图像帧中除所述目标视图区之外的其他视图区,其中,所述第一分辨率高于所述第二分辨率。a first encoding module, configured to encode the target view area according to a first resolution, and encode other view areas of the panoramic image frame except the target view area according to a second resolution, where the A resolution is higher than the second resolution.
  16. 根据权利要求13所述的装置,其中,所述第二获取单元包括:The apparatus of claim 13, wherein the second obtaining unit comprises:
    第二获取模块,用于获取所述全景媒体文件的播放模式;以及a second acquiring module, configured to acquire a play mode of the panoramic media file;
    第二确定模块,用于根据所述全景媒体文件的所述播放模式及所述第一中心视图区的坐标范围确定预定时间段后所述多个视图区上第二中心视图区的坐标范围。a second determining module, configured to determine a coordinate range of the second central view area on the plurality of view areas after the predetermined time period according to the play mode of the panoramic media file and the coordinate range of the first central view area.
  17. 根据权利要求16所述的装置,其中,所述第二确定模块包括根据以下公式计算所述第二中心视图区的坐标范围:The apparatus of claim 16, wherein the second determining module comprises calculating a coordinate range of the second central view area according to the following formula:
    Figure PCTCN2017092562-appb-100002
    Figure PCTCN2017092562-appb-100002
    其中,(x0,y0)用于表示所述第一中心视图区的坐标,(xt,yt)用于表示预定时间段t后所述第二中心视图区的坐标;v mod用于表示所述播放模式,v modx(t)用于表示在所述播放模式下所述预定时间段t后x方向上的偏移角度,x方向为水平方向,v mody(t)用于表示在所述播放模式下所述预定时间段t后y方向上的偏移角度,y方向为垂直方向。Wherein (x 0 , y 0 ) is used to represent the coordinates of the first central view area, and (x t , y t ) is used to represent the coordinates of the second central view area after the predetermined time period t; In the representation of the play mode, v mod x (t) is used to indicate the offset angle in the x direction after the predetermined time period t in the play mode, the x direction is the horizontal direction, and v mod y (t) is used. The y direction is a vertical direction indicating the offset angle in the y direction after the predetermined time period t in the play mode.
  18. 根据权利要求16所述的装置,其中,所述第二获取模块式包括:The apparatus of claim 16, wherein the second acquisition module comprises:
    第三确定子模块,用于在预定周期内所述传感器检测到的所述运动数据的变化范围小于预定阈值时,则确定所述播放模式为第一播放模式,其中,所述第一播放模式用于播放所述第一中心视图区中的画面;a third determining submodule, configured to determine, in a predetermined period, that the change range of the motion data detected by the sensor is less than a predetermined threshold, wherein the play mode is a first play mode, where the first play mode For playing a picture in the first central view area;
    第四确定子模块,用于在所述预定周期内所述传感器检测到的所述运动数据的变化范围大于等于所述预定阈值时,则确定所述播放模式为第二播放模式,其中,所述第二播放模式用于搜索第三中心视图区;以及a fourth determining submodule, configured to determine, in the predetermined period, that the change range of the motion data detected by the sensor is greater than or equal to the predetermined threshold, determining that the play mode is a second play mode, where The second play mode is used to search for the third center view area;
    第五确定子模块,用于在所述预定周期内所述传感器检测到的所述运动数据的变化范围小于所述预定阈值,且上一个所述播放模式为所述第二播放模式时,则确定所述播放模式为第三播放模式,其中,所述第三播放模式用 于播放所述第三中心视图区中的画面。a fifth determining submodule, configured to: when the change range of the motion data detected by the sensor is less than the predetermined threshold in the predetermined period, and when the last play mode is the second play mode, Determining that the play mode is a third play mode, wherein the third play mode is Playing the picture in the third central view area.
  19. 根据权利要求16所述的装置,其中,所述编码单元包括:The apparatus of claim 16, wherein the encoding unit comprises:
    处理模块,用于重复执行以下步骤,直至遍历在所述预定时间段后所述全景图像帧中的所述多个视图区:a processing module, configured to repeatedly perform the following steps until traversing the plurality of view regions in the panoramic image frame after the predetermined period of time:
    从所述多个视图区中获取当前视图区中划分的多个子视图区;Obtaining a plurality of sub-view areas divided in the current view area from the plurality of view areas;
    获取所述多个子视图区的参考值,其中,所述参考值为所述子视图区的显著性特征所指示的显著性等级与所述第二中心视图区落在所述子视图区的概率二者中的最大值;Obtaining reference values of the plurality of sub-view regions, wherein the reference value is a saliency level indicated by the saliency feature of the sub-view region and a probability that the second central view region falls within the sub-view region The maximum of the two;
    根据所述多个子视图区的所述参考值中的最大值确定所述当前视图区的第三分辨率;以及Determining a third resolution of the current view region based on a maximum of the reference values of the plurality of sub-view regions;
    按照所述第三分辨率对所述当前视图区进行编码。The current view area is encoded in accordance with the third resolution.
  20. 根据权利要求19所述的装置,其中,所述处理模块通过以下步骤实现获取所述多个子视图区的参考值:The apparatus according to claim 19, wherein the processing module implements acquiring reference values of the plurality of sub-view areas by:
    重复执行以下步骤,直至遍历所述多个子视图区:Repeat the following steps until you traverse the multiple subview regions:
    从所述多个子视图区中获取当前子视图区;Obtaining a current sub-view area from the plurality of sub-view areas;
    获取所述当前子视图区的所述显著性特征所指示的所述显著性等级及所述第二中心视图区落在所述当前子视图区的概率;Obtaining the significance level indicated by the significant feature of the current sub-view area and the probability that the second central view area falls within the current sub-view area;
    将所述显著性等级与所述概率二者中的最大值作为所述当前子视图区的所述参考值。The maximum value of the significance level and the probability is used as the reference value of the current sub-view area.
  21. 根据权利要求19或20所述的装置,其中,所述处理模块通过以下步骤实现根据所述多个子视图区的所述参考值中的最大值确定所述当前视图区的第三分辨率:The apparatus according to claim 19 or 20, wherein the processing module determines to determine a third resolution of the current view area according to a maximum value of the reference values of the plurality of sub-view areas by:
    通过以下公式计算所述当前视图区的所述第三分辨率所在的分辨率等级:The resolution level of the third resolution of the current view area is calculated by the following formula:
    S(t,x,y)=1+(n-1)*mPi(t,x,y)*Qnet,S(t,x,y)=1+(n-1)*mPi(t,x,y)*Qnet,
    其中,(x,y)为所述当前视图区的坐标,S(t,x,y)用于表示在所述预定时间段t后所述全景图像帧中所述当前视图区的所述第三分辨率所在的分辨率等级,mPi(t,x,y)用于表示在所述预定时间段t后在所述当前视图区中所述多个子 视图区的所述参考值的最大值,Qnet用于表示当前网络带宽等级,n用于表示分辨率等级,其中,Qnet∈[0,1],S(t,x,y)∈{1,2,L,n};以及Where (x, y) is the coordinates of the current view area, and S(t, x, y) is used to indicate the number of the current view area in the panoramic image frame after the predetermined time period t a resolution level at which the three resolutions are located, mPi(t, x, y) is used to indicate the plurality of children in the current view area after the predetermined time period t The maximum value of the reference value of the view area, Qnet is used to indicate the current network bandwidth level, and n is used to represent the resolution level, where Qnet∈[0,1], S(t,x,y)∈{1, 2, L, n}; and
    根据所述第三分辨率所在的分辨率等级确定所述第三分辨率。The third resolution is determined according to a resolution level at which the third resolution is located.
  22. 根据权利要求19至21中任一项所述的装置,其中,所述处理模块通过以下公式计算所述第二中心视图区落在所述当前子视图区的概率:The apparatus according to any one of claims 19 to 21, wherein the processing module calculates a probability that the second center view area falls within the current sub-view area by the following formula:
    P(t,sx,sy)=exp(-((sx-xt)2+(sy-yt)2)),P(t, sx, sy)=exp(-((sx-x t ) 2 +(sy-y t ) 2 )),
    其中,(sx,sy)用于表示所述当前子视图区的坐标,P(t,sx,sy)用于表示在所述预定时间段t后所述第二中心视图区落在所述当前子视图区的概率,(xt,yt)用于表示所述预定时间段t后所述第二中心视图区的坐标。 Where (sx, sy) is used to represent the coordinates of the current sub-view area, and P(t, sx, sy) is used to indicate that the second central view area falls within the current after the predetermined time period t The probability of the sub-view area, (x t , y t ), is used to represent the coordinates of the second central view area after the predetermined time period t.
PCT/CN2017/092562 2016-07-14 2017-07-12 Panoramic media file push method and device WO2018010653A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610557007.2 2016-07-14
CN201610557007.2A CN106060515B (en) 2016-07-14 2016-07-14 Panorama pushing method for media files and device

Publications (1)

Publication Number Publication Date
WO2018010653A1 true WO2018010653A1 (en) 2018-01-18

Family

ID=57186887

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/092562 WO2018010653A1 (en) 2016-07-14 2017-07-12 Panoramic media file push method and device

Country Status (2)

Country Link
CN (1) CN106060515B (en)
WO (1) WO2018010653A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3389281A1 (en) * 2017-04-16 2018-10-17 Facebook, Inc. Systems and methods for provisioning content

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106060515B (en) * 2016-07-14 2018-11-06 腾讯科技(深圳)有限公司 Panorama pushing method for media files and device
WO2018093840A1 (en) 2016-11-17 2018-05-24 Intel Corporation Spherical rotation for encoding wide view video
CN112738530B (en) 2016-11-17 2024-02-23 英特尔公司 Suggested viewport indication for panoramic video
US20180160119A1 (en) 2016-12-01 2018-06-07 Mediatek Inc. Method and Apparatus for Adaptive Region-Based Decoding to Enhance User Experience for 360-degree VR Video
US20180192063A1 (en) * 2017-01-03 2018-07-05 Black Sails Technology, Inc. Method and System for Virtual Reality (VR) Video Transcode By Extracting Residual From Different Resolutions
CN108693953A (en) * 2017-02-28 2018-10-23 华为技术有限公司 A kind of augmented reality AR projecting methods and cloud server
US10579898B2 (en) * 2017-04-16 2020-03-03 Facebook, Inc. Systems and methods for provisioning content using barrel projection representation
CN108810574B (en) * 2017-04-27 2021-03-12 腾讯科技(深圳)有限公司 Video information processing method and terminal
CN107277474B (en) * 2017-06-26 2019-06-25 深圳看到科技有限公司 Panorama generation method and generating means
CN109286855B (en) * 2017-07-19 2020-10-13 北京大学 Panoramic video transmission method, transmission device and transmission system
US10893261B2 (en) * 2017-12-06 2021-01-12 Dolby Laboratories Licensing Corporation Positional zero latency
CN108650460B (en) * 2018-05-10 2021-03-30 深圳视点创新科技有限公司 Server, panoramic video storage and transmission method and computer storage medium
CN110312170B (en) * 2019-07-12 2022-03-04 青岛一舍科技有限公司 Video playing method and device capable of intelligently adjusting visual angle
CN112406706B (en) * 2020-11-20 2022-07-22 上海华兴数字科技有限公司 Vehicle scene display method and device, readable storage medium and electronic equipment
CN115529449A (en) * 2021-06-26 2022-12-27 华为技术有限公司 Virtual reality video transmission method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040086186A1 (en) * 2002-08-09 2004-05-06 Hiroshi Kyusojin Information providing system and method, information supplying apparatus and method, recording medium, and program
US20140123162A1 (en) * 2012-10-26 2014-05-01 Mobitv, Inc. Eye tracking based defocusing
CN104735464A (en) * 2015-03-31 2015-06-24 华为技术有限公司 Panorama video interactive transmission method, server and client end
US20160165309A1 (en) * 2013-07-29 2016-06-09 Koninklijke Kpn N.V. Providing tile video streams to a client
CN106060515A (en) * 2016-07-14 2016-10-26 腾讯科技(深圳)有限公司 Panoramic media file push method and apparatus

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101350920A (en) * 2007-07-17 2009-01-21 北京华辰广正科技发展有限公司 Method for estimating global motion facing to panorama video
US8355041B2 (en) * 2008-02-14 2013-01-15 Cisco Technology, Inc. Telepresence system for 360 degree video conferencing
CN104010225B (en) * 2014-06-20 2016-02-10 合一网络技术(北京)有限公司 The method and system of display panoramic video
CN105323552B (en) * 2015-10-26 2019-03-12 北京时代拓灵科技有限公司 A kind of panoramic video playback method and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040086186A1 (en) * 2002-08-09 2004-05-06 Hiroshi Kyusojin Information providing system and method, information supplying apparatus and method, recording medium, and program
US20140123162A1 (en) * 2012-10-26 2014-05-01 Mobitv, Inc. Eye tracking based defocusing
US20160165309A1 (en) * 2013-07-29 2016-06-09 Koninklijke Kpn N.V. Providing tile video streams to a client
CN104735464A (en) * 2015-03-31 2015-06-24 华为技术有限公司 Panorama video interactive transmission method, server and client end
CN106060515A (en) * 2016-07-14 2016-10-26 腾讯科技(深圳)有限公司 Panoramic media file push method and apparatus

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3389281A1 (en) * 2017-04-16 2018-10-17 Facebook, Inc. Systems and methods for provisioning content

Also Published As

Publication number Publication date
CN106060515B (en) 2018-11-06
CN106060515A (en) 2016-10-26

Similar Documents

Publication Publication Date Title
WO2018010653A1 (en) Panoramic media file push method and device
US10609284B2 (en) Controlling generation of hyperlapse from wide-angled, panoramic videos
EP2481023B1 (en) 2d to 3d video conversion
US20200288098A1 (en) Method, apparatus, medium, terminal, and device for multi-angle free-perspective interaction
US10861159B2 (en) Method, system and computer program product for automatically altering a video stream
WO2018006825A1 (en) Video coding method and apparatus
Wang et al. Motion-aware temporal coherence for video resizing
US9237330B2 (en) Forming a stereoscopic video
WO2015192585A1 (en) Method and apparatus for playing advertisement in video
EP2999221A1 (en) Image processing method and device
US20130127988A1 (en) Modifying the viewpoint of a digital image
US20130127993A1 (en) Method for stabilizing a digital video
Maugey et al. Saliency-based navigation in omnidirectional image
KR102551713B1 (en) Electronic apparatus and image processing method thereof
CN105960800A (en) Image display device and image display system
WO2019080792A1 (en) Panoramic video image playing method and device, storage medium and electronic device
Tan et al. 360-degree virtual-reality cameras for the masses
US9847102B2 (en) Method and device for bounding an object in a video
Dedhia et al. Saliency prediction for omnidirectional images considering optimization on sphere domain
CN113810755B (en) Panoramic video preview method and device, electronic equipment and storage medium
CN111885417B (en) VR video playing method, device, equipment and storage medium
CN111669603B (en) Multi-angle free visual angle data processing method and device, medium, terminal and equipment
US11617024B2 (en) Dual camera regions of interest display
US20150382065A1 (en) Method, system and related selection device for navigating in ultra high resolution video content
Lee Novel video stabilization for real-time optical character recognition applications

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17826989

Country of ref document: EP

Kind code of ref document: A1