WO2018233661A1 - Method and apparatus of inter prediction for immersive video coding - Google Patents

Method and apparatus of inter prediction for immersive video coding Download PDF

Info

Publication number
WO2018233661A1
WO2018233661A1 PCT/CN2018/092142 CN2018092142W WO2018233661A1 WO 2018233661 A1 WO2018233661 A1 WO 2018233661A1 CN 2018092142 W CN2018092142 W CN 2018092142W WO 2018233661 A1 WO2018233661 A1 WO 2018233661A1
Authority
WO
WIPO (PCT)
Prior art keywords
block
sphere
frame
projection
rotation
Prior art date
Application number
PCT/CN2018/092142
Other languages
French (fr)
Inventor
Cheng-Hsuan Shih
Jian-Liang Lin
Original Assignee
Mediatek Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mediatek Inc. filed Critical Mediatek Inc.
Priority to CN201880002044.4A priority Critical patent/CN109691104B/en
Priority to TW107121492A priority patent/TWI690193B/en
Publication of WO2018233661A1 publication Critical patent/WO2018233661A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/55Motion estimation with spatial constraints, e.g. at image or region borders

Definitions

  • the present invention relates to image/video processing or coding for 360-degree virtual reality (VR) images/sequences.
  • the present invention relates to Inter prediction for three-dimensional (3D) contents in various projection formats.
  • the 360-degree video also known as immersive video is an emerging technology, which can provide “feeling as sensation of present” .
  • the sense of immersion is achieved by surrounding a user with wrap-around scene covering a panoramic view, in particular, 360-degree field of view.
  • the “feeling as sensation of present” can be further improved by stereographic rendering. Accordingly, the panoramic video is being widely used in Virtual Reality (VR) applications.
  • VR Virtual Reality
  • Immersive video involves the capturing a scene using multiple cameras to cover a panoramic view, such as 360-degree field of view.
  • the immersive camera usually uses a panoramic camera or a set of cameras arranged to capture 360-degree field of view. Typically, two or more cameras are used for the immersive camera. All videos must be taken simultaneously and separate fragments (also called separate perspectives) of the scene are recorded. Furthermore, the set of cameras are often arranged to capture views horizontally, while other arrangements of the cameras are possible.
  • the 360-degree virtual reality (VR) images may be captured using a 360-degree spherical panoramic camera or multiple images arranged to cover all filed of views around 360 degrees.
  • the three-dimensional (3D) spherical image is difficult to process or store using the conventional image/video processing devices. Therefore, the 360-degree VR images are often converted to a two-dimensional (2D) format using a 3D-to-2D projection method.
  • 2D two-dimensional
  • equirectangular projection (ERP) and cubemap projection (CMP) have been commonly used projection methods. Accordingly, a 360-degree image can be stored in an equirectangular projected format.
  • the equirectangular projection maps the entire surface of a sphere onto a flat image.
  • Fig. 1 illustrates an example of projecting a sphere 110 into a rectangular image 120 according to equirectangular projection (ERP) , where each longitude line is mapped to a vertical line of the ERP picture.
  • ERP equirectangular projection
  • the areas in the north and south poles of the sphere are stretched more severely (i.e., from a single point to a line) than areas near the equator.
  • due to distortions introduced by the stretching, especially near the two poles predictive coding tools often fail to make good prediction, causing reduction in coding efficiency.
  • FIG. 2 illustrates a cube 210 with six faces, where a 360-degree virtual reality (VR) image can be projected to the six faces on the cube according to cubemap projection (CMP) .
  • VR virtual reality
  • CMP cubemap projection
  • the example shown in Fig. 2 divides the six faces into two parts (220a and 220b) , where each part consists of three connected faces.
  • the two parts can be unfolded into two strips (230a and 230b) , where each strip corresponds to a continuous-face picture.
  • the two strips can be combined into a compact rectangular frame according to a selected layout format.
  • JVET-F1003 Joint Video Exploration Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 6th Meeting: Hobart, AU, 31 March–7 April 2017, Document: JVET-F1003) .
  • VR projection formats such as Adjusted Cubemap Projection (ACP) , Equal-Area Projection (EAP) , Octahedron Projection (OHP) , Icosahedron Projection (ISP) , Segmented Sphere Projection (SSP) and Rotated Sphere Projection (RSP) that are widely used in the field.
  • ACP Adjusted Cubemap Projection
  • EAP Equal-Area Projection
  • OHP Octahedron Projection
  • ISP Icosahedron Projection
  • SSP Segmented Sphere Projection
  • RSP Rotated Sphere Projection
  • Fig. 3 illustrates an example of octahedron projection (OHP) , where a sphere is projected onto faces of an 8-face octahedron 310.
  • the eight faces 320 lifted from the octahedron 310 can be converted to an intermediate format 330 by cutting open the face edge between faces 1 and 5 and rotating faces 1 and 5 to connect to faces 2 and 6 respectively, and applying a similar process to faces 3 and 7.
  • the intermediate format can be packed into a rectangular picture 340.
  • Fig. 4 illustrates an example of icosahedron projection (ISP) , where a sphere is projected onto faces of a 20-face icosahedron 410.
  • the twenty faces 420 from the icosahedron 410 can be packed into a rectangular picture 430 (referred as a projection layout) .
  • Segmented sphere projection has been disclosed in JVET-E0025 (Zhang et al., “AHG8: Segmented Sphere Projection for 360-degree video” , Joint Video Exploration Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 5th Meeting: Geneva, CH, 12–20 January 2017, Document: JVET-E0025) as a method to convert a spherical image into an SSP format.
  • Fig. 5 illustrates an example of segmented sphere projection, where a spherical image 500 is mapped into a North Pole image 510, a South Pole image 520 and an equatorial segment image 530.
  • the boundaries of 3 segments correspond to latitudes 45°N (502) and 45°S (504) , where 0° corresponds to the equator (506) .
  • the North and South Poles are mapped into 2 circular areas (i.e., 510 and 520) , and the projection of the equatorial segment can be the same as ERP or equal-area projection (EAP) .
  • the diameter of the circle is equal to the width of the equatorial segments because both Pole segments and equatorial segment have a 90° latitude span.
  • the North Pole image 510, South Pole image 520 and the equatorial segment image 530 can be packed into a rectangular image.
  • Fig. 6 illustrates an example of rotated sphere projection (RSP) , where the sphere 610 is partitioned into a middle 270°x90° region 620, and a residual part 622. Each part of RSP can be further stretched on the top side and the bottom side to generate a deformed part having an oval shape. The two over-shaped parts can be fitted into a rectangular frame 630 as shown in Fig. 6.
  • RSP rotated sphere projection
  • the Adjusted Cubemap Projection format is based on the CMP. If the two-dimensional coordinate (u’, v’) for CMP is determined, the two-dimensional coordinate (u, v) for ACP can be calculated by adjusting (u’, v’) according to a set of equations:
  • the 3D coordinates (X, Y, Z) can be derived using a table given the position (u, v) and the face index f.
  • the (u’, v’) and face index f can be calculated according to a table for CMP.
  • the 2D coordinates for ACP can be calculated according to a set of equations.
  • the EAP Similar to ERP, the EAP also maps a sphere surface to one face. In the (u, v) plane, u and v are in the range [0, 1] . For 2D-to-3D coordinate conversion, given the sampling position (m, n) , 2D coordinates (u, v) are first calculated in the same way as ERP. Then, the longitude and latitude ( ⁇ , ⁇ ) on the sphere can be calculated from (u, v) as:
  • the longitude and latitude ( ⁇ , ⁇ ) can be evaluated from (X, Y, Z) coordinates using:
  • Inter prediction has been a powerful coding tool to explore the inter-frame redundancy using motion estimation/compensation. If conventional Inter prediction is applied to the 2D frames converted from a 3D space, the using motion estimation/compensation techniques may not work properly since an object in the 3D space may become distorted or deformed in the 2D frames due to object movement or relative motion between an object and a camera. In order to improve Inter prediction for 2D frames converted from a 3D space, various Inter prediction techniques are developed to improve the accuracy of Inter prediction for 2D frames converted from a 3D space.
  • Methods and apparatus of processing 360-degree virtual reality images are disclosed.
  • input data for a current block in a 2D (two-dimensional) frame are received, where the 2D frame is projected from a 3D (three-dimensional) sphere.
  • a motion vector associated with a source block in the 2D frame is determined, where the motion vector points from a source location in the source block to a destination location in the 2D frame.
  • the source location, the destination location and the source block in the 2D frame are projected onto the 3D sphere according to a target projection.
  • the source block in the 3D sphere is rotated along a rotation circle on a surface of the 3D sphere around a rotation axis to generate a deformed reference block in the 3D sphere.
  • the deformed reference block in the 3D sphere is mapped back to the 2D frame according to an inverse target projection.
  • the current block in the 2D frame is then encoded or decoded using the deformed reference block in the 2D frame as a predictor.
  • the rotation circle corresponds to a largest circle on the surface of the 3D sphere. In another embodiment, the rotation circle is smaller than a largest circle on the surface of the 3D sphere.
  • the rotation circle on the surface of the 3D sphere around a rotation axis is determined according to the source location and the destination location on the 3D sphere.
  • a rotation axis, and a rotation angle ⁇ a associated with the rotation circle are derived according to and and wherein and correspond to the source location and the destination location on a surface of the 3D sphere respectively.
  • a rotation axis and a rotation angle associated with the rotation circle are derived based on motion vectors in a reference frame.
  • the rotation axis and the rotation angle ⁇ ’associated with the rotation circle are derived according to: and wherein s i corresponds to one source block in the reference frame, mv (s i ) corresponds to a motion vector of source block s i , mv corresponds to one motion vector caused by rotating one location in source block s i by the rotation angle ⁇ ’around the rotation axis and
  • F is F-norm.
  • the rotation axis and a rotation angle associated with the rotation circle are derived based on motion vectors in an already coded region in a current frame according to the same equation shown above.
  • the rotation axis associated with the rotation circle can be pre-defined or the rotation axis can be indicated in a bitstream for indicating a path of rotation.
  • the target projection corresponds to Equirectangular Projection (ERP) and Cubemap Projection (CMP) , Adjusted Cubemap Projection (ACP) , Equal-Area Projection (EAP) , Octahedron Projection (OHP) , Icosahedron Projection (ISP) , Segmented Sphere Projection (SSP) , Rotated Sphere Projection (RSP) , or Cylindrical Projection (CLP) .
  • ERP Equirectangular Projection
  • CMP Cubemap Projection
  • ACP Adjusted Cubemap Projection
  • EAP Equal-Area Projection
  • OHP Octahedron Projection
  • ISP Icosahedron Projection
  • SSP Segmented Sphere Projection
  • RSP Rotated Sphere Projection
  • CLP Cylindrical Projection
  • deformation based on displacement of camera is applied to generate an Inter predictor.
  • two 2D (two-dimensional) frames corresponding to two different viewpoints are received, where said two 2D frames are projected, using a target projection, from a 3D (three-dimensional) sphere, and wherein a current block, a predicted block for the current block and a neighbouring block are located in said two 2D frames.
  • a forward point of camera is determined based on said two 2D frames and moving flows in said two 2D frames are determined.
  • One or more second motion vectors associated with the predicted block are derived either by referring to one or more first motion vectors of the neighbouring block based on the forward point of camera and the moving flows or according to velocity of camera and depth of background.
  • An Inter prediction block is derived based on the predicted block and said one or more second motion vectors.
  • the current block in the 2D frame is encoded or decoded using the Inter prediction block.
  • Said deriving one or more second motion vectors associated with the predicted block may comprise determining displacement of camera based on said one or more first motion vectors associated with the neighbouring block. Said one or more second motion vectors associated with the predicted block can be derived from said one or more first motion vectors based on the displacement of camera and the moving flows.
  • Fig. 1 illustrates an example of projecting a sphere into a rectangular image according to equirectangular projection, where each longitude line is mapped to a vertical line of the ERP picture.
  • Fig. 2 illustrates a cube with six faces, where a 360-degree virtual reality (VR) image can be projected to the six faces on the cube according to cubemap projection (CMP) .
  • VR virtual reality
  • CMP cubemap projection
  • Fig. 3 illustrates an example of octahedron projection (OHP) , where a sphere is projected onto faces of an 8-face octahedron.
  • OHP octahedron projection
  • Fig. 4 illustrates an example of icosahedron projection (ISP) , where a sphere is projected onto faces of a 20-face icosahedron.
  • ISP icosahedron projection
  • Fig. 5 illustrates an example of segmented sphere projection (SSP) , where a spherical image is mapped into a North Pole image, a South Pole image and an equatorial segment image.
  • SSP segmented sphere projection
  • Fig. 6 illustrates an example of rotated sphere projection (RSP) , where the sphere is partitioned into a middle 270°x90° region and a residual part. These two parts of RSP can be further stretched on the top side and the bottom side to generate deformed parts having oval-shaped boundary on the top part and bottom part.
  • RSP rotated sphere projection
  • Fig. 7 illustrates a case that deformation occurs in an ERP frame due to movement, where the North Pole is mapped to a horizontal line on the top of the frame and the equator is mapped to a horizontal line in the middle of the frame.
  • Fig. 8 illustrates an example of deformation caused by movement in the 3D sphere for the ERP frame.
  • Fig. 9A to Fig. 9I illustrate examples of deformation in 2D frames projected using various projections
  • Fig. 9A is for an ERP frame
  • 9B illustrates is for a CMP frame
  • Fig. 9C is for an SSP frame
  • Fig. 9D is for an OHP frame
  • Fig. 9E is for an ISP frame
  • Fig. 9F is for an EAP frame
  • Fig. 9G is for an ACP frame
  • Fig. 9H is for an RSP frame
  • Fig. 9I is for a Cylindrical Projection frame.
  • Fig. 10 illustrates a concept of Inter prediction by taking into account of deformation based on rotation according to a method of the present invention.
  • Fig. 11 illustrates examples that a source block is moved to a destination location through different paths. Due to the different paths, the block at the destination location may have different orientations.
  • Fig. 12A illustrates an example to describe the 3D movement on a sphere by the yaw, pitch and roll rotations, where a source block is moved to a destination location as prescribed by the three axes.
  • Fig. 12B illustrates an example to describe the 3D movement on a sphere by rotating around a big circle, where a source block is moved to a destination location along a big circle.
  • Fig. 12C illustrates an example to describe the 3D movement on a sphere by rotating around a small circle, where a source block is moved to a destination location along a small circle.
  • Fig. 12D illustrates yet another way of describing object movement of the surface of sphere, where source block is first moved to destination block by rotating on a big circle 1253 on the sphere around a rotation axis and the destination block is rotated around another axis.
  • Fig. 13 illustrates an exemplary procedure for Inter prediction by taking into account of deformation based on rotation.
  • Fig. 14 illustrates the step of generating samples for a destination block by rotating samples in a source block by a rotation angle and axis of rotation after the rotation angle and axis of rotation are determined.
  • Fig. 15 compares an exemplary procedure for Inter prediction according to a method of the present invention by taking into account of deformation based on rotation and convention Inter prediction.
  • Fig. 16 compares the two different deformation methods based on rotation, where the upper part corresponds to a case of rotation along a big circle and the lower one corresponds to rotation around a new rotation axis.
  • Fig. 17 illustrates a method to derive the axis of rotation using motion vectors associated with blocks of a reference picture.
  • Fig. 18 illustrates a method to derive the axis of rotation using motion vectors associated with processed blocks in the current picture.
  • Fig. 19A illustrates an exemplary process of the deformation based on displacement of camera, where an example of an object (i.e., a tree) is projected onto the surface of a sphere at different camera locations.
  • an object i.e., a tree
  • Fig. 19B illustrates the locations of an object projected onto a 2D frame for different camera locations.
  • Fig. 20 illustrates an example of an ERP frame overlaid with the pattern of moving flows, where the flow of background (i.e., static object) can be determined if the camera forward point is known.
  • background i.e., static object
  • Fig. 21 illustrates an example of the pattern of moving flows based on displacement of viewpoint for a CMP frame in the 2x3 layout format.
  • Fig. 22 illustrates an example of using deformation based on displacement of camera for Inter more prediction.
  • Fig. 23 illustrates an exemplary procedure of using deformation based on displacement of camera for Inter more prediction.
  • Fig. 24 illustrates exemplary moving flows in the 2D frame for various projection formats.
  • Fig. 25 illustrates an exemplary flowchart of a system that applies rotation of sphere to deform a reference block for Inter prediction according to a method of the present invention.
  • Fig. 26 illustrates an exemplary flowchart of a system that applies displacement of camera to deform a reference block for Inter prediction according to a method of the present invention.
  • motion estimation/compensation is widely used to explore correlation in video data in order to reduce transmitted information.
  • the conventional video contents correspond to 2D video data and the motion estimation and compensation techniques often assume a translational motion.
  • more advanced motion models such as affine model, are considered. Nevertheless, these techniques derive a 3D motion model based on the 2D images.
  • an ERP frame 720 is generated by projecting a 3D sphere 710 onto a rectangular frame as shown in Fig. 7, where the North Pole 712 is mapped to a horizontal line 722 on the top of the frame and the equator 714 is mapped to a horizontal line 724 in the middle of the frame.
  • Map 730 illustrates the effect of deformation in ERP, where a circle close to the North Pole or South Pole is mapped to an over-shaped area while the circle remains to be a circle in the middle of the frame.
  • Fig. 8 illustrates an example of deformation caused by movement in the 3D sphere for the ERP frame.
  • block 801 on a 3D sphere 800 is moved to become block 802.
  • the two corresponding blocks become blocks 804 and 805 in the ERP frame.
  • the blocks (i.e., 801 and 802) on the 3D sphere correspond to a same block, the two blocks (i.e., 804 and 805) have different shapes in the ERP frame.
  • Fig. 9A to Fig. 9I illustrate examples of deformation in 2D frames projected using various projections.
  • Fig. 9A illustrates an example for an ERP frame 910, where a square block 912 becomes deformed when the block is moved to a location 914 closer to the North Pole.
  • Fig. 9B illustrates an example for a CMP frame 920, where a square block 922 becomes deformed when the block is moved to a location 924.
  • Fig. 9C illustrates an example for an SSP frame 930, where a square block 932 becomes deformed when the block is moved to a location 934.
  • Fig. 9D illustrates an example for an OHP frame 940, where a square block 942 becomes deformed when the block is moved to a location 944.
  • Fig. 9A illustrates an example for an ERP frame 910, where a square block 912 becomes deformed when the block is moved to a location 914 closer to the North Pole.
  • Fig. 9B illustrates an example for a
  • FIG. 9E illustrates an example for an ISP frame 950, where a square block 952 becomes deformed when the block is moved to a location 954.
  • Fig. 9F illustrates an example for an EAP frame 970, where a square block 972 becomes deformed when the block is moved to a location 974.
  • Fig. 9G illustrates an example for an ACP frame 980, where a square block 982 becomes deformed when the block is moved to a location 984.
  • Fig. 9H illustrates an example for an RSP frame 980, where a square block 982 becomes deformed when the block is moved to a location 984.
  • Fig. 9I illustrates an example for a Cylindrical Projection frame 990, where a square block 992 becomes deformed when the block is moved to a location 994.
  • object motion in a 3D space may cause object deformation in a 2D frame projected from the 3D sphere.
  • various methods of Inter prediction for VR Video Processing are disclosed.
  • a method to handle the deformation issue for Inter prediction is to project a block in the 2D frame back to the 3D sphere.
  • the block may be a corresponding block in a reference picture prior to motion compensation for a current block.
  • the corresponding block is moved in the 2D frame according to the motion vector to point to a reference block and the reference block is used as Inter predictor for the current block.
  • the block is moved to a designated location on the surface of the 3D sphere.
  • the block is moved to a new location by rotating the sphere.
  • the object moved on the surface of the 3D sphere is projected back to the 2D frame.
  • Fig. 10 illustrates a concept of Inter prediction by taking into account of deformation based on rotation.
  • block 1013 corresponds to a source block in a 2D frame 1010.
  • Motion vector 1015 of the source block points from a location s c 1012 in the source block 1013 to a destination location d c 1014.
  • the data in the 2D frame are projected to the 3D sphere according to a corresponding projection type.
  • the 2D frame is generated from ERP
  • the ERP projection is used to project data in the 2D frame to the 3D sphere.
  • location s c 1012 and location d c 1014 in the 2D frame are projected to locations 1022 and 1024 in the 3D sphere 1020 respectively.
  • location 1022 is rotated to location 1024.
  • the same rotation is also applied to other locations of the source block 1013 to generate the destination block.
  • the data in the destination block in the 3D sphere are then projected back to the 2D frame using an inverse ERP projection.
  • Fig. 11 illustrates examples that a source block is moved to a destination location through different paths. Due to the different paths, the block at the destination location may have different orientations.
  • the source block 1112 is moved to the destination location 1114 through path 1113 with slight right turn.
  • the source block 1122 is moved to the destination location 1124 through straight path 1123.
  • the source block 1132 is moved to the destination location 1134 through path 1133 with slight left turn.
  • a source block 1212 is moved to a destination location 1214.
  • the axes of yaw 1216, pitch 1217 and roll 1218 are shown.
  • Another way to describe the 3D movement on a sphere 1220 is rotation around a big circle as shown in Fig. 12B, where a big circle 1221, a source block 1222 is moved to a destination location 1224.
  • the rotation 1226 is shown.
  • the big circle 1221 corresponds to the largest circle on the surface of the sphere 1220.
  • FIG. 12C illustrates an example of rotation of the sphere 1230 from source block 1232 to destination block 1234 on a small circle 1233 on the sphere 1235, where the small circle 1233 corresponds to a circle smaller than the largest circle (e.g. circle 1236) on the surface of the sphere 1235.
  • the centre point of rotation is shown as a dot 1237 in Fig. 12C.
  • the axis of rotation is shown as an arrow 1247 in Fig. 12C.
  • FIG. 12D illustrates yet another way of describing object movement of the surface of sphere 1250, where source block 1252 is first moved to destination block 1254 by rotating on a big circle 1253 on the sphere 1250 around axis-a1256. After the destination block reaches the final location, the destination block is rotated around axis-b 1257, where axis-b is from the centre of the big circle 1258 to the centre of destination block 1254.
  • Fig. 13 illustrates an exemplary procedure for Inter prediction by taking into account of deformation based on rotation.
  • block 1313 corresponds to a source block in a 2D frame 1310.
  • Motion vector 1315 of the source block points from a location s c 1312 in the source block 1313 to a destination location d c 1314.
  • the data in the 2D frame are projected to the 3D sphere according to a corresponding projection type.
  • the 2D frame is generated from ERP
  • the ERP projection is used to project data in the 2D frame to the 3D sphere.
  • location s c 1312 and location d c 1314 in the 2D frame are projected to locations 1322 and 1324 in the 3D sphere 1320 respectively.
  • location 1322 is rotated to location 1324 around big circle 1326.
  • the same rotation is also applied to other locations of the source block 1313 to generate the destination block.
  • rotation angle ⁇ from to is calculated according to:
  • the axis of rotation is calculated as:
  • samples s mn in a block 1410 of the 2D frame are identified as shown in Fig. 14.
  • Samples s mn are mapped to 1422 in 3D sphere 1420.
  • Samples are rotated by ⁇ around -axis to obtain 1424 at the destination location according to Rodrigues'rotation formula:
  • samples will be further rotated. Otherwise, the samples are the final rotated samples in 3D sphere.
  • the rotated samples 1512 in 3D sphere 1510 are projected back to a deformed block 1514 in the 2D frame and used as a new Inter predictor for a source block 1516 as shown in Fig. 15.
  • the source block 1516 may be a corresponding block in a reference picture prior to motion compensation for a current block.
  • the destination block 1514 corresponds to a deformed reference block for Inter prediction according to the present invention.
  • the source block 1526 is moved in the 2D frame according to the motion vector to point to a reference block 1524 and the reference block is used as Inter predictor for the current block.
  • the Inter predictor 1522 for a source block 1526 in the 2D frame maintains its shape. In the 3D space 1520, the Inter predictor 1522 becomes deformed. Therefore, the conventional Inter predictor does not perform properly due to deformation caused by movement in 3D space.
  • Method 1 the rotation axis is around the normal vector of big circle (i.e., ) .
  • Method 2 a new rotation axis is used.
  • Fig. 16 compares the two different deformation methods based on rotation. The upper part corresponds to a case of method 1, where a source block 1612 in the 2D frame 1610 is mapped to block 1622 in 3D sphere 1620. A motion vector 1616 in the 2D frame is mapped to the 3D sphere 1620 for determining the location of the destination block 1624. The source block 1622 is then rotated along the big circle 1626 with rotation axis to generate the destination block 1624. The destination block 1624 is then mapped back to the 2D frame 1610 to generate the deformed block 1614 as an Inter predictor for the source block 1612.
  • the lower part corresponds to the deformation based on rotation according to Method 2, where the source block 1632 in the 2D frame 1630 is mapped to block 1642 in 3D sphere 1640.
  • a motion path 1636 in the 2D frame is mapped to the 3D sphere 1640 for determining the location of the destination block 1644.
  • the source block 1642 is then rotated along the small circle 1646 with a new rotation axis to generate the destination block 1644.
  • the destination block 1644 is then mapped back to the 2D frame 1630 to generate the deformed block 1634 as an Inter predictor for the source block 1632.
  • the example in the upper part of Fig. 16 i.e., rotation along a big circle
  • Fig. 16 it requires to determine the small circle or the axis of rotation.
  • a method to derive the axis of rotation is shown in Fig. 17, where s c is the block centre of a source block 1712 and the mv pointing from s c to for d c the block is already known. Both s c to for d c are mapped to and in the 3D sphere 1720 respectively.
  • a motion vector from to can be determined in the 3D sphere.
  • the motion vector mapping to the 3D sphere can be applied to all source blocks of the 2D frame as shown in 3D sphere 1730.
  • An axis of rotation, that rotates motion vector mv (s i ) to mv is selected based on a performance criterion, where corresponds to the axis of rotation and ⁇ ’corresponds to the angle of rotation.
  • a performance criterion corresponds to the axis of rotation and ⁇ ’corresponds to the angle of rotation.
  • the true mv of s i is mv (s i )
  • mv is motion vector that rotate ⁇ ’around axis at s i .
  • Fig. 18 illustrates a method to derive the axis of rotation using motion vectors associated with processed blocks in the current picture as follows:
  • Fig. 19A illustrates an example of an object (i.e., a tree) projected onto the surface of a sphere at different camera locations.
  • the tree is projected onto sphere 1910 to form an image 1940 of the tree.
  • the tree is projected onto sphere 1920 to form an image 1950 of the tree.
  • the image 1941 of the tree corresponding to camera position A is also shown on sphere 1920 for comparison.
  • the tree is projected onto sphere 1930 to form an image 1960 of the tree.
  • Image 1942 of the tree corresponding to camera position A and image 1951 of the tree corresponding to camera position B are also shown on sphere 1930 for comparison.
  • Fig. 19A illustrates an example of an object (i.e., a tree) projected onto the surface of a sphere at different camera locations.
  • the tree is projected onto sphere 1910 to form an image 1940 of the tree.
  • the tree is projected onto sphere 1920 to form an image 1950 of the tree.
  • the image 1941 of the tree corresponding to camera position A is also shown on sphere 1920 for comparison.
  • the direction of movement of the camera in 3D space can be represented by a latitude and longitude coordinates, where corresponds to the intersection of the motion vector and the 3D sphere.
  • the point of is projected to the 2D target projection plane and the point is referred as a forward point (e.g. 1934) .
  • Fig. 19B illustrates the locations of the tree projected onto a 2D frame 1970 for camera locations A and B in 3D space as shown in Fig. 19A, where the tree image at location 1972 corresponds to camera location A and the tree image at location 1974 corresponds to camera location B.
  • the locations of the tree projected onto a 2D frame 1980 for camera locations B and C in 3D space is also shown in Fig. 19B, where the tree image at location 1982 corresponds to camera location A, the tree image at location 1984 corresponds to camera location B and the tree image at location 1986 corresponds to camera location C.
  • Fig. 20 illustrates an example of an ERP frame overlaid with the pattern of moving flows, where the flow of background (i.e., static object) can be determined if the camera forward point is known.
  • the flows are indicated by arrows.
  • the camera forward point 2010 and camera backward point 2020 are shown.
  • Moving flows correspond to the moving direction of video content based on a camera moving in a direction.
  • the movement of the camera causes the relative movement of the static background objects, and the moving direction of background object on the 2D frame captured by camera can be represented as moving flows.
  • Multiple frames 2030 may be used to derive the moving flows.
  • the pattern of moving flows based on displacement of viewpoint can be applied to various projection methods.
  • the moving flow in the 2D frame 2110 is shown for a CMP frame in the 2x3 layout format in Fig. 21.
  • Fig. 22 illustrates an example of using deformation based on displacement of camera for Inter more prediction.
  • the moving flows are indicated by arrows.
  • the deformation of the source block e.g. blocks 2231-2235
  • the deformation of the source block can be determined using moving flows of the background for background objects.
  • Fig. 23 illustrates an exemplary procedure of using deformation based on displacement of camera for Inter more prediction. The steps of this exemplary procedure are as follow:
  • ⁇ MV of neighbouring block can determine the camera displacement based on forward point of camera and moving flow of frame
  • the camera displacement and moving flow can determine the MV of each pixel of predicted block.
  • the images for two different camera locations can be capture as shown in arrangement 2310.
  • the moving flows (indicated by arrows) and the camera forward point 2322 in the 2D frame 2320 can be determined.
  • the camera displacement and moving flow can then be used to determine the MV of each pixel of predicted block 2324. Accordingly, a deformed block 2326 is derived and used as Inter prediction for the current block 2324.
  • the deformation based on displacement of camera for Inter more prediction can be applied to various projections.
  • the moving flow in the 2D frame can be mapped to 3D sphere.
  • the moving flow on the 3D sphere 2410 is shown in Fig. 24, where the forward point and two different lines of moving flow (2412 and 2414) are shown.
  • the moving flow on 3D sphere associated with ERP 2420 is shown in Fig. 24, where the moving flows are shown for an ERP frame 2426.
  • the moving flow on 3D sphere associated with CMP 2430 is shown in Fig. 24, where the moving flows are shown for six faces of a CMP frame 2436 in a 2x3 layout format.
  • the moving flow on 3D sphere associated with OHP 2440 is shown in Fig.
  • FIG. 24 where the moving flows are shown for eight faces of an OHP frame 2446.
  • the moving flow on 3D sphere associated with ISP 2450 is shown in Fig. 24, where the moving flows are shown for twenty faces of an ISP frame 2456.
  • the moving flow on 3D sphere associated with SSP 2460 is shown in Fig. 24, where the moving flows are shown for segmented faces of an SSP frame 2466.
  • Fig. 25 illustrates an exemplary flowchart of a system that applies rotation of sphere to deform a reference block for Inter prediction according to a method of the present invention.
  • the steps shown in the flowchart may be implemented as program codes executable on one or more processors (e.g., one or more CPUs) at the encoder side.
  • the steps shown in the flowchart may also be implemented based hardware such as one or more electronic devices or processors arranged to perform the steps in the flowchart.
  • input data for a current block in a 2D (two-dimensional) frame are received in step 2510, where the 2D frame is projected from a 3D (three-dimensional) sphere.
  • a motion vector associated with a source block in the 2D frame is determined in step 2520, where the motion vector points from a source location in the source block to a destination location in the 2D frame.
  • the source location, the destination location and the source block in the 2D frame are projected onto the 3D sphere according to a target projection in step 2530.
  • the source block in the 3D sphere is rotated along a rotation circle on a surface of the 3D sphere around a rotation axis to generate a deformed reference block in the 3D sphere in step 2540.
  • the deformed reference block in the 3D sphere is mapped back to the 2D frame according to an inverse target projection in step 2550.
  • the current block in the 2D frame is encoded or decoded using the deformed reference block in the 2D frame as an Inter predictor in step 2560.
  • Fig. 26 illustrates an exemplary flowchart of a system that applies displacement of camera to deform a reference block for Inter prediction according to a method of the present invention.
  • Two 2D (two-dimensional) frames corresponding to two different viewpoints are received in step 2610, where said two 2D frames are projected, using a target projection, from a 3D (three-dimensional) sphere, and wherein a current block, a predicted block for the current block and a neighbouring block are located in said two 2D frames.
  • a forward point of camera is determined based on said two 2D frames in step 2620. Moving flows in said two 2D frames are determined in step 2630.
  • One or more second motion vectors associated with the predicted block are derived either by referring to one or more first motion vectors of the neighbouring block based on the forward point of camera and the moving flows or according to velocity of camera and depth of background in step 2640.
  • An Inter prediction block is derived based on the predicted block and said one or more second motion vectors in step 2650.
  • the current block in the 2D frame is encoded or decoded using the Inter prediction block in step 2660.
  • Embodiment of the present invention as described above may be implemented in various hardware, software codes, or a combination of both.
  • an embodiment of the present invention can be one or more electronic circuits integrated into a video compression chip or program code integrated into video compression software to perform the processing described herein.
  • An embodiment of the present invention may also be program code to be executed on a Digital Signal Processor (DSP) to perform the processing described herein.
  • DSP Digital Signal Processor
  • the invention may also involve a number of functions to be performed by a computer processor, a digital signal processor, a microprocessor, or field programmable gate array (FPGA) .
  • These processors can be configured to perform particular tasks according to the invention, by executing machine-readable software code or firmware code that defines the particular methods embodied by the invention.
  • the software code or firmware code may be developed in different programming languages and different formats or styles.
  • the software code may also be compiled for different target platforms.
  • different code formats, styles and languages of software codes and other means of configuring code to perform the tasks in accordance with the invention will not depart from the spirit and scope of the invention.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Processing Or Creating Images (AREA)

Abstract

Methods and apparatus of processing 360-degree virtual reality images are disclosed. According to one method, deformation along a circle on the sphere is used for Inter prediction of a 2D frame that is projected from a 3D space. A source block in a 2D frame is projected onto a 3D sphere. The source block on the 3D sphere is then rotated to a destination block, which is projected back to the 2D frame and used as an Inter predictor. In one embodiment, the rotation axis can be derived based on motion vectors associated with samples or blocks in a reference picture. In another embodiment, the rotation axis can be derived based on motion vectors associated with processed samples or blocks in the current picture. According to another method, deformation is derived from viewpoint displacement.

Description

METHOD AND APPARATUS OF INTER PREDICTION FOR IMMERSIVE VIDEO CODING
CROSS REFERENCE TO RELATED APPLICATIONS
The present invention claims priority to U.S. Provisional Patent Application, Serial No. 62/523,883, filed on June 23, 2017 and U.S. Provisional Patent Application, Serial No. 62/523,885, filed on June 23, 2017. The U.S. Provisional Patent Applications are hereby incorporated by reference in their entireties.
FIELD OF THE INVENTION
The present invention relates to image/video processing or coding for 360-degree virtual reality (VR) images/sequences. In particular, the present invention relates to Inter prediction for three-dimensional (3D) contents in various projection formats.
BACKGROUND AND RELATED ART
The 360-degree video, also known as immersive video is an emerging technology, which can provide “feeling as sensation of present” . The sense of immersion is achieved by surrounding a user with wrap-around scene covering a panoramic view, in particular, 360-degree field of view. The “feeling as sensation of present” can be further improved by stereographic rendering. Accordingly, the panoramic video is being widely used in Virtual Reality (VR) applications.
Immersive video involves the capturing a scene using multiple cameras to cover a panoramic view, such as 360-degree field of view. The immersive camera usually uses a panoramic camera or a set of cameras arranged to capture 360-degree field of view. Typically, two or more cameras are used for the immersive camera. All videos must be taken simultaneously and separate fragments (also called separate perspectives) of the scene are recorded. Furthermore, the set of cameras are often arranged to capture views horizontally, while other arrangements of the cameras are possible.
The 360-degree virtual reality (VR) images may be captured using a 360-degree spherical panoramic camera or multiple images arranged to cover all filed of views around 360 degrees. The three-dimensional (3D) spherical image is difficult to process or store using the conventional image/video processing devices. Therefore, the 360-degree VR images are often converted to a two-dimensional (2D) format using a 3D-to-2D projection method. For example, equirectangular projection (ERP) and cubemap projection (CMP) have been commonly used projection methods. Accordingly, a 360-degree image can be stored in an equirectangular projected format. The equirectangular projection maps the entire surface of a sphere onto a flat image. The vertical axis is latitude and the horizontal axis is longitude. Fig. 1 illustrates an example of projecting a sphere 110 into a rectangular image 120 according to equirectangular projection (ERP) , where each longitude line is mapped to a vertical line of the ERP picture. For the ERP projection, the areas in the north and south poles of the sphere are stretched more severely (i.e., from a single point to a line) than areas near the equator. Furthermore, due to distortions introduced by the stretching, especially near the two poles, predictive coding tools often fail to make good prediction, causing reduction in coding efficiency. Fig. 2 illustrates a cube 210 with six faces, where a 360-degree virtual reality (VR) image can be projected to the six faces on the cube according  to cubemap projection (CMP) . There are various ways to lift the six faces off the cube and repack them into a rectangular picture. The example shown in Fig. 2 divides the six faces into two parts (220a and 220b) , where each part consists of three connected faces. The two parts can be unfolded into two strips (230a and 230b) , where each strip corresponds to a continuous-face picture. The two strips can be combined into a compact rectangular frame according to a selected layout format.
Both ERP and CMP formats have been included in the projection format conversion being considered for the next generation video coding as described in JVET-F1003 (Y. Ye, et al., “Algorithm descriptions of projection format conversion and video quality metrics in 360Lib” , Joint Video Exploration Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 6th Meeting: Hobart, AU, 31 March–7 April 2017, Document: JVET-F1003) . Besides the ERP and CMP formats, there are various other VR projection formats, such as Adjusted Cubemap Projection (ACP) , Equal-Area Projection (EAP) , Octahedron Projection (OHP) , Icosahedron Projection (ISP) , Segmented Sphere Projection (SSP) and Rotated Sphere Projection (RSP) that are widely used in the field.
Fig. 3 illustrates an example of octahedron projection (OHP) , where a sphere is projected onto faces of an 8-face octahedron 310. The eight faces 320 lifted from the octahedron 310 can be converted to an intermediate format 330 by cutting open the face edge between  faces  1 and 5 and rotating  faces  1 and 5 to connect to  faces  2 and 6 respectively, and applying a similar process to faces 3 and 7. The intermediate format can be packed into a rectangular picture 340.
Fig. 4 illustrates an example of icosahedron projection (ISP) , where a sphere is projected onto faces of a 20-face icosahedron 410. The twenty faces 420 from the icosahedron 410 can be packed into a rectangular picture 430 (referred as a projection layout) .
Segmented sphere projection (SSP) has been disclosed in JVET-E0025 (Zhang et al., “AHG8: Segmented Sphere Projection for 360-degree video” , Joint Video Exploration Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 5th Meeting: Geneva, CH, 12–20 January 2017, Document: JVET-E0025) as a method to convert a spherical image into an SSP format. Fig. 5 illustrates an example of segmented sphere projection, where a spherical image 500 is mapped into a North Pole image 510, a South Pole image 520 and an equatorial segment image 530. The boundaries of 3 segments correspond to latitudes 45°N (502) and 45°S (504) , where 0° corresponds to the equator (506) . The North and South Poles are mapped into 2 circular areas (i.e., 510 and 520) , and the projection of the equatorial segment can be the same as ERP or equal-area projection (EAP) . The diameter of the circle is equal to the width of the equatorial segments because both Pole segments and equatorial segment have a 90° latitude span. The North Pole image 510, South Pole image 520 and the equatorial segment image 530 can be packed into a rectangular image.
Fig. 6 illustrates an example of rotated sphere projection (RSP) , where the sphere 610 is partitioned into a middle 270°x90° region 620, and a residual part 622. Each part of RSP can be further stretched on the top side and the bottom side to generate a deformed part having an oval shape. The two over-shaped parts can be fitted into a rectangular frame 630 as shown in Fig. 6.
The Adjusted Cubemap Projection format (ACP) is based on the CMP. If the two-dimensional coordinate (u’, v’) for CMP is determined, the two-dimensional coordinate (u, v) for ACP can be calculated by adjusting (u’, v’) according to a set of equations:
Figure PCTCN2018092142-appb-000001
Figure PCTCN2018092142-appb-000002
The 3D coordinates (X, Y, Z) can be derived using a table given the position (u, v) and the face index f. For 3D-to-2D coordinate conversion, given (X, Y, Z) , the (u’, v’) and face index f can be calculated according to a table for CMP. The 2D coordinates for ACP can be calculated according to a set of equations.
Similar to ERP, the EAP also maps a sphere surface to one face. In the (u, v) plane, u and v are in the range [0, 1] . For 2D-to-3D coordinate conversion, given the sampling position (m, n) , 2D coordinates (u, v) are first calculated in the same way as ERP. Then, the longitude and latitude (φ, θ) on the sphere can be calculated from (u, v) as:
φ = (u-0.5) * (2*π)                           (3)
θ = sin -1 (1.0-2*v)                             (4)
Finally, (X, Y, Z) can be calculated using the same equations as used for ERP:
X = cos (θ) cos (φ)                                       (5)
Y = sin (θ)                                                (6)
Z = -cos (θ) sin (φ)                                      (7)
Inversely, the longitude and latitude (φ, θ) can be evaluated from (X, Y, Z) coordinates using:
φ = tan-1 (-Z/X)                                                  (8)
θ = sin-1 (Y/ (X2+Y2+Z2) 1/2)                                     (9)
Since the images or video associated with virtual reality may take a lot of space to store or a lot of bandwidth to transmit, therefore image/video compression is often used to reduce the required storage space or transmission bandwidth. Inter prediction has been a powerful coding tool to explore the inter-frame redundancy using motion estimation/compensation. If conventional Inter prediction is applied to the 2D frames converted from a 3D space, the using motion estimation/compensation techniques may not work properly since an object in the 3D space may become distorted or deformed in the 2D frames due to object movement or relative motion between an object and a camera. In order to improve Inter prediction for 2D frames converted from a 3D space, various Inter prediction techniques are developed to improve the accuracy of Inter prediction for 2D frames converted from a 3D space.
BRIEF SUMMARY OF THE INVENTION
Methods and apparatus of processing 360-degree virtual reality images are disclosed. According to one method, input data for a current block in a 2D (two-dimensional) frame are received, where the 2D frame is projected from a 3D (three-dimensional) sphere. A motion vector associated with a source block in the 2D frame is determined, where the motion vector points from a source location in the source block to a destination location in the 2D frame. The source location, the destination location and the source block in the 2D frame are projected onto the 3D sphere according to a target projection. The source block in the 3D sphere is rotated along a rotation circle on a surface of the 3D sphere around a rotation axis to generate a deformed reference block in the 3D sphere. The deformed reference block in the 3D sphere is mapped back to the 2D frame according to an inverse target projection. The current block in the 2D frame is then encoded or decoded using the deformed reference  block in the 2D frame as a predictor.
In one embodiment, the rotation circle corresponds to a largest circle on the surface of the 3D sphere. In another embodiment, the rotation circle is smaller than a largest circle on the surface of the 3D sphere.
In one embodiment, the rotation circle on the surface of the 3D sphere around a rotation axis is determined according to the source location and the destination location on the 3D sphere. For example, a rotation axis, 
Figure PCTCN2018092142-appb-000003
and a rotation angle θ a associated with the rotation circle are derived according to
Figure PCTCN2018092142-appb-000004
Figure PCTCN2018092142-appb-000005
and
Figure PCTCN2018092142-appb-000006
and wherein
Figure PCTCN2018092142-appb-000007
and
Figure PCTCN2018092142-appb-000008
correspond to the source location and the destination location on a surface of the 3D sphere respectively. In another embodiment, a rotation axis and a rotation angle associated with the rotation circle are derived based on motion vectors in a reference frame. For example, the rotation axis
Figure PCTCN2018092142-appb-000009
and the rotation angle θ’associated with the rotation circle are derived according to:
Figure PCTCN2018092142-appb-000010
and wherein s i corresponds to one source block in the reference frame, mv (s i) corresponds to a motion vector of source block s i, mv
Figure PCTCN2018092142-appb-000011
corresponds to one motion vector caused by rotating one location in source block s i by the rotation angle θ’around the rotation axis
Figure PCTCN2018092142-appb-000012
and ||.|| F is F-norm. In yet another embodiment, the rotation axis and a rotation angle associated with the rotation circle are derived based on motion vectors in an already coded region in a current frame according to the same equation shown above.
The rotation axis associated with the rotation circle can be pre-defined or the rotation axis can be indicated in a bitstream for indicating a path of rotation.
The target projection corresponds to Equirectangular Projection (ERP) and Cubemap Projection (CMP) , Adjusted Cubemap Projection (ACP) , Equal-Area Projection (EAP) , Octahedron Projection (OHP) , Icosahedron Projection (ISP) , Segmented Sphere Projection (SSP) , Rotated Sphere Projection (RSP) , or Cylindrical Projection (CLP) .
In another method of the present invention, deformation based on displacement of camera is applied to generate an Inter predictor. According to this method, two 2D (two-dimensional) frames corresponding to two different viewpoints are received, where said two 2D frames are projected, using a target projection, from a 3D (three-dimensional) sphere, and wherein a current block, a predicted block for the current block and a neighbouring block are located in said two 2D frames. A forward point of camera is determined based on said two 2D frames and moving flows in said two 2D frames are determined. One or more second motion vectors associated with the predicted block are derived either by referring to one or more first motion vectors of the neighbouring block based on the forward point of camera and the moving flows or according to velocity of camera and depth of background. An Inter prediction block is derived based on the predicted block and said one or more second motion vectors. The current block in the 2D frame is encoded or decoded using the Inter prediction block.
Said deriving one or more second motion vectors associated with the predicted block may comprise determining displacement of camera based on said one or more first motion vectors associated with the neighbouring block. Said one or more second motion vectors associated with the predicted block can be derived from said one or more first motion vectors based on the displacement of camera and the moving flows.
BRIEF DESCRIPTION OF THE DRAWINGS
Fig. 1 illustrates an example of projecting a sphere into a rectangular image according to equirectangular projection, where each longitude line is mapped to a vertical line of the ERP picture.
Fig. 2 illustrates a cube with six faces, where a 360-degree virtual reality (VR) image can be projected to the six faces on the cube according to cubemap projection (CMP) .
Fig. 3 illustrates an example of octahedron projection (OHP) , where a sphere is projected onto faces of an 8-face octahedron.
Fig. 4 illustrates an example of icosahedron projection (ISP) , where a sphere is projected onto faces of a 20-face icosahedron.
Fig. 5 illustrates an example of segmented sphere projection (SSP) , where a spherical image is mapped into a North Pole image, a South Pole image and an equatorial segment image.
Fig. 6 illustrates an example of rotated sphere projection (RSP) , where the sphere is partitioned into a middle 270°x90° region and a residual part. These two parts of RSP can be further stretched on the top side and the bottom side to generate deformed parts having oval-shaped boundary on the top part and bottom part.
Fig. 7 illustrates a case that deformation occurs in an ERP frame due to movement, where the North Pole is mapped to a horizontal line on the top of the frame and the equator is mapped to a horizontal line in the middle of the frame.
Fig. 8 illustrates an example of deformation caused by movement in the 3D sphere for the ERP frame.
Fig. 9A to Fig. 9I illustrate examples of deformation in 2D frames projected using various projections, where Fig. 9A is for an ERP frame, 9B illustrates is for a CMP frame, Fig. 9C is for an SSP frame, Fig. 9D is for an OHP frame, Fig. 9E is for an ISP frame, Fig. 9F is for an EAP frame, Fig. 9G is for an ACP frame, Fig. 9H is for an RSP frame, and Fig. 9I is for a Cylindrical Projection frame.
Fig. 10 illustrates a concept of Inter prediction by taking into account of deformation based on rotation according to a method of the present invention.
Fig. 11 illustrates examples that a source block is moved to a destination location through different paths. Due to the different paths, the block at the destination location may have different orientations.
Fig. 12A illustrates an example to describe the 3D movement on a sphere by the yaw, pitch and roll rotations, where a source block is moved to a destination location as prescribed by the three axes.
Fig. 12B illustrates an example to describe the 3D movement on a sphere by rotating around a big circle, where a source block is moved to a destination location along a big circle.
Fig. 12C illustrates an example to describe the 3D movement on a sphere by rotating around a small circle, where a source block is moved to a destination location along a small circle.
Fig. 12D illustrates yet another way of describing object movement of the surface of sphere, where source block is first moved to destination block by rotating on a big circle 1253 on the sphere around a rotation axis and the destination block is rotated around another axis.
Fig. 13 illustrates an exemplary procedure for Inter prediction by taking into account of deformation based on rotation.
Fig. 14 illustrates the step of generating samples for a destination block by rotating samples in a source block by a rotation angle and axis of rotation after the rotation angle and axis of rotation are determined.
Fig. 15 compares an exemplary procedure for Inter prediction according to a method of the present invention by taking into account of deformation based on rotation and convention Inter prediction.
Fig. 16 compares the two different deformation methods based on rotation, where the upper part corresponds to a case of rotation along a big circle and the lower one corresponds to rotation around a new rotation axis.
Fig. 17 illustrates a method to derive the axis of rotation using motion vectors associated with blocks of a reference picture.
Fig. 18 illustrates a method to derive the axis of rotation using motion vectors associated with processed blocks in the current picture.
Fig. 19A illustrates an exemplary process of the deformation based on displacement of camera, where an example of an object (i.e., a tree) is projected onto the surface of a sphere at different camera locations.
Fig. 19B illustrates the locations of an object projected onto a 2D frame for different camera locations.
Fig. 20 illustrates an example of an ERP frame overlaid with the pattern of moving flows, where the flow of background (i.e., static object) can be determined if the camera forward point is known.
Fig. 21 illustrates an example of the pattern of moving flows based on displacement of viewpoint for a CMP frame in the 2x3 layout format.
Fig. 22 illustrates an example of using deformation based on displacement of camera for Inter more prediction.
Fig. 23 illustrates an exemplary procedure of using deformation based on displacement of camera for Inter more prediction.
Fig. 24 illustrates exemplary moving flows in the 2D frame for various projection formats.
Fig. 25 illustrates an exemplary flowchart of a system that applies rotation of sphere to deform a reference block for Inter prediction according to a method of the present invention.
Fig. 26 illustrates an exemplary flowchart of a system that applies displacement of camera to deform a reference block for Inter prediction according to a method of the present invention.
DETAILED DESCRIPTION OF THE INVENTION
The following description is of the best-contemplated mode of carrying out the invention. This description is made for the purpose of illustrating the general principles of the invention and should not be taken in a limiting sense. The scope of the invention is best determined by reference to the appended claims.
For conventional Inter prediction in video coding, motion estimation/compensation is widely used to explore correlation in video data in order to reduce transmitted information. The conventional video contents correspond to 2D video data and the motion estimation and compensation techniques often assume a translational motion. In the next generation video coding, more advanced motion models, such as affine model, are considered. Nevertheless, these techniques derive a 3D motion model based on the 2D images.
When contents are projected from a 3D space onto to 2D frame, deformation occurs for various projection types. For example, an ERP frame 720 is generated by projecting a 3D sphere 710 onto a rectangular frame as shown in Fig. 7, where the North Pole 712 is mapped to a horizontal line 722 on the top of the frame and the equator 714 is mapped to a horizontal line 724 in the middle of the frame. As shown in Fig. 7, a single point  at the North Pole and the equator are both mapped to a horizontal line of the same length. Therefore, the ERP projection substantially stretches an object in the longitudinal direction for the object close to the North Pole. Map 730 illustrates the effect of deformation in ERP, where a circle close to the North Pole or South Pole is mapped to an over-shaped area while the circle remains to be a circle in the middle of the frame.
Fig. 8 illustrates an example of deformation caused by movement in the 3D sphere for the ERP frame. In Fig. 8, block 801 on a 3D sphere 800 is moved to become block 802. When block 801 and moved block 802 are mapped to an ERP frame 803, the two corresponding blocks become  blocks  804 and 805 in the ERP frame. While the blocks (i.e., 801 and 802) on the 3D sphere correspond to a same block, the two blocks (i.e., 804 and 805) have different shapes in the ERP frame.
Fig. 9A to Fig. 9I illustrate examples of deformation in 2D frames projected using various projections. Fig. 9A illustrates an example for an ERP frame 910, where a square block 912 becomes deformed when the block is moved to a location 914 closer to the North Pole. Fig. 9B illustrates an example for a CMP frame 920, where a square block 922 becomes deformed when the block is moved to a location 924. Fig. 9C illustrates an example for an SSP frame 930, where a square block 932 becomes deformed when the block is moved to a location 934. Fig. 9D illustrates an example for an OHP frame 940, where a square block 942 becomes deformed when the block is moved to a location 944. Fig. 9E illustrates an example for an ISP frame 950, where a square block 952 becomes deformed when the block is moved to a location 954. Fig. 9F illustrates an example for an EAP frame 970, where a square block 972 becomes deformed when the block is moved to a location 974. Fig. 9G illustrates an example for an ACP frame 980, where a square block 982 becomes deformed when the block is moved to a location 984. Fig. 9H illustrates an example for an RSP frame 980, where a square block 982 becomes deformed when the block is moved to a location 984. Fig. 9I illustrates an example for a Cylindrical Projection frame 990, where a square block 992 becomes deformed when the block is moved to a location 994.
As described above, object motion in a 3D space may cause object deformation in a 2D frame projected from the 3D sphere. In order to overcome the issue associated with deformation, various methods of Inter prediction for VR Video Processing are disclosed.
Method 1-Deformation based on Rotation
A method to handle the deformation issue for Inter prediction is to project a block in the 2D frame back to the 3D sphere. The block may be a corresponding block in a reference picture prior to motion compensation for a current block. In conventional Inter prediction, the corresponding block is moved in the 2D frame according to the motion vector to point to a reference block and the reference block is used as Inter predictor for the current block. According to this method of the present invention, the block is moved to a designated location on the surface of the 3D sphere. In particular, the block is moved to a new location by rotating the sphere. Finally, the object moved on the surface of the 3D sphere is projected back to the 2D frame. Fig. 10 illustrates a concept of Inter prediction by taking into account of deformation based on rotation. In Fig. 10, block 1013 corresponds to a source block in a 2D frame 1010. Motion vector 1015 of the source block points from a location s c 1012 in the source block 1013 to a destination location d c 1014. According to the present method, the data in the 2D frame are projected to the 3D sphere according to a corresponding projection type. For example, if the 2D frame is generated from ERP, the ERP projection is used to project data in the 2D frame to the 3D sphere. Accordingly, location s c 1012 and location d c 1014 in the 2D frame are projected to locations 
Figure PCTCN2018092142-appb-000013
1022 and
Figure PCTCN2018092142-appb-000014
1024 in the 3D sphere 1020 respectively. In the 3D space, location
Figure PCTCN2018092142-appb-000015
1022 is rotated to location
Figure PCTCN2018092142-appb-000016
1024. The same rotation is also applied to other locations of the source block 1013 to generate the destination block. The data in the destination block in the 3D sphere are then projected back to the 2D frame using an inverse ERP projection.
Fig. 11 illustrates examples that a source block is moved to a destination location through different paths. Due to the different paths, the block at the destination location may have different orientations. In 3D sphere 1110, the source block 1112 is moved to the destination location 1114 through path 1113 with slight right turn. In 3D sphere 1120, the source block 1122 is moved to the destination location 1124 through straight path 1123. In 3D sphere 1130, the source block 1132 is moved to the destination location 1134 through path 1133 with slight left turn.
One way to describe the 3D movement on a sphere is the yaw, pitch and roll rotations 1210 as shown in Fig. 12A, where a source block 1212 is moved to a destination location 1214. The axes of yaw 1216, pitch 1217 and roll 1218 are shown. Another way to describe the 3D movement on a sphere 1220 is rotation around a big circle as shown in Fig. 12B, where a big circle 1221, a source block 1222 is moved to a destination location 1224. The rotation 1226 is shown. The big circle 1221 corresponds to the largest circle on the surface of the sphere 1220. Fig. 12C illustrates an example of rotation of the sphere 1230 from source block 1232 to destination block 1234 on a small circle 1233 on the sphere 1235, where the small circle 1233 corresponds to a circle smaller than the largest circle (e.g. circle 1236) on the surface of the sphere 1235. The centre point of rotation is shown as a dot 1237 in Fig. 12C. Another perspective of rotation of the sphere 1240 from source block 1242 to destination block 1244 on a small circle 1243 on the sphere 1245, where the small circle 1243 corresponds to a circle smaller than the largest circle (e.g. circle 1246) on the surface of the sphere 1245. The axis of rotation is shown as an arrow 1247 in Fig. 12C. Fig. 12D illustrates yet another way of describing object movement of the surface of sphere 1250, where source block 1252 is first moved to destination block 1254 by rotating on a big circle 1253 on the sphere 1250 around axis-a1256. After the destination block reaches the final location, the destination block is rotated around axis-b 1257, where axis-b is from the centre of the big circle 1258 to the centre of destination block 1254.
Fig. 13 illustrates an exemplary procedure for Inter prediction by taking into account of deformation based on rotation. In Fig. 13, block 1313 corresponds to a source block in a 2D frame 1310. Motion vector 1315 of the source block points from a location s c 1312 in the source block 1313 to a destination location d c 1314. According to the present method, the data in the 2D frame are projected to the 3D sphere according to a corresponding projection type. For example, if the 2D frame is generated from ERP, the ERP projection is used to project data in the 2D frame to the 3D sphere. Accordingly, location s c 1312 and location d c 1314 in the 2D frame are projected to locations
Figure PCTCN2018092142-appb-000017
1322 and
Figure PCTCN2018092142-appb-000018
1324 in the 3D sphere 1320 respectively. In the 3D space, location 
Figure PCTCN2018092142-appb-000019
1322 is rotated to location
Figure PCTCN2018092142-appb-000020
1324 around big circle 1326. The same rotation is also applied to other locations of the source block 1313 to generate the destination block. In Fig. 13, rotation angle θ from
Figure PCTCN2018092142-appb-000021
to
Figure PCTCN2018092142-appb-000022
is calculated according to:
Figure PCTCN2018092142-appb-000023
The axis of rotation
Figure PCTCN2018092142-appb-000024
is calculated as:
Figure PCTCN2018092142-appb-000025
After rotation angle θ and axis of rotation
Figure PCTCN2018092142-appb-000026
are determined, samples s mn in a block 1410 of the 2D frame are identified as shown in Fig. 14. Samples s mn are mapped to
Figure PCTCN2018092142-appb-000027
1422 in 3D sphere 1420. Samples 
Figure PCTCN2018092142-appb-000028
are rotated by θ around
Figure PCTCN2018092142-appb-000029
-axis to obtain
Figure PCTCN2018092142-appb-000030
1424 at the destination location according to Rodrigues'rotation formula:
Figure PCTCN2018092142-appb-000031
If the block at the destination location is further rotated around axis-b (i.e., φ≠0) as shown in Fig. 12D, samples
Figure PCTCN2018092142-appb-000032
will be further rotated. Otherwise, the samples
Figure PCTCN2018092142-appb-000033
are the final rotated samples in 3D sphere.
According to the present method, the rotated samples
Figure PCTCN2018092142-appb-000034
1512 in 3D sphere 1510 are projected back to a deformed block 1514 in the 2D frame and used as a new Inter predictor for a source block 1516 as shown in Fig. 15. The source block 1516 may be a corresponding block in a reference picture prior to motion compensation for a current block. The destination block 1514 corresponds to a deformed reference block for Inter prediction according to the present invention. In conventional Inter prediction, the source block 1526 is moved in the 2D frame according to the motion vector to point to a reference block 1524 and the reference block is used as Inter predictor for the current block. Compared to the conventional approach, the Inter predictor 1522 for a source block 1526 in the 2D frame maintains its shape. In the 3D space 1520, the Inter predictor 1522 becomes deformed. Therefore, the conventional Inter predictor does not perform properly due to deformation caused by movement in 3D space.
Method 2-Deformation based on Rotation
In this disclosure, another deformation based on rotation is proposed. In Method 1, the rotation axis is around the normal vector of big circle (i.e., 
Figure PCTCN2018092142-appb-000035
) . However, in Method 2, a new rotation axis
Figure PCTCN2018092142-appb-000036
is used. Fig. 16 compares the two different deformation methods based on rotation. The upper part corresponds to a case of method 1, where a source block 1612 in the 2D frame 1610 is mapped to block 1622 in 3D sphere 1620. A motion vector 1616 in the 2D frame is mapped to the 3D sphere 1620 for determining the location of the destination block 1624. The source block 1622 is then rotated along the big circle 1626 with rotation axis
Figure PCTCN2018092142-appb-000037
to generate the destination block 1624. The destination block 1624 is then mapped back to the 2D frame 1610 to generate the deformed block 1614 as an Inter predictor for the source block 1612.
In Fig. 16, the lower part corresponds to the deformation based on rotation according to Method 2, where the source block 1632 in the 2D frame 1630 is mapped to block 1642 in 3D sphere 1640. A motion path 1636 in the 2D frame is mapped to the 3D sphere 1640 for determining the location of the destination block 1644. The source block 1642 is then rotated along the small circle 1646 with a new rotation axis
Figure PCTCN2018092142-appb-000038
to generate the destination block 1644. The destination block 1644 is then mapped back to the 2D frame 1630 to generate the deformed block 1634 as an Inter predictor for the source block 1632. It is observed that the example in the upper part of Fig. 16 (i.e., rotation along a big circle) is a special case of the example in the lower part of Fig. 16.
In Fig. 16, it requires to determine the small circle or the axis of rotation. A method to derive the axis of rotation is shown in Fig. 17, where s c is the block centre of a source block 1712 and the mv pointing from s c to for d c the block is already known. Both s c to for d c are mapped to
Figure PCTCN2018092142-appb-000039
and
Figure PCTCN2018092142-appb-000040
in the 3D sphere 1720 respectively. A motion vector from
Figure PCTCN2018092142-appb-000041
to
Figure PCTCN2018092142-appb-000042
can be determined in the 3D sphere. The motion vector mapping to  the 3D sphere can be applied to all source blocks of the 2D frame as shown in 3D sphere 1730. An axis of rotation, that rotates motion vector mv (s i) to mv
Figure PCTCN2018092142-appb-000043
is selected based on a performance criterion, where 
Figure PCTCN2018092142-appb-000044
corresponds to the axis of rotation and θ’corresponds to the angle of rotation. For block centres s i, i=1, 2, …, n, the true mv of s i is mv (s i) , and mv
Figure PCTCN2018092142-appb-000045
is motion vector that rotate θ’around axis
Figure PCTCN2018092142-appb-000046
at s i.
Solve: 
Figure PCTCN2018092142-appb-000047
where ||...|| F is F-norm.
The equation provides a way to select an axis of rotation and rotation angle to achieve the best match between a set of mapped motion vectors and a set of rotated motion vectors. In practice, different ways can be applied to find a rotation axis. Fig. 18 illustrates a method to derive the axis of rotation using motion vectors associated with processed blocks in the current picture as follows:
● Pre-defined a rotation (e.g. yaw=0°/pitch=-90°, or any value encoded in a bit-steam as shown in the 3D space 1810) ,
● Find the best matching rotation axis based on motion vectors in reference frame as shown in the 3D space 1820, or
● Find the best matching rotation axis based on motion vectors of compressed blocks in current frame as shown in the 3D space 1830.
Method 3-Deformation Based on Displacement of Camera
According to another method, the deformation based on displacement of camera is disclosed. Fig. 19A illustrates an example of an object (i.e., a tree) projected onto the surface of a sphere at different camera locations. At camera position A, the tree is projected onto sphere 1910 to form an image 1940 of the tree. At a forward position B, the tree is projected onto sphere 1920 to form an image 1950 of the tree. The image 1941 of the tree corresponding to camera position A is also shown on sphere 1920 for comparison. At a further forward position C, the tree is projected onto sphere 1930 to form an image 1960 of the tree. Image 1942 of the tree corresponding to camera position A and image 1951 of the tree corresponding to camera position B are also shown on sphere 1930 for comparison. In Fig. 19A, for a video captured by a camera moving in a straight line, the direction of movement of the camera in 3D space (as indicated by  arrows  1912, 1922 and 1932 for three different camera locations in Fig. 19A) can be represented by a latitude and longitude
Figure PCTCN2018092142-appb-000048
coordinates, where 
Figure PCTCN2018092142-appb-000049
corresponds to the intersection of the motion vector and the 3D sphere. The point of
Figure PCTCN2018092142-appb-000050
is projected to the 2D target projection plane and the point is referred as a forward point (e.g. 1934) .
Fig. 19B illustrates the locations of the tree projected onto a 2D frame 1970 for camera locations A and B in 3D space as shown in Fig. 19A, where the tree image at location 1972 corresponds to camera location A and the tree image at location 1974 corresponds to camera location B. The locations of the tree projected onto a 2D frame 1980 for camera locations B and C in 3D space is also shown in Fig. 19B, where the tree image at location 1982 corresponds to camera location A, the tree image at location 1984 corresponds to camera location B and the tree image at location 1986 corresponds to camera location C.
Fig. 20 illustrates an example of an ERP frame overlaid with the pattern of moving flows, where the flow of background (i.e., static object) can be determined if the camera forward point is known. The flows are  indicated by arrows. The camera forward point 2010 and camera backward point 2020 are shown. Moving flows correspond to the moving direction of video content based on a camera moving in a direction. The movement of the camera causes the relative movement of the static background objects, and the moving direction of background object on the 2D frame captured by camera can be represented as moving flows. Multiple frames 2030 may be used to derive the moving flows.
The pattern of moving flows based on displacement of viewpoint can be applied to various projection methods. The moving flow in the 2D frame 2110 is shown for a CMP frame in the 2x3 layout format in Fig. 21.
Fig. 22 illustrates an example of using deformation based on displacement of camera for Inter more prediction. In the 2D frame 2210, the moving flows are indicated by arrows. For each source block (e.g. blocks 2221-2225) , the deformation of the source block (e.g. blocks 2231-2235) can be determined using moving flows of the background for background objects.
Fig. 23 illustrates an exemplary procedure of using deformation based on displacement of camera for Inter more prediction. The steps of this exemplary procedure are as follow:
1. Find out the forward point of camera. An example to derive the forward point has been described in Fig. 19A and associated text.
2. Calculate the moving flow of frame, which correspond to tangent direction at each pixel.
3. Determine the motion vector of each pixel by referring motion vector of neighbouring block or according to velocity of camera and depth of background:
■ MV of neighbouring block can determine the camera displacement based on forward point of camera and moving flow of frame
■ The camera displacement and moving flow can determine the MV of each pixel of predicted block.
For example, the images for two different camera locations can be capture as shown in arrangement 2310. The moving flows (indicated by arrows) and the camera forward point 2322 in the 2D frame 2320 can be determined. The camera displacement and moving flow can then be used to determine the MV of each pixel of predicted block 2324. Accordingly, a deformed block 2326 is derived and used as Inter prediction for the current block 2324.
The deformation based on displacement of camera for Inter more prediction can be applied to various projections. The moving flow in the 2D frame can be mapped to 3D sphere. The moving flow on the 3D sphere 2410 is shown in Fig. 24, where the forward point and two different lines of moving flow (2412 and 2414) are shown. The moving flow on 3D sphere associated with ERP 2420 is shown in Fig. 24, where the moving flows are shown for an ERP frame 2426. The moving flow on 3D sphere associated with CMP 2430 is shown in Fig. 24, where the moving flows are shown for six faces of a CMP frame 2436 in a 2x3 layout format. The moving flow on 3D sphere associated with OHP 2440 is shown in Fig. 24, where the moving flows are shown for eight faces of an OHP frame 2446. The moving flow on 3D sphere associated with ISP 2450 is shown in Fig. 24, where the moving flows are shown for twenty faces of an ISP frame 2456. The moving flow on 3D sphere associated with SSP 2460 is shown in Fig. 24, where the moving flows are shown for segmented faces of an SSP frame 2466.
Fig. 25 illustrates an exemplary flowchart of a system that applies rotation of sphere to deform a  reference block for Inter prediction according to a method of the present invention. The steps shown in the flowchart may be implemented as program codes executable on one or more processors (e.g., one or more CPUs) at the encoder side. The steps shown in the flowchart may also be implemented based hardware such as one or more electronic devices or processors arranged to perform the steps in the flowchart. According to this method, input data for a current block in a 2D (two-dimensional) frame are received in step 2510, where the 2D frame is projected from a 3D (three-dimensional) sphere. A motion vector associated with a source block in the 2D frame is determined in step 2520, where the motion vector points from a source location in the source block to a destination location in the 2D frame. The source location, the destination location and the source block in the 2D frame are projected onto the 3D sphere according to a target projection in step 2530. The source block in the 3D sphere is rotated along a rotation circle on a surface of the 3D sphere around a rotation axis to generate a deformed reference block in the 3D sphere in step 2540. The deformed reference block in the 3D sphere is mapped back to the 2D frame according to an inverse target projection in step 2550. The current block in the 2D frame is encoded or decoded using the deformed reference block in the 2D frame as an Inter predictor in step 2560.
Fig. 26 illustrates an exemplary flowchart of a system that applies displacement of camera to deform a reference block for Inter prediction according to a method of the present invention. Two 2D (two-dimensional) frames corresponding to two different viewpoints are received in step 2610, where said two 2D frames are projected, using a target projection, from a 3D (three-dimensional) sphere, and wherein a current block, a predicted block for the current block and a neighbouring block are located in said two 2D frames. A forward point of camera is determined based on said two 2D frames in step 2620. Moving flows in said two 2D frames are determined in step 2630. One or more second motion vectors associated with the predicted block are derived either by referring to one or more first motion vectors of the neighbouring block based on the forward point of camera and the moving flows or according to velocity of camera and depth of background in step 2640. An Inter prediction block is derived based on the predicted block and said one or more second motion vectors in step 2650. The current block in the 2D frame is encoded or decoded using the Inter prediction block in step 2660.
The flowcharts shown above are intended for serving as examples to illustrate embodiments of the present invention. A person skilled in the art may practice the present invention by modifying individual steps, splitting or combining steps with departing from the spirit of the present invention.
The above description is presented to enable a person of ordinary skill in the art to practice the present invention as provided in the context of a particular application and its requirement. Various modifications to the described embodiments will be apparent to those with skill in the art, and the general principles defined herein may be applied to other embodiments. Therefore, the present invention is not intended to be limited to the particular embodiments shown and described, but is to be accorded the widest scope consistent with the principles and novel features herein disclosed. In the above detailed description, various specific details are illustrated in order to provide a thorough understanding of the present invention. Nevertheless, it will be understood by those skilled in the art that the present invention may be practiced.
Embodiment of the present invention as described above may be implemented in various hardware, software codes, or a combination of both. For example, an embodiment of the present invention can be one or more electronic circuits integrated into a video compression chip or program code integrated into video compression software to perform the processing described herein. An embodiment of the present invention may  also be program code to be executed on a Digital Signal Processor (DSP) to perform the processing described herein. The invention may also involve a number of functions to be performed by a computer processor, a digital signal processor, a microprocessor, or field programmable gate array (FPGA) . These processors can be configured to perform particular tasks according to the invention, by executing machine-readable software code or firmware code that defines the particular methods embodied by the invention. The software code or firmware code may be developed in different programming languages and different formats or styles. The software code may also be compiled for different target platforms. However, different code formats, styles and languages of software codes and other means of configuring code to perform the tasks in accordance with the invention will not depart from the spirit and scope of the invention.
The invention may be embodied in other specific forms without departing from its spirit or essential characteristics. The described examples are to be considered in all respects only as illustrative and not restrictive. The scope of the invention is therefore, indicated by the appended claims rather than by the foregoing description. All changes which come within the meaning and range of equivalency of the claims are to be embraced within their scope.

Claims (16)

  1. A method of processing 360-degree virtual reality images, the method comprising:
    receiving input data for a current block in a 2D (two-dimensional) frame, wherein the 2D frame is projected from a 3D (three-dimensional) sphere;
    determining a motion vector associated with a source block in the 2D frame, wherein the motion vector points from a source location in the source block to a destination location in the 2D frame;
    projecting the source location, the destination location and the source block in the 2D frame onto the 3D sphere according to a target projection;
    rotating the source block in the 3D sphere along a rotation circle on a surface of the 3D sphere around a rotation axis to generate a deformed reference block in the 3D sphere;
    mapping the deformed reference block in the 3D sphere back to the 2D frame according to an inverse target projection; and
    encoding or decoding the current block in the 2D frame using the deformed reference block in the 2D frame as an Inter predictor.
  2. The method of Claim 1, wherein the rotation circle corresponds to a largest circle on the surface of the 3D sphere.
  3. The method of Claim 1, wherein the rotation circle is smaller than a largest circle on the surface of the 3D sphere.
  4. The method of Claim 1, wherein the rotation circle on the surface of the 3D sphere around a rotation axis is determined according to the source location and the destination location on the 3D sphere.
  5. The method of Claim 4, wherein a rotation axis, 
    Figure PCTCN2018092142-appb-100001
    and a rotation angle θ a associated with the rotation circle are derived according to
    Figure PCTCN2018092142-appb-100002
    and
    Figure PCTCN2018092142-appb-100003
    and wherein
    Figure PCTCN2018092142-appb-100004
    and
    Figure PCTCN2018092142-appb-100005
    correspond to the source location and the destination location on a surface of the 3D sphere respectively.
  6. The method of Claim 1, wherein a rotation axis and a rotation angle associated with the rotation circle are derived based on motion vectors in a reference frame.
  7. The method of Claim 6, wherein the rotation axis
    Figure PCTCN2018092142-appb-100006
    and the rotation angle θ’ associated with the rotation circle are derived according to:
    [0089] 
    Figure PCTCN2018092142-appb-100007
    and wherein s i corresponds to one source block in the reference frame, mv (s i) corresponds to a motion vector of source block s i
    Figure PCTCN2018092142-appb-100008
    corresponds to one motion vector caused by rotating one location in source block s i by the rotation angle θ’ around the rotation axis
    Figure PCTCN2018092142-appb-100009
    and ||.|| F is F-norm.
  8. The method of Claim 1, wherein a rotation axis and a rotation angle associated with the rotation circle are derived based on motion vectors in an already coded region in a current frame.
  9. The method of Claim 8, wherein the rotation axis
    Figure PCTCN2018092142-appb-100010
    and the rotation angle θ’ associated with the rotation circle are derived according to:
    [0090] 
    Figure PCTCN2018092142-appb-100011
    and wherein s i corresponds to one source block in the already coded region in the current frame, mv (s i)  corresponds to a motion vector of source block s i, and
    Figure PCTCN2018092142-appb-100012
    corresponds to one motion vector caused by rotating one location in source block s i by the rotation angle θ’ around the rotation axis
    Figure PCTCN2018092142-appb-100013
  10. The method of Claim 1, wherein a rotation axis associated with the rotation circle is pre-defined or the rotation axis is indicated in a bitstream for indicating a path of rotation.
  11. The method of Claim 1, wherein the target projection corresponds to Equirectangular Projection (ERP) and Cubemap Projection (CMP) , Adjusted Cubemap Projection (ACP) , Equal-Area Projection (EAP) , Octahedron Projection (OHP) , Icosahedron Projection (ISP) , Segmented Sphere Projection (SSP) , Rotated Sphere Projection (RSP) , or Cylindrical Projection (CLP) .
  12. An apparatus for processing 360-degree virtual reality images, the apparatus comprising one or more electronic devices or processors configured to:
    receive input data for a current block in a 2D (two-dimensional) frame, wherein the 2D frame is projected from a 3D (three-dimensional) sphere;
    determine a motion vector associated with a source block in the 2D frame, wherein the motion vector points from a source location in the source block to a destination location in the 2D frame;
    project the source location, the destination location and the source block in the 2D frame onto the 3D sphere according to a target projection;
    rotate the source block in the 3D sphere along a rotation circle on a surface of the 3D sphere around a rotation axis to generate a deformed reference block in the 3D sphere;
    map the deformed reference block in the 3D sphere back to the 2D frame according to an inverse target projection; and
    encode or decode the current block in the 2D frame using the deformed reference block in the 2D frame as an Inter predictor.
  13. A method of processing 360-degree virtual reality images, the method comprising:
    receiving two 2D (two-dimensional) frames corresponding to two different viewpoints, wherein said two 2D frames are projected, using a target projection, from a 3D (three-dimensional) sphere, and wherein a current block, a predicted block for the current block and a neighbouring block are located in said two 2D frames;
    determining a forward point of camera based on said two 2D frames;
    determining moving flows in said two 2D frames;
    deriving one or more second motion vectors associated with the predicted block either by referring to one or more first motion vectors of the neighbouring block based on the forward point of camera and the moving flows or according to velocity of camera and depth of background;
    deriving an Inter prediction block based on the predicted block and said one or more second motion vectors; and
    encoding or decoding the current block in the 2D frame using the Inter prediction block.
  14. The method of Claim 13, wherein said deriving one or more second motion vectors associated with the predicted block comprises determining displacement of camera based on said one or more first motion vectors associated with the neighbouring block; and wherein said one or more second motion vectors associated with the predicted block are derived from said one or more first motion vectors based on the displacement of camera and the moving flows.
  15. The method of Claim 13, wherein the target projection corresponds to Equirectangular Projection (ERP) and Cubemap Projection (CMP) , Adjusted Cubemap Projection (ACP) , Equal-Area Projection (EAP) , Octahedron Projection (OHP) , Icosahedron Projection (ISP) , Segmented Sphere Projection (SSP) , Rotated Sphere Projection (RSP) , or Cylindrical Projection (CLP) .
  16. An apparatus for processing 360-degree virtual reality images, the apparatus comprising one or more electronic devices or processors configured to:
    receive two 2D (two-dimensional) frames corresponding to two different viewpoints, wherein said two 2D frames are projected, using a target projection, from a 3D (three-dimensional) sphere, and wherein a current block, a predicted block for the current block and a neighbouring block are located in said two 2D frames;
    determine a forward point of camera based on said two 2D frames;
    determine moving flows in said two 2D frames;
    derive one or more second motion vectors associated with the predicted block either by referring to one or more first motion vectors of the neighbouring block based on the forward point of camera and the moving flows or according to velocity of camera and depth of background;
    derive an Inter prediction block based on the predicted block and said one or more second motion vectors; and
    encode or decode the current block in the 2D frame using the Inter prediction block.
PCT/CN2018/092142 2017-06-23 2018-06-21 Method and apparatus of inter prediction for immersive video coding WO2018233661A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201880002044.4A CN109691104B (en) 2017-06-23 2018-06-21 Method and device for processing 360-degree virtual reality image
TW107121492A TWI690193B (en) 2017-06-23 2018-06-22 Method and apparatus of processing 360-degree virtual reality images

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US201762523885P 2017-06-23 2017-06-23
US201762523883P 2017-06-23 2017-06-23
US62/523,885 2017-06-23
US62/523,883 2017-06-23

Publications (1)

Publication Number Publication Date
WO2018233661A1 true WO2018233661A1 (en) 2018-12-27

Family

ID=64735503

Family Applications (2)

Application Number Title Priority Date Filing Date
PCT/CN2018/092142 WO2018233661A1 (en) 2017-06-23 2018-06-21 Method and apparatus of inter prediction for immersive video coding
PCT/CN2018/092143 WO2018233662A1 (en) 2017-06-23 2018-06-21 Method and apparatus of motion vector derivations in immersive video coding

Family Applications After (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/092143 WO2018233662A1 (en) 2017-06-23 2018-06-21 Method and apparatus of motion vector derivations in immersive video coding

Country Status (3)

Country Link
CN (2) CN109691104B (en)
TW (2) TWI686079B (en)
WO (2) WO2018233661A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021083187A1 (en) * 2019-10-28 2021-05-06 Mediatek Inc. Video decoding method for decoding part of bitstream to generate projection-based frame with constrained guard band size, constrained projection face size, and/or constrained picture size

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10904558B2 (en) * 2019-04-26 2021-01-26 Tencent America LLC Method and apparatus for motion compensation for 360 video coding
CN110248212B (en) * 2019-05-27 2020-06-02 上海交通大学 Multi-user 360-degree video stream server-side code rate self-adaptive transmission method and system
WO2020252422A1 (en) * 2019-06-13 2020-12-17 Beijing Dajia Internet Information Technology Co., Ltd. Motion vector prediction for video coding
US11263722B2 (en) * 2020-06-10 2022-03-01 Mediatek Inc. Video processing method for remapping sample locations in projection-based frame with hemisphere cubemap projection layout to locations on sphere and associated video processing apparatus
CN115423812B (en) * 2022-11-05 2023-04-18 松立控股集团股份有限公司 Panoramic monitoring planarization display method
CN116540872B (en) * 2023-04-28 2024-06-04 中广电广播电影电视设计研究院有限公司 VR data processing method, device, equipment, medium and product

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103039075A (en) * 2010-05-21 2013-04-10 Jvc建伍株式会社 Image encoding apparatus, image encoding method, image encoding program, image decoding apparatus, image decoding method and image decoding program
CN104063843A (en) * 2014-06-18 2014-09-24 长春理工大学 Method for generating integrated three-dimensional imaging element images on basis of central projection
WO2017027884A1 (en) * 2015-08-13 2017-02-16 Legend3D, Inc. System and method for removing camera rotation from a panoramic video

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102333221B (en) * 2011-10-21 2013-09-04 北京大学 Panoramic background prediction video coding and decoding method
EP3354029A4 (en) * 2015-09-23 2019-08-21 Nokia Technologies Oy A method, an apparatus and a computer program product for coding a 360-degree panoramic video

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103039075A (en) * 2010-05-21 2013-04-10 Jvc建伍株式会社 Image encoding apparatus, image encoding method, image encoding program, image decoding apparatus, image decoding method and image decoding program
CN104063843A (en) * 2014-06-18 2014-09-24 长春理工大学 Method for generating integrated three-dimensional imaging element images on basis of central projection
WO2017027884A1 (en) * 2015-08-13 2017-02-16 Legend3D, Inc. System and method for removing camera rotation from a panoramic video

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JILL BOYCE ET AL.: "Spherical rotation orientation SEI for HEVC and AVC coding of 360 video", JOINT COLLABORATIVE TEAM ON VIDEO CODING (JCT-VC) OF ITU-T SG 16 WP 3, 20 January 2017 (2017-01-20), Geneva, XP030118131 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021083187A1 (en) * 2019-10-28 2021-05-06 Mediatek Inc. Video decoding method for decoding part of bitstream to generate projection-based frame with constrained guard band size, constrained projection face size, and/or constrained picture size
US11095912B2 (en) 2019-10-28 2021-08-17 Mediatek Inc. Video decoding method for decoding part of bitstream to generate projection-based frame with constrained guard band size, constrained projection face size, and/or constrained picture size
US11405630B2 (en) 2019-10-28 2022-08-02 Mediatek Inc. Video decoding method for decoding part of bitstream to generate projection-based frame with constrained picture size and associated electronic device
US11405629B2 (en) 2019-10-28 2022-08-02 Mediatek Inc. Video decoding method for decoding part of bitstream to generate projection-based frame with constrained guard band size and/or constrained projection face size and associated electronic device

Also Published As

Publication number Publication date
WO2018233662A1 (en) 2018-12-27
CN109429561B (en) 2022-01-21
CN109691104B (en) 2021-02-23
TW201911867A (en) 2019-03-16
TW201911861A (en) 2019-03-16
TWI686079B (en) 2020-02-21
CN109691104A (en) 2019-04-26
CN109429561A (en) 2019-03-05
TWI690193B (en) 2020-04-01

Similar Documents

Publication Publication Date Title
WO2018233661A1 (en) Method and apparatus of inter prediction for immersive video coding
US10600233B2 (en) Parameterizing 3D scenes for volumetric viewing
US10264282B2 (en) Method and apparatus of inter coding for VR video using virtual reference frames
WO2017125030A1 (en) Apparatus of inter prediction for spherical images and cubic images
WO2018196682A1 (en) Method and apparatus for mapping virtual-reality image to a segmented sphere projection format
Bertel et al. Megaparallax: Casual 360 panoramas with motion parallax
US10614609B2 (en) Method and apparatus for reduction of artifacts at discontinuous boundaries in coded virtual-reality images
US20080205791A1 (en) Methods and systems for use in 3d video generation, storage and compression
CN108886598A (en) The compression method and device of panoramic stereoscopic video system
CN109361913A (en) For providing the method and apparatus of 3-D image for head-mounted display
KR101933037B1 (en) Apparatus for reproducing 360 degrees video images for virtual reality
TWI702835B (en) Method and apparatus of motion vector derivation for vr360 video coding
US10827159B2 (en) Method and apparatus of signalling syntax for immersive video coding
US20180338160A1 (en) Method and Apparatus for Reduction of Artifacts in Coded Virtual-Reality Images
TWI684359B (en) Method and apparatus of signalling syntax for immersive video coding
WO2018199793A1 (en) Geodesic intra-prediction for panoramic video coding
Pintore et al. PanoVerse: automatic generation of stereoscopic environments from single indoor panoramic images for Metaverse applications
Calagari et al. Sports VR content generation from regular camera feeds
CN114945092A (en) Cube-based projection method
CN113614776A (en) Image signal representing a scene
KR20170114160A (en) Decoding method for video data including stitching information and encoding method for video data including stitching information

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18821464

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18821464

Country of ref document: EP

Kind code of ref document: A1