WO2018233662A1 - Method and apparatus of motion vector derivations in immersive video coding - Google Patents

Method and apparatus of motion vector derivations in immersive video coding Download PDF

Info

Publication number
WO2018233662A1
WO2018233662A1 PCT/CN2018/092143 CN2018092143W WO2018233662A1 WO 2018233662 A1 WO2018233662 A1 WO 2018233662A1 CN 2018092143 W CN2018092143 W CN 2018092143W WO 2018233662 A1 WO2018233662 A1 WO 2018233662A1
Authority
WO
WIPO (PCT)
Prior art keywords
motion vector
sphere
projection
frame
location
Prior art date
Application number
PCT/CN2018/092143
Other languages
French (fr)
Inventor
Cheng-Hsuan Shih
Jian-Liang Lin
Original Assignee
Mediatek Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mediatek Inc. filed Critical Mediatek Inc.
Priority to CN201880001715.5A priority Critical patent/CN109429561B/en
Priority to TW107121493A priority patent/TWI686079B/en
Publication of WO2018233662A1 publication Critical patent/WO2018233662A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/55Motion estimation with spatial constraints, e.g. at image or region borders

Definitions

  • the present invention relates to image/video processing or coding for 360-degree virtual reality (VR) images/sequences.
  • the present invention relates to deriving motion vectors for three-dimensional (3D) contents in various projection formats.
  • the 360-degree video also known as immersive video is an emerging technology, which can provide “feeling as sensation of present” .
  • the sense of immersion is achieved by surrounding a user with wrap-around scene covering a panoramic view, in particular, 360-degree field of view.
  • the “feeling as sensation of present” can be further improved by stereographic rendering. Accordingly, the panoramic video is being widely used in Virtual Reality (VR) applications.
  • VR Virtual Reality
  • Immersive video involves the capturing a scene using multiple cameras to cover a panoramic view, such as 360-degree field of view.
  • the immersive camera usually uses a panoramic camera or a set of cameras arranged to capture 360-degree field of view. Typically, two or more cameras are used for the immersive camera. All videos must be taken simultaneously and separate fragments (also called separate perspectives) of the scene are recorded. Furthermore, the set of cameras are often arranged to capture views horizontally, while other arrangements of the cameras are possible.
  • the 360-degree virtual reality (VR) images may be captured using a 360-degree spherical panoramic camera or multiple images arranged to cover all filed of views around 360 degrees.
  • the three-dimensional (3D) spherical image is difficult to process or store using the conventional image/video processing devices. Therefore, the 360-degree VR images are often converted to a two-dimensional (2D) format using a 3D-to-2D projection method.
  • 2D two-dimensional
  • equirectangular projection (ERP) and cubemap projection (CMP) have been commonly used projection methods. Accordingly, a 360-degree image can be stored in an equirectangular projected format.
  • the equirectangular projection maps the entire surface of a sphere onto a flat image.
  • Fig. 1 illustrates an example of projecting a sphere 110 into a rectangular image 120 according to equirectangular projection (ERP) , where each longitude line is mapped to a vertical line of the ERP picture.
  • ERP equirectangular projection
  • the areas in the north and south poles of the sphere are stretched more severely (i.e., from a single point to a line) than areas near the equator.
  • due to distortions introduced by the stretching, especially near the two poles predictive coding tools often fail to make good prediction, causing reduction in coding efficiency.
  • FIG. 2 illustrates a cube 210 with six faces, where a 360-degree virtual reality (VR) image can be projected to the six faces on the cube according to cubemap projection (CMP) .
  • VR virtual reality
  • CMP cubemap projection
  • the example shown in Fig. 2 divides the six faces into two parts (220a and 220b) , where each part consists of three connected faces.
  • the two parts can be unfolded into two strips (230a and 230b) , where each strip corresponds to a continuous-face picture.
  • the two strips can be combined into a compact rectangular frame according to a selected layout format.
  • JVET-F1003 Joint Video Exploration Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 6th Meeting: Hobart, AU, 31 March –7 April 2017, Document: JVET-F1003) .
  • VR projection formats such as Adjusted Cubemap Projection (ACP) , Equal-Area Projection (EAP) , Octahedron Projection (OHP) , Icosahedron Projection (ISP) , Segmented Sphere Projection (SSP) and Rotated Sphere Projection (RSP) that are widely used in the field.
  • ACP Adjusted Cubemap Projection
  • EAP Equal-Area Projection
  • OHP Octahedron Projection
  • ISP Icosahedron Projection
  • SSP Segmented Sphere Projection
  • RSP Rotated Sphere Projection
  • Fig. 3 illustrates an example of octahedron projection (OHP) , where a sphere is projected onto faces of an 8-face octahedron 310.
  • the eight faces 320 lifted from the octahedron 310 can be converted to an intermediate format 330 by cutting open the face edge between faces 1 and 5 and rotating faces 1 and 5 to connect to faces 2 and 6 respectively, and applying a similar process to faces 3 and 7.
  • the intermediate format can be packed into a rectangular picture 340.
  • Fig. 4 illustrates an example of icosahedron projection (ISP) , where a sphere is projected onto faces of a 20-face icosahedron 410.
  • the twenty faces 420 from the icosahedron 410 can be packed into a rectangular picture 430 (referred as a projection layout) .
  • Segmented sphere projection has been disclosed in JVET-E0025 (Zhang et al., “AHG8: Segmented Sphere Projection for 360-degree video” , Joint Video Exploration Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 5th Meeting: Geneva, CH, 12–20 January 2017, Document: JVET-E0025) as a method to convert a spherical image into an SSP format.
  • Fig. 5 illustrates an example of segmented sphere projection, where a spherical image 500 is mapped into a North Pole image 510, a South Pole image 520 and an equatorial segment image 530.
  • the boundaries of 3 segments correspond to latitudes 45°N (502) and 45°S (504) , where 0° corresponds to the equator (506) .
  • the North and South Poles are mapped into 2 circular areas (i.e., 510 and 520) , and the projection of the equatorial segment can be the same as ERP or equal-area projection (EAP) .
  • the diameter of the circle is equal to the width of the equatorial segments because both Pole segments and equatorial segment have a 90° latitude span.
  • the North Pole image 510, South Pole image 520 and the equatorial segment image 530 can be packed into a rectangular image.
  • Fig. 6 illustrates an example of rotated sphere projection (RSP) , where the sphere 610 is partitioned into a middle 270°x90° region 620, and a residual part 622. Each part of RSP can be further stretched on the top side and the bottom side to generate a deformed part having an oval shape. The two over-shaped parts can be fitted into a rectangular frame 630 as shown in Fig. 6.
  • RSP rotated sphere projection
  • the Adjusted Cubemap Projection format is based on the CMP. If the two-dimensional coordinate (u’, v’) for CMP is determined, the two-dimensional coordinate (u, v) for ACP can be calculated by adjusting (u’, v’) according to a set of equations:
  • the 3D coordinates (X, Y, Z) can be derived using a table given the position (u, v) and the face index f.
  • the (u’, v’) and face index f can be calculated according to a table for CMP.
  • the 2D coordinates for ACP can be calculated according to a set of equations.
  • the EAP Similar to ERP, the EAP also maps a sphere surface to one face. In the (u, v) plane, u and v are in the range [0, 1]. For 2D-to-3D coordinate conversion, given the sampling position (m, n) , 2D coordinates (u, v) are first calculated in the same way as ERP. Then, the longitude and latitude ( ⁇ , ⁇ ) on the sphere can be calculated from (u, v) as:
  • the longitude and latitude ( ⁇ , ⁇ ) can be evaluated from (X, Y, Z) coordinates using:
  • Inter prediction has been a powerful coding tool to explore the inter-frame redundancy using motion estimation/compensation. If conventional Inter prediction is applied to the 2D frames converted from a 3D space, the using motion estimation/compensation techniques may not work properly since an object in the 3D space may become distorted or deformed in the 2D frames due to object movement or relative motion between an object and a camera. In order to improve Inter prediction for 2D frames converted from a 3D space, various Inter prediction techniques are developed to improve the accuracy of Inter prediction for 2D frames converted from a 3D space.
  • Methods and apparatus of processing 360-degree virtual reality images are disclosed.
  • input data for a current block in a 2D (two-dimensional) frame are received, where the 2D frame is projected from a 3D (three-dimensional) sphere.
  • a first motion vector associated with a neighbouring block in the 2D frame is determined, where the first motion vector points from a first start location in the neighbouring block to a first end location in the 2D frame.
  • the first motion vector is projected onto the 3D sphere according to a target projection.
  • the first motion vector is rotated in the 3D sphere along a rotation circle on a surface of the 3D sphere around a rotation axis to generate a second motion vector in the 3D sphere.
  • the second motion vector in the 3D sphere is mapped back to the 2D frame according to an inverse target projection.
  • the current block in the 2D frame is encoded or decoded using the second motion vector.
  • the second motion vector may be included as a candidate in a Merge candidate list or an AMVP (Advanced Motion Vector Prediction) candidate list for encoding or decoding of the current block.
  • AMVP Advanced Motion Vector Prediction
  • the rotation circle corresponds to a largest circle on the surface of the 3D sphere. In another embodiment, the rotation circle is smaller than a largest circle on the surface of the 3D sphere.
  • the target projection may correspond to Equirectangular Projection (ERP) and Cubemap Projection (CMP) , Adjusted Cubemap Projection (ACP) , Equal-Area Projection (EAP) , Octahedron Projection (OHP) , Icosahedron Projection (ISP) , Segmented Sphere Projection (SSP) , Rotated Sphere Projection (RSP) , or Cylindrical Projection (CLP) .
  • ERP Equirectangular Projection
  • CMP Cubemap Projection
  • ACP Adjusted Cubemap Projection
  • EAP Equal-Area Projection
  • OHP Octahedron Projection
  • ISP Icosahedron Projection
  • SSP Segmented Sphere Projection
  • RSP Rotated Sphere Projection
  • CLP Cylindrical
  • said projecting the first motion vector onto the 3D sphere comprises projecting the first start location, the first end location and a second start location in the 2D frame onto the 3D sphere according to the target projection, where the second start location is at a corresponding location in the current block corresponding to the first start location in the neighbouring block.
  • Said rotating the first motion vector in the 3D sphere along the rotation circle comprises determining a target rotation for rotating from the first start location to the second start location in the 3D sphere along a rotation circle on a surface of the 3D sphere around a rotation axis and rotating the first end location to a second end location on the 3D sphere using the target rotation.
  • Said mapping the second motion vector in the 3D sphere back to the 2D frame comprises mapping the second end location on the 3D sphere back to the 2D frame according to the inverse target projection and determining the second motion vector in the 2D frame based on the second start location and the second end location in the 2D frame.
  • two 2D frames are received, where said two 2D frames are projected, using a target projection, from a 3D sphere corresponding to two different viewpoints, and a current block and a neighbouring block are located in said two 2D frames.
  • a forward point of camera is determined based on said two 2D frames. Moving flows in said two 2D frames are determined. Displacement of camera based on a first motion vector associated with the neighbouring block is determined. A second motion vector associated with the current block is determined based on the displacement of camera. The current block in the 2D frame is then encoded or decoded using the second motion vector.
  • the moving flows in said two 2D frames can be calculated from a tangent direction at each pixel in said two 2D frames.
  • the second motion vector can be included as a candidate in a Merge candidate list or an AMVP (Advanced Motion Vector Prediction) candidate list for encoding or decoding of the current block.
  • AMVP Advanced Motion Vector Prediction
  • the target projection may correspond to Equirectangular Projection (ERP) and Cubemap Projection (CMP) , Adjusted Cubemap Projection (ACP) , Equal-Area Projection (EAP) , Octahedron Projection (OHP) , Icosahedron Projection (ISP) , Segmented Sphere Projection (SSP) , Rotated Sphere Projection (RSP) , or Cylindrical Projection (CLP) .
  • ERP Equirectangular Projection
  • CMP Cubemap Projection
  • ACP Adjusted Cubemap Projection
  • EAP Equal-Area Projection
  • OHP Octahedron Projection
  • ISP Icosahedron Projection
  • SSP Segmented Sphere Projection
  • RSP Rotated Sphere Projection
  • CLP Cylindrical Projection
  • input data for a current block in a 2D frame are received, wherein the 2D frame is projected from a 3D sphere according to a target projection.
  • a first motion vector associated with a neighbouring block in the 2D frame where the first motion vector points from a first start location in the neighbouring block to a first end location in the 2D frame.
  • the first motion vector is scaled to generate a second motion vector.
  • the current block in the 2D frame is then encoded or decoded using the second motion vector.
  • said scaling the first motion vector to generate the second motion vector comprises projecting the first start location, the first end location and a second start location in the 2D frame onto the 3D sphere according to the target projection, where the second start location is at a corresponding location in the current block corresponding to the first start location in the neighbouring block.
  • Said scaling the first motion vector to generate the second motion vector further comprises scaling a longitude component of the first motion vector to generate a scaled longitude component of the first motion vector; scaling a latitude component of the first motion vector to generate a scaled latitude component of the first motion vector; and determining a second end location corresponding to the second start location based on the scaled longitude component of the first motion vector and the scaled latitude component of the first motion vector.
  • Said scaling the first motion vector to generate the second motion vector further comprises mapping the second end location on the 3D sphere back to the 2D frame according to an inverse target projection; and determining the second motion vector in the 2D frame based on the second start location and the second end location in the 2D frame.
  • said scaling the first motion vector to generate the second motion vector comprises applying a first combined function to generate an x-component of the second motion vector and applying a second combined function to generate a y-component of the second motion vector; where the first combined function and the second combined function are dependent on the first start location, a second start location in the current block associated with corresponding first start location, the first motion vector and a target projection; and wherein the first combined function and the second combined function combine the target projection for projecting first data in the 2D frame into second data in the 3D sphere, scaling a selected motion vector in the 3D sphere into a scaled motion vector in the 3D sphere, and inverse target projection for projecting the scaled motion vector into the 2D frame.
  • the first combined function corresponds to and the second combined function corresponds to an identity function, wherein corresponds to a first latitude associated with the first starting location and corresponds to a second latitude associated with the second starting location.
  • ERP Equirectangular Projection
  • Fig. 1 illustrates an example of projecting a sphere into a rectangular image according to equirectangular projection, where each longitude line is mapped to a vertical line of the ERP picture.
  • Fig. 2 illustrates a cube with six faces, where a 360-degree virtual reality (VR) image can be projected to the six faces on the cube according to cubemap projection (CMP) .
  • VR virtual reality
  • CMP cubemap projection
  • Fig. 3 illustrates an example of octahedron projection (OHP) , where a sphere is projected onto faces of an 8-face octahedron.
  • OHP octahedron projection
  • Fig. 4 illustrates an example of icosahedron projection (ISP) , where a sphere is projected onto faces of a 20-face icosahedron.
  • ISP icosahedron projection
  • Fig. 5 illustrates an example of segmented sphere projection (SSP) , where a spherical image is mapped into a North Pole image, a South Pole image and an equatorial segment image.
  • SSP segmented sphere projection
  • Fig. 6 illustrates an example of rotated sphere projection (RSP) , where the sphere is partitioned into a middle 270°x90° region and a residual part. These two parts of RSP can be further stretched on the top side and the bottom side to generate deformed parts having oval-shaped boundary on the top part and bottom part.
  • RSP rotated sphere projection
  • Fig. 7A and Fig. 7B illustrate examples of deformation of 2D image due to the rotation of sphere.
  • Fig. 8 illustrates an example of rotation of the sphere from point a to point b on a big circle on the sphere, where the big circle corresponds to the largest circle on the surface of the sphere.
  • Fig. 9 illustrates an example of rotation of the from point a to point b on a small circle on the sphere, where the big circle corresponds to a circle smaller than the largest circle on the surface of the sphere.
  • Fig. 10 illustrates an example of deriving the motion vector for 2D projected pictures using the rotation of the sphere model.
  • Fig. 11 illustrates an example of using the MV derived based on rotation of sphere as a Merge or AMVP candidate.
  • Fig. 12 illustrates an example of an object (i.e., a tree) projected onto the surface of a sphere at different camera locations.
  • Fig. 13 illustrates an example of an ERP frame overlaid with the pattern of moving flows, where the flow of background (i.e., static object) can be determined if the camera forward point is known.
  • background i.e., static object
  • Fig. 14 illustrates an exemplary procedure of MV derivation based on displacement of viewpoint.
  • Fig. 15 illustrates exemplary MV derivation based on displacement of viewpoint for various projection methods.
  • Fig. 16 illustrates an example of deformation associated with motion in the ERP (Equirectangular Projection) frame.
  • Fig. 17 illustrates an exemplary procedure for the MV scaling technique in 3D sphere.
  • Fig. 18 illustrates an exemplary procedure of MV scaling in a 2D frame.
  • Fig. 19 illustrates an exemplary procedure of MV scaling in the ERP (Equirectangular Projection) frame.
  • Fig. 20 illustrates an exemplary procedure of MV scaling in an Equirectangular Projection (ERP) frame, where a current block and a neighbouring block in an ERP picture are shown.
  • ERP Equirectangular Projection
  • Fig. 21 illustrates an exemplary procedure of MV scaling in a Cubemap Projection (CMP) frame, where a current block and a neighbouring block in a CMP picture are shown.
  • CMP Cubemap Projection
  • Fig. 22 illustrates an exemplary procedure of MV scaling in a Segmented Sphere Projection (SSP) frame, where a current block and a neighbouring block in an SSP picture are shown.
  • SSP Segmented Sphere Projection
  • Fig. 23 illustrates an exemplary procedure of MV scaling in an Octahedron Projection (OHP) frame, where a current block and a neighbouring block in an OHP picture are shown.
  • OHP Octahedron Projection
  • Fig. 24 illustrates an exemplary procedure of MV scaling in an Icosahedron Projection (ISP) frame, where a current block and a neighbouring block in an ISP picture are shown.
  • ISP Icosahedron Projection
  • Fig. 25 illustrates an exemplary procedure of MV scaling in an Equal-Area Projection (EAP) frame, where a current block and a neighbouring block in an EAP picture are shown.
  • EAP Equal-Area Projection
  • Fig. 26 illustrates an exemplary procedure of MV scaling in an Adjusted Cubemap Projection (ACP) frame, where a current block and a neighbouring block in an ACP picture are shown.
  • ACP Adjusted Cubemap Projection
  • Fig. 27 illustrates an exemplary procedure of MV scaling in a Rotated Sphere Projection (RSP) frame, where a current block and a neighbouring block in a RSP picture are shown.
  • RSP Rotated Sphere Projection
  • Fig. 28 illustrates an exemplary procedure of MV scaling in a Cylindrical Projection (CLP) frame, where a current block and a neighbouring block in an CLP picture are shown.
  • CLP Cylindrical Projection
  • Fig. 29 illustrates an exemplary flowchart of a system that applies rotation of sphere to adjust motion vector for processing 360-degree virtual reality images according to an embodiment of the present invention.
  • Fig. 30 illustrates an exemplary flowchart of a system that derives motion vector from displacement of viewpoint for processing 360-degree virtual reality images according to an embodiment of the present invention.
  • Fig. 31 illustrates an exemplary flowchart of a system that applies scaling to adjust motion vector for processing 360-degree virtual reality images according to an embodiment of the present invention.
  • motion estimation/compensation is widely used to explore correlation in video data in order to reduce transmitted information.
  • the conventional video contents correspond to 2D video data and the motion estimation and compensation techniques often assume a translational motion.
  • more advanced motion models such as affine model, are considered. Nevertheless, these techniques derive a 3D motion model based on the 2D images.
  • the motion information is derived in the 3D domain so that more accurate motion information may be derived.
  • the rotation of sphere is assumed for the cause of block deformation in the 2D video contents.
  • Fig. 7A and Fig. 7B illustrate examples of deformation of 2D image due to the rotation of sphere.
  • block 701 on a 3D sphere 700 is moved to become block 702.
  • the two corresponding blocks become blocks 704 and 705 in the ERP frame.
  • the blocks (i.e., 701 and 702) on the 3D sphere correspond to a same block, the two blocks (i.e., 704 and 705) have different shapes in the ERP frame.
  • object move or rotate in the 3D space may cause object distortion in the 2D frame (i.e., ERP frame in this example) .
  • an area 711 is on the surface of a sphere 710.
  • the area 711 is moved to area 714 caused by the rotation of the sphere along path 713.
  • a motion vector 712 associated with area 711 is mapped to motion vector 715 due to the rotation of sphere.
  • the correspondences of parts 711-715 in the 2D domain 720 are shown as parts 721-725.
  • the parts corresponding to 713-725 for trajectory 733 on the surface of a sphere 730 are shown as parts 733-735 respectively, and the correspondences of parts 733-735 in the 2D domain 740 are shown as parts 743-745.
  • the parts corresponding to 713-725 for trajectory 753 on the surface of a sphere 750 are shown as parts 753-755 respectively, and the correspondences of parts 753-755 in the 2D domain 760 are shown as parts 763-765.
  • Fig. 8 illustrates an example of rotation of the sphere 800 from point a 830 to point b 840 on a big circle 820 on the sphere 810, where the big circle 820 corresponds to the largest circle on the surface of the sphere 810.
  • the axis of rotation 850 is shown as an arrow in Fig. 8.
  • the angle of rotation is ⁇ a .
  • Fig. 9 illustrates an example of rotation of the sphere 900 from point a to point b on a small circle 910 on the sphere 920, where the small circle 910 corresponds to a circle smaller than the largest circle (e.g. circle 930) on the surface of the sphere 920.
  • the centre point of rotation is shown as a dot 912 in Fig. 9.
  • a big circle 980 i.e., the largest circle
  • the axis of rotation 990 is shown as an arrow in Fig. 9.
  • Fig. 10 illustrates an example of deriving the motion vector for 2D projected pictures using the rotation of the sphere model.
  • illustration 1010 depicts an example of deriving the motion vector for 2D projected pictures using the rotation of the sphere model according to an embodiment of the present invention.
  • Locations a and b are two locations in a 2D projected picture.
  • Motion vector mv a points from a to a′. The goal is to find the motion vector mv b for location b.
  • the present invention projects the 2D projected picture onto a 3D sphere 1020 using 2D-to-3D projection. Rotation around a small circle or big circle is applied to rotate location a to location b.
  • Location a’ is rotated to location b’according to the same selected rotation of sphere.
  • An inverse projection is then applied to the 3D sphere to convert the location b’to the 2D frame 1030.
  • Motion vectors mv a and mv b are two-dimensional vectors in the (x, y) domain.
  • Locations a, a’, b and b’ are two-dimensionalcoordinates in the (x, y) domain.
  • the notations and are three-dimensional coordinates in the ( ⁇ , ⁇ ) domain.
  • the derived motion vector can be used as a candidate for video coding using Merge or AMVP (advanced motion vector prediction) mode, where the Merge mode and the AMVP modes are techniques to code a block or motion information of the block predictively as disclosed in HEVC (High Efficiency Video Coding) .
  • AMVP advanced motion vector prediction
  • the current block uses the motion information of a block indicated by a Merge index indicating to a selected candidate in a Merge candidate list.
  • AMVP advanced motion vector prediction
  • FIG. 11 illustrates an example of using the MV derived based on rotation of sphere as a Merge or AMVP candidate.
  • layout 1110 of neighbouring blocks that are used to derive Merge candidate is shown for block 1112 where the neighbouring blocks include spatial neighbouring blocks A 0 , A 1 , B 0 , B 1 and B 2 and temporal neighbouring blocks Col 0 or Col 1 .
  • the motion vectors from block A 0 and B 0 can be used to derive motion candidate for the current block.
  • the motion vector mv a of neighbouring block A 0 can be used to derive a corresponding motion vector mv a’ for the current block according to the rotation of sphere.
  • motion vector mv b of neighbouring block B 0 can be used to derive a corresponding motion vector mv b’ at the current block according to the rotation of sphere.
  • Motion vectors mv a’ and mv b’ can be included in the Merge or AMVP candidate list.
  • FIG. 12 illustrates an example of an object (i.e., a tree) projected onto the surface of a sphere at different camera locations.
  • the tree is projected onto sphere 1210 to form an image 1240 of the tree.
  • the tree is projected onto sphere 1220 to form an image 1250 of the tree.
  • the image 1241 of the tree corresponding to camera position A is also shown on sphere 1220 for comparison.
  • the tree is projected onto sphere 1230 to form an image 1260 of the tree.
  • Image 1242 of the tree corresponding to camera position A and image 1251 of the tree corresponding to camera position B are also shown on sphere 1230 for comparison.
  • the direction of movement of the camera in 3D space can be represented by a latitude and longitude coordinates, where corresponds to the intersection of the motion vector and the 3D sphere.
  • the point of is projected to the 2D target projection plane and the point is referred as a forward point.
  • Fig. 13 illustrates an example of an ERP frame overlaid with the pattern of moving flows, where the flow of background (i.e., static object) can be determined if the camera forward point is known.
  • the flows are indicated by arrows.
  • the camera forward point 1310 and camera backward point 1320 are shown.
  • Moving flows correspond to the moving direction of video content based on a camera moving in a direction.
  • the movement of the camera causes the relative movement of the static background objects, and the moving direction of background object on the 2D frame captured by camera can be represented as moving flows.
  • Fig. 14 illustrates an exemplary procedure of MV derivation based on displacement of viewpoint. In Fig.
  • an object 1410 is projected to the sphere 1420 to form an image 1422 on the surface of the sphere corresponding to camera location 1424.
  • the object 1410 is projected to the sphere 1430 to form an image 1432 on the surface of the sphere.
  • the corresponding location of the object is also shown on the surface of the sphere 1430 at location 1423.
  • ⁇ Calculate moving flows 1450 in the 2D frame i.e., tangent direction at each pixel.
  • the MV of neighbouring block can be used to determine the displacement of camera, as indicated by arrow 1460;
  • the displacement of camera can be used to determine the MV of current block, as indicated by arrow 1470.
  • the MV derivation based on displacement of viewpoint can be applied to various projection methods.
  • the moving flow in the 2D frame can be mapped to 3D sphere.
  • the moving flow on the 3D sphere 1510 is shown in Fig. 15, where the forward point and two different lines of moving flow (1512 and 1514) are shown.
  • the moving flow on 3D sphere associated with ERP 1520 is shown in Fig. 15, where the moving flows are shown for an ERP frame 1526.
  • the moving flow on 3D sphere associated with CMP 1530 is shown in Fig. 15, where the moving flows are shown for six faces of a CMP frame 1536 in a 2x3 layout format.
  • the moving flow on 3D sphere associated with OHP 1540 is shown in Fig.
  • FIG. 15 where the moving flows are shown for eight faces of an OHP frame 1546.
  • the moving flow on 3D sphere associated with ISP 1550 is shown in Fig. 15, where the moving flows are shown for twenty faces of an ISP frame 1556.
  • the moving flow on 3D sphere associated with SSP 1560 is shown in Fig. 15, where the moving flows are shown for segmented faces of an SSP frame 1566.
  • mappings such as ERP, CMP, SSP, OHP and ISP
  • ERP enterprise resource planning
  • CMP computed to Physical Transport
  • SSP Session sphere
  • OHP OHP
  • ISP ISP
  • An MV scaling technique on 3D sphere is disclosed to perform deformation by scaling MV on a 3D sphere to minimize effects of projection types on deformation.
  • An MV scaling technique on 2D frame is also disclosed, which can be directly applied to 2D frames by combining projection, scaling MV and inverse projection into a single function.
  • Fig. 16 illustrates an example of deformation associated with motion in the ERP frame.
  • three motion vectors (1612, 1614 and 1616) along three latitude lines are shown on the surface of 3D sphere 1610, where the three motion vectors have about the same length. If the surface of the 3D sphere is unfolded into a 2D plane 1620, the three motion vectors (1622, 1624 and 1626) maintains their equal length property. For the ERP frame, the unfolded image needs to be stretched, where more stretching is needed in higher latitudes. Therefore, the three motion vectors (1632, 1634 and 1636) have different lengths as shown in Fig. 16.
  • motion vector 1634 has to be properly scaled before used for coding, such as a merge or AMVP candidate.
  • a merge or AMVP candidate i.e., AMVP candidate
  • Map a, a’, and b from 2D frame 1710 to 3D sphere 1720 as shown in Fig. 17.
  • mapping function Pprojection type (x, y) which maps the pixel at (x, y) in a 2D frame to
  • scaling functions for ⁇ and are scale ⁇ and respectively.
  • the MV scaling in the 3D domain can also be performed in a 2D domain by combining projection function, scaling function, and inverse projection function into a single function f (a, b, mv a , projection type) . Therefore, the mapping between 3D and 2D can be skipped.
  • f x (a, b, mv a , projection type) is the single function to generate x mvb
  • f y (a, b, mv a , projection type) is the single function to generate y mvb
  • Fig. 18 illustrates an exemplary procedure of MV scaling in a 2D frame. Two blocks (A and B) in a 2D frame 1810 are shown. The motion vector for location a in block A is marked. The end position for location a associated with mv a is denoted as a’in Fig. 18. Location b in block B is marked. The end position for location b associated with mv b is denoted as b’in Fig. 18.
  • the steps of projection function (i.e., forward projection 1830) , scaling function in the 3D sphere 1820, and inverse projection function (i.e., inverse projection 1840) can be combined into a single function according to the above equation.
  • FIG. 19 An example of MV scaling in an ERP frame is shown in Fig. 19.
  • the surface of 3D sphere 1910 is unfolded into a 2D plane 1920.
  • the unfolded image needs to be stretched so that all latitude lines have the same length.
  • the length of horizontal line is enlarged more when it is closer to North or South Pole.
  • a neighbouring block 1932 has a motion vector mv 1 .
  • a motion vector mv 2 needs to be derived based on mv 1 for the current block 1934.
  • the derived motion vector can be used for coding the current block.
  • the derived block can be used as a Merge or AMVP candidate.
  • the motion vector associated with the neighbouring block needs to be scaled before it is used for coding the current block.
  • the motion vector of the neighbouring block is stretched more in the x-direction than a motion vector for the current block since the neighbouring block is in higher latitude. Therefore, the x-component of mv 1 needs to be scaled down before it is used for the current block.
  • the MV scaling process 1940 is shown in Fig. 19, where the motion vector mv 1 at location a of a neighbouring block is scaled to mv 2 and used for coding the current block.
  • a scaling function to preserve the motion distance is disclosed, where mv 2 is a function of mv 1 , ⁇ 1, ⁇ 2,
  • the y-component of the derived motion vector is the same as the y-component of the mv 1 .
  • the scaling function for the y-component is an identity function.
  • MV scaling for other projections are also disclosed.
  • Fig. 20 the MV scaling for ERP is disclosed, where a current block 2012 and a neighbouring block 2014 in an ERP picture 2010 are shown.
  • the neighbouring block has a motion vector mv 1 .
  • the motion vector mv 1 from the neighbouring block is used to derive a motion vector mv 2 for the current block.
  • the MV scaling for CMP is disclosed, where a current block 2112 and a neighbouring block 2114 in a CMP picture 2110 are shown.
  • the neighbouring block has a motion vector mv 1 .
  • the motion vector mv 1 from the neighbouring block is used to derive a motion vector mv 2 for the current block.
  • the MV scaling for SSP is disclosed, where a current block 2212 and a neighbouring block 2214 in an SSP picture 2210 are shown.
  • the neighbouring block has a motion vector mv 1 .
  • the motion vector mv 1 from the neighbouring block is used to derive a motion vector mv 2 for the current block.
  • the MV scaling for OHP is disclosed, where a current block 2312 and a neighbouring block 2314 in an OHP picture 2310 are shown.
  • the neighbouring block has a motion vector mv 1 .
  • the motion vector mv 1 from the neighbouring block is used to derive a motion vector mv 2 for the current block.
  • the MV scaling for ISP is disclosed, where a current block 2412 and a neighbouring block 2414 in an ISP picture 2410 are shown.
  • the neighbouring block has a motion vector mv 1 .
  • the motion vector mv 1 from the neighbouring block is used to derive a motion vector mv 2 for the current block.
  • MV scaling for EAP is disclosed, where a current block 2512 and a neighbouring block 2514 in an EAP picture 2510 are shown.
  • the neighbouring block has a motion vector mv 1 .
  • the motion vector mv 1 from the neighbouring block is used to derive a motion vector mv 2 for the current block.
  • MV scaling for ACP is disclosed, where a current block 2612 and a neighbouring block 2614 in an ACP picture 2610 are shown.
  • the neighbouring block has a motion vector mv 1 .
  • the motion vector mv 1 from the neighbouring block is used to derive a motion vector mv 2 for the current block.
  • MV scaling for RSP is disclosed, where a current block 2712 and a neighbouring block 2714 in an RSP picture 2710 are shown.
  • the neighbouring block has a motion vector mv 1 .
  • the motion vector mv 1 from the neighbouring block is used to derive a motion vector mv 2 for the current block.
  • cylindrical projection has also been used to project a 3D sphere into a 2D frame.
  • cylindrical projections are created by wrapping a cylinder 2820 around a globe 2830 and projecting light through the globe onto the cylinder as shown in Fig. 28.
  • Cylindrical projections represent meridians as straight, evenly-spaced, vertical lines and parallels as straight horizontal lines. Meridians and parallels intersect at right angles, as they do on the globe.
  • various CLPs are generated.
  • the MV scaling for Cylindrical Projection is disclosed, where a current block 2812 and a neighbouring block 2814 in a CLP picture 2810 are shown.
  • the neighbouring block has a motion vector mv 1 .
  • the motion vector mv 1 from the neighbouring block is used to derive a motion vector mv 2 for the current block.
  • Fig. 29 illustrates an exemplary flowchart of a system that applies rotation of sphere to adjust motion vector for processing 360-degree virtual reality images according to an embodiment of the present invention.
  • the steps shown in the flowchart may be implemented as program codes executable on one or more processors (e.g., one or more CPUs) at the encoder side.
  • the steps shown in the flowchart may also be implemented based hardware such as one or more electronic devices or processors arranged to perform the steps in the flowchart.
  • input data for a current block in a 2D (two-dimensional) frame are received in step 2910, where the 2D frame is projected from a 3D (three-dimensional) sphere.
  • the input data may correspond to pixel data of a 2D frame to be encoded.
  • a first motion vector associated with a neighbouring block in the 2D frame is determined in step 2920, where the first motion vector points from a first start location in the neighbouring block to a first end location in the 2D frame.
  • the first motion vector is projected onto the 3D sphere according to a target projection in step 2930.
  • the first motion vector in the 3D sphere is rotated along a rotation circle on a surface of the 3D sphere around a rotation axis to generate a second motion vector in the 3D sphere in step 2940.
  • the second motion vector in the 3D sphere is mapped back to the 2D frame according to an inverse target projection.
  • the current block in the 2D frame is encoded or decoded using the second motion vector in step 2960.
  • Fig. 30 illustrates an exemplary flowchart of a system that derives motion vector from displacement of viewpoint for processing 360-degree virtual reality images according to an embodiment of the present invention.
  • two 2D (two-dimensional) frames are received in step 3010, where said two 2D frames are projected, using a target projection, from a 3D (three-dimensional) sphere corresponding to two different viewpoints, and where a current block and a neighbouring block are located in said two 2D frames.
  • a forward point of camera is determined based on said two 2D frames in step 3020.
  • Moving flows in said two 2D frames are determined in step 3030.
  • Displacement of camera is determined based on a first motion vector associated with the neighbouring block in step 3040.
  • a second motion vector associated with the current block is derived based on the displacement of camera in step 3050.
  • the current block in the 2D frame is encoded or decoded using the second motion vector in step 3060.
  • Fig. 31 illustrates an exemplary flowchart of a system that applies scaling to adjust motion vector for processing 360-degree virtual reality images according to an embodiment of the present invention.
  • input data for a current block in a 2D (two-dimensional) frame are received in step 3110, where the 2D frame is projected from a 3D (three-dimensional) sphere according to a target projection.
  • a first motion vector associated with a neighbouring block in the 2D frame is determined in step 3120, where the first motion vector points from a first start location in the neighbouring block to a first end location in the 2D frame.
  • the first motion vector is scaled to generate a second motion vector in step 3130.
  • the current block in the 2D frame is encoded or decoded using the second motion vector in step 3140.
  • Embodiment of the present invention as described above may be implemented in various hardware, software codes, or a combination of both.
  • an embodiment of the present invention can be one or more electronic circuits integrated into a video compression chip or program code integrated into video compression software to perform the processing described herein.
  • An embodiment of the present invention may also be program code to be executed on a Digital Signal Processor (DSP) to perform the processing described herein.
  • DSP Digital Signal Processor
  • the invention may also involve a number of functions to be performed by a computer processor, a digital signal processor, a microprocessor, or field programmable gate array (FPGA) .
  • These processors can be configured to perform particular tasks according to the invention, by executing machine-readable software code or firmware code that defines the particular methods embodied by the invention.
  • the software code or firmware code may be developed in different programming languages and different formats or styles.
  • the software code may also be compiled for different target platforms.
  • different code formats, styles and languages of software codes and other means of configuring code to perform the tasks in accordance with the invention will not depart from the spirit and scope of the invention.

Abstract

Methods and apparatus of processing 360-degree virtual reality images are disclosed. According to one method, a system that maps a motion vector in a 2D frame into a 3D space and then applies rotation of sphere to adjust motion vector in the 3D space. The rotated motion vector is then mapped back to the 2D frame for processing 360-degree virtual reality images. According to another method, a system that derives motion vector from displacement of viewpoint is disclosed. According to yet another method, a system that applies scaling to adjust motion vector is disclosed. The motion vector scaling can be performed in the 3D space by projecting a motion vector from a 2D frame to a 3D sphere. After motion vector scaling, the scaled motion vector is projected back to the 2D frame. Alternatively, the forward projection, motion vector scaling and inverse projection can be combined into a single function.

Description

METHOD AND APPARATUS OF MOTION VECTOR DERIVATIONS IN IMMERSIVE VIDEO CODING
CROSS REFERENCE TO RELATED PATENT APPLICATION (S)
The present invention claims priority to U.S. Provisional Patent Application, Serial No. 62/523,883, filed on June 23, 2017 and U.S. Provisional Patent Application, Serial No. 62/523,885, filed on June 23, 2017. The U.S. Provisional Patent Applications are hereby incorporated by reference in their entireties.
TECHNICAL FIELD
The present invention relates to image/video processing or coding for 360-degree virtual reality (VR) images/sequences. In particular, the present invention relates to deriving motion vectors for three-dimensional (3D) contents in various projection formats.
BACKGROUND
The 360-degree video, also known as immersive video is an emerging technology, which can provide “feeling as sensation of present” . The sense of immersion is achieved by surrounding a user with wrap-around scene covering a panoramic view, in particular, 360-degree field of view. The “feeling as sensation of present” can be further improved by stereographic rendering. Accordingly, the panoramic video is being widely used in Virtual Reality (VR) applications.
Immersive video involves the capturing a scene using multiple cameras to cover a panoramic view, such as 360-degree field of view. The immersive camera usually uses a panoramic camera or a set of cameras arranged to capture 360-degree field of view. Typically, two or more cameras are used for the immersive camera. All videos must be taken simultaneously and separate fragments (also called separate perspectives) of the scene are recorded. Furthermore, the set of cameras are often arranged to capture views horizontally, while other arrangements of the cameras are possible.
The 360-degree virtual reality (VR) images may be captured using a 360-degree spherical panoramic camera or multiple images arranged to cover all filed of views around 360 degrees. The three-dimensional (3D) spherical image is difficult to process or store using the conventional image/video processing devices. Therefore, the 360-degree VR images are often converted to a two-dimensional (2D) format using a 3D-to-2D projection method. For example, equirectangular projection (ERP) and cubemap projection (CMP) have been commonly used projection methods. Accordingly, a 360-degree image can be stored in an equirectangular projected format. The equirectangular projection maps the entire surface of a sphere onto a flat image. The vertical axis is latitude and the horizontal axis is longitude. Fig. 1 illustrates an example of projecting a sphere 110 into a rectangular image 120 according to equirectangular projection (ERP) , where each longitude line is mapped to a vertical line of the ERP picture. For the ERP projection, the areas in the north and south poles of the sphere are stretched more severely (i.e., from a single point to a line) than areas near the equator. Furthermore, due to distortions introduced by the stretching, especially near the two poles, predictive coding tools often fail to make good prediction, causing reduction in coding efficiency. Fig. 2 illustrates a cube 210 with six faces, where a 360-degree virtual reality (VR) image can be projected to the six  faces on the cube according to cubemap projection (CMP) . There are various ways to lift the six faces off the cube and repack them into a rectangular picture. The example shown in Fig. 2 divides the six faces into two parts (220a and 220b) , where each part consists of three connected faces. The two parts can be unfolded into two strips (230a and 230b) , where each strip corresponds to a continuous-face picture. The two strips can be combined into a compact rectangular frame according to a selected layout format.
Both ERP and CMP formats have been included in the projection format conversion being considered for the next generation video coding as described in JVET-F1003 (Y. Ye, et al., “Algorithm descriptions of projection format conversion and video quality metrics in 360Lib” , Joint Video Exploration Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 6th Meeting: Hobart, AU, 31 March –7 April 2017, Document: JVET-F1003) . Besides the ERP and CMP formats, there are various other VR projection formats, such as Adjusted Cubemap Projection (ACP) , Equal-Area Projection (EAP) , Octahedron Projection (OHP) , Icosahedron Projection (ISP) , Segmented Sphere Projection (SSP) and Rotated Sphere Projection (RSP) that are widely used in the field.
Fig. 3 illustrates an example of octahedron projection (OHP) , where a sphere is projected onto faces of an 8-face octahedron 310. The eight faces 320 lifted from the octahedron 310 can be converted to an intermediate format 330 by cutting open the face edge between  faces  1 and 5 and rotating  faces  1 and 5 to connect to faces 2 and 6 respectively, and applying a similar process to faces 3 and 7. The intermediate format can be packed into a rectangular picture 340.
Fig. 4 illustrates an example of icosahedron projection (ISP) , where a sphere is projected onto faces of a 20-face icosahedron 410. The twenty faces 420 from the icosahedron 410 can be packed into a rectangular picture 430 (referred as a projection layout) .
Segmented sphere projection (SSP) has been disclosed in JVET-E0025 (Zhang et al., “AHG8: Segmented Sphere Projection for 360-degree video” , Joint Video Exploration Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 5th Meeting: Geneva, CH, 12–20 January 2017, Document: JVET-E0025) as a method to convert a spherical image into an SSP format. Fig. 5 illustrates an example of segmented sphere projection, where a spherical image 500 is mapped into a North Pole image 510, a South Pole image 520 and an equatorial segment image 530. The boundaries of 3 segments correspond to latitudes 45°N (502) and 45°S (504) , where 0° corresponds to the equator (506) . The North and South Poles are mapped into 2 circular areas (i.e., 510 and 520) , and the projection of the equatorial segment can be the same as ERP or equal-area projection (EAP) . The diameter of the circle is equal to the width of the equatorial segments because both Pole segments and equatorial segment have a 90° latitude span. The North Pole image 510, South Pole image 520 and the equatorial segment image 530 can be packed into a rectangular image.
Fig. 6 illustrates an example of rotated sphere projection (RSP) , where the sphere 610 is partitioned into a middle 270°x90° region 620, and a residual part 622. Each part of RSP can be further stretched on the top side and the bottom side to generate a deformed part having an oval shape. The two over-shaped parts can be fitted into a rectangular frame 630 as shown in Fig. 6.
The Adjusted Cubemap Projection format (ACP) is based on the CMP. If the two-dimensional coordinate (u’, v’) for CMP is determined, the two-dimensional coordinate (u, v) for ACP can be calculated by adjusting (u’, v’) according to a set of equations:
Figure PCTCN2018092143-appb-000001
Figure PCTCN2018092143-appb-000002
The 3D coordinates (X, Y, Z) can be derived using a table given the position (u, v) and the face index f. For 3D-to-2D coordinate conversion, given (X, Y, Z) , the (u’, v’) and face index f can be calculated according to a table for CMP. The 2D coordinates for ACP can be calculated according to a set of equations.
Similar to ERP, the EAP also maps a sphere surface to one face. In the (u, v) plane, u and v are in the range [0, 1]. For 2D-to-3D coordinate conversion, given the sampling position (m, n) , 2D coordinates (u, v) are first calculated in the same way as ERP. Then, the longitude and latitude (φ, θ) on the sphere can be calculated from (u, v) as:
φ = (u-0.5) * (2*π)                 (3)
θ = sin -1 (1.0-2*v)                   (4)
Finally, (X, Y, Z) can be calculated using the same equations as used for ERP:
X = cos (θ) cos (φ)                 (5)
Y = sin (θ)                          (6)
Z = -cos (θ) sin (φ)                (7)
Inversely, the longitude and latitude (φ, θ) can be evaluated from (X, Y, Z) coordinates using:
φ = tan-1 (-Z/X)                         (8)
θ = sin-1 (Y/ (X2+Y2+Z2) 1/2)           (9)
Since the images or video associated with virtual reality may take a lot of space to store or a lot of bandwidth to transmit, therefore image/video compression is often used to reduce the required storage space or transmission bandwidth. Inter prediction has been a powerful coding tool to explore the inter-frame redundancy using motion estimation/compensation. If conventional Inter prediction is applied to the 2D frames converted from a 3D space, the using motion estimation/compensation techniques may not work properly since an object in the 3D space may become distorted or deformed in the 2D frames due to object movement or relative motion between an object and a camera. In order to improve Inter prediction for 2D frames converted from a 3D space, various Inter prediction techniques are developed to improve the accuracy of Inter prediction for 2D frames converted from a 3D space.
SUMMARY
Methods and apparatus of processing 360-degree virtual reality images are disclosed. According to one method, input data for a current block in a 2D (two-dimensional) frame are received, where the 2D frame is projected from a 3D (three-dimensional) sphere. A first motion vector associated with a neighbouring block in the 2D frame is determined, where the first motion vector points from a first start location in the neighbouring block to a first end location in the 2D frame. The first motion vector is projected onto the 3D sphere according to a target projection. The first motion vector is rotated in the 3D sphere along a rotation circle on a surface of the 3D sphere around a rotation axis to generate a second motion vector in the 3D sphere.  The second motion vector in the 3D sphere is mapped back to the 2D frame according to an inverse target projection. The current block in the 2D frame is encoded or decoded using the second motion vector. The second motion vector may be included as a candidate in a Merge candidate list or an AMVP (Advanced Motion Vector Prediction) candidate list for encoding or decoding of the current block.
In one embodiment, the rotation circle corresponds to a largest circle on the surface of the 3D sphere. In another embodiment, the rotation circle is smaller than a largest circle on the surface of the 3D sphere. The target projection may correspond to Equirectangular Projection (ERP) and Cubemap Projection (CMP) , Adjusted Cubemap Projection (ACP) , Equal-Area Projection (EAP) , Octahedron Projection (OHP) , Icosahedron Projection (ISP) , Segmented Sphere Projection (SSP) , Rotated Sphere Projection (RSP) , or Cylindrical Projection (CLP) .
In one embodiment, said projecting the first motion vector onto the 3D sphere comprises projecting the first start location, the first end location and a second start location in the 2D frame onto the 3D sphere according to the target projection, where the second start location is at a corresponding location in the current block corresponding to the first start location in the neighbouring block. Said rotating the first motion vector in the 3D sphere along the rotation circle comprises determining a target rotation for rotating from the first start location to the second start location in the 3D sphere along a rotation circle on a surface of the 3D sphere around a rotation axis and rotating the first end location to a second end location on the 3D sphere using the target rotation. Said mapping the second motion vector in the 3D sphere back to the 2D frame comprises mapping the second end location on the 3D sphere back to the 2D frame according to the inverse target projection and determining the second motion vector in the 2D frame based on the second start location and the second end location in the 2D frame.
According to another method, two 2D frames are received, where said two 2D frames are projected, using a target projection, from a 3D sphere corresponding to two different viewpoints, and a current block and a neighbouring block are located in said two 2D frames. A forward point of camera is determined based on said two 2D frames. Moving flows in said two 2D frames are determined. Displacement of camera based on a first motion vector associated with the neighbouring block is determined. A second motion vector associated with the current block is determined based on the displacement of camera. The current block in the 2D frame is then encoded or decoded using the second motion vector.
For the above method, the moving flows in said two 2D frames can be calculated from a tangent direction at each pixel in said two 2D frames. The second motion vector can be included as a candidate in a Merge candidate list or an AMVP (Advanced Motion Vector Prediction) candidate list for encoding or decoding of the current block. The target projection may correspond to Equirectangular Projection (ERP) and Cubemap Projection (CMP) , Adjusted Cubemap Projection (ACP) , Equal-Area Projection (EAP) , Octahedron Projection (OHP) , Icosahedron Projection (ISP) , Segmented Sphere Projection (SSP) , Rotated Sphere Projection (RSP) , or Cylindrical Projection (CLP) .
According to yet another method, input data for a current block in a 2D frame are received, wherein the 2D frame is projected from a 3D sphere according to a target projection. A first motion vector associated with a neighbouring block in the 2D frame, where the first motion vector points from a first start location in the neighbouring block to a first end location in the 2D frame. The first motion vector is scaled to  generate a second motion vector. The current block in the 2D frame is then encoded or decoded using the second motion vector.
In one embodiment of the above method, said scaling the first motion vector to generate the second motion vector comprises projecting the first start location, the first end location and a second start location in the 2D frame onto the 3D sphere according to the target projection, where the second start location is at a corresponding location in the current block corresponding to the first start location in the neighbouring block. Said scaling the first motion vector to generate the second motion vector further comprises scaling a longitude component of the first motion vector to generate a scaled longitude component of the first motion vector; scaling a latitude component of the first motion vector to generate a scaled latitude component of the first motion vector; and determining a second end location corresponding to the second start location based on the scaled longitude component of the first motion vector and the scaled latitude component of the first motion vector. Said scaling the first motion vector to generate the second motion vector further comprises mapping the second end location on the 3D sphere back to the 2D frame according to an inverse target projection; and determining the second motion vector in the 2D frame based on the second start location and the second end location in the 2D frame.
In another embodiment, said scaling the first motion vector to generate the second motion vector comprises applying a first combined function to generate an x-component of the second motion vector and applying a second combined function to generate a y-component of the second motion vector; where the first combined function and the second combined function are dependent on the first start location, a second start location in the current block associated with corresponding first start location, the first motion vector and a target projection; and wherein the first combined function and the second combined function combine the target projection for projecting first data in the 2D frame into second data in the 3D sphere, scaling a selected motion vector in the 3D sphere into a scaled motion vector in the 3D sphere, and inverse target projection for projecting the scaled motion vector into the 2D frame. In one embodiment, when the target projection corresponds to Equirectangular Projection (ERP) , the first combined function corresponds to 
Figure PCTCN2018092143-appb-000003
and the second combined function corresponds to an identity function, wherein
Figure PCTCN2018092143-appb-000004
corresponds to a first latitude associated with the first starting location and
Figure PCTCN2018092143-appb-000005
corresponds to a second latitude associated with the second starting location.
BRIEF DESCRIPTION OF THE DRAWINGS
Fig. 1 illustrates an example of projecting a sphere into a rectangular image according to equirectangular projection, where each longitude line is mapped to a vertical line of the ERP picture.
Fig. 2 illustrates a cube with six faces, where a 360-degree virtual reality (VR) image can be projected to the six faces on the cube according to cubemap projection (CMP) .
Fig. 3 illustrates an example of octahedron projection (OHP) , where a sphere is projected onto faces of an 8-face octahedron.
Fig. 4 illustrates an example of icosahedron projection (ISP) , where a sphere is projected onto faces of a 20-face icosahedron.
Fig. 5 illustrates an example of segmented sphere projection (SSP) , where a spherical image is  mapped into a North Pole image, a South Pole image and an equatorial segment image.
Fig. 6 illustrates an example of rotated sphere projection (RSP) , where the sphere is partitioned into a middle 270°x90° region and a residual part. These two parts of RSP can be further stretched on the top side and the bottom side to generate deformed parts having oval-shaped boundary on the top part and bottom part.
Fig. 7A and Fig. 7B illustrate examples of deformation of 2D image due to the rotation of sphere.
Fig. 8 illustrates an example of rotation of the sphere from point a to point b on a big circle on the sphere, where the big circle corresponds to the largest circle on the surface of the sphere.
Fig. 9 illustrates an example of rotation of the from point a to point b on a small circle on the sphere, where the big circle corresponds to a circle smaller than the largest circle on the surface of the sphere.
Fig. 10 illustrates an example of deriving the motion vector for 2D projected pictures using the rotation of the sphere model.
Fig. 11 illustrates an example of using the MV derived based on rotation of sphere as a Merge or AMVP candidate.
Fig. 12 illustrates an example of an object (i.e., a tree) projected onto the surface of a sphere at different camera locations.
Fig. 13 illustrates an example of an ERP frame overlaid with the pattern of moving flows, where the flow of background (i.e., static object) can be determined if the camera forward point is known.
Fig. 14 illustrates an exemplary procedure of MV derivation based on displacement of viewpoint.
Fig. 15 illustrates exemplary MV derivation based on displacement of viewpoint for various projection methods.
Fig. 16 illustrates an example of deformation associated with motion in the ERP (Equirectangular Projection) frame.
Fig. 17 illustrates an exemplary procedure for the MV scaling technique in 3D sphere.
Fig. 18 illustrates an exemplary procedure of MV scaling in a 2D frame.
Fig. 19 illustrates an exemplary procedure of MV scaling in the ERP (Equirectangular Projection) frame.
Fig. 20 illustrates an exemplary procedure of MV scaling in an Equirectangular Projection (ERP) frame, where a current block and a neighbouring block in an ERP picture are shown.
Fig. 21 illustrates an exemplary procedure of MV scaling in a Cubemap Projection (CMP) frame, where a current block and a neighbouring block in a CMP picture are shown.
Fig. 22 illustrates an exemplary procedure of MV scaling in a Segmented Sphere Projection (SSP) frame, where a current block and a neighbouring block in an SSP picture are shown.
Fig. 23 illustrates an exemplary procedure of MV scaling in an Octahedron Projection (OHP) frame, where a current block and a neighbouring block in an OHP picture are shown.
Fig. 24 illustrates an exemplary procedure of MV scaling in an Icosahedron Projection (ISP) frame, where a current block and a neighbouring block in an ISP picture are shown.
Fig. 25 illustrates an exemplary procedure of MV scaling in an Equal-Area Projection (EAP) frame, where a current block and a neighbouring block in an EAP picture are shown.
Fig. 26 illustrates an exemplary procedure of MV scaling in an Adjusted Cubemap Projection  (ACP) frame, where a current block and a neighbouring block in an ACP picture are shown.
Fig. 27 illustrates an exemplary procedure of MV scaling in a Rotated Sphere Projection (RSP) frame, where a current block and a neighbouring block in a RSP picture are shown.
Fig. 28 illustrates an exemplary procedure of MV scaling in a Cylindrical Projection (CLP) frame, where a current block and a neighbouring block in an CLP picture are shown.
Fig. 29 illustrates an exemplary flowchart of a system that applies rotation of sphere to adjust motion vector for processing 360-degree virtual reality images according to an embodiment of the present invention.
Fig. 30 illustrates an exemplary flowchart of a system that derives motion vector from displacement of viewpoint for processing 360-degree virtual reality images according to an embodiment of the present invention.
Fig. 31 illustrates an exemplary flowchart of a system that applies scaling to adjust motion vector for processing 360-degree virtual reality images according to an embodiment of the present invention.
DETAILED DESCRIPTION OF PREFERRED IMPLEMENTATIONS
The following description is of the best-contemplated mode of carrying out the invention. This description is made for the purpose of illustrating the general principles of the invention and should not be taken in a limiting sense. The scope of the invention is best determined by reference to the appended claims.
For conventional Inter prediction in video coding, motion estimation/compensation is widely used to explore correlation in video data in order to reduce transmitted information. The conventional video contents correspond to 2D video data and the motion estimation and compensation techniques often assume a translational motion. In the next generation video coding, more advanced motion models, such as affine model, are considered. Nevertheless, these techniques derive a 3D motion model based on the 2D images.
In the present invention, the motion information is derived in the 3D domain so that more accurate motion information may be derived. According to one method, the rotation of sphere is assumed for the cause of block deformation in the 2D video contents. Fig. 7A and Fig. 7B illustrate examples of deformation of 2D image due to the rotation of sphere. In Fig. 7A, block 701 on a 3D sphere 700 is moved to become block 702. When block 701 and moved block 702 are mapped to an ERP frame 703, the two corresponding blocks become  blocks  704 and 705 in the ERP frame. While the blocks (i.e., 701 and 702) on the 3D sphere correspond to a same block, the two blocks (i.e., 704 and 705) have different shapes in the ERP frame. In other words, object move or rotate in the 3D space may cause object distortion in the 2D frame (i.e., ERP frame in this example) . In Fig. 7B, an area 711 is on the surface of a sphere 710. The area 711 is moved to area 714 caused by the rotation of the sphere along path 713. A motion vector 712 associated with area 711 is mapped to motion vector 715 due to the rotation of sphere. The correspondences of parts 711-715 in the 2D domain 720 are shown as parts 721-725. In another example, the parts corresponding to 713-725 for trajectory 733 on the surface of a sphere 730 are shown as parts 733-735 respectively, and the correspondences of parts 733-735 in the 2D domain 740 are shown as parts 743-745. In yet another example, the parts corresponding to 713-725 for trajectory 753 on the surface of a sphere 750 are shown as parts 753-755 respectively, and the correspondences of parts 753-755 in the 2D domain 760 are shown as parts 763-765.
Fig. 8 illustrates an example of rotation of the sphere 800 from point a 830 to point b 840 on a big circle 820 on the sphere 810, where the big circle 820 corresponds to the largest circle on the surface of the sphere 810. The axis of rotation 850 is shown as an arrow in Fig. 8. The angle of rotation is θ a.
Fig. 9 illustrates an example of rotation of the sphere 900 from point a to point b on a small circle 910 on the sphere 920, where the small circle 910 corresponds to a circle smaller than the largest circle (e.g. circle 930) on the surface of the sphere 920. The centre point of rotation is shown as a dot 912 in Fig. 9. Another example of rotation of the sphere 950 from point a to point b on a small circle 960 on the sphere 970. A big circle 980 (i.e., the largest circle) is indicated in Fig. 9. The axis of rotation 990 is shown as an arrow in Fig. 9.
Fig. 10 illustrates an example of deriving the motion vector for 2D projected pictures using the rotation of the sphere model. In Fig. 10, illustration 1010 depicts an example of deriving the motion vector for 2D projected pictures using the rotation of the sphere model according to an embodiment of the present invention. Locations a and b are two locations in a 2D projected picture. Motion vector mv a points from a to a′. The goal is to find the motion vector mv b for location b. The present invention projects the 2D projected picture onto a 3D sphere 1020 using 2D-to-3D projection. Rotation around a small circle or big circle is applied to rotate location a to location b. Location a’is rotated to location b’according to the same selected rotation of sphere. An inverse projection is then applied to the 3D sphere to convert the location b’to the 2D frame 1030. The motion vector for location b can be calculated according to mv b = b’–b. Motion vectors mv a and mv b are two-dimensional vectors in the (x, y) domain. Locations a, a’, b and b’are two-dimensionalcoordinates in the (x, y) domain. The notations
Figure PCTCN2018092143-appb-000006
and
Figure PCTCN2018092143-appb-000007
are three-dimensional coordinates in the (θ, φ) domain.
The derived motion vector can be used as a candidate for video coding using Merge or AMVP (advanced motion vector prediction) mode, where the Merge mode and the AMVP modes are techniques to code a block or motion information of the block predictively as disclosed in HEVC (High Efficiency Video Coding) . When a block is coded in the Merge mode, the current block uses the motion information of a block indicated by a Merge index indicating to a selected candidate in a Merge candidate list. When a block is coded in the AMVP mode, the motion vector of the current block predictively by a predictor as indicated by an AMVP index pointing to a selected candidate in the AMVP candidate list. Fig. 11 illustrates an example of using the MV derived based on rotation of sphere as a Merge or AMVP candidate. In Fig. 11, layout 1110 of neighbouring blocks that are used to derive Merge candidate is shown for block 1112 where the neighbouring blocks include spatial neighbouring blocks A 0, A 1, B 0, B 1 and B 2 and temporal neighbouring blocks Col 0 or Col 1. For a current block 1120 in a 2D frame, the motion vectors from block A 0 and B 0 can be used to derive motion candidate for the current block. As disclosed above, the motion vector mv a of neighbouring block A 0 can be used to derive a corresponding motion vector mv a’for the current block according to the rotation of sphere. Similarly, the motion vector mv b of neighbouring block B 0 can be used to derive a corresponding motion vector mv b’at the current block according to the rotation of sphere. Motion vectors mv a’and mv b’can be included in the Merge or AMVP candidate list.
In this present invention, motion vector derivation based on displacement of viewpoint is disclosed. Fig. 12 illustrates an example of an object (i.e., a tree) projected onto the surface of a sphere at different  camera locations. At camera position A, the tree is projected onto sphere 1210 to form an image 1240 of the tree. At a forward position B, the tree is projected onto sphere 1220 to form an image 1250 of the tree. The image 1241 of the tree corresponding to camera position A is also shown on sphere 1220 for comparison. At a further forward position C, the tree is projected onto sphere 1230 to form an image 1260 of the tree. Image 1242 of the tree corresponding to camera position A and image 1251 of the tree corresponding to camera position B are also shown on sphere 1230 for comparison. In Fig. 12, for a video captured by a camera moving in a straight line, the direction of movement of the camera in 3D space (as indicated by  arrows  1212, 1222 and 1232 for three different camera locations in Fig. 12) can be represented by a latitude and longitude 
Figure PCTCN2018092143-appb-000008
 coordinates, where 
Figure PCTCN2018092143-appb-000009
 corresponds to the intersection of the motion vector and the 3D sphere. The point of 
Figure PCTCN2018092143-appb-000010
 is projected to the 2D target projection plane and the point is referred as a forward point.
Fig. 13 illustrates an example of an ERP frame overlaid with the pattern of moving flows, where the flow of background (i.e., static object) can be determined if the camera forward point is known. The flows are indicated by arrows. The camera forward point 1310 and camera backward point 1320 are shown. Moving flows correspond to the moving direction of video content based on a camera moving in a direction. The movement of the camera causes the relative movement of the static background objects, and the moving direction of background object on the 2D frame captured by camera can be represented as moving flows. Fig. 14 illustrates an exemplary procedure of MV derivation based on displacement of viewpoint. In Fig. 14, an object 1410 is projected to the sphere 1420 to form an image 1422 on the surface of the sphere corresponding to camera location 1424. At camera location 1434, the object 1410 is projected to the sphere 1430 to form an image 1432 on the surface of the sphere. The corresponding location of the object is also shown on the surface of the sphere 1430 at location 1423. The procedure to derive the motion vector based on displacement of viewpoint as shown as follows:
·Find out the forward point 1440 of camera.
·Calculate moving flows 1450 in the 2D frame (i.e., tangent direction at each pixel) .
·Determine the motion vector of current block by using MV of neighbouring block:
a. The MV of neighbouring block can be used to determine the displacement of camera, as indicated by arrow 1460;
b. The displacement of camera can be used to determine the MV of current block, as indicated by arrow 1470.
The MV derivation based on displacement of viewpoint can be applied to various projection methods. The moving flow in the 2D frame can be mapped to 3D sphere. The moving flow on the 3D sphere 1510 is shown in Fig. 15, where the forward point and two different lines of moving flow (1512 and 1514) are shown. The moving flow on 3D sphere associated with ERP 1520 is shown in Fig. 15, where the moving flows are shown for an ERP frame 1526. The moving flow on 3D sphere associated with CMP 1530 is shown in Fig. 15, where the moving flows are shown for six faces of a CMP frame 1536 in a 2x3 layout format. The moving flow on 3D sphere associated with OHP 1540 is shown in Fig. 15, where the moving flows are shown for eight faces of an OHP frame 1546. The moving flow on 3D sphere associated with ISP 1550 is shown in Fig. 15, where the moving flows are shown for twenty faces of an ISP frame 1556. The moving flow on 3D sphere associated with SSP 1560 is shown in Fig. 15, where the moving flows are shown for segmented faces of an  SSP frame 1566.
As mentioned before, there are various mappings, such as ERP, CMP, SSP, OHP and ISP, to project from a 3D sphere to a 2D frame. When an object in the 3D sphere moves, the corresponding object in a 2D frame may become deformed. The deformation is projection-type dependent. The length, angle, and area may not remain uniform at all mapping positions in the 2D frame. An MV scaling technique on 3D sphere is disclosed to perform deformation by scaling MV on a 3D sphere to minimize effects of projection types on deformation. An MV scaling technique on 2D frame is also disclosed, which can be directly applied to 2D frames by combining projection, scaling MV and inverse projection into a single function.
Fig. 16 illustrates an example of deformation associated with motion in the ERP frame. In Fig. 16, three motion vectors (1612, 1614 and 1616) along three latitude lines are shown on the surface of 3D sphere 1610, where the three motion vectors have about the same length. If the surface of the 3D sphere is unfolded into a 2D plane 1620, the three motion vectors (1622, 1624 and 1626) maintains their equal length property. For the ERP frame, the unfolded image needs to be stretched, where more stretching is needed in higher latitudes. Therefore, the three motion vectors (1632, 1634 and 1636) have different lengths as shown in Fig. 16. If a neighbouring block having a motion vector 1634 is used for a current block (i.e., a motion vector at location 1632) , motion vector 1634 has to be properly scaled before used for coding, such as a merge or AMVP candidate. The example in Fig. 16 illustrates the need for MV scaling.
An exemplary procedure for the MV scaling technique in 3D sphere is disclosed as follows. Suppose the starting point of mv a is a, and the ending point of mv a is a’. We can predict the motion vector mv b at point b by following procedures:
1. Map a, a’, and b from 2D frame 1710 to 3D sphere 1720 as shown in Fig. 17.
Let the mapping function be
Figure PCTCN2018092143-appb-000011
= Pprojection type (x, y) which maps the pixel at (x, y) in a 2D frame to
Figure PCTCN2018092143-appb-000012
2. Calculate Δθ and
Figure PCTCN2018092143-appb-000013
Figure PCTCN2018092143-appb-000014
3. Apply scaling functions scaleθ () and
Figure PCTCN2018092143-appb-000015
to Δθ and
Figure PCTCN2018092143-appb-000016
Figure PCTCN2018092143-appb-000017
Figure PCTCN2018092143-appb-000018
4. Calculate
Figure PCTCN2018092143-appb-000019
according to the following formula:
Figure PCTCN2018092143-appb-000020
5. Map
Figure PCTCN2018092143-appb-000021
from the 3D sphere 1720 to the 2D frame 1730 to produce b’.
Let the inverse function be (x, y) = IP projection  type
Figure PCTCN2018092143-appb-000022
which maps the pixel at
Figure PCTCN2018092143-appb-000023
on the 3D sphere to (x, y) in the 2D frame.
6. MV mv b can be determined according to mv b = b’–b.
In the above procedure, scaling functions for Δθ and
Figure PCTCN2018092143-appb-000024
are scaleθ
Figure PCTCN2018092143-appb-000025
and
Figure PCTCN2018092143-appb-000026
respectively. Some examples of scaling functions are shown as follows.
Example 1:
Figure PCTCN2018092143-appb-000027
Therefore, Δθ'= Δθ; 
Figure PCTCN2018092143-appb-000028
Example 2:
Figure PCTCN2018092143-appb-000029
Figure PCTCN2018092143-appb-000030
Figure PCTCN2018092143-appb-000031
Example 3: Generic form of scaling function–projection independent
Figure PCTCN2018092143-appb-000032
Figure PCTCN2018092143-appb-000033
Example 4: Generic form of scaling function–projection dependent
Figure PCTCN2018092143-appb-000034
Figure PCTCN2018092143-appb-000035
The MV scaling in the 3D domain can also be performed in a 2D domain by combining projection function, scaling function, and inverse projection function into a single function f (a, b, mv a, projection type) . Therefore, the mapping between 3D and 2D can be skipped.
mv b = (x mvb, y mvb) = f (a, b, mv a, projection type)
x mvb = f x (a, b, mv a, projection type)
y mvb = f y (a, b, mv a, projection type)
In the above equation, f x (a, b, mv a, projection type) is the single function to generate x mvb and f y (a, b, mv a, projection type) is the single function to generate y mvb. Fig. 18 illustrates an exemplary procedure of MV scaling in a 2D frame. Two blocks (A and B) in a 2D frame 1810 are shown. The motion vector for location a in block A is marked. The end position for location a associated with mv a is denoted as a’in Fig. 18. Location b in block B is marked. The end position for location b associated with mv b is denoted as b’in Fig. 18. The steps of projection function (i.e., forward projection 1830) , scaling function in the 3D sphere 1820, and inverse projection function (i.e., inverse projection 1840) can be combined into a single function according to the above equation.
An example of MV scaling in an ERP frame is shown in Fig. 19. In Fig. 19, the surface of 3D sphere 1910 is unfolded into a 2D plane 1920. For the ERP frame 1930, the unfolded image needs to be stretched so that all latitude lines have the same length. Based on the features of ERP, the length of horizontal line is enlarged more when it is closer to North or South Pole. In the ERP frame, a neighbouring block 1932 has a motion vector mv 1. A motion vector mv 2 needs to be derived based on mv 1 for the current block 1934. The derived motion vector can be used for coding the current block. For example, the derived block can be used as a Merge or AMVP candidate. Since the neighbouring block is located at a different location on the 3D sphere, the motion vector associated with the neighbouring block needs to be scaled before it is used for coding the current block. In particular, the motion vector of the neighbouring block is stretched more in the x-direction than a motion vector for the current block since the neighbouring block is in higher latitude. Therefore, the x-component of mv 1 needs to be scaled down before it is used for the current block. The MV scaling process 1940 is shown in Fig. 19, where the motion vector mv 1 at location a of a neighbouring block is scaled to mv 2 and used for coding the current block. In one embodiment of the present invention, a scaling  function to preserve the motion distance is disclosed, where mv 2 is a function of mv 1, θ1, 
Figure PCTCN2018092143-appb-000036
θ2, 
Figure PCTCN2018092143-appb-000037
Figure PCTCN2018092143-appb-000038
Figure PCTCN2018092143-appb-000039
mv y2 = mv y1
In the above equation, 
Figure PCTCN2018092143-appb-000040
corresponds to the longitude and latitude of location a in the neighbouring block and
Figure PCTCN2018092143-appb-000041
corresponds to the longitude and latitude of location b in the current block. Location a and location b may correspond to the centre of respective blocks or other location of respective blocks. The y-component of the derived motion vector is the same as the y-component of the mv 1. In other words, the scaling function for the y-component is an identity function.
The MV scaling for other projections are also disclosed. In Fig. 20, the MV scaling for ERP is disclosed, where a current block 2012 and a neighbouring block 2014 in an ERP picture 2010 are shown. The neighbouring block has a motion vector mv 1. The motion vector mv 1 from the neighbouring block is used to derive a motion vector mv 2 for the current block. According to the present invention, the motion vector mv 1 from the neighbouring block needs to be scaled using a scaling function according to mv 2 = f (mv 1, x 1, y 1, x 2, y 2, ERP) , where (x 1, y 1) corresponds to the (x, y) coordinates of location a in the neighbouring block and (x 2, y 2) corresponds to the (x, y) coordinates of location b in the current block.
In Fig. 21, the MV scaling for CMP is disclosed, where a current block 2112 and a neighbouring block 2114 in a CMP picture 2110 are shown. The neighbouring block has a motion vector mv 1. The motion vector mv 1 from the neighbouring block is used to derive a motion vector mv 2 for the current block. According to the present invention, the motion vector mv 1 from the neighbouring block needs to be scaled using a scaling function according to mv 2 = f (mv 1, x 1, y 1, x 2, y 2, CMP) .
In Fig. 22, the MV scaling for SSP is disclosed, where a current block 2212 and a neighbouring block 2214 in an SSP picture 2210 are shown. The neighbouring block has a motion vector mv 1. The motion vector mv 1 from the neighbouring block is used to derive a motion vector mv 2 for the current block. According to the present invention, the motion vector mv 1 from the neighbouring block needs to be scaled using a scaling function according to mv 2 = f (mv 1, x 1, y 1, x 2, y 2, SSP) .
In Fig. 23, the MV scaling for OHP is disclosed, where a current block 2312 and a neighbouring block 2314 in an OHP picture 2310 are shown. The neighbouring block has a motion vector mv 1. The motion vector mv 1 from the neighbouring block is used to derive a motion vector mv 2 for the current block. According to the present invention, the motion vector mv 1 from the neighbouring block needs to be scaled using a scaling function according to mv 2 = f (mv 1, x 1, y 1, x 2, y 2, OHP) .
In Fig. 24, the MV scaling for ISP is disclosed, where a current block 2412 and a neighbouring block 2414 in an ISP picture 2410 are shown. The neighbouring block has a motion vector mv 1. The motion vector mv 1 from the neighbouring block is used to derive a motion vector mv 2 for the current block. According to the present invention, the motion vector mv 1 from the neighbouring block needs to be scaled using a scaling function according to mv 2 = f (mv 1, x 1, y 1, x 2, y 2, ISP) .
In Fig. 25, the MV scaling for EAP is disclosed, where a current block 2512 and a neighbouring block 2514 in an EAP picture 2510 are shown. The neighbouring block has a motion vector mv 1. The motion  vector mv 1 from the neighbouring block is used to derive a motion vector mv 2 for the current block. According to the present invention, the motion vector mv 1 from the neighbouring block needs to be scaled using a scaling function according to mv 2 = f (mv 1, x 1, y 1, x 2, y 2, EAP) .
In Fig. 26, the MV scaling for ACP is disclosed, where a current block 2612 and a neighbouring block 2614 in an ACP picture 2610 are shown. The neighbouring block has a motion vector mv 1. The motion vector mv 1 from the neighbouring block is used to derive a motion vector mv 2 for the current block. According to the present invention, the motion vector mv 1 from the neighbouring block needs to be scaled using a scaling function according to mv 2 = f (mv 1, x 1, y 1, x 2, y 2, ACP) .
In Fig. 27, the MV scaling for RSP is disclosed, where a current block 2712 and a neighbouring block 2714 in an RSP picture 2710 are shown. The neighbouring block has a motion vector mv 1. The motion vector mv 1 from the neighbouring block is used to derive a motion vector mv 2 for the current block. According to the present invention, the motion vector mv 1 from the neighbouring block needs to be scaled using a scaling function according to mv 2 = f (mv 1, x 1, y 1, x 2, y 2, RSP) .
Besides these projections mentioned above, cylindrical projection has also been used to project a 3D sphere into a 2D frame. Conceptually, cylindrical projections are created by wrapping a cylinder 2820 around a globe 2830 and projecting light through the globe onto the cylinder as shown in Fig. 28. Cylindrical projections represent meridians as straight, evenly-spaced, vertical lines and parallels as straight horizontal lines. Meridians and parallels intersect at right angles, as they do on the globe. Depending on the placement of the light source, various CLPs are generated. In Fig. 28, the MV scaling for Cylindrical Projection (CLP) is disclosed, where a current block 2812 and a neighbouring block 2814 in a CLP picture 2810 are shown. The neighbouring block has a motion vector mv 1. The motion vector mv 1 from the neighbouring block is used to derive a motion vector mv 2 for the current block. According to the present invention, the motion vector mv 1 from the neighbouring block needs to be scaled using a scaling function according to mv 2 = f (mv 1, x 1, y 1, x 2, y 2, CLP) .
Fig. 29 illustrates an exemplary flowchart of a system that applies rotation of sphere to adjust motion vector for processing 360-degree virtual reality images according to an embodiment of the present invention. The steps shown in the flowchart may be implemented as program codes executable on one or more processors (e.g., one or more CPUs) at the encoder side. The steps shown in the flowchart may also be implemented based hardware such as one or more electronic devices or processors arranged to perform the steps in the flowchart. According to this method, input data for a current block in a 2D (two-dimensional) frame are received in step 2910, where the 2D frame is projected from a 3D (three-dimensional) sphere. The input data may correspond to pixel data of a 2D frame to be encoded. A first motion vector associated with a neighbouring block in the 2D frame is determined in step 2920, where the first motion vector points from a first start location in the neighbouring block to a first end location in the 2D frame. The first motion vector is projected onto the 3D sphere according to a target projection in step 2930. The first motion vector in the 3D sphere is rotated along a rotation circle on a surface of the 3D sphere around a rotation axis to generate a second motion vector in the 3D sphere in step 2940. The second motion vector in the 3D sphere is mapped back to the 2D frame according to an inverse target projection. The current block in the 2D frame is encoded or decoded using the second motion vector in step 2960.
Fig. 30 illustrates an exemplary flowchart of a system that derives motion vector from displacement of viewpoint for processing 360-degree virtual reality images according to an embodiment of the present invention. According to another method, two 2D (two-dimensional) frames are received in step 3010, where said two 2D frames are projected, using a target projection, from a 3D (three-dimensional) sphere corresponding to two different viewpoints, and where a current block and a neighbouring block are located in said two 2D frames. A forward point of camera is determined based on said two 2D frames in step 3020. Moving flows in said two 2D frames are determined in step 3030. Displacement of camera is determined based on a first motion vector associated with the neighbouring block in step 3040. A second motion vector associated with the current block is derived based on the displacement of camera in step 3050. The current block in the 2D frame is encoded or decoded using the second motion vector in step 3060.
Fig. 31 illustrates an exemplary flowchart of a system that applies scaling to adjust motion vector for processing 360-degree virtual reality images according to an embodiment of the present invention. According to this method, input data for a current block in a 2D (two-dimensional) frame are received in step 3110, where the 2D frame is projected from a 3D (three-dimensional) sphere according to a target projection. A first motion vector associated with a neighbouring block in the 2D frame is determined in step 3120, where the first motion vector points from a first start location in the neighbouring block to a first end location in the 2D frame. The first motion vector is scaled to generate a second motion vector in step 3130. The current block in the 2D frame is encoded or decoded using the second motion vector in step 3140.
The flowcharts shown above are intended for serving as examples to illustrate embodiments of the present invention. A person skilled in the art may practice the present invention by modifying individual steps, splitting or combining steps with departing from the spirit of the present invention.
The above description is presented to enable a person of ordinary skill in the art to practice the present invention as provided in the context of a particular application and its requirement. Various modifications to the described embodiments will be apparent to those with skill in the art, and the general principles defined herein may be applied to other embodiments. Therefore, the present invention is not intended to be limited to the particular embodiments shown and described, but is to be accorded the widest scope consistent with the principles and novel features herein disclosed. In the above detailed description, various specific details are illustrated in order to provide a thorough understanding of the present invention. Nevertheless, it will be understood by those skilled in the art that the present invention may be practiced.
Embodiment of the present invention as described above may be implemented in various hardware, software codes, or a combination of both. For example, an embodiment of the present invention can be one or more electronic circuits integrated into a video compression chip or program code integrated into video compression software to perform the processing described herein. An embodiment of the present invention may also be program code to be executed on a Digital Signal Processor (DSP) to perform the processing described herein. The invention may also involve a number of functions to be performed by a computer processor, a digital signal processor, a microprocessor, or field programmable gate array (FPGA) . These processors can be configured to perform particular tasks according to the invention, by executing machine-readable software code or firmware code that defines the particular methods embodied by the invention. The software code or firmware code may be developed in different programming languages and different formats  or styles. The software code may also be compiled for different target platforms. However, different code formats, styles and languages of software codes and other means of configuring code to perform the tasks in accordance with the invention will not depart from the spirit and scope of the invention.
The invention may be embodied in other specific forms without departing from its spirit or essential characteristics. The described examples are to be considered in all respects only as illustrative and not restrictive. The scope of the invention is therefore, indicated by the appended claims rather than by the foregoing description. All changes which come within the meaning and range of equivalency of the claims are to be embraced within their scope.

Claims (21)

  1. A method of processing 360-degree virtual reality images, the method comprising:
    receiving input data for a current block in a 2D (two-dimensional) frame, wherein the 2D frame is projected from a 3D (three-dimensional) sphere;
    determining a first motion vector associated with a neighbouring block in the 2D frame, wherein the first motion vector points from a first start location in the neighbouring block to a first end location in the 2D frame;
    projecting the first motion vector onto the 3D sphere according to a target projection;
    rotating the first motion vector in the 3D sphere along a rotation circle on a surface of the 3D sphere around a rotation axis to generate a second motion vector in the 3D sphere;
    mapping the second motion vector in the 3D sphere back to the 2D frame according to an inverse target projection; and
    encoding or decoding the current block in the 2D frame using the second motion vector.
  2. The method of Claim 1, wherein the second motion vector is included as a candidate in a Merge candidate list or an AMVP (Advanced Motion Vector Prediction) candidate list for encoding or decoding of the current block.
  3. The method of Claim 1, wherein the rotation circle corresponds to a largest circle on the surface of the 3D sphere.
  4. The method of Claim 1, wherein the rotation circle is smaller than a largest circle on the surface of the 3D sphere.
  5. The method of Claim 1, wherein the target projection corresponds to Equirectangular Projection (ERP) and Cubemap Projection (CMP) , Adjusted Cubemap Projection (ACP) , Equal-Area Projection (EAP) , Octahedron Projection (OHP) , Icosahedron Projection (ISP) , Segmented Sphere Projection (SSP) , Rotated Sphere Projection (RSP) , or Cylindrical Projection (CLP) .
  6. The method of Claim 1, wherein said projecting the first motion vector onto the 3D sphere comprises projecting the first start location, the first end location and a second start location in the 2D frame onto the 3D sphere according to the target projection, wherein the second start location is at a corresponding location in the current block corresponding to the first start location in the neighbouring block.
  7. The method of Claim 6, wherein said rotating the first motion vector in the 3D sphere along the rotation circle comprises determining a target rotation for rotating from the first start location to the second start location in the 3D sphere along a rotation circle on a surface of the 3D sphere around a rotation axis and rotating the first end location to a second end location on the 3D sphere using the target rotation.
  8. The method of Claim 7, wherein said mapping the second motion vector in the 3D sphere back to the 2D frame comprises mapping the second end location on the 3D sphere back to the 2D frame according to the inverse target projection and determining the second motion vector in the 2D frame based on the second start location and the second end location in the 2D frame.
  9. An apparatus for processing 360-degree virtual reality images, the apparatus comprising one or more electronic devices or processors configured to:
    receive input data for a current block in a 2D (two-dimensional) frame, wherein the 2D frame is projected  from a 3D (three-dimensional) sphere;
    determine a first motion vector associated with a neighbouring block in the 2D frame, wherein the first motion vector points from a first start location in the neighbouring block to a first end location in the 2D frame;
    project the first motion vector onto the 3D sphere according to a target projection;
    rotate the first motion vector in the 3D sphere along a rotation circle on a surface of the 3D sphere around a rotation axis to generate a second motion vector in the 3D sphere;
    map the second motion vector in the 3D sphere back to the 2D frame according to an inverse target projection; and
    encode or decode the current block in the 2D frame using the second motion vector.
  10. A method of processing 360-degree virtual reality images, the method comprising:
    receiving two 2D (two-dimensional) frames, wherein said two 2D frames are projected, using a target projection, from a 3D (three-dimensional) sphere corresponding to two different viewpoints, and wherein a current block and a neighbouring block are located in said two 2D frames;
    determining a forward point of camera based on said two 2D frames;
    determining moving flows in said two 2D frames;
    determining displacement of camera based on a first motion vector associated with the neighbouring block;
    deriving a second motion vector associated with the current block based on the displacement of camera; and
    encoding or decoding the current block in the 2D frame using the second motion vector.
  11. The method of Claim 10, wherein the moving flows in said two 2D frames are calculated from a tangent direction at each pixel in said two 2D frames.
  12. The method of Claim 10, wherein the second motion vector is included as a candidate in a Merge candidate list or an AMVP (Advanced Motion Vector Prediction) candidate list for encoding or decoding of the current block.
  13. The method of Claim 10, wherein the target projection corresponds to Equirectangular Projection (ERP) and Cubemap Projection (CMP) , Adjusted Cubemap Projection (ACP) , Equal-Area Projection (EAP) , Octahedron Projection (OHP) , Icosahedron Projection (ISP) , Segmented Sphere Projection (SSP) , Rotated Sphere Projection (RSP) , or Cylindrical Projection (CLP) .
  14. A method of processing 360-degree virtual reality images, the method comprising:
    receiving input data for a current block in a 2D (two-dimensional) frame, wherein the 2D frame is projected from a 3D (three-dimensional) sphere according to a target projection;
    determining a first motion vector associated with a neighbouring block in the 2D frame, wherein the first motion vector points from a first start location in the neighbouring block to a first end location in the 2D frame;
    scaling the first motion vector to generate a second motion vector; and
    encoding or decoding the current block in the 2D frame using the second motion vector.
  15. The method of Claim 14, wherein said scaling the first motion vector to generate the second motion  vector comprises projecting the first start location, the first end location and a second start location in the 2D frame onto the 3D sphere according to the target projection, wherein the second start location is at a corresponding location in the current block corresponding to the first start location in the neighbouring block.
  16. The method of Claim 15, said scaling the first motion vector to generate the second motion vector further comprises scaling a longitude component of the first motion vector to generate a scaled longitude component of the first motion vector; scaling a latitude component of the first motion vector to generate a scaled latitude component of the first motion vector; and determining a second end location corresponding to the second start location based on the scaled longitude component of the first motion vector and the scaled latitude component of the first motion vector.
  17. The method of Claim 16, wherein said scaling the first motion vector to generate the second motion vector further comprises mapping the second end location on the 3D sphere back to the 2D frame according to an inverse target projection; and determining the second motion vector in the 2D frame based on the second start location and the second end location in the 2D frame.
  18. The method of Claim 14, wherein said scaling the first motion vector to generate the second motion vector comprises applying a first combined function to generate an x-component of the second motion vector and applying a second combined function to generate a y-component of the second motion vector; wherein the first combined function and the second combined function are dependent on the first start location, a second start location in the current block associated with corresponding first start location, the first motion vector and a target projection; and wherein the first combined function and the second combined function combine the target projection for projecting first data in the 2D frame into second data in the 3D sphere, scaling a selected motion vector in the 3D sphere into a scaled motion vector in the 3D sphere, and inverse target projection for projecting the scaled motion vector into the 2D frame.
  19. The method of Claim 18, wherein the target projection corresponds to Equirectangular Projection (ERP) and the first combined function corresponds to
    Figure PCTCN2018092143-appb-100001
    wherein
    Figure PCTCN2018092143-appb-100002
    corresponds to a first latitude associated with the first starting location and
    Figure PCTCN2018092143-appb-100003
    corresponds to a second latitude associated with the second starting location, and the second combined function corresponds to an identity function.
  20. The method of Claim 14, wherein the second motion vector is included as a candidate in a Merge candidate list or an AMVP (Advanced Motion Vector Prediction) candidate list for encoding or decoding of the current block.
  21. The method of Claim 14, wherein the target projection corresponds to Equirectangular Projection (ERP) and Cubemap Projection (CMP) , Adjusted Cubemap Projection (ACP) , Equal-Area Projection (EAP) , Octahedron Projection (OHP) , Icosahedron Projection (ISP) , Segmented Sphere Projection (SSP) , Rotated Sphere Projection (RSP) , or Cylindrical Projection (CLP) .
PCT/CN2018/092143 2017-06-23 2018-06-21 Method and apparatus of motion vector derivations in immersive video coding WO2018233662A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201880001715.5A CN109429561B (en) 2017-06-23 2018-06-21 Method and device for processing 360-degree virtual reality image
TW107121493A TWI686079B (en) 2017-06-23 2018-06-22 Method and apparatus of processing 360-degree virtual reality images

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US201762523883P 2017-06-23 2017-06-23
US201762523885P 2017-06-23 2017-06-23
US62/523,883 2017-06-23
US62/523,885 2017-06-23

Publications (1)

Publication Number Publication Date
WO2018233662A1 true WO2018233662A1 (en) 2018-12-27

Family

ID=64735503

Family Applications (2)

Application Number Title Priority Date Filing Date
PCT/CN2018/092143 WO2018233662A1 (en) 2017-06-23 2018-06-21 Method and apparatus of motion vector derivations in immersive video coding
PCT/CN2018/092142 WO2018233661A1 (en) 2017-06-23 2018-06-21 Method and apparatus of inter prediction for immersive video coding

Family Applications After (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/092142 WO2018233661A1 (en) 2017-06-23 2018-06-21 Method and apparatus of inter prediction for immersive video coding

Country Status (3)

Country Link
CN (2) CN109429561B (en)
TW (2) TWI686079B (en)
WO (2) WO2018233662A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114145023A (en) * 2019-04-26 2022-03-04 腾讯美国有限责任公司 Method and apparatus for motion compensation for 360video coding

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110248212B (en) * 2019-05-27 2020-06-02 上海交通大学 Multi-user 360-degree video stream server-side code rate self-adaptive transmission method and system
US11095912B2 (en) 2019-10-28 2021-08-17 Mediatek Inc. Video decoding method for decoding part of bitstream to generate projection-based frame with constrained guard band size, constrained projection face size, and/or constrained picture size
US11263722B2 (en) * 2020-06-10 2022-03-01 Mediatek Inc. Video processing method for remapping sample locations in projection-based frame with hemisphere cubemap projection layout to locations on sphere and associated video processing apparatus
CN115423812B (en) * 2022-11-05 2023-04-18 松立控股集团股份有限公司 Panoramic monitoring planarization display method
CN116540872A (en) * 2023-04-28 2023-08-04 中广电广播电影电视设计研究院有限公司 VR data processing method, device, equipment, medium and product

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103039075A (en) * 2010-05-21 2013-04-10 Jvc建伍株式会社 Image encoding apparatus, image encoding method, image encoding program, image decoding apparatus, image decoding method and image decoding program
CN104063843A (en) * 2014-06-18 2014-09-24 长春理工大学 Method for generating integrated three-dimensional imaging element images on basis of central projection
WO2017027884A1 (en) * 2015-08-13 2017-02-16 Legend3D, Inc. System and method for removing camera rotation from a panoramic video

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102333221B (en) * 2011-10-21 2013-09-04 北京大学 Panoramic background prediction video coding and decoding method
KR102432085B1 (en) * 2015-09-23 2022-08-11 노키아 테크놀로지스 오와이 A method, an apparatus and a computer program product for coding a 360-degree panoramic video

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103039075A (en) * 2010-05-21 2013-04-10 Jvc建伍株式会社 Image encoding apparatus, image encoding method, image encoding program, image decoding apparatus, image decoding method and image decoding program
CN104063843A (en) * 2014-06-18 2014-09-24 长春理工大学 Method for generating integrated three-dimensional imaging element images on basis of central projection
WO2017027884A1 (en) * 2015-08-13 2017-02-16 Legend3D, Inc. System and method for removing camera rotation from a panoramic video

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JILL BOYCE ET AL.: "Spherical rotation orientation SEI for HEVC and AVC coding of 360 video", JOINT COLLABORATIVE TEAM ON VIDEO CODING (JCT-VC) OF ITU-T SG 16 WP3, 20 January 2017 (2017-01-20), Geneva, pages 1 - 2, XP030118131 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114145023A (en) * 2019-04-26 2022-03-04 腾讯美国有限责任公司 Method and apparatus for motion compensation for 360video coding
EP3939309A4 (en) * 2019-04-26 2022-05-11 Tencent America Llc Method and apparatus for motion compensation for 360 video coding

Also Published As

Publication number Publication date
CN109429561B (en) 2022-01-21
TW201911867A (en) 2019-03-16
CN109691104B (en) 2021-02-23
TWI686079B (en) 2020-02-21
CN109691104A (en) 2019-04-26
WO2018233661A1 (en) 2018-12-27
TW201911861A (en) 2019-03-16
TWI690193B (en) 2020-04-01
CN109429561A (en) 2019-03-05

Similar Documents

Publication Publication Date Title
WO2018233662A1 (en) Method and apparatus of motion vector derivations in immersive video coding
US10600233B2 (en) Parameterizing 3D scenes for volumetric viewing
US10264282B2 (en) Method and apparatus of inter coding for VR video using virtual reference frames
CN109804633B (en) Method and apparatus for omni-directional video encoding and decoding using adaptive intra-prediction
WO2017125030A1 (en) Apparatus of inter prediction for spherical images and cubic images
US10212411B2 (en) Methods of depth based block partitioning
EP3610647B1 (en) Apparatuses and methods for encoding and decoding a panoramic video signal
US20170118475A1 (en) Method and Apparatus of Video Compression for Non-stitched Panoramic Contents
WO2018196682A1 (en) Method and apparatus for mapping virtual-reality image to a segmented sphere projection format
CN108377377A (en) The spherical surface either Video coding of cube image sequence or coding/decoding method and device
US9736498B2 (en) Method and apparatus of disparity vector derivation and inter-view motion vector prediction for 3D video coding
CN108886598A (en) The compression method and device of panoramic stereoscopic video system
WO2017220012A1 (en) Method and apparatus of face independent coding structure for vr video
TWI702835B (en) Method and apparatus of motion vector derivation for vr360 video coding
US20180338160A1 (en) Method and Apparatus for Reduction of Artifacts in Coded Virtual-Reality Images
WO2019037656A1 (en) Method and apparatus of signalling syntax for immersive video coding
CN109961395B (en) Method, device and system for generating and displaying depth image and readable medium
KR101946715B1 (en) Adaptive search ragne determination method for motion estimation of 360 degree video
Cheng et al. Texture plus depth video coding using camera global motion information
KR20170114160A (en) Decoding method for video data including stitching information and encoding method for video data including stitching information

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18820869

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18820869

Country of ref document: EP

Kind code of ref document: A1