WO2018074813A1

WO2018074813A1 - Device and method for encoding or decoding image

Info

Publication number: WO2018074813A1
Application number: PCT/KR2017/011457
Authority: WO
Inventors: 임정연; 이선영; 손세훈; 신재섭; 김형덕; 이경택
Original assignee: 에스케이텔레콤 주식회사
Priority date: 2016-10-17
Filing date: 2017-10-17
Publication date: 2018-04-26

Abstract

The present invention relates to a method for encoding prediction information of a current block located on a first plane to be encoded when encoding each plane in a two-dimensional image projected from a 360 degrees image and comprises: a step of generating prediction information candidates by using surrounding blocks of a current block; and a step of encoding a syntax element associated with prediction information of the current block by using the prediction information candidates, wherein, if the boundary of the current block matches that of a first plane, blocks adjacent to the current block on the basis of a 360 degrees image, not a two-dimensional image, are configured as at least some of the surrounding blocks.

Description

Apparatus and method for image encoding or decoding

The present invention relates to image encoding or decoding for efficiently encoding an image.

Since moving picture data has a larger amount of data than audio data, still image data, and the like, a large amount of hardware resources including a memory are required to store or transmit itself without processing for compression. Accordingly, when storing or transmitting video data, the video data is compressed and stored or transmitted using an encoder, and the decoder receives the compressed video data, decompresses, and plays the video data. Such video compression techniques include H.264 / AVC and High Efficiency Video Coding (HEVC), which was established in early 2013, which improved coding efficiency by about 40%.

However, the size, resolution, and frame rate of an image are gradually increasing, and accordingly, the amount of data to be encoded is also increasing. Therefore, a compression technique with better coding efficiency than a conventional compression technique is required.

In addition to the existing 2D natural video generated by the camera, the demand for video content such as games or 360-degree video (hereinafter referred to as "360 video") is increasing. Since such games or 360 images include features different from existing 2D natural images, there is a limit to compression using existing compression techniques based on 2D images.

360 video is a video taken from multiple directions with multiple cameras. Stitching the video output from multiple cameras into a single 2D video to compress and transmit video from multiple scenes. The stitched video is compressed and decoded. Is sent to the device. After decoding the compressed image, the decoding apparatus is reproduced by mapping in 3D.

An example of a projection format for a 360 image is a square projection (equirectangular projection) as shown in FIG. (A) of FIG. 1 shows a spherical 360 image mapped in 3D, and (b) shows a spherical 360 image in an equirectangular format.

Such square projection has a disadvantage of severely distorting the pixels by increasing the upper and lower pixels, and increases the data amount and the encoding throughput in the increased portion even when the image is compressed. Accordingly, there is a need for an image compression technology capable of efficiently encoding 360 images.

The present invention provides an image encoding or decoding technique for efficiently encoding an image or a 360 image having a high resolution or frame rate.

An aspect of the present invention provides a method of encoding prediction information of a current block located on a first surface to be encoded when encoding each surface of a 2D image projected from a 360 image, using neighboring blocks of the current block. Generating prediction information candidates; And encoding a syntax element of the prediction information of the current block by using the prediction information candidates, when the boundary of the current block coincides with the boundary of the first surface. It provides a prediction information encoding method, characterized in that the block adjacent to the current block is set as at least some of the neighboring blocks.

Another aspect of the present invention is a method for decoding prediction information of a current block located on a first surface to be decoded in a 360 image encoded as a 2D image, the syntax element for the prediction information of the current block from a bitstream an element); Generating prediction information candidates using neighboring blocks of the current block; And reconstructing prediction information of the current block by using the prediction information candidates and the decoded syntax element, when the boundary of the current block coincides with the boundary of the first surface. The present invention provides a method for decoding prediction information, wherein a block adjacent to the current block is set as at least some of the neighboring blocks.

According to another aspect of the present invention, there is provided an apparatus for decoding prediction information of a current block located on a first surface to be decoded from a 360 image encoded as a 2D image, wherein a syntax element for the prediction information of the current block from a bitstream ( a decoder for decoding a syntax element; A prediction information candidate generator for generating prediction information candidates using neighboring blocks of the current block; And a prediction information determiner for reconstructing prediction information of the current block by using the prediction information candidates and the decoded syntax element, wherein the prediction information candidate generator comprises: a boundary of the current block and a boundary of the first surface; When there is a match, the apparatus for predicting information decoding comprises setting a block adjacent to the current block as at least some of the neighboring blocks based on the 360 image.

1 is an exemplary diagram of a rectangular projection format of a 360 image;

2 is a block diagram of an image encoding apparatus according to an embodiment of the present invention;

3 is an exemplary diagram of block partitioning using a QuadTree plus BinaryTree (QTBT) structure;

4 is an exemplary diagram for a plurality of intra prediction modes;

5 is an exemplary diagram of neighboring blocks of a current block;

6 is an exemplary diagram for various projection formats of 360 images;

7 is an exemplary diagram for a layout of a cube projection format;

8 is an exemplary diagram for explaining a rearrangement of a layout in a cube projection format;

9 is a block diagram of an apparatus for generating a syntax element for prediction information of a current block in a 360 image according to an embodiment of the present invention;

10 is an exemplary diagram for describing a method of determining a neighboring block of a current block in a cube format to which a compact layout is applied.

FIG. 11 is a diagram illustrating a detailed configuration of an intra predictor of FIG. 2 when the apparatus of FIG. 9 is applied to intra prediction. FIG.

12 is an exemplary diagram for explaining a method of setting reference pixels for intra prediction in a cube format;

13 is an exemplary diagram for explaining a method of setting a reference pixel for intra prediction in various projection formats;

14 is a diagram illustrating a detailed configuration of an inter predictor of FIG. 2 when the apparatus of FIG. 9 is applied to inter prediction;

15 is a block diagram illustrating an image decoding apparatus according to an embodiment of the present invention;

16 is a block diagram of an apparatus for decoding prediction information of a current block in a 360 image, according to an embodiment of the present invention;

17 is a diagram illustrating a detailed configuration of an intra predictor of FIG. 15 when the apparatus of FIG. 16 is applied to intra prediction;

FIG. 18 is a diagram illustrating a detailed configuration of the inter prediction unit of FIG. 15 when the apparatus of FIG. 16 is applied to inter prediction.

Hereinafter, some embodiments of the present invention will be described in detail through exemplary drawings. In adding an identification code to the components of each drawing, it should be noted that the same components as possible, even if shown on different drawings have the same reference numerals. In addition, in describing the present invention, when it is determined that the detailed description of the related well-known configuration or function may obscure the gist of the present invention, the detailed description thereof will be omitted.

2 is a block diagram of an image encoding apparatus according to an embodiment of the present invention.

The image encoding apparatus includes a block divider 210, a predictor 220, a subtractor 230, a transformer 240, a quantizer 245, an encoder 250, an inverse quantizer 260, and an inverse transform unit ( 265, an adder 270, a filter unit 280, and a memory 290. In the image encoding apparatus, each component may be implemented as a hardware chip, or may be implemented in software and implemented so that the microprocessor executes a function of software corresponding to each component.

After dividing each picture constituting the image into a plurality of coding tree units (CTUs), the block dividing unit 210 recursively divides the CTUs using a tree structure. A leaf node in the tree structure becomes a CU (coding unit) which is a basic unit of coding. As a tree structure, QuadTree (QT), which the parent node splits into four child nodes, or QTBT (QuadTree), which uses a QT structure and a binaryTree (BT) structure that the parent node splits into two child nodes. plus BinaryTree) structure can be used.

In a QTBT (QuadTree plus BinaryTree) structure, the CTU is first divided into a QT structure. The leaf nodes of the QT may then be further partitioned by BT. The partition information generated by the block divider 210 by dividing the CTU by the QTBT structure is encoded by the encoder 250 and transmitted to the decoding apparatus.

In QT, a first flag (QT split flag, QT_split_flag) indicating whether a block of a corresponding node is split is encoded. If the first flag is 1, the block of the node is divided into four blocks of the same size. If the first flag is 1, the node is no longer divided by QT.

In BT, a second flag (BT split flag, BT_split_flag) indicating whether a block of the corresponding node is split is encoded. In BT, there may be a plurality of partition types. For example, there may be two types of partitioning a block of a node horizontally into two blocks of the same size and a type of partitioning vertically. Alternatively, there may further be a type in which blocks of the corresponding node are further divided into two blocks having an asymmetric shape. The asymmetrical form may include dividing a block of a node into two rectangular blocks having a size ratio of 1: 3, or dividing a block of the node in a diagonal direction. When the BT has a plurality of division types in this manner, when a second flag indicating that a block is divided is encoded, partition type information indicating a partition type of the corresponding block is further encoded.

3 is an exemplary diagram of block division using a QTBT structure. (A) of FIG. 3 is an example in which a block is divided by a QTBT structure, and (b) shows it in a tree structure. In FIG. 3, the solid line indicates the division by the QT structure, and the dotted line indicates the division by the BT structure. In addition, in FIG. 3B, the parenthesis indicates a layer of QT, and the parenthesis indicates a layer of BT. In the BT structure represented by a dotted line, a number represents partition type information.

In FIG. 3, the CTU, which is the highest layer of QT, is divided into four nodes of layer 1. Accordingly, the block dividing unit 210 generates a QT splitting flag (QT_split_flag = 1) indicating that the CTU is divided. The block corresponding to the first node of layer 1 is no longer divided by QT. Accordingly, the block divider 210 generates QT_split_flag = 0.

Thereafter, the block corresponding to the first node of layer 1 of the QT proceeds to BT. In the present embodiment, it will be described that there are two types in which the BT divides the block of the node into two blocks of the same size horizontally and vertically. The first node of QT's layer 1 becomes the root node of BT (layer 0). Since the block corresponding to the root node of BT is further divided into blocks of (layer 1), the block dividing unit 210 generates a BT split flag BT_split_flag = 1 indicating that the split is performed by BT. Subsequently, partition type information indicating whether the corresponding block is divided horizontally or vertically is generated. In FIG. 3, the block corresponding to the root node of the BT is vertically divided, so that 1 indicating the vertical division is generated as the partition type information. Since the first block of the block of (layer 1) divided from the root node is further split and the split type is vertical, it generates BT_split_flag = 1 and split type information 1. On the other hand, since the second block of (layer 1) split from the root node of BT is no longer split, BT_split_flag = 0 is generated.

On the other hand, in order to efficiently signal the information about the block division by the QTBT structure to the decoding apparatus, the following information may be further encoded. The information is encoded as header information of an image, for example, may be encoded by a sequence parameter set (SPS) or a picture parameter set (PPS).

CTU size: the top layer of QTBT, that is, the block size of the root node

MinQTSize: the minimum block size of leaf nodes allowed in QT

MaxBTSize: the maximum block size of the root node allowed by BT

MaxBTDepth: the maximum depth allowed by BT

MinBTSize: the minimum block size of leaf nodes allowed in BT

In QT, a block having the same size as MinQTSize is no longer split, and thus split information (first flag) about QT corresponding to the block is not encoded. In addition, there is no BT in a block having a size larger than MaxBTSize in QT. Therefore, splitting information (second flag, splitting type information) regarding BT corresponding to the block is not encoded. In addition, when the depth of the corresponding node of the BT reaches MaxBTDepth, the block of the corresponding node is no longer split and no splitting information (second flag, split type information) regarding the BT of the corresponding node is not encoded. In addition, a block having the same size as MinBTSize in BT is no longer split and no split information (second flag, split type information) regarding BT is also encoded. In this way, the maximum or minimum block size that a loop or leaf node of the QT and BT can have at a high level such as a sequence parameter set (SPS) or a picture parameter set (PPS) can be defined to determine whether the CTU is divided or not. The amount of coding for the information indicating the partition type can be reduced.

Meanwhile, the luma component and the chroma component of the CTU may be divided into the same QTBT structure. However, the present invention is not limited thereto, and the luminance component and the chrominance component may be divided using separate QTBT structures, respectively. For example, in the case of an I (Intra) slice, a luma component and a chroma component may be divided into different QTBT structures.

Hereinafter, a block corresponding to a CU to be encoded or decoded is called a 'current block'.

The prediction unit 220 generates a prediction block by predicting the current block. The predictor 220 includes an intra predictor 222 and an inter predictor 224.

The intra predictor 222 predicts pixels in the current block by using pixels (reference pixels) positioned around the current block in the current picture including the current block. There are a plurality of intra prediction modes according to the prediction direction, and the peripheral pixels to be used and the equations are defined differently according to each prediction mode.

4 illustrates an example of a plurality of intra prediction modes.

As shown in FIG. 4, the plurality of intra prediction modes may include two non-directional modes (planar mode and DC mode) and 65 directional modes.

The intra predictor 222 selects one intra prediction mode from among the plurality of intra prediction modes, and predicts the current block by using a neighboring pixel (reference pixel) and an operation formula determined according to the selected intra prediction mode. Information about the selected intra prediction mode is encoded by the encoder 250 and transmitted to the decoding apparatus.

Meanwhile, the intra prediction unit 222 may efficiently encode intra prediction mode information indicating which mode of the plurality of intra prediction modes is used as the intra prediction mode of the current block. The most probable mode is selected as the most probable mode (MPM) as the intra prediction mode of. Then, mode information indicating whether the intra prediction mode of the current block is selected from the MPM is generated and transmitted to the encoder 250. When the intra prediction mode of the current block is selected from the MPMs, the first intra identification information for indicating which mode of the MPMs is selected as the intra prediction mode of the current block is transmitted to the encoder. On the other hand, when the intra prediction mode of the current block is not selected from the MPM, the second intra identification information for indicating which mode other than the MPM is selected as the intra prediction mode of the current block is transmitted to the encoder.

Hereinafter, a method of constructing the MPM list will be described. In the present specification, the configuration of the MPM list with six MPMs is described as an example, but the present invention is not limited thereto, and the number of MPMs included in the MPM list may be selected within a range of 3 to 10.

First, the MPM candidate is configured using the intra prediction mode of neighboring blocks of the current block. The neighboring block may be, for example, all or some of the left block L, the upper block A, the lower left block BL, the upper right block AR, and the upper left block AL of the current block. It may include. Here, the left block (L) of the current block means a block including the pixel of the position moved by one pixel to the left from the position of the left bottom pixel in the current block, the top block (A) of the current block is the most in the current block It means a block including the pixel of the position moved one pixel upward from the position of the upper right pixel. The lower left block BL means a block including a pixel at a position shifted one pixel upward after one pixel shifted upward from the position of the leftmost lower pixel in the current block, and the upper right block AR is the rightmost block in the current block. It means the block including the pixel of the position moved by one pixel to the right after moving one pixel upward from the position of the top pixel, and the upper left block (AL) is left after moving one pixel upward from the position of the leftmost pixel in the current block. This means a block containing the pixel of the position moved by one pixel.

The intra prediction mode of these neighboring blocks is included in the MPM list. Here, the intra prediction mode of the valid blocks in the order of the left block L, the top block A, the bottom left block BL, the top right block AR, and the top left block AL is included in the MPM list. Or, after adding the planar mode and the DC mode to the intra prediction modes of the neighboring blocks to form a candidate, the left block (L), the upper block (A), the planar mode, the DC mode, the lower left block (BL), The valid modes in the order of the upper right block AR and the upper left block AL may be added to the MPM list.

The MPM list includes only different intra prediction modes. That is, when a duplicated mode is present, only one of them is included in the MPM list.

If the number of MPMs in the list is smaller than the predetermined number (eg, 6), the MPM may be derived by adding -1 or +1 to the directional modes in the list. In addition, when the number of MPMs in the list is smaller than the predetermined number, the number of insufficient modes is added to the MPM list in the order of vertical mode, horizontal mode, diagonal mode, and the like. You may.

The inter prediction unit 224 searches for the block most similar to the current block in the coded and decoded reference picture before the current picture, and generates a prediction block for the current block using the searched block. A motion vector corresponding to a displacement between the current block in the current picture and the prediction block in the reference picture is generated. The motion information including the information about the reference picture and the motion vector used to predict the current block is encoded by the encoder 250 and transmitted to the decoding apparatus.

Various methods may be used to minimize the amount of bits required to encode motion information.

For example, when the reference picture and the motion vector of the current block are the same as the reference picture and the motion vector of the neighboring block, the motion information of the current block can be transmitted to the decoding apparatus by encoding information for identifying the neighboring block. This method is called 'merge mode'.

In the merge mode, the inter prediction unit 224 selects a predetermined number of merge candidate blocks (hereinafter, referred to as 'merge candidates') from neighboring blocks of the current block.

As a neighboring block for deriving a merge candidate, as shown in FIG. 5, a left block L, an upper block A, a right upper block AR, and a lower left block BL that are adjacent to the current block in the current picture. ), All or part of the upper left block AL may be used. In addition, a block located within a reference picture (which may be the same as or different from the reference picture used to predict the current block) other than the current picture in which the current block is located may be used as the merge candidate. For example, a co-located block or a block adjacent to a block in the same position as the current block in the reference picture may be further used as a merge candidate.

The inter prediction unit 224 constructs a merge list including a predetermined number of merge candidates using these neighboring blocks. The merge candidate to be used as the motion information of the current block is selected from the merge candidates included in the merge list, and merge index information for identifying the selected candidate is generated. The generated merge index information is encoded by the encoder 250 and transmitted to the decoding apparatus.

Another way to encode motion information is to encode differential motion vectors.

In this method, the inter prediction unit 224 derives the predictive motion vector candidates for the motion vector of the current block by using the neighboring blocks of the current block. As a neighboring block used to derive the predictive motion vector candidates, a left block L, an upper block A, a right upper block AR, and a lower left block adjacent to the current block in the current picture shown in FIG. BL), all or part of the upper left block AL may be used. In addition, a block located within a reference picture (which may be the same as or different from the reference picture used to predict the current block) that is not the current picture in which the current block is located may be used as a neighboring block used to derive predictive motion vector candidates. It may be. For example, a co-located block or a block adjacent to a block at the same position as the current block in the reference picture may be used.

The inter prediction unit 224 derives the predictive motion vector candidates using the motion vectors of the neighboring blocks, and determines the predictive motion vector for the motion vector of the current block using the predictive motion vector candidates. The difference motion vector is calculated by subtracting the predicted motion vector from the motion vector of the current block.

The predicted motion vector may be obtained by applying a predefined function (eg, median value, average value calculation, etc.) to the predicted motion vector candidates. In this case, the image decoding apparatus also knows a predefined function. In addition, since the neighboring block used to derive the predictive motion vector candidate has already been encoded and decoded, the image decoding apparatus and the motion vector of the neighboring block are already known. Therefore, the image encoding apparatus does not need to encode information for identifying the predictive motion vector candidate. Therefore, in this case, the information on the differential motion vector and the reference picture used to predict the current block are encoded.

Meanwhile, the predicted motion vector may be determined by selecting any one of the predicted motion vector candidates. In this case, the information for identifying the selected predicted motion vector candidate is further encoded along with the information about the differential motion vector and the reference picture used for predicting the current block.

The subtractor 230 subtracts the prediction block generated by the intra predictor 222 or the inter predictor 224 from the current block to generate a residual block.

The converter 240 converts the residual signal in the residual block having pixel values of the spatial domain into a transform coefficient of the frequency domain. The transform unit 240 may convert the residual signals in the residual block using the size of the current block as a conversion unit, or divide the residual block into a plurality of smaller subblocks and convert the residual signals in a subblock-sized transform unit. You can also convert. There may be various ways of dividing the residual block into smaller subblocks. For example, it may be divided into sub-blocks of a predetermined same size, or a quadtree (QT) scheme may be used in which the residual block is a root node.

The quantization unit 245 quantizes the transform coefficients output from the transform unit 240, and outputs the quantized transform coefficients to the encoder 250.

The encoder 250 generates a bitstream by encoding the quantized transform coefficients by using an encoding method such as CABAC. In addition, the encoder 250 encodes information such as CTU size, MinQTSize, MaxBTSize, MaxBTDepth, MinBTSize, QT split flag, BT split flag, split type, etc. related to block division, so that the decoding apparatus encodes a block in the same way as the encoding apparatus. Allow splitting.

The encoder 250 encodes information about a prediction type indicating whether the current block is encoded by intra prediction or inter prediction, and encodes intra prediction information or inter prediction information according to the prediction type.

When the current block is intra predicted, a syntax element for the intra prediction mode is encoded as intra prediction information. The syntax element for intra prediction mode includes the following.

(1) mode information indicating whether the intra prediction mode of the current block is selected from MPMs,

(2) when the intra prediction mode of the current block is selected from the MPMs, first intra identification information for indicating which mode of the MPM is selected as the intra prediction mode of the current block,

(3) if the intra prediction mode of the current block is not selected among the MPMs, second intra identification information for indicating which mode other than the MPM is selected as the intra prediction mode of the current block;

Meanwhile, when the current block is inter predicted, the encoder 250 encodes a syntax element for inter prediction information. The syntax element for inter prediction information includes the following.

(1) Mode information indicating whether motion information of the current block is encoded in a merge mode or a mode of encoding a differential motion vector.

(2) syntax element for motion information

When the motion information is encoded by the merge mode, the encoder 250 uses merge index information indicating the candidate of the merge candidates as a candidate for extracting the motion information of the current block as a syntax element for the motion information. Encode

On the other hand, when the motion information is encoded by the mode for encoding the differential motion vector, the information about the differential motion vector and the information about the reference picture are encoded as syntax elements for the motion information. If the predicted motion vector is determined in such a manner as to select any one of the plurality of predicted motion vector candidates, the syntax element for the motion information further includes predicted motion vector identification information for identifying the selected candidate. Include.

The inverse quantizer 260 inversely quantizes the quantized transform coefficients output from the quantizer 245 to generate transform coefficients. The inverse transformer 265 restores the residual block by converting the transform coefficients output from the inverse quantizer 260 from the frequency domain to the spatial domain.

The adder 270 reconstructs the current block by adding the reconstructed residual block and the predicted block generated by the predictor 220. The pixels in the reconstructed current block are used as reference pixels when intra prediction of the next order of blocks.

The filter unit 280 deblocks and filters the boundary between the reconstructed blocks in order to remove blocking artifacts that occur due to encoding / decoding of blocks. When all the blocks in a picture are reconstructed, the reconstructed picture is used as a reference picture for inter prediction of a block in a picture to be encoded later.

The image encoding technique described above is also applied when encoding a 2D image after projecting a 360 image in 2D.

Square projection (equirectangular projection), which is a representative projection format used for 360 images, has a disadvantage in that the upper and lower pixels are severely distorted when projecting a 2D image into a 360 image. There is a disadvantage of increasing the amount and increasing the encoding throughput. Accordingly, the present invention seeks to provide an image encoding technique that supports various projection formats. Also, regions not adjacent to each other in the 2D image may be adjacent to each other in the 360 image. For example, the left boundary and the right boundary of the 2D image illustrated in FIG. 1A are adjacent to each other when projected as a 360 image. Accordingly, the present invention provides a method for efficiently encoding an image by reflecting such a feature of a 360 image.

360 영상의 위한 메타 데이터(meta data)Meta data for 360 images

Table 1 below shows an example of metadata of 360 images encoded into a bitstream to support various projection formats.

The metadata of the 360 video is encoded at any one of a video parameter set (VPS), a sequence parameter set (SPS), a picture pattern set (PPS), and supplementary enhancement information (SEI).

1-1) projection_format_idx1-1) projection_format_idx

This syntax element means an index indicating a projection format of a 360 image. The projection format according to this index value may be defined as shown in Table 2.

IndexIndex	Projection format Projection format		Description Description
00	ERPERP	Equirectangular projectionEquirectangular projection
1One	CMPCMP	Cube map projection Cube map projection
22	ISP ISP		Icosahedron projectionIcosahedron projection
33	OHP OHP		Octahedron projectionOctahedron projection
44	EAPEAP	Equal-area projectionEqual-area projection
55	TSPTSP	Truncated square pyramid projectionTruncated square pyramid projection
66	SSPSSP	Segmented sphere projectionSegmented sphere projection

Square projection is as shown in FIG. 1, and examples of various other projection formats are shown in FIG. 6.

1-2) compact_layout_flag1-2) compact_layout_flag

This syntax element is a flag indicating whether to change the layout of the 2D image projected from the 360 image. If this flag is 0, a non-compact layout with no change in layout is used. If the flag is 1, a rectangular compact layout with no spaces is used to rearrange each face.

7 is a diagram illustrating a layout of a cube format. (A) of FIG. 7 shows a non-compact layout without changing the layout, and (b) shows a compact layout with the changed layout.

1-3) num_face_rows_minus1 및 num_face_columns_minus11-3) num_face_rows_minus1 and num_face_columns_minus1

num_face_rows_minus1 represents a value of (number of faces-1) based on the horizontal axis, and num_face_columns_minus1 represents a value of (number of faces-1) based on the vertical axis. For example, in FIG. 7A, num_face_rows_minus1 is 2, num_face_columns_minus1 is 3, and in FIG. 7B, num_face_rows_minus1 is 1 and num_face_columns_minus1 is 2.

1-4) face_width 및 face_height1-4) face_width and face_height

The width information of the face (the number of pixels in the width direction) and the height information (the number of pixels in the height direction). However, since the resolution of the face determined by these syntaxes can be sufficiently inferred by num_face_rows_minus1 and num_face_columns_minus1, these syntaxes may not be encoded.

1-5) face_idx1-5) face_idx

This syntax element is an index representing the position of each face in the 360 image. This index can be defined as shown in Table 3.

face_ face_ idxidx		locationlocation
00	Top Top
1One
Bottom Bottom
22	Front Front
33	Right Right
44	Back Back
55	Left Left
66	NullNull

When there is a blank area as shown in the non-compact layout of FIG. 7 (a), an index value (for example, 6) meaning 'null' is set in the blank area, and encoding for a surface set to null is omitted. Can be. For example, in the non-compact layout of FIG. 7A, the index values for each face are 0 (top), 6 (null), 6 (null), 6 (null), 2 (front) in raster scan order. , 3 (right), 4 (back), 5 (left), 1 (bottom), 6 (null), 6 (null), and 6 (null).

1-6) face_rotation_idx1-6) face_rotation_idx

This syntax element is an index representing rotation information of each face. Rotating a layout in a 2D layout can increase the association between adjacent faces. For example, in FIG. 8A, the upper boundary of the left plane and the left boundary of the top plane contact each other in a 360 image. Therefore, if left is rotated 270 degrees (-90 degrees) after changing the layout of FIG. 8 (a) to a compact layout as shown in FIG. 7 (b), the left surface as shown in (b) of FIG. Continuity between the top and top surfaces can be maintained. Thus, as a syntax element for the rotation of each side, it defines a face_rotation_ idx. This index may be defined as shown in Table 4.

IndexIndex	Face rotation in counter-clockwiseFace rotation in counter-clockwise
00	00
1One	9090
22	180180
33	270270

Table 1 illustrates that syntax elements of 1-3) to 1-6) are encoded when the projection format is a cube projection format. It may be extended to other formats such as). In addition, not all of the syntax elements defined in Table 1 should always be encoded. Depending on how far the metadata of the 360 image is defined, some syntax elements may not be encoded. For example, when not applying the compact layout or adopting a technique of rotating a face, syntax elements such as compact_layout_flag and face_rotation_idx may be omitted.

360 영상의 예측360 video prediction

In a 2D layout of a 360 image, an area in which one face or adjacent faces are bound together is treated as a tile or a slice or as a picture. In image coding, each tile or slice is treated independently because of its low dependency on each other. When predicting a block included in each tile or slice, information of another tile or slice is not used. Therefore, when predicting a block located at a tile or slice boundary, there may not be a neighboring block for the block. Existing video encoding apparatus pads pixel values of neighboring blocks at non-existent positions with predetermined values or regards them as invalid blocks.

However, regions not adjacent to each other in the 2D layout may be adjacent to each other based on the 360 image. Accordingly, the present invention needs to predict the current block to be encoded or encode the prediction information for the current block by reflecting this characteristic of the 360 image.

9 illustrates an apparatus for generating a syntax element for prediction information of a current block in a 360 image according to an embodiment of the present invention.

The apparatus 900 includes a prediction information candidate generator 910 and a syntax generator 920.

The prediction information candidate generator 910 generates prediction information candidates using neighboring blocks of the current block located on the first surface of the 2D layout projected from the 360 image. The neighboring blocks are blocks located at predetermined positions around the current block, and as shown in FIG. 5, the left block L, the upper block A, the lower left block BL, the upper right block AR, and the upper left block May include all or part of AL).

When the current block is in contact with the boundary of the first surface, that is, when the boundary of the current block coincides with the boundary of the first surface, some of the neighboring blocks at the predetermined position may not exist in the first surface. For example, when the current block is in contact with the upper boundary of the first surface, the upper block A, the upper right block AR and the upper left block AL are not present in the first surface in FIG. 5. In conventional image coding, these neighboring blocks are considered invalid and have not been used. However, in the present invention, when the current block coincides with the boundary of the first surface, the neighboring block is determined based on the 360 image rather than the 2D layout. That is, the block adjacent to the current block is determined as the neighboring block based on the 360 image. Here, the prediction information candidate generator 910 may identify a block adjacent to the current block based on the 360 image by using one or more information among the projection format of the 360 image, the face index, and the rotation information of the face. have. For example, in the case of the square projection format, since there is only one surface, a block adjacent to the current block can be identified using only the projection format without using the index of the surface or rotation information of the surface. In addition, in the case of a projection format having a plurality of planes in addition to the square projection, the block adjacent to the current block can be identified using the index of the plane in addition to the projection format. In the case of adopting a technique of rotating a face, the block adjacent to the current block can be identified by further using the face rotation information as well as the face index.

For example, when the boundary of the current block coincides with the boundary of the first surface, the prediction information candidate generator 910 identifies the second surface that is first encoded and contacts the boundary of the current block based on the 360 image. Here, whether the boundary of the current block coincides with the boundary of the first surface may be determined by the position of the current block, for example, the position of the leftmost upper pixel in the current block. The second face is identified using one or more of the projection format, face index, and rotation information of the face. The prediction information candidate generator 910 determines a block located on the second surface and adjacent to the current block in the 360 image as the neighboring block of the current block.

10 is an exemplary diagram for describing a method of determining neighboring blocks of a current block in a cube format to which a compact layout is applied.

In Fig. 10, the numbers on each side indicate the indexes of the faces, and as shown in Table 3, 0 is the top face, 1 is the bottom face, and 2 is the front face. ), 3 means a right face, 4 means a back face, and 5 means a left face. In the compact layout of FIG. 10 (b), when the current block (X) is in contact with the upper boundary of the front face (2), the left peripheral block (L) of the current block is present on the front face (2), but the peripheral block (A) located at the top ) Is not present on the front face (2). However, as shown in FIG. 10 (a), when the compact layout is projected as a 360 image according to a cube format, the upper boundary of the front face 2 that the current block contacts is in contact with the lower boundary of the upper face 0. Then, a block A adjacent to the current block X exists at the lower boundary of the upper surface 0. Therefore, block A of the upper surface 0 is used as a neighboring block of the current block.

Meanwhile, the encoder 250 of the encoding apparatus illustrated in FIG. 2 may further encode a flag indicating whether reference between different planes is allowed. Determining the neighboring block of the current block based on the 360 image, there is a dependency between each plane may result in a reduction in the execution speed of the encoder and the decoder. In order to prevent this, the above flag may be encoded in a header such as a sequence parameter set (SSP) or a picture parameter set (PPS). At this time, the prediction information candidate generator 910 determines the neighboring block of the current block based on the 360 image when the flag is turned on (eg, flag = 1), and when the flag is turned off (eg, flag = 0), the prediction information candidate generator 910 is 360. As in the prior art, instead of the image, neighboring blocks are determined independently for each face based on the 2D image.

The syntax generator 920 encodes a syntax element for the prediction information of the current block by using the prediction information candidates generated by the prediction government candidate generator 910. Here, the prediction information may be inter prediction information or intra prediction information.

An example of a case where the apparatus of FIG. 9 is applied to intra prediction and inter prediction is described.

FIG. 11 is a diagram illustrating a detailed configuration of the intra predictor 222 when the apparatus of FIG. 9 is applied to intra prediction.

The intra prediction unit 222 of the present embodiment includes an MPM generator 1110 and a syntax generator 1120, and these components are respectively predicted candidate generator 910 and syntax generator 920 of FIG. 10. Corresponds to.

As described above, the MPM generator 1110 generates MPMs from the intra prediction mode of the neighboring blocks of the current block to form an MPM list. Since the method of constructing the MPM list has already been described in the intra prediction unit 222 of FIG. 2, further description thereof will be omitted.

When the boundary of the current block is the same as the boundary of the plane on which the current block is located, the MPM generator 1110 determines a block adjacent to the current block as a neighboring block of the current block based on 360 images. For example, as in the example of FIG. 10, when the current block X is in contact with the upper boundary of the front face 2, the upper block A, the upper right block AR, and the upper left block AL exist on the front face 2. I never do that. Therefore, in the 360 image, the upper surface 0 which is in contact with the upper boundary of the front face 2 is identified, and blocks corresponding to the upper block A, the upper right block AR, and the upper left block AL based on the position of the current block. Is identified on the upper surface (0) and used as a peripheral block.

The syntax generator 1120 generates a syntax element for the intra prediction mode of the current block by using the MPMs included in the MPM list and outputs the syntax element to the encoder 250. That is, the syntax generator 1120 determines whether the intra prediction mode of the current block is the same as any one of the MPMs in the MPM list, and mode information indicating whether the intra prediction mode of the current block is selected from the MPMs in the MPM list. Create When the intra prediction information of the current block is selected from the MPMs, first identification information indicating which mode of the MPMs is selected as the intra prediction mode of the current block is generated. When the intra prediction information of the current block is not selected from the MPMs, second identification information indicating the intra prediction mode of the current block is generated among the remaining modes except the MPMs from the plurality of intra prediction modes. The generated mode information, the first identification information and / or the second identification information is output to the encoder 250 and encoded by the encoder 250.

The intra predictor 222 may further include a reference pixel generator 1130 and a predictor block generator 1140.

The reference pixel generator 1130 sets pixels in the first coded block located around the current block as the reference pixel. For example, in the current block, pixels positioned at the upper and upper right sides and pixels positioned at the upper left, upper left and lower left may be set as reference pixels. The pixels located at the top and top right may include one or more rows of pixels around the current block. The pixels located at the left, upper left and lower left may include one or more rows of pixels around the current block.

When the boundary of the current block coincides with the boundary of the plane on which the current block is located, the reference pixel generator 1130 sets the current block reference pixel based on the 360 image. The principle is as described with reference to FIG. For example, referring to FIG. 12, in the 2D layout, reference pixels exist at the lower left and lower left of the current block X positioned on the front surface 2, but reference pixels do not exist at the upper, upper right and upper left. However, when the compact layout is projected to the 360 image according to the cube format, the upper boundary of the front face 2 that the current block is in contact with the lower boundary of the upper face (0). Therefore, the pixels corresponding to the upper, upper right and upper left sides of the current block are set as reference pixels at the lower boundary of the upper surface 0.

13 is an exemplary diagram for describing a method of setting reference pixels for intra prediction in various projection formats. As shown in (a) to (e) of FIG. 13, a position where a reference pixel does not exist is padded with pixels positioned around a current block based on a 360 image. The padding is determined in consideration of the position where the pixels touch each other in the 360 image. For example, in the case of the cube format of FIG. 13B, pixels 1 to 8 sequentially positioned from the bottom to the top of the left face on the left boundary in the back face are located at the top of the left face in the right to left direction. Padded sequentially with the surrounding pixels located. However, the present invention is not limited thereto, and in some cases, reverse padding is also possible. For example, in FIGS. 13B, pixels 1 to 8 positioned in the bottom boundary in the back face from the bottom to the top are sequentially arranged as peripheral pixels positioned in the upper part of the left face in the left to right direction. It may be padded.

The prediction block generator 1140 predicts the current block by using the reference pixels set by the reference pixel generator 1130 and determines an intra prediction mode of the current block. The determined intra prediction mode is input to the MPM generator 1110, and the MPM generator 1110 and the syntax generator 1120 generate syntax elements for the determined intra prediction mode and output them to the encoder.

FIG. 14 is a diagram illustrating a detailed configuration of the inter prediction unit 224 when the apparatus of FIG. 9 is applied to inter prediction.

When the apparatus of FIG. 9 is applied to inter prediction, the inter prediction unit 224 includes a prediction block generator 1410, a merge candidate generator 1420, and a syntax generator 1430. The merge candidate generator 1420 and the syntax generator 1430 correspond to the prediction information candidate generator 910 and the syntax generator 920 of FIG. 9, respectively.

The prediction block generator 1410 searches for a block having a pixel value most similar to that of the current block in the reference picture, and generates a motion vector and a prediction block of the current block. The prediction block is output to the subtractor 230 and the adder 270, and the motion information including the motion vector and the information about the reference picture is output to the syntax generator 1430.

The merge candidate generator 1420 generates a merge list including merge candidates using neighboring blocks of the current block. As described above, among the left block (L), the upper block (A), the upper right block (AR), the lower left block (BL), the upper left block (AL) adjacent to the current block shown in FIG. All or part may be used as a neighboring block for generating merge candidates.

When the boundary of the current block coincides with the boundary of the first surface on which the current block is located, the merge candidate generator 1420 determines a block adjacent to the current block as the neighboring block based on the 360 image. The merge candidate generator 1420 corresponds to the prediction information candidate generator 910 of FIG. 9. Therefore, since all the functions of the prediction information candidate generator 910 may be applied to the merge candidate generator 1420, a detailed description thereof will be omitted.

The syntax generator 1430 generates a syntax element for inter prediction information of the current block by using merge candidates included in the merge list. First, mode information indicating whether to encode motion information of a current block in a merge mode is generated. When the motion information of the current block is encoded in the merge mode, the syntax generator 1430 generates merge index information indicating which merge candidate motion information included in the merge list is set as the motion information of the current block. .

If not encoded in the merge mode, the syntax generator 1430 generates information about the differential motion vector and reference information used for predicting the current block (that is, referred to by the motion vector of the current block).

The syntax generator 1430 determines a predicted motion vector with respect to the motion vector of the current block to generate a differential motion vector. As described in the inter prediction unit 224 of FIG. 2, the syntax generator 1430 derives the predictive motion vector candidates using the neighboring blocks of the current block, and uses the predictive motion vector candidates for the motion vector of the current block. The predicted motion vector can be determined. In this case, when the boundary of the current block coincides with the boundary of the first surface on which the current block is located, the block adjacent to the current block is determined as the neighboring block based on the 360 image in the same manner as the merge candidate generator 1420.

If the prediction motion vector for the motion vector of the current block is determined by selecting one of the prediction motion vector candidates, the syntax generator 1430 selects the candidate selected as the prediction motion vector from the prediction motion vector candidates. Further generating predicted motion vector identification information for identification.

The syntax element generated by the syntax generator 1430 is encoded by the encoder 250 and transferred to the decoding apparatus.

Hereinafter, an image decoding apparatus will be described.

15 illustrates an image decoding apparatus according to an embodiment of the present invention.

The image decoding apparatus includes a decoder 1510, an inverse quantizer 1520, an inverse transformer 1530, a predictor 1540, an adder 1550, a filter 1560, and a memory 1570. Like the image encoding apparatus of FIG. 2, the image decoding apparatus may be implemented by each component as a hardware chip, or may be implemented by software and a microprocessor to execute a function of software corresponding to each component.

The decoder 1510 decodes the bitstream received from the image encoding apparatus, extracts information related to block division, determines a current block to be decoded, and includes prediction information and residual signal information necessary for reconstructing the current block. Extract

The decoder 1510 extracts information on the CTU size from a Sequence Parameter Set (SPS) or Picture Parameter Set (PPS) to determine the size of the CTU, and divides the picture into a CTU of the determined size. The CTU is determined as the highest layer of the tree structure, that is, the root node, and the CTU is partitioned using the tree structure by extracting partition information about the CTU. For example, when splitting a CTU using a QTBT structure, first, a first flag (QT_split_flag) related to splitting of QT is extracted, and each node is divided into four nodes of a lower layer. For the node corresponding to the leaf node of the QT, the second flag BT_split_flag and the split type information related to the splitting of the BT are extracted to split the corresponding leaf node into the BT structure.

Taking the block division structure of FIG. 3 as an example, the QT division flag QT_split_flag corresponding to the node of the highest layer of the QTBT structure is extracted. Since the value of the extracted QT split flag QT_split_flag is 1, the node of the highest layer is divided into four nodes of the lower layer (layer 1 of QT). Then, the QT splitting flag QT_split_flag for the first node of layer 1 is extracted. Since the extracted QT split flag (QT_split_flag) has a value of 0, the first node of layer 1 is no longer split into a QT structure.

Since the first node of layer 1 of the QT becomes a leaf node of the QT, the node proceeds to BT having the first node of layer 1 of the QT as the root node of the BT. A BT split flag (BT_split_flag) corresponding to the root node of BT, that is, (layer 0), is extracted. Since the BT split flag BT_split_flag is 1, the root node of BT is split into two nodes of (layer 1). Since the root node of the BT is split, split type information indicating whether a block corresponding to the root node of the BT is split vertically or horizontally is extracted. Since the split type information is 1, the block corresponding to the root node of BT is split vertically. Then, the BT split flag (BT_split_flag) for the first node of (layer 1) split from the root node of BT is extracted. Since the BT split flag BT_split_flag is 1, the split type information of the block of the first node of (layer 1) is extracted. Since the partition type information of the block of the first node of (layer 1) is 1, the block of the first node of (layer 1) is vertically divided. Then, the BT split flag BT_split_flag of the second node of (layer 1) divided from the root node of BT is extracted. Since the BT split flag BT_split_flag is 0, it is no longer split by BT.

In this way, the decoder 1510 first recursively extracts the QT splitting flag QT_split_flag to split the CTU into a QT structure. The BT split flag (BT_split_flag) is extracted for the leaf node of the QT, and when the BT split flag (BT_split_flag) indicates the split, split type information is extracted. In this manner, the decoder 1510 may confirm that the CTU is divided into a structure as shown in FIG.

On the other hand, if information such as MinQTSize, MaxBTSize, MaxBTDepth, MinBTSize, etc. is additionally defined in the SPS or PPS, the decoder 1510 extracts the information and uses the information when extracting the split information for the QT and BT. Can reflect.

For example, blocks having the same size as MinQTSize in QT are no longer split. Therefore, the decoder 1510 does not extract the split information (QT split flag) about the QT of the block from the bitstream (that is, the QT split flag of the block does not exist in the bitstream), and automatically extracts the value. Set to zero. In addition, there is no BT in a block having a size larger than MaxBTSize in QT. Accordingly, the decoder 1510 does not extract the BT split flag for the leaf node having a block larger than MaxBTSize in QT, and automatically sets the BT split flag to 0. In addition, when the depth of the node of the BT reaches MaxBTDepth, the block of the node is no longer split. Therefore, the BT partition flag of the corresponding node is not extracted from the bitstream, and its value is automatically set to zero. Also, in BT, blocks having the same size as MinBTSize are no longer split. Accordingly, the decoder 1510 does not extract the BT partition flag of the block having the same size as MinBTSize from the bitstream, and automatically sets the value to zero.

On the other hand, when the decoder 1510 determines the current block (current block) to be decoded by splitting the tree structure, the decoder 1510 extracts information about a prediction type indicating whether the current block is intra predicted or inter predicted.

When the prediction type information indicates intra prediction, the decoder 1510 extracts a syntax element for intra prediction information (intra prediction mode) of the current block. First, mode information indicating whether the intra prediction mode of the current block is selected from the MPMs is extracted. When the intra mode encoding information indicates that the intra prediction mode of the current block is selected from the MPMs, first intra identification information for indicating which mode of the MPMs is selected as the intra prediction mode of the current block is extracted. On the other hand, when the intra mode encoding information indicates that the intra prediction mode of the current block is not selected among the MPMs, the second intra for indicating which mode other than the MPM is selected as the intra prediction mode of the current block. Extract identification information.

When the prediction type information indicates inter prediction, the decoder 1510 extracts a syntax element for the inter prediction information. First, mode information indicating whether the motion information of the current block is encoded by which of a plurality of encoding modes is extracted. Here, the plurality of encoding modes include a merge mode and a differential motion vector encoding mode. When the mode information indicates the merge mode, the decoder 1510 extracts merge index information indicating whether to derive the motion vector of the current block from the candidates among the merge candidates as a syntax element for the motion information. On the other hand, when the mode information indicates the differential motion vector encoding mode, the decoder 1510 extracts the information about the differential motion vector and the reference picture to which the motion vector of the current block refers as a syntax element for the motion vector. do. On the other hand, when the image encoding apparatus uses any one of the plurality of prediction motion vector candidates as the prediction motion vector of the current block, the prediction motion vector identification information is included in the bitstream. Therefore, in this case, not only the information on the differential motion vector and the reference picture but also the predicted motion vector identification information are extracted as syntax elements for the motion vector.

Meanwhile, the decoder 1510 extracts information on the quantized transform coefficients of the current block as information on the residual signal.

The inverse quantization unit 1520 inversely quantizes the quantized transform coefficients, and the inverse transform unit 1530 inversely transforms the inverse quantized transform coefficients from the frequency domain to the spatial domain to generate a residual block for the current block.

The predictor 1540 includes an intra predictor 1542 and an inter predictor 1544. The intra predictor 1542 is activated when the intra prediction is the prediction type of the current block, and the inter predictor 1544 is activated when the intra prediction is the prediction type of the current block.

The intra predictor 1542 determines the intra prediction mode of the current block among the plurality of intra prediction modes from the syntax elements for the intra prediction mode extracted from the decoder 1510, and references pixels around the current block according to the intra prediction mode. Predict the current block using

In order to determine the intra prediction mode of the current block, the intra prediction unit 1542 constructs an MPM list including a predetermined number of MPMs from neighboring blocks of the current block. The method of constructing the MPM list is the same as that of the intra predictor 222 of FIG. 2. When the mode information of the intra prediction indicates that the intra prediction mode of the current block is selected from the MPMs, the intra prediction unit 1542 may convert the MPM indicated by the first intra identification information among the MPMs in the MPM list into the intra of the current block. Select the prediction mode. On the other hand, if the mode information indicates that the intra prediction mode of the current block is not selected from the MPM, the intra prediction mode of the current block is determined among the remaining intra prediction modes except for the MPMs in the MPM list using the second intra identification information. do.

The inter prediction unit 1544 determines motion information of the current block using syntax elements of the intra prediction mode extracted from the decoder 1510, and predicts the current block using the determined motion information.

First, the inter prediction unit 1544 checks mode information in the inter prediction extracted from the decoder 1510. When the mode information indicates the merge mode, the inter prediction unit 1544 constructs a merge list including a predetermined number of merge candidates using neighboring blocks of the current block. The inter prediction unit 1544 configures the merge list in the same way as the inter prediction unit 224 of the image encoding apparatus. Then, one merge candidate is selected from among the merge candidates in the merge list by using the merge index information transmitted from the decoder 1510. The motion information of the selected merge candidate, that is, the motion vector and the reference picture of the merge candidate are set as the motion vector and the reference picture of the current block.

On the other hand, when the mode information indicates the differential motion vector encoding mode, the inter prediction unit 1544 derives the predictive motion vector candidates using the motion vectors of the neighboring blocks of the current block, and uses the predictive motion vector candidates to determine the current block. Determine the predicted motion vector for the motion vector of. The inter prediction unit 1544 derives the prediction motion vector candidates in the same manner as the inter prediction unit 224 of the image encoding apparatus. If the image encoding apparatus uses any one of a plurality of prediction motion vector candidates as the prediction motion vector of the current block, the syntax element for the motion information includes the prediction motion vector identification information. Therefore, in this case, the inter prediction unit 1544 may select a candidate indicated by the prediction motion vector identification information among the prediction motion vector candidates as the prediction motion vector. However, when the image encoding apparatus determines the prediction motion vector by using a function predefined for the plurality of prediction motion vector candidates, the inter prediction unit may determine the prediction motion vector by applying the same function as the image encoding apparatus. When the predicted motion vector of the current block is determined, the inter predictor 1544 adds the predicted motion vector and the differential motion vector transmitted from the decoder 1510 to determine the motion vector of the current block. The reference picture referred to by the motion vector of the current block is determined by using information about the reference picture transferred from the decoder 1510.

When the motion vector and the reference picture of the current block are determined in the merge mode or the differential motion vector encoding mode, the inter prediction unit 1542 generates the prediction block of the current block using the block indicated by the motion vector in the reference picture. do.

The adder 1550 reconstructs the current block by adding the residual block output from the inverse transformer and the prediction block output from the inter predictor or the intra predictor. The pixels in the reconstructed current block are utilized as reference pixels when intra prediction of a block to be decoded later.

The filter unit 1560 deblocks and filters the boundary between the reconstructed blocks in order to remove blocking artifacts caused by block-by-block decoding, and stores them in the memory 290. When all the blocks in a picture are reconstructed, the reconstructed picture is used as a reference picture for inter prediction of a block in a picture to be decoded later.

The image decoding technique described above is also applied when decoding a 360 image encoded in 2D after being projected in 2D.

In the case of the 360 image, as described above, the metadata of the 360 image is stored at any one of a video parameter set (VPS), a sequence parameter set (SPS), a picture pattern set (PPS), and supplementary enhancement information (SEI). It is encoded. Accordingly, the decoder 1510 extracts metadata of the 360 image at the corresponding position. The extracted metadata is used to reconstruct the 360 image. In particular, the metadata may be used when predicting the current block or decoding the prediction information about the current block.

FIG. 16 illustrates an apparatus for determining prediction information of a current block in a 360 image, according to an embodiment of the present invention.

The apparatus 1600 includes a prediction information candidate generator 1610 and a prediction information determiner 1620.

The prediction information candidate generator 1610 generates prediction information candidates using neighboring blocks of the current block located on the first surface of the 2D layout projected from the 360 image. In particular, when the boundary of the current block coincides with the boundary of the first surface, that is, when the current block is in contact with the boundary of the first surface, the prediction information candidate generator 1610 is 360 even though the current block is not adjacent to the current block. A block adjacent to the current block in the image is set as a neighboring block of the current block. For example, when the boundary of the current block coincides with the boundary of the first surface, the prediction information candidate generator 910 identifies a second surface that is first contacted with the boundary of the current block. The second face is identified using one or more projection formats, face indexes, and rotation information of the face among metadata of the 360 image. Since the prediction information candidate generator 1610 determines a neighboring block of the current block based on the 360 image, the prediction information candidate generator 1610 is the same as that of the prediction information candidate generator 910 of FIG.

The prediction information determiner 1620 is a syntax element for the prediction information candidates generated by the prediction information candidate generator 1610 and the prediction information extracted by the decoder 1510, that is, a syntax element or inter for the intra prediction information. The prediction information of the current block is restored using a syntax element for the prediction information.

Hereinafter, an embodiment in the case where the apparatus of FIG. 16 is applied to intra prediction and inter prediction is described.

17 is a diagram illustrating a detailed configuration of the intra prediction unit 1542 when the apparatus of FIG. 16 is applied to intra prediction.

When the apparatus of FIG. 16 is applied to intra prediction, the intra prediction unit 1542 includes an MPM generator 1710, an intra prediction mode determiner 1720, a reference pixel generator 1730, and a prediction block generator 1740. It includes. The MPM generator 1710 and the intra prediction mode determiner 1720 correspond to the prediction information candidate generator 1610 and the prediction information determiner 1620 of FIG. 16, respectively.

The MPM generator 1710 constructs an MPM list by deriving MPMs from the intra prediction mode of the neighboring block of the current block. In particular, when the boundary of the current block coincides with the boundary of the first surface on which the current block is located, the MPM generator 1710 determines the neighboring block of the current block based on the 360 image rather than the 2D layout. That is, even if there is no neighboring block of the current block in the 2D layout, if there is a block adjacent to the current block in the 360 image, the neighboring block of the current block is set as the block. The method of determining the neighboring block by the MPM generator 1710 is the same as that of the MPM generator 1110 of FIG. 11.

The intra prediction mode determiner 1720 determines the intra prediction mode of the current block from the syntax elements for the MPMs in the MPM list generated by the MPM generator 1710 and the intra prediction mode extracted from the decoder 1510. . That is, if the mode information of the intra prediction indicates that the intra prediction mode of the current block is determined from the MPM list, the intra prediction mode determiner 1720 determines the mode identified by the first intra identification information among the MPM candidates included in the MPM list. Determine the intra prediction mode of the current block. On the other hand, if the mode information of the intra prediction indicates that the intra prediction mode of the current block is not determined from the MPM list, the MPM list among a plurality of intra prediction modes, that is, all intra prediction modes that can be used for intra prediction of the current block, is determined. The intra prediction mode of the current block is determined using the second intra identification information among the remaining intra prediction modes except for the MPMs.

The reference pixel generator 1730 sets pixels in the first decoded block located around the current block as reference pixels. If the boundary of the current block coincides with the boundary of the first surface on which the current block is located, the reference pixel generator 1730 sets the reference pixel based on the 360 image rather than the 2D layout. The method of setting the reference pixel by the reference pixel generator 1730 is the same as the reference pixel generator 1130 of FIG. 11.

The prediction block generation unit 1740 selects reference pixels corresponding to the intra prediction mode of the current block among the reference pixels, and predicts the current block by using an expression corresponding to the intra prediction mode of the current block to the selected reference pixels. Create a block.

18 is a diagram illustrating a detailed configuration of the inter prediction unit 1544 when the apparatus of FIG. 16 is applied to inter prediction.

When the apparatus of FIG. 16 is applied to inter prediction, the inter prediction unit 1544 includes a merge candidate generator 1810, a motion vector predictor (MVP) candidate generator 1820, a motion information determiner 1830, and a prediction block. Generation unit 1840 is included. The merge candidate generator 1810 and the MVP candidate generator 1820 correspond to the prediction information candidate generator 1610 of FIG. 16. The motion information determiner 1830 corresponds to the prediction information determiner 1620 of FIG. 16.

The merge candidate generator 1810 is activated when the mode information of the inter prediction extracted from the decoder 1510 indicates the merge mode. The merge candidate generator 1810 generates a merge list including merge candidates using neighboring blocks of the current block. In particular, when the boundary of the current block coincides with the boundary of the first surface on which the current block is located, the merge candidate generator 1420 determines a block adjacent to the current block as a neighboring block based on the 360 image. That is, in the 2D layout, the block adjacent to the current block is set as the neighboring block of the current block in the 360 image even if the current block is not adjacent to the current block. The merge candidate generator 1810 is the same as the merge candidate generator 1420 of FIG. 14.

The MVP candidate generator 1820 is activated when the mode information of the inter prediction extracted from the decoder 1510 indicates a differential motion vector encoding mode. The MVP candidate generator 1820 determines a candidate (predicted motion vector candidate) for the predicted motion vector of the current block by using the motion vectors of neighboring blocks of the current block. The MVP candidate generator 1820 determines the prediction motion vector candidates in the same manner as the syntax generator 1430 determines the prediction motion vector candidates in FIG. 14. For example, like the syntax generator 1430 of FIG. 14, the MVP candidate generator 1820 is 360 instead of an image of a 2D layout when the boundary of the current block matches the boundary of the first surface on which the current block is located. A block adjacent to the current block is determined as a neighboring block of the current block based on the image.

The motion information determiner 1830 reconstructs the motion information of the current block by using the merge candidates or the predicted motion vector candidates and the motion information syntax element extracted from the decoder 1510 according to the inter prediction mode information. For example, when the inter prediction mode information indicates the merge mode, the motion information determiner 1830 determines a motion vector and a reference picture of the candidate indicated by the merge index information among the merge candidates in the merge list, and the motion vector of the current block. And reference picture. On the other hand, when the inter prediction mode information indicates the differential motion vector encoding mode, the motion information determiner 1830 determines the predicted motion vector for the motion vector of the current block by using the predicted motion vector candidates, and determines the determined predicted motion. The motion vector of the current block is determined by adding the difference motion vector transmitted from the vector and the decoder 1510. Then, the reference picture is determined by using information about the reference picture transferred from the decoder 1510.

The prediction block generator 1840 predicts the current block by using the motion vector and the reference picture of the current block determined by the motion information determiner 1830. That is, the prediction block for the current block is generated using the block indicated by the motion vector of the current block in the reference picture.

The above description is merely illustrative of the technical idea of the present embodiment, and those skilled in the art to which the present embodiment belongs may make various modifications and changes without departing from the essential characteristics of the present embodiment. Therefore, the present embodiments are not intended to limit the technical idea of the present embodiment but to describe the present invention, and the scope of the technical idea of the present embodiment is not limited by these embodiments. The scope of protection of the present embodiment should be interpreted by the following claims, and all technical ideas within the scope equivalent thereto should be construed as being included in the scope of the present embodiment.

CROSS-REFERENCE TO RELATED APPLICATION

This patent application is filed with the Korean Patent Application No. 10-2016-0134654 filed in Korea on October 17, 2016 and Patent Application No. 10-2017-0003154 filed in Korea on January 09, 2017. (a) Claims priority under section 35 USC §119 (a), all of which are hereby incorporated by reference in this patent application. In addition, this patent application claims priority to countries other than the United States for the same reasons, all of which are incorporated herein by reference.

Claims

A method of encoding prediction information of a current block located on a first surface to be encoded when encoding each surface of a 2D image projected from a 360 image,

Generating prediction information candidates using neighboring blocks of the current block; And

Encoding a syntax element for the prediction information of the current block by using the prediction information candidates,

And when a boundary of the current block coincides with a boundary of the first surface, a block adjacent to the current block is set as at least some of the neighboring blocks based on the 360 image.
The method of claim 1,

When the boundary of the current block coincides with the boundary of the first face on which the current block is located, setting at least some of the neighboring blocks may include:

Identifying a second surface that is adjacent to a boundary of the current block and is first encoded in the 360 image; And

Setting one or more blocks on the second surface and adjacent to the current block in the 360 image as at least some of the peripheral blocks

Prediction information encoding method comprising a.
The method of claim 2,

And determining whether the boundary of the current block coincides with the boundary of the first surface based on the position of the current block.
The method of claim 1,

The block adjacent to the current block based on the 360 image is identified by at least one of a projection format, an index for each surface, and rotation information of each surface.
The method of claim 1,

Wherein the prediction information is an intra prediction mode, and the prediction information candidates are MPMs (most probable modes).
The method of claim 5, wherein

The MPMs are derived from intra prediction modes of neighboring blocks of a predetermined position of the current block, and the predetermined position includes a plurality of positions of a left side, an upper side, a lower left side, an upper right side, and an upper left side of the current block. A prediction information encoding method.
The method of claim 5, wherein

Encoding a syntax element for the prediction information of the current block,

Encoding mode information indicating whether an intra prediction mode of the current block is selected from the MPMs;

If an intra prediction mode of the current block is selected from the MPMs, encoding first intra identification information indicating which mode of the MPMs is selected as the intra prediction mode of the current block; And

If the intra prediction mode of the current block is not selected from the MPMs, encoding second intra identification information indicating the intra prediction mode of the current block among a plurality of intra prediction modes except for the MPMs;

Prediction information encoding method comprising a.
The method of claim 1,

Encoding a flag indicating whether reference between different planes is allowed,

And determining a block adjacent to the current block as at least some of the neighboring blocks based on the 360 image when the flag indicates that reference between different planes is allowed.
A method of decoding prediction information of a current block located on a first surface to be decoded from a 360 image encoded as a 2D image,

Decoding a syntax element for prediction information of the current block from a bitstream;

Generating prediction information candidates using neighboring blocks of the current block; And

Restoring prediction information of the current block by using the prediction information candidates and the decoded syntax element,

And when a boundary of the current block coincides with a boundary of the first surface, a block adjacent to the current block is set as at least some of the neighboring blocks based on the 360 image.
The method of claim 9,

When the boundary of the current block coincides with the boundary of the first surface,

Identifying a second surface that is adjacent to a boundary of the current block and is first encoded in the 360 image; And

Setting a block included in the second surface and adjacent to the current block in the 360 image as at least some of the peripheral blocks

Prediction information decoding method comprising a.
The method of claim 9,

Decoding metadata of the 360 image including at least one of projection format information, index information for each surface, and rotation information of each surface from a bitstream,

The block adjacent to the current block based on the 360 image is identified by at least one of projection format information, index information for each surface, and rotation information of each surface. .
The method of claim 1,

Wherein the prediction information is an intra prediction mode, and the prediction information candidates are MPMs (most probable modes).
The method of claim 1,

Encoding a flag indicating whether reference between different planes is allowed,

And when the flag indicates that reference between different planes is allowed, setting a block adjacent to the current block as at least some of the neighboring blocks based on the 360 image.
An apparatus for decoding prediction information of a current block located on a first surface to be decoded from a 360 image encoded as a 2D image,

A decoder which decodes a syntax element for the prediction information of the current block from a bitstream;

A prediction information candidate generator for generating prediction information candidates using neighboring blocks of the current block; And

A prediction information determiner for reconstructing prediction information of the current block by using the prediction information candidates and the decoded syntax element,

The prediction information candidate generator may set a block adjacent to the current block as at least some of the neighboring blocks based on the 360 image when the boundary of the current block coincides with the boundary of the first surface. Prediction information decoding apparatus.