WO2019009590A1

WO2019009590A1 - Method and device for decoding image by using partition unit including additional region

Info

Publication number: WO2019009590A1
Application number: PCT/KR2018/007520
Authority: WO
Inventors: 김기백
Original assignee: 김기백
Priority date: 2017-07-03
Filing date: 2018-07-03
Publication date: 2019-01-10
Also published as: US20230051471A1

Abstract

Disclosed are a method and a device for decoding an image by using a partition unit including an additional region. The method for decoding an image by using a partition unit including an additional region comprises the steps of: partitioning, by referring to a syntax element acquired from a received bit stream, an encoded image, which is included in the bitstream, into at least one partition unit; setting an additional region for at least one partition unit; and decoding the encoded image on the basis of the partition unit for which the additional region is set. Therefore, the image encoding efficiency can be improved.

Description

Image decoding method and apparatus using divided unit including additional region

The present invention relates to an image decoding method and apparatus using a division unit including an additional region, and more particularly, to an image decoding method and apparatus using a division unit including an additional region, And to a technique for improving coding efficiency by referring to image data in an additional area.

Recently, demand for multimedia data such as video is rapidly increasing on the Internet. However, the bandwidth of the channel is rapidly increasing, and it is necessary to efficiently compress the amount of multimedia data. ISO / ISE Moving Picture Experts Group (MPEG) and ITU-T Video Coding Experts Group (VCEG) are studying video compression standards through cooperative research.

On the other hand, in the case of performing independent encoding on an image, since it is general to independently perform encoding for each individual division unit including a tile, there is a problem that the image data of another division unit adjacent temporally and spatially can not be referred to have.

Accordingly, there is a need for a method of referring to adjacent image data while maintaining parallel processing by independent encoding as in the conventional art.

In addition, the intra-picture prediction according to the conventional image coding / decoding method constitutes a reference pixel by using the pixels closest to the current block, and it is not preferable to construct the reference pixel as the nearest pixel depending on the type of the image .

Accordingly, there is a need for a method of improving intra-picture prediction efficiency by configuring reference pixels differently from conventional ones.

SUMMARY OF THE INVENTION The present invention has been made to solve the above problems, and it is an object of the present invention to provide a video decoding method and apparatus using a division unit including an additional region.

It is another object of the present invention to provide a method and apparatus for encoding an image using a division unit including an additional region.

In order to solve the above problems, an object of the present invention is to provide a video decoding method supporting a plurality of reference pixel layers.

It is another object of the present invention to provide an image decoding apparatus supporting a plurality of reference pixel layers.

According to an aspect of the present invention, there is provided an image decoding method using a division unit including an additional region.

Here, a video decoding method using a division unit including an additional region divides an encoded video included in the bitstream into at least one division unit by referring to a syntax element obtained from a received bitstream, , Setting an additional area for the at least one division unit, and decoding the encoded image based on the division unit in which the additional area is set.

The step of decoding the encoded image may include determining a reference block for a current block to be decoded in the encoded image according to information indicating whether the reference stream included in the bitstream is available for reference.

Here, the reference block may be a block belonging to a position overlapping an additional region set in a division unit to which the reference block belongs.

According to another aspect of the present invention, there is provided an image decoding method supporting a plurality of reference pixel layers.

The video decoding method for supporting a plurality of reference pixel layers includes the steps of: checking whether or not a plurality of reference pixel layers are supported in a bitstream; referring to syntax information included in the bitstream, Determining a reference pixel hierarchy to be used for a current block, constructing a reference pixel using pixels belonging to the determined reference pixel hierarchy, and performing intra prediction on the current block using the constructed reference pixel .

The method may further include confirming whether or not the adaptive reference pixel filtering method is supported in the bitstream, after confirming whether the plurality of reference pixel layers are supported.

If the plurality of reference pixel layers are not supported after the step of checking whether the plurality of reference pixel layers are supported, the step of configuring reference pixels using the predetermined reference pixel layer may be included.

When the image decoding method and apparatus using the division unit including the additional area according to the present invention as described above, the image compression efficiency can be improved because there are many image data to be referred to.

When the image decoding method and apparatus supporting a plurality of reference pixel layers according to the present invention as described above are used, intra prediction accuracy can be increased because a plurality of reference pixels are used.

In addition, according to the present invention, since adaptive reference pixel filtering is supported, optimum reference pixel filtering can be performed according to characteristics of an image.

In addition, there is an advantage that the compression efficiency due to the image portion / decoding can be increased.

1 is a conceptual diagram of an image encoding and decoding system according to an embodiment of the present invention.

2 is a block diagram of an image encoding apparatus according to an embodiment of the present invention.

3 is a block diagram of an image decoding apparatus according to an embodiment of the present invention.

4A to 4D are conceptual diagrams illustrating a projection format according to an embodiment of the present invention.

5A to 5C are conceptual diagrams for explaining the surface arrangement according to an embodiment of the present invention.

6A and 6B are illustrations for explaining a partition according to an embodiment of the present invention.

Fig. 7 is an example of dividing one picture into a plurality of tiles.

8A to 8I are first exemplary views for setting an additional area for each tile according to FIG.

Figs. 9A to 9I are second exemplary views for setting an additional area for each tile according to Fig. 7. Fig.

FIG. 10 is an exemplary diagram illustrating an additional area generated according to an embodiment of the present invention in a sub-decoding process of another area.

FIGS. 11 to 12 are diagrams for explaining a sub-decoding method for a division unit according to an embodiment of the present invention.

13A to 13G are exemplary diagrams for explaining an area in which a specific division unit can be referred to.

14A to 14E are illustrations for explaining the possibility of reference to an additional region in a division unit according to an embodiment of the present invention.

15 is an example of a block belonging to a division unit of a current image and a block belonging to a division unit of another image.

16 is a hardware block diagram of an image encoding / decoding apparatus according to an embodiment of the present invention.

17 is an exemplary view illustrating an intra prediction mode according to an embodiment of the present invention.

18 is a first exemplary view of a reference pixel configuration used in intra-frame prediction according to an embodiment of the present invention.

19A to 19C are second exemplary views of a reference pixel configuration according to an embodiment of the present invention.

20 is a third exemplary view of a reference pixel structure according to an embodiment of the present invention.

21 is a fourth exemplary view of a reference pixel structure according to an embodiment of the present invention.

FIGS. 22A and 22B are exemplary diagrams illustrating a method of filling a reference pixel at a preset position in an unusable reference candidate block. FIG.

FIGS. 23A to 23C are views illustrating a method of performing interpolation based on a fractional pixel unit for reference pixels constructed according to an embodiment of the present invention.

24A and 24B are first illustrations for explaining an adaptive reference pixel filtering method according to an embodiment of the present invention.

25 is a second exemplary view for explaining an adaptive reference pixel filtering method according to an embodiment of the present invention.

FIGS. 26A to 26B are views illustrating an example of using one reference pixel layer in reference pixel filtering according to an exemplary embodiment of the present invention.

27 is a diagram illustrating an example of using a plurality of reference pixel layers in reference pixel filtering according to an exemplary embodiment of the present invention.

FIG. 28 is a block diagram for explaining a sub-decoding method for an intra-picture prediction mode according to an embodiment of the present invention.

29 is a first exemplary diagram for explaining a bit stream configuration for intra-picture prediction according to the reference pixel configuration.

30 is a second exemplary diagram for explaining a bit stream configuration for intra-picture prediction according to the reference pixel configuration.

31 is a third exemplary diagram for explaining a bit stream structure for intra-picture prediction according to the reference pixel configuration.

32 is a flowchart illustrating an image decoding method supporting a plurality of reference pixel layers according to an embodiment of the present invention.

While the invention is susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and will herein be described in detail. It should be understood, however, that the invention is not intended to be limited to the particular embodiments, but includes all modifications, equivalents, and alternatives falling within the spirit and scope of the invention. Like reference numerals are used for like elements in describing each drawing.

The terms first, second, A, B, etc. may be used to describe various elements, but the elements should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from another. For example, without departing from the scope of the present invention, the first component may be referred to as a second component, and similarly, the second component may also be referred to as a first component. And / or < / RTI > includes any combination of a plurality of related listed items or any of a plurality of related listed items.

It is to be understood that when an element is referred to as being "connected" or "connected" to another element, it may be directly connected or connected to the other element, . On the other hand, when an element is referred to as being "directly connected" or "directly connected" to another element, it should be understood that there are no other elements in between.

The terminology used in this application is used only to describe a specific embodiment and is not intended to limit the invention. The singular expressions include plural expressions unless the context clearly dictates otherwise. In the present application, the terms "comprises" or "having" and the like are used to specify that there is a feature, a number, a step, an operation, an element, a component or a combination thereof described in the specification, But do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, or combinations thereof.

Unless defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Terms such as those defined in commonly used dictionaries are to be interpreted as having a meaning consistent with the contextual meaning of the related art and are to be interpreted as either ideal or overly formal in the sense of the present application Do not.

In general, an image may be composed of a series of still images. The still images may be classified into a group of pictures (GOP), and each still image is referred to as a picture or a frame can do. As a higher concept, a unit such as a GOP and a sequence may exist, and each picture may be divided into predetermined areas such as slices, tiles, blocks, and the like. In addition, a unit of an I picture, a P picture, a B picture, and the like may be included in one GOP. The I picture may be a picture to be encoded / decoded by itself without using a reference picture, and the P picture and the B picture may be subjected to a process such as motion estimation (Motion estimation) and motion compensation (motion compensation) And may be a picture to be encoded / decoded. In general, an I picture and a P picture can be used as a reference picture in the case of a P picture, and a reference picture can be used as an I picture and a P picture in the case of a B picture, but this definition can also be changed by setting of encoding / decoding .

Here, a picture referred to in encoding / decoding is referred to as a reference picture, and a block or pixel to be referred to is referred to as a reference block and a reference pixel. The reference data may be not only the pixel values of the spatial domain but also the coefficient values of the frequency domain and various encoding / decoding information generated and determined during the encoding / decoding process.

The minimum unit of image can be a pixel, and the number of bits used to represent one pixel is called a bit depth. In general, the bit depth can be 8 bits and can support different bit depths depending on the encoding setting. The bit depth may be supported by at least one bit depth depending on the color space. Also, it may be configured as at least one color space according to the color format of the image. One or more pictures having a predetermined size or one or more pictures having different sizes according to a color format. For example, in the case of YCbCr 4: 2: 0, it may be composed of one luminance component (Y in this example) and two chrominance components (Cb / Cr in this example) The composition ratio may have a width of 1: 2. As another example, in the case of 4: 4: 4, it may have the same composition ratio in the horizontal and vertical directions. If it is composed of more than one color space as in the above example, the picture can perform the division into each color space.

In the present invention, some color spaces (Y in this example) of some color formats (YCbCr in this example) will be described, and the same color space (Cb, Cr in this example) Similar applications (settings dependent on a specific color space) can be made. However, it may also be possible to place a partial difference (an independent setting for a specific color space) in each color space. That is, the setting depending on each color space has a setting proportional to or dependent on the composition ratio of each component (for example, determined by 4: 2: 0, 4: 2: 2, 4: 4: 4, And the setting independent of each color space can be regarded as having a setting of only the corresponding color space regardless of the constituent ratio of each element or independently. In the present invention, depending on the subdecoder, the subdecoder may have an independent setting for some of the configurations or may have a dependent setting.

The setup information or syntax elements required in the image coding process can be determined at a unit level such as a video, a sequence, a picture, a slice, a tile, and a block. The VPS, the sequence parameter set (SPS) Set, Slice Header, Tile Header, Block Header, etc., and transmitted to the decoder. In the decoder, the setting information transmitted from the encoder is parsed by parsing at the same level unit, It can be used for decryption process. In addition, related information can be transmitted and parsed in the form of SEI (Supplement Enhancement Information) or Metadata in a bit stream. Each parameter set has a unique ID value, and in the lower parameter set, it can have an ID value of an upper parameter set to be referred to. For example, information of an upper parameter set having a matching ID value among one or more higher parameter sets in the lower parameter set may be referred to. In the case where any one of the examples of various units mentioned above includes one or more other units, the corresponding unit may be referred to as an upper unit and the included unit may be referred to as a lower unit.

In the case of the setting information generated in the unit, the setting information may include contents of independent setting for each unit, or contents related to setting depending on the previous, the next, or the upper unit. Here, it can be understood that the dependent setting indicates the setting information of the corresponding unit with flag information (for example, 1 if the flag is 1, not followed by 0) to follow the setting of the previous unit and then the higher unit. The setting information in the present invention will be described with reference to an example of an independent setting, but an example in which addition or substitution is made to the contents of a relation that relies on setting information of a previous unit, a subsequent unit, or a higher unit of the current unit .

Hereinafter, preferred embodiments according to the present invention will be described in detail with reference to the accompanying drawings.

1, the image encoding apparatus 105 and the decoding apparatus 100 may be implemented as a personal computer (PC), a notebook computer, a personal digital assistant (PDA), a portable multimedia player (PMP) Player, a PlayStation Portable (PSP), a wireless communication terminal, a smart phone, a TV, a server terminal such as an application server and a service server, A communication device such as a communication modem for performing communication with a wired / wireless communication network, a memory (memories 120 and 125) for storing various programs for inter or intra prediction for encoding or decoding an image and memories (memories 120 and 125) And a processor (processor, 110, 115) for controlling the processor. The video encoded by the video encoding apparatus 105 can be transmitted through a wired or wireless communication network such as the Internet, a short-range wireless communication network, a wireless network, a WiBro network, or a mobile communication network in real time or non- And then transmitted to the image decoding apparatus 100 through various communication interfaces such as a serial bus (USB: Universal Serial Bus), etc., decoded by the image decoding apparatus 100, and restored and reproduced as an image. In addition, the image encoded by the image encoding apparatus 105 as a bit stream can be transferred from the image encoding apparatus 105 to the image decoding apparatus 100 via a computer-readable recording medium.

2, the image encoding apparatus 20 according to the present embodiment includes a predictor 200, a subtractor 205, a transformer 210, a quantizer 215, an inverse quantizer 220, An inverse transform unit 225, an adder 230, a filter unit 235, an encoding picture buffer 240, and an entropy encoding unit 245.

The prediction unit 200 may include an intra prediction unit for performing intra prediction and an inter prediction unit for performing inter prediction. Intra prediction can generate a prediction block by performing spatial prediction using pixels of adjacent blocks of the current block. In inter-prediction, motion compensation is performed by searching an area that is most matched with the current block from the reference image A prediction block can be generated. (Intra prediction) mode, a motion vector, a motion vector, and the like) for the corresponding unit (coding unit or prediction unit) Video, etc.). At this time, the processing unit to be predicted, the prediction method, and the processing unit in which the concrete contents are determined can be determined according to the subdecryption setting. For example, the prediction method, the prediction mode, and the like are determined as a prediction unit, and the prediction can be performed in units of conversion.

Directional prediction mode such as horizontal and vertical modes used in the intra-picture prediction unit according to the prediction direction, and a non-directional prediction mode such as DC or Planar using the average, interpolation, etc. of reference pixels. (33 directional + 2 non-directional) or 67 prediction modes (65 directional + 2 non-directional), and 131 directional and non-directional modes. One of various candidates such as prediction mode (129 directional + 2 non-directional) can be used as a candidate group.

The intra prediction unit may include a reference pixel unit, a reference pixel filter unit, a reference pixel interpolation unit, a prediction mode determination unit, a prediction block generation unit, and a prediction mode coding unit. The reference pixel unit may configure a pixel belonging to a block neighboring the current block and adjacent to the current block as reference pixels for intra-frame prediction. According to the encoding setting, the nearest neighbor reference pixel line may be constituted as a reference pixel, or one adjacent reference pixel line may be constituted as a reference pixel, and a plurality of reference pixel lines may be constituted by reference pixels. If a portion of the reference pixel is not available, a reference pixel may be generated using the available reference pixel, and if not all of the pixels are available, a predetermined value (e.g., a pixel value A median of a range, etc.) can be used to generate a reference pixel.

The reference pixel filter unit of the intra prediction unit can perform filtering on the reference pixels for the purpose of reducing the remaining degradation through the encoding process. At this time, the filters used are 3-tap filters [1/4, 1/2, 1/4], 5-tap filters [2/16, 3/16, 6/16, 3/16, 2/16] And may be the same low-pass filter. The application of filtering and the type of filtering can be determined according to the encoding information (e.g., size, type, and prediction mode of the block).

The reference pixel interpolator of the intra-frame prediction unit may generate a pixel of a decimal unit through a linear interpolation process of the reference pixel according to the prediction mode, and an interpolation filter to be applied may be determined according to the coding information. In this case, the interpolation filter used may include a 4-tap Cubic filter, a 4-tap Gaussian filter, a 6-tap Wiener filter, an 8-tap Kalman filter, and the like. Although interpolation is performed separately from the process of performing the low-pass filter, it is also possible to perform the filtering process by integrating the filters applied to the two processes.

The prediction mode decision unit of the intra prediction unit can select the optimum prediction mode among the prediction mode candidates considering the coding cost and the prediction block generating unit can generate the prediction block using the corresponding prediction mode. And the prediction mode encoding unit may encode the optimal prediction mode based on the prediction value. At this time, the prediction information can be adaptively encoded according to whether the predicted value is fit or not.

The predictive value may be referred to as MPM (Most Probable Mode) in the intra prediction unit and some modes among all the modes belonging to the prediction mode candidate group may be configured as the MPM candidate group. The MPM candidate group may include prediction modes of predetermined prediction modes (e.g., DC, Planar, vertical, horizontal, diagonal mode, etc.) or spatially adjacent blocks (e.g., left, top, left top, right top, May be included. In addition, a mode derived from a mode included in the MPM candidate group (a difference of +1 or -1 in the directional mode) can be configured as an MPM candidate group.

There may be a priority of the prediction mode for MPM candidate configuration. According to the priority order, an order to be included in the MPM candidate group can be determined, and if the number of MPM candidate groups (determined according to the number of prediction mode candidate groups) is filled according to the priority, the MPM candidate group configuration can be completed. In this case, the priority order can be determined by the mode order derived from the prediction mode of the spatially adjacent block, the predetermined prediction mode, and the prediction mode included first in the MPM candidate group, but other modifications are also possible.

For example, spatially adjacent blocks may be included in the candidate group in the order of left-upper-lower-right-upper-left block, and DC-Planar-vertical-horizontal mode among predetermined prediction modes. , And the mode obtained by adding +1, -1, etc. in the included mode may be included in the candidate group, and a total of 6 modes may be configured as the candidate group. Or a left-upper-DC-Planar-lower left-upper right-upper left- (left +1) - (left -1) - (upper +1) .

The validity check may be performed on the candidate group configuration. If valid, the candidate group is included in the candidate group, and if not valid, the next candidate is passed. If the neighboring block is located outside the picture or may belong to a different division unit from the current block, or if the coding mode of the block is inter-picture prediction, it may not be effective. It may not be valid.

In the case of spatially adjacent blocks among the candidates, the block may be composed of one block, but it may be composed of several blocks (sub-blocks). Therefore, in the case of the left block in the order of (left-top) of the candidate group, after the validity check of any one of the positions (for example, the bottom block of the left block) (E.g., one or more sub-blocks starting from the top block of the left block and located in the downward direction) may be performed after the validity check has been performed, Can be determined.

The inter-picture prediction unit can classify the moving motion model and the non-moving motion model according to the motion prediction method. In the case of the moving motion model, the prediction is performed considering only the parallel motion. In the case of the non-moving motion model, prediction can be performed considering the motion such as rotation, perspective, and zoom in / out as well as parallel movement have. Assuming unidirectional prediction, one motion vector may be needed for a moving model, but more than one for a non-moving model. In the case of the out-of-motion model, each motion vector may be information applied to a predetermined position of the current block, such as a left upper corner vertex and an upper right corner vertex of the current block. Can be obtained on a pixel-by-pixel or sub-block-by-sub-block basis. The inter-picture prediction unit may be applied in common to some of the processes described below according to the motion model, and some processes may be applied individually.

The inter picture prediction unit may include a reference picture construction unit, a motion estimation unit, a motion compensation unit, a motion information determination unit, and a motion information coding unit. The reference picture constructing unit may include the pictures coded before or after the current picture in the reference picture lists L0 and L1. The prediction block may be obtained from the reference picture included in the reference picture list. The current picture may also be composed of the reference picture and be included in at least one of the reference picture list according to the coding setting.

In the inter-picture prediction unit, the reference picture constructing unit may include a reference picture interpolating unit, and may perform an interpolation process for a decimal pixel according to the interpolation precision. For example, an 8-tap DCT-based interpolation filter may be applied for luminance components, and a 4-tap DCT-based interpolation filter may be applied for chrominance components.

In the inter-picture prediction unit, the motion estimation unit may search for a block having a high correlation with the current block through a reference picture. Various methods such as a full search-based block matching algorithm (FBMA) and a three step search (TSS) The motion compensation unit refers to a process of acquiring a prediction block through a motion estimation process.

In the inter-picture prediction unit, the motion information determination unit may perform a process for selecting optimal motion information of the current block. The motion information may include a skip mode, a merge mode, a competition mode, The motion information encoding mode, and the like. The mode may be configured by combining the supported modes according to the motion model, and may include a skip mode, a skip mode, a merge mode, a merge mode, a competition mode, Competitive mode (non-movement) can be an example of that. Depending on the encoding setting, some of the modes may be included in the candidate group.

The motion information encoding mode can obtain a predicted value of motion information (a motion vector, a reference picture, a prediction direction, and the like) of the current block in at least one candidate block, and when two or more candidate blocks are supported, May occur. The skip mode (no residual signal) and the merging mode (residual signal present) can use the predicted value as motion information of the current block as it is, and the contention mode can generate differential value information between motion information of the current block and the predicted value .

The candidate group for the motion information predictive value of the current block may have adaptive and various configurations according to the motion information encoding mode. The motion information of blocks spatially adjacent to the current block (e.g., left, upper, left, upper right, lower left blocks, etc.) may be included in the candidate group and temporally adjacent blocks (for example, Motion information of left, right, upper, lower, upper left, upper right, lower left, and lower right blocks including other in-image blocks <center> can be included in the candidate group and mixed motion information of spatial candidates and temporal candidates For example, the motion information may be acquired in units of a current block or a sub-block of a current block, based on motion information of spatially adjacent blocks and motion information of temporally adjacent blocks) .

There may be a priority for constructing a motion information prediction value candidate group. The order included in the prediction value candidate group structure may be determined according to the priority order, and the candidate group structure may be completed if the number of candidate groups (determined according to the motion information coding mode) is filled in accordance with the priority order. In this case, motion information of spatially adjacent blocks, motion information of temporally adjacent blocks, and mixed motion information of spatial candidates and temporal candidates can be prioritized, but other modifications are also possible.

For example, among the spatially adjacent blocks, the candidate blocks may be included in the order of the left-upper-right-upper-left-lower-left block, and the right-lower-left- .

The validity check can be performed on the candidate group configuration. If valid, the candidate group is included in the candidate group, and if invalid, the next candidate is passed. If the neighboring block is located outside the picture or may belong to a different division unit from the current block, or if the coding mode of the block is an intra prediction, it may not be effective. It may not be valid.

Among the candidates, the block may be composed of one block in the case of spatially or temporally adjacent blocks, but it may be composed of several blocks (sub-blocks). Therefore, in the case of the left block in the same order as the above (left-top) of the spatial candidate group, after the validity of the position of any one place (for example, the bottom block of the left block) is checked, Or it may be out of order to upper blocks after performing validation of several locations (e.g., one or more sub-blocks located in the downward direction starting from the top block of the left block). Also, in the case of a middle block in the same order as the temporal candidate group configuration (middle-right), validation of any one position (for example, when the central block is divided into 4 × 4 regions, 3 >, < 3 >, < 3 >, < 2 >, < 3 > Block, etc.) may be performed after the validation of the sub-blocks is performed, and may be determined according to the encoding setting.

Subtraction unit 205 subtracts the prediction block from the current block to generate a residual block. That is, the subtractor 205 calculates the difference between the pixel value of each pixel of the current block to be encoded and the predicted pixel value of each pixel of the predictive block generated through the predictor to generate a residual block, which is a residual signal of a block form .

The transform unit 210 transforms the residual block into a frequency domain and transforms each pixel value of the residual block into a frequency coefficient. Here, the transforming unit 210 transforms the transformed transformed transformed data into transformed transformed transform coefficients based on Hadamard Transform, DCT Based Transform, DST Based Transform, and KLT Based Transform Transform) can be transformed into a frequency domain by using various transformation techniques for transforming an image signal of a spatial axis into a frequency domain. The residual signal transformed into the frequency domain is a frequency coefficient. The transform can be transformed by a one-dimensional transform matrix. Each transformation matrix can be adaptively used in horizontal and vertical units. For example, in the case of intra prediction, if the prediction mode is horizontal, a DCT-based transformation matrix may be used in the vertical direction and a DST-based transformation matrix may be used in the horizontal direction. In the case of vertical, a DCT-based transformation matrix may be used in the horizontal direction and a DST-based transformation matrix may be used in the vertical direction.

The quantization unit 215 quantizes the residual block having the frequency coefficients converted into the frequency domain by the transform unit 210. Here, the quantization unit 215 may quantize the transformed residual block using Dead Zone Uniform Threshold Quantization, a Quantization Weighted Matrix, or an improved quantization technique. It can be set to one or more quantization schemes as candidates and can be determined by encoding mode, prediction mode information, and the like.

The entropy encoding unit 245 scans the generated quantization frequency coefficient sequence according to various scanning methods to generate a quantization coefficient sequence, and outputs the quantization coefficient sequence using an entropy encoding technique or the like. The scan pattern can be set to one of various patterns such as zigzag, diagonal, and raster. Also, it is possible to generate encoded data including encoded information transmitted from each component and output the generated encoded data as a bit stream.

The inverse quantization unit 220 inversely quantizes the residual block quantized by the quantization unit 215. That is, the dequantizer 220 dequantizes the quantized frequency coefficient sequence to generate a residual block having a frequency coefficient.

The inverse transform unit 225 inversely transforms the inversely quantized residual block by the inverse quantization unit 220. That is, the inverse transform unit 225 inversely transforms the frequency coefficients of the inversely quantized residual block to generate a residual block having a pixel value, that is, a reconstructed residual block. Here, the inverse transform unit 225 may perform the inverse transform using the inverse transform used in the transform unit 210.

The adder 230 restores the current block by adding the predicted block predicted by the predictor 200 and the residual block reconstructed by the inverse transform unit 225. [ The reconstructed current block is stored in the encoding picture buffer 240 as a reference picture (or a reference block), and can be used as a reference picture when coding a next block or another block or another picture in the current block.

The filter unit 235 may include one or more post-processing filter processes such as a deblocking filter, a sample adaptive offset (SAO), and an adaptive loop filter (ALF). The deblocking filter can remove block distortion occurring at the boundary between the blocks in the reconstructed picture. The ALF can perform filtering based on a comparison between the reconstructed image and the original image after the block is filtered through the deblocking filter. The SAO can recover the offset difference from the original image on a pixel-by-pixel basis with respect to the residual block to which the deblocking filter is applied. Such a post-processing filter may be applied to the restored picture or block.

The deblocking filter unit in the filter unit can be applied based on pixels included in a few columns or rows included in both blocks based on the block boundary. The block may be applied to a boundary of an encoding block, a prediction block, and a transform block, and may be limited to blocks of a predetermined minimum size (for example, 8 x 8).

Whether or not the filtering is applied can be determined as one of strong candidates such as filtering, intermediate filtering, and weak filtering considering the characteristic of the block boundary and the filtering strength and the filtering strength. In addition, when the block boundary corresponds to the boundary of the division unit, it is determined whether or not to apply according to the in-lur filter application flag at the boundary of the division unit, and whether to apply or not may be determined according to various cases described later in the present invention .

In the filter section, SAO can be applied based on the difference between the reconstructed image and the original image. Edge Offset and Band Offset may be supported as types of offsets, and filtering may be performed by selecting one of the offsets according to image characteristics. In addition, the offset-related information may be coded on a block-by-block basis, and may be encoded using a prediction value. At this time, the related information can be adaptively encoded according to whether the predicted value is fit or not. The prediction value may be offset information of an adjacent block (e.g., left, upper, upper left, upper right block, etc.), and selection information on which block offset information is to be obtained may occur.

The validity check may be performed on the candidate group configuration. If valid, the candidate group is included in the candidate group, and if not valid, the next candidate is passed. If the neighboring block is located outside of the picture or belongs to a division unit different from the current block, it may not be valid even in the case of a non-referenceable case described later in the present invention.

The encoded picture buffer 240 may store a restored block or picture through the filter unit 235. [ The reconstruction block or picture stored in the encoding picture buffer 240 may be provided to the prediction unit 200 that performs intra-picture prediction or inter-picture prediction.

3, the image decoding apparatus 30 includes an entropy decoding unit 305, a predicting unit 310, an inverse quantization unit 315, an inverse transform unit 320, an adder / subtracter 325, a filter 330, And a decoded picture buffer 335, as shown in Fig.

In addition, the prediction unit 310 may include an intra prediction module and an inter prediction module.

First, when an image bitstream transmitted from the image encoding apparatus 20 is received, the image bitstream can be transmitted to the entropy decoding unit 305.

The entropy decoding unit 305 may decode the decoded data including the quantized coefficients and the decoded information to be transmitted to the constituent units by decoding the bit stream.

The prediction unit 310 may generate a prediction block based on the data transmitted from the entropy decoding unit 305. [ At this time, based on the reference image stored in the decoded picture buffer 335, a reference picture list using a default construction technique may be constructed.

The intra-picture prediction unit may include a reference picture constituent unit, a reference pixel filter unit, a reference pixel interpreter unit, a predictive block generation unit, and a predictive mode decoding unit. The inter picture prediction unit may include a reference picture constructing unit, a motion compensating unit, And some of them can perform the same process as that of the encoder, and some of them can perform the process of inverse derivation.

The inverse quantization unit 315 is provided as a bitstream and can dequantize the quantized transform coefficients decoded by the entropy decoding unit 305. [

The inverse transform unit 320 may apply inverse DCT, inverse integer transform, or similar inverse transformation techniques to the transform coefficients to generate residual blocks.

The inverse quantization unit 315 and the inverse transformation unit 320 inversely perform the processes performed by the transform unit 210 and the quantization unit 215 of the image encoding apparatus 20 described above and may be implemented in various ways have. For example, the same process and inverse transformation that are shared with the transform unit 210 and the quantization unit 215 may be used, and information on the transform and quantization process (for example, transform size, transform Shape, quantization type, etc.), the transformation and quantization processes can be performed inversely.

The residual block subjected to the inverse quantization and inverse transform process may be added to the prediction block derived by the prediction unit 310 to generate a reconstructed image block. This addition may be performed by the adder / subtracter 325.

The filter 330 may apply a deblocking filter to the reconstructed image block to remove a blocking phenomenon if necessary, and may further add other loop filters to improve the video quality before and after the decoding process. It can also be used.

The reconstructed and filtered image block may be stored in the decoded picture buffer 335.

Although not shown in the figure, the image decoding apparatus 30 may further include a division unit, wherein the division unit may include a picture division unit and a block division unit. The division is the same as or equivalent to that of the image encoding apparatus according to FIG. 2, and can be easily understood by a person skilled in the art, so a detailed description will be omitted.

4A shows an Equi-Rectangular Projection (ERP) format in which a 360-degree image is projected onto a two-dimensional plane. 4B shows a CMP (CubeMap Projection) format in which a 360-degree image is projected onto a cube. 4C shows an OHP (Octahedron Projection) format in which a 360-degree image is projected onto an octahedron. 4D shows an ISP (IsoSahedral Projection) format in which a 360-degree image is projected in a polyhedron. However, various projection formats can be used without being limited thereto. For example, Truncated Square Pyramid Projection (TSP), Segmented Sphere Projection (SSP), and the like may be used. 4A to 4D show an example in which a three-dimensional model is transformed into a two-dimensional space through a projection process. A figure projected in two dimensions may be composed of one or more faces and each surface may have a shape such as a circle, a triangle, or a square.

4A-4D, the projection format can have one surface (e.g., ERP) or multiple surfaces (e.g., CMP, OHP, ISP, etc.). In addition, each surface can be divided into a square shape and a triangle shape. The classification may be one of the types, characteristics, and the like of the image according to the present invention that can be applied to the case where the setting of the sub-decode according to the projection format is different. For example, the type of image may be a 360 degree image, the characteristics of an image may be one of the above distinctions (e.g., each projection format, one surface or a plurality of surfaces, a projection format, .

A two-dimensional plane coordinate system {e.g., (i, j)} may be defined on each surface of the two-dimensional projection image, and the characteristics of the coordinate system may vary depending on the projection format, In the case of ERP, one two-dimensional plane coordinate system and other projection formats can have a plurality of two-dimensional plane coordinate systems according to the number of surfaces. At this time, the coordinate system can be expressed by (k, i, j), where k can be the index information of each surface.

In the present invention, for convenience of explanation, the case where the shape of the surface is a rectangle is mainly described, and the number of the surfaces projected in two dimensions is one (for example, Equirectangular Projection, that is, ) To two or more (for example, Cube Map Projection, etc.).

In a projection format in which a three-dimensional image is projected two-dimensionally, it is necessary to determine the arrangement of its surface. At this time, the arrangement of the surfaces may be arranged so as to maintain continuity of images in the three-dimensional space, or may be disposed as closely as possible between the surfaces even if the image continuity between some adjacent surfaces is undermined. Further, when the surface is arranged, some surfaces can be arranged by rotating at a certain angle (0, 90, 180, 270 degrees, etc.).

Referring to FIG. 5A, an example of a surface layout for the CMP format is shown. When arranged to maintain image continuity in a three-dimensional space, four surfaces are arranged horizontally as shown in the left figure, A 4x3 layout in which the surfaces of the substrates are disposed can be used. Also, as shown in the right figure, a 3x2 layout in which surfaces are arranged so that there is no empty space in a two-dimensional plane can be used even if image continuity between some adjacent surfaces is impaired.

Referring to FIG. 5B, the surface arrangement for the OHP format can be confirmed. When the arrangement is maintained so as to maintain the image continuity in the three-dimensional space, the top view can be obtained. In this case, if the surface is arranged so that there is no empty space in the projected two-dimensional plane even if the image continuity is partially damaged, it can be arranged as shown in the lower figure.

Referring to FIG. 5C, the surface arrangement of the ISP format can be confirmed. As shown in the upper part of FIG. 5C, the image sequence in the three-dimensional space can be maintained or the surface can be closely contacted .

Here, the process of closely adhering the surfaces so that there is no empty space can be referred to as frame packing. By rotating the surface, it is possible to reduce the degradation of image continuity as much as possible. Hereinafter, changing the above-described surface arrangement to another surface arrangement is referred to as surface rearrangement.

In the following, the term continuity can be interpreted to refer to the continuity of the visible scene in three-dimensional space, or the continuity of the actual image or scene in the two-dimensional space projected. The presence of continuity may be expressed as a high correlation between regions. In a typical two-dimensional image, the correlation between regions may be high or low, but a 360-degree image may have a region in which there is no continuity even though it is spatially adjacent. There may also be regions where there is continuity but not spatially contiguous due to prior surface placement or rearrangement.

Relocation of the surface can be performed for the purpose of improving the coding performance. For example, surface relocations can be performed so that surfaces with image continuity are disposed adjacent to each other.

In this case, the surface relocation does not necessarily mean that the surface is rearranged after the surface arrangement, but can be understood as a process of setting a specific surface arrangement from the beginning. (Which may be performed in region-wise packing in a 360 degree imaging / decoding process)

Also, surface placement or rearrangement may include rotation of the surface as well as repositioning of each surface (simple movement of the surface in this example, movement such as lower left or lower right in the upper left corner of the image). The rotation of the surface can be expressed as 0 degrees without surface rotation, 45 degrees to the right, 90 degrees to the left, etc., and splits 360 degrees into k (or ^2k ) ), And by selecting a divided section, the rotation angle can be referred to.

The subdecoder / decoder decides the surface layout information (the shape of the surface, the number of the surfaces, the position of the surface, the rotation angle of the surface, etc.) and / or the surface relocation information (indicating the position or movement angle of each surface, (Or rearrangement) according to the information (e.g., information). In addition, the encoder generates the surface layout information and / or the surface relocation information according to the input image, and the surface arrangement (or rearrangement) may be performed by the decoder receiving and decoding the information from the encoder.

In the following description, the surface is referred to as a 3X2 layout according to FIG. 5A. In this case, the number designating each surface may be 0 to 5 according to a raster scan order from the upper left corner .

Hereinafter, the continuity between the surfaces according to FIG. 5A is presupposed. Unless otherwise noted, surfaces 0 to 2 have continuity to each other, surfaces 3 to 5 have continuity to each other, and 0 and 3, 1, 4, 2 And surface 5 can be assumed to have no continuity. The presence or absence of continuity between the above surfaces can be confirmed by setting the characteristics, type, and format of the image.

According to the addition / decryption process of the 360-degree image, the encoding device acquires the input image, preprocesses the obtained image, encodes the preprocessed image, and transmits the encoded bitstream to the decryption device have. The preprocessing may include image stiching, projection of a three-dimensional image into a two-dimensional plane, surface placement and relocation (or may be referred to as region-wise packing), and the like. Also, the decoding apparatus may receive a bitstream, decode the received bitstream, and perform post-processing (image rendering, etc.) on the decoded image to generate an output image.

At this time, information (SEI message or meta data) generated in the preprocessing process and information (video encoded data) generated in the encoding process can be recorded and transmitted in the bitstream.

The image encoding / decoding apparatus according to FIG. 2 or 3 may further include a division unit, and the division unit may include a picture division unit and a block division unit. The picture dividing unit can divide the picture into at least one processing unit (e.g., color space (YCbCr, RGB, XYZ, etc.), subpicture, slice, tile, basic encoding unit (or maximum encoding unit) The block division unit may divide a basic encoding unit into at least one processing unit (e.g., encoding, prediction, conversion, quantization, entropy, in-loop filter unit, etc.).

The basic coding unit can be obtained by dividing a picture into a horizontal interval and a vertical interval at a constant length interval, which can be a unit also applied to a sub picture, a tile, a slice, a surface, and the like. That is, the unit may be configured as an integral multiple of the basic encoding unit, but is not limited thereto.

For example, in the case of some division units (tile, sub-picture, etc. in this example), the size of the basic encoding unit may be differently applied, and in the case of the corresponding division unit, the size of each basic encoding unit may be used. That is, the basic encoding unit of the division unit may be the same as or different from the basic encoding unit of the picture unit and the basic encoding unit of the different division unit.

For convenience of explanation, the basic encoding unit and other processing units (encoding, prediction, conversion, and the like) are referred to as a block in the present invention.

Size or shape of the block is horizontal, N × the length of the vertical, which is represented by two of the exponential (2 ⁿ⁾ N a square form (2n × 2n. 256 × 256 , 128 × 128, 64 × 64, 32 × 32, 16 × 16, 8 × 8, 4 × 4, etc., where n is an integer between 2 and 8), or an M × N rectangular shape (2 m × 2 n). For example, an input image can be divided into 256 × 256 for an 8 k UHD image having a high resolution, 128 × 128 for a 1080p HD image, and 16 × 16 for a WVGA image.

The picture can be divided into at least one slice. In the case of a slice, the slice may be formed of a bundle of at least one block that continues in accordance with a scan pattern. Each slice may be divided into at least one slice segment, and each slice segment may be divided into basic encoding units.

The picture may be divided into at least one sub-picture or tile. A subpicture or tile may have a rectangular (rectangular or square) subdivision shape and may be divided into basic encoding units. A subpicture is similar to a tile in that it has the same partition type (square). However, in the case of the sub picture, unlike the tile, it can be distinguished from the tile in that it has a separate sub-picture / decode setting. That is, the setting information for performing the sub-decoding is allocated to the tile in a higher unit (for example, a picture or the like), while the sub-picture has at least one setting information for performing sub- Can be obtained directly from the information. That is, unlike the sub picture, the tile is a unit obtained according to the division of the image, but may not be a unit (for example, a basic unit of VCL (Video Coding Layer)) in which data is transmitted.

In addition, in the case of a tile, it may be a division unit supported in terms of parallel processing, and in the case of a sub picture, it may be a division unit supported in a separate subdecryption view. In detail, the sub-picture means that not only the sub-picture decode setting can be made for each sub-picture but also the sub-picture decode can be determined. It means that the sub-picture corresponding to the interested area can be constructed and displayed And the setting for this can be determined in units of a sequence, a picture, and the like.

In the above example, it is also possible to change the expression for setting the sub-picture decoding setting in the upper unit and the setting for the individual sub-decoding setting in the tile unit for the sub-picture. In the present invention, for convenience of explanation, Are all possible.

The division information generated when the picture is divided into a rectangular shape may have various shapes.

Referring to FIG. 6A, a rectangular unit may be obtained by dividing a picture into a horizontal line b7 and a vertical line (in this case, b1 and b3, and b2 and b4, respectively). For example, the number information of the rectangle may be generated based on the horizontal and vertical directions. In this case, if the rectangle is equally divided, the horizontal and vertical lengths of the divided rectangles can be identified by dividing the horizontal and vertical lengths of the picture into the number of horizontal lines and vertical lines, respectively. If the rectangles are not evenly divided, May be additionally generated. In this case, the horizontal and vertical lengths can be expressed in one pixel unit or in a plurality of pixel units. For example, when the basic encoding unit is M × N and the size of the rectangle is 8M × 4N, the horizontal and vertical lengths are 8 and 4 (in this example, (In the case where the basic encoding unit of the corresponding division unit is M / 2 x N / 2) in the present example.

Meanwhile, referring to FIG. 6B, it can be seen that, unlike FIG. 6A, a divided unit of a rectangular shape is obtained by separately dividing a picture. For example, the information on the number of rectangles in the image, the starting position information of the respective rectangles in the horizontal and vertical directions (positions indicated by the drawing symbols z0 to z5, which can be expressed by x and y coordinates in the picture) The length and the length information of the frame can be generated. At this time, the start position may be represented by one pixel unit or a plurality of pixel units, and the horizontal and vertical lengths may also be expressed by one pixel unit or a plurality of pixel units.

6A can be an example of division information of a tile or a sub picture, and FIG. 6B can be an example of division information of a sub picture, but it is not limited thereto. Hereinafter, for convenience of explanation, tiles are assumed to be a rectangular unit, but the tile-related description may be applied to the same or similar sub-picture (and also to the face). That is, in the present invention, the description of the tile is used as the definition of the sub-picture only, or the description of the sub-picture can be used as the definition of the tile.

Some of the above-mentioned division units may not necessarily be included, and some or all of them may optionally be included depending on the subdecryption setting, and other additional units (for example, a surface) may be supported.

On the other hand, it can be divided into coding units (or blocks) of various sizes through the block dividing unit. At this time, the encoding unit may be composed of a plurality of encoding blocks (for example, one luminance encoding block, two color difference encoding blocks, etc.) according to the color format. For ease of explanation, one color component unit is assumed. The encoding block may have variable sizes such as MxM (e.g., M is 4, 8, 16, 32, 64, 128, etc.). Alternatively, according to a partitioning scheme (for example, tree-based partitioning, quadtree partitioning <Quad Tree. QT>, binary tree <Binary Tree. BT>, and a ternary tree <Ternary Tree. MxN (e.g., M and N may be variable sizes such as 4, 8, 16, 32, 64, 128, etc.). At this time, the encoded block may be a unit that is a basis for intra-picture prediction, inter-picture prediction, conversion, quantization, entropy coding, and the like.

In the present invention, although it is described under the assumption that a plurality of sub-blocks (symmetries) having the same size and shape are obtained according to the division method, asymmetric sub-blocks (for example, in the case of a binary tree, > 1: 3 or 3: 1, or the aspect ratio <the same as horizontal> is 1: 3 or 3: 1. In the case of a ternary tree, the horizontal ratio < 1 < / RTI >: 1 < RTI ID = 0.0 > and the like).

The division of an encoding block (MxN) may have a recursive tree-based structure. At this time, whether or not to divide can be indicated through the division flag. For example, when the division flag of the coding block having the division depth k is 0, the coding block is encoded in the coding block having the division depth k. If the division flag of the coding block having the division depth k is 1 The coding of the coded block may be performed by four sub-coded blocks (quad tree partition) or two sub-coded blocks (binary tree partition) or three sub-coded blocks (turntree partition) with a division depth k + Lt; / RTI >

The sub-encoding block may be divided into a sub-encoding block (k + 2) through the process of setting the encoding block (k + 1) ) Can be supported.

For binary tree partitioning, split flags and split direction flags (horizontal or vertical) can be supported. If the binary tree partition supports one or more partition ratios (e.g., additional partition ratios other than 1: 1 in the horizontal or vertical ratios, i.e., support for asymmetric partitioning) A ratio of one of the ratio candidates <1: 1, 1: 2, 2: 1, 1: 3, 3: 1>) may be supported, or other types of flags (eg, There is no additional information as to the partition, and if 0, additional information on the ratio asymmetric partition is required) can be supported.

In the case of a ternary division, a split flag and a split direction flag can be supported. Additional partition information such as the binary tree may be needed if the ternary partition supports one or more partition ratios.

The above example is partition information that is generated when only one tree partition is valid, and in the case where a plurality of tree partition is effective, partition information can be configured as follows.

For example, in a case where a plurality of tree partitioning is supported, partition information corresponding to a line-up order may be configured first if there is a predetermined partitioning priority. If the division flag corresponding to the preceding rank is true (division is performed), additional division information of the division method may be continued. If the division flag is false (division execution x), division information of the division method Flag, split direction flag, etc.).

Alternatively, in the case where a plurality of tree segmentation is supported, selection information on the partitioning scheme may be additionally generated, and may be configured with partitioning information according to the selected partitioning scheme.

In the case of the partial flags, the flag may be omitted depending on the preceding or previous division result.

The block division starts from the maximum encoding block and can proceed to the minimum encoding block. Alternatively, it may start at the minimum division depth (0) and proceed to the maximum division depth. That is, the partitioning can be performed recursively until the block size reaches the minimum coding block size or the division depth reaches the maximum division depth. At this time, in accordance with the sub / decoding setting (for example, image <slice, tile> type <I / P / B>, coding mode <Intra / Inter>, color difference component <Y / Cb / Cr> The size of the minimum coding block, and the maximum division depth can be adaptively set.

For example, when the maximum coded block is 128 x 128, quad tree partitioning can be performed in the range of 32 x 32 to 128 x 128, the binary tree partitioning can be performed in the range of 16 x 16 to 64 x 64, And the ternary division can be performed in the range of 8 x 8 to 32 x 32 and the maximum division depth of 3. Alternatively, the quadtree partitioning may be performed in the range of 8 x 8 to 128 x 128, and the binary tree and the turntree partitioning may be performed in the range of 4 x 4 to 128 x 128 and the maximum division depth of 3. In the former case, it may be an I picture type (for example, a slice), and in the latter case, a P or B picture type.

As described in the above example, the division setting such as the size of the maximum coding block, the size of the minimum coding block, the maximum division depth, and the like can be shared or individually supported according to the division method and the above setting / decoding setting.

If a plurality of partitioning schemes are supported, the partitioning is performed within the block support range of each partitioning scheme, and if the block support ranges of each partitioning scheme overlap, the priority of the partitioning scheme may exist. For example, a quadtree partition may precede a binary tree partition.

Or, the partition selection information to be executed when the partition support range overlaps can be generated. For example, selection information about the partitioning scheme performed during the binary tree and the ternary tree partitioning may occur.

Also, if a plurality of division methods are supported, it may be determined whether or not to perform a following division according to the result of the preceding division. For example, if the result of the preceding partition (quad tree in this example) indicates that the partition is to be performed, then the subsequent partition (in this example, the binary tree or the tertiary tree) The block is set again as an encoding block so that the segmentation can be performed.

Alternatively, if the result of the preceding segmentation indicates that the segmentation is not performed, the segmentation may be performed according to the result of the succeeding segmentation. At this time, when the result of the succeeding division (the binary tree or the turntree tree in this example) indicates that the division is performed, the divided sub-coded blocks are set as encoding blocks again to perform division, and the result of the following division If it indicates that no partitioning is performed, no further partitioning is performed. At this time, when the division result of the succeeding division indicates that division is performed and the divided sub-encoding block is set again as the encoding block, when a plurality of division methods are supported (for example, Case), the preceding partition can not be performed but only the subsequent partition can be supported. That is, when a plurality of division methods are supported, if the result of the preceding division shows that the division is not performed, it means that the preceding division is no longer performed.

For example, if the quad tree partitioning and the binary tree partitioning are possible, the M × N encoded block can check the quad tree division flag first. If the division flag is 1, (M >> 1) × (N >> 1 ) Size sub-encoding block, and the sub-encoding block is set as an encoding block again to perform the segmentation (quad-tree segmentation or binary tree segmentation). If the division flag is 0, the binary tree division flag can be confirmed. If the flag is 1, the sub-encoding block is divided into two sub-coded blocks of size (M >> 1) × N or M × (N >> 1) Is performed and the sub-encoding block is set as an encoding block again to perform segmentation (binary tree segmentation). If the division flag is 0, the dividing process is terminated and the coding process proceeds.

Although a case where a plurality of division methods are performed through the above example has been described, the present invention is not limited thereto, and various combinations of supporting methods can be possible. For example, a quadtree / binary tree / a ternary tree / a quadtree + a binary tree / a quadtree + a binary tree + a ternary tree may be used. At this time, information on whether or not the additional division method is supported may be implicitly determined or may be explicitly included in units of a sequence, a picture, a sub-picture, a slice, and a tile.

In the above example, the information related to the division such as the size information of the encoding block, the support range of the encoding block, and the maximum division depth may be included in the unit of the sequence, picture, sub picture, slice, tile or the like or implicitly determined. In summary, the range of allowable blocks can be determined by the size of the maximum coded block, the range of supported blocks, the maximum division depth, and the like.

The coding block obtained by performing the division through the above process can be set to the maximum size of intra-picture prediction or inter-picture prediction. That is, the coded block after the block division may be the start size of the division of the prediction block for intra-picture prediction or inter-picture prediction. For example, if the coding block is 2M x 2N, the prediction block may have a size of 2M x 2N, M x N that is equal to or smaller than the prediction block. Alternatively, it may have a size of 2Mx2N, 2MxN, Mx2N, and MxN. Alternatively, it may have the same size as the encoding block and have a size of 2M x 2N. In this case, the fact that the encoding block and the prediction block have the same size may mean that the prediction is performed with the size acquired through the division of the encoding block without performing the division of the prediction block. That is, the partition information for the prediction block is not generated. Such a setting may be applied to a transform block, and the transform may be performed on a divided block basis.

Various configurations may be possible according to the following sub-decode settings. For example, at least one prediction block and at least one transform block may be obtained based on an encoding block (after the encoding block is determined). Alternatively, one prediction block having the same size as the encoding block can be obtained, and at least one transform block can be obtained based on the encoding block. Alternatively, one prediction block and one transform block having the same size as the encoding block can be obtained. In the above example, when at least one block is acquired, partition information of each block can be generated (generated), and when one block is acquired, partition information of each block does not occur.

The square or rectangular block of various sizes obtained according to the above result may be a block used for intra prediction or inter prediction, and may be a block used for transforming and quantizing residual components. Lt; / RTI >

The division unit obtained by dividing the picture through the picture division unit can independently perform sub-decoding or dependent sub-decoding according to the sub-decoding setting.

Independent subtraction / decryption can mean that when performing subdecryption of some subdivision unit (or area), it can not refer to data of another unit. In detail, the information used or generated in some units of texture encoding and entropy encoding (e.g., pixel value, sub-decode information (intra-frame prediction related information, inter-frame prediction related information, entropy partial / )} Are independently referenced without reference to each other. In the decoder, parsing information and decompression information of other units may not be referred to each other for texture decoding and entropy decoding of a unit.

Dependent subdecryption may also mean that other units of data can be referenced when performing subdecryption of some units. In detail, the information used or generated in texture coding and entropy encoding of some units are referenced to each other and are coded in a dependent manner. In the decoder, parsing information and restoration information of other units are also used for texture decoding and entropy decoding in some units They can be referred to each other.

In general, the previously mentioned division units (e.g., sub-pictures, tiles, slices, etc.) may follow independent sub-decode settings. It can be set as not referable for parallelization purposes. Further, it can be set to be invisible for the purpose of improving the sub-decoding performance. For example, when a 360-degree image is divided into a plurality of surfaces in a three-dimensional space and placed in a two-dimensional space, a correlation (for example, image continuity) with the adjacent surfaces may be deteriorated depending on the surface arrangement. That is, when there is no correlation between the surfaces, the need to refer to each other is low, so that independent sub-decoding settings can be followed.

Further, for the purpose of improving the subdecryption performance, reference settings between division units can be set. For example, even if a 360-degree image is divided into surface units, there may be a case where the surface is highly correlated with the adjacent surface depending on the surface arrangement setting, and may depend on the dependent sub-decoding setting.

Meanwhile, in the present invention, independent or dependent subdivision decoding can be applied not only to the spatial domain but also to the time domain. That is, not only performing independent or dependent subtraction or decryption with other division units existing within the same time as the current division unit, but also performing division and division by using a division unit existing within a time different from the current division unit (in this example, It may be possible to perform independent or dependent subtraction / decryption from the assumption that even if there is a division unit at the same position in the corresponding image, it is different division unit).

For example, when a bitstream A containing data obtained by encoding a 360-degree image with a high image quality and a bitstream B containing data encoded by a normal image quality are simultaneously transmitted, the decoder may extract a region of interest (for example, A viewport, a viewport, or an area to be displayed) is obtained by parsing and decoding a bitstream A transmitted with a high image quality, and parsing and decoding the bitstream B transmitted in a normal image quality outside the ROI .

In detail, when the image is divided into a plurality of units (for example, a sub-picture, a tile, a slice, a surface, etc. in this example, the surface is assumed to process data in the same manner as a tile or a sub picture) (Bit stream A) of a division unit belonging to an area (or a division unit overlapping a view pixel with a pixel) and data (bit stream B) of a division unit belonging to a region other than the region of interest.

Alternatively, a bitstream containing data obtained by encoding the entire image may be transmitted. In the decoder, the region of interest may be parsed from a bitstream and decoded. Specifically, only the data of the division unit belonging to the region of interest can be decoded.

In summary, a bitstream that is divided into one or more image qualities is generated in an encoder, and a decoder decodes only a specific bitstream to obtain a whole or a partial image, or selectively decodes the bitstream in each bitstream for each portion of an image, You can also get an image. In the above example, the case of the 360-degree image is exemplified, but it may be a description applicable to the general image.

In the case of carrying out the subdecryption / decryption as described above, since the decoder may restore some data (in this example, where the interested area is located, the encoder may not know where it is located, and random access may be performed according to the interested area) The reference setting in the time domain, and the like, and perform subtraction / decryption.

For example, if it is determined in a decoder that decoding is to be performed in a single division unit, the current division unit performs independent coding in the spatial domain and performs limited independent decoding in the time domain (for example, , And other division units are referred to as reference constraints, and in general, there is no restriction in the time domain, so that comparison with unlimited dependent coding can be performed).

Alternatively, the decoder may divide a plurality of division units (horizontally adjacent division units, or vertically adjacent division units to obtain a plurality of division units, and divide horizontal and vertical division units to obtain a plurality of division units , The current division unit performs independent or dependent coding in the spatial domain, and if it is determined to perform the decoding in the time domain (for example, in the case of one of the divided units, It is possible to perform limited dependent coding (for example, to allow not only the same position division unit at another time corresponding to the current division unit but also some other division unit).

In the present invention, the surface is generally a division unit having different arrangement and shape according to the projection format, and having different characteristics from the above-mentioned other division unit, such as having no separate partial / decryption setting, And may be viewed as a unit obtained in the picture division part on the side (and having a rectangular shape, etc.).

In the case of the spatial domain, it has been described that independent decoding / decoding can be performed for each division unit for the purpose of parallelization or the like. However, there is a problem that the sub-decryption efficiency can be reduced because independent sub-decryption can not refer to another sub-unit. Therefore, as a step before the subdecryption is performed, it is possible to expand (or add) the data of the adjacent division unit to expand the division unit in which the independent division / decryption is performed. Here, the division unit to which the data of the adjacent division unit is added increases the data that can be referred to, so that the coding / decoding efficiency can be increased. At this time, the extension / division of the expanded division unit can also be regarded as dependence / decryption in that it can refer to the data of the adjacent division unit.

The information on the reference setting between the division units may be stored in a bit stream in units of video, sequence, picture, sub picture, slice, tile, etc., and transmitted to the decoder. In the decoder, The transmitted setting information can be restored. In addition, related information can be transmitted and parsed in the form of SEI (Supplement Enhancement Information) or Metadata in a bit stream. Also, according to a previously defined definition of the subdecoder, subdecryption can be performed according to the reference setting without transmitting the information.

Fig. 7 is an example of dividing one picture into a plurality of tiles. 8A to 8I are first exemplary views for setting an additional area for each tile according to FIG. Figs. 9A to 9I are second exemplary views for setting an additional area for each tile according to Fig. 7. Fig.

If an image is divided into two or more divided units (or regions) through a picture partitioning unit and the subdivided units are independently subdivided / decoded for each divided unit, there is an advantage such as parallel processing. On the other hand, There is a possibility that the performance is deteriorated. In order to solve this problem, it is possible to deal with a dependency setting / decryption setting between division units (description based on tiles in this example, which can be applied to the same or similar setting to other units).

It is common that independent subdecryption is performed so that reference between divisions can not be made. Thus, a pre-processing or post-processing process for dependent decoding / decoding can be performed. For example, before performing the subdecryption, the expanded area may be located at the outer periphery of each division unit, and the expanded area may be filled with data of another division unit to be referred to.

In this method, there is no difference in that independent subdecoding is performed except that subdecryption is performed by extending each division unit. However, when data to be referred to by an existing division unit is acquired in advance in another division unit It can be understood as an example of dependency subdecryption in terms of reference.

In addition, after performing the subtraction / decryption, filtering can be applied using a plurality of divided unit data based on the boundary between the divided units. That is, when filtering is applied, it is dependent on the use of other divided unit data, and may be independent when filtering is not applied.

In the following example, a description will be given centering on the case where dependency sub-decoding is performed by performing a sub-decoding preprocessing process (expansion in this example). Further, in the present invention, the boundary between the same division unit may be referred to as an inner boundary, and the outer edge of the picture may be referred to as an outer boundary.

According to an embodiment of the present invention, an additional area for the current tile can be set. More specifically, at least one tile (in this example, a case where one picture is composed of one tile, that is, a case where the picture is not divided into two or more division units, But it is assumed that it recognizes as a division unit even if it is not divided).

For example, an additional area can be set in at least one direction of the current direction of the tile, such as up / down / left / right. Where additional regions can be filled using arbitrary values. In addition, the additional area can be filled using some data of the current tile, which can be padded on the outer pixels of the current tile or copied by filling in the pixels in the current tile.

Further, the additional area can be filled using image data of tiles other than the current tile. Specifically, image data of a tile adjacent to a current tile can be used, and image data of adjacent tiles in a specific direction of the current tile can be copied and filled.

Here, the size (length) of the image data to be acquired may have a value common to each direction or may have an individual value, and may be determined according to the subdecryption setting.

For example, in FIG. 6A, all or some of the boundaries of b0 to b8 can be extended in the boundary direction. And may be extended by m in all boundary directions of the division unit or may be extended by m _i (i is an index of each direction) along the boundary direction. The m or mi may be applied to all split units of the image or may be set individually according to the split unit.

Here, the setting information on the additional area can be generated. In this case, the setting information of the additional area includes information such as whether the additional area is supported, whether or not the additional area is supported for each division unit, and the shape of the additional area (for example, upward / downward / left / (In this example, the setting information applied to the individual division unit in the image) in the divided unit, the additional information in the entire image (E.g., setting information that is commonly applied to all division units in an image), and a size of an additional area (for example, Size (in this example, setting information individually applied to individual division units in an image), a method of filling an additional region in the entire image, Gajeok how to fill a region, or the like.

The additional area-related settings may be determined proportionally according to the color format, or may have independent settings. Additional area setting information in the luminance component can be generated and the setting of the additional area in the color difference component can be implicitly determined according to the color format. Or, setting information of an additional area in the color difference component can be generated.

For example, if the size of the additional region of the luminance component is m, the size of the additional region of the chrominance component may be m / 2 if determined according to the color format (4: 2: 0 in this example). As another example, if the size of the additional region of the luminance component is m and the chrominance component has independent setting, the size information of the additional region of the chrominance component (in this example, n is commonly used, or n1, n2, n3, etc. are possible). As another example, a method of filling additional regions of a luminance component may be created, and a method of filling additional regions of a chrominance component may use a method in a luminance component or related information may be generated.

Information related to the additional area setting may be recorded in a bit stream in units of video, sequence, picture, sub picture, slice, and the like, and may be transmitted and decoded by parsing related information from the unit during decoding. The embodiments described below are explained on the assumption that additional area support is activated.

Referring to FIG. 7, it can be seen that one picture is divided into the respective tiles displayed from 0 to 8. 8A to 8I, the additional area according to an embodiment of the present invention is set for each tile according to FIG.

7 and 8A, the tile 0 (size of T0_W x TO_H) can be extended to have an area of EO_R to the right and EO_D to the bottom. At this time, an additional area can be obtained in the adjacent tiles. Specifically, the right extension area can be obtained from the first tile, and the lower extension area can be obtained from the third tile. In addition, the 0th tile can be set to an additional area using the tile adjacent to the lower right (No. 4 tile). That is, the additional area may be set in the direction of the remaining inner boundary (or the boundary between the same division unit) excluding the outer boundary (or the picture boundary) of the tile.

In FIGS. 7 and 8E, the fourth tile (the size of T4_W x T4_H) can be extended so as to additionally have left, right, top, and bottom areas since there is no outer boundary. In this case, the left extension area can be obtained in 3 tiles, the right extension area in 5 tiles, the upper extension area in 1 tile, and the lower extension area in 7 tiles. In addition, the tile 4 can be set up in the upper left corner, the lower left corner, the upper right corner, and the lower right corner. In this case, the left upper extension area is 0 tile, the lower left extension area is 6 tile, the upper right extension area is 2 tile, and the lower right extension area is 8 tile.

In FIG. 8C, since the L2 block is in principle a block adjacent to the tile boundary, there is no data that can be referred to from the left, upper left, and lower left blocks. However, according to an embodiment of the present invention, when an additional area for the second tile is set, there is an advantage that the L2 block can be subdivided / decoded by referring to the additional area. That is, the L2 block can refer to the data of the block located at the left and upper left as an additional area (which may be an area obtained from the tile 1), and the data of the block located at the lower left can be referred to as an additional area Area). &Lt; / RTI >

The data included in the additional area through the above embodiment can be included in the current tile and can be subdivided / decoded. In this case, since the data of the additional area is located in the tile boundary (in the present example, the tile is updated or extended due to the additional area), there is no data to be referred to during the coding process, This can be understood as a temporary memory format for enhancing the coding performance because it is added for reference of the existing tile boundary area. That is, it helps to improve the image quality of the final output image, and as a result, it is an area to be removed, so that the degradation of the coding performance of the area is not a problem. This can be applied to the same or similar purpose in the following embodiments.

9A to 9I, a 360-degree image is converted into a two-dimensional image through a surface arrangement (or rearrangement) process according to a projection format, and each tile (or a surface) )can confirm. In this case, if the 360-degree image is equirectangular, the two-dimensional image is composed of one surface, and thus one surface may be divided into tiles. For the convenience of explanation, it is assumed that the tile division for the two-dimensional image is the same as the tile division according to FIG.

The divided tiles can be divided into a tile including only an inner boundary and a tile including at least one outer boundary, and an additional area can be set for each tile as in FIGS. 8A through 8I. However, the 360-degree image converted into the two-dimensional image may have no continuity of the actual image even if they are adjacent to each other in the two-dimensional image, or there may be continuity of the actual image even if they are not adjacent to each other (refer to the description of FIGS. 5A to 5C) . Thus, even though some of the boundaries of the tile are outer boundaries, regions in which there is continuity with the outer boundary region of the tile may be present in the picture. Specifically, referring to FIG. 9B, although the upper end of the tile # 1 corresponds to the outer boundary of the picture, the region having the continuity of the actual image may exist in the same picture. Therefore, It can be possible. That is, unlike Figs. 8A to 8I, in Figs. 9A to 9I, additional areas can be set for all or a part of the outer boundary direction of the tile.

Referring to FIG. 9E, the fourth tile is a tile (tile 4 in this example) having only the inner boundary as the tile boundary. Therefore, the additional area for the fourth tile can be set for the upper, lower, left, right and upper left, lower left, upper right, and lower right directions. Here, the left extension area may be the image data obtained from the tile of 3 times, the right extension area may be the 5th tile, the upper extension area may be the 1 tile, the lower extension area may be the image data obtained from the 7th tile, The lower left extension area may be image data obtained from tile No. 6, the right upper extension area may be image data obtained from tile No. 2, and the right lower extension area may be image data obtained from tile No. 8.

Referring to FIG. 9A, tile No. 0 corresponds to a tile having at least one outer boundary (left, upward direction). Accordingly, the 0th tile may have an additional area extended not only in the right, lower, and lower right spatially adjacent directions but also in an outer boundary direction (left, upper, and upper left directions). Here, the additional area can be set using the data of adjacent tiles in the right, lower, and lower right directions spatially adjacent to each other, but an additional area for the outer boundary direction is a problem. Here, the additional region for the outer boundary direction is not spatially adjacent in the picture but can be set using data having continuity in the substantial image. For example, if the projection format of the 360-degree image is Equirectangular, the left boundary of the picture has substantially continuous image continuity with the right boundary of the picture, and the upper boundary of the picture has continuity of the image substantially with the lower boundary of the picture , The left boundary of tile 0 is continuous with the right boundary of tile 2, and the upper boundary of tile 0 has continuity with the lower boundary of tile 6. Therefore, in the tile 0, the left extension area can be obtained from the tile 2, the right extension area 1 from the tile, the upper extension area 6 from the tile, and the lower extension area from the tile 3. Also, in the tile 0, the upper left extension area can be obtained from the tile 8, the lower left extension area 5, the right upper extension area 7, and the lower right extension area 4 from the tile.

Since the L0 block in FIG. 9A is a block located at the boundary of a tile, there is no data (similar to U0) that can be referred to from the left, top left, bottom left, top, and top right blocks. At this time, a block having continuity in the real image may exist in the two-dimensional image (or picture) even if it is not spatially adjacent in the two-dimensional image. Therefore, if the projection format of the 360-degree image is Equirectangular, the left boundary of the picture has substantially continuity of the image with the right boundary of the picture, and the upper boundary of the picture is substantially the image continuity with the lower boundary of the picture The upper left block of the L0 block can be obtained from the 8th tile, the upper left block of the L0 block can be obtained from the 6th tile, Can be obtained.

Table 1 below is a pseudo code that acquires data corresponding to an additional area from another area having continuity.

I_pos' is the output pixel position, minI (corresponding to the variable B) is the minimum value of the pixel position range, maxI (corresponding to the variable B) C) may be the maximum value of the pixel position range and i may be a position component (horizontal, vertical, etc. in this example). In this example, minI is 0, maxI is Pic_width Pic_height (vertical width of picture) -1.

For example, it is assumed that the vertical length of the picture (general picture) has a range of 0 to 47, and the picture is divided as shown in FIG. When you want to fill the additional area with the top area of the tile 7 by placing the additional area to the bottom of the tile 4 by m, you can check where the data is to be obtained through the above equation.

Tile 4 has a vertical length range of 16 to 30, and when the additional area is set to the lower side by 4, data corresponding to 31, 32, 33 and 34 positions can be filled in the additional area of tile 4. In this case, since min and max are 0 and 47 in the above equations, 31 to 34 are output as their own values 31 to 34, respectively. That is, the data to be filled in the additional area is data at positions 31 to 34.

Alternatively, it is assumed that the horizontal length of a picture (360 degrees image, Equirectangular image has continuity at both ends) ranges from 0 to 95, and the picture is divided as shown in FIG. If you want to add an additional area to the left of tile 3 by m and fill it with the right data of tile 5, you can check where to acquire the data from the above formula.

The width of the tile 3 is 0 to 31, and when the additional area is set to 4 to the left, data corresponding to -4, -3, -2, -1 can be filled in the additional area of tile 3. Since the above position does not exist within the width of the picture, it is calculated from the above formula. Since min and max are 0 and 95 in the above equation, -4 to -1 are output as 92 to 95, respectively. That is, the data to be filled in the additional area is the data at positions 92 to 95.

Specifically, when the area of m is in the range of 360 to 380 degrees (assuming that the pixel value position range is 0 to 360 degrees), it is adjusted to 0 to 20 degrees As shown in FIG. That is, it can be obtained based on the pixel value position range between 0 and Pic_width - 1.

In summary, in order to obtain data of an additional area, it is possible to confirm the position of data to be acquired through an overlapping process.

In the above example, assuming that one surface is obtained in the case of a 360-degree image, it is assumed that spatially adjacent areas in a picture (except for a picture boundary having continuity at both ends) . However, there may be cases in which there is more than one surface according to a projection format (e.g., a cube map, etc.), and if each surface has undergone a placement or rearrangement process, there is no continuity even if it is spatially adjacent in the picture. In this case, the additional area can be created by checking the data of the position having the continuity of the actual image through the arrangement or relocation information of the surface.

Table 2 below is a numerical code that generates additional areas for the specific division unit using internal data of a specific division unit.

The meaning of each variable in Table 2 is the same as Table 1, and the detailed explanation is omitted. However, in this example, minI may be the left or upper side of a specific division unit, and maxI may be the right or lower coordinate of each unit.

For example, when the picture is divided as shown in FIG. 7 and the width of the tile 2 is in the range of 32 to 47 and the additional area is set to the right of the tile 2 by m, the corresponding 48, 49, The data may be filled with data at 47 locations (corresponding to the interior of tile 2) output via the above equations. That is, according to Table 2, the additional region for a specific division unit can be generated by copying the pixels outside the division unit.

In summary, in order to obtain data of an additional area, a position to be acquired can be confirmed through a clipping process.

The detailed configuration according to Table 1 or Table 2 is not fixed and can be changed. For example, in the case of a 360-degree image, overlapping can be directly changed by considering the arrangement (or rearrangement) of the surface and the coordinate system characteristics between the surfaces.

Meanwhile, since the additional area according to the embodiment of the present invention is generated using the image data of the other area, the additional area may correspond to the overlapped image data. Therefore, after the subdecryption is performed to prevent the maintenance of unnecessary duplicate data, the additional area can be removed. However, before removing the additional area, it may be considered that the additional area is utilized after subtraction / decryption.

Referring to FIG. 10, it can be seen that B, which is an additional region of the division unit J, is generated by using the A region of the division unit I. FIG. At this time, the region B can be utilized in the subdivision / decoding (specifically, the restoration or correction process) of the A region included in the division unit I before the generated region B is removed.

More specifically, assuming that the division unit I and the division unit J are 0 tile and 1 tile according to FIG. 7, the right portion of the division unit I and the left portion of the division unit J have continuity of images with each other. Here, the additional area B can be used for sub-decoding for the area A after being referred to the video part / decoding of the dividing unit J. In particular, although the area A and the area B are data obtained from the area A at the time of creation of the additional area, they may be restored to some different value (including a quantization error) in the subdecryption process. Therefore, when restoring the division unit I, a portion corresponding to the region A can be restored by using the restored image data of the region A and the image data of the region B. For example, a partial region C of the division unit I can be replaced by an average or weight sum of the region A and the region B. [ Since there are two or more data in the same area, we use the data of the two areas (in this case, the process is called Rec_Process in the drawing) and the reconstructed image (C region, Can be obtained.

Further, the partial area C belonging to the division unit I can be replaced by using the area A and the area B depending on which division unit is close to. Specifically, since the image data belonging to a certain range (for example, M pixel intervals) from the area C to the left is adjacent to the division unit I, the data of the area A is used (or copied) The image data belonging to a certain range (for example, N pixel intervals) is adjacent to the division unit J, so that the data in the area B can be used (or copied) to be restored. This can be expressed by the following equation (1).

In addition, some of the regions C belonging to the division unit I can be replaced by assigning weights to the image data of the regions A and B, respectively, depending on which division unit is close to each other. That is, the image data close to the division unit I in the region C give a high weight to the image data in the region A, and the image data close to the division unit J can give a high weight to the image data in the region B. That is, a weight value can be set based on the difference in distance between the horizontal width of the area C and the x-coordinate of the pixel value to be corrected.

The following equation (2) can be derived as an equation for setting an adaptive weight for the area A and the area B.

Referring to Equation (2), w denotes a weight given to the pixel coordinates (x, y) of the A region and the B region. In this case, the weighting average for the A region and the B region means The weight w is multiplied, and the pixels of the B region are multiplied by 1-w. However, not only the weighted averages but also weights of different values may be given to the regions A and B, respectively.

When the use of the additional area is completed according to the foregoing description, the additional area B may be removed in the resizing process for the divide unit J and stored in a memory (DPB, which may be a decoded picture buffer) Assuming that the process of setting the area is " Sizing ", this process can be performed in some of the above embodiments, for example, by checking an additional area flag, checking the next size information, It is assumed that the process of increasing the size of the adjustment process is performed while the process of reducing the size of the size adjustment process is performed.

Also, the resizing process can be stored in memory without performing resizing (specifically, after the sub-picture decoding of the image is completed), and a resizing process is performed in the outputting step (in this example, And can be removed. This can be applied to all or some of the divisions belonging to the image.

The related setting information may be implicitly or explicitly processed in accordance with the setting of the subdecryption, and may be implicitly (in detail, depending on the characteristics of the image, the type, the format, or the like) Setting>), and in the case of an explicit case, the setting for the removal of an additional area can be adjusted through the generation of the relevant syntax element, and the unit for this is a video, a sequence, a picture, a sub picture, a slice , Tiles, and the like.

On the other hand, the conventional encoding method according to the division unit includes: 1) dividing the picture into one or more tiles (or may be collectively referred to as a division unit) and generating division information; 2) 3) performing filtering based on information indicating whether to allow in-loop filters of the tile boundary, and 4) storing the filtered tiles in a memory.

The decoding method according to the conventional dividing unit includes the steps of 1) dividing a picture into one or more tiles based on tile division information, 2) performing decoding in units of divided tiles, 3) Performing filtering based on information indicating whether to permit a loop filter, and 4) storing the filtered tiles in a memory.

Here, the third stage of the sub-decoding method is a post-processing step of sub-decoding, and may be independent sub-decoding if filtering is performed and independent sub-decoding if filtering is not performed.

The coding method of a division unit according to an embodiment of the present invention includes the steps of 1) dividing a picture into one or more tiles and generating division information, 2) setting an additional area for at least one divided tile unit, Filling the additional area with adjacent tile units, 3) performing encoding on the tile unit having the additional area, 4) removing the additional area for the tile unit, indicating whether to allow the in-loop filter of the tile boundary And 5) storing the filtered tiles in a memory.

According to an embodiment of the present invention, there is provided a method of decoding a divided unit, comprising: 1) dividing a picture into one or more tiles based on tile division information, 2) setting an additional area for the divided tile units, (3) filling the additional area with preset information or another (adjacent) tile unit restored in advance, (3) performing decoding using the decoding information received from the encoding device for the tile unit in which the additional area is created, 4) removing the additional area for the tile unit and performing filtering based on information indicating whether to allow an in-loop filter on the tile boundary, and 5) storing the filtered tile in memory .

The second step in the sub-decoding method according to an embodiment of the present invention may be a sub-decoding preprocessing process (dependent upon setting an additional area, or otherwise independent decoding). In addition, the fourth step may be a post-decryption process (dependent upon performing filtering, or independent if not performed). In this example, an additional area is used in the subdecryption process and the size is adjusted to the initial size of the tile before being stored in memory.

First, a picture is divided into a plurality of tiles according to an encoder. Depending on the explicit or implicit setting, additional areas are tiled and related data is acquired in adjacent areas. And performs encoding in an updated tile unit including an existing tile and an additional area. After the coding is completed, the additional area is removed and the filtering is performed according to the in-loop filtering application setting.

At this time, the filtering setting may be different according to the additional area filling method and the removing method. For example, in case of simple elimination, it is possible to follow the above in-loop filtering application setting. In the case of elimination using the overlapping area, filtering may not be applied or other filtering setting may be applied. That is, since the distortion of the tile boundary region can be reduced by using the overlapped data, it is possible to avoid filtering regardless of whether the in-loop filter is applied to the tile boundary unit or not, You can apply filtering settings and other settings (for example, applying a filter with weak filter strength to the tile boundary). After the above process, it is stored in memory.

First, a picture is divided into a plurality of tiles according to tile division information transmitted from an encoder. The additional area related information is explicitly or implicitly confirmed, and the encoded information of the updated tile transmitted from the encoder is parsed with an additional area. Then, decoding is performed in the updated tile unit. After the decoding is completed, the additional region is removed and filtering is performed according to the in-loop filtering application setting like the encoder. Various cases related to this are described in the encoder, and detailed description is omitted. After the above process, store it in memory.

On the other hand, it can be considered that an additional area for the division unit is used in the subdecryption process and is stored in the memory without being deleted. For example, in the case of a 360-degree image, when the accuracy of prediction in a certain prediction process (for example, inter-picture prediction) is deteriorated according to a surface arrangement setting or the like (for example, Which is difficult to find in places where there is no continuity). Therefore, additional areas can be stored in memory to improve prediction accuracy and can be utilized in the prediction process. If utilized in inter-picture prediction, an additional area (or a picture including an additional area) can be used as a reference picture for inter-picture prediction.

The encoding method according to the case of preserving the additional area includes: 1) dividing the picture into one or more tiles and generating division information; 2) setting an additional area for at least one divided tile unit; 3) performing encoding on the tile unit having the additional area, 4) preserving the additional area for the tile unit, where applying in-loop filtering may be omitted ), And 5) storing the encoded tiles in a memory.

The decoding method according to the case of preserving the additional area includes the steps of 1) dividing the picture into one or more tiles based on the tile division information, 2) setting an additional area for the divided tile units, (3) filling the additional area with other (adjacent) tile units restored in advance, (3) performing decoding using the decoding information received from the coding device for the tile unit where the additional area is created, (4) Preserving the additional area for the current block (in which case in-loop filtering may be omitted), and 5) storing the decoded tile in memory.

Explaining the encoder according to the case of saving the additional area, the picture is divided into a plurality of tiles. Depending on the explicit or implicit setting, the additional area is tiled and the related data is acquired in the predetermined area. The predetermined area may be a region adjacent to the current tile or a region not adjacent to the tile because it means another region having a correlation according to the surface layout setting in the 360-degree image. Then, the encoding is performed in the updated tile unit. Filtering is not performed irrespective of the in-loop filtering application setting because the additional area will be preserved after the encoding is completed. The reason is that the boundaries of each updated tile do not share the actual tile boundaries due to additional areas. After the above process, store it in memory.

If a decoder is described according to the case of storing an additional area, the picture identifies the tile division information transmitted from the encoder and divides the information into a plurality of tiles accordingly. Related information, and parses the encoded information of the updated tile transmitted from the encoder with an additional area. Then, decoding is performed in the updated tile unit. After the decoding is completed, the additional area is stored in the memory without applying the in-loop filter.

Hereinafter, a sub-decoding method for a division unit according to an embodiment of the present invention will be described with reference to the drawings.

FIGS. 11 to 12 are diagrams for explaining a sub-decoding method for a division unit according to an embodiment of the present invention. Specifically, FIG. 11 shows a coding method including an additional area, and FIG. 12 shows a decoding method for removing an additional area, as an example in which an additional area is generated for each divided unit and subdecryption is performed. In the case of the 360-degree image, a preprocessing process (stitching, projection, etc.) may be performed in the previous step of FIG. 11, and a post-processing process (rendering, etc.) may be performed in a later step of FIG.

Referring to FIG. 11, when an input image is obtained in an encoder (step A), an input image is divided into two or more division units through a picture division unit (setting information for a division method can be generated, (Step C), and a bitstream can be generated by performing encoding on the division unit including the additional area (step < RTI ID = 0.0 > Step D). After the bitstream is generated, it is possible to decide whether or not to resize (or to delete the additional area, step E) for each division unit according to the encoding setting, D or E) in the memory (step E).

Referring to FIG. 12, in a decoder, an image to be decoded is divided into two or more division units (step B) by referring to division-related setting information obtained by parsing a received bit stream, and decoded According to the setting, the size of the additional area for each division unit is set (step C), and the image data included in the bit stream is decoded to acquire image data having the additional area (step D). Next, the restored picture is generated by deleting the additional area (step E), and the restored picture can be output to the display (step F). At this time, it is possible to determine whether to delete the additional area according to the decoding setting, and to store the decoded picture or image data (data according to D or E) in the memory. Meanwhile, in step F, restoration of the reconstructed picture to a 360-degree image may be performed through surface relocation.

On the other hand, in-loop filtering (in this example, it is assumed to be a deblocking filter, other in-loop filters can be applied) may be adaptively performed in the dividing unit boundary depending on whether or not the additional region is removed in FIG. In addition, in-loop filtering can be performed adaptively depending on whether or not to allow additional area generation.

When an additional area is removed and stored in the memory, the in-loop filter application flag at the boundary of the division unit (in the initial state), such as loop_filter_across_enabled_flag (in the case of tiles in this example) The filter may or may not be applied.

Alternatively, a flag for whether or not an in-loop filter is applied at the boundary of the division unit is not supported, and filtering application and filtering setting and the like can be implicitly determined as in the following example.

Also, even if there is continuity of the image between each division unit, if an additional area for each division unit is generated, the image continuity about the boundary between the division units in which the additional area is created can be eliminated. In this case, if an in-loop filter is applied, an in-loop filter may not be implicitly applied because it causes unnecessary increase in the amount of computation and degradation of encoding performance.

In addition, there may be no image continuity between adjacent division units in a two-dimensional space according to the arrangement of surfaces in a 360-degree image. In this way, when in-loop filtering is performed on a boundary between division units having no image continuity, image quality deterioration may occur. Therefore, in-loop filtering may not be implicitly performed on boundaries between division units that do not have video continuity.

In addition, in the case of replacing a partial region of the current division unit by assigning weights to the two regions as in the description of FIG. 10, in-loop filtering belonging to the inner boundary in the additional region may be applied to the boundary of each division unit, It is unnecessary to perform in-loop filtering because a coding error can be reduced by a method such as weighting sum of a part of a current area belonging to an area. Therefore, in this case, an in-loop filter may not be performed implicitly.

Further, depending on the flag indicating whether or not the in-loop filter is applied (more specifically, it may be determined whether or not to apply the in-loop filter). Filtering may be applied depending on the in-loop filter setting, conditions, etc. applied to the inside of the division unit when the flag is activated, or an in-loop filter setting, conditions, etc. applied to the boundary of the division unit (in detail, In addition to in-loop filter settings, conditions, etc., in addition to non-boundary additions), other defined filtering may be applied.

In the case of the above embodiment, it is assumed that the additional region is removed and stored in the memory. However, some of them may be included in the other output stage (more specifically, in the in-loop filter unit, It may be a process that can be performed.

The above example is based on the assumption that additional areas are supported in each direction of each division unit, and only a part of the contents may be applicable depending on the setting of the additional area and only in some directions. For example, the boundary may be applied to a boundary where an additional area is not supported, and a case where an additional area is supported may be applied to various cases such as following the above example. That is, the application can be adaptively determined to all or a part of the unit boundary according to the setting of the additional area.

The related setting information may be implicitly or explicitly processed in accordance with the setting of the subdecryption, and may be implicitly (in detail, depending on the characteristics of the image, the type, the format, or the like) Setting>). In the case of an explicit case, it can be adjusted through the generation of related syntax elements, and the unit thereof may include a video, a sequence, a picture, a sub-picture, a slice, and a tile.

The following section describes how to determine the division unit and the availability of reference to additional areas. At this time, dependable sub-decryption is possible if it can be referred to, and independent sub-decryption if it is not possible to refer to it.

The additional area according to an embodiment of the present invention may be referred to during the sub-decoding process of the current image or another image, or the reference may be limited. The additional area that is specifically removed before being stored in the memory may be referred to or restricted in the current image subdivision / decryption process. Further, the additional area stored in the memory may be referred to or restricted in the process of adding / decoding the temporally different image as well as the current image.

In summary, the reference possibility and range of the additional area and the like can be determined according to the sub / decryption setting. According to the above-mentioned setting, an additional area of the current image is added / decoded and stored in the memory, which means that it can be included in the reference image of another image or the reference can be restricted. This can be applied to all or some of the divisions belonging to the image. The case described in the following example is also applicable to the present example.

The setting information for the reference possibility of the additional area may be implicitly or explicitly processed according to the setting of the sub-decryption, and may be implicit (in more detail, depending on the characteristics, type, (In this example, based on additional region-related settings>), and if explicit, the settings for the referenceability of additional regions can be adjusted through the generation of relevant syntax elements, the units of which are video, sequence , A picture, a sub-picture, a slice, a tile, and the like.

Generally, some units in the current image (in this example, assuming a division unit obtained through the picture division unit) can refer to the data of the corresponding unit and can not refer to data of other units. In addition, some units in the current image can refer to data in all units existing in other images. The above description may be an example of a general property in a unit obtained through a picture division unit, and an additional property can be defined.

In addition, a flag can be defined that indicates whether or not another division unit can be referred to in the current image, and whether or not the division unit belonging to another image can be referred to.

For example, a division unit belonging to another image and belonging to the same position as the current division unit can be referred to, and a division unit having a different position from the current division unit can be limited in reference. For example, in the case where a plurality of bitstreams obtained by encoding the same image in an environment with different coding settings are transmitted, and the decoder decodes each area (division unit) of the image (assuming decoding in units of tiles in this example) In the case of selectively determining the bitstream, since the reference possibility between each division unit must be limited not only in the same space but also in another space, it is possible to carry out subtraction / decoding with reference to only the same area of another image.

In one example, the reference may be allowed or the reference may be restricted according to the identifier information for the division unit. For example, reference information can be referred to when the identifier information allocated to the division unit is equal to each other. At this time, the identifier information may mean information indicating that (dependent) sub-decryption has been performed in a mutually referable environment.

The related configuration information can be implicitly or explicitly processed according to the subdecryption setting. In the case of implicit, it can be determined without generating the relevant syntax element. In the case of an explicit case, the related configuration information can be processed through generation of related syntax elements. Video, sequence, picture, sub-picture, slice, tile, and the like.

13A to 13G are exemplary diagrams for explaining an area in which a specific division unit can be referred to. 13A to 13G, the area indicated by the bold line may refer to a referenceable area.

Referring to FIG. 13A, various reference arrows for performing inter picture prediction can be identified. At this time, in the case of C0 and C1 blocks, unidirectional inter prediction is shown. The C0 block can acquire the RP0 reference block in the previous of the current picture, and the RF0 reference block after the current picture. In the case of the C2 block, bidirectional inter-picture prediction is shown, and reference blocks RP1 and RF1 can be obtained in a previous picture of the current picture or in a subsequent picture of the current picture. Although the figure shows an example of acquiring one reference block in forward and backward directions, it may also be possible to acquire a reference block only in forward direction only or in backward direction. In the case of the C3 block, a non-directional inter-picture prediction is shown, and an RCO reference block can be obtained in the current picture. Although an example of acquiring one reference block is shown in the figure, it is also possible to acquire two or more reference blocks.

In the following example, the reference value of the pixel value and the prediction mode information in the inter picture prediction according to the division unit will be mainly described. However, other sub / decoding information that can be referred to spatially or temporally (for example, , Transform and quantization information, in-loop filter information, and the like).

Referring to FIG. 13B, the current picture (Current (t)) is divided into two or more tiles, and the block C0 of some tiles can perform inter-picture prediction in one direction to obtain reference blocks P0 and P1. The block C1 of some tiles can perform inter-picture prediction in both directions to obtain reference blocks P3 and F0. That is, it may be an example in which it is permitted to refer to a block belonging to another position of another image without restriction such as limitation of position or permission of reference only in the current picture.

Referring to FIG. 13C, the picture is divided into two or more tile units, and some blocks C1 of some tiles can perform inter-picture prediction in one direction to obtain reference blocks P2 and P3. Some blocks C0 of some tiles may perform inter-view prediction in both directions to obtain reference blocks P0, P1, F0, and F1. Some blocks C3 of some tiles may perform inter-picture prediction in a non-directional manner to obtain a reference block FC0.

13B and 13C, it is possible to refer to a block belonging to another position of another image without limitation such as limitation of position or permission of reference only in the current picture.

Referring to FIG. 13D, the current picture is divided into two or more tile units, and a block C0 of some tiles can perform inter-picture prediction in all directions to obtain a reference block P0. However, a reference block P1 , P2, and P3) can not be obtained. The block C4 of some tiles can acquire the reference blocks F0 and F1 by performing the inter-view prediction in the backward direction, but can not acquire the reference blocks F2 and F3. Some blocks C3 of some tiles may perform inter-picture prediction in a non-directional manner to obtain a reference block FC0 or not obtain a reference block FC1.

That is, in FIG. 13D, it can be referenced or limited depending on the division of the picture (t-1, t, t + 1 in this example) or the subdecryption setting of the division unit of the picture. Specifically, reference to only a block belonging to a tile having the same identifier information as the current tile may be possible.

13E, the picture is divided into two or more tile units, and some blocks C0 of some tiles can perform inter-picture prediction in both directions to obtain reference blocks P0 and F0, or reference blocks P1, P2, P3, F1, F2, F3). That is, in FIG. 13E, the reference may be allowed only to a tile which is located at the same position as the tile to which the current block belongs.

13F, a picture is divided into two or more tile units, and some blocks C0 of some tiles can perform inter-picture prediction in both directions to obtain reference blocks P1 and F2, or reference blocks P0, P2, P3, F0, F1, F3). In Fig. 13F, information indicating a tile which can be referred to for the current division unit is included in the bit stream, and the tile having a possibility of reference can be identified with reference to the information.

Referring to FIG. 13G, a picture is divided into two or more tiles, and some blocks C0 of some tiles can perform inter-picture prediction in one direction to obtain reference blocks P0, P3, and P5, Can not be obtained. Some blocks C1 of some tiles can perform inter-view prediction in both directions to obtain reference blocks P1, F0, F2 or can not obtain reference blocks P2, F1.

FIG. 13G is a diagram showing the division of the picture (t-3, t-2, t-1, t, t + 1, t + 2, t + 3 in this example) , The identifier information of the division unit, the identifier information of the picture unit, whether or not the same area is divided, the similar area of the division unit, bit stream information of the division unit, and the like) or the like. Here, a tile which can be referred to may have the same or similar location as the current tile in the image, and may have the same identifier information as the current tile (in detail, in picture unit or division unit) Stream.

14A to 14E are illustrations for explaining the possibility of reference to an additional region in a division unit according to an embodiment of the present invention. 14A to 14E, the area indicated by the bold line indicates a reference area, and the area indicated with a dotted line indicates an additional area for the division unit.

According to an embodiment of the present invention, the possibility of reference to some pictures (temporally previous or later located other pictures) may be limited or allowed. Further, the possibility of reference to the entire extended partitioning unit including the additional area can be limited or allowed. In addition, the reference possibility can be allowed or limited only for the initial division unit excluding the additional region. Further, the possibility of reference to the boundary between the additional region and the initial division unit can be allowed or limited.

Referring to FIG. 14A, some blocks C0 of some tiles may perform inter-picture prediction in one direction to obtain reference blocks P0 and P1. Some block C2 of some tiles may perform inter-view prediction in both directions to obtain reference blocks P2, P3, F0, and F1. Some blocks C1 of some tiles may perform inter-picture prediction in a non-directional manner to obtain a reference block FC0. Here, the block C0 can obtain the reference blocks P0, P1, P2, P3, F0 in the initial tile area (basic tile excluding the additional area) of some reference pictures t-1, t + Not only obtains the reference blocks P2 and P3 in the initial tile area of the reference picture t-1 but also acquires the reference block F1 in the tile area including the additional area in the reference picture t + can do. At this time, as can be seen from the reference block F1, a reference block including a boundary between the additional area and the initial tile area may be obtained.

Referring to FIG. 14B, some blocks C0, C1, and C3 of some tiles can perform inter-picture prediction in one direction to obtain reference blocks P0, P1, P2 / F0, F2 / F1, F3, and F4 have. Some blocks C2 of some tiles may perform inter-picture prediction in a non-directional manner to obtain reference blocks FC0, FC1, FC2.

Some blocks C0, C1 and C3 may obtain reference blocks P0, F0 and F3 in the initial tile areas of some reference pictures (t-1, t + 1 in this example) (P1, x, F4) and obtain reference blocks (P2, F2, F1) outside the updated tile region boundaries.

Some block C2 may obtain the reference block FC1 in the initial tile area of some reference picture (t in this example), obtain the reference block FC2 at the updated tile area boundary, The reference block FC0 can be obtained outside the tile region boundary.

Here, some blocks C0 may be blocks located in the initial tile region, some blocks C1 may be blocks located at the updated tile region boundaries, and some blocks C3 may be blocks located outside the updated tile boundaries. Lt; / RTI >

Referring to FIG. 14C, the picture is divided into two or more tile units, some of the pictures have additional areas in some tiles, some of the pictures do not have additional areas in some tiles, and some pictures do not have additional areas. Some blocks C0 and C1 of some tiles may perform inter-view prediction in a unidirectional manner to obtain reference blocks P2, F1, F2 and F3 or to obtain reference blocks P0, P1, P3 and F0 none. Some blocks C2 of some tiles can perform inter-picture prediction in a non-directional manner to obtain reference blocks FC1 and FC2 or acquire reference block FC0.

Some blocks C2 can not acquire the reference block FC0 in the initial tile area of some reference pictures (t in this example), and the reference blocks FC1 and FC1 in some updated area filling methods in the updated tile area Can be the same area.) FC0 can not be referenced in tile partitioning in the initial unit, but can be referenced when bringing the area to the current tile through an additional area).

Some block C2 is a reference block (FC2. Basically, data of another tile of the current picture can not be referred to in some tile area of some reference picture (t in this example) but can be referenced according to the identifier information in the embodiment) It is assumed that reference is possible when the setting is made).

Referring to FIG. 14D, the picture is divided into two or more tile units and has an additional area. Some blocks C0 of some tiles can perform inter-view prediction in both directions to acquire reference blocks P0, F0, F1 and F3 or can not acquire reference blocks P1, P2, P3 and F2.

It is possible to obtain the reference block P0 in the initial tile area (tile 0) of some reference picture t-1 of some block C0 and to obtain the reference block P3 at the extended tile area boundary And is restricted such that the reference block P2 can not be obtained outside the boundary of the extended tile area (i.e., the additional area).

The reference block F0 can be obtained in the initial tile area (tile 0) of some reference picture t0 in some block C0 and the reference block F1 can be obtained in the extended tile area boundary And can obtain a reference block F3 outside the extended tile region boundary.

Referring to FIG. 14E, a picture is divided into two or more tile units and has an additional area having at least one size and shape. The block C0 of some tiles can acquire the reference blocks P0, P3, P5, and F0 by performing inter-picture prediction in a unidirectional manner, but acquires the reference block P2 located at the boundary between the additional area and the basic tile Can not. The block C1 of some tiles can acquire the reference blocks P1, F2 and F3 by performing inter-view prediction in both directions, but can not obtain the reference blocks P4, F1 and F5.

As described above, the pixel value can be a target of reference, and restriction of reference to other sub / decode information can be possible.

13A to 13C illustrate whether the intra prediction mode candidate group in intra prediction is searched in a spatially adjacent block in the prediction unit and whether the division unit to which the current block belongs can refer to the division unit to which the adjacent block belongs 14e. &Lt; / RTI >

For example, when the prediction unit searches for a motion information candidate group in an inter-picture prediction in temporally and spatially adjacent blocks, the division unit to which the current block belongs is divided into a spatial division within the current picture or a division to which a block temporally adjacent to the current picture belongs It is possible to confirm whether or not the unit can be referred to by the method as shown in Figs. 13A to 14E.

For example, when in-loop filter-related setting information is found in an adjacent block, whether or not the division unit to which the current block belongs can refer to the division unit to which the adjacent block belongs is shown in FIGS. 13A to 14E .

Referring to FIG. 15, in this example, left, upper left, lower left, upper and upper right blocks around the current block may be spatially adjacent reference candidate blocks. The left, upper left, lower left, upper, right upper, right, upper left, and lower right of a collocated block at the same or corresponding position to the current block in a different picture temporally adjacent to the current picture, The bottom right, bottom, and center blocks may be temporal reference candidate blocks. In FIG. 15, a thick outline is a boundary line indicating a division unit.

When the current block is M, all of the spatially adjacent blocks (G, H, I, L, Q) can be referenced.

When the current block is G, some of the spatially adjacent blocks (A, B, C, F, and K) can be referenced, and the rest of the blocks may be restricted. The reference availability can be determined according to the reference-related setting of the division unit (UC, ULC, LC) to which the spatially adjacent block belongs and the division unit to which the current block belongs.

When the current block is S, some of the blocks (s, r, m, w, n, x, t, o, y) located around the same position as the current block in the temporally adjacent image can be referred to , And the rest of the blocks may be limited in reference. The reference availability can be determined according to the reference-related setting of the division unit (RD, DRD, DD) to which the block around the same position as the current block in the temporally adjacent image belongs and the unit to which the current block belongs.

If there is a reference-restricted candidate according to the position of the current block, the candidate of the next order among the priorities of the candidate group configuration may be filled, or another candidate adjacent to the candidate whose reference is restricted may be substituted.

For example, in the intra prediction, if the current block is G, the upper left block is limited in reference, and the MPM candidate structure follows the order of P - D - A - E - U, U can be used to construct a candidate group by performing validation in the order of U, or A or B that is spatially adjacent to A can be substituted for A.

Or, in the inter-view prediction, if the current block is S and the temporally adjacent left lower block is limited in reference, and the temporal candidate composition of the skip mode candidate group is y, y is a reference impossible case, Candidate candidates can be constructed by performing validation in the order of y, y, and y, or alternatively y can be replaced with t, x, s spatially adjacent to y.

16, an apparatus 200 for encoding and decoding an image according to an exemplary embodiment of the present invention includes at least one processor 210 and at least one processor 210 performing at least one step And a memory 220 for storing instructions that direct the instructions.

Where at least one processor 210 may be a central processing unit (CPU), a graphics processing unit (GPU), or a dedicated processor on which methods in accordance with embodiments of the present invention are performed . Each of the memory 120 and the storage device 260 may be constituted by at least one of a volatile storage medium and a non-volatile storage medium. For example, the memory 220 may comprise at least one of read-only memory (ROM) and random access memory (RAM).

In addition, the image encoding / decoding apparatus 200 may include a transceiver 230 for performing communication via a wireless network. The image encoding / decoding device 200 may further include an input interface device 240, an output interface device 250, a storage device 260, and the like. Each component included in the image encoding / decoding apparatus 200 may be connected by a bus 270 to perform communication with each other.

Wherein at least one step includes dividing an encoded image included in the bitstream into at least one division unit by referring to a syntax element obtained from a received bitstream, And decoding the encoded image based on the division unit in which the additional area is set.

Referring to FIG. 17, 35 prediction modes can be identified, and 35 prediction modes can be classified into 33 directional modes and 2 non-directional modes (DC, Planar). At this time, the directional mode can be identified by a slope (for example, dy / dx) or angle information. The above example may mean a prediction mode candidate group for a luminance component or a chrominance component. Alternatively, the chrominance components may be supported in some prediction modes (e.g., DC, Planar, vertical, horizontal, diagonal mode, etc.). In addition, when the prediction mode of the luminance component is determined, the mode may be included in the prediction mode of the chrominance component, or a mode derived from the chrominance component may be included in the prediction mode.

In addition, the restoration block of another color space, which has been subdivided and decoded using the correlation between the color spaces, can be used for prediction of the current block and can include a prediction mode supporting the same. For example, in the case of a chrominance component, a reconstructed block of the luminance component corresponding to the current block can be generated as a prediction block of the current block.

The prediction mode candidate group can be determined adaptively according to the subdecryption setting. The number of candidate groups can be increased for the purpose of increasing the accuracy of the prediction and the number of candidate groups can be reduced for the purpose of reducing the bit amount according to the prediction mode.

For example, A candidate group (67, 65 directional mode and 2 non-directional mode), B candidate group (35. 33 directional mode and 2 non-directional mode), C candidate group (19, 17 directional mode And two non-directional modes) can be used. If there is no special description in the present invention, it is assumed that intra prediction is performed with one predetermined prediction mode candidate group (A candidate group).

The intra-frame prediction method in the image encoding according to an embodiment of the present invention includes a reference pixel forming step, a prediction block generating step using one or more prediction modes with reference to a reference pixel configured, a step of determining an optimal prediction mode, And encoding the mode. In addition, the image encoding apparatus may be configured to include a reference pixel forming step, a prediction block generating step, a prediction mode determining step, a reference pixel forming unit for implementing the prediction mode encoding step, a prediction block generating unit, a prediction mode determining unit, can do. Some of the above-described processes may be omitted or other processes may be added, and the order may be changed in a different order than the above-described order.

Meanwhile, in the intra-frame prediction method according to an embodiment of the present invention, the intra-frame prediction method includes constructing a reference pixel, and estimating a prediction of a current block according to a prediction mode obtained through a syntax element received by the image encoding apparatus Blocks can be created.

The size and type (M x N) of the current block in which the intra prediction is performed can be obtained from the block dividing unit and can have a size of 4 x 4 to 256 x 256. Intra prediction may be performed in units of prediction blocks, but may be performed in units of encoding blocks (or coding units), conversion blocks (or conversion units), etc., depending on the setting of the block division unit. After confirming the block information, the reference pixel composing unit can construct a reference pixel used for predicting the current block. In this case, the reference pixels may be managed through a temporary memory (e.g., array <Array>, primary array, secondary array, etc.), generated and removed for each intra-picture prediction process of the block, May be determined according to the configuration of the pixel.

The reference pixel may be a pixel belonging to an adjacent block (may be referred to as a reference block) positioned at the left, upper, left, upper right, and lower left of the current block, but the present invention is not limited thereto. It can also be used for prediction. Here, adjacent blocks located in the left, upper, upper, right, and lower left positions may be blocks selected depending on whether raster or Z scan is performed, and if the scan order is different, For example, a pixel belonging to the right, lower, left, right, lower, right, lower, left,

In addition, the reference block may be a block corresponding to the current block in a color space different from the color space to which the current block belongs. Here, when the Y / Cb / Cr format is taken as an example, the color space may mean one of Y, Cb and Cr. The block corresponding to the current block may have the same positional coordinates as the current block or may have a positional coordinate corresponding to the current block according to the color component composition ratio.

For convenience of explanation, it is assumed that reference blocks according to the predetermined positions (left, upper, left, upper right, lower left) are composed of one block. However, have.

In summary, the adjacent region of the current block may be the position of the reference pixel for intra-picture prediction of the current block, and the region corresponding to the current block of another color space may be considered as the position of the reference pixel according to the prediction mode have. In addition to the above example, the position of a reference pixel defined according to a prediction mode, a method, and the like can be determined. For example, when a prediction block is generated through a block matching method or the like, the reference pixel position may be determined based on a search range (for example, a search range in a sub-decoded area or a partially decoded area before the current block of the current image, An area included in the left or upper side or the upper left corner, the upper right corner, etc. of the current block) may be considered as the position of the reference pixel.

Referring to FIG. 18, the reference pixels used for intra-frame prediction of the current block (size of M × N) include left, upper, left, upper right, and lower left adjacent pixels (Ref_L, Ref_T, Ref_TL, Ref_TR, Ref_BL). In this case, in FIG. 18, what is expressed in the form of P (x, y) can mean pixel coordinates.

Meanwhile, the pixel adjacent to the current block can be classified into at least one reference pixel layer, and the pixel closest to the current block is ref_0 {pixels having a pixel value difference of 1 from the boundary pixel of the current block. p (-1, -1) to p (-1, -1) to p (-1, (2, -2), p (-2, -1) to p (-2, 2N)} ref_1, the next adjacent pixel {the boundary of the current block The difference between pixel and pixel value 3. p (-3, -3) ~ p (2M + 1, -3), p (-3, -2) ~ p (-3,2N + 1)} is divided by ref_2 . That is, the reference pixel can be classified into a plurality of reference pixel layers according to the pixel distance adjacent to the boundary pixel of the current block.

Here, the reference pixel layer can be set differently for adjacent neighboring blocks. For example, when using the current block as the reference block and the neighboring block as the reference block, the reference pixel according to the ref_0 layer is used. When the neighboring block is used as the reference block, the reference pixel according to the ref_1 layer can be used have.

In general, the reference pixel set referred to when intra prediction is performed belongs to neighboring blocks adjacent to the current block and the lower left, left, upper left, upper right, and ref_0 layers (pixels nearest to the boundary pixel) And are assumed to be such pixels unless otherwise described below. However, only the pixels belonging to some blocks among the above-mentioned neighboring blocks may be used as a reference pixel set, or pixels belonging to two or more layers may be used as a reference pixel set. Here, the reference pixel set or hierarchy may be determined implicitly (preset in the subdecoder / decoder) or explicitly determined (information that can be determined from the encoder).

Here, it is assumed that a maximum of three reference pixel layers are supported. However, the number of reference pixel layers can be more than three, and the number of reference pixel sets according to the number of reference pixel layers and the positions of referenceable neighboring blocks I / P / B < / RTI > At this time, the image may be set differently according to a picture, a slice, a tile, etc., a color component, and the related information may be included in a unit of a sequence, a picture, a slice, or a tile.

In the present invention, a case where a low index (incremented from 0 to 1) is allocated from the reference pixel layer closest to the current block is presupposed, but the present invention is not limited thereto. Further, the reference-pixel-configuration-related information to be described later can be generated under the above-mentioned index setting (binarization in which a short bit is assigned to a small index when one of a plurality of reference pixel sets is selected).

In addition, when two or more reference pixel layers are supported, a weighted average or the like can be applied to each reference pixel included in two or more reference pixel layers.

For example, a prediction block can be generated using reference pixels obtained by summing weights of ref_0 and ref_1 layers in FIG. At this time, the pixel to which the weight sum is applied in each reference pixel layer may be a pixel unit as well as an integer unit pixel according to a prediction mode (for example, prediction mode directionality). (For example, 7: 1, 3: 1, 1: 1, 2: 1, , 2: 1, 1: 1, and so on) to obtain one prediction block. In this case, the weighting value may have a higher weight value as the prediction block according to the reference pixel layer adjacent to the current block.

Assuming that information is explicitly generated in relation to the reference pixel structure, the instruction information (adaptive_intra_ref_sample_enabled_flag in this example) allowing an adaptive reference pixel configuration may occur in units of video, sequence, picture, slice, tile, .

The adaptive reference pixel configuration information (adaptive_intra_ref_sample_flag in this example) may be generated in units of pictures, slices, tiles, blocks, etc., if the instruction information indicates that the adaptive reference pixel configuration is acceptable (adaptive_intra_ref_sample_enabled_flag = 1 in this example) have.

If the configuration information indicates an adaptive reference pixel configuration (adaptive_intra_ref_sample_flag = 1 in this example), the reference pixel organization related information (e.g., selection information on the reference pixel hierarchy and the aggregation, such as intra_ref_idx in this example) Slices, tiles, blocks, and the like.

At this time, if the adaptive reference pixel configuration is not allowed or the adaptive reference pixel configuration is not used, the reference pixel may be configured according to a predetermined setting. (Ref_0 and ref_1, for example, are selected as the reference pixel layer, and ref_0 and ref_1 are weighted through ref_0 and ref_1, respectively). However, the present invention is not limited to this, A case where a predictive pixel value is generated by a method such as summing, that is, an implied case) may be possible.

In addition, the reference pixel organization related information (e.g., reference pixel hierarchy or selection information for the aggregate) may be configured (for example, ref_1, ref_2, ref_3, etc.), but it is not so limited.

Some examples of the reference pixel configuration have been described with reference to the above example, which may be combined with various sub-decode information to determine intra-picture prediction settings. In this case, the sub-decoded information includes at least one of a video type, a color component, a size and a type of a current block, a prediction mode (a type of a prediction mode (directionality and non-directionality), a direction of a prediction mode (vertical, horizontal, diagonal 1, diagonal 2, } Or the like, and an intra-picture prediction setting (reference pixel configuration in this example) can be determined according to the sub-decoding information of the neighboring block and the combination of the sub-decoding information of the current block and the neighboring block.

Referring to FIG. 19A, it can be seen that the reference pixel is composed only of ref_0 reference pixel layer in FIG. (reference pixel generation, reference pixel filtering, and reference pixel generation) after a reference pixel is constructed using pixels belonging to a neighboring block (for example, lower left, left, upper left, Reference pixel interpolation, prediction block generation, post-processing filtering, etc. Some intra-picture prediction processes can be adaptively performed depending on the reference pixel configuration). In this example, there is shown an example in which intra prediction is performed using a non-directional mode without setting information for the reference pixel hierarchy when one preset reference pixel hierarchy is used.

Referring to FIG. 19B, it can be confirmed that reference pixels are formed by using all two supported reference pixel layers. That is, the intra prediction can be performed after using the pixels belonging to the hierarchy ref_0 and the hierarchy ref_1 (or by using the weighted average value of the pixels belonging to the two hierarchies) reference pixels. In this example, the setting information for the reference pixel hierarchy is not generated, but a certain directional prediction mode (right upper to left lower direction or the opposite direction in the drawing) And the prediction is performed.

Referring to FIG. 19C, it is confirmed that a reference pixel is constructed using only one reference pixel layer among three supported reference pixel layers. In this example, there are a plurality of reference pixel hierarchy candidates, setting information for the reference pixel hierarchy to be used among them is generated, and an example in which the in-screen side is performed using some directional prediction mode (upper left to lower right in the drawing) .

20 is a block having a size of 64 × 64 or more, a drawing symbol b is a block having a size of 16 × 16 or more to less than 64 × 64, and a drawing symbol c is a block having a size of less than 16 × 16.

If the block according to the drawing symbol a is a current block to be subjected to in-picture prediction, intra-picture prediction can be performed using the nearest neighbor reference pixel ref_0.

In addition, if the block according to the drawing symbol b is a current block to be subjected to in-frame prediction, intra prediction can be performed using two supportable reference pixel layers ref_0 and ref_1.

In addition, if the block according to the drawing symbol c is the current block to be subjected to in-frame prediction, intra prediction can be performed using three supportable reference pixel layers ref_0, ref_1, ref_2.

Referring to the explanations with reference to the drawing symbols a to c, it is possible to determine the number of supportable reference pixel layers differently according to the size of a current block to be subjected to intra prediction. In FIG. 20, the larger the size of the current block, the higher the probability that the size of the neighboring block is small. This may be a result of division due to other image characteristics. Therefore, It is assumed that as the size of the block increases, the number of supported reference pixel layers decreases. However, other variants including the opposite case are also possible.

Referring to FIG. 21, it can be seen that the current block performing the intra prediction is of a rectangular shape. If the current block is rectangular and the horizontal and vertical are asymmetric, the number of support of the reference pixel layer adjacent to the horizontal side boundary having a long length in the current block is set to be large and the support of the reference pixel layer adjacent to the vertical boundary surface having a short length in the current block You can set a small number. In the figure, it is confirmed that the reference pixel layer adjacent to the horizontal boundary surface of the current block is set to two, and the reference pixel layer adjacent to the vertical boundary surface of the current block is set to one. This is because the accuracy of the prediction can be reduced because the pixels adjacent to the short side of the current block in the short side are often far away from the pixels included in the current block (because of a long length). Therefore, although the number of support of the reference pixel hierarchy adjacent to the side of the longitudinally shortened side is set small, the opposite case may be possible.

In addition, the reference pixel layer to be used for prediction can be determined differently depending on the type of the intra prediction mode or the neighboring block neighboring the current block. For example, in a directional mode using a pixel belonging to a block adjacent to the upper and upper ends of the current block as reference pixels, two or more reference pixel layers are used, and a pixel belonging to a block adjacent to the left end and the lower left end of the current block is referred to The directional mode used as a pixel may use only the nearest reference pixel layer.

On the other hand, if the prediction blocks generated through the reference pixel layers in the plurality of reference pixel layers are the same or similar to each other, generating the setting information of the reference pixel layer may be a result of additionally generating unnecessary data.

For example, if the distribution characteristics of the pixels constituting each reference pixel hierarchy are similar or identical to each other, a similar or identical prediction block can be generated regardless of which reference pixel hierarchy is used. Therefore, it is necessary to generate data for selecting a reference pixel hierarchy There is no. At this time, the distribution characteristic of the pixels constituting the reference pixel layer can be determined by comparing the average or variance of the pixels with a predetermined threshold value.

That is, if the reference pixel layers are the same or similar based on the finally determined intra-picture prediction mode, the reference pixel layer can be selected by a predetermined method (for example, selecting the nearest reference pixel layer).

At this time, the decoder may receive intra-picture prediction information (or intra-picture prediction mode information) from the encoding apparatus and determine whether to receive information for selecting a reference pixel hierarchy based on the received information.

Although reference pixels are formed using a plurality of reference pixel layers through the various examples, the present invention is not limited thereto, and various modifications may be possible and may be combined with other additional configurations.

The reference pixel forming unit of the intra prediction may include a reference pixel generating unit, a reference pixel interpolating unit, a reference pixel filter unit, and the like, and may include all or a part of the above configuration. Herein, a block including pixels which can be reference pixels may be referred to as a reference candidate block. Also, the reference candidate block may be a neighboring block that is generally adjacent to the current block.

It is possible to determine whether a pixel belonging to the reference candidate block can be used as a reference pixel according to the reference pixel availability (Availability) set for the reference candidate block in the reference pixel block.

The usability of the reference pixel can be determined to be unusable when at least one of the following conditions is satisfied. For example, when the reference candidate block is located outside the picture boundary, it does not belong to the same division unit (for example, slice, tile, etc.) as the current block, and the sub- It is determined that the pixels belonging to the corresponding reference candidate block can not be referred to. At this time, if all of the above conditions are not satisfied, it can be judged that it is usable.

Further, the use of the reference pixels can be restricted by the setting of the subdivision / decryption. For example, when a flag (for example, constrained_intra_pred_flag) that restricts a reference to a reference candidate block is activated, the pixel belonging to the reference candidate block can be restricted so that it can not be used as a reference pixel. The flag may be applied to a case where the reference candidate block is a reconstructed block referring to an image temporally different from the current picture in order to perform robust addition / decryption due to various external factors including a communication environment.

Here, if the flag for limiting the reference is disabled (for example, the constrained_intra_pred_flag = 0 in the I picture type or the P or B picture type), the pixel of the reference candidate block can be used as the reference pixel. In addition, depending on whether the reference candidate block is coded by intra-picture prediction or inter-picture prediction when the flag for limiting the reference is activated (for example, constrained_intra_pred_flag = 1 in the P or B picture type) Whether or not reference is possible can be determined. That is, if the reference candidate block is coded by intra prediction, the reference candidate block can be referred to regardless of whether the flag is activated or not, and if the reference candidate block is coded by inter prediction , And the referential candidate block can be referenced according to whether the flag is activated or not.

Also, a restoration block having a position corresponding to the current block in another color space may be a reference candidate block. At this time, whether or not the reference candidate block can be referred to can be determined according to the coding mode of the reference candidate block. For example, if the current block belongs to a certain chrominance component (Cb, Cr), it is determined whether or not the current block has a position corresponding to the current block in the luminance component (Y) Whether or not reference is possible can be determined. This may be an example corresponding to the case where the encoding mode is independently determined according to the color space.

The flag to limit the reference may be a setting applied in some video types (e.g., P or B slice / tile type, etc.).

It is possible to classify all of the reference candidate blocks as usable, partially usable, and all useless as a reference candidate block through the usability of reference pixels. It is possible to fill or generate reference pixels of unusable candidate block positions in all cases except when all of them are usable.

If a reference candidate block is available, a pixel at a predetermined position of the block (or a pixel adjacent to the current block) can be stored in the reference pixel memory of the current block. At this time, the pixel data of the corresponding block position may be copied as it is or may be stored in the reference pixel memory through a process such as reference pixel filtering.

If the reference candidate block is unavailable, the pixel obtained through the reference pixel generation process can be included in the reference pixel memory of the current block.

In summary, a reference pixel can be constructed when the reference pixel candidate block is usable, and an entropy pixel can be generated when the reference pixel candidate block is unusable.

A method of filling a reference pixel at a predetermined position in an unusable reference candidate block is as follows. First, a reference pixel can be generated using an arbitrary pixel value. Here, the arbitrary pixel value is a specific pixel value belonging to a pixel value range, and may be a minimum value, a maximum value, or a median value of a pixel value used in a pixel value adjusting process based on bit depth or a pixel value adjusting process based on pixel value range information of an image , And may be a value derived from the values. Here, generating a reference pixel with an arbitrary pixel value may be applied when all the reference candidate blocks are unusable.

Next, a reference pixel can be generated by using pixels belonging to a block adjacent to the unusable reference candidate block. Specifically, the pixels belonging to the adjacent block can be extrapolated, interpolated, or copied to a predetermined position in the unusable reference candidate block and filled in. At this time, the direction of performing the copying or extrapolation may be clockwise or counterclockwise, and may be determined according to the sub / decryption setting. For example, the reference pixel generation direction in the block may follow a predetermined one direction or may be adaptively determined according to the position of the unusable block.

Referring to FIG. 22A, a method of filling pixels belonging to the unusable reference candidate block among the reference pixels made up of one reference pixel layer can be confirmed. Referring to FIG. 22A, when a neighboring block adjacent to the current block is an unusable reference candidate block, a reference pixel (denoted by < 1 >) belonging to a neighboring block adjacent to the upper right of the current block is referred to as a reference pixel belonging to a neighboring block adjacent to the upper end of the current block Can be generated by extrapolating in clockwise or linear extrapolation.

In FIG. 22A, when a neighboring block adjacent to the left of the current block is a unusable reference candidate block, a reference pixel (denoted by < 2 >) belonging to a neighboring block adjacent to the left is a neighboring block (Corresponding to a possible block) in a counterclockwise direction. At this time, if extrapolated or linearly extrapolated in the clockwise direction, reference pixels belonging to neighboring blocks adjacent to the lower left end of the current block can be used.

Also, in FIG. 22A, a part of a reference pixel (denoted by < 3 >) belonging to a neighboring block on the upper side of the current block can be generated by interpolating or linearly interpolating reference pixels usable on both sides. That is, a case where not all of the reference pixels belonging to the neighboring block but some of the reference pixels can not be used can also be set. In this case, unusable reference pixels can be used to fill unusable reference pixels.

Referring to FIG. 22B, a method of filling a reference pixel that can not be used when some reference pixels are unavailable among reference pixels composed of a plurality of reference pixel layers can be confirmed. Referring to FIG. 22B, when neighboring blocks adjacent to the upper right of the current block are reference blocks that can not be used, pixels (denoted by < 1 >) belonging to the three reference pixel layers belonging to the neighboring block are located at the upper May be generated clockwise using pixels belonging to adjacent neighboring blocks (corresponding to usable blocks).

In FIG. 22B, if a neighboring block adjacent to the left of the current block is a unusable reference candidate block and neighboring blocks adjacent to the upper left end and the lower left end of the current block are usable reference blocks, Direction, the counterclockwise direction, or both directions to generate reference pixels of unusable reference candidate blocks.

At this time, the unusable reference pixels of each reference pixel hierarchy can be generated using the pixels of the same reference pixel hierarchy, but the use of pixels of the same reference pixel hierarchy is not excluded. For example, in FIG. 22B, reference pixels (denoted by < 3 >) along three reference pixel layers belonging to neighboring blocks adjacent to the upper end of the current block are assumed as unusable reference pixels. At this time, the pixels belonging to the reference pixel hierarchy ref_0 and the reference pixel hierarchy ref_0 closest to the current block can be generated using the usable reference pixels belonging to the same reference pixel hierarchy. In addition, the pixels belonging to the reference pixel hierarchy ref_1 separated by a distance of one pixel from the current block use pixels belonging to the same reference pixel hierarchy ref_1 as well as pixels belonging to the other reference pixel hierarchy ref_0 and ref_2 Lt; / RTI > At this time, reference pixels usable on both sides can be filled with unusable reference pixels by a method such as quadratic linear interpolation.

The above example shows an example in which a reference pixel is generated when some reference candidate blocks are not available when a plurality of reference pixel layers are configured as reference pixels. Alternatively, in a setting that does not allow an adaptive reference pixel configuration according to a sub / decode setting (e.g., when at least one reference candidate block is unavailable or all reference candidate blocks are unavailable) adaptive_intra_ref_sample_flag = 0) may be possible. That is, the reference pixel can be configured according to a predetermined setting without any additional occurring information.

The reference pixel interpolator can generate the reference pixel in the decimal unit through the linear interpolation of the reference pixel. In the present invention, it is assumed that the process is a partial process of the reference pixel forming unit, but it may be included in the prediction block generating unit and can be understood as a process performed before generating the prediction block.

Although it is assumed that the process is different from the reference pixel filter unit described later, it can be integrated into one process. This may be a case in which a case of generating a distortion in a reference pixel due to an increase in the number of filtering applied to the reference pixel when a plurality of filtering is applied through the reference pixel interpolator and the reference pixel filter unit.

The reference pixel interpolation process may be performed in a certain prediction mode (for example, a horizontal mode, a vertical mode, a partial diagonal mode, a diagonal down left mode, a diagonal down left mode, (A mode in which interpolation is not required in decimal units at the time of generating a prediction block), and other prediction modes (a mode in which a decimal unit interpolation is required at the time of generating a prediction block).

The interpolation precision (for example, pixel units of 1, 1/2, 1/4, 1/8, 1/16, 1/32, 1/64, etc.) can be determined according to the prediction mode have. For example, in the case of a prediction mode having a 45-degree angle, an interpolation process is not required, and in the case of a prediction mode having an angle of 22.5 degrees or 67.5 degrees, a half-pixel interpolation is required. As described above, at least one interpolation accuracy and maximum interpolation precision can be determined according to the prediction mode.

(For example, a 4-tap cubic filter, a 4-tap Gaussian filter, a 6-tap linear interpolation filter, or a 4-tap linear interpolation filter) -tap winner filter, 8-tap Kalman filter, etc.) can be used according to the setting of the sub / decoder. At this time, the interpolation filter can be divided into the difference of the number of filter tap (i.e., the number of pixels to which filtering is applied) and the filter coefficient.

The interpolation may be performed step by step in a low precision to a high precision order (e.g., 1/2 -> 1/4 - 1/8), and may be performed collectively. In the former case, interpolation is performed based on a pixel in an integer unit and a pixel in a decimal unit (a pixel interpolated in advance with a precision lower than a pixel to be interpolated). In the latter case, It may mean performing interpolation.

When using one of a plurality of filter candidate groups, the filter selection information can be explicitly generated or implicitly defined, and the sub-decode setting (e.g., interpolation precision, block size, shape, prediction mode, etc.) . &Lt; / RTI > In this case, explicitly generated units are video, sequence, picture, slice, tile, block, and the like.

For example, in the case of having 1/4 or more interpolation accuracy (1/2, 1/4), an 8-tap Kalman filter is applied to reference pixels in integer units, and an interpolation precision / 8, 1/16), a 4-tap Gaussian filter is applied to the reference pixels in the integer unit and the interpolated reference pixels in units of 1/4 or more, and interpolation precision of 1/16 or less / 64), a 2-tap linear filter can be applied to an integer reference pixel and an interpolated reference pixel of 1/16 or more units.

Alternatively, an 8-tap Kalman filter is applied to 64 × 64 or more blocks, a 6-tap Wiener filter is applied to 64 × 64 or more blocks of 16 × 16 or more, and 4- tap Gaussian filter can be applied.

Alternatively, a 4-tap Gaussian filter may be applied for a prediction mode with a 22.5 degree angular difference based on a vertical or horizontal mode and a 4-tap Gaussian filter for a prediction mode with a 22.5 degree angular difference or more.

In addition, a plurality of candidate filter groups may be composed of a 4-tap cubic filter, a 6-tap winner filter, and an 8-tap Kalman filter for some subdivisions / decode settings, a 2-tap linear filter, tap Wiener filter.

Referring to FIG. 23A, a method of interpolating a pixel of a decimal unit when one reference pixel layer ref_i is supported as a reference pixel can be confirmed. Specifically, interpolation can be performed by applying filtering (denoted by int_func_1D as a filtering function) to pixels adjacent to the interpolation target pixel (denoted by x) (assuming that a filter is applied to an integer unit pixel in this example). Here, since one reference pixel layer is used as a reference pixel, interpolation can be performed using adjacent pixels belonging to the same reference pixel hierarchy as the pixel x to be interpolated.

Referring to FIG. 23B, a method of acquiring interpolation pixels of a decimal unit in a case where two or more reference pixel layers ref_i, ref_j, and ref_k are supported as reference pixels can be confirmed. In FIG. 23B, when performing the reference pixel interpolation process in the reference pixel layer ref_j, it is possible to interpolate interpolation target pixels in a fractional unit by further using another reference pixel layer ref_k, ref_i. Specifically, the pixel (x _k , x) of the interpolation object pixel (position of x _j ) and the pixel (x _k , x) of the position belonging to another reference pixel hierarchy and corresponding to the interpolation object pixel s _i) of the adjacent pixels for each of (a _k ~ h _k, a _j ~ h _j, a _i ~ h _i) the filter (s interpolation process. int_func_1D function) first interpolation pixel is obtained, and the obtained (x _k , by performing an additional filter (which may be applicable instead of the interpolation process, [1, 2, 1] / 4, [1, 6,1] / 8, the weighted average or the like, such as filter) for x _j, x _i) Finally, the final interpolation pixel (x) in the reference pixel hierarchy ref_j can be obtained. In this example, it is assumed that the pixel (x _k , x _i ) of another reference pixel layer corresponding to the pixel to be interpolated is a decimal unit pixel that can be obtained through an interpolation process.

In the above example, the first interpolation pixel is obtained through filtering at each reference pixel layer, and the final interpolation pixel is obtained by performing additional filtering on the first interpolation pixels. However, _k to h _k , a _j to h _j , and a _i to h _i may be filtered to obtain the final interpolation pixel at a time.

Of the three reference pixel layers supported in FIG. 23B, the layer used as an actual reference pixel may be ref_j. In other words, the interpolation of one reference pixel hierarchy constituted of reference pixels is included in the candidate group (for example, not being composed of reference pixels means that pixels of the reference pixel hierarchy are not used for prediction, May refer to the corresponding pixels for interpolation and may be included in the case of being used precisely).

Referring to FIG. 23C, an example in which two supported reference pixel layers are used as reference pixels. ( _I , _j , e _i , e _j in this example) as the input pixels and performing filtering on the adjacent pixels in each of the supported reference pixel layers, The final interpolation pixel x can be obtained. However, as shown in FIG. 23B, it is also possible to obtain a first interpolation pixel at each reference pixel layer and perform additional filtering on the first interpolation pixel to obtain a final interpolation pixel (x).

The above example is not limited to the reference pixel interpolation process but can be understood as a process that is combined with another process of intra-picture prediction (for example, a reference pixel filter process, a prediction block generation process, etc.).

The reference pixel filter unit generally includes a low-pass filter. For example, smoothing using a 3-tap or 5-tap filter such as [1, 2, 1] / 4, [2, 3, 6, 3, 2] / 16, (E.g., a high-pass filter, etc.) may be used depending on the purpose of the filter application (e.g., sharpening, etc.). In the present invention, filtering for the purpose of smoothing is performed to reduce deterioration occurring in the sub-decoding process.

The reference pixel filtering can be performed according to the subdecryption setting. However, since applying the filtering presence / absence collectively does not reflect the partial characteristics of the image, filtering based on the partial characteristics of the image may be advantageous for improving the coding performance. Here, the characteristics of the image include not only the image type, the color component, the quantization parameter, the addition / decryption information of the current block (for example, the size, type, division information, and prediction mode of the current block) And a combination of the current block and the sub-block information of the neighboring block. Further, it can be determined according to the reference pixel distribution characteristics (for example, dispersion, standard deviation, flat area, discontinuous area, etc. of the reference pixel area).

24A, if it belongs to the classification (category 0) according to some partial / decode settings (for example, block size range A, prediction mode B, color component C, etc.) (Category 1) according to the decoding setting (for example, the prediction mode A of the current block, the prediction mode B of the predetermined neighboring block, and the like), filtering can be applied.

Referring to FIG. 24B, filtering is applied when belonging to a classification (category 0) according to some partial / decode settings (for example, a size A of a current block, a size B of a neighboring block, a prediction mode C of a current block, (Category 1) according to some partial / decryption settings (for example, the size A of the current block, the type B of the current block, the size C of the neighboring block, etc.), filtering is performed using the filter A (Category 2) according to some partial / decryption settings (for example, the parent block A of the current block, the parent block B of the neighboring block, etc.), filtering can be performed using the filter b.

Therefore, it is possible to determine whether or not the filtering is applied, the type of the filter, whether or not the filter information is encoded (explicit / implicit), the number of times of filtering, and the like depending on the size of the current block and the neighboring block, the prediction mode, The number of taps of the filter, the filter coefficient, and the like. At this time, it is also possible to apply the same filter several times or apply different filters to the filter more than once.

In the above example, reference pixel filtering may be preset according to the characteristics of the image. That is, the filter-related information may be implicitly determined. However, if the determination of the characteristics of the image as described above is not appropriate, the coding efficiency may be adversely affected. Therefore, it is necessary to consider a part thereof.

For the purpose of preventing the above case, the reference pixel filtering can be explicitly set. For example, information about whether filtering is applied may occur. At this time, if there is one filter, no filter selection information is generated, and if there are a plurality of filter candidate groups, filter selection information may be generated.

Although the implicit and explicit settings have been described with reference to the reference pixel filtering in the above example, a mixed case may be possible, which is determined in some cases by an explicit setting and in some cases by an implicit setting. Implicitly, the implication here is that the decoder can derive information related to the reference pixel filter (eg, filtering applicability information, filter type information).

Referring to FIG. 25, categories may be classified according to characteristics of images that are identified through the sub-decryption information, and reference pixel filtering may be adaptively performed according to the classified categories.

For example, if classification is classified as category 0, filtering is not applied. If classification is classified as category 1, filtering A is applied. In the case of

Claims

A video decoding method using a division unit including an additional area,

Dividing an encoded image included in the bitstream into at least one division unit by referring to a syntax obtained from a received bitstream;

Setting an additional region for the at least one partitioning unit; And

And decoding the encoded image based on the division unit in which the additional area is set.
In claim 1,

Wherein the step of decoding the encoded image comprises:

And determining a reference block for a current block to be decoded in the encoded image according to information indicating whether the reference is included in the bitstream.
In claim 2,

The reference block includes:

Wherein the reference block is a block belonging to a position overlapping an additional region set in a division unit to which the reference block belongs.