KR20130050899A

KR20130050899A - Encoding method and apparatus, and decoding method and apparatus of image

Info

Publication number: KR20130050899A
Application number: KR1020120125803A
Authority: KR
Inventors: 이배근; 권재철
Original assignee: 주식회사 케이티
Priority date: 2011-11-08
Filing date: 2012-11-08
Publication date: 2013-05-16

Abstract

Disclosed are a method and apparatus for encoding an image, and a method and apparatus for decoding an image. The image decoding method includes entropy decoding of acquiring motion information of a current block and generating a prediction block corresponding to the current block based on the motion information, wherein the motion information is extrapolated from a reference block ( It includes a motion vector calculated based on sub-pixels of sub-integer units obtained through extrapolation and interpolation.

Description

A video encoding method and apparatus, and a video decoding method and apparatus {ENCODING METHOD AND APPARATUS, AND DECODING METHOD AND APPARATUS OF IMAGE}

The present invention relates to encoding / decoding of an image, and more particularly, to an interpolation method in inter prediction.

Recently, the demand for high resolution and high quality images such as high definition (HD) image and ultra high definition (UHD) image is increasing in various applications. As the video data becomes higher resolution and higher quality, the amount of data increases relative to the existing video data. Therefore, when the video data is transmitted or stored using a medium such as a conventional wired / wireless broadband line, The storage cost will increase. High-efficiency image compression techniques can be utilized to solve such problems as image data becomes high-resolution and high-quality.

An inter-screen prediction technique for predicting pixel values included in the current picture from a picture before or after the current picture using an image compression technology, an intra-picture prediction technology for predicting pixel values included in the current picture using pixel information in the current picture, Various techniques exist, such as an entropy encoding technique for allocating a short code to a high frequency of appearance and a long code to a low frequency of appearance, and the image data can be effectively compressed and transmitted or stored.

The present invention provides a method of performing extrapolation and interpolation of a reference block in inter prediction to increase encoding / decoding efficiency of an image.

The present invention provides an apparatus for performing an extrapolation and interpolation method of a reference block during inter prediction to increase encoding / decoding efficiency of an image.

According to one aspect of the present invention, an image decoding method is provided. The method includes entropy decoding obtaining motion information for a current block and generating a prediction block corresponding to the current block based on the motion information, wherein the motion information is extrapolation of a reference block. And a motion vector calculated based on sub-pixels of an integer or less unit obtained through interpolation.

In the generating of the prediction block, extrapolation may be performed based on reference pixels in the reference block to obtain a final reference block, and interpolation may be performed based on the final reference block to calculate the motion vector.

The size of the reference block is greater than or equal to the size of the current block, and the size of the current block + the length of the interpolation filter tab minus 1) the size of the current block plus the length of the interpolation filter tab minus 1 It is smaller and the size of the last reference block may be larger than the size of the reference block.

The extrapolation may generate an extrapolated reference pixel based on at least one reference pixel in the reference block, and the final reference block may include the extrapolated reference pixel and a reference pixel in the reference block.

The interpolation may be performed using the extrapolated reference pixel located in the horizontal direction or the vertical direction of the subpixel to be interpolated and the reference pixel in the reference block.

According to another aspect of the present invention, an image decoding apparatus is provided. The apparatus includes an entropy decoder that obtains motion information about a current block, and a predictor that generates a prediction block corresponding to the current block based on the motion information, wherein the motion information includes extrapolation of a reference block and It includes a motion vector calculated based on sub-pixels of sub-integer units obtained through interpolation.

According to another aspect of the present invention, a video encoding method is provided. The method includes performing prediction on a current block based on a motion vector calculated using subpixels of an integer sub-unit, and entropy encoding information on the prediction. Is generated through extrapolation and interpolation of the reference block.

In the performing of the prediction, the final reference block may be obtained by extrapolation based on the reference pixel in the reference block, and the motion vector may be calculated by performing interpolation based on the last reference block.

According to another aspect of the present invention, an image encoding apparatus is provided. The apparatus includes a prediction unit for predicting a current block based on a motion vector calculated using subpixels of an integer sub-unit, and an entropy encoder for entropy encoding information for performing the prediction. The subpixels of the unit are generated through extrapolation and interpolation of the reference block.

By using the extrapolation method for the reference block, it is possible to reduce the data size of the reference block input and output to the real memory for inter prediction. Therefore, the cost of memory input / output, that is, memory bandwidth, may be reduced when encoding / decoding an image, and ultimately, power consumption of the encoder / decoder may be reduced. In addition, it is possible to solve a problem of deterioration of image quality caused by filtering not performed on pixels existing at the boundary of the reference block through extrapolation of the reference block.

1 is a block diagram illustrating an image encoding apparatus according to an embodiment of the present invention.
2 is a block diagram illustrating an image decoding apparatus according to an embodiment of the present invention.
FIG. 3 is a diagram schematically illustrating an example of interpolation of quarter units of luminance pixels of a reference picture to compensate for motion of a 4 × 4 block.
4 is a flowchart schematically illustrating a method of performing interpolation based on a reference block obtained through extrapolation according to an embodiment of the present invention.
FIG. 5 is a diagram schematically illustrating quarter interpolation of luminance pixels of a reference picture for inter prediction on a 4 × 4 block according to an embodiment of the present invention.
6 is a diagram for describing a method of performing extrapolation using a reference block according to an embodiment of the present invention.
7 is a flowchart illustrating a video encoding method to which the present invention described above is applied.
8 is a flowchart illustrating an image decoding method to which the present invention described above is applied.

While the invention is susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and will herein be described in detail. It should be understood, however, that the invention is not intended to be limited to the particular embodiments, but includes all modifications, equivalents, and alternatives falling within the spirit and scope of the invention. Like reference numerals are used for like elements in describing each drawing.

The terms first, second, etc. may be used to describe various components, but the components should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from another. For example, without departing from the scope of the present invention, the first component may be referred to as a second component, and similarly, the second component may also be referred to as a first component. And / or < / RTI > includes any combination of a plurality of related listed items or any of a plurality of related listed items.

It is to be understood that when an element is referred to as being "connected" or "connected" to another element, it may be directly connected or connected to the other element, . On the other hand, when a component is referred to as being "directly connected" or "directly connected" to another component, it should be understood that there is no other component in between.

The terminology used herein is for the purpose of describing particular example embodiments only and is not intended to be limiting of the present invention. Singular expressions include plural expressions unless the context clearly indicates otherwise. In this application, the terms "comprise" or "have" are intended to indicate that there is a feature, number, step, operation, component, part, or combination thereof described in the specification, and one or more other features. It is to be understood that the present invention does not exclude the possibility of the presence or the addition of numbers, steps, operations, components, components, or a combination thereof.

Hereinafter, with reference to the accompanying drawings, it will be described in detail a preferred embodiment of the present invention. Hereinafter, the same reference numerals are used for the same components in the drawings, and duplicate descriptions of the same components are omitted.

1 is a block diagram illustrating an image encoding apparatus according to an embodiment of the present invention.

Referring to FIG. 1, the image encoding apparatus 100 may include a picture splitter 110, a predictor 120 and 125, a transformer 130, a quantizer 135, a realigner 160, and an entropy encoder. 165, an inverse quantizer 140, an inverse transformer 145, a filter 150, and a memory 155.

Each of the components shown in FIG. 1 is shown independently to represent different characteristic functions in the image encoding apparatus, and does not mean that each component is composed of separate hardware or one software configuration unit. In other words, each component is included in each component for convenience of description, and at least two of the components may be combined into one component, or one component may be divided into a plurality of components to perform a function. Integrated and separate embodiments of the components are also included within the scope of the present invention without departing from the spirit of the invention.

In addition, some of the components are not essential components to perform essential functions in the present invention, but may be optional components only to improve performance. The present invention can be implemented only with components essential for realizing the essence of the present invention, except for the components used for the performance improvement, and can be implemented by only including the essential components except the optional components used for performance improvement Are also included in the scope of the present invention.

The picture dividing unit 110 may divide the input picture into at least one processing unit. In this case, the processing unit may be a prediction unit (PU), a transform unit (TU), or a coding unit (CU). The picture splitter 110 divides one picture into a combination of a plurality of coding units, prediction units, and transformation units, and combines one coding unit, prediction unit, and transformation unit with a predetermined reference (for example, a cost function). The picture can be encoded by selecting.

For example, one picture may be divided into a plurality of coding units. In order to divide an encoding unit in a picture, a recursive tree structure such as a quad tree structure can be used. An encoding unit, which is divided into other encoding units with one image or a maximum-size encoding unit as a root, Can be divided by the number of child nodes. Under certain constraints, an encoding unit that is no longer segmented becomes a leaf node. That is, when it is assumed that only square division is possible for one coding unit, one coding unit may be split into at most four other coding units.

Hereinafter, in the embodiment of the present invention, the meaning of a coding unit can be used not only in the unit of coding but also in the meaning of a unit of decoding.

The prediction unit may be a prediction unit having a shape of at least one square or a rectangle having the same size in one coding unit or a shape of one prediction unit among the prediction units divided in one coding unit is different from the shape of another prediction unit It can be divided into shapes.

If the prediction unit for performing the intra prediction based on the coding unit is not the minimum coding unit when the prediction unit is generated, the intra prediction may be performed without splitting into a plurality of prediction units N × N.

The predictors 120 and 125 may include an inter predictor 120 that performs inter prediction and an intra predictor 125 that performs intra prediction. Whether to use inter-prediction or intra-prediction is determined for the prediction unit, and specific information (eg, intra-prediction mode, motion vector, reference picture, etc.) according to each prediction method may be determined. At this time, the processing unit in which the prediction is performed may be different from the processing unit in which the prediction method and the concrete contents are determined. For example, the method of prediction, the prediction mode and the like are determined as a prediction unit, and the execution of the prediction may be performed in a conversion unit. The residual value (residual block) between the generated prediction block and the original block may be input to the transformer 130. In addition, prediction mode information and motion vector information used for prediction may be encoded by the entropy encoder 165 together with the residual value and transmitted to the decoder. When a specific encoding mode is used, the original block may be encoded as it is and transmitted to the decoder without generating the prediction block through the prediction units 120 and 125.

The inter prediction unit 120 may predict the prediction unit based on the information of at least one picture of the previous picture or the next picture of the current picture. The inter predictor 120 may include a reference picture interpolator, a motion predictor, and a motion compensator.

The reference picture interpolation unit may receive reference picture information from the memory 155 and generate pixel information of an integer pixel or less in the reference picture. In the case of a luminance pixel, a DCT-based interpolation filter having a different filter coefficient may be used to generate pixel information of an integer number of pixels or less in units of quarter pixels. In the case of a color difference signal, a DCT-based 4-tap interpolation filter having a different filter coefficient may be used to generate pixel information of an integer number of pixels or less in units of 1/8 pixel.

The motion prediction unit may perform motion prediction based on the reference picture interpolated by the reference picture interpolating unit. As a method for calculating a motion vector, various methods such as a full search-based block matching algorithm (FBMA), a three step search (TSS), and a new three-step search algorithm (NTS) may be used. The motion vector may have a motion vector value of 1/2 or 1/4 pixel unit based on the interpolated pixel. The motion prediction unit can predict the current prediction unit by differently performing the motion prediction method. As the motion prediction method, various methods such as a skip method, a merge method, and an AMVP (Advanced Motion Vector Prediction) method can be used.

The intra predictor 125 may generate a prediction unit based on reference pixel information around the current block, which is pixel information in the current picture. In the case where the neighboring blocks of the current prediction unit are the blocks in which the inter-screen prediction is performed, and the reference pixels are the pixels performing the inter-screen prediction, the intra-picture prediction is performed on the reference pixels included in the block in which the inter- Block reference pixel information. That is, when the reference pixel is not available, the reference pixel information that is not available may be replaced by at least one reference pixel among the available reference pixels.

In intra prediction, a prediction mode may have a directional prediction mode using reference pixel information according to a prediction direction, and a non-directional mode using no directional information when performing prediction. The mode for predicting the luminance information may be different from the mode for predicting the chrominance information, and intra prediction mode information or predicted luminance signal information in which luminance information is predicted can be used to predict the chrominance information.

When performing intra prediction, if the size of the prediction unit and the size of the transformation unit are the same, the screen for the prediction unit is based on the pixels on the left of the prediction unit, the pixels on the upper left, and the pixels on the top. I can do my predictions. However, when performing intra prediction, if the size of the prediction unit and the size of the transform unit are different, the intra prediction may be performed using a reference pixel based on the transform unit. In addition, intra prediction using N × N splitting may be used only for a minimum coding unit.

In the intra prediction method, an AIS (Adaptive Intra Smoothing) filter can be applied to a reference pixel according to a prediction mode, and a prediction block can be generated. The type of the AIS filter applied to the reference pixel may be different. To perform the intra prediction method, the intra prediction mode of the current prediction unit can be predicted from the intra prediction mode of the prediction unit existing around the current prediction unit. When the prediction mode of the current prediction unit is predicted using the mode information predicted from the neighboring prediction unit, if the prediction mode of the current prediction unit and the neighboring prediction unit are the same in the screen, the current prediction unit and the neighbor prediction are performed by using predetermined flag information. Information that the prediction modes of the units are the same may be transmitted. If the prediction modes of the current prediction unit and the neighboring prediction unit are different, entropy encoding may be performed to encode prediction mode information of the current block.

Also, a residual block may include a prediction unit performing prediction based on the prediction units generated by the prediction units 120 and 125 and residual information including residual information that is a difference from an original block of the prediction unit. The generated residual block may be input to the transformer 130.

The transform unit 130 transforms the residual block including residual information of the original block and the prediction unit generated by the prediction units 120 and 125 such as a discrete cosine transform (DCT) or a discrete sine transform (DST). Can be converted using the method. Whether to apply DCT or DST to transform the residual block may be determined based on intra prediction mode information of the prediction unit used to generate the residual block.

The quantization unit 135 may quantize the values converted by the transformer 130 into the frequency domain. The quantization factor may vary depending on the block or the importance of the image. The value calculated by the quantization unit 135 may be provided to the inverse quantization unit 140 and the reordering unit 160.

The reordering unit 160 may reorder coefficient values with respect to the quantized residual value.

The reordering unit 160 may change the two-dimensional block shape coefficients into a one-dimensional vector form through a coefficient scanning method. For example, the reordering unit 160 may scan from DC coefficients to coefficients in the high frequency region by using a Zig-Zag scan method and change them into one-dimensional vectors. Depending on the size of the transform unit and the intra prediction mode, a vertical scan that scans two-dimensional block shape coefficients in a column direction instead of a zig-zag scan may be used, and a horizontal scan that scans two-dimensional block shape coefficients in a row direction. That is, it is possible to determine whether any scan method among the jig-zag scan, the vertical scan, and the horizontal scan is used according to the size of the conversion unit and the intra prediction mode.

The entropy encoder 165 may perform entropy encoding based on the values calculated by the reordering unit 160. For entropy encoding, various encoding methods such as Exponential Golomb, Context-Adaptive Variable Length Coding (CAVLC), and Context-Adaptive Binary Arithmetic Coding (CABAC) may be used.

The entropy encoder 165 receives residual value coefficient information, block type information, prediction mode information, partition unit information, prediction unit information, transmission unit information, and motion of the coding unit from the reordering unit 160 and the prediction units 120 and 125. Various information such as vector information, reference frame information, interpolation information of a block, and filtering information can be encoded.

The entropy encoder 165 may entropy encode a coefficient value of a coding unit input from the reordering unit 160.

The inverse quantizer 140 and the inverse transformer 145 inverse quantize the quantized values in the quantizer 135 and inversely transform the transformed values in the transformer 130. The residual value generated by the inverse quantizer 140 and the inverse transformer 145 is reconstructed by combining the prediction units predicted by the motion estimator, the motion compensator, and the intra predictor included in the predictors 120 and 125. You can create a Reconstructed Block.

The filter unit 150 may include at least one of a deblocking filter, an offset correction unit, and an adaptive loop filter (ALF).

The deblocking filter may remove block distortion caused by boundaries between blocks in the reconstructed picture. It may be determined whether to apply a deblocking filter to the current block based on pixels included in a few columns or rows included in the block to determine whether to perform deblocking. When a deblocking filter is applied to a block, a strong filter or a weak filter may be applied according to the deblocking filtering strength required. In applying the deblocking filter, horizontal filtering and vertical filtering may be performed concurrently in performing vertical filtering and horizontal filtering.

The offset correction unit may correct the offset of the deblocked image with respect to the original image in units of pixels. In order to perform offset correction for a specific picture, a method of dividing a pixel included in an image into a predetermined number of regions, determining an area to be offset and applying an offset to the corresponding area, or considering an edge of each pixel, Can be used.

Adaptive Loop Filtering (ALF) can be performed based on a comparison between the filtered reconstructed image and the original image. After dividing the pixels included in the image into a predetermined group, one filter to be applied to the group may be determined and different filtering may be performed for each group. The information related to whether to apply the ALF may be transmitted for each coding unit (CU), and the shape and the filter coefficient of the ALF filter to be applied may be changed according to each block. Also, an ALF filter of the same type (fixed form) may be applied irrespective of the characteristics of the application target block.

The memory 155 may store reconstructed blocks or pictures calculated by the filter unit 150, and the stored reconstructed blocks or pictures may be provided to the predictors 120 and 125 when performing inter prediction.

2 is a block diagram illustrating an image decoding apparatus according to an embodiment of the present invention.

Referring to FIG. 2, the image decoder 200 includes an entropy decoder 210, a reordering unit 215, an inverse quantizer 220, an inverse transformer 225, a predictor 230, 235, and a filter unit ( 240, a memory 245 may be included.

When an image bitstream is input in the image encoder, the input bitstream may be decoded in a procedure opposite to that of the image encoder.

The entropy decoding unit 210 can perform entropy decoding in a procedure opposite to that in which entropy encoding is performed in the entropy encoding unit of the image encoder. For example, various methods such as Exponential Golomb, Context-Adaptive Variable Length Coding (CAVLC), and Context-Adaptive Binary Arithmetic Coding (CABAC) may be applied in accordance with the method performed by the image encoder.

The entropy decoding unit 210 can decode information related to intra-picture prediction and inter-picture prediction performed by the encoder.

The reordering unit 215 can perform reordering based on a method in which the entropy decoding unit 210 rearranges the entropy-decoded bitstreams in the encoding unit. The coefficients represented by the one-dimensional vector form can be rearranged by restoring the coefficients of the two-dimensional block form again. The reordering unit 215 may be realigned by receiving information related to coefficient scanning performed by the encoder and performing reverse scanning based on the scanning order performed by the corresponding encoder.

The inverse quantization unit 220 can perform inverse quantization based on the quantization parameters provided by the encoder and the coefficient values of the re-arranged blocks.

The inverse transform unit 225 may perform inverse DCT and inverse DST on the DCT and DST performed by the transform unit on the quantization result performed by the image encoder. The inverse transform can be performed based on the transmission unit determined by the image encoder. In the transform unit of the image encoder, DCT and DST can be selectively performed according to a plurality of information such as a prediction method, a size and a prediction direction of a current block, and an inverse transform unit 225 of the image decoder performs transform The inverse transform can be performed based on the transformed information.

The prediction units 230 and 235 may generate the prediction block based on the prediction block generation related information provided by the entropy decoder 210 and previously decoded blocks or picture information provided by the memory 245.

As described above, when the intra prediction is performed in the same manner as in the image encoder, when the size of the prediction unit and the size of the conversion unit are the same, pixels located on the left side of the prediction unit, pixels located on the upper left side, The intra prediction is performed on the prediction unit on the basis of the existing pixel. However, when the intra prediction is performed, when the size of the prediction unit differs from the size of the conversion unit, Prediction can be performed. In addition, intra prediction using N × N splitting may be used only for a minimum coding unit.

The predictors 230 and 235 may include a prediction unit determiner, an inter prediction unit, and an intra prediction unit. The prediction unit determiner receives various information such as prediction unit information input from the entropy decoder 210, prediction mode information of the intra prediction method, and motion prediction related information of the inter prediction method, and distinguishes the prediction unit from the current coding unit. In addition, it may be determined whether the prediction unit performs inter prediction or intra prediction. The inter prediction unit 230 may use the information necessary for inter-prediction of the current prediction unit provided by the image encoder, based on information included in at least one of a previous picture or a subsequent picture of the current picture including the current prediction unit. Inter-prediction of the prediction unit may be performed.

In order to perform the inter-picture prediction, whether the motion prediction method of the prediction unit included in the coding unit is based on a skip mode, a merge mode, or an AMVP mode Can be determined.

The intra predictor 235 may generate a prediction block based on pixel information in the current picture. If the prediction unit is a prediction unit that performs intra prediction, the intra prediction can be performed based on the intra prediction mode information of the prediction unit provided by the image encoder. The intra predictor 235 may include an adaptive intra smoothing (AIS) filter, a reference pixel interpolator, and a DC filter. The AIS filter performs filtering on the reference pixels of the current block and can determine whether to apply the filter according to the prediction mode of the current prediction unit. The AIS filtering can be performed on the reference pixel of the current block using the prediction mode of the prediction unit provided in the image encoder and the AIS filter information. When the prediction mode of the current block is a mode in which AIS filtering is not performed, the AIS filter may not be applied.

When the prediction mode of the prediction unit is a prediction unit that performs intra prediction based on pixel values obtained by interpolating the reference pixels, the reference pixel interpolator may generate reference pixels having an integer value or less by interpolating the reference pixels. . The reference pixel may not be interpolated in the prediction mode in which the prediction mode of the current prediction unit generates the prediction block without interpolating the reference pixel. The DC filter can generate a prediction block through filtering when the prediction mode of the current block is the DC mode.

The reconstructed block or picture may be provided to the filter unit 240. The filter unit 240 may include a deblocking filter, an offset correction unit, and an ALF.

Information on whether or not a deblocking filter is applied to the corresponding block or picture from the image encoder and information on whether a strong filter or a weak filter is applied can be provided when the deblocking filter is applied. In the deblocking filter of the video decoder, the deblocking filter related information provided by the video encoder is provided, and the video decoder can perform deblocking filtering for the corresponding block.

The offset correction unit may perform offset correction on the reconstructed image based on the type of offset correction applied to the image and the offset value information during encoding.

The ALF can be applied to an encoding unit on the basis of ALF application information and ALF coefficient information provided from an encoder. Such ALF information may be provided in a specific parameter set.

The memory 245 may store the reconstructed picture or block to use as a reference picture or reference block, and may provide the reconstructed picture to the output unit.

Hereinafter, a block may mean a unit of image encoding and decoding. Accordingly, in this specification, a block may mean a coding unit (CU), a prediction unit (PU), a transform unit (TU), or the like, in some cases. In addition, the encoding / decoding object block may be used in the present specification to include both a transform / inverse transform object block when the transform / inverse transform is performed and a predictive block when prediction is performed.

Meanwhile, in the case of inter prediction, a prediction block may be generated by performing prediction on a prediction target block of a current picture based on at least one picture (reference picture) of a previous picture or a subsequent picture of the current picture. That is, motion estimation (ME) is performed on the prediction target block of the current picture based on a reference block in the reference picture, and as a result, a motion including a motion vector (MV), a reference block index, a prediction mode, and the like. Information can be generated. In addition, motion compensation (MC) is performed based on the motion information and the reference block to generate a prediction block corresponding to the current prediction target block from the reference block.

The motion vector is a difference value between the current prediction target block and the reference block and may have a resolution equal to or less than an integer unit. For example, the resolution component may have a resolution of 1/4 unit and the color difference component may have a resolution of 1/8 unit. Accordingly, interpolation is performed to calculate subpixel values of the reference picture at non-integer positions such as 1/2 unit pixel, 1/4 unit pixel, 1/8 unit pixel, and the like. Interpolation applies an interpolation filter based on pixels at integer positions (pixels in integer units) from a reference picture to generate sub-pixels (pixels at sub-integer units) at non-integer positions. By using sub-pixels of these sub-integer units, a reference block more similar to the current prediction target block may be selected to perform better motion estimation and motion compensation.

In performing such interpolation, a plurality of reference pixels in a reference picture are required to calculate one sub pixel value. For example, to calculate a subpixel value by applying an 8-tap filter to motion compensation for a 4x4 block, 11x11 = (size of reference block horizontal direction + length of interpolation filter-1) x (vertical direction of reference block 1) size of reference block is required-the size of the interpolation filter length. In this case, since the reference block is loaded into the memory and used to calculate subpixel values, a large reference block increases memory bandwidth according to the memory load of the reference block and also increases bus usage. Will increase. On the other hand, the small size of the reference block loaded into the memory can reduce the memory bandwidth and thus reduce the power consumption. Accordingly, there is a need for a method capable of reducing the size of a reference block loaded into memory.

FIG. 3 is a diagram schematically illustrating an example of interpolation of quarter units of luminance pixels of a reference picture to compensate for motion of a 4 × 4 block.

Referring to FIG. 3, interpolation may be performed by applying an 8-tap interpolation filter to compensate for motion of a block 310 having a size of 4 × 4 in a reference picture.

For example, subpixel a in a 4x4 block 310 has four integer units located to the left of subpixel a and four integer units located to the right of subpixel a, i.e. 8 located in the horizontal direction. A pixel value of the subpixel a may be calculated by applying an 8-tap interpolation filter based on the pixels 321. The subpixel b in the 4x4 block 310 has four integer units located to the left of the subpixel b and four integer units located to the right of the subpixel b, that is, eight pixels positioned in the horizontal direction. The 8-tap interpolation filter may be applied to calculate pixel values of the subpixel b. In addition, the other subpixels in the 4 × 4 block 310 may also be adjacent to each of the subpixels in a horizontal or vertical direction. A pixel value of each subpixel may be calculated by applying an 8-tap interpolation filter based on 8 pixels.

When interpolation is performed by applying an 8-tap interpolation filter to a block 310 having a 4x4 size in the above manner, interpolation is necessary for interpolation as the subpixels performing interpolation move from left to right or from top to bottom. Eight reference pixels are moved. That is, the 8-tap interpolation filter is applied to the block 310 having a 4x4 size by moving the group of reference pixels necessary for performing interpolation on one subpixel, that is, the filtering window according to the interpolation position of the subpixel. In this case, a reference block 320 having a minimum size of 11 × 11 is required.

On the other hand, as described above, when the size of the reference block is large, the memory bandwidth is increased and as a result, the power loss is increased. Therefore, recently, a method for reducing the size of a reference block has been proposed. The proposed method performs filtering by phasing the phase component of the finite impulse response (FIR) filter and shifting the phase for regions beyond the boundary of the reference block for motion compensation. Therefore, interpolation filtering may be performed in a fixed filtering window regardless of which position the subpixel performing interpolation exists.

Referring back to FIG. 3, interpolation may be performed in a fixed filtering window by applying a 7-tap interpolation filter to compensate for motion of a block 310 having a size of 4x4 in the reference picture.

For example, the subpixel c in the 4x4 block 310 has two integer units located to the left of the subpixel c and five integer units located to the right of the subpixel c, that is, 7 positioned in the horizontal direction. A pixel value of the subpixel c may be calculated by applying a 7-tap interpolation filter based on the number of pixels 331. The subpixel d in the 4x4 size block 310 is five integer units located to the left of the subpixel d and two integer units located to the right of the subpixel d, that is, seven pixels positioned in the horizontal direction (332). ), A 7-tap interpolation filter may be applied to calculate a pixel value of the subpixel d. In addition, the other subpixels in the 4 × 4 block 310 may also apply a 7-tap interpolation filter based on 7 pixels adjacent to each subpixel in a horizontal or vertical direction to calculate pixel values of each subpixel. have.

When the 7-tap interpolation filter is applied to the 4x4 block 310 in the above manner, interpolation is performed using the reference pixel in the 7x7 size reference block 330 regardless of the position of the subpixel performing interpolation. Can be done. In this case, different filter coefficients should be used for each subpixel, and Table 1 below shows interpolation filter coefficients for each subpixel.

Distance from block boundary Tap length Sub pel Filter coefficients 0 7 1/4 -5 54 21 -9 5 -3 One 2/4 -6 36 44 -15 8 -4 One 3/4 -3 16 59 -12 6 -3 One One 7 1/4 2 -9 56 20 -8 4 -One 2/4 3 -10 39 41 -13 6 -2 3/4 2 -6 18 58 -11 4 -One

As described above, when interpolation filtering is performed using a 7x7 size reference block to compensate for motion of the 4x4 size block 310, the image quality deterioration is not performed because the filtering is not performed on pixels beyond the boundary of the reference block. There is a problem that occurs. Therefore, in the present invention, a method of performing interpolation by generating extrapolated reference pixels outside the boundary of a reference block through extrapolation to prevent deterioration of image quality while maintaining memory bandwidth such as using an existing 7x7 sized reference block. to provide.

4 is a flowchart schematically showing a method of performing interpolation based on a reference block obtained through extrapolation according to an embodiment of the present invention, and FIG. FIG. 4 is a diagram schematically illustrating interpolation of 1/4 units of luminance pixels of a reference picture for prediction.

Referring to FIG. 4, according to an embodiment of the present invention, an interpolation process may include obtaining a reference block (S400), performing extrapolation based on a reference block, obtaining a final reference block (S410), and a final reference block. Performing interpolation using (S420).

Acquiring a reference block (S400) obtains a reference block for performing motion compensation and inter prediction based on the motion information of the current block from the reference picture. For example, a reference block obtained to perform motion compensation and inter prediction may be loaded into memory. The size of the reference block may be smaller than (the horizontal size of the current block + the length of the interpolation filter tab-1) x (the vertical size of the current block + the length of the interpolation filter tab-1).

For example, when inter prediction is performed on a 4 × 4 block, a reference block is obtained from a reference picture. In this case, the reference block is a block smaller than the size of (the horizontal size of the current block + the length of the interpolation filter tab minus 1) x (the vertical size of the current block + the length of the interpolation filter tab minus 1), for example, of the prediction unit (PU). It may have the same size as the size. As illustrated in FIG. 5, for example, a 7 × 7 block 510 may be extracted as a reference block from the reference picture.

Acquiring the final reference block (S410) performs extrapolation based on at least one reference pixel in the reference block, thereby generating extrapolated reference pixels outside the region of the reference block to obtain the final reference block. Therefore, the final reference block is larger in size than the reference block obtained from the reference picture, and includes a reference pixel and an extrapolated reference pixel in the reference block. Extrapolation is a method of estimating a part of a graph or other piece of data using known values, for example, knowing two points A and B on a curve, and a few points on the part bounded by these two points. When there is, it can be used to estimate the point located outside the portion limited to A and B.

6 is a diagram for describing a method of performing extrapolation using a reference block according to an embodiment of the present invention.

Referring to FIG. 6, extrapolation reference pixels 620 may be generated by extrapolation based on at least one reference pixel among the reference pixels 610 in the reference block. In this case, linear extrapolation may be used, and the linear extrapolation may be expressed by Equation 1 below.

Where x _* is the coordinate of x pixel to be extrapolated and y (x _* ) is the value of x pixel calculated by linear extrapolation. x _k and x _k _-1 are the coordinates of the reference pixel in the reference block, y (x _k ) is the value of x _k pixels, and y (x _k _-1 ) is the value of x _k _-1 pixels.

For example, as shown in FIG. 6, when linear extrapolation is performed using the reference pixels x ₀ and x ₁ in the reference block as shown in Equation 1, the value of the extrapolated reference pixel x ₂ (y (x ₂ )) is obtained. can do.

As shown in FIG. 5, the 11x11 final reference block 520 may be obtained through the extrapolation method, and interpolation may be performed based on the final reference block 520.

Referring back to FIG. 4, in operation S420, an interpolation filter is applied based on pixels in a final reference block to calculate subpixel values of an integer or less unit.

For example, as illustrated in FIG. 5, subpixel a is an 8-tap interpolation filter based on four reference pixels including an extrapolated reference pixel located to the left of subpixel a and four reference pixels located to the right of subpixel a. The pixel value of the sub-pixel a may be calculated by applying. Subpixel b calculates the pixel value of subpixel b by applying an 8-tap interpolation filter based on the four reference pixels located above subpixel b and the four reference pixels including the extrapolated reference pixels located below subpixel b. Can be.

As described above, the method of extrapolating and interpolating the reference block has been described in order to generate the pixel information of the unit of 1/4 for the luminance pixel, but the present invention is not limited thereto. Applicable for integer units In addition, for the chrominance component, non-integer pixel information may be generated by applying extrapolation and interpolation methods of a reference block.

As described above, according to an embodiment of the present invention, the extrapolation method may be used for the reference block to reduce the data size of the reference block input / output into the actual memory for inter prediction. Therefore, the cost of memory input / output, that is, memory bandwidth, may be reduced when encoding / decoding an image, and ultimately, power consumption of the encoder / decoder may be reduced. In addition, it is possible to solve a problem of deterioration of image quality caused by filtering not performed on pixels existing at the boundary of the reference block through extrapolation of the reference block.

7 is a flowchart illustrating a video encoding method to which the present invention described above is applied. Each step of FIG. 7 may be performed in a configuration corresponding to the image encoding apparatus described with reference to FIG. 1.

Referring to FIG. 7, a new coding unit (CU) of a current picture is input to an encoder (S700). When the input coding unit is an inter prediction mode, a coding unit of one inter prediction mode (hereinafter, referred to as an “inter CU”) may be a prediction unit (PU) of several inter prediction modes (hereinafter, referred to as “inter CU”). It can be configured as an 'inter PU', and has one of two prediction modes (PredMode), that is, a skip mode (MODE_SKIP, hereinafter referred to as 'MODE_SKIP') and an inter mode (MODE_INTER, hereinafter referred to as 'MODE_INTER'). May have a mode.

A CU having MODE_SKIP is not divided into smaller PUs, and motion information of a PU having a partition mode (PartMode) of PART_2N × 2N is allocated.

A CU with MODE_INTER can exist in four types of PU partitions, and the information that the prediction mode is MODE_INTER (PredMode == MODE_INTER) and the partition type are PART_2N x 2N, PART_2N x N, and PART_N x 2N in the CU-level syntax. , Information indicating which one of the PART_N x N (PartMode = = PART_2N x 2N, PartMode = = PART_2N x N, PartMode = = PART_N x 2N, or PartMode = = PART_N x N) may be delivered to the decoder.

The encoder performs motion prediction on the current inter PU (S710). When a CU is partitioned into multiple PUs, a PU to be currently encoded (hereinafter, referred to as a “current PU”) is input. The motion prediction for the current PU may be performed using the previous frame, the next frame, or the previous and subsequent frames of the current frame. Through motion prediction, motion information (motion vector, reference picture index, prediction direction index) for the current PU can be obtained.

The encoder calculates a motion prediction value (MVP) of the current PU in the inter prediction mode (S720). The motion information of the current PU is not sent to the decoder as it is, and the difference with the predicted values obtained from space-time adjacent blocks is transmitted to the decoder in order to improve compression efficiency. There are two types of motion prediction, a merge mode and an advanced motion vector prediction (AMVP) mode, and motion prediction values may be calculated using two prediction modes.

The merge mode obtains merge candidates from the motion information of blocks adjacent to the current PU in time and space. If there is a candidate equal to the motion information of the current PU among the candidates, a flag (Merge_Flag) indicating information for using the merge mode and the index of the same candidate as the motion information of the current PU may be transmitted to the decoder. More specifically, available temporal motion vector prediction values are calculated by using a reference picture index refIdxLX, which is an index indicating a reference picture obtained at the time of motion prediction, and a merge candidate list Merge CandList is created. If there is a candidate having the same motion information as the current PU from the created merge candidate list, the value of Merge_Flag is set to 1, and the index (Merge_Idx) of the candidate is encoded.

The AMVP mode calculates AMVP candidates from the motion information of blocks adjacent to the current PU in time and space. That is, the motion vector prediction value mvpLX of the luma component is calculated. More specifically, a spatial motion vector candidate (MVP) is extracted from neighboring PUs adjacent to the current PU. A temporal motion vector candidate of a co-located block is extracted using a reference picture index (refIdxLX) obtained during motion prediction. An MVP list (mvpListLX) is created based on the spatial motion vector candidate and the temporal motion vector candidate. If several motion vectors in the created MVP list have the same value, all the motion vectors except the highest priority are deleted from the MVP list. Here, the priority of the motion vector is the order of the left neighboring block (mvLXA) of the current PU, the upper neighboring block (mvLXB) of the current PU, and the motion vector (mvLXCol) of the temporal co-located block. It is limited to. The motion vector of the best predictor among the motion vector candidates in the MVP list is selected as the motion vector predictor mvpLX. The best predictor is J _Mot , which takes into account the Rate Distortion (RD) cost function (e.g., Bit Cost and Sum of Absolute Difference). _SAD ) is a candidate block to minimize.

The encoder generates a prediction signal based on the motion information (S730). The motion information includes a motion vector calculated using a subpixel value of an integer or less unit. In this case, the subpixel value of the sub-integer unit may be calculated through extrapolation and interpolation of the reference block.

More specifically, a reference block for performing inter prediction on the current block is obtained from the reference picture. Extrapolation is performed based on the at least one reference pixel in the obtained reference block, and through this, extrapolation reference pixels are generated outside the reference block to derive the final reference block. An interpolation filter is applied based on the pixels in the final reference block to calculate sub-pixel values in sub-integer units. Since a specific embodiment thereof has been described above with reference to FIGS. 3 and 6, a description thereof will be omitted.

The encoder encodes the motion information of the current PU (S740). When the merge mode is used to predict the motion of the current PU, if a candidate having the same motion information as the current PU exists among the merge candidates, the current PU is declared as the merge mode and a flag (Merge_Flag) indicating that the merge mode is used and the current PU An index (Merge_Idx) of a candidate having the same motion information as may be encoded and transmitted to the decoder.

When the AMVP mode is used to predict the motion of the current PU, the candidate having the minimum cost function is determined by comparing the motion vector information of the current PU among the AMVP candidates. A residual signal after motion compensation may be obtained using a difference between the motion information of the candidate minimizing the cost function and the motion information of the current PU and the candidate minimizing the cost function. That is, the encoder entropy encodes a difference (MVD) between the motion vector of the current PU and the motion vector of the best predictor.

The encoder obtains a residual signal by obtaining a difference between the pixel value of the current block and the pixel value of the prediction block on a pixel basis through motion compensation (S750), and converts the obtained residual signal (S760).

The residual signal is encoded through a transform, and may be transformed by applying a transform encoding kernel. The size of the transform encoding kernel may be 2 × 2, 4 × 4, 8 × 8, 16 × 16, 32 × 32, or 64 × 64, and the kernel used for conversion may be limited in advance. In this case, a transform coefficient is generated by the transform, and the transform coefficient is in the form of a two-dimensional block. For example, the transform coefficient C for the n x n block may be calculated as in Equation 2 below.

Where C (n, n) is a matrix of n * n transform coefficients, T (n, n) is an n * n transform kernel matrix, and B (n, n) is n * n Matrix for the residual block.

The transform coefficient calculated by Equation 4 described above is then quantized.

The encoder determines whether to transmit a residual signal or a transform coefficient based on the RDO (S770). If the prediction is good, the residual signal can be transmitted without conversion coding. In this case, a cost function before and after transform coding may be compared, and a method of minimizing cost may be selected.

In addition, the type of the signal to be transmitted (residual signal or transform coefficient) for the current block may be signaled and transmitted to the decoder. For example, if the method of transmitting the residual signal without transform coding minimizes the cost, the residual signal for the current block is signaled. If the method of transmitting the transform coefficient minimizes the cost, the transform coefficient for the current block is Can be signaled.

The encoder scans the transform coefficients (S780). The transform coefficients of the quantized two-dimensional block form are scanned and changed into transform coefficients of the one-dimensional vector form.

The encoder performs entropy encoding on the transmission target information (S790). For example, the information on the scanned transform coefficients and the inter prediction mode is entropy encoded. The encoded information forms a compressed bit stream and may be transmitted or stored through a network abstraction layer (NAL).

8 is a flowchart illustrating an image decoding method to which the present invention described above is applied. Each step of FIG. 8 may be performed in a configuration corresponding to the image decoding apparatus described with reference to FIG. 2.

Referring to FIG. 8, the decoder entropy decodes the received bit stream (S800). The decoder can determine the block type from a variable length coding (VLC) table and can know the prediction mode of the current block. In addition, the decoder may check information on whether the information transmitted for the current block is a residual signal or a transform coefficient. According to the confirmed result, the residual signal or the transform coefficient for the current block can be obtained.

The decoder inverse scans an entropy decoded residual signal or transform coefficients (S810).

The decoder inversely scans the residual signal to generate a residual block, and in the case of transform coefficients, generates a transform block having a two-dimensional block shape. When the transform block is generated, the decoder may dequantize and inverse transform the transform block to obtain a residual block. The process of obtaining the residual block through the inverse transform of the transform block is shown in Equation 3.

Where B (n, n) is a matrix of residual blocks of size n * n, T (n, n) is a transform kernel matrix of size n * n, and C (n, n) is size of n * n size Matrix of transform coefficients.

The decoder performs inter prediction (S820). The decoder may decode information about the prediction mode and perform inter prediction according to the prediction mode.

For example, when the prediction mode (PredMode) is a merge mode (eg, PredMode == MODE_SKIP && Merge_Flag == 1), the motion vector (mvLX) of the luma component through the merge mode, the reference picture index You need to get (refIdxLX). To this end, a merge candidate is extracted from partitions of an adjacent PU adjacent to the current PU in a spatial direction. Then, a reference picture index (refIdxLX) is obtained to obtain a temporal merge candidate of the current PU. The available temporal motion vector prediction value MVP can be obtained using the reference picture index refIdxLX. If the number of candidates (NumMergeCand) in the merge candidate list (MergeCandList) created based on the spatial merge candidate and the temporal merge candidate is 1, the value of the merge candidate index (Merge_Idx) is set to 1; otherwise, the merge candidate index (Merge_Idx) ) Is set to the index value of the received merge. The motion vector (mvLX) and the reference picture index (refIdxLX) of the merge candidate indicated by the received merge index value are extracted and used for motion compensation.

When the prediction mode Prediction mode is Advanced Motion Vector Prediction (AMVP) mode, the reference picture index refIdxLX of the current PU is extracted, and the motion vector prediction value mvpLX of the luma component is obtained using this. More specifically, the spatial motion vector candidate (MVP) is extracted from adjacent PUs adjacent to the current PU, and the temporal motion vector candidate of the co-located block indicated by the reference picture index refIdxLX. Extract the (MVP). An MVP list (mvpListLX) is created based on the extracted spatial motion vector candidate and the temporal motion vector candidate. If several motion vectors in the created MVP list have the same value, all the motion vectors except the highest priority are deleted from the MVP list. Here, the priority of the motion vector is the order of the left neighboring block (mvLXA) of the current PU, the upper neighboring block (mvLXB) of the current PU, and the motion vector (mvLXCol) of the temporal co-located block. However, it is limited to the available vectors. If the number of MVP candidates (NumMVPCand (LX)) in the MVP list (mvpListLX) is 1, the value of the MPV candidate index (mpvIdx) is set to 0, otherwise (that is, when there are two or more MPV candidates), the MPV candidate index ( mpvIdx) is set to the received index value. At this time, the motion vector indicated by the MPV candidate index mpvIdx among the motion candidates in the MVP list mvpListLX is determined as the motion vector prediction value mvpLX. The motion vector mvLX may be calculated using Equation 4 below and the motion vector prediction value mvpLX.

Where mvLX [0], mvdLX [0], mvpLX [0] are the x component direction values of the LX motion vector, and mvLX [1], mvdLX [1], mvpLX [1] are the y component direction values of the LX motion vector. to be.

The decoder generates a prediction signal based on the motion information (S830). The motion information includes a motion vector calculated using a subpixel value of an integer or less unit. In this case, the subpixel value of the sub-integer unit may be calculated through extrapolation and interpolation of the reference block.

The decoder generates a reproduction signal (S840). For example, the decoder may add a residual signal and a signal of a previous frame to generate a reproduction signal. The play signal may be generated by adding the motion compensated prediction signal in the previous frame and the residual signal of the decoded current PU using the calculated motion vector.

In the above embodiments, the methods are described based on a flowchart as a series of steps or blocks, but the present invention is not limited to the order of steps, and certain steps may occur in a different order or at the same time than other steps described above. Can be. It will also be understood by those skilled in the art that the steps depicted in the flowchart illustrations are not exclusive, that other steps may be included, or that one or more steps in the flowchart may be deleted without affecting the scope of the present invention. You will understand.

The foregoing description is merely illustrative of the technical idea of the present invention, and various changes and modifications may be made by those skilled in the art without departing from the essential characteristics of the present invention. Therefore, the embodiments disclosed in the present invention are intended to illustrate rather than limit the scope of the present invention, and the scope of the technical idea of the present invention is not limited by these embodiments. The protection scope of the present invention should be interpreted by the claims, and all technical ideas within the equivalent scope should be interpreted as being included in the scope of the present invention.

Claims

Entropy decoding obtaining motion information for the current block; And
Generating a prediction block corresponding to the current block based on the motion information,
The motion information decoding method comprising a motion vector calculated based on sub-pixels of sub-integer units obtained through extrapolation and interpolation of a reference block.

The method of claim 1,
In the generating of the prediction block, extrapolation is performed based on reference pixels in the reference block to obtain a final reference block, and interpolation is performed based on the last reference block to calculate the motion vector. Image Decoding Method.

The method of claim 2,
The size of the reference block is greater than or equal to the size of the current block, and the size of the current block + the length of the interpolation filter tab minus 1) the size of the current block plus the length of the interpolation filter tab minus 1 Less than,
And the size of the last reference block is larger than the size of the reference block.

The method of claim 2,
The extrapolation generates an extrapolated reference pixel based on at least one reference pixel in the reference block, and the final reference block includes the extrapolated reference pixel and a reference pixel in the reference block.

5. The method of claim 4,
And the interpolation is performed by using the extrapolated reference pixel located in the horizontal direction or the vertical direction of the subpixel to be interpolated and the reference pixel in the reference block.

An entropy decoder for obtaining motion information on the current block; And
A prediction unit generating a prediction block corresponding to the current block based on the motion information,
And the motion information includes a motion vector calculated based on sub-pixels of sub-integer units obtained through extrapolation and interpolation of a reference block.

Performing prediction on the current block based on a motion vector calculated using sub-pixels of an integer or less; And
Entropy encoding the information about the prediction,
The sub-pixel having an integer sub-unit is generated through extrapolation and interpolation of a reference block.

The method of claim 7, wherein
In the performing of the prediction, an image may be obtained by performing extrapolation based on reference pixels in the reference block to obtain a final reference block, and calculating the motion vector by performing interpolation based on the last reference block. Coding method.

9. The method of claim 8,
The size of the reference block is greater than or equal to the size of the current block, and the size of the current block + the length of the interpolation filter tab minus 1) the size of the current block plus the length of the interpolation filter tab minus 1 Less than,
And the size of the last reference block is larger than the size of the reference block.

9. The method of claim 8,
The extrapolation generates an extrapolated reference pixel based on at least one reference pixel in the reference block, and the final reference block includes the extrapolated reference pixel and a reference pixel in the reference block.

The method of claim 10,
And the interpolation is performed by using the extrapolated reference pixel located in the horizontal direction or the vertical direction of the subpixel to be interpolated and the reference pixel in the reference block.

A prediction unit configured to predict a current block based on a motion vector calculated using sub-pixels of an integer or less; And
An entropy encoder configured to entropy encode information for performing the prediction,
The sub-pixel having an integer sub-unit is generated through extrapolation and interpolation of a reference block.