KR20130107414A

KR20130107414A - Video coding method using adaptive division transform

Info

Publication number: KR20130107414A
Application number: KR1020120029156A
Authority: KR
Inventors: 김연희; 전동산; 정순흥; 최진수; 김진웅
Original assignee: 한국전자통신연구원
Priority date: 2012-03-22
Filing date: 2012-03-22
Publication date: 2013-10-02

Abstract

The present invention relates to a video encoding method in a video encoding apparatus.
According to an embodiment of the present invention, an image encoding method includes acquiring a prediction block of each of a plurality of candidate encoding modes, generating a differential block based on a difference between a prediction block and an input block, and determining whether to divide or transform a differential block. Determining, determining a split transform size of the differential block, split transforming the differential block to a split transform size to generate a split transform block, calculating a rate-distortion based coding cost and splitting rate of the split transform block. And determining a final encoding mode of the input block among the plurality of candidate encoding modes based on the distortion-based encoding cost.

Description

VIDEO CODING METHOD USING ADAPTIVE DIVISION TRANSFORM

The present invention relates to a method of encoding an image, and more particularly, to a method of determining an encoding mode of an image.

Recently, as the broadcasting system supporting HD (High Definition) resolution has been expanded not only in Korea but also in the world, many users are getting used to high resolution and high quality images, and many organizations are accelerating the development of the next generation video equipment. . In addition, as interest in Ultra High Definition (UHD), which supports four times the resolution of HDTV, is increased along with HDTV, a compression technology for higher resolution and higher quality images is required.

An inter prediction technique for predicting a pixel value included in the current picture from a preceding picture and / or a succeeding picture for compression of an image, an intra prediction method for predicting a pixel value using pixel information in a picture, An entropy coding technique may be used in which a short code is assigned to a symbol having a high frequency of occurrence and / or appearance frequency, and a long code is assigned to a symbol having a low appearance frequency.

The present invention provides a method of encoding an image using adaptive segmentation transform.

The present invention provides a method for adaptively determining whether or not to perform split transformation on an encoding target block.

The present invention provides a method for adaptively determining the size of a split transform of a block to be encoded.

According to an embodiment of the present invention, an image encoding method in an image encoding apparatus is provided. The image encoding method may include: obtaining a prediction block of each of a plurality of candidate encoding modes, generating a difference block based on a difference between a prediction block and an input block, determining whether or not to divide or transform a difference block, Determining a split transform size, generating a split transform block by splitting the difference block into a split transform size, calculating a rate-distortion based encoding cost of the split transform block, and based on the rate-distortion based encoding cost Determining a final encoding mode of the input block among the plurality of candidate encoding modes.

According to the present invention, the complexity of the image encoding apparatus can be reduced.

1 is a block diagram illustrating an example of a structure of a video encoding apparatus.
2 is a block diagram illustrating an example of a structure of an image decoding apparatus.
3 is a flowchart illustrating an image encoding method according to an embodiment of the present invention.
4 illustrates intra prediction modes according to 33 directions currently supported by the HEVC test model (HM).
5 is a conceptual diagram schematically illustrating an embodiment in which one block is divided into a plurality of sub-blocks.

Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. However, in describing the embodiments of the present invention, when it is determined that the detailed description of the known configuration or function may obscure the gist of the present invention, the detailed description thereof will be omitted.

When a component is described as being "connected" or "connected" to another component, it may be directly connected to or connected to another component, but another component may be present in between. . In addition, when the present invention is described as "includes" a specific component, rather than excluding components other than the component, it is understood that additional components may be included in the scope of the embodiments or technical spirit of the present invention. it means.

Terms such as "first" and "second" may be used to describe various components, but the components are not limited by the terms. In other words, the terms are used only for the purpose of distinguishing one component from another. For example, without departing from the scope of the present invention, the first component may be referred to as the second component, and likewise, the second component may be referred to as the first component.

In addition, the components shown in the embodiments of the present invention are shown independently to indicate that they perform different characteristic functions, and do not mean that each component may not be implemented in one hardware or software. That is, each component is divided for convenience of description, and a plurality of components may be combined to operate as one component, or one component may be divided into and operate as a plurality of components, which does not depart from the essence of the present invention. Unless included in the scope of the present invention.

In addition, some components may be optional components for improving performance rather than essential components for performing essential functions of the present invention. The present invention may be implemented in a structure including only essential components except for optional components, and a structure including only essential components is also included in the scope of the present invention.

1 is a block diagram illustrating an example of a structure of a video encoding apparatus.

Referring to FIG. 1, the image encoding apparatus 100 may include a motion predictor 111, a motion compensator 112, an intra predictor 120, a switch 115, a subtractor 125, a transformer 130, A quantization unit 140, an entropy encoding unit 150, an inverse quantization unit 160, an inverse transform unit 170, an adder 175, a filter unit 180, and a reference picture buffer 190 are included.

The image encoding apparatus 100 encodes an input image in an intra prediction mode or an inter prediction mode to output a bitstream. Intra prediction is intra prediction, and inter prediction is inter prediction. The image encoding apparatus 100 transitions between the intra prediction mode and the inter prediction mode through the switching of the switch 115. The image encoding apparatus 100 generates a prediction block for an input block of an input image and then encodes a residual between the input block and the prediction block.

In the intra prediction mode, the intra predictor 120 generates a prediction block by performing spatial prediction using pixel values of blocks already encoded around the current block.

In the inter prediction mode, the motion predictor 111 finds a motion vector that finds the best match with the input block in the reference picture stored in the reference picture buffer 190 during the motion prediction process. The motion compensation unit 112 generates a prediction block by performing motion compensation using the motion vector. Here, the motion vector is a two-dimensional vector used for inter prediction, and represents an offset between the target block of the current encoding / decoding and the reference block.

The subtractor 125 generates a residual block based on the difference between the input block and the prediction block, and the transformer 130 transforms the difference block and outputs a transform coefficient. The quantization unit 140 quantizes the transform coefficients and outputs quantized coefficients.

The entropy encoder 150 outputs a bitstream by performing entropy encoding based on information obtained in the encoding / quantization process. Entropy encoding reduces the size of a bit string for a symbol to be encoded by representing frequently generated symbols with fewer bits. Therefore, it is expected to improve the compression performance of the image through entropy encoding. The entropy encoder 150 may use an encoding method such as exponential golomb, context-adaptive variable length coding, or context-adaptive binary arithmetic coding (CABAC) for entropy encoding.

The coded picture needs to be decoded and stored again to be used as a reference picture for performing inter prediction coding. Accordingly, the inverse quantization unit 160 inverse quantizes the quantized coefficients, and the inverse transform unit 170 inverse transforms the inverse quantized coefficients to output the reconstructed difference block. The adder 175 adds the reconstructed difference block to the prediction block to generate a reconstruction block.

The filter unit 180 may also be referred to as an adaptive in-loop filter, and may include at least one of deblocking filtering, sample adaptive offset (SAO) compensation, and adaptive loop filtering (ALF). Apply. Deblocking filtering means removing block distortion at an inter-block boundary, and SAO compensation means adding an appropriate offset to pixel values to compensate for coding errors. Also, ALF means filtering based on a value obtained by comparing a reconstructed image with an original image.

The reference picture buffer 190 stores the reconstructed block that has passed through the filter unit 180.

2 is a block diagram illustrating an example of a structure of an image decoding apparatus.

2, the image decoding apparatus 200 may include an entropy decoder 210, an inverse quantizer 220, an inverse transformer 230, an intra predictor 240, a motion compensator 250, and an adder 255. ), A filter unit 260 and a reference picture buffer 270.

The image decoding apparatus 200 outputs a reconstructed image by decoding the bitstream in an intra prediction mode or an inter prediction mode. The image decoding apparatus 200 transitions between the intra prediction mode and the inter prediction mode by switching a switch. The image decoding apparatus 200 obtains a difference block from a bitstream, generates a prediction block, and then adds the difference block and the prediction block to generate a reconstruction block.

The entropy decoder 210 performs entropy decoding based on probability distribution. The entropy decoding process is the reverse of the above-described entropy coding process. That is, the entropy decoder 210 generates a symbol including quantized coefficients from a bitstream representing a frequently generated symbol with a small number of bits.

The inverse quantizer 220 inversely quantizes the quantized coefficients, and the inverse transformer 230 inversely transforms the inverse quantized coefficients to generate a difference block.

In the intra prediction mode, the intra predictor 240 generates a predictive block by performing spatial prediction using pixel values of blocks already decoded around the current block.

In the inter-prediction mode, the motion compensation unit 250 performs motion compensation using a motion vector and a reference picture stored in the reference picture buffer 270 to generate a prediction block.

The adder 255 adds the prediction block to the difference block, and the filter unit 260 outputs the reconstructed image by applying at least one of deblocking filtering, SAO compensation, and ALF to the block that has passed through the adder.

The reconstructed image may be stored in the reference picture buffer 270 to be used for motion compensation.

Hereinafter, a block means a unit of encoding / decoding. In the encoding / decoding process, an image is divided into a predetermined size and encoded / decoded. Therefore, a block may also be called a macro block (MB), a coding unit (CU), a prediction unit (PU), a transform unit (TU), or the like. It may be divided into smaller blocks of smaller size.

Here, the prediction unit means a basic unit of performing prediction and / or motion compensation. The prediction unit may be divided into a plurality of partitions, and each partition is called a prediction unit partition. When the prediction unit is divided into a plurality of partitions, the prediction unit partition may be a basic unit of performing prediction and / or motion compensation. Hereinafter, in an embodiment of the present invention, the prediction unit may mean a prediction unit partition.

Meanwhile, in the image encoding method, an optimal encoding mode is determined using a rate-distortion based optimization technique. That is, for every candidate coding mode, the coding unit is converted into a possible size, and among them, the candidate coding mode having the minimum rate-distortion cost based coding cost is determined as the final coding mode. This process is essential for encoding high resolution video because it maximizes coding efficiency. However, this process has a problem in that it is difficult to apply to a broadcast video encoding apparatus that needs to perform real-time encoding because of its high complexity.

In the rate-distortion-based optimization technique, the present invention provides a method for determining whether a coding unit is split transformed by using features of an image when calculating a rate-distortion-based encoding cost. It also provides a method for adaptively determining the split transform size. Therefore, according to the present invention, by eliminating a part of the rate-distortion-based encoding cost calculation process using the adaptive split transformation, it is possible to maintain the performance of the coding efficiency while reducing the complexity of the video encoding apparatus.

Meanwhile, the following description is based on an encoding method and apparatus, but a person having ordinary knowledge in the art to which the present invention belongs can easily apply the technical idea of the present invention to a decoding method and apparatus.

3 is a flowchart illustrating an image encoding method according to an embodiment of the present invention.

Referring to FIG. 3, in the image encoding method according to an embodiment of the present invention, a prediction block obtaining step (S310), a differential block generation step (S320), a split transform determination step (S330), and a split size determination step (S340) A transform block generation step S350, an encoding cost calculation step S360, and a final encoding mode determination step S370 are included.

The image encoding apparatus obtains prediction blocks of each of the plurality of candidate encoding modes (S310). That is, the image encoding apparatus generates a prediction block corresponding to the candidate mode from the neighboring samples for each of the plurality of candidate encoding modes.

In the intra prediction mode, a prediction block of a block to be currently encoded is generated using spatially decoded spatial peripheral information as a reference sample. Compared to the intra prediction technique in the past video compression standard, HEVC (High Efficiency Video Coding), which is currently being standardized, supports a larger number of prediction modes as shown in Table 1.

PU size Number of intra prediction modes 4x4 18 8x8 35 16x16 35 32x32 35 64x64 4

One of the past video compression standards, H.264 / AVC, supports up to nine intra prediction modes depending on the size of the prediction unit, while HEVC supports up to 35 intra prediction modes. 4 illustrates intra prediction modes according to 33 directions currently supported by the HEVC test model (HM). The coding method adopted as a standard is integrated into one software for fair verification and easy development of coding tools. This is called HM.

Referring back to FIG. 3, the image encoding apparatus generates a difference block based on the difference between the prediction block and the input block obtained through the prediction block obtaining step S310 (S320). That is, the difference image between the original image and the predicted image is obtained.

In addition, the apparatus for encoding an image determines whether or not to segment-convert the differential block generated through the differential block generation step S320 (S330). In this case, whether to divide the differential block may be determined in consideration of the complexity of the image, whether to divide the neighboring block, or a boundary error.

The rate-distortion based encoding cost is calculated based on the encoding result of coefficients with which the differential block generated through the differential block generation step S330 is transformed. Accordingly, the rate-distortion based coding cost calculated according to the split transform size of the differential block varies.

In the conventional method of determining a coding mode using a rate-distortion based optimization technique, after performing a split transform on a plurality of split transform sizes for all candidate coding modes, the mode having the least cost and the split transform size are compared with the final coding mode. Determine the final split transform size. However, since the rate-distortion-based encoding cost has a high complexity, the existing encoding mode determination method for obtaining the cost for each split transform size has a problem that the complexity continues to increase.

An image encoding apparatus according to an embodiment of the present invention adaptively performs partition transform to increase encoding efficiency compared to complexity. For example, in a method of determining an optimal encoding mode using a rate-distortion based optimization method according to an embodiment of the present invention, whether or not to perform a split transformation is determined according to the complexity of an image and whether or not to perform a split transformation of a neighboring block.

Referring back to FIG. 3, when it is determined to perform a division transform on the difference block, the image encoding apparatus determines the division transform size of the difference block (S340). In this case, the split transform size of the difference block may be one predetermined size or a predetermined lower size. That is, the difference block may be converted into a predetermined size, or divided into a predetermined lower size to be converted.

In addition, the apparatus for encoding an image generates a split transform block by splitting the difference block into a split transform size determined through the split transform size determination step S340 (S350), and calculates a rate-distortion based encoding cost of the split transform block. (S360).

For example, when the encoding mode is determined in consideration of the rate-distortion-based encoding cost of the split-converted differential block, after calculating the rate-distortion-based encoding cost of the differential block, the differential block is transformed to a predetermined depth. The rate-distortion based coding cost of each split transform block can be obtained.

5 is a conceptual diagram schematically illustrating an embodiment in which one block is divided into a plurality of sub-blocks.

The depth of a block means the degree to which the block is divided. For example, as shown in FIG. 5, the root node that is the highest node in the tree structure is the shallowest, and the level 3 leaf node has the deepest depth.

According to an embodiment of the present invention, when the predetermined split transform depth is 3, the video encoding apparatus may determine a difference block as (a, b, j), (c, h, i), (d, e, f, g). By performing split transform, the rate-distortion-based encoding cost of each split transform block can be obtained.

On the other hand, the rate-distortion-based coding cost is calculated from the viewpoint of rate and distortion, and the rate-distortion optimization technique selects an optimal coding mode in consideration of the rate-distortion-based coding cost. .

Therefore, the image encoding apparatus calculates a rate-distortion-based encoding cost based on the bit amount and the image quality difference between the input block and the reconstruction block when the transform coefficients of the transformed transform block are transformed through the transform block generation step S350. do.

Referring to FIG. 3 again, the apparatus for encoding an image determines a final encoding mode based on the rate-distortion based encoding cost calculated through the encoding cost calculating step S360 (S370). That is, the encoding mode having the lowest rate-distortion based encoding cost among the plurality of candidate encoding modes is determined as the final encoding mode.

In this case, whether or not to perform a split transform and a split transform size may be encoded or stored together with the determined mode.

On the other hand, while the above-described embodiments are described through a flowchart represented by a series of steps or blocks, the present invention is not limited to the order of the above-described steps, some steps may occur in a different order or at the same time with other steps. have. In addition, one of ordinary skill in the art will appreciate that the steps shown in the flowcharts are not exclusive, that other steps may be included or some steps may be deleted.

In addition, the above-described embodiments include examples of various aspects. While not all possible combinations can be described to illustrate various aspects, one of ordinary skill in the art will recognize that other combinations are possible. Accordingly, it is intended that the invention include all alternatives, modifications and variations that fall within the scope of the following claims.

Claims

In the video encoding method in a video encoding apparatus,
Obtaining a prediction block of each of the plurality of candidate encoding modes;
Generating a difference block based on the difference between the prediction block and the input block;
Determining whether to divide or transform the difference block;
In case of splitting the difference block,
Determining a split transform size of the difference block;
Generating a split transform block by splitting the difference block into the split transform size;
Calculating a rate-distortion based encoding cost of the split transform block; And
And determining a final encoding mode of the input block among the plurality of candidate encoding modes based on the rate-distortion based encoding cost.