KR20160108958A

KR20160108958A - Video Coding Method and Apparatus thereof

Info

Publication number: KR20160108958A
Application number: KR1020150032457A
Authority: KR
Inventors: 전동산; 김종호; 이진호; 이하현; 임성창; 강정원; 김휘용; 최진수
Original assignee: 한국전자통신연구원
Priority date: 2015-03-09
Filing date: 2015-03-09
Publication date: 2016-09-21

Abstract

The present invention relates to a video coding method and to an apparatus thereof. According to the present invention, the video coding method comprises: a step of determining the minimum depth and the maximum depth for a division depth of a currently coded block based on the division depth of an already encoded peripheral block; determining the division depth of the currently encoded block based on the determined minimum depth, the maximum depth, and the image property of the currently encoded block; and outputting the encoded image depending on the determined division depth.

Description

[0001] The present invention relates to a video coding method and apparatus,

The present invention relates to a video coding method and apparatus therefor.

JCT-VC, which was recently formed jointly by the ISO / IEC MPEG and ITU-T VCEG standardization groups, has completed the standardization of HEVC (High Efficiency Video Coding) technology as the next generation video standard. Compared with H.264 / AVC, which provides the highest compression ratio through HEVC, image compression technology achieves subjective image quality improvement of about 50% or more. To this end, a sophisticated coding tool capable of improving compression efficiency is newly It was proposed.

Similar to the conventional image compression method, the HEVC is an inter prediction coding technique for predicting pixel values included in a current screen from previous or later reference pictures in time, using the reference pixel information in the current screen (Intra) prediction coding technique for predicting pixel values included in a current screen, an entropy coding technique for assigning a short code to a symbol having a high appearance frequency and allocating a long code to a symbol having a low appearance frequency .

Conventional video codecs are encoded in units of macroblocks (16 × 16), while HEVC uses CU (Coding Unit), PU (Prediction Unit) and TU (Transform) to enhance optimal compression efficiency for high- (Block Structure) that can achieve optimal compression efficiency per unit.

HEVC supports a quadtree structure as a coding structure. In the HEVC, an encoding unit (hereinafter referred to as a block) has a variable size and has a structure of 64 × 64 to 8 × 8 hierarchically, and a structure of a quad tree, as shown in FIG. As a result, in the HEVC, the block size is composed of 64 × 64, 32 × 32, 16 × 16, and 8 × 8 layers. A block having a size larger than that of a specific size is referred to as a parent block, . A 64x64 block is the largest block supported by the HEVC, and can be named a root node, a large block, and so on.

Each layer in the quadtree structure can have depth or level information. The depth indicates the number and / or the number of times the block is divided, and therefore may include information about the size of the sub-block. In HEVC, the depth of each layer of quad tree is expressed as a constant as follows.

- 64 × 64 = Depth "0"

- 32 x 32 = Depth "1"

- 16 × 16 = Depth "2"

- 8 × 8 = Depth "3"

According to the above description, the larger the block size, the lower the depth, and the smaller the block size, the higher the depth. The largest block (64x64 block) supported by HEVC has minimum depth (0) and the smallest block (8x8 block) has maximum depth (3).

In the following embodiments, "depth "," split depth ", "maximum depth ", and" minimum depth "

When encoding is performed, the maximum size and minimum size of the block are determined by inputting parameters at the time of encoding, and when the maximum and minimum sizes are determined, the depth of division is determined. The division depth may mean a range from the depth corresponding to the maximum size of the block to the size corresponding to the minimum size. For example, if it is determined that a block is to be divided from a maximum size of 32x32 to a minimum size of 8x8, the division depth is determined from the minimum depth 1 to the maximum depth 3.

In the encoding based on the quadtree structure, the block is recursively partitioned from the maximum size to the minimum size according to the division depth. After calculating the Rate-Distortion cost for a maximum size block (for example, 64x64 blocks), the encoder divides the block into sub-blocks (for example, 32x32 blocks) And calculates the rate-distortion cost for the sub-block. After the rate-distortion cost calculation and the block division process are repeated to calculate the rate-distortion cost up to a block (for example, 8x8 block) of a size corresponding to the division depth, The rate-distortion cost is compared to determine the optimal block size, and encoding is performed on the determined size block.

This conventional scheme has the disadvantage of increasing the computational complexity of the encoder since it is necessary to calculate the rate-distortion cost of all blocks from the maximum size block to the minimum size block although the coding efficiency can be improved.

SUMMARY OF THE INVENTION The present invention has been made to solve the above-mentioned problems, and it is an object of the present invention to provide a video coding method that reduces the complexity of an encoder by efficiently determining a depth to which a current block is to be divided, And a device therefor.

The present invention relates to a video coding method and apparatus for determining a division depth based on division depth information of a neighboring block that has been already encoded and an image characteristic of a block to be coded in a coding based on a quad tree structure.

According to another aspect of the present invention, there is provided a method of coding a video, the method comprising: determining a minimum depth and a maximum depth of a division depth of a current block to be coded based on a division depth of a neighboring block that has already been coded; Determining the division depth of the current block to be coded based on the minimum depth, maximum depth, and image characteristics of the current block to be coded, and outputting the coded image according to the determined division depth .

The video coding method and apparatus according to the present invention can quickly determine the division depth of a block in the quad tree-based block coding process supported by the HEVC.

Also, the video encoding method and apparatus according to the present invention can reduce the complexity of the encoder by determining the minimum block size to be 16x16 instead of 8x8 for any size block.

FIG. 1 is a diagram for explaining a quadtree structure supported by HEVC.
2 is a flowchart illustrating a video encoding method according to the present invention.
3 is a diagram showing an example of a neighboring block that has already been encoded.
FIG. 4 is a flowchart illustrating a divided block determination method according to an embodiment of the present invention.
5 is a diagram for explaining the calculation of the internal boundary absolute difference accumulative value.
6 is a block diagram illustrating a structure of an encoder according to the present invention.

In the description of the embodiments of the present invention, if it is determined that the detailed description of the related known structure or function is not satisfactory, the detailed description thereof may be omitted.

As used herein, a coding unit refers to a basic unit of video coding and decoding. The basic unit refers to a divided unit when one picture is divided and encoded or decoded, and the coding unit can be named as a unit, a block, a macroblock, and the like. The coding unit may correspond to a prediction unit (PU) or a transform unit (TU). One encoding unit can be divided into sub-encoding units of smaller size.

As used herein, a block means an M x N array of samples. M and N have any positive integer value, and the meaning of the block described in the present invention may mean an encoding unit.

When an element is referred to herein as " connected " or " connected " to another element, it is to be understood that the element is not only directly connected or connected to another element, But it should be understood that other components exist between the component and the other component.

Quot ;, " include, "" include," as used herein. And the like are intended to indicate the existence of the disclosed function, operation, component, etc., and do not limit the one or more additional functions, operations, components, and the like. Also, in this specification, "include." Or "having" are intended to designate the presence of stated features, integers, steps, operations, components, parts, or combinations thereof, unless the context clearly dictates otherwise. Elements, parts, or combinations thereof without departing from the spirit and scope of the invention.

The constituent parts of the present invention are shown separately to represent different characteristic functions and do not mean that each constituent part is composed of separate hardware or one software constituent unit. That is, each constituent unit is included in each constituent unit for convenience of explanation, and at least two constituent units of the constituent units may be combined to form one constituent unit, or one constituent unit may be divided into a plurality of constituent units to perform a function. The integrated embodiments and separate embodiments of the components are also included within the scope of the present invention, without departing from the essence of the present invention.

Some components are not essential components to perform essential functions in the present invention, but may be optional components only to improve performance. The present invention can be implemented only with components essential for realizing the essence of the present invention, except for the components used for the performance improvement, and can be implemented by only including the essential components except the optional components used for performance improvement Are also included in the scope of the present invention.

As used herein, the singular forms "a", "an" and "the" include plural referents unless the context clearly dictates otherwise.

Hereinafter, the present invention will be described with reference to the accompanying drawings.

In order to provide a high coding efficiency, a video encoder has a coding unit size (which may further include a prediction unit size, a conversion unit size, or the like), an encoding mode (including a prediction mode, ), Motion information, and the like. The video encoder uses a Rate-Distortion Optimization scheme to select an optimal combination of the above factors. Generally, the optimal combination in the rate-distortion optimization scheme is chosen in a combination that minimizes the rate-distortion cost.

The rate-distortion optimization scheme calculates the Rate-Distortion cost (rate distortion) to select the optimal combination. The rate-distortion cost J can be calculated according to the following equation (1).

In Equation (1), D denotes a mean square error (SSD) of the difference between the original transform coefficients and the restored transform coefficients in the corresponding block. do. R denotes the number of bits required to encode the block, i.e., the bit rate using the related context information. R includes not only coding parameter information such as a prediction mode, motion information, and coded block flag but also bits generated when coding the transform coefficients. lambda denotes a Lagrangian multiplier.

In order to calculate the correct D and R, the image encoder performs in-screen / inter-picture prediction, transformation, quantization, entropy encoding, inverse quantization, and inverse transformation. This process greatly increases the complexity of the image encoder.

In order to solve the complexity problem of the rate-distortion optimization method, in the present invention, the division depth is determined based on the division depth information of neighboring blocks that have already been coded and the image characteristics of the block to be coded, Thereby making it possible to speed up the determination of the encoding mode.

Hereinafter, the encoding method according to the present invention will be described in more detail.

2 is a flowchart illustrating a video encoding method according to the present invention.

Referring to FIG. 2, the encoder according to the present invention determines a minimum depth and a maximum depth of a division depth of a current block to be encoded (201). The encoder determines the minimum depth (i.e., the maximum value of the division depth) and the maximum depth (i.e., the minimum value of the division depth) of the current block to be coded based on the division depth information of the neighboring block that has been already encoded. The division depth information of a neighboring block that has already been encoded is obtained in advance at the time of encoding the neighboring blocks.

In one embodiment, the neighboring blocks that have already been encoded may be the left and top blocks of the current block to be coded, as shown in FIG. In general, since coding proceeds from left to right and from top to bottom, the left and top blocks are already encoded based on the block to be coded at present. Therefore, the minimum division depth and the maximum division depth of the block to be currently coded can be determined based on the division depth information previously obtained at the time of encoding the left and the equivalent blocks.

In various embodiments of the present invention, the minimum depth of a block to be currently encoded may be determined according to the following equation (2).

In Equation (2), D _min ^CB is a minimum depth of a block to be currently coded and has a constant value. D _min ^HEVC is the minimum value of the depth supported by the HEVC, and can have a constant value of 0 as described above. The _Dmin ^neighbor is a division depth of a block that has already been coded and has a constant value.

In one embodiment, if the neighboring block that has already been encoded is a left and an upper block, Equation (2) can be expressed as Equation (3).

In Equation (3), D _min ^Left And D _min ^Above is the minimum depth of the left and top blocks and has a constant value.

Also, in various embodiments of the present invention, the maximum depth of a block to be currently encoded may be determined according to Equation (4) below.

In Equation (4), D _max ^CB is a maximum depth of a block to be currently coded and has a constant value. D _max ^HEVC is the maximum value of the depth supported by the HEVC and can have a constant value of 3 as described above. D _max ^neighbor is the maximum depth of a block that has already been encoded and has a constant value.

In one embodiment, if the neighboring blocks that have already been encoded are the left and upper blocks, Equation (4) can be expressed as Equation (5).

In Equation (5), D _min ^Left And D _min ^Above is the maximum depth of the left and top blocks and has a constant value.

Next, the encoder determines a division depth of a block to be currently coded (202). The encoder determines a division depth of a block to be currently coded based on the determined minimum depth and maximum depth. That is, the encoder selects the maximum depth to be actually applied within the determined minimum depth and maximum depth to finally determine the division depth.

In various embodiments of the present invention, the encoder determines the division depth based on the image characteristics of the block that is currently being encoded. The encoder can determine whether or not to divide the block to the next depth after the calculation of the rate-distortion cost of the minimum depth block is completed, even if the range of the division depth is determined from the minimum division depth to the maximum division depth. For example, even when the minimum depth is determined to be 0 and the maximum depth is determined to be 2, the encoder can determine whether to divide it into 32x32 blocks after completing the rate-distortion cost calculation of 64x64 blocks .

In one embodiment, the image characteristic may be Inner Boundary Absolute Difference. If the absolute difference accumulation value (SAD) of the pixel unit for the inner boundary at a certain depth is smaller than a preset threshold value, the encoder can determine the depth up to the current depth. This will be described in detail as follows.

Referring to FIG. 4, the encoder performs a rate-distortion cost calculation for a 2N × 2N block of the current depth (401). If the current depth is not the previously determined minimum depth (402), as shown in FIG. 5, the encoder calculates a Sum of Absolute Difference (SNR) value for the inner boundary of the 2N × 2N block corresponding to the current depth (SAD) is calculated (403).

If the calculated SAD is less than a predetermined threshold (404), the encoder does not further divide the block at the current depth. That is, the encoder determines the current depth up to the division depth.

On the other hand, if the calculated SAD is not smaller than the preset threshold value, the encoder divides the block into a size corresponding to the next depth (405) and recursively performs the rate-distortion cost calculation for the divided block (401) . That is, the encoder does not determine the current depth up to the division depth.

If the current depth reaches the minimum depth according to the above procedure (402), the encoder determines the depth up to the minimum depth.

Returning to FIG. 2, if the division depth is determined according to the above-described procedure, the encoder performs video coding according to the determined division depth (step 203). And the encoder can output the encoded image according to the determined depth of division.

6 is a block diagram illustrating a structure of an encoder according to the present invention.

Referring to FIG. 6, a video encoder 600 according to the present invention may include an input unit 601, a controller 602, and a storage unit 603.

The input unit 601 can receive a video to be encoded. The video can be input in units of a coding block having a predetermined size, and can be input in blocks of CU, PU, TU size when the HEVC standard is applied.

The control unit 602 performs encoding of input video. The control unit 602 determines the division depth based on the division depth information of the already encoded neighboring block and the image characteristic of the current block to be coded according to the coding method according to the present invention and performs video coding according to the determined depth can do. The specific operation of the control unit 602 is as described above.

In various embodiments, the controller 602 may be logically divided into a predictor, an encoder, a decoder, a threshold determiner, an optimal mode determiner, a filter, a subtracter, and the like, Only the physical structure is shown. However, the control unit 602 may have various types of logical / physical structures according to the description, and as long as the control unit 602 performs the encoding operation according to the technical idea of the present invention, The scope will be self-evident.

The storage unit 603 may store the rate-distortion cost, the threshold value or the like determined by the controller 602 or temporarily or permanently store the encoded / decoded data by the controller 603. [

It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. It is to be understood that both the foregoing general description and the following detailed description of the present invention are exemplary and explanatory and are intended to provide further explanation of the invention as claimed. Accordingly, the scope of the present invention should be construed as being included in the scope of the present invention, all changes or modifications derived from the technical idea of the present invention.

600: encoder 601: input unit
602: Control section 603:

Claims

Determining a minimum depth and a maximum depth for a division depth of a current block to be coded based on a division depth of a neighboring block that has been already encoded;
Determining the division depth of the current block to be coded based on the determined minimum depth, maximum depth, and image characteristic of the current block to be coded; And
And outputting an encoded image according to the determined depth of division.