WO2018124332A1

WO2018124332A1 - Intra prediction mode-based image processing method, and apparatus therefor

Info

Publication number: WO2018124332A1
Application number: PCT/KR2016/015431
Authority: WO
Inventors: 강제원; 이재호; 류수경; 임재현; 강민주
Original assignee: 엘지전자(주); 이화여자대학교 산학협력단
Priority date: 2016-12-28
Filing date: 2016-12-28
Publication date: 2018-07-05
Also published as: KR102287302B1; KR20190090867A

Abstract

Disclosed are an intra prediction mode-based image processing method, and an apparatus therefor. Specifically, the intra prediction mode-based method for decoding an image comprises the steps of: discovering a leaf node in a decision tree having a binary tree structure by sequentially selecting a child node starting from the root node by utilizing a split function stored in each node; selecting a class to which the leaf node and one or more intra prediction modes are mapped; deriving the intra prediction mode of the current block by utilizing the class; and generating a prediction sample of the current block on the basis of the intra prediction mode, wherein the split function comprises a split parameter and can output a value corresponding to a left childe node or right child node by taking as an input a reference sample neighboring the current block.

Description

Intra prediction mode based image processing method and apparatus therefor

The present invention relates to a still image or moving image processing method, and more particularly, to a method for encoding / decoding a still image or moving image based on an intra prediction mode and an apparatus supporting the same.

Compression coding refers to a series of signal processing techniques for transmitting digitized information through a communication line or for storing in a form suitable for a storage medium. Media such as an image, an image, an audio, and the like may be a target of compression encoding. In particular, a technique of performing compression encoding on an image is called video image compression.

Next-generation video content will be characterized by high spatial resolution, high frame rate and high dimensionality of scene representation. Processing such content would result in a tremendous increase in terms of memory storage, memory access rate, and processing power.

Accordingly, there is a need to design coding tools for more efficiently processing next generation video content.

An object of the present invention is to propose a method of encoding / decoding by optimizing signaling of intra prediction mode using a random forest method, which is one of machine learning techniques.

Another object of the present invention is to propose a method of deriving an intra prediction mode using a random forest without signaling the intra prediction mode.

In addition, an object of the present invention is to propose a method for configuring a plurality of intra prediction modes into one class and estimating the class using a random forest.

The technical problems to be achieved in the present invention are not limited to the technical problems mentioned above, and other technical problems not mentioned above will be clearly understood by those skilled in the art from the following description. Could be.

An aspect of the present invention provides a method of decoding an image based on an intra prediction mode, comprising: a split function stored in each node in a decision tree having a binary tree structure Searching for a leaf node by sequentially performing a process of selecting a child node from a root node using; Selecting a class to which the leaf node and at least one intra prediction mode are mapped; Deriving an intra prediction mode of the current block using the class; And generating a predictive sample of the current block based on the intra prediction mode, wherein the splitting function is composed of a split parameter and receives a reference sample neighboring the current block as an input. A value corresponding to a left child node or a right child node may be output.

An aspect of the present invention is an apparatus for decoding an image based on an intra prediction mode, comprising: a split function stored in each node in a decision tree having a binary tree structure A leaf node search unit for searching for a leaf node by sequentially performing a process of selecting a child node from a root node using a root node; A class selector configured to select a class to which the leaf node and at least one intra prediction mode are mapped; A prediction mode derivation unit for deriving an intra prediction mode of the current block by using the class; And a prediction sample generator configured to generate a prediction sample of the current block based on the intra prediction mode, wherein the splitting function includes a split parameter and receives a reference sample neighboring the current block as an input. A value corresponding to a left child node or a right child node may be output.

Preferably, the partitioning parameter may be learned in a direction that maximizes the reduction in uncertainty calculated when the node is split into a left child node or a right child node at each node of the decision tree.

Preferably, when a plurality of classes are selected by searching for leaf nodes of each decision tree in a random forest composed of a plurality of decision trees, the most selected class among the plurality of classes is defined as a class of the random forest. The method may further include determining, and an intra prediction mode of the current block may be derived using a class of the random forest.

Preferably, the partitioning parameter may be determined using two reference samples of reference samples neighboring the current block.

Preferably, when referring to multiple reference sample lines to generate a predictive sample of the current block, the splitting parameter may be determined using a plurality of reference samples among the multiple reference sample lines.

Preferably, when referring to a multi-reference sample line to generate a predictive sample of the current block, the splitting parameter may be obtained by using a plurality of reference samples of two reference sample lines adjacent to the current block among the multi-reference sample lines. Can be determined.

Preferably, when referring to a multi-reference sample line to generate a prediction sample of the current block, the splitting parameter is adjacent to the right side of the specific reference sample in a horizontal direction with the sample value of the specific reference sample of the multi-reference sample line. The difference value between the sample value of the reference sample and the sample value of the reference sample adjacent to the lower end of the specific reference sample in the vertical direction with the sample value of the specific reference sample may be determined.

Preferably, when referring to a multi-reference sample line to generate a prediction sample of the current block, the splitting parameter is equal to the sample value of each reference sample of the multi-reference sample line and the sample value of adjacent reference samples in a specific angular direction. It can be determined using the sum of difference values of.

Preferably, when a plurality of intra prediction modes are mapped to the class, the method further includes decoding index information indicating an intra prediction mode of the current block in the class, and intra prediction of the current block. The mode may be derived using the class and the index information.

Preferably, selecting a class to which the one or more intra prediction modes are mapped comprises: searching for leaf nodes in a first decision tree to determine an intra prediction mode group consisting of a plurality of intra prediction modes that can be applied to the current block. The method may further include searching for a leaf node in a second decision tree determined according to the intra prediction mode group, and selecting a class to which one or more intra prediction modes are mapped in the intra prediction mode group.

An aspect of the present invention provides a method of encoding an image based on an intra prediction mode, comprising: a split function stored in each node in a decision tree having a binary tree structure Searching for a leaf node by sequentially performing a process of selecting a child node from a root node using; Selecting a class to which the leaf node and at least one intra prediction mode are mapped; And encoding the intra prediction mode of the current block by using the class, wherein the splitting function is composed of a split parameter and receives a reference sample neighboring the current block as a left child node. A value corresponding to (left child node) or right child node may be output.

Preferably, the partitioning parameter may be determined using one reference sample among reference samples neighboring the current block and one prediction sample among prediction samples of the current block.

Preferably, the partitioning parameter may be determined using one or more reference samples among reference samples neighboring the current block and one or more prediction samples among the prediction samples of the current block.

Preferably, the splitting parameter is a difference value between a sample value of a specific prediction sample among the prediction samples of the current block and a sample value of a prediction sample adjacent to the right side of the specific prediction sample in a horizontal direction, and a sample value of the specific prediction sample. And a difference value between the sample values of the prediction sample adjacent to the bottom of the specific prediction sample in the vertical direction.

Preferably, the splitting parameter may be determined using a sum of a difference value between a sample value of each prediction sample of the current block and a sample value of adjacent prediction samples in a specific angular direction.

According to an embodiment of the present invention, by using the random forest method, the number of bits used to express the intra prediction mode may be reduced, thereby improving coding efficiency.

In addition, according to an embodiment of the present invention, the intra prediction mode may be decoded without signaling the intra prediction mode by correlating one intra prediction mode to a class determined through the random forest. The bits used to represent the mode can be greatly saved.

In addition, according to an embodiment of the present invention, by matching a set of a plurality of prediction modes to a class determined through a random forest, the accuracy of intra prediction prediction may be increased, thereby improving coding efficiency.

The effects obtainable in the present invention are not limited to the above-mentioned effects, and other effects not mentioned will be clearly understood by those skilled in the art from the following description. .

BRIEF DESCRIPTION OF THE DRAWINGS The accompanying drawings, included as part of the detailed description in order to provide a thorough understanding of the present invention, provide embodiments of the present invention and together with the description, describe the technical features of the present invention.

1 is a schematic block diagram of an encoder in which encoding of a still image or video signal is performed according to an embodiment to which the present invention is applied.

2 is a schematic block diagram of a decoder in which encoding of a still image or video signal is performed according to an embodiment to which the present invention is applied.

3 is a diagram for describing a partition structure of a coding unit that may be applied to the present invention.

4 is a diagram for explaining a prediction unit applicable to the present invention.

5 is a diagram illustrating an intra prediction method as an embodiment to which the present invention is applied.

6 illustrates a prediction direction according to an intra prediction mode.

7 is a flowchart illustrating a method of decoding an intra prediction mode according to an embodiment to which the present invention is applied.

8 is a diagram for describing a method of determining an MPM mode according to an embodiment to which the present invention is applied.

9 is a diagram for describing a random forest and a decision tree as an embodiment to which the present invention is applied.

10 to 22 are diagrams illustrating a method of determining a division function as an embodiment to which the present invention is applied.

FIG. 23 is a diagram illustrating a method of mapping a plurality of intra prediction modes to a class according to an embodiment to which the present invention may be applied.

24 is a diagram illustrating a method of determining an intra prediction mode of a hierarchical structure based on a random forest according to an embodiment of the present invention.

FIG. 25 is a diagram for describing a method of determining probability values for intra prediction modes according to an embodiment to which the present invention may be applied.

26 is a diagram illustrating an intra prediction method according to an embodiment of the present invention.

27 is a diagram illustrating an intra prediction unit according to an embodiment of the present invention.

Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings. The detailed description, which will be given below with reference to the accompanying drawings, is intended to explain exemplary embodiments of the present invention and is not intended to represent the only embodiments in which the present invention may be practiced. The following detailed description includes specific details in order to provide a thorough understanding of the present invention. However, one of ordinary skill in the art appreciates that the present invention may be practiced without these specific details.

In some instances, well-known structures and devices may be omitted or shown in block diagram form centering on the core functions of the structures and devices in order to avoid obscuring the concepts of the present invention.

In addition, the terminology used in the present invention was selected as a general term widely used as possible now, in a specific case will be described using terms arbitrarily selected by the applicant. In such a case, since the meaning is clearly described in the detailed description of the part, it should not be interpreted simply by the name of the term used in the description of the present invention, and it should be understood that the meaning of the term should be understood and interpreted. .

Specific terms used in the following description are provided to help the understanding of the present invention, and the use of such specific terms may be changed to other forms without departing from the technical spirit of the present invention. For example, signals, data, samples, pictures, frames, blocks, etc. may be appropriately replaced and interpreted in each coding process.

Hereinafter, in the present specification, the 'processing unit' refers to a unit in which a process of encoding / decoding such as prediction, transformation, and / or quantization is performed. Hereinafter, for convenience of description, the processing unit may be referred to as a 'processing block' or 'block'.

The processing unit may be interpreted to include a unit for the luma component and a unit for the chroma component. For example, the processing unit may correspond to a Coding Tree Unit (CTU), a Coding Unit (CU), a Prediction Unit (PU), or a Transform Unit (TU).

In addition, the processing unit may be interpreted as a unit for a luma component or a unit for a chroma component. For example, the processing unit may be a coding tree block (CTB), a coding block (CB), a prediction block (PU), or a transform block (TB) for a luma component. May correspond to. Or, it may correspond to a coding tree block (CTB), a coding block (CB), a prediction block (PU), or a transform block (TB) for a chroma component. In addition, the present invention is not limited thereto, and the processing unit may be interpreted to include a unit for a luma component and a unit for a chroma component.

In addition, the processing unit is not necessarily limited to square blocks, but may also be configured in a polygonal form having three or more vertices.

In the following specification, a pixel, a pixel, and the like are referred to collectively as a sample. In addition, using a sample may mean using a pixel value or a pixel value.

Referring to FIG. 1, the encoder 100 may include an image divider 110, a subtractor 115, a transform unit 120, a quantizer 130, an inverse quantizer 140, an inverse transform unit 150, and a filtering unit. 160, a decoded picture buffer (DPB) 170, a predictor 180, and an entropy encoder 190. The predictor 180 may include an inter predictor 181 and an intra predictor 182.

The image divider 110 divides an input video signal (or a picture or a frame) input to the encoder 100 into one or more processing units.

The subtractor 115 subtracts the difference from the prediction signal (or prediction block) output from the prediction unit 180 (that is, the inter prediction unit 181 or the intra prediction unit 182) in the input image signal. Generate a residual signal (or difference block). The generated difference signal (or difference block) is transmitted to the converter 120.

The transform unit 120 may convert a differential signal (or a differential block) into a transform scheme (eg, a discrete cosine transform (DCT), a discrete sine transform (DST), a graph-based transform (GBT), and a karhunen-loeve transform (KLT)). Etc.) to generate transform coefficients. In this case, the transform unit 120 may generate transform coefficients by performing a transform using a transform mode determined according to the prediction mode applied to the difference block and the size of the difference block.

The quantization unit 130 quantizes the transform coefficients and transmits the transform coefficients to the entropy encoding unit 190, and the entropy encoding unit 190 entropy codes the quantized signals and outputs them as bit streams.

Meanwhile, the quantized signal output from the quantization unit 130 may be used to generate a prediction signal. For example, the quantized signal may recover the differential signal by applying inverse quantization and inverse transformation through an inverse quantization unit 140 and an inverse transformation unit 150 in a loop. A reconstructed signal may be generated by adding the reconstructed difference signal to a prediction signal output from the inter predictor 181 or the intra predictor 182.

Meanwhile, in the compression process as described above, adjacent blocks are quantized by different quantization parameters, thereby causing deterioration of the block boundary. This phenomenon is called blocking artifacts, which is one of the important factors in evaluating image quality. In order to reduce such deterioration, a filtering process may be performed. Through this filtering process, the image quality can be improved by removing the blocking degradation and reducing the error of the current picture.

The filtering unit 160 applies filtering to the reconstruction signal and outputs it to the reproduction apparatus or transmits the decoded picture buffer to the decoding picture buffer 170. The filtered signal transmitted to the decoded picture buffer 170 may be used as the reference picture in the inter prediction unit 181. As such, by using the filtered picture as a reference picture in the inter prediction mode, not only image quality but also encoding efficiency may be improved.

The decoded picture buffer 170 may store the filtered picture for use as a reference picture in the inter prediction unit 181.

The inter prediction unit 181 performs temporal prediction and / or spatial prediction to remove temporal redundancy and / or spatial redundancy with reference to a reconstructed picture. Here, since the reference picture used to perform the prediction is a transformed signal that has been quantized and dequantized in units of blocks at the time of encoding / decoding, a blocking artifact or a ringing artifact may exist. have.

Accordingly, the inter prediction unit 181 may interpolate the signals between pixels in sub-pixel units by applying a lowpass filter to solve performance degradation due to discontinuity or quantization of such signals. Herein, the subpixel refers to a virtual pixel generated by applying an interpolation filter, and the integer pixel refers to an actual pixel existing in the reconstructed picture. As the interpolation method, linear interpolation, bi-linear interpolation, wiener filter, or the like may be applied.

The interpolation filter may be applied to a reconstructed picture to improve the precision of prediction. For example, the inter prediction unit 181 generates an interpolation pixel by applying an interpolation filter to integer pixels, and uses an interpolated block composed of interpolated pixels as a prediction block. You can make predictions.

The intra predictor 182 predicts the current block by referring to samples in the vicinity of the block to which the current encoding is to be performed. The intra prediction unit 182 may perform the following process to perform intra prediction. First, reference samples necessary for generating a prediction signal may be prepared. The prediction signal may be generated using the prepared reference sample. In addition, the prediction mode is encoded. In this case, the reference sample may be prepared through reference sample padding and / or reference sample filtering. Since the reference sample has been predicted and reconstructed, there may be a quantization error. Accordingly, the reference sample filtering process may be performed for each prediction mode used for intra prediction to reduce such an error.

The prediction signal (or prediction block) generated by the inter prediction unit 181 or the intra prediction unit 182 is used to generate a reconstruction signal (or reconstruction block) or a differential signal (or differential block). It can be used to generate.

2, the decoder 200 includes an entropy decoding unit 210, an inverse quantization unit 220, an inverse transform unit 230, an adder 235, a filtering unit 240, and a decoded picture buffer (DPB). Buffer Unit (250), the prediction unit 260 may be configured. The predictor 260 may include an inter predictor 261 and an intra predictor 262.

The reconstructed video signal output through the decoder 200 may be reproduced through the reproducing apparatus.

The decoder 200 receives a signal (ie, a bit stream) output from the encoder 100 of FIG. 1, and the received signal is entropy decoded through the entropy decoding unit 210.

The inverse quantization unit 220 obtains a transform coefficient from the entropy decoded signal using the quantization step size information.

The inverse transform unit 230 applies an inverse transform scheme to inverse transform the transform coefficients to obtain a residual signal (or a differential block).

The adder 235 outputs the obtained difference signal (or difference block) from the prediction unit 260 (that is, the prediction signal (or prediction block) output from the inter prediction unit 261 or the intra prediction unit 262. ) Generates a reconstructed signal (or a reconstruction block).

The filtering unit 240 applies filtering to the reconstructed signal (or the reconstructed block) and outputs the filtering to the reproduction device or transmits the decoded picture buffer unit 250 to the reproduction device. The filtered signal transmitted to the decoded picture buffer unit 250 may be used as a reference picture in the inter predictor 261.

In the present specification, the embodiments described by the filtering unit 160, the inter prediction unit 181, and the intra prediction unit 182 of the encoder 100 are respectively the filtering unit 240, the inter prediction unit 261, and the decoder of the decoder. The same may be applied to the intra predictor 262.

처리 단위 분할 구조Processing unit partition structure

In general, a still image or video compression technique (eg, HEVC) uses a block-based image compression method. The block-based image compression method is a method of processing an image by dividing the image into specific block units, and may reduce memory usage and calculation amount.

The encoder splits one image (or picture) into units of a coding tree unit (CTU) in a rectangular shape. In addition, one CTU is sequentially encoded according to a raster scan order.

In HEVC, the size of the CTU may be set to any one of 64 × 64, 32 × 32, and 16 × 16. The encoder may select and use the size of the CTU according to the resolution of the input video or the characteristics of the input video. The CTU includes a coding tree block (CTB) for luma components and a CTB for two chroma components corresponding thereto.

One CTU may be divided into a quad-tree structure. That is, one CTU has a square shape and is divided into four units having a half horizontal size and a half vertical size to generate a coding unit (CU). have. This partitioning of the quad-tree structure can be performed recursively. That is, a CU is hierarchically divided into quad-tree structures from one CTU.

CU refers to a basic unit of coding in which an input image is processed, for example, intra / inter prediction is performed. The CU includes a coding block (CB) for a luma component and a CB for two chroma components corresponding thereto. In HEVC, the size of a CU may be set to any one of 64 × 64, 32 × 32, 16 × 16, and 8 × 8.

Referring to FIG. 3, the root node of the quad-tree is associated with the CTU. The quad-tree is split until it reaches a leaf node, which corresponds to a CU.

More specifically, the CTU corresponds to a root node and has a smallest depth (ie, depth = 0). The CTU may not be divided according to the characteristics of the input image. In this case, the CTU corresponds to a CU.

The CTU may be divided into quad tree shapes, resulting in lower nodes having a depth of 1 (depth = 1). In addition, a node that is no longer divided (ie, a leaf node) in a lower node having a depth of 1 corresponds to a CU. For example, in FIG. 3 (b), CU (a), CU (b), and CU (j) corresponding to nodes a, b, and j are divided once in the CTU and have a depth of one.

At least one of the nodes having a depth of 1 may be split into a quad tree again, resulting in lower nodes having a depth of 1 (ie, depth = 2). In addition, a node (ie, a leaf node) that is no longer divided in a lower node having a depth of 2 corresponds to a CU. For example, in FIG. 3 (b), CU (c), CU (h) and CU (i) corresponding to nodes c, h and i are divided twice in the CTU and have a depth of two.

In addition, at least one of the nodes having a depth of 2 may be divided into quad tree shapes, resulting in lower nodes having a depth of 3 (ie, depth = 3). And, a node that is no longer partitioned (ie, a leaf node) in a lower node having a depth of 3 corresponds to a CU. For example, in FIG. 3 (b), CU (d), CU (e), CU (f), and CU (g) corresponding to nodes d, e, f, and g are divided three times in the CTU, Has depth.

In the encoder, the maximum size or the minimum size of the CU may be determined according to characteristics (eg, resolution) of the video image or in consideration of encoding efficiency. Information about this or information capable of deriving the information may be included in the bitstream. A CU having a maximum size may be referred to as a largest coding unit (LCU), and a CU having a minimum size may be referred to as a smallest coding unit (SCU).

In addition, a CU having a tree structure may be hierarchically divided with predetermined maximum depth information (or maximum level information). Each partitioned CU may have depth information. Since the depth information indicates the number and / or degree of division of the CU, the depth information may include information about the size of the CU.

Since the LCU is divided into quad tree shapes, the size of the SCU can be obtained by using the size and maximum depth information of the LCU. Or conversely, using the size of the SCU and the maximum depth information of the tree, the size of the LCU can be obtained.

For one CU, information indicating whether the corresponding CU is split (for example, a split CU flag split_cu_flag) may be transmitted to the decoder. This partitioning information is included in all CUs except the SCU. For example, if the flag indicating whether to split or not is '1', the CU is divided into 4 CUs again. If the flag indicating whether to split or not is '0', the CU is not divided further. Processing may be performed.

As described above, a CU is a basic unit of coding in which intra prediction or inter prediction is performed. HEVC divides a CU into prediction units (PUs) in order to code an input image more effectively.

The PU is a basic unit for generating a prediction block, and may generate different prediction blocks in PU units within one CU. However, PUs belonging to one CU are not mixed with intra prediction and inter prediction, and PUs belonging to one CU are coded by the same prediction method (ie, intra prediction or inter prediction).

The PU is not divided into quad-tree structures, but is divided once in a predetermined form in one CU. This will be described with reference to the drawings below.

The PU is divided differently according to whether an intra prediction mode or an inter prediction mode is used as a coding mode of a CU to which the PU belongs.

FIG. 4A illustrates a PU when an intra prediction mode is used, and FIG. 4B illustrates a PU when an inter prediction mode is used.

Referring to FIG. 4 (a), assuming that a size of one CU is 2N × 2N (N = 4,8,16,32), one CU has two types (ie, 2N × 2N or N). XN).

Here, when divided into 2N × 2N type PU, it means that only one PU exists in one CU.

On the other hand, when divided into N × N type PU, one CU is divided into four PUs, and different prediction blocks are generated for each PU unit. However, the division of the PU may be performed only when the size of the CB for the luminance component of the CU is the minimum size (that is, the CU is the SCU).

Referring to FIG. 4 (b), assuming that a size of one CU is 2N × 2N (N = 4,8,16,32), one CU has 8 PU types (ie, 2N × 2N). , N × N, 2N × N, N × 2N, nL × 2N,

nR ×

2N, 2N × nU, 2N × nD).

Similar to intra prediction, PU partitioning in the form of N × N may be performed only when the size of the CB for the luminance component of the CU is the minimum size (that is, the CU is the SCU).

In inter prediction, 2N × N splitting in the horizontal direction and N × 2N splitting in the vertical direction are supported.

In addition, it supports PU partitions of nL × 2N,

nR ×

2N, 2N × nU, and 2N × nD types, which are Asymmetric Motion Partition (AMP). Here, 'n' means a 1/4 value of 2N. However, AMP cannot be used when the CU to which the PU belongs is a CU of the minimum size.

In order to efficiently encode an input image within one CTU, an optimal partitioning structure of a coding unit (CU), a prediction unit (PU), and a transformation unit (TU) is subjected to the following process to perform a minimum rate-distortion. It can be determined based on the value. For example, looking at the optimal CU partitioning process in 64 × 64 CTU, rate-distortion cost can be calculated while partitioning from a 64 × 64 CU to an 8 × 8 CU. The specific process is as follows.

1) The partition structure of the optimal PU and TU that generates the minimum rate-distortion value is determined by performing inter / intra prediction, transform / quantization, inverse quantization / inverse transform, and entropy encoding for a 64 × 64 CU.

2) Divide the 64 × 64 CU into four 32 × 32 CUs and determine the optimal PU and TU partitioning structure that generates the minimum rate-distortion value for each 32 × 32 CU.

3) The 32 × 32 CU is subdivided into four 16 × 16 CUs, and a partition structure of an optimal PU and TU that generates a minimum rate-distortion value for each 16 × 16 CU is determined.

4) Subdivide the 16 × 16 CU into four 8 × 8 CUs and determine the optimal PU and TU partitioning structure that generates the minimum rate-distortion value for each 8 × 8 CU.

5) 16 × 16 blocks by comparing the sum of the rate-distortion values of the 16 × 16 CUs calculated in 3) above with the rate-distortion values of the four 8 × 8 CUs calculated in 4) above. Determine the partition structure of the optimal CU within. This process is similarly performed for the remaining three 16 × 16 CUs.

6) 32 × 32 block by comparing the sum of the rate-distortion values of the 32 × 32 CUs calculated in 2) above with the rate-distortion values of the four 16 × 16 CUs obtained in 5) above. Determine the partition structure of the optimal CU within. Do this for the remaining three 32x32 CUs.

7) Finally, compare the sum of the rate-distortion values of the 64 × 64 CUs calculated in step 1) with the rate-distortion values of the four 32 × 32 CUs obtained in step 6). The partition structure of the optimal CU is determined within the x64 block.

In the intra prediction mode, a prediction mode is selected in units of PUs, and prediction and reconstruction are performed in units of actual TUs for the selected prediction mode.

TU means a basic unit in which actual prediction and reconstruction are performed. The TU includes a transform block (TB) for a luma component and a TB for two chroma components corresponding thereto.

In the example of FIG. 3, as one CTU is divided into quad-tree structures to generate CUs, the TUs are hierarchically divided into quad-tree structures from one CU to be coded.

Since the TU is divided into quad-tree structures, the TU divided from the CU can be divided into smaller lower TUs. In HEVC, the size of the TU may be set to any one of 32 × 32, 16 × 16, 8 × 8, and 4 × 4.

Referring again to FIG. 3, it is assumed that a root node of the quad-tree is associated with a CU. The quad-tree is split until it reaches a leaf node, which corresponds to a TU.

In more detail, a CU corresponds to a root node and has a smallest depth (that is, depth = 0). The CU may not be divided according to the characteristics of the input image. In this case, the CU corresponds to a TU.

The CU may be divided into quad tree shapes, resulting in lower nodes having a depth of 1 (depth = 1). In addition, a node (ie, a leaf node) that is no longer divided in a lower node having a depth of 1 corresponds to a TU. For example, in FIG. 3B, TU (a), TU (b), and TU (j) corresponding to nodes a, b, and j are divided once in a CU and have a depth of 1. FIG.

At least one of the nodes having a depth of 1 may be split into a quad tree again, resulting in lower nodes having a depth of 1 (ie, depth = 2). In addition, a node (ie, a leaf node) that is no longer divided in a lower node having a depth of 2 corresponds to a TU. For example, in FIG. 3B, TU (c), TU (h), and TU (i) corresponding to nodes c, h, and i are divided twice in a CU and have a depth of two.

In addition, at least one of the nodes having a depth of 2 may be divided into quad tree shapes, resulting in lower nodes having a depth of 3 (ie, depth = 3). And, a node that is no longer partitioned (ie, a leaf node) in a lower node having a depth of 3 corresponds to a CU. For example, in FIG. 3 (b), TU (d), TU (e), TU (f), and TU (g) corresponding to nodes d, e, f, and g are divided three times in a CU. Has depth.

A TU having a tree structure may be hierarchically divided with predetermined maximum depth information (or maximum level information). Each divided TU may have depth information. Since the depth information indicates the number and / or degree of division of the TU, it may include information about the size of the TU.

For one TU, information indicating whether the corresponding TU is split (for example, split TU flag split_transform_flag) may be delivered to the decoder. This partitioning information is included in all TUs except the smallest TU. For example, if the value of the flag indicating whether to split is '1', the corresponding TU is divided into four TUs again. If the value of the flag indicating whether to split is '0', the corresponding TU is no longer divided.

예측(prediction)Prediction

The decoded portion of the current picture or other pictures in which the current processing unit is included may be used to reconstruct the current processing unit in which decoding is performed.

Intra picture or I picture (slice), which uses only the current picture for reconstruction, i.e. performs only intra picture prediction, predicts a picture (slice) using at most one motion vector and reference index to predict each unit A picture using a predictive picture or P picture (slice), up to two motion vectors, and a reference index (slice) may be referred to as a bi-predictive picture or a B picture (slice).

Intra prediction means a prediction method that derives the current processing block from data elements (eg, sample values, etc.) of the same decoded picture (or slice). That is, a method of predicting pixel values of the current processing block by referring to reconstructed regions in the current picture.

Inter prediction means a prediction method of deriving a current processing block based on data elements (eg, sample values or motion vectors, etc.) of pictures other than the current picture. That is, a method of predicting pixel values of the current processing block by referring to reconstructed regions in other reconstructed pictures other than the current picture.

Hereinafter, the intra prediction will be described in more detail.

Intra prediction( Intra prediction (or in-screen prediction)

Referring to FIG. 5, the decoder derives the intra prediction mode of the current processing block (S501).

In intra prediction, the prediction direction may have a prediction direction with respect to the position of a reference sample used for prediction according to a prediction mode. An intra prediction mode having a prediction direction is referred to as an intra directional prediction mode. On the other hand, as an intra prediction mode having no prediction direction, there are an intra planner (INTRA_PLANAR) prediction mode and an intra DC (INTRA_DC) prediction mode.

Table 1 illustrates an intra prediction mode and related names, and FIG. 6 illustrates a prediction direction according to the intra prediction mode.

Intra prediction performs prediction on the current processing block based on the derived prediction mode. Since the prediction mode is different from the reference sample used for the prediction according to the prediction mode, when the current block is encoded in the intra prediction mode, the decoder derives the prediction mode of the current block to perform the prediction.

The decoder checks whether neighboring samples of the current processing block can be used for prediction and constructs reference samples to be used for prediction (S502).

In intra prediction, the neighboring samples of the current processing block are the samples adjacent to the left boundary of the current processing block of size nS × nS and the total 2 × nS samples neighboring the bottom-left, It means a total of 2 x nS samples adjacent to the top border and a sample adjacent to the top-right and one sample neighboring the top-left of the current processing block.

However, some of the surrounding samples of the current processing block may not be decoded yet or may be available. In this case, the decoder can construct reference samples for use in prediction by substituting samples that are not available with the available samples.

The decoder may perform filtering of reference samples based on the intra prediction mode (S503).

Whether filtering of the reference sample is performed may be determined based on the size of the current processing block. In addition, the filtering method of the reference sample may be determined by the filtering flag transmitted from the encoder.

The decoder generates a prediction block for the current processing block based on the intra prediction mode and the reference samples (S504). That is, the decoder predicts the current processing block based on the intra prediction mode derived in the intra prediction mode derivation step S501 and the reference samples obtained through the reference sample configuration step S502 and the reference sample filtering step S503. Generate a block (ie, generate a predictive sample in the current processing block).

In order to minimize the discontinuity of the boundary between the processing blocks when the current processing block is encoded in the INTRA_DC mode, the left boundary sample (ie, the sample in the prediction block adjacent to the left boundary) and the upper side of the prediction block in step S504. (top) boundary samples (i.e., samples in prediction blocks adjacent to the upper boundary) may be filtered.

In addition, in operation S504, filtering may be applied to the left boundary sample or the upper boundary sample in the vertical direction mode and the horizontal mode among the intra directional prediction modes similarly to the INTRA_DC mode.

In more detail, when the current processing block is encoded in the vertical mode or the horizontal mode, the value of the prediction sample may be derived based on a reference sample located in the prediction direction. In this case, a boundary sample which is not located in the prediction direction among the left boundary sample or the upper boundary sample of the prediction block may be adjacent to a reference sample which is not used for prediction. That is, the distance from the reference sample not used for prediction may be much closer than the distance from the reference sample used for prediction.

Thus, the decoder may adaptively apply filtering to left boundary samples or upper boundary samples depending on whether the intra prediction direction is vertical or horizontal. That is, when the intra prediction direction is the vertical direction, the filtering may be applied to the left boundary samples, and when the intra prediction direction is the horizontal direction, the filtering may be applied to the upper boundary samples.

MPM mode Most Probable Mode

As described above, in the HEVC, the prediction block of the current block is generated by using 33 directional prediction methods, two non-directional prediction methods, and a total of 35 prediction methods for intra prediction (or intra prediction). In HEVC, the statistical characteristics of the intra prediction modes are used to represent (or signal) these 35 prediction modes with fewer bits.

In general, since a coding block has an image characteristic similar to that of a neighboring block, the intra prediction mode also has a high probability of having the same or similar intra prediction mode. Considering these characteristics, the prediction mode of the current PU is encoded based on the intra prediction mode of the left PU and the upper PU of the current PU. In this case, the encoder / decoder determines the prediction mode of the neighboring block (or the neighboring block) and the most commonly occurring prediction mode as the MPM mode (Most Probable Mode).

If the prediction mode of the current PU is determined as the MPM mode, the bits used to represent the prediction mode can be saved (represented within 2 bits), and if the prediction mode other than the MPM mode is determined, except for the three MPM modes. Since encoding is performed in one of 32 modes, the intra prediction mode can be expressed using 5 bits instead of 6 bits.

If the current block is encoded in the intra prediction mode, the decoder parses an MPM flag (S701).

The decoder determines whether the current block is encoded in the MPM mode from the MPM flag (S702).

As a result of the determination in step S702, when the current block is encoded in the MPM mode, the decoder parses the MPM index (S703).

The decoder sets the mode derived from the MPM index to the intra prediction mode of the current block (S704).

That is, the decoder determines which intra prediction picture of which block among neighboring blocks is to be used as the MPM (or to use the non-directional prediction mode), and performs decoding in the determined intra prediction mode.

As a result of the determination in step S702, when the current block is not encoded in the MPM mode, the decoder parses the intra prediction mode (S705).

As described above, the prediction mode signaled by the encoder is parsed among the remaining 32 prediction modes except the MPM mode, and decoding is performed in the parsed intra prediction mode.

A method of determining a specific MPM mode will be described with reference to the drawings below.

The decoder determines whether the prediction mode of the block adjacent to the left side of the current block (hereinafter referred to as 'L mode') and the prediction mode of the block adjacent to the top of the current block (hereinafter referred to as 'A mode') are the same. (S801).

As a result of the determination in step S801, when the L mode and the A mode is different, the decoder determines the first MPM mode (MPM [0]) and the second MPM mode (MPM [1]) as L mode and A mode, respectively. The MPM mode (MPM [2]) is set to a mode other than the L mode and the A mode among the planar, DC, and vertical modes (S802).

As a result of the determination in step S801, when the L mode and the A mode are the same, the decoder determines whether the prediction mode of the L mode is smaller than 2 (see FIG. 6 above) (S803).

As a result of the determination in step S803, if the L mode is not less than 2, the decoder selects the first MPM mode (MPM [0]), the second MPM mode (MPM [1]), and the third MPM mode (MPM [2]). Set L mode, L mode-1, and L mode + 1, respectively (S804).

If the L mode is less than 2, the decoder determines the first MPM mode (MPM [0]), the second MPM mode (MPM [1]), and the third MPM mode (MPM [2]). A planar, DC, and vertical mode are set (S805).

random Forrest (RF: Random Forest)

Random forest is a set of decision trees (or random trees), a kind of machine learning technique used for classification and regression analysis. It demonstrates with reference to the following drawings.

9A illustrates an example of a decision tree, the decision tree may have a binary tree structure in which all nodes except the leaf node 903 have two or less child nodes. One decision tree is composed of nodes and edges, and each node has a hierarchical structure connected to nodes and edges of the next layer.

Nodes within the decision tree may be: root node 901, internal node (or split node) 902, leaf node (or terminal node) as follows: (terminal node)) (903). Then, the parent node and the child node are connected by the edge.

Root node 901 refers to the top node of the decision tree. It consists of a split function and a split parameter constituting the split function. The partition function corresponds to a function for determining whether to send an input to a left child node or a right child node. Segmentation parameters may be determined and stored through off-line training.

Internal node 902 refers to a node other than the leaf node 903 and the root node 901 among the nodes constituting the decision tree. Like the root node 901, it is composed of a partition function and partition parameters.

Leaf node 903: refers to the last node of the decision tree. It stores the class information of the input that reaches it and uses it to vote for the final class decision.

The splitting parameter of each node may be learned in a direction maximizing information gain. In information theory, information gain is defined as the reduction of uncertainty calculated when the training data reaching the current node is divided into child nodes.

That is, the information gain may be defined as in Equation 1.

Here, S_i represents a data set including a subset S_i ^ L to be divided into a left child node and a subset S_i ^ R to be divided into a right child node in an i-th node. H (S) represents the entropy of the random variable S.

In-screen prediction mode Based image processing method

The present invention proposes a method of encoding / decoding by optimizing signaling of intra prediction mode using a random forest method, which is one of machine learning techniques. By using the random forest method, the number of bits used to express the intra prediction mode can be reduced, thereby improving the coding efficiency.

The present invention proposes a method of estimating an intra prediction mode using a random forest without signaling the intra prediction mode.

In addition, the present invention proposes a method of configuring a plurality of intra prediction modes into one class and estimating the class using a random forest.

In addition, the present invention proposes a method of decoding an intra prediction mode by applying one or more random forests hierarchically.

In addition, the present invention proposes a method for reconfiguring or adding a most probable mode using a random forest.

Example One

The present embodiment proposes a method of estimating an intra prediction mode using a random forest without signaling the intra prediction mode (or intra prediction mode).

In detail, a class is determined by searching from a root node to a leaf node in each decision tree (or a random tree) in a random forest, and deciding a class by combining the classes determined in each decision tree. We propose a method of deriving an intra prediction mode by determining.

Here, the random forest consists of one or more decision trees (or random trees). In addition, the probability information of the class may be stored in each leaf node of the decision tree, and the class output through the search in each decision tree may be determined as the class having the highest probability among the classes stored in the leaf node.

In the present embodiment, one intra prediction mode may be mapped and stored for each class. In other words, each leaf node may correspond to a specific intra prediction mode.

The method proposed in the present embodiment may be applied to the encoder and the decoder in the same way, and for convenience of description, the method performed in the decoder will be described.

The random tree (or decision tree) structure constituting the random forest is as described above with reference to FIG. 9A.

Referring again to FIG. 9A, the random tree is composed of nodes and edges. The root node 901 or the internal node 902 may receive or store pixel or feature information of the input block.

The encoder / decoder divides (or searches) from the root node 901 of the random tree constituting the random forest to the leaf node 903 and stores class information (or in-picture prediction mode) of the reached leaf node 903. And it can be used for voting of the final class determination (or determination of the prediction mode in the screen of the current block).

When the random forest consists of a plurality of random trees, the class (or in-picture prediction mode) determined in each random tree is used for class decision voting of the random forest. In addition, the class having the most votes in the random forest may be determined as an optimal class (that is, a class of the random forest).

If the random forest consists of a single random tree, the class (or intra prediction mode) determined in the random tree may be determined as the optimal class (or intra prediction mode of the current block).

The encoder / decoder may determine a split function by considering the following to design a random tree.

Split function input: The input of the split function is a feature that allows to accurately determine the optimal encoding mode (i.e., the intra prediction mode of the current block), wherein the encoder / decoder For example, a neighboring sample neighboring to a coding block, a CU, a PU, or the like may be used as an input of a partitioning function.

Split parameter: The split parameter constituting the split function may be determined through learning. The learning method for determining the segmentation parameter will be described later. The segmentation parameter may be the position of a neighboring reference sample (or the value of the reference sample at that position), the intra prediction mode of the neighboring block, the decision threshold t, the number of reference samples used in the segmentation function, or May be determined by a combination thereof.

Split function output: The output has a value of 0 or 1, and each value may correspond to a left child node or a right child node. That is, it may be split into a left child node or a right child node according to the output value of the split function.

A split parameter set is a plurality of split parameter sets. Hereinafter, in the description of the present invention, the split parameter set is referred to as theta.

The partition function of each node in one random tree is the same, but the values of parameters or parameter sets used in the partition function of each node may be different. In addition, a parameter set may be shared in one random tree, and a partition function may be configured by using all or part of the entire parameter set for each node.

Random forests can be designed through a learning phase and a testing phase. Through the learning step, the partition parameter of the partition function stored in each node of the random tree may be determined, and the classification result (or probability distribution result of each class) of the data using the partition parameter determined through the test step may be accumulated.

In this case, the sample data used in the learning phase and the testing phase may be different or the same. Hereinafter, for convenience of description, the sample data used in the learning step is referred to as the training sample data and the sample data used in the test step is referred to as test sample data.

The partition function in the random forest (or random tree) can be determined by a variety of different methods. Examples of how to design a partition function are described. However, the present invention is not limited only to the examples described below.

1) The partition function h according to the first method may be defined as in Equation 2.

Referring to Equation 2, a reference sample neighboring the current block may be used as an input of a partition function. Here, v represents each node of the random tree, and as described above, theta refers to a split parameter set used to construct a split function.

Specifically, theta constituting the division function of Equation 2 includes two reference samples R_ () located at the decision threshold t and the coordinates (x0, y0) and (x1, y1) among the neighbors of the current block. x0, y0) and R_ (x1, y1).

10 is a diagram illustrating a method of determining a partition function as an embodiment to which the present invention is applied.

Referring to FIG. 10, a reference sample neighboring an N × N sized current block (HEVC, for example, PU) 1001 has a total of 2 × adjacent to the left and bottom-left of the current block. It may consist of N samples, a total of 2 × N samples adjacent to the top and top-right of the current block and one sample adjacent to the top-left of the current block. .

As described above, theta may be determined as a result of learning using the learning sample data. In this case, the encoder / decoder may learn using the same learning sample data, or the encoder / decoder may share the theta determined as a result of the learning.

In the learning phase, two pixels are randomly selected N times among the reference samples neighboring the current block of size N × N for each node, and two reference samples and a threshold value t are divided among them to maximize the information gain. It may be determined by a parameter.

After the partitioning parameter is determined in the learning step, in the test step, a test procedure for accumulating probability values (that is, classification results of data) stored in each leaf node may be performed. In the test phase, one leaf node may be searched through partitioning (or searching) at each node using the theta determined through learning. A detailed method of the test procedure will be described with reference to the drawings below.

FIG. 11 is a diagram illustrating a method of determining a partition function as an embodiment to which the present invention is applied.

Referring to FIG. 11, a random tree in a random forest for estimating (or predicting) an intra prediction mode of a current block is configured as shown in FIG. 11 (a), and splitting parameters of each node in the random tree are shown in Table 2 below. Assume the case learned together. In addition, it is assumed that test sample data (ie, neighboring reference samples) used in the test procedure is the same as that of FIG. 11 (b).

Referring to FIG. 11 (a), it illustrates a path along which a test sample moves along a learned random tree. Table 2 shows partition parameters (positions (R (x0, y0), R (x1, y1)) and threshold t) of two reference samples stored in each node in the random tree.

The test sample data may be determined to move to a left child node or a right child node through a binary test at node 0 1101. As a result of the binary test at node 01101, | R_ (3,0) -R_ (0,5) | = | 40-210 | = 170 <100 does not satisfy Equation 2, so the value of 0 The output may be moved to the node 2 1102 which is the right child node.

The result of the binary test at node 21102 is | R (0,4) -R (1,0) | = | 200-20 | = 180 <50 and does not satisfy Equation 2. The output may move to node 6 1103, which is a right child node.

As a result of the binary test at node 6 (1103), | R (0,5) -R (3,0) | = | 210-40 | = 170 <175 satisfies Equation 2, a value of 1 is outputted. As a left child node, node 11 1104, which is a leaf node, may be reached.

As described above, in the test step, the probability value for each class may be accumulated in each leaf node of the random tree in which the splitting parameter of the split function is determined in the learning step.

Assume that there are 9 classes. For example, through a test procedure, node 11 (1104) has a probability distribution for each class, such as {0.1,0.2,0.5,0,0.05,0,0.15,0,0}. Can be stored (or accumulated).

In the encoding / decoding step, it may reach from the root node to the leaf node of the random tree in the same manner as in the test procedure.

That is, by using the partitioning parameter and the partitioning function determined in the learning step, each node may be searched for the child node to reach the leaf node. The class having the maximum probability among the classes stored in the leaf node may be finally determined.

If a random forest is composed of a plurality of random trees, the encoder / decoder may determine the most determined class among the classes determined in each random tree as the intra prediction mode of the current block. In other words, through the voting step, the class determined in each random tree is voted and the class having the most votes can be finally determined. The intra prediction mode of the current block may be determined using the determined class information. In other words, the intra prediction mode mapped to the determined class may be determined as the intra prediction mode of the current block.

For example, in the encoding / decoding step, when the node 11 is split (or searched) from the root node (node 0) 1101 of the random tree as shown in FIG. Class 3, which is a class having the maximum probability among classes stored in 1104, may be predicted (or determined).

Further, for example, assuming that 10 random trees exist in a random forest, class 3 is determined in 7 random trees among 10 random trees, class 2 in 2 random trees, and class 9 in 1 random tree. If (or predicted), the class 3 which received the most votes through the voting procedure may finally be determined as the class of the random forest, and the intra prediction mode of the current block may be determined from the determined class.

2) The segmentation function by the second method can use as input two reference sample lines of neighboring reference samples, unlike the first method using one reference sample line (or reference sample array) among neighboring reference samples. have.

That is, when referring to multiple reference sample lines (or reference sample arrays) to generate a prediction block of the current block, a partition function may also be designed using reference samples of the multiple reference sample lines.

Hereinafter, a method of designing a division function using two neighboring reference sample lines will be described by way of example with reference to FIGS. 12 to 17. However, the present invention is not limited only to the examples described below.

12 is a diagram illustrating a method of determining a partition function as an embodiment of the present invention.

Referring to FIG. 12, for example, the division function h may be defined as in Equation 3 below.

Referring to FIG. 12, of two reference sample lines neighboring the current block, a reference sample line further adjacent to the current block (ie, located inward) is left and bottom of the N × N sized current block. 2 ^ N samples (R ^ I_ (0,1), R ^ I_ (0,2), ..., R ^ I_ (0,2N-1), R ^ I_ (0,2N) adjacent to -left) )), A total of 2 × N samples (R ^ I_ (1,0), R ^ I_ (2,0), ..., R ^ adjacent to the top and top-right of the current block I_ (2N-1,0), R ^ I_ (2N, 0)) and one sample (R ^ I_ (0,0)) adjacent to the top-left of the current block. .

A reference sample line that is not closer to the current block (ie, located outside) of two reference sample lines neighboring the current block is left and bottom-left of an N × N sized current block. A total of 2 x N samples (R ^ O _ (-1,1), R ^ O _ (-1,2), ..., R ^ O _ (-1,2N-1), R ^ O _ (- 1,2N)), a total of 2 × N samples (R ^ O_ (1, -1), R ^ O_ (2, -1) neighboring the top and top-right of the current block ),…, R ^ O_ (2N-1, -1), R ^ O_ (2N, -1), and three samples (R ^ O _ (-) neighboring the top-left of the current block. 1, -1), R ^ O_ (0, -1), and R ^ O _ (-1,0)).

Referring to Equation 3, R ^ O_ (x0, y0) refers to a reference sample at position (x0, y0) of a reference sample line located outside of two reference sample lines, and R ^ I_ (x1, y1) Refers to the reference sample at the position (x1, y1) of the reference sample line located inside. Theta may be composed of {R ^ O_ (x0, y0), R ^ I_ (x1, y1), t}.

12 shows R ^ O_ (0, -1) 1201 as one sample in the outer reference sample line and R ^ I_ (N + 1,0) 1202 as one sample in the inner reference sample line. Illustrates the case where) is selected.

In the learning phase, as above, one sample (R ^ O_ (x0, y0)) located outside the reference sample line adjacent to the current block, and one sample inside the reference sample line located inside R By arbitrarily selecting ^ I_ (x1, y1) N times, a partition parameter for maximizing information gain may be determined.

In the same manner as the method described above with reference to FIG. 11, the leaf node of the random tree having the partition parameter determined through the learning procedure may be searched to accumulate the result of classification of data in the leaf node.

As described above, the encoder / decoder may reach the leaf node in the same manner as the test procedure, and may determine the class having the maximum probability among the classes stored in the leaf node.

If a random forest is composed of a plurality of random trees, the encoder / decoder may determine the most determined class among the classes determined in each random tree as the class of the random forest.

Another example of a partitioning function using two neighboring reference sample lines will be described with reference to the drawings below.

FIG. 13 is a diagram illustrating a method of determining a partition function as an embodiment to which the present invention is applied.

Referring to FIG. 13, the partition function h may be defined as in Equation 4.

Here, two reference sample lines may be used as an input of the division function, and the reference sample lines may be configured in the same manner as in the case of FIG. 12.

Referring to Equation 4, R ^ O_ (x0, y0) and R ^ O_ (x2, y2) are located at (x0, y0), (x2, y2) of the reference sample line located outside of the two reference sample lines. Refers to the reference sample of R ^ I_ (x1, y1) refers to the reference sample at position (x1, y1) of the reference sample line located therein. Theta may be composed of {R ^ O_ (x0, y0), R ^ O_ (x2, y2), R ^ I_ (x1, y1), t_1, t_2}.

In FIG. 13, two samples in the outer reference sample line are R ^ O _ (-1,2) 1301 and R ^ O_ (2, -1) 1303 and one sample in the inner reference sample line. As an example, R ^ I_ (N + 1,0) 1302 is selected.

The difference between R ^ O_ (x0, y0) and R ^ I_ (x1, y1) is greater than the first threshold t1 or R ^ O_ (x2, y2) and R ^ I_ (x1, y1) When the difference value is greater than the second threshold value t2, a value of 0 may be output and moved to the right child node. On the other hand, if both difference values are smaller than the threshold value, a value of 1 may be output and moved to the left child node.

In the learning phase, as shown in FIG. 13, two samples (R ^ O_ (x0, y0) and R ^ O_ (x2, y2)) in a reference sample line located outside of two reference sample lines neighboring the current block. The splitting parameter maximizing the information gain may be determined by randomly selecting one sample (R ^ I_ (x1, y1)) N times in the reference sample line located inside and.

In the same manner as the method described with reference to FIG. 11, the data classification result may be accumulated in the leaf node by dividing from the root node to the leaf node of the random tree in which the splitting parameter is determined through the learning procedure.

FIG. 14 is a diagram illustrating a method of determining a partition function as an embodiment to which the present invention is applied.

Referring to FIG. 14, the partition function h may be defined as in Equation 5.

Explaining the difference from the division function of Equation 4, two reference samples (that is, R ^ I_ (x1, y1), R ^ I_ (x3, y3)) may be selected. Theta may be composed of {R ^ O_ (x0, y0), R ^ O_ (x2, y2), R ^ I_ (x1, y1), R ^ I_ (x3, y3), t_1, t_2}. .

In FIG. 14, two samples in the outer reference sample line are R ^ O _ (-1,2) 1402 and R ^ O_ (2, -1) 1401 and two samples in the inner reference sample line. As an example, R ^ I_ (2N, 0) 1403 and R ^ I_ (0, N) 1404 are selected.

The learning step and the test step may be performed in the same manner as the method described with reference to FIG. 13 except that two reference samples are determined as split parameters in the reference sample line located inside of the two reference sample lines as described above. In the encoding / decoding step, the class may be determined.

15 is a diagram illustrating a method of determining a division function, according to an embodiment to which the present invention is applied.

For example, the division function h may be defined as in Equation 6.

Two reference sample lines may be used as an input of the division function h, and reference sample lines may be configured in the same manner as in the case of FIG. 12.

Here, theta may be composed of {θ, t}. In addition, θ = tan ^ (-1) (k_m), and k_m may be configured as ΔR ^ y_ (i, j) / ΔR ^ x_ (i, j).

Referring to FIG. 15, ΔR ^ x_ (i, j) represents a difference between a sample value at a position of each reference sample in two reference sample lines and a sample value of a reference sample adjacent to the right in the x-axis direction, ΔR ^ y_ (i, j) represents the difference between the sample value at the position of each reference sample in the two reference sample lines and the sample value of the reference sample adjacent to the bottom in the y axis direction.

In the learning phase, the value of k_m is determined when maximizing the L2 norm (or gradient.magnitude) of ΔR ^ x_ (i, j) and ΔR ^ y_ (i, j), From that the value of θ can be determined. L2 norms of Δx and Δy are defined as in Equation 7.

If a random forest consists of a plurality of random trees, the encoder / decoder may determine a class of the random forest that is the most determined class among the classes determined in each random tree, and derive an intra prediction mode of the current block. .

16 and 17 are diagrams illustrating a method of determining a division function in an embodiment to which the present invention is applied.

For example, the division function h may be defined as in Equation 8.

Here, theta may be composed of {θ, t}. D may be calculated by Equation 9 below.

Here, D_0 to D_ (6N + 1) are defined as difference values of sample values between respective reference samples neighboring in the angle θ direction.

Referring to FIG. 16, it is assumed that the angle θ is zero. In this case, D_0 to D_ (6N + 1) may be calculated as a difference between sample values between adjacent reference samples in the horizontal direction. For example, D_0 may be calculated as a difference value between the sample values of the reference samples R ^ O _ (-1, -1) 1601 and R ^ O_ (0, -1) 1602.

Referring to FIG. 17, it is assumed that the angle θ is π / 2. In this case, D_0 to D_ (6N + 1) may be calculated as a difference of sample values between adjacent reference samples in the vertical direction. For example, D_0 may be calculated as a difference between the sample values of the reference samples R ^ O _ (-1, -1) 1701 and R ^ O _ (-1,0) 1702.

Example 2

In this embodiment, a method for efficiently encoding an intra prediction mode in a random forest encoder is proposed.

Specifically, a method of determining (or predicting) an intra prediction mode using input values of a partition function in a random tree constituting a random forest, as well as reference samples neighboring the current block, as well as sample values in the prediction block of the current block. Suggest.

Here, the prediction block may correspond to a PU using HEVC as an example, but the present invention is not limited thereto. That is, the prediction block in this embodiment may mean an array or block of samples predicted in the same prediction mode, and may also be referred to as a coding block, a coding unit, a transform block, a transform unit, or the like.

The method proposed in this embodiment may be used for fast-mode decision of an intra prediction mode in an encoder. Since the sample of the predictive block is used, the method proposed in this embodiment can be applied only to the encoder. Therefore, the present embodiment will be described with reference to the encoder.

For example, in HEVC, a fast-mode decision of an intra-picture prediction mode typically selects a mode set first including some of a total of 35 intra-picture prediction modes (RMD: Rough Mode). Decision), and performs a method of selecting the best mode (FMD: Fine Mode Decision) through rate-distortion optimization among the selected modes.

In the present embodiment, the following method may be used using a random forest.

The encoder may add one or more intra prediction modes estimated by performing the random forest method to the mode set in the RMD process, or replace the existing intra prediction modes selected in the mode set in the RMD process. In this case, the encoder may select the intra prediction mode in descending order (ie, high probability) using the probability values of each mode (that is, stored in each class) among the modes selected by the random forest method.

Instead of the existing RMD process, the encoder may configure a mode set by selecting one or more intra prediction modes in descending order according to probability values among intra prediction modes estimated through the random forest method. In addition, the encoder may select an optimal intra prediction mode by applying the FMD method in the selected mode set.

The partitioning function in the random tree used for fast-mode decision can be designed by various methods.

Hereinafter, a method of designing a division function will be described with reference to the examples of FIGS. 18 to 22. However, the present invention is not limited only to the examples described below.

18 is a diagram illustrating a method of determining a partition function as an embodiment to which the present invention is applied.

Referring to FIG. 18, the partition function h may be defined as in Equation 10.

The reference sample neighboring the N × N sized current block (HEVC, for example, PU) is a total of 2 × N samples adjacent to the left and bottom-left of the current block, the upper side of the current block. It may consist of a total of 2 × N samples adjacent to the top and top-right and one sample adjacent to the top-left of the current block.

Referring to Equation 10, R_ (x0, y0) represents a reference sample located at (x0, y0) coordinates among samples neighboring the current block, and P_ (x1, y1) represents (x1, y1) represents a prediction sample located at a coordinate. Theta may be composed of {L ^ R, L ^ P, t}. L ^ R represents R_ (x0, y0) and L ^ P represents P_ (x1, y1).

In other words, the partition function of Equation 10 may be composed of one reference sample among samples neighboring the current block, one prediction sample in the prediction block of the current block, and a parameter of the threshold t.

18 exemplifies a case in which R_ (0,1) 1801 is selected as one reference sample and P_ (2,1) 1802 is selected as one prediction sample.

In the learning phase, one reference sample R_ (x0, y0) of the reference samples neighboring the current block and one prediction sample P_ (x1, y1) of the prediction block are randomly selected N times to maximize the information gain. The splitting parameter can be determined.

The encoder may reach the leaf node in the same manner as the test procedure, and determine the class having the maximum probability among the classes stored in the leaf node.

If a random forest is composed of a plurality of random trees, the encoder may determine the most determined class among the classes determined in each random tree as the intra prediction mode of the current block.

Another example of a split function in a random tree for fast-mode decision will be described with reference to the following drawings.

19 is a diagram to illustrate a method of determining a partition function, according to an embodiment to which the present invention is applied.

Referring to FIG. 19, the partition function h may be defined as in Equation 11.

Referring to Equation 11, R_ (x0, y0) and R_ (x2, y2) refer to reference samples at positions (x0, y0) and (x2, y2) among reference samples neighboring the current block, and P_ ( x1, y1) refers to a prediction sample at the position (x1, y1) in the prediction block. Theta may be composed of {L_1 ^ R, L_2 ^ R, L ^ P, t_1, t_2}. L_1 ^ R represents R_ (x0, y0), L_2 ^ R represents R_ (x2, y2), and L ^ P represents R_ (x1, y1). In addition, as in FIG. 18, a reference sample neighboring the current block may be configured.

In FIG. 19, R_ (2,0) 1901 and R_ (N + 1,0) 1902 as two reference samples in a reference sample neighboring the current block and P_ (N as one prediction sample in a prediction block are shown. The case where 1) is selected is illustrated.

The difference between R_ (x0, y0) and P_ (x1, y1) is greater than the first threshold t1 or the difference between R_ (x2, y2) and P_ (x1, y1) is equal to the second threshold ( If greater than t2), a value of 0 may be output and moved to the right child node. On the other hand, if both difference values are smaller than the threshold value, a value of 1 may be output and moved to the left child node.

In the learning phase, two reference sample samples (R_ (x0, y0), R_ (x2, y2)) and one prediction sample (P_ (x1, y1)) are randomly selected N times to divide the maximum information gain. The parameter can be determined.

As described above, the encoder may reach the leaf node in the same manner as the test procedure, and may determine a class having the maximum probability among the classes stored in the leaf node.

20 is a diagram illustrating a method of determining a partition function as an embodiment to which the present invention is applied.

Referring to FIG. 20, the partition function h may be defined as in Equation 12.

Here, as in FIG. 18, reference samples neighboring to the current block may be configured.

Explaining the difference from the division function of Equation 11, two prediction samples P_ (x1, y1) and P_ (x3, y3) may be selected not only in the reference sample but also in the prediction block. Theta may be composed of {L_1 ^ R, L_2 ^ R, L_1 ^ P, L_2 ^ P ,, t_1, t_2}. L_1 ^ R represents R_ (x0, y0), L_2 ^ R represents R_ (x2, y2), L_1 ^ P represents P_ (x1, y1), and L_2 ^ P represents P_ (x3, y3).

In FIG. 20, R_ (1,0) 2001 and R_ (0,1) 2002 as two reference samples and P_ (N, 1) 2003 and P_ (N, N as two samples in the prediction block (2004) is selected.

A learning step and a test step may be performed in the same manner as the method described above with reference to FIG. 19 except that two prediction samples are determined as split parameters in the prediction block, and a class may be determined in the encoding step.

FIG. 21 is a diagram illustrating a method of determining a partition function as an embodiment to which the present invention is applied.

Referring to FIG. 21, the partitioning function h may be defined as in Equation 13.

Referring to Equation 13, theta may be composed of {θ, t}, and may be calculated using θ = arctan (Δy / Δx).

Referring to FIG. 21, Δx represents a difference value between a sample value of the current prediction sample 2101 in the prediction block and a sample value of the prediction sample 2102 adjacent to the right in the x-axis direction, and Δy represents the current prediction in the prediction block. The difference value between the sample value of the sample 2101 and the sample value of the predicted sample 2103 adjacent to the lower end in the y-axis direction is shown.

In the learning phase, the θ value in the case of maximizing the L2 norm (or rate of change-magnitude) of Δx and Δy in the current prediction sample may be determined. L2 norms of Δx and Δy may be calculated by Equation 7 described above.

In this way, the characteristics of the prediction block can be reflected in the splitting parameter by using the amount of change in the pixel value with adjacent pixels.

FIG. 22 is a diagram illustrating a method of determining a partition function as an embodiment to which the present invention is applied.

Referring to FIG. 22, the partition function h may be defined as in Equation 14.

Referring to Equation 14, theta may be composed of {θ, t}. D may be calculated by Equation 15.

Here, D_0 to D_ (N (N-1)) are defined as difference values of sample values between each prediction sample neighboring in the angle θ direction.

Referring to FIG. 22A, it is assumed that the angle θ is zero. In this case, D_0 to D_ (N (N-1)) may be calculated as a difference between sample values between adjacent prediction samples in the horizontal direction. For example, D_0 may be calculated as a difference value between predicted sample values of the predicted sample P_0 2201 and P_1 2202.

Referring to FIG. 22B, it is assumed that the angle θ is π / 2. In this case, D_0 to D_ (N (N-1)) may be calculated as a difference between sample values between prediction samples adjacent in the vertical direction. For example, D_0 may be calculated as a difference value between the predicted sample values of the reference samples P_0 2203 and P_4 2204.

Example 3

In the method proposed in the first embodiment, one intra prediction mode may be mapped to each class determined through the random forest. In this case, there is an advantage that the intra prediction mode can be decoded without signaling the intra prediction mode.

In contrast, the present embodiment proposes a method of improving the accuracy of prediction mode determination by mapping one or more intra prediction modes for each class.

By considering the intra prediction mode mapped to the class as a set of a plurality of prediction modes instead of one specific prediction mode, the intra prediction mode can be estimated accurately, thereby improving coding efficiency. In addition, by determining a class to which a plurality of prediction modes are mapped and signaling an index for indicating an intra prediction mode of the current block within the determined class range, the number of bits used to express the intra prediction mode may be reduced. have.

Referring to FIG. 23, it is assumed that the intra prediction mode of the current block is the same as in the existing HEVC (see FIG. 6 above). For example, three, four or five adjacent intra prediction modes may be classified and mapped to each class.

In this case, the encoder / decoder may determine a class using a random forest, and the encoder may signal index information indicating an intra prediction mode of the current block to the decoder within the determined class. The decoder may determine the intra prediction mode of the current block by using the index information received from the encoder within the determined class.

The method of constructing a class to which a plurality of intra prediction modes correspond (or maps) may be performed by various methods. The following illustrates the methods that can be used to construct a class. Each method may be used independently or a plurality of methods may be used in combination.

Classes can be organized in one or a combination of the following ways: The class identified in the example below is represented by index i.

23, a method of determining a set of prediction modes (or a set of adjacent prediction modes) having similar directions as one class.

For example, the m_i-n to m_i + n prediction modes may be configured as one class by selecting the representative mode m_i. Here, m represents a number of prediction modes (any of modes 0 to 34, for example in HEVC), and i represents an index of a class to be classified.

-How to organize statistically frequently selected mode sets and non-statistic set into one class each

How to organize classes by dividing MPM mode and non-MPM mode

A method of constructing a class by dividing an angular prediction mode and a non-angular prediction mode

In case of the prediction mode having a high probability of being determined as the prediction mode of the current block, the method is included in a plurality of classes.

For example, a plurality of prediction modes of at least one prediction mode (planar mode, vertical mode, horizontal mode, DC mode, etc.) that are frequently selected in an image are plural. Classes can be configured to be included in one class. Alternatively, the class may be configured to include at least one prediction mode of the MPM mode in a plurality of classes.

The number of prediction modes included (or mapped) in each class may be the same or may be different for each class. The class may be determined using the same random forest, or different independent forests may be applied for each class.

A method of designing a partition function and a partition parameter in a random tree constituting a random forest, a method of determining a class in a random tree or a random forest, and the like may be applied in the same manner as described in the first embodiment.

Example 4

In this embodiment, a method of encoding / decoding an intra prediction mode hierarchically based on a random forest is proposed.

Specifically, a method of decoding an intra prediction mode by dividing the prediction mode of the intra prediction mode hierarchically, determining a class in each layer using one or more random forests.

Each random forest used hierarchically in this embodiment may be composed of a random tree of the same type or may be composed of a random tree of different types.

The type of random forest used in the current layer may be determined according to the class determined through the random forest in the previous layer.

The encoder / decoder may search for leaf nodes in the first decision tree (or first random forest) to determine an intra prediction mode group consisting of a plurality of intra prediction modes that can be applied to the current block. The encoder / decoder searches leaf nodes in a second decision tree (or a second random forest) determined according to the intra prediction mode group, and selects a class to which one or more intra prediction modes are mapped in the intra prediction mode group. Can be. Specific examples will be described with reference to the drawings below.

Referring to FIG. 24, first, a prediction mode may be classified using a directional prediction mode or a non-angular prediction mode using a random forest 1 (RF 1) 2401. That is, random forest 1 2401 corresponds to a set of random trees that determine a class divided into a directional mode and a non-directional mode.

If the intra prediction mode of the current block is determined by the random forest 1 2401 as the directional mode, the encoder / decoder uses the random forest 2 (RF 2) 2402 to determine the intra prediction mode of the current block. Finally it can be decrypted.

If the intra prediction mode of the current block is determined as the non-directional mode by the random forest 1 2401, the encoder / decoder uses the intra prediction mode of the current block using the random forest 3 (RF 3) 2403. Can finally be encoded / decoded.

In this case, the class of each random forest may be classified using the method of Embodiment 1 or Embodiment 3 described above.

In addition to the method illustrated in FIG. 24, various decoding methods may be used to hierarchically divide the decoding step of the intra prediction mode, and the intra prediction mode may be encoded / decoded using different random forests in each layer.

Example 5

In this embodiment, a method of configuring an MPM mode using a random forest is proposed.

If it is assumed that the MPM mode set is composed of N encoding modes, the MPM mode set may be configured in the following manner using a random forest.

As a result of the random forest, probabilities for each intra prediction mode may be obtained at each leaf node of the random tree. In this case, the highest N intra prediction modes having the highest probability may be determined as the MPM mode set.

If the probability for each intra prediction mode is obtained by the random forest, the upper (Nm) intra prediction mode having the highest probability value and the m prediction modes (or m predictions determined by a predefined method) having the highest probability value are obtained. N MPM mode sets can be determined by a combination of the "

A method of obtaining a probability value for each prediction mode in a screen at each leaf node of a random tree through a random forest will be described with reference to the following drawings.

Referring to FIG. 25, a probability value of a class corresponding to an intra prediction mode may be stored for each leaf node of each random tree. Here, the probability value for each class may be accumulated using test sample data.

Each leaf node obtains a probability value for each intra prediction and selects an intra prediction mode having a high probability to configure an MPM mode set or configures an MPM mode set by adding to all or part of an existing MPM mode set.

Embodiments 1 to 5 described above may be applied independently, or some or all of the embodiments 1 to 5 may be applied in combination.

The encoder / decoder uses a split function stored in each node in a decision tree having a binary tree structure to select a child node from a root node. By sequentially performing the operation, a leaf node is searched for (S2601).

As described above, the encoder / decoder may search for leaf nodes by selecting child nodes from the root node to the leaf node of the decision tree in which the splitting parameter is determined through the learning procedure in the same manner as described above with reference to FIG. 11.

In addition, as described above, the splitting function is composed of split parameters, and inputs a reference sample neighboring the current block to a left child node or a right child node. The corresponding value can be output.

In addition, as described above, the segmentation parameter is a reduction in uncertainty, i.e. information, calculated when the node is divided into a left child node or a right child node in each node of the decision tree (or random tree). It can be learned in the direction of maximizing the gain.

As described above with reference to FIGS. 10 and 2, the partition parameter may be determined using two reference samples among reference samples neighboring the current block.

Multiple reference sample lines can be referenced to generate predictive samples of the current block. In this case, as described above with reference to FIGS. 12 to 14 and Equations 3 to 5, the splitting parameter may be determined using a plurality of reference samples of two reference sample lines adjacent to the current block among the multiple reference sample lines.

In addition, when referring to multiple reference sample lines to generate a prediction sample of the current block, the splitting parameter may be determined using a plurality of reference samples among the multiple reference sample lines.

In addition, as described above with reference to FIG. 16 and Equation 6, the division parameter may be equal to a difference between a sample value of a specific reference sample of the multi-reference sample line and a sample value of a reference sample adjacent to the right side of the specific reference sample in the horizontal direction. The difference between the sample value of the specific reference sample and the sample value of the reference sample adjacent to the lower end of the specific reference sample in the vertical direction may be determined.

In addition, as described above with reference to FIGS. 16, 17, and 8, the division parameter uses a sum of a difference value between a sample value of each reference sample of the multi-reference sample line and a sample value of adjacent reference samples in a specific angular direction. Can be determined.

The encoder / decoder selects a class to which a leaf node and one or more intra prediction modes are mapped (S2602).

As described above, the probability information of the class may be stored in each leaf node of the decision tree, and the class output through searching in the decision tree may be determined as the most probable class among the classes stored in the leaf node.

In addition, the random forest may consist of one or more decision trees (or random trees). When a plurality of classes are selected by searching each decision tree of a random forest composed of a plurality of decision trees, the encoder / decoder may determine the class most selected among the plurality of classes as the class of the random forest.

As described above, the encoder / decoder divides (or searches) from the root node of the random tree constituting the random forest to the leaf node, stores class information (or in-picture prediction mode) of the reached leaf node, and finally It can be used for voting for class determination (or determination of prediction modes in the picture of the current block).

If a random forest consists of a plurality of random trees, the class determined in each random tree (or in-screen prediction mode) is used for voting, and the class with the most votes in the random forest (or in-screen prediction mode) ) May be determined as an optimal class (or an intra prediction mode of the current block).

In addition, one or more intra prediction modes may be mapped to each class to improve accuracy of prediction mode determination. In other words, when a plurality of intra prediction modes are mapped to a class, the encoder / decoder may decode index information indicating the intra prediction mode of the current block in the class.

That is, by determining a class to which a plurality of prediction modes are mapped and signaling an index for indicating the intra prediction mode of the current block within the determined class range, the number of bits used to represent the intra prediction mode can be reduced. have.

The encoder / decoder derives the intra prediction mode of the current block by using the class (S2603).

As described above, when one intra prediction mode is mapped for each class, the encoder / decoder may derive the intra prediction mode of the current block by using the determined class.

On the other hand, when a plurality of intra prediction modes are mapped to a class, the encoder / decoder decodes index information indicating the intra prediction mode of the current block in the class, and uses the index information in the selected class. In, an intra prediction mode of a current block can be derived.

When a plurality of classes are selected by searching each decision tree of a random forest composed of a plurality of decision trees, the encoder / decoder determines the most selected class among the plurality of classes as the class of the random forest and uses the same. The intra prediction mode of the current block can be derived.

When the intra prediction mode of the current block is divided into one or more layers, the encoder / decoder hierarchically selects the intra prediction mode of the current block by selecting a class from a leaf node of a decision tree in a random forest determined for each layer. Can be derived.

In other words, the encoder / decoder searches for leaf nodes in the first decision tree to determine an intra prediction mode group consisting of a plurality of intra prediction modes that can be applied to the current block, and a second decision determined according to the intra prediction mode group. A leaf node may be searched in a tree to select a class to which one or more intra prediction modes are mapped in the intra prediction mode group.

In this case, each random forest or decision tree used hierarchically may be composed of a decision tree of the same type, or may be composed of a decision tree of different types.

The type of random forest or decision tree used in the current layer may be determined according to the class determined through the random forest in the previous layer.

The encoder / decoder generates prediction samples of the current block based on the intra prediction mode (S2604).

As described above with reference to FIGS. 5 and 6, the encoder / decoder may derive an intra prediction mode of the current block and configure reference samples to be used for prediction using neighboring samples neighboring the current block. And, if some of the samples neighboring the current block have not yet been decoded or are available, the encoder / decoder substitutes the samples that are not available with the available samples to determine the reference samples to use for prediction. Can be configured. The encoder / decoder may perform filtering of reference samples based on the intra prediction mode. The encoder / decoder may generate a prediction sample for the current block based on the intra prediction mode and the reference samples.

In FIG. 27, the intra prediction unit is illustrated as one block for convenience of description, but the intra prediction unit may be implemented as a configuration included in the encoder and / or the decoder.

Referring to FIG. 27, the intra predictor implements the functions, processes, and / or methods proposed in FIGS. 5 to 26. In detail, the intra prediction unit may include a leaf node search unit 2701, a class selector 2702, a prediction mode derivation unit 2703, and a prediction sample generator 2704.

The leaf node search unit 2701 is a root node for selecting a child node using a split function stored in each node in a decision tree having a binary tree structure. By sequentially executing from the (root node), the leaf node is searched.

As described above, the leaf node search unit 2701 searches the leaf node by selecting child nodes from the root node to the leaf node of the decision tree in which the splitting parameter is determined through the learning procedure, in the same manner as described above with reference to FIG. 11. can do.

The class selector 2702 selects a class to which one or more intra prediction modes are mapped in the leaf node.

In addition, the random forest may consist of one or more decision trees (or random trees). When a plurality of classes are selected by searching each decision tree of a random forest composed of a plurality of decision trees, the class selector 2702 may determine the class selected most frequently among the plurality of classes as the class of the random forest. have.

As described above, the class selector 2702 splits (or searches) from the root node of the random tree constituting the random forest to the leaf node, and stores class information (or in-picture prediction mode) of the reached leaf node. This can be used for voting for the final class decision (or for determining the prediction mode in the picture of the current block).

In addition, one or more intra prediction modes may be mapped to each class to improve accuracy of prediction mode determination. In other words, when a plurality of intra prediction modes are mapped to a class, the class selector 2702 may decode index information indicating the intra prediction mode of the current block in the class.

The prediction mode derivation unit 2703 derives the intra prediction mode of the current block by using the class.

As described above, when one intra prediction mode is mapped for each class, the prediction mode derivation unit 2703 may derive the intra prediction mode of the current block by using the determined class.

On the other hand, when a plurality of intra prediction modes are mapped to a class, the prediction mode derivation unit 2703 decodes index information indicating the intra prediction mode of the current block in the class and uses the index information. In this case, the intra prediction mode of the current block may be derived within the selected class.

When a plurality of classes are selected by searching each decision tree of a random forest composed of a plurality of decision trees, the prediction mode derivator 2703 determines the most selected class among the plurality of classes as the class of the random forest. In this case, the intra prediction mode of the current block may be derived.

When the intra prediction mode of the current block is divided into one or more layers, the prediction mode derivation unit 2703 selects a class from leaf nodes of a decision tree in a random forest determined for each layer, thereby hierarchically selecting the current block. Intra prediction mode can be derived.

In other words, the prediction mode derivation unit 2703 searches for a leaf node in the first decision tree to determine an intra prediction mode group including a plurality of intra prediction modes that can be applied to the current block, and to determine the intra prediction mode group according to the intra prediction mode group. The second node may search the leaf nodes in the second decision tree to select a class to which one or more intra prediction modes are mapped in the intra prediction mode group.

The prediction sample generator 2704 generates a prediction sample of the current block based on the intra prediction mode.

As described above with reference to FIGS. 5 and 6, the prediction sample generator 2704 derives an intra prediction mode of the current block and configures reference samples to be used for prediction using neighboring samples neighboring the current block. can do. And, if some of the samples neighboring the current block have not yet been decoded or are available, the encoder / decoder substitutes the samples that are not available with the available samples to determine the reference samples to use for prediction. Can be configured. The prediction sample generator 2704 may perform filtering of the reference sample based on the intra prediction mode. The prediction sample generator 2704 may generate a prediction sample for the current block based on the intra prediction mode and the reference samples.

The embodiments described above are the components and features of the present invention are combined in a predetermined form. Each component or feature is to be considered optional unless stated otherwise. Each component or feature may be embodied in a form that is not combined with other components or features. It is also possible to combine some of the components and / or features to form an embodiment of the invention. The order of the operations described in the embodiments of the present invention may be changed. Some components or features of one embodiment may be included in another embodiment or may be replaced with corresponding components or features of another embodiment. It is obvious that the claims may be combined to form an embodiment by combining claims that do not have an explicit citation relationship in the claims or as new claims by post-application correction.

Embodiments according to the present invention may be implemented by various means, for example, hardware, firmware, software, or a combination thereof. In the case of a hardware implementation, an embodiment of the present invention may include one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), FPGAs ( field programmable gate arrays), processors, controllers, microcontrollers, microprocessors, and the like.

In the case of implementation by firmware or software, an embodiment of the present invention may be implemented in the form of a module, procedure, function, etc. that performs the functions or operations described above. The software code may be stored in memory and driven by the processor. The memory may be located inside or outside the processor, and may exchange data with the processor by various known means.

It will be apparent to those skilled in the art that the present invention may be embodied in other specific forms without departing from the essential features of the present invention. Accordingly, the above detailed description should not be construed as limiting in all aspects and should be considered as illustrative. The scope of the invention should be determined by reasonable interpretation of the appended claims, and all changes within the equivalent scope of the invention are included in the scope of the invention.

As mentioned above, preferred embodiments of the present invention are disclosed for purposes of illustration, and those skilled in the art can improve and change various other embodiments within the spirit and technical scope of the present invention disclosed in the appended claims below. , Replacement or addition would be possible.

Claims

A method of decoding an image based on an intra prediction mode,

In the decision tree having a binary tree structure, the child nodes are sequentially selected from the root node using the split function stored in each node. Thereby searching for leaf nodes;

Selecting a class to which the leaf node and at least one intra prediction mode are mapped;

Deriving an intra prediction mode of the current block using the class; And

Generating a predictive sample of the current block based on the intra prediction mode,

The splitting function is composed of split parameters and outputs a value corresponding to a left child node or a right child node by inputting a reference sample neighboring the current block. An intra prediction mode based image decoding method.
The method of claim 1,

And the splitting parameter is trained in a direction of maximizing a reduction in uncertainty calculated when the node is divided into a left child node or a right child node in each decision tree.
The method of claim 1,

If a plurality of classes are selected by searching for leaf nodes of each decision tree in a random forest composed of a plurality of decision trees, determining a class selected from the plurality of classes as the class of the random forest. More,

The intra prediction mode based image decoding method of the current block is derived using the class of the random forest.
The method of claim 1,

And the splitting parameter is determined using two reference samples among reference samples neighboring the current block.
The method of claim 1,

When referring to multiple reference sample lines to generate a predictive sample of the current block,

And the splitting parameter is determined using a plurality of reference samples among the multiple reference sample lines.
The method of claim 1,

When referring to multiple reference sample lines to generate a predictive sample of the current block,

And the splitting parameter is determined using a plurality of reference samples of two reference sample lines adjacent to the current block among the multiple reference sample lines.
The method of claim 1,

When referring to multiple reference sample lines to generate a predictive sample of the current block,

The segmentation parameter is a difference between a sample value of a specific reference sample of the multi-reference sample line and a difference value between a sample value of a reference sample adjacent to the right side of the specific reference sample and a vertical value of the sample value of the specific reference sample. And an intra prediction mode-based image decoding method determined using a difference value with a sample value of a reference sample adjacent to a lower end of the specific reference sample.
The method of claim 1,

When referring to multiple reference sample lines to generate a predictive sample of the current block,

And the splitting parameter is determined using a sum of a difference between a sample value of each reference sample of the multi-reference sample line and a sample value of adjacent reference samples in a specific angular direction.
The method of claim 1,

If a plurality of intra prediction modes are mapped to the class, decoding the index information indicating the intra prediction mode of the current block in the class;

The intra prediction mode based image decoding method of the current block is derived using the class and the index information.
The method of claim 1,

Selecting a class to which the at least one intra prediction mode is mapped,

Searching for a leaf node in a first decision tree to determine an intra prediction mode group consisting of a plurality of intra prediction modes that can be applied to the current block,

An intra prediction mode-based image decoding method of searching for leaf nodes in a second decision tree determined according to the intra prediction mode group, and selecting a class to which one or more intra prediction modes are mapped in the intra prediction mode group.
An apparatus for decoding an image based on an intra prediction mode,

In the decision tree having a binary tree structure, the child nodes are sequentially selected from the root node using the split function stored in each node. Thereby, a leaf node search unit for searching for leaf nodes;

A class selector configured to select a class to which the leaf node and at least one intra prediction mode are mapped;

A prediction mode derivation unit for deriving an intra prediction mode of the current block by using the class; And

A prediction sample generator configured to generate a prediction sample of the current block based on the intra prediction mode,

The splitting function is composed of split parameters and outputs a value corresponding to a left child node or a right child node by inputting a reference sample neighboring the current block. An intra prediction mode based image decoding apparatus.
A method of encoding an image based on an intra prediction mode,

In the decision tree having a binary tree structure, the child nodes are sequentially selected from the root node using the split function stored in each node. Thereby searching for leaf nodes;

Selecting a class to which the leaf node and at least one intra prediction mode are mapped; And

Encoding the intra prediction mode of the current block using the class,

The splitting function is composed of split parameters and outputs a value corresponding to a left child node or a right child node by inputting a reference sample neighboring the current block. An intra prediction mode based image coding method.
The method of claim 12,

And the splitting parameter is trained in a direction of maximizing a reduction in uncertainty calculated when each node of the decision tree is split into a left child node or a right child node.
The method of claim 12,

The splitting parameter is an intra prediction mode-based image encoding method determined by using one reference sample among reference samples neighboring the current block and one prediction sample among the prediction samples of the current block.
The method of claim 12,

And the splitting parameter is determined using at least one reference sample among reference samples neighboring the current block and at least one prediction sample among the prediction samples of the current block.
The method of claim 12,

The splitting parameter may include a difference value between a sample value of a specific prediction sample and a sample value of a prediction sample adjacent to the right side of the specific prediction sample in a horizontal direction among the prediction samples of the current block, and a sample value and a vertical direction of the sample value of the specific prediction sample. The intra prediction mode based image encoding method is determined by using a difference value between sample values of a predictive sample adjacent to a lower end of the specific predictive sample.
The method of claim 12,

And the splitting parameter is determined using a sum of a difference between a sample value of each prediction sample of the current block and a sample value of adjacent prediction samples in a specific angular direction.