CN110476425B

CN110476425B - Prediction method and device based on block form

Info

Publication number: CN110476425B
Application number: CN201880020454.1A
Authority: CN
Inventors: 李镇浩; 姜晶媛; 高玄硕; 林成昶; 全东山; 李河贤; 赵承炫; 金晖容; 崔海哲; 权大赫; 金在坤; 白雅兰
Original assignee: Electronics and Telecommunications Research Institute ETRI; Industry Academic Cooperation Foundation of Hanbat National University; University Industry Cooperation Foundation of Korea Aerospace University
Current assignee: Electronics and Telecommunications Research Institute ETRI; Industry Academic Cooperation Foundation of Hanbat National University; University Industry Cooperation Foundation of Korea Aerospace University
Priority date: 2017-03-22
Filing date: 2018-03-22
Publication date: 2023-11-28
Anticipated expiration: 2038-03-22
Also published as: KR102310730B1; CN117255197A; KR102504643B1; KR20180107762A; CN117255196A; CN117255198A; KR20230035277A; KR20210122761A; CN110476425A

Abstract

The present application relates to a decoding method, a decoding apparatus, an encoding method, and an encoding apparatus for video, in which a plurality of divided blocks are generated by dividing blocks to be predicted when encoding and decoding video. A prediction mode is extracted for at least some of the plurality of partitioned blocks, and prediction may be performed on the plurality of partitioned blocks based on the extracted prediction mode. In performing prediction for a divided block, information associated with a block to be predicted may be used, and information associated with other divided blocks that have been predicted before the divided block may be used.

Description

Prediction method and device based on block form

Technical Field

The following embodiments relate generally to a video decoding method and apparatus and a video encoding method and apparatus, and more particularly, to a method and apparatus for performing prediction based on a shape of a block in encoding and decoding video.

The present application claims the benefits of korean patent application No. 10-2017-0036257 filed on the 3 rd month 22 of 2017 and korean patent application No. 10-2017-0155097 filed on the 11 th month 20 of 2017, which are incorporated herein by reference in their entireties.

Background

With the continued development of the information and communication industry, broadcast services supporting High Definition (HD) resolution have been popular throughout the world. Through this popularity, a large number of users have become accustomed to high resolution and high definition images and/or videos.

In order to meet the demands of users for high definition, a large number of institutions have accelerated the development of next-generation imaging devices. In addition to the increased interest of users in High Definition TV (HDTV) and Full High Definition (FHD) TV, the interest in UHD TV has also increased, where the resolution of UHD TV is more than four times that of Full High Definition (FHD) TV. With the increasing interest thereof, image encoding/decoding techniques for images having higher resolution and higher definition are continuously demanded.

The image encoding/decoding apparatus and method may use an inter prediction technique, an intra prediction technique, an entropy encoding technique, or the like in order to perform encoding/decoding on high resolution and high definition images. The inter prediction technique may be a technique for predicting values of pixels included in a current picture using a temporally preceding picture and/or a temporally following picture. The intra prediction technique may be a technique for predicting a value of a pixel included in a current picture using information about the pixel in the current picture. Entropy coding techniques may be techniques for assigning short codewords to frequently occurring symbols and assigning long codewords to rarely occurring symbols.

Various prediction methods have been developed to improve the efficiency and accuracy of intra-and/or inter-prediction. For example, a block may be divided for efficient prediction, and prediction may be performed for each block generated by the division. The prediction efficiency may vary greatly depending on whether the block is divided or not.

Disclosure of Invention

Technical problem

Embodiments are directed to an encoding apparatus and method, and a decoding apparatus and method, which divide a block based on the size and/or shape of the block and derive a prediction mode for each partition block generated by the division.

Embodiments are directed to an encoding apparatus and method and a decoding apparatus and method that perform prediction on each partition block according to a derived prediction mode.

Solution scheme

According to an aspect, there is provided an encoding method including: generating a plurality of sub-blocks by dividing a target block; deriving a prediction mode for at least a portion of the plurality of partitions; and performing prediction on the plurality of partitions based on the derived prediction modes.

According to another aspect, there is provided a decoding method including: generating a plurality of sub-blocks by dividing a target block; deriving a prediction mode for at least a portion of the plurality of partitions; and performing prediction on the plurality of partitions based on the derived prediction modes.

Whether to divide the target block may be determined based on information related to the target block.

It may be determined whether to divide the target block and what type of division is to be used based on the block division indicator.

The target block may be partitioned based on the size of the target block.

The target block may be partitioned based on its shape.

The prediction mode may be derived for a particular partition among the plurality of partitions.

The specific block may be a block located at a specific position among the plurality of blocks.

The prediction mode derived for the particular partition may be used for the remaining partitions other than the particular partition among the plurality of partitions.

The prediction mode determined by combining the prediction mode derived for the specific block and an additional prediction mode may be used for the remaining blocks other than the specific block among the plurality of blocks.

A Most Probable Mode (MPM) list may be used for the derivation of the prediction modes.

The MPM list may include a plurality of MPM lists.

The MPM candidate patterns in the plurality of MPM lists may not overlap each other.

The MPM list may be configured for a particular unit.

The particular unit may be a target block.

The MPM list for the plurality of tiles may be configured based on one or more reference blocks for the target block.

A prediction mode derived for a first block among the plurality of blocks may be used to predict a second block among the plurality of blocks.

The reconstructed pixels of the first block may be used as reference samples for predicting the second block.

The reference points used to predict the plurality of partitions may be reconstructed pixels adjacent to the target block.

The prediction mode may be derived for a lowest block or a rightmost block among the plurality of partitions.

A reconstructed pixel adjacent to the top of the target block may be used as a reference pixel for predicting the lowest block.

The prediction may be performed on the plurality of partition blocks in a predefined order.

The predefined order may be an order from the lowest block to the uppermost block, an order from the rightmost block to the leftmost block, an order of blocks ranging from the uppermost block to the second block at the bottom first selected and then sequentially selected, or an order of blocks ranging from the rightmost block to the second block at the right first selected and then sequentially selected.

According to another aspect, there is provided a decoding method including: deriving a prediction mode; generating a plurality of sub-blocks by dividing a target block; and performing prediction on the plurality of partitions based on the derived prediction modes.

Advantageous effects

Provided are an encoding apparatus and method, and a decoding apparatus and method, for dividing a block based on the size and/or shape of the block and deriving a prediction mode for each partition block generated by the division.

Provided are an encoding apparatus and method and a decoding apparatus and method for performing prediction on each partition block according to a derived prediction mode.

Drawings

Fig. 1 is a block diagram showing a configuration of an embodiment of an encoding apparatus to which the present disclosure is applied;

fig. 2 is a block diagram showing a configuration of an embodiment of a decoding apparatus to which the present disclosure is applied;

fig. 3 is a diagram schematically showing a partition structure of an image when the image is encoded and decoded;

fig. 4 is a diagram showing a form of a Prediction Unit (PU) that an encoding unit (CU) can include;

fig. 5 is a diagram illustrating a form of a Transform Unit (TU) that can be included in a CU;

fig. 6 is a diagram for explaining an embodiment of an intra prediction process;

Fig. 7 is a diagram for explaining the positions of reference samples used in an intra prediction process;

fig. 8 is a diagram for explaining an embodiment of an inter prediction process;

FIG. 9 illustrates spatial candidates according to an embodiment;

fig. 10 illustrates an order of adding motion information of spatial candidates to a merge list according to an embodiment;

FIG. 11 illustrates a transform and quantization process according to an example;

fig. 12 is a configuration diagram of an encoding apparatus according to an embodiment;

fig. 13 is a configuration diagram of a decoding apparatus according to an embodiment;

FIG. 14 is a flow chart of a prediction method according to an embodiment;

FIG. 15 is a flow chart of a block partitioning method according to an embodiment;

FIG. 16 illustrates an 8×4 target block according to an example;

FIG. 17 illustrates a 4 x 4 partition block according to an example

FIG. 18 illustrates a 4X 16 target block according to an example;

FIG. 19 illustrates 8 x 4 tiling according to an example;

FIG. 20 illustrates a 4 x 4 partition according to an example;

FIG. 21 is a flow chart of a method for deriving a prediction mode for a partition according to an example;

FIG. 22 illustrates prediction of a bisected block according to an example;

FIG. 23 illustrates prediction of a partition using a reconstructed block of the partition according to an example;

FIG. 24 illustrates prediction of a block of segments using external reference pixels for the block of segments, according to an example;

FIG. 25 illustrates prediction of four tiles according to an example;

FIG. 26 illustrates prediction of a first block after prediction is performed on a fourth block, according to an example;

FIG. 27 illustrates prediction of a second block according to an example;

FIG. 28 illustrates prediction of a third block according to an example;

FIG. 29 is a flow chart of a prediction method according to an embodiment;

FIG. 30 illustrates derivation of a prediction mode for a target block according to an example;

fig. 31 is a flowchart illustrating a target block prediction method and a bit stream generation method according to an embodiment;

fig. 32 is a flowchart illustrating a target block prediction method using a bitstream according to an embodiment;

FIG. 33 illustrates partitioning of upper layer blocks according to an example;

FIG. 34 illustrates partitioning of target blocks according to an example;

fig. 35 is a signal flowchart illustrating an image encoding and decoding method according to an embodiment.

Best mode for carrying out the invention

The present invention is susceptible to various modifications and alternative embodiments, and specific embodiments thereof are described in detail below with reference to the accompanying drawings. It should be understood, however, that the examples are not intended to limit the invention to the particular forms disclosed, but to include all changes, equivalents, or modifications that are within the spirit and scope of the invention.

The following exemplary embodiments will be described in detail with reference to the accompanying drawings showing specific embodiments. These embodiments are described so that those of ordinary skill in the art to which the present disclosure pertains will be readily able to practice them. It should be noted that the various embodiments are different from each other, but need not be mutually exclusive. For example, the specific shapes, structures, and characteristics described herein may be implemented as other embodiments without departing from the spirit and scope of the various embodiments associated with one embodiment. Further, it is to be understood that the location or arrangement of individual components within each disclosed embodiment may be changed without departing from the spirit and scope of the embodiments. Accordingly, the following detailed description is not intended to limit the scope of the disclosure, and the scope of the exemplary embodiments is defined only by the appended claims and their equivalents (as long as they are properly described).

In the drawings, like reference numerals are used to designate the same or similar functions in all respects. The shapes, sizes, etc. of components in the drawings may be exaggerated to make the description clear.

Terms such as "first" and "second" may be used to describe various components, but the components are not limited by the terms. The terms are used only to distinguish one component from another. For example, a first component may be referred to as a second component without departing from the scope of the present description. Similarly, the second component may be referred to as a first component. The term "and/or" may include a combination of a plurality of related descriptive items or any of a plurality of related descriptive items.

It will be understood that when an element is referred to as being "connected" or "coupled" to another element, the two elements can be directly connected or coupled to each other or intervening elements may be present between the two elements. It will be understood that when components are referred to as being "directly connected or coupled," there are no intervening components between the two components.

Furthermore, the components described in the embodiments are shown separately to represent different characteristic functions, but this does not mean that each component is formed of one separate hardware or software. That is, for convenience of description, a plurality of components are individually arranged and included. For example, at least two of the plurality of components may be integrated into a single component. Instead, one component may be divided into a plurality of components. Embodiments in which multiple components are integrated or embodiments in which some components are separated are included in the scope of the present specification as long as they do not depart from the essence of the present specification.

Furthermore, it should be noted that, in the exemplary embodiment, the expression that describes a component "including" a specific component means that another component may be included within the scope of the practice or technical spirit of the exemplary embodiment, but does not exclude the presence of components other than the specific component.

The terminology used in the description presented herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. The singular forms "a," "an," and "the" include plural referents unless the context clearly dictates otherwise. In this specification, it should be understood that terms such as "comprises" or "comprising" are only intended to indicate the presence of features, numbers, steps, operations, components, parts, or combinations thereof, but are not intended to exclude the possibility that one or more other features, numbers, steps, operations, components, parts, or combinations thereof will be present or added.

The embodiments will be described in detail below with reference to the drawings so that those skilled in the art to which the embodiments pertain can easily practice the embodiments. In the following description of the embodiments, a detailed description of known functions or configurations that are considered to obscure the gist of the present specification will be omitted. In addition, the same reference numerals are used to designate the same components throughout the drawings, and repeated descriptions of the same components will be omitted.

Hereinafter, "image" may represent a single picture that forms part of a video, or may represent the video itself. For example, "encoding and/or decoding an image" may mean "encoding and/or decoding a video" and may also mean "encoding and/or decoding any one of a plurality of images constituting a video".

Hereinafter, the terms "video" and "moving picture" may be used to have the same meaning and may be used interchangeably with each other.

Hereinafter, the target image may be an encoding target image that is a target to be encoded and/or a decoding target image that is a target to be decoded. Further, the target image may be an input image input to the encoding apparatus or an input image input to the decoding apparatus.

Hereinafter, the terms "image", "picture", "frame" and "screen" may be used to have the same meaning and may be used interchangeably with each other.

Hereinafter, the target block may be an encoding target block (i.e., a target to be encoded) and/or a decoding target block (i.e., a target to be decoded). Furthermore, the target block may be a current block, i.e., a target that is currently to be encoded and/or decoded. Here, the terms "target block" and "current block" may be used to have the same meaning and may be used interchangeably with each other.

Hereinafter, the terms "block" and "unit" may be used to have the same meaning and may be used interchangeably with each other. Alternatively, "block" may represent a particular unit.

Hereinafter, the terms "region" and "section" are used interchangeably with each other.

Hereinafter, the specific signal may be a signal indicating a specific block. For example, the original signal may be a signal indicating a target block. The prediction signal may be a signal indicating a prediction block. The residual signal may be a signal indicating a residual block.

In the following embodiments, specific information, data, flags, elements, and attributes may have their respective values. The value "0" corresponding to each of the information, data, flags, elements, and attributes may indicate a logical false or a first predefined value. In other words, the values "0", false, logical false and first predefined value may be used interchangeably with each other. The value "1" corresponding to each of the information, data, flags, elements, and attributes may indicate a logical true or a second predefined value. In other words, the values "1", true, logical true and second predefined values may be used interchangeably with each other.

When a variable such as i or j is used to indicate a row, column, or index, the value i may be an integer of 0 or an integer greater than 0, or may be an integer of 1 or an integer greater than 1. In other words, in an embodiment, each of the rows, columns, and indexes may be counted starting from 0, or may be counted starting from 1.

Next, terms to be used in the embodiments will be described.

An encoder: the encoder represents means for performing encoding.

A decoder: the decoder represents means for performing decoding.

A unit: "unit" may mean a unit of image encoding and decoding. The terms "unit" and "block" may be used with the same meaning and are used interchangeably with each other.

"cell" may be an array of M x N samples. M and N may each be positive integers. The term "cell" may generally refer to an array of two-dimensional (2D) samples.

During the encoding and decoding of an image, a "unit" may be a region created by partitioning an image. A single image may be partitioned into multiple units. Alternatively, one image may be partitioned into sub-portions, and a unit may represent each partitioned sub-portion when encoding or decoding is performed on the partitioned sub-portion.

In the encoding and decoding of the image, a predefined process may be performed for each unit according to the type of unit.

Unit types can be classified into macroblock units, coding Units (CUs), prediction Units (PUs), residual units, transform Units (TUs), etc., according to functions. Alternatively, units may represent blocks, macro blocks, coding Tree Units (CTUs), coding tree blocks, coding units, coding blocks, prediction units, prediction blocks, residual units, residual blocks, transform units, transform blocks, etc., according to functions.

The term "unit" may denote information including a luminance (luma) component block, a chrominance (chroma) component block corresponding to the luminance component block, and syntax elements for the respective blocks such that the unit is designated as being distinguished from the block.

The size and shape of the cells may be implemented differently. Further, the cells may have any of a variety of sizes and shapes. In particular, the shape of the cell may include not only square, but also geometric shapes (such as rectangle, trapezoid, triangle, and pentagon) that may be represented in two dimensions (2D).

Furthermore, the unit information may comprise one or more of the type of unit (indicating coding unit, prediction unit, residual unit or transform unit), the size of the unit, the depth of the unit, the order of encoding and decoding of the unit, etc.

One unit may be partitioned into sub-units, each sub-unit having a smaller size than the size of the associated unit.

-cell depth: the cell depth may represent the degree to which a cell is partitioned. Further, the cell depth may indicate a level at which a corresponding cell exists when the cell is represented in a tree structure.

The cell partition information may comprise a cell depth indicating the depth of the cell. The cell depth may indicate the number of times a cell is partitioned and/or the degree to which a cell is partitioned.

In a tree structure, the depth of the root node may be considered to be minimum and the depth of the leaf node to be maximum.

-a single unit may be hierarchically partitioned into a plurality of sub-units, while said plurality of sub-units have tree-structure based depth information. In other words, a cell and a sub-cell generated by partitioning the cell may correspond to a node and a sub-node of the node, respectively. Each partitioned sub-unit may have a unit depth. Since the cell depth indicates the number of times a cell is partitioned and/or the degree to which a cell is partitioned, the partition information of a sub-cell may include information about the size of the sub-cell.

In a tree structure, the top node may correspond to the initial node before partitioning. The top node may be referred to as the "root node". Further, the root node may have a minimum depth value. Here, the depth of the top node may be the level "0".

A node of depth level "1" may represent a cell that is generated when the initial cell is partitioned once. A node of depth level "2" may represent a cell that is generated when an initial cell is partitioned twice.

Leaf nodes of depth level "n" may represent units that are generated when an initial unit is partitioned n times.

The leaf node may be a bottom node, which cannot be partitioned further. The depth of the leaf node may be the maximum level. For example, the predefined value for the maximum level may be 3.

Sampling point: the samples may be the basic units constituting the block. From 0 to 2 according to bit depth (Bd) ^Bd- A value of 1 indicates a sample.

The samples may be pixels or pixel values.

In the following, the terms "pixel" and "sample" may be used with the same meaning and are used interchangeably with each other.

Coding Tree Unit (CTU): the CTU may be composed of a single luma component (Y) coding tree block and two chroma component (Cb, cr) coding tree blocks associated with the luma component coding tree block. Further, the CTU may represent information including the above-described blocks and syntax elements for each block.

Each Coding Tree Unit (CTU) may be partitioned using one or more partitioning methods, such as quadtrees and binary trees, to configure sub-units, such as coding units, prediction units, and transform units.

"CTU" may be used as a term designating a block of pixels as a processing unit in image decoding and encoding processes (as in the case of partitioning an input image).

Coding Tree Block (CTB): "CTB" may be used as a term specifying any one of a Y code tree block, a Cb code tree block, and a Cr code tree block.

Adjacent blocks: the neighboring block represents a block adjacent to the target block. The blocks adjacent to the target block may represent blocks whose boundaries are in contact with the target block or blocks located within a predetermined distance from the target block. The neighboring block may represent a block adjacent to the vertex of the target block. Here, a block adjacent to a vertex of the target block may represent a block vertically adjacent to an adjacent block horizontally adjacent to the target block or a block horizontally adjacent to an adjacent block vertically adjacent to the target block. The neighboring blocks may be reconstructed neighboring blocks.

Prediction unit: the prediction unit may be a basic unit for prediction such as inter prediction, intra prediction, inter compensation, intra compensation, and motion compensation.

A single prediction unit may be divided into a plurality of partitions or sub-prediction units having smaller sizes. The plurality of partitions may also be basic units in performing prediction or compensation. The partition generated by dividing the prediction unit may also be the prediction unit.

Prediction unit partitioning: the prediction unit partition may be a shape into which the prediction unit is divided.

Reconstructed neighboring units: the reconstructed neighboring cells may be cells that have been decoded and reconstructed around the target cell.

The reconstructed neighboring cells may be cells spatially or temporally adjacent to the target cell.

The reconstructed spatial neighboring unit may be a unit included in the current picture that has been reconstructed by encoding and/or decoding.

The reconstructed temporal neighboring units may be units comprised in the reference image that have been reconstructed by encoding and/or decoding. The position of the reconstructed temporal neighboring unit in the reference image may be the same as the position of the target unit in the current picture or may correspond to the position of the target unit in the current picture.

Parameter set: the parameter set may be header information in the structure of the bitstream. For example, the parameter sets may include a sequence parameter set, a picture parameter set, an adaptive parameter set, and the like.

Rate-distortion optimization: the encoding device may use rate distortion optimization to provide high encoding efficiency by utilizing a combination of: the size of the Coding Unit (CU), the prediction mode, the size of the Prediction Unit (PU), the motion information, and the size of the Transform Unit (TU).

The rate-distortion optimization scheme may calculate the rate-distortion costs for each combination to select the optimal combination from among the combinations. The rate distortion cost may be calculated using equation 1 below. In general, the combination that minimizes the rate-distortion cost may be selected as the optimal combination under the rate-distortion optimization scheme.

[ equation 1]

D+λ*R

D may represent distortion. D may be the average of the squares of the differences (i.e. the mean square error) between the original transform coefficients and the reconstructed transform coefficients in the transform unit.

R may represent a code rate, which may represent a bit rate using the relevant context information.

- λ represents the lagrangian multiplier. R may include not only coding parameter information such as a prediction mode, motion information, and a coding block flag, but also bits generated due to coding of transform coefficients.

The encoding device may perform processes such as inter-prediction and/or intra-prediction, transformation, quantization, entropy coding, dequantization (dequantization) and inverse transformation in order to calculate accurate D and R. These processes can greatly increase the complexity of the encoding device.

-bit stream: the bitstream may represent a stream of bits including encoded image information.

-parameter set: the parameter set may be header information in the structure of the bitstream.

The parameter set may include at least one of a Video Parameter Set (VPS), a Sequence Parameter Set (SPS), a Picture Parameter Set (PPS), and an Adaptive Parameter Set (APS). Further, the parameter set may include information about the stripe header and information about the parallel block header.

Analysis: parsing may be a decision on the value of a syntax element made by performing entropy decoding on a bitstream. Alternatively, the term "parsing" may refer to such entropy decoding itself.

The symbols: the symbol may be at least one of a syntax element, an encoding parameter, and a transform coefficient of the encoding target unit and/or the decoding target unit. Furthermore, the symbol may be a target of entropy encoding or a result of entropy decoding.

Reference picture: the reference picture may be an image that is referenced by a unit in order to perform inter prediction or motion compensation. Alternatively, the reference picture may be an image including a reference unit that is referenced by the target unit in order to perform inter prediction or motion compensation.

Hereinafter, the terms "reference picture" and "reference image" may be used to have the same meaning and may be used interchangeably with each other.

Reference picture list: the reference picture list may be a list including one or more reference pictures used for inter prediction or motion compensation.

Types of reference picture lists may include merged List (LC), list 0 (L0), list 1 (L1), list 2 (L3), list 3 (L3), etc.

One or more reference picture lists may be used for inter prediction.

Inter prediction indicator: the inter prediction indicator may indicate an inter prediction direction of the target unit. Inter prediction may be one of unidirectional prediction and bidirectional prediction. Alternatively, the inter prediction indicator may represent the number of reference pictures used to generate the prediction unit of the target unit. Alternatively, the inter prediction indicator may represent the number of prediction blocks used for inter prediction or motion compensation of the target unit.

Reference picture index: the reference picture index may be an index indicating a specific reference picture in the reference picture list.

Motion Vector (MV): the motion vector may be a 2D vector for inter prediction or motion compensation. The motion vector may represent an offset between the encoding/decoding target image and the reference image.

For example, it may be as follows (mv) _x ，mv _y ) Is expressed as MV. mv (mv) _x Can indicate the horizontal component, mv _y The vertical component may be indicated.

Search range: the search range may be a 2D region in which a search for MVs is performed during inter prediction. For example, the size of the search range may be mxn. M and N may each be positive integers.

Motion vector candidates: the motion vector candidate may be a block as a prediction candidate when the motion vector is predicted or a motion vector of a block as a prediction candidate.

The motion vector candidates may be included in a motion vector candidate list.

Motion vector candidate list: the motion vector candidate list may be a list configured using one or more motion vector candidates.

Motion vector candidate index: the motion vector candidate index may be an indicator for indicating motion vector candidates in the motion vector candidate list. Alternatively, the motion vector candidate index may be an index of a motion vector predictor.

Motion information: the motion information may be information including a reference picture list, a reference picture, a motion vector candidate index, at least one of a merge candidate and a merge index, and a motion vector, a reference picture index and an inter prediction indicator.

Merging candidate list: the merge candidate list may be a list configured using merge candidates.

Combining candidates: the merge candidates may be spatial merge candidates, temporal merge candidates, combined bi-predictive merge candidates, zero merge candidates, etc. The merge candidates may include motion information such as an inter prediction indicator, a reference picture index for each list, and a motion vector.

Merging index: the merge index may be an indicator for indicating a merge candidate in the merge candidate list.

The merge index may indicate a reconstruction unit between a reconstruction unit spatially adjacent to the target unit and a reconstruction unit temporally adjacent to the target unit for deriving a merge candidate.

The merge index may indicate at least one of a plurality of pieces of motion information of the merge candidate.

A conversion unit: the transform unit may be a basic unit of residual signal encoding and/or residual signal decoding, such as transform, inverse transform, quantization, dequantization, transform coefficient encoding, and transform coefficient decoding. A single transform unit may be divided into multiple transform units having smaller sizes.

Scaling: scaling may represent the process of multiplying a factor by the transform coefficient level.

As a result of scaling the transform coefficient levels, transform coefficients may be generated. Scaling may also be referred to as "dequantizing".

Quantization Parameter (QP): the quantization parameter may be a value for generating a transform coefficient level for the transform coefficient in quantization. Alternatively, the quantization parameter may also be a value for generating the transform coefficient by scaling the transform coefficient level in dequantization. Alternatively, the quantization parameter may be a value mapped to a quantization step size.

Delta (Delta) quantization parameter: the delta quantization parameter is a difference between a quantization parameter of the encoding/decoding target unit and a predicted quantization parameter.

Scanning: scanning may represent a method of sequentially arranging coefficients in units, blocks or matrices. For example, a method for arranging a 2D array in the form of a one-dimensional (1D) array may be referred to as "scanning". Alternatively, a method for arranging the 1D array in the form of a 2D array may also be referred to as "scanning" or "inverse scanning".

Transform coefficients: the transform coefficients may be coefficient values generated when the encoding device performs the transform. Alternatively, the transform coefficient may be a coefficient value generated when the decoding apparatus performs at least one of entropy decoding and dequantization.

The level at which quantization is applied to quantization of the transform coefficients or residual signal or quantized transform coefficient level may also be included in the meaning of the term "transform coefficient".

Quantized grade: the level of quantization may be a value generated when the encoding apparatus performs quantization on the transform coefficient or the residual signal. Alternatively, the level of quantization may be a value that is a target of dequantization when the decoding apparatus performs dequantization.

Quantized transform coefficient levels as a result of the transform and quantization may also be included in the meaning of quantized levels.

Non-zero transform coefficients: the non-zero transform coefficients may be transform coefficients having values other than 0, or may be transform coefficient levels having values other than 0. Alternatively, the non-zero transform coefficient may be a transform coefficient whose magnitude is not 0, or may be a transform coefficient level whose magnitude is not 0.

Quantization matrix: the quantization matrix may be a matrix used in a quantization or dequantization process to improve subjective image quality or objective image quality of an image. The quantization matrix may also be referred to as a "scaling list".

Quantization matrix coefficients: the quantization matrix coefficient may be each element in the quantization matrix. The quantized matrix coefficients may also be referred to as "matrix coefficients".

Default matrix: the default matrix may be a quantization matrix predefined by the encoding device and decoding device.

Non-default matrix: the non-default matrix may be a quantization matrix that is not predefined by the encoding device and decoding device. The non-default matrix may be signaled by the encoding device to the decoding device.

Fig. 1 is a block diagram showing a configuration of an embodiment of an encoding apparatus to which the present disclosure is applied.

The encoding apparatus 100 may be an encoder, a video encoding apparatus, or an image encoding apparatus. The video may include one or more images (pictures). The encoding apparatus 100 may sequentially encode one or more images of the video.

Referring to fig. 1, the encoding apparatus 100 includes an inter prediction unit 110, an intra prediction unit 120, a switcher 115, a subtractor 125, a transform unit 130, a quantization unit 140, an entropy encoding unit 150, a dequantization (inverse quantization) unit 160, an inverse transform unit 170, an adder 175, a filtering unit 180, and a reference picture buffer 190.

The encoding apparatus 100 may perform encoding on the target image using intra mode and/or inter mode.

Further, the encoding apparatus 100 may generate a bitstream including information about encoding by encoding a target image, and may output the generated bitstream. The generated bit stream may be stored in a computer readable storage medium and may be streamed over a wireless/wired transmission medium.

When the intra mode is used as the prediction mode, the switcher 115 can switch to the intra mode. When the inter mode is used as the prediction mode, the switcher 115 may switch to the inter mode.

The encoding apparatus 100 may generate a prediction block of the target block. Further, after the prediction block has been generated, the encoding apparatus 100 may encode a residual between the target block and the prediction block.

When the prediction mode is an intra mode, the intra prediction unit 120 may use pixels of a neighboring block previously encoded/decoded around the target block as reference samples. Intra-prediction unit 120 may perform spatial prediction on the target block using the reference samples, and may generate prediction samples for the target block via spatial prediction.

The inter prediction unit 110 may include a motion prediction unit and a motion compensation unit.

When the prediction mode is an inter mode, the motion prediction unit may search for a region in the reference image that best matches the target block in the motion prediction process, and may derive a motion vector for the target block and the found region based on the found region.

The reference picture may be stored in a reference picture buffer 190. More specifically, when encoding and/or decoding of the reference picture has been processed, the reference picture may be stored in the reference picture buffer 190.

The motion compensation unit may generate a prediction block for the target block by performing motion compensation using the motion vector. Here, the motion vector may be a two-dimensional (2D) vector for inter prediction. Further, the motion vector may represent an offset between the target image and the reference image.

When the motion vector has a value other than an integer, the motion prediction unit and the motion compensation unit may generate the prediction block by applying an interpolation filter to a partial region of the reference image. In order to perform inter prediction or motion compensation, it may be determined which mode of a skip mode, a merge mode, an Advanced Motion Vector Prediction (AMVP) mode, and a current picture reference mode corresponds to a method for predicting and compensating for motion of a PU included in a CU based on the CU, and inter prediction or motion compensation may be performed according to the mode.

The subtractor 125 may generate a residual block, where the residual block is the difference between the target block and the prediction block. The residual block may also be referred to as a "residual signal".

The residual signal may be the difference between the original signal and the predicted signal. Alternatively, the residual signal may be a signal generated by transforming or quantizing a difference between the original signal and the predicted signal or a signal generated by transforming and quantizing the difference. The residual block may be a residual signal for a block unit.

The transform unit 130 may generate transform coefficients by transforming the residual block, and may output the generated transform coefficients. Here, the transform coefficient may be a coefficient value generated by transforming the residual block.

When the transform skip mode is used, the transform unit 130 may omit an operation of transforming the residual block.

By quantizing the transform coefficients, quantized transform coefficient levels or quantized levels may be generated. Hereinafter, in the embodiments, each of the quantized transform coefficient level and the quantized level may also be referred to as a "transform coefficient".

The quantization unit 140 may generate quantized transform coefficient levels or quantized levels by quantizing the transform coefficients according to quantization parameters. The quantization unit 140 may output the generated quantized transform coefficient level or quantized level. In this case, the quantization unit 140 may quantize the transform coefficient using a quantization matrix.

The entropy encoding unit 150 may generate a bitstream by performing entropy encoding based on probability distribution based on the values calculated by the quantization unit 140 and/or the encoding parameter values calculated in the encoding process. The entropy encoding unit 150 may output the generated bitstream.

The entropy encoding unit 150 may also perform entropy encoding on information about pixels of an image and information required to decode the image. For example, information required for decoding an image may include syntax elements and the like.

The encoding parameters may be information required for encoding and/or decoding. The encoding parameters may include information encoded by the encoding device 100 and transmitted from the encoding device 100 to the decoding device, and may also include information derived during encoding or decoding. For example, the information transmitted to the decoding device may include syntax elements.

For example, the coding parameters may include values or statistical information such as prediction modes, motion vectors, reference picture indexes, coded block patterns, presence or absence of residual signals, transform coefficients, quantized transform coefficients, quantization parameters, block sizes, and block partition information. The prediction mode may be an intra prediction mode or an inter prediction mode.

The residual signal may represent a difference between the original signal and the predicted signal. Alternatively, the residual signal may be a signal generated by transforming a difference between the original signal and the predicted signal. Alternatively, the residual signal may be a signal generated by transforming and quantizing a difference between the original signal and the predicted signal.

When entropy coding is applied, fewer bits may be allocated to more frequently occurring symbols and more bits may be allocated to less frequently occurring symbols. Since the symbol is represented by this allocation, the size of the bit string for the target symbol to be encoded can be reduced. Accordingly, the compression performance of video coding can be improved by entropy coding.

Further, for entropy encoding, the entropy encoding unit 150 may use an encoding method such as exponential golomb, context Adaptive Variable Length Coding (CAVLC), or Context Adaptive Binary Arithmetic Coding (CABAC). For example, the entropy encoding unit 150 may perform entropy encoding using a variable length coding/coding (VLC) table. For example, the entropy encoding unit 150 may derive a binarization method for the target symbol. Furthermore, entropy encoding unit 150 may derive a probability model for the target symbol/binary bit. The entropy encoding unit 150 may perform arithmetic encoding using the derived binarization method, probability model, and context model.

The entropy encoding unit 150 may transform coefficients in the form of 2D blocks into the form of 1D vectors through a transform coefficient scanning method in order to encode the transform coefficient levels.

The coding coefficients may include not only information (or flags or indexes) such as syntax elements encoded by the encoding device and signaled by the encoding device to the decoding device, but also information derived during the encoding or decoding process. Furthermore, the encoding parameters may include information required to encode or decode the image. For example, the encoding parameters may include at least one of or a combination of the following: the size of the unit/block, the depth of the unit/block, the partition information of the unit/block, the partition structure of the unit/block, the information indicating whether the unit/block is partitioned in a quadtree structure, the information indicating whether the unit/block is partitioned in a binary tree structure, the partition direction (horizontal direction or vertical direction) of the binary tree structure, the partition form (symmetric partition or asymmetric partition) of the binary tree structure, the prediction scheme (intra prediction or inter prediction), the intra prediction mode/direction, the reference sample filtering method, the prediction block boundary filtering method, the filter tap for filtering, the filter coefficient for filtering, the inter prediction mode, the motion information, the motion vector, the reference picture index, the inter prediction direction, the inter prediction indicator, the reference picture list, the reference picture motion vector predictors, motion vector prediction candidates, motion vector candidate lists, information indicating whether a merge mode is used, merge candidates, merge candidate lists, information about whether a skip mode is used, the type of interpolation filter, taps of the interpolation filter, filter coefficients of the interpolation filter, the size of a motion vector, the accuracy of motion vector representation, the type of transform, the size of transform, information indicating whether a first transform is used, information indicating whether an additional (second) transform is used, a first transform index, a second transform index, information indicating whether a residual signal is present, a coding block pattern, a coding block flag, quantization parameters, quantization matrices, information about an in-loop filter, information indicating whether an in-loop filter is applied, a second transform index, a coding block pattern, a coding block flag, quantization parameters, quantization matrices, information about an in-loop filter, a first transform and a second transform index, the coefficients of the in-loop filter, the taps of the in-loop filter, the shape/form of the in-loop filter, the information indicating whether the deblocking filter is applied, the coefficients of the deblocking filter, the taps of the deblocking filter, the deblocking filter strength, the shape/form of the deblocking filter, the information indicating whether the adaptive sample point offset is applied, the values of the adaptive sample point offset, the class of the adaptive sample point offset, the type of the adaptive sample point offset, the information indicating whether the adaptive loop filter is applied, the coefficients of the adaptive loop filter, the taps of the adaptive loop filter, the shape/form of the adaptive loop filter, the values of the adaptive sample point offset, the type of the adaptive sample point offset, the information indicating whether the adaptive loop filter is applied a binarization/inverse binarization method, a context model decision method, a context model update method, information indicating whether a normal mode is performed, information indicating whether a bypass mode is performed, a context binary bit, a bypass binary bit, a transform coefficient level scanning method, an image display/output order, stripe identification information, a stripe type, stripe partition information, parallel block identification information, a parallel block type, parallel block partition information, a picture type, a bit depth, information on a luminance signal, and information on a chrominance signal.

Here, signaling the flag or index may mean that the encoding apparatus 100 includes the entropy-encoded flag or entropy-encoded index generated by performing entropy encoding on the flag or index in the bitstream, and may mean that the decoding apparatus 200 acquires the flag or index by performing entropy decoding on the entropy-encoded flag or entropy-encoded index extracted from the bitstream.

Since the encoding apparatus 100 performs encoding via inter prediction, the encoded target image may be used as a reference image for another image to be subsequently processed. Accordingly, the encoding apparatus 100 may reconstruct or decode the encoded target image and store the reconstructed or decoded image as a reference image in the reference picture buffer 190. For decoding, dequantization and inverse transformation of the encoded target image may be performed.

The quantized levels may be dequantized by dequantization unit 160 and may be inverse-transformed by inverse transformation unit 170. The coefficients that have been dequantized and/or inverse transformed may be added to the prediction block by adder 175. The inverse quantized and/or inverse transformed coefficients and the prediction block are added and then a reconstructed block may be generated. Here, the inverse quantized and/or inverse transformed coefficients may represent coefficients on which one or more of the dequantization and inverse transformation are performed, and may also represent a reconstructed residual block.

The reconstructed block may be filtered by a filtering unit 180. The filtering unit 180 may apply one or more of a deblocking filter, a Sample Adaptive Offset (SAO) filter, and an Adaptive Loop Filter (ALF) to the reconstructed block or the reconstructed picture. The filtering unit 180 may also be referred to as a "loop filter".

The deblocking filter may remove block distortion that occurs at boundaries between blocks. To determine whether to apply the deblocking filter, a decision may be made to be included in the block and include the number of columns or rows of pixels based on which the determination is made as to whether to apply the deblocking filter to the target block. When a deblocking filter is applied to a target block, the applied filter may be different depending on the strength of the deblocking filter required. In other words, among different filters, a filter determined in consideration of the intensity of deblocking filtering may be applied to the target block.

SAO may add the appropriate offset to the pixel value to compensate for the coding error. The SAO may perform a correction on the image to which deblocking is applied based on pixels, the correction using an offset of a difference between the original image and the image to which deblocking is applied. A method for dividing pixels included in an image into a certain number of regions, determining a region to which an offset is to be applied among the divided regions, and applying the offset to the determined region may be used, and a method for applying the offset in consideration of edge information of each pixel may also be used.

The ALF may perform filtering based on a value obtained by comparing the reconstructed image with the original image. After pixels included in an image have been divided into a predetermined number of groups, filters to be applied to the groups may be determined, and filtering may be performed differently for the respective groups. Information regarding whether to apply the adaptive loop filter may be signaled for each CU. The shape and filter coefficients of the ALF to be applied to each block may be different for each block.

The reconstructed block or the reconstructed image filtered by the filtering unit 180 may be stored in a reference picture buffer 190. The reconstructed block filtered by the filtering unit 180 may be a part of a reference picture. In other words, the reference picture may be a reconstructed picture composed of the reconstructed blocks filtered by the filtering unit 180. The stored reference pictures may then be used for inter prediction.

Fig. 2 is a block diagram showing a configuration of an embodiment of a decoding apparatus to which the present disclosure is applied.

The decoding apparatus 200 may be a decoder, a video decoding apparatus, or an image decoding apparatus.

Referring to fig. 2, the decoding apparatus 200 may include an entropy decoding unit 210, a dequantization (inverse quantization) unit 220, an inverse transform unit 230, an intra prediction unit 240, an inter prediction unit 250, an adder 255, a filtering unit 260, and a reference picture buffer 270.

The decoding apparatus 200 may receive the bit stream output from the encoding apparatus 100. The decoding apparatus 200 may receive a bit stream stored in a computer-readable storage medium and may receive a bit stream transmitted through a wired/wireless transmission medium stream.

The decoding apparatus 200 may perform decoding on the bit stream in an intra mode and/or an inter mode. Further, the decoding apparatus 200 may generate a reconstructed image or a decoded image via decoding, and may output the reconstructed image or the decoded image.

For example, an operation of switching to an intra mode or an inter mode based on a prediction mode for decoding may be performed by a switch. When the prediction mode for decoding is an intra mode, the switch may be operated to switch to the intra mode. When the prediction mode for decoding is an inter mode, the switch may be operated to switch to the inter mode.

The decoding apparatus 200 may acquire a reconstructed residual block by decoding an input bitstream, and may generate a prediction block. When the reconstructed residual block and the prediction block are acquired, the decoding apparatus 200 may generate a reconstructed block as a target to be decoded by adding the reconstructed residual block and the prediction block.

The entropy decoding unit 210 may generate symbols by performing entropy decoding on the bitstream based on probability distribution of the bitstream. The generated symbols may include quantized hierarchical format symbols. Here, the entropy decoding method may be similar to the entropy encoding method described above. That is, the entropy decoding method may be an inverse of the entropy encoding method described above.

The quantized coefficients may be dequantized by dequantization unit 220. The dequantization unit 220 may generate dequantized coefficients by performing dequantization on the quantized coefficients. Further, the inversely quantized coefficients may be inversely transformed by the inverse transformation unit 230. The inverse transform unit 230 may generate a reconstructed residual block by performing inverse transform on the inversely quantized coefficients. As a result of performing dequantization and inverse transformation on the quantized coefficients, a reconstructed residual block may be generated. Here, when generating the reconstructed residual block, the dequantization unit 220 may apply a quantization matrix to quantized coefficients.

When the intra mode is used, the intra prediction unit 240 may generate a prediction block by performing spatial prediction using pixel values of previously decoded neighboring blocks around the target block.

The inter prediction unit 250 may include a motion compensation unit. Alternatively, the inter prediction unit 250 may be designated as a "motion compensation unit".

When the inter mode is used, the motion compensation unit 250 may generate a prediction block by performing motion compensation using the motion vector and the reference image stored in the reference picture buffer 270.

The motion compensation unit may apply an interpolation filter to a partial region of the reference image when the motion vector has a value other than an integer, and may generate the prediction block using the reference image to which the interpolation filter is applied. In order to perform motion compensation, the motion compensation unit may determine which mode of a skip mode, a merge mode, an Advanced Motion Vector Prediction (AMVP) mode, and a current picture reference mode corresponds to a motion compensation method for a PU included in the CU, based on the CU, and may perform motion compensation according to the determined mode.

The reconstructed residual block and the prediction block may be added to each other by an adder 255. The adder 255 may generate a reconstructed block by adding the reconstructed residual block and the prediction block.

The reconstructed block may be filtered by a filtering unit 260. The filtering unit 260 may apply at least one of a deblocking filter, an SAO filter, and an ALF to the reconstructed block or the reconstructed picture.

The reconstructed block filtered by the filtering unit 260 may be stored in a reference picture buffer 270. The reconstructed block filtered by the filtering unit 260 may be part of a reference picture. In other words, the reference image may be an image composed of the reconstructed block filtered by the filtering unit 260. The stored reference pictures may then be used for inter prediction.

Fig. 3 is a diagram schematically showing a partition structure of an image when the image is encoded and decoded.

Fig. 3 may schematically illustrate an example in which a single unit is partitioned into a plurality of sub-units.

In order to partition an image efficiently, a Coding Unit (CU) may be used in encoding and decoding. The term "unit" may be used to collectively designate 1) a block including image samples and 2) a syntax element. For example, "partition of a unit" may represent "partition of a block corresponding to the unit".

A CU may be used as a basic unit for image encoding/decoding. A CU may be used as a unit to which one mode selected from an intra mode and an inter mode is applied in image encoding/decoding. In other words, in image encoding/decoding, it may be determined which one of the intra mode and the inter mode is to be applied to each CU.

Furthermore, a CU may be a basic unit in prediction, transformation, quantization, inverse transformation, dequantization, and encoding/decoding of transform coefficients.

Referring to fig. 3, an image 300 may be sequentially partitioned into units corresponding to a Largest Coding Unit (LCU), and a partition structure of the image 300 may be determined according to the LCU. Here, LCUs may be used to have the same meaning as Code Tree Units (CTUs).

Partitioning a unit may mean partitioning a block corresponding to the unit. The block partition information may include depth information regarding the depth of the unit. The depth information may indicate the number of times a unit is partitioned and/or the degree to which the unit is partitioned. A single unit may be hierarchically partitioned into sub-units, with the sub-units having depth information based on a tree structure. Each partitioned subunit may have depth information. The depth information may be information indicating the size of the CU. The depth information may be stored for each CU. Each CU may have depth information.

The partition structure may represent a distribution of Coding Units (CUs) in LCU 310 for efficiently encoding an image. Such a distribution may be determined according to whether a single CU is to be partitioned into multiple CUs. The number of CUs generated by partitioning may be a positive integer of 2 or more, including 2, 3, 4, 8, 16, and the like. Depending on the number of CUs generated by performing partitioning, the horizontal and vertical sizes of each CU generated by performing partitioning may be smaller than those of the CU before being partitioned.

Each partitioned CU may be recursively partitioned into four CUs in the same manner. The at least one of the horizontal and vertical dimensions of each partitioned CU may be reduced via recursive partitioning as compared to the at least one of the horizontal and vertical dimensions of the CU prior to partitioning.

Partitioning of the CU may be performed recursively until a predefined depth or a predefined size. For example, the depth of the LCU may be 0 and the depth of the Smallest Coding Unit (SCU) may be a predefined maximum depth. Here, as described above, the LCU may be a CU having a maximum coding unit size, and the SCU may be a CU having a minimum coding unit size.

Partitioning may begin at LCU 310 and the depth of a CU may increase by 1 each time the horizontal and/or vertical size of the CU is reduced by partitioning.

For example, for each depth, a CU that is not partitioned may have a size of 2n×2n. Further, in the case where a CU is partitioned, a CU having a size of 2n×2n may be partitioned into four CUs each having a size of n×n. The value of N may be halved each time the depth increases by 1.

Referring to fig. 3, an LCU of depth 0 may have 64×64 pixels or 64×64 blocks. 0 may be the minimum depth. A SCU of depth 3 may have 8 x 8 pixels or 8 x 8 blocks. 3 may be the maximum depth. Here, a CU having 64×64 blocks as an LCU may be represented by a depth of 0. A CU with 32 x 32 blocks may be represented by a depth of 1. A CU with 16 x 16 blocks may be represented by depth 2. A CU with 8 x 8 blocks as SCU may be represented by depth 3.

The information on whether the corresponding CU is partitioned may be represented by partition information of the CU. The partition information may be 1-bit information. All CUs except SCU may include partition information. For example, the value of partition information of a CU that is not partitioned may be 0. The value of partition information of a partitioned CU may be 1.

For example, when a single CU is partitioned into four CUs, the horizontal and vertical sizes of each of the four CUs generated by performing the partitioning may be half of the horizontal and vertical sizes of the CUs before the partitioning. When a CU having a size of 32×32 is partitioned into four CUs, the size of each of the partitioned four CUs may be 16×16. When a single CU is partitioned into four CUs, the CUs may be considered to have been partitioned in a quadtree structure.

For example, when a single CU is partitioned into two CUs, the horizontal size or the vertical size of each of the two CUs generated by performing the partitioning may be half of the horizontal size or the vertical size of the CU before being partitioned. When a CU having a size of 32×32 is vertically partitioned into two CUs, the size of each of the partitioned two CUs may be 16×32. When a single CU is partitioned into two CUs, the CUs may be considered to have been partitioned in a binary tree structure.

In addition to the quadtree partition, a binary tree partition may be applied to LCU 310 of fig. 3.

Fig. 4 is a diagram showing a form of a Prediction Unit (PU) that an encoding unit (CU) can include.

Among CUs partitioned from LCUs, a CU that is no longer partitioned may be partitioned into one or more Prediction Units (PUs). This partitioning is also referred to as "partitioning".

A PU may be a base unit for prediction. The PU may be encoded and decoded in any one of a skip mode, an inter mode, and an intra mode. The PU may be partitioned into various shapes according to various modes. For example, the target block described above with reference to fig. 1 and the target block described above with reference to fig. 2 may both be PUs.

In skip mode, no partition may be present in the CU. In the skip mode, the 2n×2n mode 410 may be supported without partitioning, wherein the size of the PU and the size of the CU are the same as each other in the 2n×2n mode.

In inter mode, there may be 8 types of partition shapes in a CU. For example, in the inter mode, a 2n×2n mode 410, a 2n×n mode 415, an n×2n mode 420, an n×n mode 425, a 2n×nu mode 430, a 2n×nd mode 435, an nl×2n mode 440, and an nr×2n mode 445 may be supported.

In the intra mode, a 2n×2n mode 410, an n×n mode 425, a 2n×n mode, and an n×2n mode may be supported.

In the 2n×2n mode 410, PUs of size 2n×2n may be encoded. A PU of size 2N x 2N may represent a PU of the same size as a CU. For example, a PU of size 2N x 2N may have a size of 64 x 64, 32 x 32, 16 x 16, or 8 x 8.

In the nxn mode 425, a PU of size nxn may be encoded.

For example, in intra prediction, four partitioned PUs may be encoded when the PU size is 8 x 8. The size of each partitioned PU may be 4 x 4.

When encoding a PU in intra mode, the PU may be encoded using any of a plurality of intra prediction modes. For example, HEVC techniques may provide 35 intra-prediction modes, and a PU may be encoded in any of the 35 intra-prediction modes.

Which of the 2nx2n mode 410 and the nxn mode 425 is to be used to encode the PU may be determined based on the rate distortion cost.

The encoding apparatus 100 may perform an encoding operation on a PU having a size of 2nx2n. Here, the encoding operation may be an operation of encoding the PU in each of a plurality of intra prediction modes that can be used by the encoding apparatus 100. By the encoding operation, the best intra prediction mode for a PU of size 2N x 2N can be derived. The optimal intra prediction mode may be an intra prediction mode that exhibits a minimum rate distortion cost when encoding a PU having a size of 2nx2n among a plurality of intra prediction modes that can be used by the encoding apparatus 100.

Further, the encoding apparatus 100 may sequentially perform encoding operations on respective PUs obtained by performing nxn partitioning. Here, the encoding operation may be an operation of encoding the PU in each of a plurality of intra prediction modes that can be used by the encoding apparatus 100. By the encoding operation, the best intra prediction mode for a PU of size nxn can be derived. The optimal intra prediction mode may be an intra prediction mode that exhibits a minimum rate distortion cost when encoding a PU of size nxn among a plurality of intra prediction modes that can be used by the encoding apparatus 100.

Fig. 5 is a diagram illustrating a form of a Transform Unit (TU) that can be included in a CU.

A Transform Unit (TU) may be a basic unit in a CU used for processes such as transform, quantization, inverse transform, dequantization, entropy encoding, and entropy decoding. TUs may have a square or rectangular shape.

Among CUs partitioned from LCUs, a CU that is no longer partitioned into CUs may be partitioned into one or more TUs. Here, the partition structure of the TUs may be a quadtree structure. For example, as shown in FIG. 5, a single CU 510 may be partitioned one or more times according to a quadtree structure. With such partitioning, a single CU 510 may be composed of TUs having various sizes.

In the encoding apparatus 100, a Coding Tree Unit (CTU) having a size of 64×64 may be partitioned into a plurality of smaller CUs in a recursive quadtree structure. A single CU may be partitioned into four CUs having the same size. Each CU may be recursively partitioned and may have a quadtree structure.

A CU may have a given depth. When a CU is partitioned, a CU generated by performing the partitioning may have a depth increased by 1 from the depth of the partitioned CU.

For example, the depth of a CU may have a value ranging from 0 to 3. The size of a CU may range from a 66 x 64 size to an 8 x 8 size, depending on the depth of the CU.

By recursively partitioning the CUs, the best partitioning method that yields the smallest rate-distortion cost can be selected.

Fig. 6 is a diagram for explaining an embodiment of an intra prediction process.

Arrows extending radially from the center of the graph in fig. 6 indicate the prediction direction of the intra prediction mode. Further, numbers appearing near the arrow may represent examples of mode values assigned to intra prediction modes or prediction directions of intra prediction modes.

Intra-coding and/or decoding may be performed using reference samples of blocks adjacent to the target block. The neighboring block may be a neighboring reconstructed block. For example, intra-frame encoding and/or decoding may be performed using values of reference samples included in each neighboring reconstructed block or encoding parameters of the neighboring reconstructed blocks.

The encoding apparatus 100 and/or the decoding apparatus 200 may generate a prediction block by performing intra prediction on a target block based on information about samples in a target image. When intra prediction is performed, the encoding apparatus 100 and/or the decoding apparatus 200 may generate a prediction block for a target block by performing intra prediction based on information about samples in a target image. When intra prediction is performed, the encoding apparatus 100 and/or the decoding apparatus 200 may perform directional prediction and/or non-directional prediction based on at least one reconstructed reference sample.

The prediction block may be a block generated as a result of performing intra prediction. The prediction block may correspond to at least one of a CU, PU, and TU.

The unit of the prediction block may have a size corresponding to at least one of the CU, PU, and TU. The prediction block may have a square shape with a size of 2n×2n or n×n. The dimensions N x N may include dimensions 4 x 4, 8 x 8, 16 x 16, 32 x 32, 64 x 64, etc.

Alternatively, the prediction block may be a rectangular block of size m×n (such as 2×8, 4×8, 2×16, 4×16, 8×16, etc.).

Intra prediction may be performed considering an intra prediction mode for a target block. The number of intra prediction modes that the target block may have may be a predefined fixed value, and may be a value that is differently determined according to the properties of the prediction block. For example, the attribute of the prediction block may include the size of the prediction block, the type of the prediction block, and the like.

For example, the number of intra prediction modes may be fixed to 35 regardless of the size of the prediction block. Alternatively, the number of intra prediction modes may be, for example, 3, 5, 9, 17, 34, 35, or 36.

The intra prediction mode may be a non-directional mode or a directional mode. For example, as shown in fig. 6, the intra prediction modes may include two non-directional modes and 33 directional modes.

The non-directional mode includes a DC mode and a planar mode. For example, the value of the DC mode may be 1. The value of the planar mode may be 0.

The orientation mode may be a mode having a specific direction or a specific angle. Among the plurality of intra prediction modes, the remaining modes other than the DC mode and the plane mode may be directional modes.

Each of the intra prediction modes may be represented by at least one of a mode number, a mode value, and a mode angle. The number of intra prediction modes may be M. The value of M may be 1 or greater. In other words, the number of intra prediction modes may be M, which includes the number of non-directional modes and the number of directional modes.

The number of intra prediction modes may be fixed to M regardless of the size of the block. Alternatively, the number of intra prediction modes may be different according to the size of the block and/or the type of color component. For example, the number of prediction modes may be different depending on whether the color component is a luminance signal or a chrominance signal. For example, the larger the size of a block, the larger the number of intra prediction modes. Alternatively, the number of intra prediction modes corresponding to the luminance component block may be greater than the number of intra prediction modes corresponding to the chrominance component block.

For example, in a vertical mode with a mode value of 26, prediction may be performed in the vertical direction based on the pixel value of the reference sample.

Even in the directional mode other than the above-described modes, the encoding apparatus 100 and the decoding apparatus 200 can perform intra prediction on the target unit using the reference samples according to the angle corresponding to the directional mode.

The intra prediction mode located at the right side with respect to the vertical mode may be referred to as a "vertical-right mode". The intra prediction mode located below the horizontal mode may be referred to as a "horizontal-below mode". For example, in fig. 6, an intra prediction mode in which the mode value is one of 27, 28, 29, 30, 31, 32, 33, and 34 may be the vertical-right mode 613. The intra prediction mode whose mode value is one of 2, 3, 4, 5, 6, 7, 8, and 9 may be the horizontal-down mode 616.

The number of intra prediction modes and the mode values of the respective intra prediction modes described above are merely exemplary. The number of intra prediction modes described above and the mode values of the respective intra prediction modes may be defined differently according to embodiments, implementations, and/or requirements.

In order to perform intra prediction on a target block, a step of checking whether a sample included in a reconstructed neighboring block can be used as a reference sample of the target block may be performed. When a sample that cannot be used as a reference sample of the current block exists among samples in the neighboring block, a value generated through interpolation and/or copying using at least one sample value among samples included in the reconstructed neighboring block may replace a sample value of a sample that cannot be used as a reference sample. When a value generated via copying and/or interpolation replaces a sample value of an existing sample, the sample may be used as a reference sample for the target block.

In intra prediction, a filter may be applied to at least one of the reference sample and the prediction sample based on at least one of an intra prediction mode and a size of the target block.

When the intra prediction mode is a planar mode, a sample value of the prediction target block may be generated according to a position of the prediction target sample point in the prediction block using a weighted sum of an upper reference sample point of the target block, a left reference sample point of the target block, an upper right reference sample point of the target block, and a lower left reference sample point of the target block when the prediction block of the target block is generated.

When the intra prediction mode is a DC mode, an average value of a reference sample above the target block and a reference sample to the left of the target block may be used in generating the prediction block of the target block.

When the intra prediction mode is a directional mode, a prediction block may be generated using an upper reference sample, a left reference sample, an upper right reference sample, and/or a lower left reference sample of the target block.

To generate the above-described predicted samples, real-based interpolation may be performed.

The intra prediction mode of the target block may perform prediction from intra prediction of neighboring blocks adjacent to the target block, and information for the prediction may be entropy encoded/decoded.

For example, when intra prediction modes of a target block and a neighboring block are identical to each other, a predefined flag may be used to signal that the intra prediction modes of the target block and the neighboring block are identical.

For example, an indicator for indicating the same intra prediction mode as the intra prediction mode of the target block among the intra prediction modes of the plurality of neighboring blocks may be signaled.

When intra prediction modes of a target block and neighboring blocks are different from each other, intra prediction mode information of the target block may be entropy encoded/decoded based on the intra prediction modes of the neighboring blocks.

Fig. 7 is a diagram for explaining the positions of reference samples used in an intra prediction process.

Fig. 7 shows the positions of reference samples for intra prediction of a target block. Referring to fig. 7, reconstructed reference samples for intra-prediction of a target block may include a lower left reference sample 731, a left reference sample 733, an upper left reference sample 735, an upper reference sample 737, and an upper right reference sample 739.

For example, the left reference sample 733 may represent a reconstructed reference pixel adjacent to the left side of the target block. The upper reference sample 737 may represent reconstructed reference pixels adjacent to the top of the target block. The upper left corner reference sample point 735 may represent a reconstructed reference pixel located at the upper left corner of the target block. The lower left reference sample point 731 may represent a reference sample point located below a left sample point line composed of the left reference sample point 733 among sample points located on the same line as the left sample point line. The upper right reference sample point 739 may represent a reference sample point located right of an upper sample point line composed of the upper reference sample points 737 among sample points located on the same line as the upper sample point line.

When the size of the target block is n×n, the numbers of the lower left reference sample 731, the left reference sample 733, the upper reference sample 737, and the upper right reference sample 739 may all be N.

By performing intra prediction on a target block, a prediction block may be generated. The process of generating the prediction block may include determining values of pixels in the prediction block. The target block and the prediction block may be the same size.

The reference points for intra-predicting the target block may be changed according to the intra-prediction mode of the target block. The direction of the intra prediction mode may represent a dependency relationship between the reference sample point and the pixels of the prediction block. For example, the value of the specified reference sample point may be used as the value of one or more specified pixels in the prediction block. In this case, the one or more specified pixels in the specified reference sample and prediction block may be samples and pixels located on a straight line along a direction of the intra prediction mode. In other words, the value of the specified reference sample point may be copied as a value of a pixel located in a direction opposite to the direction of the intra prediction mode. Alternatively, the value of a pixel in the prediction block may be the value of a reference sample located in the direction of the intra prediction mode with respect to the position of the pixel.

In one example, when the intra prediction mode of the target block is a vertical mode with a mode value of 26, the upper reference sample 737 may be used for intra prediction. When the intra prediction mode is a vertical mode, the value of a pixel in the prediction block may be the value of a reference sample vertically above the position of the pixel. Thus, the upper reference sample 737 adjacent to the top of the target block may be used for intra prediction. Furthermore, the values of the pixels in a row of the prediction block may be the same as the values of the pixels of the upper reference sample 737.

In one example, when the mode value of the intra prediction mode of the current block is 18, at least some of the left reference samples 733, the upper left reference samples 735, and at least some of the upper reference samples 737 may be used for intra prediction. When the mode value of the intra prediction mode is 18, the value of a pixel in the prediction block may be the value of a reference sample point diagonally located at the upper left corner of the pixel.

The number of reference samples used to determine the pixel value of one pixel in the prediction block may be 1 or 2 or more.

As described above, the pixel values of the pixels in the prediction block may be determined according to the positions of the pixels and the positions of the reference samples indicated by the direction of the intra prediction mode. When the position of the pixel and the position of the reference sample point indicated by the direction of the intra prediction mode are integer positions, the value of one reference sample point indicated by the integer position may be used to determine the pixel value of the pixel in the prediction block.

When the position of the pixel and the position of the reference sample indicated by the direction of the intra prediction mode are not integer positions, an interpolated reference sample based on two reference samples closest to the position of the reference sample may be generated. The values of the interpolated reference samples may be used to determine pixel values for pixels in the prediction block. In other words, when the position of a pixel in the prediction block and the position of a reference sample indicated by the direction of the intra prediction mode indicate the position between two reference samples, an interpolation value based on the values of the two samples may be generated.

The prediction block generated via prediction may be different from the original target block. In other words, there may be a prediction error, which is a difference between the target block and the prediction block, and there may also be a prediction error between the pixels of the target block and the pixels of the prediction block.

Hereinafter, the terms "difference", "error" and "residual" may be used to have the same meaning and may be used interchangeably with each other.

For example, in the case of directional intra prediction, the longer the distance between the pixels of the prediction block and the reference sample, the greater the prediction error that may occur. Such prediction errors may result in discontinuities between the generated prediction block and neighboring blocks.

In order to reduce prediction errors, a filtering operation for the prediction block may be used. The filtering operation may be configured to adaptively apply the filter to regions of the prediction block that are considered to have large prediction errors. For example, the region considered to have a large prediction error may be a boundary of a prediction block. In addition, regions considered to have a large prediction error in a prediction block may differ according to an intra prediction mode, and characteristics of a filter may also differ according to an intra prediction mode.

Fig. 8 is a diagram for explaining an embodiment of an inter prediction process.

The rectangle shown in fig. 8 may represent an image (or screen). Further, in fig. 8, an arrow may indicate a prediction direction. That is, each image may be encoded and/or decoded according to a prediction direction.

The image may be classified into an intra picture (I picture), a unidirectional predicted picture or a predicted coded picture (P picture), and a bidirectional predicted picture or a bidirectional predicted coded picture (B picture) according to the type of encoding. Each picture may be encoded according to the type of encoding of each picture.

When the target image that is the target to be encoded is an I picture, the target image may be encoded using data contained in the image itself without inter prediction with reference to other images. For example, an I picture may be encoded via intra prediction only.

When the target image is a P picture, the target image may be encoded via inter prediction using a reference picture existing in one direction. Here, the one direction may be a forward direction or a backward direction.

When the target image is a B picture, the image may be encoded via inter prediction using reference pictures existing in both directions, or may be encoded via inter prediction using reference pictures existing in one of a forward direction and a backward direction. Here, the two directions may be a forward direction and a backward direction.

P-pictures and B-pictures encoded and/or decoded using reference pictures may be considered as pictures using inter-prediction.

Hereinafter, inter prediction in inter mode according to an embodiment will be described in detail.

Inter prediction may be performed using motion information.

In the inter mode, the encoding apparatus 100 may perform inter prediction and/or motion compensation on the target block. The decoding apparatus 200 may perform inter prediction and/or motion compensation on the target block corresponding to the inter prediction and/or motion compensation performed by the encoding apparatus 100.

The motion information of the target block may be derived separately by the encoding apparatus 100 and the decoding apparatus 200 during inter prediction. The motion information may be derived using the motion information of the reconstructed neighboring block, the motion information of the co-located block (col block), and/or the motion information of the block adjacent to the col block. The col block may be a block in a previously reconstructed co-located picture (col picture). The location of the col block in the col picture may correspond to the location of the target block in the target image. The col picture may be any one of one or more reference pictures included in the reference picture list.

For example, the encoding apparatus 100 or the decoding apparatus 200 may perform prediction and/or motion compensation by using motion information of spatial candidates and/or temporal candidates as motion information of a target block. The target block may represent a PU and/or a PU partition.

The spatial candidates may be reconstructed blocks spatially adjacent to the target block.

The temporal candidates may be reconstructed blocks corresponding to the target block in a previously reconstructed co-located picture (col picture).

In the inter prediction, the encoding apparatus 100 and the decoding apparatus 200 may improve encoding efficiency and decoding efficiency by using motion information of spatial candidates and/or temporal candidates. The motion information of the spatial candidate may be referred to as "spatial motion information". The motion information of the temporal candidates may be referred to as "temporal motion information".

Next, the motion information of the spatial candidate may be motion information of a PU including the spatial candidate. The motion information of the temporal candidate may be motion information of a PU including the temporal candidate. The motion information of the candidate block may be motion information of a PU including the candidate block.

Inter prediction may be performed using a reference picture.

The reference picture may be at least one of a picture preceding the target picture and a picture following the target picture. The reference picture may be an image for prediction of the target block.

In inter prediction, a region in a reference picture may be specified using a reference picture index (or refIdx) indicating the reference picture, a motion vector to be described later, or the like. Here, the region specified in the reference picture may indicate a reference block.

Inter prediction may select a reference picture, and may also select a reference block corresponding to the target block from the reference picture. Furthermore, inter prediction may use the selected reference block to generate a prediction block for the target block.

The motion information may be derived by each of the encoding apparatus 100 and the decoding apparatus 200 during inter prediction.

The spatial candidates may be 1) blocks that exist in the target picture 2) have been previously reconstructed via encoding and/or decoding and 3) are adjacent to or located at corners of the target block. Here, the "block located at the corner of the target block" may be a block vertically adjacent to an adjacent block horizontally adjacent to the target block, or a block horizontally adjacent to an adjacent block vertically adjacent to the target block. Further, "a block located at a corner of a target block" may have the same meaning as "a block adjacent to a corner of a target block". The meaning of "a block located at a corner of a target block" may be included in the meaning of "a block adjacent to the target block".

For example, the spatial candidate may be a reconstructed block located to the left of the target block, a reconstructed block located above the target block, a reconstructed block located in the lower left corner of the target block, a reconstructed block located in the upper right corner of the target block, or a target block located in the upper left corner of the target block.

Each of the encoding apparatus 100 and the decoding apparatus 200 may identify a block existing in a location in the col picture spatially corresponding to the target block. The position of the target block in the target picture and the position of the identified block in the col picture may correspond to each other.

Each of the encoding apparatus 100 and the decoding apparatus 200 may determine col blocks existing at predefined relevant locations for the identified blocks as time candidates. The predefined relevant locations may be locations that exist inside and/or outside the identified block.

For example, the col blocks may include a first col block and a second col block. When the coordinates of the identified block are (xP, yP) and the size of the identified block is expressed by (nPSW, nPSH), the first col block may be a block located at the coordinates (xp+npsw, yp+npsh). The second col block may be a block located at coordinates (xp+ (nPSW > > 1), yp+ (nPSH > > 1)). The second col block may be selectively used when the first col block is not available.

The motion vector of the target block may be determined based on the motion vector of the col block. Each of the encoding apparatus 100 and the decoding apparatus 200 may scale the motion vector of the col block. The scaled motion vector of the col block may be used as the motion vector of the target block. Furthermore, the motion vector of the running information of the temporal candidates stored in the list may be a scaled motion vector.

The ratio of the motion vector of the target block to the motion vector of the col block may be the same as the ratio of the first distance to the second distance. The first distance may be a distance between a reference picture and a target picture of the target block. The second distance may be a distance between the reference picture and a col picture of the col block.

The scheme for deriving motion information may vary according to the inter prediction mode of the target block. For example, as an inter prediction mode applied to inter prediction, there may be an Advanced Motion Vector Predictor (AMVP) mode, a merge mode, a skip mode, a current picture reference mode, and the like. The merge mode may also be referred to as a "motion merge mode". The respective modes will be described in detail below.

1) AMVP mode

When the AMVP mode is used, the encoding apparatus 100 may search for similar blocks in a neighboring area of the target block. The encoding apparatus 100 may acquire a prediction block by performing prediction on a target block using motion information of the found similar block. The encoding apparatus 100 may encode a residual block, which is a difference between the target block and the prediction block.

1-1) creating a list of predicted motion vector candidates

When AMVP mode is used as the prediction mode, each of the encoding apparatus 100 and the decoding apparatus 200 may create a list of prediction motion vector candidates using a motion vector of a spatial candidate, a motion vector of a temporal candidate, and a zero vector. The predicted motion vector candidate list may include one or more predicted motion vector candidates. At least one of a motion vector of a spatial candidate, a motion vector of a temporal candidate, and a zero vector may be determined and used as a predicted motion vector candidate.

Hereinafter, the terms "predicted motion vector (candidate)" and "motion vector (candidate)" may be used to have the same meaning and may be used interchangeably with each other.

The spatial motion candidates may comprise reconstructed spatial neighboring blocks. In other words, the motion vectors of the reconstructed neighboring blocks may be referred to as "spatial prediction motion vector candidates".

The temporal motion candidates may include col blocks and blocks adjacent to the col blocks. In other words, the motion vector of the col block or the motion vector of the block adjacent to the col block may be referred to as a "temporal prediction motion vector candidate".

The zero vector may be a (0, 0) motion vector.

The predicted motion vector candidates may be motion vector predictors for predicting motion vectors. Further, in the encoding apparatus 100, each predicted motion vector candidate may be an initial search position for a motion vector.

1-2) searching for motion vectors using a list of predicted motion vector candidates

The encoding apparatus 100 may determine a motion vector to be used for encoding the target block within the search range using the list of predicted motion vector candidates. Further, the encoding apparatus 100 may determine a predicted motion vector candidate to be used as a predicted motion vector of the target block among the predicted motion vector candidates existing in the predicted motion vector candidate list.

The motion vector to be used for encoding the target block may be a motion vector that may be encoded at a minimum cost.

Further, the encoding apparatus 100 may determine whether to encode the target block using the AMVP mode.

1-3) transmission of inter prediction information

The encoding apparatus 100 may generate a bitstream including inter prediction information required for inter prediction. The decoding apparatus 200 may perform inter prediction on the target block using inter prediction information of the bitstream.

The inter prediction information may include 1) mode information indicating whether AMVP is used, 2) a prediction motion vector index, 3) a Motion Vector Difference (MVD), 4) a reference direction, and 5) a reference picture index.

Furthermore, the inter prediction information may include a residual signal.

When the mode information indicates that AMVP mode is used, the decoding apparatus 200 may acquire a prediction motion vector index, MVD, reference direction, and reference picture index from the bitstream through entropy decoding.

The prediction motion vector index may indicate a prediction motion vector candidate to be used for predicting the target block among prediction motion vector candidates included in the prediction motion vector candidate list.

1-4) inter prediction in AMVP mode using inter prediction information

The decoding apparatus 200 may derive a predicted motion vector candidate using the predicted motion vector candidate list, and may determine motion information of the target block based on the derived predicted motion vector candidate.

The decoding apparatus 200 may determine a motion vector candidate for the target block among the predicted motion vector candidates included in the predicted motion vector candidate list using the predicted motion vector index. The decoding apparatus 200 may select the predicted motion vector candidate indicated by the predicted motion vector index from among the predicted motion vector candidates included in the predicted motion vector candidate list as the predicted motion vector of the target block.

The motion vector that is actually to be used for inter prediction of the target block may not match the predicted motion vector. In order to indicate the difference between the motion vector that will actually be used for inter prediction of the target block and the predicted motion vector, MVD may be used. The encoding apparatus 100 may derive a prediction motion vector similar to a motion vector that will be actually used for inter prediction of a target block in order to use an MVD as small as possible.

The MVD may be the difference between the motion vector of the target block and the predicted motion vector. The encoding apparatus 100 may calculate the MVD and may entropy-encode the MVD.

The MVD may be transmitted from the encoding apparatus 100 to the decoding apparatus 200 through a bitstream. The decoding apparatus 200 may decode the received MVD. The decoding apparatus 200 may derive a motion vector of the target block by summing the decoded MVD and the predicted motion vector. In other words, the motion vector of the target block derived by the decoding apparatus 200 may be the sum of the entropy-decoded MVD and the motion vector candidates.

The reference direction may indicate a list of reference pictures to be used for predicting the target block. For example, the reference direction may indicate one of the reference picture list L0 and the reference picture list L1.

The reference direction only indicates a reference picture list to be used for prediction of the target block, and does not mean that the direction of the reference picture is limited to a forward direction or a backward direction. In other words, each of the reference picture list L0 and the reference picture list L1 may include pictures in a forward direction and/or a backward direction.

The reference picture being unidirectional may mean that a single reference picture list is used. The reference direction being bi-directional may mean that two reference picture lists are used. In other words, the reference direction may indicate one of the following: a case where only the reference picture list L0 is used, a case where only the reference picture list L1 is used, and a case where two reference picture lists are used.

The reference picture index may indicate a reference picture to be used for predicting the target block among reference pictures in the reference picture list. The reference picture index may be entropy encoded by the encoding device 100. The entropy encoded reference picture index may be signaled by the encoding device 100 to the decoding device 200 through a bitstream.

When two reference picture lists are used to predict a target block, a single reference picture index and a single motion vector may be used for each of the reference picture lists. Further, when two reference picture lists are used to predict a target block, two prediction blocks may be designated for the target block. For example, an average or weighted sum of two prediction blocks for a target block may be used to generate a (final) prediction block for the target block.

The motion vector of the target block may be derived by predicting the motion vector index, MVD, reference direction, and reference picture index.

The decoding apparatus 200 may generate a prediction block for the target block based on the derived motion vector and the reference picture index. For example, the prediction block may be a reference block indicated by a derived motion vector in a reference picture indicated by a reference picture index.

Since the prediction motion vector index and the MVD are encoded and the motion vector of the target block is not itself encoded, the number of bits transmitted from the encoding apparatus 100 to the decoding apparatus 200 may be reduced and the encoding efficiency may be improved.

The motion information of the reconstructed neighboring blocks may be used for the target block. In a specific inter prediction mode, the encoding apparatus 100 may not separately encode actual motion information of the target block. The motion information of the target block is not encoded, but may encode additional information that enables the motion information of the target block to be derived using the reconstructed motion information of the neighboring blocks. Since the additional information is encoded, the number of bits transmitted to the decoding apparatus 200 may be reduced and encoding efficiency may be improved.

For example, as an inter prediction mode in which motion information of a target block is not directly encoded, a skip mode and/or a merge mode may exist. Here, each of the encoding apparatus 100 and the decoding apparatus 200 may use an indicator and/or index indicating a unit whose motion information is to be used as the motion information of the target unit among the reconstructed neighboring units.

2) Merge mode

As a scheme for deriving motion information of a target block, there is merging. The term "merge" may mean merging the motion of multiple blocks. "merge" may mean that motion information of one block is also applied to other blocks. In other words, the merge mode may be a mode in which motion information of a target block is derived from motion information of neighboring blocks.

When the merge mode is used, the encoding apparatus 100 may predict motion information of the target block using motion information of spatial candidates and/or motion information of temporal candidates. The spatial candidates may include reconstructed spatially neighboring blocks that are spatially adjacent to the target block. Spatially adjacent blocks may include a left-side adjacent block and an upper-side adjacent block. The temporal candidates may include col blocks. The terms "spatial candidate" and "spatial merge candidate" may be used to have the same meaning and may be used interchangeably with each other. The terms "temporal candidates" and "temporal merging candidates" may be used to have the same meaning and may be used interchangeably with each other.

The encoding apparatus 100 may acquire a prediction block via prediction. The encoding apparatus 100 may encode a residual block, which is a difference between the target block and the prediction block.

2-1) creation of merge candidate list

When the merge mode is used, each of the encoding apparatus 100 and the decoding apparatus 200 may create a merge candidate list using motion information of spatial candidates and/or motion information of temporal candidates. The motion information may include 1) a motion vector, 2) a reference picture index, and 3) a reference direction. The reference direction may be unidirectional or bidirectional.

The merge candidate list may include merge candidates. The merge candidate may be motion information. In other words, the merge candidate list may be a list storing a plurality of pieces of motion information.

The merge candidate may be motion information of a plurality of temporal candidates and/or spatial candidates. Further, the merge candidate list may include new merge candidates generated by combining merge candidates already existing in the merge candidate list. In other words, the merge candidate list may include new motion information generated by merging pieces of motion information previously existing in the merge candidate list.

Further, the merge candidate list may include motion information of a zero vector. The zero vector may also be referred to as a "zero merge candidate".

In other words, the pieces of motion information in the merge candidate list may be at least one of the following information: 1) motion information of a spatial candidate, 2) motion information of a temporal candidate, 3) motion information generated by combining pieces of motion information previously existing in a merge candidate list, and 4) a zero vector.

The motion information may include 1) a motion vector, 2) a reference picture index, and 3) a reference direction. The reference direction may also be referred to as an "inter prediction indicator". The reference direction may be unidirectional or bidirectional. The unidirectional reference direction may indicate either L0 prediction or L1 prediction.

The merge candidate list may be created before prediction in the merge mode is performed.

The number of merging candidates in the merging candidate list may be defined in advance. Each of the encoding apparatus 100 and the decoding apparatus 200 may add the merge candidates to the merge candidate list according to a predefined scheme or a predefined priority such that the merge candidate list has a predefined number of merge candidates. The merge candidate list of the encoding device 100 and the merge candidate list of the decoding device 200 may be made identical to each other using a predefined scheme and a predefined priority.

Merging may be applied based on a CU or PU. When merging is performed based on a CU or PU, the encoding apparatus 100 may transmit a bitstream including predefined information to the decoding apparatus 200. For example, the predefined information may include 1) information indicating whether to perform merging for each block partition, and 2) information about blocks to be performed merging among blocks that are spatial candidates and/or temporal candidates for a target block.

2-2) searching for motion vectors using merge candidate lists

The encoding apparatus 100 may determine a merge candidate to be used for encoding the target block. For example, the encoding apparatus 100 may perform prediction on the target block using the merge candidates in the merge candidate list, and may generate a residual block for the merge candidates. The encoding apparatus 100 may encode the target block using a merge candidate that generates the minimum cost in encoding of the prediction and residual blocks.

Further, the encoding apparatus 100 may determine whether to encode the target block using the merge mode.

2-3) transmission of inter prediction information

The encoding apparatus 100 may generate a bitstream including inter prediction information required for inter prediction. The encoding apparatus 100 may generate entropy-encoded inter prediction information by performing entropy encoding on the inter prediction information, and may transmit a bitstream including the entropy-encoded inter prediction information to the decoding apparatus 200. The entropy encoded inter prediction information may be signaled by the encoding apparatus 100 to the decoding apparatus 200 through a bitstream.

The decoding apparatus 200 may perform inter prediction on the target block using inter prediction information of the bitstream.

The inter prediction information may include 1) mode information indicating whether a merge mode is used and 2) a merge index.

Furthermore, the inter prediction information may include a residual signal.

The decoding apparatus 200 may acquire the merge index from the bitstream only when the mode information indicates that the merge mode is used.

The mode information may be a merge flag. The unit of mode information may be a block. The information about the block may include mode information, and the mode information may indicate whether a merge mode is applied to the block.

The merge index may indicate a merge candidate to be used for predicting the target block among the merge candidates included in the merge candidate list. Alternatively, the merge index may indicate a block to be merged with the target block among neighboring blocks spatially or temporally adjacent to the target block.

2-4) inter prediction using a merge mode of inter prediction information

The decoding apparatus 200 may perform prediction on the target block using the merge candidates indicated by the merge index among the merge candidates included in the merge candidate list.

The motion vector of the target block may be specified by the motion vector of the merge candidate indicated by the merge index, the reference picture index, and the reference direction.

3) Skip mode

The skip mode may be a mode in which motion information of a spatial candidate or motion information of a temporal candidate is applied to a target block without change. In addition, the skip mode may be a mode in which a residual signal is not used. In other words, when the skip mode is used, the reconstructed block may be a prediction block.

The difference between the merge mode and the skip mode is whether to transmit or use a residual signal. That is, the skip mode may be similar to the merge mode except that the residual signal is not transmitted or used.

When the skip mode is used, the encoding apparatus 100 may transmit information on a block whose motion information is to be used as motion information of a target block among blocks that are spatial candidates or temporal candidates to the decoding apparatus 200 through a bitstream. The encoding apparatus 100 may generate entropy-encoded information by performing entropy encoding on the information, and may signal the entropy-encoded information to the decoding apparatus 200 through a bitstream.

Further, when the skip mode is used, the encoding apparatus 100 may not transmit other syntax information (such as MVD) to the decoding apparatus 200. For example, when the skip mode is used, the encoding apparatus 100 may not signal syntax elements related to at least one of MVC, a coded block flag, and a transform coefficient level to the decoding apparatus 200.

3-1) creation of merge candidate list

The skip mode may also use a merge candidate list. In other words, the merge candidate list may be used in both the merge mode and the skip mode. In this regard, the merge candidate list may also be referred to as a "skip candidate list" or a "merge/skip candidate list".

Alternatively, the skip mode may use an additional candidate list different from the candidate list of the merge mode. In this case, in the following description, the merge candidate list and the merge candidate may be replaced with a skip candidate list and a skip candidate, respectively.

The merge candidate list may be created before prediction in the skip mode is performed.

3-2) searching for motion vectors using merge candidate list

The encoding apparatus 100 may determine a merge candidate to be used for encoding the target block. For example, the encoding apparatus 100 may perform prediction on the target block using the merge candidates in the merge candidate list. The encoding apparatus 100 may encode the target block using the merge candidate that generates the minimum cost in prediction.

Further, the encoding apparatus 100 may determine whether to encode the target block using the skip mode.

3-3) transmission of inter prediction information

The inter prediction information may include 1) mode information indicating whether a skip mode is used and 2) a skip index.

The skip index may be the same as the merge index described above.

When the skip mode is used, the target block may be encoded without using a residual signal. The inter prediction information may not include a residual signal. Alternatively, the bitstream may not include a residual signal.

The decoding apparatus 200 may acquire the skip index from the bitstream only when the mode information indicates that the skip mode is used. As described above, the merge index and the skip index may be identical to each other. The decoding apparatus 200 may acquire the skip index from the bitstream only when the mode information indicates that the merge mode or the skip mode is used.

The skip index may indicate a merge candidate to be used for predicting the target block among the merge candidates included in the merge candidate list.

3-4) inter prediction in skip mode using inter prediction information

The decoding apparatus 200 may perform prediction on the target block using the merge candidate indicated by the skip index among the merge candidates included in the merge candidate list.

The motion vector of the target block may be specified by the motion vector of the merge candidate indicated by the skip index, the reference picture index, and the reference direction.

4) Current picture reference mode

The current picture reference mode may represent such a prediction mode: the prediction mode uses a previously reconstructed region in the current picture to which the target block belongs.

A vector may be defined that specifies the previously reconstructed region. The reference picture index of the target block may be used to determine whether the target block has been encoded in the current picture reference mode.

A flag or index indicating whether the target block is a block encoded in the current picture reference mode may be signaled by the encoding apparatus 100 to the decoding apparatus 200. Alternatively, it may be inferred from the reference picture index of the target block whether the target block is a block encoded in the current picture reference mode.

When a target block is encoded in a current picture reference mode, a current picture may be added to a fixed position or an arbitrary position in a reference picture list for the target block.

For example, the fixed position may be a position where the reference picture index is 0 or a last position.

When a current picture is added to an arbitrary position in the reference picture list, an additional reference picture index indicating such an arbitrary position may be signaled by the encoding apparatus 100 to the decoding apparatus 200.

In the AMVP mode, the merge mode, and the skip mode described above, the motion information to be used for predicting the target block among the pieces of motion information in the list may be specified using the index of the list.

To improve the encoding efficiency, the encoding apparatus 100 may signal only an index of an element that generates the minimum cost in inter prediction of the target block among elements in the list. The encoding apparatus 100 may encode the index and may signal the encoded index.

Therefore, it is necessary to be able to derive the above-described list (i.e., the predicted motion vector candidate list and the merge candidate list) based on the same data using the same scheme by the encoding apparatus 100 and the decoding apparatus 200. Here, the same data may include a reconstructed picture and a reconstructed block. Furthermore, in order to specify elements using indexes, the order of elements in the list must be fixed.

Fig. 9 illustrates spatial candidates according to an embodiment.

In fig. 9, the positions of the spatial candidates are shown.

The large block at the center of the graph may represent the target block. Five tiles may represent spatial candidates.

The coordinates of the target block may be (xP, yP), and the size of the target block may be expressed by (nPSW, nPSH).

Spatial candidate A ₀ May be a block adjacent to the lower left corner of the target block. A is that ₀ May be a block that occupies pixels located at coordinates (xP-1, yp+npsh+1).

Space coordinates A ₁ May be a block adjacent to the left side of the target block. A is that ₁ May be the lowest block among the blocks adjacent to the left side of the target block.Alternatively, A ₁ Can be with A ₀ Is not included in the block, is not included in the block. A is that ₁ May be a block that occupies pixels located at coordinates (xP-1, yp+npsh).

Spatial candidate B ₀ May be a block adjacent to the upper right corner of the target block. B (B) ₀ May be a block that occupies pixels located at coordinates (xp+npsw+1, yp-1).

Spatial candidate B ₁ May be a block adjacent to the top of the target block. B (B) ₁ May be the rightmost block among the blocks adjacent to the top of the target block. Alternatively, B ₁ Can be with B ₀ To the left of the adjacent block. B (B) ₁ May be a block that occupies pixels located at coordinates (xp+npsw, yP-1).

The spatial candidate B2 may be a block adjacent to the upper left corner of the target block. B2 may be a block occupying a pixel located at coordinates (xP-1, yP-1).

Determining availability of spatial and temporal candidates

In order to include motion information of a spatial candidate or motion information of a temporal candidate in a list, it is necessary to determine whether the motion information of the spatial candidate or the motion information of the temporal candidate is available.

Hereinafter, the candidate block may include a spatial candidate and a temporal candidate.

For example, the operation of determining whether motion information of a spatial candidate or motion information of a temporal candidate is available may be performed by sequentially applying the following steps 1) to 4).

Step 1) when a PU including a candidate block is located outside the boundary of a picture, the availability of the candidate block may be set to "false". The expression "availability is set to false" may have the same meaning as "set to unavailable".

Step 2) when the PU including the candidate block is located outside the boundary of the stripe, the availability of the candidate block may be set to "false". When the target block and the candidate block are located in different stripes, the availability of the candidate block may be set to "false".

Step 3) when the PU including the candidate block is located outside the boundary of the parallel block, the availability of the candidate block may be set to "false". When the target block and the candidate block are located in different parallel blocks, the availability of the candidate block may be set to "false".

Step 4) when the prediction mode of the PU including the candidate block is an intra prediction mode, the availability of the candidate block may be set to "false". When the PU including the candidate block does not use inter prediction, the availability of the candidate block may be set to "false".

Fig. 10 illustrates a sequence of adding motion information of a spatial candidate to a merge list according to an embodiment.

As shown in fig. 10, when a plurality of pieces of motion information of space candidates are added to a merge list, a may be used ₁ 、B ₁ 、B ₀ 、A ₀ And B ₂ Is a sequence of (a). That is, according to A ₁ 、B ₁ 、B ₀ 、A ₀ And B ₂ A plurality of pieces of motion information of available spatial candidates is added to the merge list.

Method for deriving a merge list in merge mode and skip mode

As described above, the maximum number of merge candidates in the merge list may be set. "N" may be used to indicate the maximum number of settings. The set number may be transmitted from the encoding apparatus 100 to the decoding apparatus 200. The head of the tape may comprise N. In other words, the maximum number of merge candidates in the merge list for the target block of the stripe may be set by the stripe header. For example, the value of N may be substantially 5.

The pieces of motion information (i.e., merge candidates) may be added to the merge list in the order of the following steps 1) to 4).

Step 1)Among the spatial candidates, valid spatial candidates may be added to the merge list. The pieces of motion information of the available spatial candidates may be added to the merge list in the order shown in fig. 10. Here, when the motion information of the available spatial candidate overlaps with other motion information already existing in the merge list, the motion information of the available spatial candidate may not be added to the merge list. Checking whether the corresponding motion information is associated with other motions present in the listThe operation of information overlap may be referred to as "overlap checking".

The maximum number of motion information added may be N.

Step 2)When the number of pieces of motion information in the merge list is less than N and a time candidate is available, the motion information of the time candidate may be added to the merge list. Here, when the motion information of the available time candidate overlaps with other motion information already existing in the merge list, the motion information of the time candidate may not be added to the merge list.

Step 3)When the number of pieces of motion information in the merge list is less than N and the type of the target slice is "B", combined motion information generated by combined bi-prediction (bi-prediction) may be added to the merge list.

The target stripe may be a stripe that includes a target block.

The combined motion information may be a combination of L0 motion information and L1 motion information. The L0 motion information may be motion information referring to only the reference picture list L0. The L1 motion information may be motion information referring to only the reference picture list L1.

In the merge list, there may be one or more pieces of L0 motion information. Further, in the merge list, there may be one or more pieces of L1 motion information.

The combined motion information may include one or more pieces of combined motion information. When the combined motion information is generated, L0 motion information and L1 motion information to be used for the step of generating the combined motion information among the one or more pieces of L0 motion information and the one or more pieces of L1 motion information may be predefined. One or more pieces of combined motion information may be generated in a predefined order via bi-prediction using a combination of a pair of different motion information in a merged list. One piece of motion information of the pair of different motion information may be L0 motion information, and the other piece of motion information of the pair of different motion information may be L1 motion information.

For example, the combined motion information added with the highest priority may be a combination of L0 motion information having a merge index 0 and L1 motion information having a merge index 1. When the motion information having the merge index 0 is not L0 motion information or when the motion information having the merge index 1 is not L1 motion information, the combined motion information may be neither generated nor added. Next, the combined motion information added with the next priority may be a combination of L0 motion information having a merge index 1 and L1 motion information having a merge index 0. The detailed combinations that follow may be consistent with other combinations in the video encoding/decoding arts.

Here, when the combined motion information overlaps with other motion information already existing in the merge list, the combined motion information may not be added to the merge list.

Step 4)When the number of pieces of motion information in the merge list is less than N, zero vector motion information may be added to the merge list.

The zero vector motion information may be motion information in which the motion vector is a zero vector.

The number of zero vector motion information may be one or more. The reference picture indexes of one or more pieces of zero vector motion information may be different from each other. For example, the value of the reference picture index of the first zero vector motion information may be 0. The value of the reference picture index of the second zero vector motion information may be 1.

The number of zero vector motion information pieces may be the same as the number of reference pictures in the reference picture list.

The reference direction of the zero vector motion information may be bi-directional. The two motion vectors may be zero vectors. The number of zero vector motion information may be the smaller one of the number of reference pictures of the reference picture list L0 and the number of reference pictures of the reference picture list L1. Alternatively, when the number of reference pictures in the reference picture list L0 and the number of reference pictures in the reference picture list L1 are different from each other, the reference direction as one direction may be used for a reference picture index applicable to only a single reference picture list.

The encoding apparatus 100 and/or the decoding apparatus 200 may then add zero vector motion information to the merge list while changing the reference picture index.

When the zero vector motion information overlaps with other motion information already present in the merge list, the zero vector motion information may not be added to the merge list.

The order of steps 1) to 4) described above is merely exemplary and may be changed. Furthermore, some of the above steps may be omitted according to predefined conditions.

Method for deriving a list of predicted motion vector candidates in AMVP mode

The maximum number of predicted motion vector candidates in the predicted motion vector candidate list may be predefined. The predefined maximum number may be indicated with N. For example, the predefined maximum number may be 2.

A plurality of pieces of motion information (i.e., predicted motion vector candidates) may be added to the predicted motion vector candidate list in the order of the following steps 1) to 3).

Step 1)Available spatial candidates among the spatial candidates may be added to the predicted motion vector candidate list. The spatial candidates may include a first spatial candidate and a second spatial candidate.

The first spatial candidate may be A ₀ 、A ₁ Scaled A ₀ And scaled A ₁ One of which is a metal alloy. The second spatial candidate may be B ₀ 、B ₁ 、B ₂ Scaled B ₀ Scaled B ₁ And scaled B ₂ One of which is a metal alloy.

The plurality of pieces of motion information of the available spatial candidates may be added to the predicted motion vector candidate list in the order of the first spatial candidate and the second spatial candidate. In this case, when the motion information of the available spatial candidate overlaps with other motion information already existing in the predicted motion vector candidate list, the motion information of the available spatial candidate may not be added to the predicted motion vector candidate list. In other words, when the value of N is 2, if the motion information of the second spatial candidate is the same as the motion information of the first spatial candidate, the motion information of the second spatial candidate may not be added to the predicted motion vector candidate list.

The maximum number of motion information added may be N.

Step 2)When the number of pieces of motion information in the predicted motion vector candidate list is less than N and a temporal candidate is available, the motion information of the temporal candidate may be added to the predicted motion vector candidate list. In this case, when the motion information of the available time candidate overlaps with other motion information already existing in the predicted motion vector candidate list, the motion information of the available time candidate may not be added to the predicted motion vector candidate list.

Step 3)When the number of pieces of motion information in the predicted motion vector candidate list is less than N, zero vector motion information may be added to the predicted motion vector candidate list.

The zero vector motion information may include one or more pieces of zero vector motion information. The reference picture indexes of the one or more pieces of zero vector motion information may be different from each other.

The encoding apparatus 100 and/or the decoding apparatus 200 may sequentially add a plurality of pieces of zero vector motion information to the predicted motion vector candidate list while changing the reference picture index.

When the zero vector motion information overlaps with other motion information already present in the predicted motion vector candidate list, the zero vector motion information may not be added to the predicted motion vector candidate list.

The description of zero vector motion information made in connection with the merge list above is also applicable to zero vector motion information. A repetitive description thereof will be omitted.

The order of steps 1) to 3) described above is merely exemplary and may be changed. Furthermore, some of the steps may be omitted according to predefined conditions.

Fig. 11 illustrates a transform and quantization process according to an example.

As shown in fig. 11, the level of quantization may be generated by performing a transform and/or quantization process on the residual signal.

The residual signal may be generated as a difference between the original block and the predicted block. Here, the prediction block may be a block generated via intra prediction or inter prediction.

The transformation may include at least one of a primary transformation and a secondary transformation. The transform coefficients may be generated by performing a first transform on the residual signal, and the secondary transform coefficients may be generated by performing a second transform on the transform coefficients.

The first transformation may be performed using at least one of a predefined plurality of transformation methods. For example, the predefined plurality of transform methods may include Discrete Cosine Transform (DCT), discrete Sine Transform (DST), carlo transform (KLT), and the like.

The secondary transform may be performed on transform coefficients generated by performing the primary transform.

The transform method applied to the primary transform and/or the secondary transform may be determined based on at least one of the encoding parameters for the target block and/or the neighboring block. Alternatively, the transformation information indicating the transformation method may be signaled by the encoding apparatus to the decoding apparatus 200.

The level of quantization may be generated by performing quantization on a result generated by performing the primary transform and/or the secondary transform or performing quantization on the residual signal.

The quantized level may be scanned based on at least one of an upper right diagonal scan, a vertical scan, and a horizontal scan according to at least one of an intra prediction mode, a block size, and a block form.

For example, coefficients of a block may be changed to a 1D vector form by scanning the coefficients using an upper right diagonal scan. Alternatively, a vertical scan that scans the 2D block format coefficients in the column direction or a horizontal scan that scans the 2D block format coefficients in the row direction may be used instead of the upper right diagonal scan, depending on the size of the intra block and/or the intra prediction mode.

The scanned quantized levels may be entropy encoded, and the bitstream may include the entropy encoded quantized levels.

The decoding apparatus 200 may generate the level of quantization via entropy decoding the bitstream. The quantized levels may be arranged in the form of 2D blocks via inverse scanning. Here, as a method of inverse scanning, at least one of upper right diagonal scanning, vertical scanning, and horizontal scanning may be performed.

Dequantization may be performed on the quantized level. The secondary inverse transform may be performed on a result generated by performing the dequantization according to whether the secondary inverse transform is performed. Further, the first inverse transform may be performed on a result generated by performing the second inverse transform according to whether the first inverse transform is to be performed. The reconstructed residual signal may be generated by performing a first inverse transform on a result generated by performing a second inverse transform.

Fig. 12 is a configuration diagram of an encoding apparatus according to an embodiment.

The encoding apparatus 1200 may correspond to the encoding apparatus 100 described above.

The encoding apparatus 1200 may include a processing unit 1210, a memory 1230, a User Interface (UI) input device 1250, a UI output device 1260, and a storage 1240 in communication with each other through a bus 1290. The electronic device 1200 may also include a communication unit 1220 that is connected to a network 1299.

The processing unit 1210 may be a Central Processing Unit (CPU) or a semiconductor device for executing processing instructions stored in the memory 1230 or the storage 1240. The processing unit 1210 may be at least one hardware processor.

The processing unit 1210 may generate and process signals, data, or information input to the encoding apparatus 1200, output from the encoding apparatus 1200, or used in the encoding apparatus 1200, and may perform checking, comparison, determination, or the like related to the signals, data, or information. In other words, in an embodiment, the generation and processing of data or information, as well as the checking, comparing, and determining related to the data or information, may be performed by the processing unit 1210.

The processing unit 1210 may include an inter prediction unit 110, an intra prediction unit 120, a switcher 115, a subtractor 125, a transform unit 130, a quantization unit 140, an entropy coding unit 150, a dequantization unit 160, an inverse transform unit 170, an adder 175, a filtering unit 180, and a reference picture buffer 190.

At least some of the inter prediction unit 110, the intra prediction unit 120, the switcher 115, the subtractor 125, the transform unit 130, the quantization unit 140, the entropy encoding unit 150, the dequantization unit 160, the inverse transform unit 170, the adder 175, the filtering unit 180, and the reference picture buffer 190 may be program modules and may communicate with external devices or systems. The program modules may be included in the encoding device 1200 in the form of an operating system, application program modules, or other program modules.

The program modules may be physically stored in various types of well known storage devices. Furthermore, at least some of the program modules may also be stored in a remote storage device capable of communicating with the encoding apparatus 1200.

Program modules may include, but are not limited to, routines, subroutines, programs, objects, components, and data structures for performing functions or operations according to embodiments or for implementing abstract data types according to embodiments.

The program modules may be implemented using instructions or code executed by at least one processor of the encoding apparatus 1200.

The processing unit 1210 may run instructions or codes in the inter prediction unit 110, the intra prediction unit 120, the switch 115, the subtractor 125, the transform unit 130, the quantization unit 140, the entropy coding unit 150, the dequantization unit 160, the inverse transform unit 170, the adder 175, the filtering unit 180, and the reference picture buffer 190.

The storage unit may represent the memory 1230 and/or the storage 1240. Each of the memory 1230 and the storage 1240 may be any of various types of volatile or non-volatile storage media. For example, the memory 1230 may include at least one of a Read Only Memory (ROM) 1231 and a Random Access Memory (RAM) 1232.

The storage unit may store data or information for operation of the encoding apparatus 1200. In an embodiment, data or information of the encoding apparatus 1200 may be stored in a storage unit.

For example, the storage unit may store pictures, blocks, lists, motion information, inter prediction information, bitstreams, and the like.

The encoding apparatus 1200 may be implemented in a computer system including a computer readable storage medium.

The storage medium may store at least one module required for the operation of the encoding apparatus 1200. The memory 1230 may store at least one module and may be configured to cause the at least one module to be executed by the processing unit 1210.

The functions related to the communication of data or information of the encoding apparatus 1200 may be performed by the communication unit 1220.

For example, the communication unit 1220 may transmit a bitstream to a decoding apparatus 1300 to be described later.

Fig. 13 is a configuration diagram of a decoding apparatus according to an embodiment.

The decoding apparatus 1300 may correspond to the decoding apparatus 200 described above.

Decoding apparatus 1300 may include a processing unit 1310, a memory 1330, a User Interface (UI) input device 1350, a UI output device 1360, and a storage 1340 in communication with each other through a bus 1390. The decoding apparatus 1300 may further include a communication unit 1320 connected to the network 1399.

Processor 1310 may be a Central Processing Unit (CPU) or semiconductor device configured to execute processing instructions stored in memory 1330 or storage 1340. The processing unit 1310 may be at least one hardware processor.

The processing unit 1310 may generate and process signals, data, or information input to the decoding apparatus 1300, output from the decoding apparatus 1300, or used in the decoding apparatus 1300, and may perform checking, comparing, determining, etc. related to the signals, data, or information. In other words, in an embodiment, the generation and processing of data or information, as well as the checking, comparing, and determining related to the data or information, may be performed by the processing unit 1310.

The processing unit 1310 may include an entropy decoding unit 210, a dequantization unit 220, an inverse transformation unit 230, an intra prediction unit 240, an inter prediction unit 250, an adder 255, a filtering unit 260, and a reference picture buffer 270.

At least some of the entropy decoding unit 210, the dequantization unit 220, the inverse transformation unit 230, the intra prediction unit 240, the inter prediction unit 250, the adder 255, the filtering unit 260, and the reference picture buffer 270 of the decoding apparatus 1300 may be program modules, and may communicate with external devices or systems. The program modules may be included in the decoding device 1300 in the form of an operating system, application program modules, or other program modules.

Program modules may be physically stored in various types of well known storage devices. Furthermore, at least some of the program modules may also be stored in a remote memory storage device capable of communicating with the decoding apparatus 1300.

The program modules may be implemented using instructions or code executed by at least one processor of decoding device 1300.

The processing unit 1310 may run instructions or codes in the entropy decoding unit 210, the dequantization unit 220, the inverse transformation unit 230, the intra prediction unit 240, the inter prediction unit 250, the adder 255, the filtering unit 260, and the reference picture buffer 270.

The storage unit may represent the memory 1330 and/or the storage 1340. Each of the memory 1330 and the storage 1340 may be any of various types of volatile or non-volatile storage media. For example, the memory 1330 may include at least one of ROM 1331 and RAM 1332.

The storage unit may store data or information for operation of the decoding apparatus 1300. In an embodiment, data or information of the decoding apparatus 1300 may be stored in a storage unit.

Decoding device 1300 may be implemented in a computer system comprising a computer-readable storage medium.

The storage medium may store at least one module required for the operation of the decoding apparatus 1300. The memory 1330 may store at least one module and may be configured to cause the at least one module to be executed by the processing unit 1310.

The functions related to the communication of data or information of the decoding apparatus 1300 may be performed by the communication unit 1320.

For example, the communication unit 1320 may receive a bitstream from the encoding apparatus 1200.

Fig. 14 is a flowchart of a prediction method according to an embodiment.

The prediction method may be performed by the encoding apparatus 1200 and/or the decoding apparatus 1300.

For example, the encoding apparatus 1200 may perform the prediction method according to the embodiment to compare the efficiency of a plurality of prediction schemes for a target block and/or a plurality of partitions, and may also perform the prediction method according to the present embodiment to generate a reconstructed block of the target block.

In an embodiment, the target block may be at least one of CTU, CU, PU, TU, a block having a specific size, and a block having a size falling within a predefined range.

For example, the decoding apparatus 1300 may perform the prediction method according to the embodiment to generate a reconstructed block of the target block.

Hereinafter, the processing unit may correspond to the processing unit 1210 of the encoding apparatus 1200 and/or the processing unit 1310 of the decoding apparatus 1300.

At step 1410, the processing unit may generate a plurality of tiles by dividing the target block.

The processing unit may generate the plurality of tiles by dividing the target block using coding parameters associated with the target block.

In an embodiment, the processing unit may generate the plurality of tiles by dividing the target block based on one or more of a size of the target block and a shape of the target block.

For example, the target block may include a plurality of tiles. The plurality of tiles may also be referred to as "a plurality of sub-tiles".

An example of step 1410 is described in detail below with reference to fig. 15.

In an embodiment, it may be determined whether to perform step 1410, i.e., whether to generate a plurality of tiles by dividing the target block, based on information related to the target block. The processing unit may determine whether to apply partitioning to the target block based on the information related to the target block.

In an embodiment, the information related to the target block may include at least one of coding parameters of the target block, picture related information of a target picture including the target block, information about a slice including the target block, a Quantization Parameter (QP) of the target block, a Coded Block Flag (CBF) of the target block, a size of the target block, a depth of the target block, a shape of the target block, an entropy coding scheme of the target block, partition information of a reference block for the target block, a temporal layer level of the target block, and a block partition indicator (flag).

The reference block may include one or more of a block spatially adjacent to the target block and a block temporally adjacent to the target block.

1) In an embodiment, the processing unit may determine whether to apply the partitioning to the target block according to picture related information of the target picture. For example, a Picture Parameter Set (PPS) of the target picture may include information indicating whether blocks in the target picture are to be divided. With PPS, information indicating whether a block in a target picture is to be divided may be encoded and/or decoded. Alternatively, by PPS, a picture set so that a block in a picture will be divided, or a picture set so that a block in a picture will not be divided may be identified.

For example, when a non-square target block is included in a picture set such that blocks in the picture are to be divided, the processing unit may divide the non-square target block into square blocks.

2) In an embodiment, the processing unit may determine whether to apply the partitioning to the target block based on the information about the specific picture. For example, the specific screen may be a screen preceding the target screen.

For example, the processing unit may determine whether to apply the partitioning to the target block according to whether the partitioning has been applied to a block in a picture preceding the target picture.

For example, when a non-square block in a picture preceding the target picture is divided into square blocks, the processing unit may divide the non-square block in the target picture into square blocks.

3) In an embodiment, the processing unit may determine whether to apply partitioning to the target block based on the information about the stripe. The stripe may include a target block. Alternatively, the stripe may include a reference block.

For example, the processing unit may determine whether to apply partitioning to the target block according to the type of stripe. The stripe types may include I-stripe, B-stripe, and P-stripe.

For example, when a non-square target block is included in the I-slice, the processing unit may divide the non-square target block of the target picture into square blocks.

For example, when a non-square block is included in a P-stripe or a B-stripe, the processing unit may divide the non-square block into square blocks.

4) In an embodiment, the processing unit may determine whether to apply partitioning to the target block based on the information about the additional stripes.

For example, the additional stripes may be stripes before or after the corresponding stripe that includes the target block. The additional stripe may be a stripe including a reference block for the target block.

For example, the processing unit may determine whether to apply partitioning to the target block based on the type of additional stripe. The types of additional stripes may include I-stripes, B-stripes, and P-stripes.

For example, when the additional slice is an I slice, the processing unit may divide a non-square target block of the target picture into square blocks.

For example, when the additional stripes are P-stripes or B-stripes, the processing unit may divide the non-square target block into square blocks.

5) In an embodiment, the processing unit may determine whether to apply partitioning to the target block based on quantization parameters of the target block.

For example, when the quantization parameter of the non-square target block falls within a certain range, the processing unit may divide the non-square target block into square blocks.

6) In an embodiment, the processing unit may determine whether to apply partitioning to the target block based on a Coded Block Flag (CBF) of the target block.

For example, when the value of CBF of the non-square target block is equal to or corresponds to a specific value, the processing unit may divide the non-square target block into square blocks.

7) In an embodiment, the processing unit may determine whether to apply partitioning to the target block based on the size of the target block.

For example, the processing unit may divide the non-square target block into square blocks when the size of the non-square target block is 1) equal to a specific size or 2) falls within a specific range.

For example, the processing unit may divide the non-square target block into square blocks when the sum of the width and the height of the non-square target block is 1) equal to a specific value, 2) equal to or greater than a specific value, 3) less than or equal to a specific value, or 4) falls within a specific range. For example, the specific value may be 16.

8) In an embodiment, the processing unit may determine whether to apply partitioning to the target block based on the depth of the target block.

For example, the processing unit may divide the non-square target block into square blocks when the depth of the non-square target block is 1) equal to a specific depth or 2) falls within a specific range.

9) In an embodiment, the processing unit may determine whether to apply partitioning to the target block based on the shape of the target block.

For example, the processing unit may divide the non-square target block into square blocks when the ratio of the width to the height of the non-square target block 1) is equal to a specific value or 2) falls within a specific range.

10 In an embodiment, the processing unit may determine whether to apply partitioning to the target block based on the block partitioning indicator (flag).

The block division indicator may be an indicator indicating whether the target block is to be divided. Further, the block partition indicator may indicate a type of partitioning the target block.

The type of division may include the direction of the division. The direction of division may be a vertical direction or a horizontal direction.

The type of division may include the number of tiles generated by performing the division.

In an embodiment, the indicator may include information explicitly signaled from the encoding device 1200 to the decoding device 1300 through the bitstream. In an embodiment, the indicator may comprise a block partition indicator.

When the block division indicator is used, the decoding apparatus 1300 may directly determine whether to divide the target block and what type of division to use based on the block division indicator provided from the encoding apparatus 1200.

The block partition indicator may be optional (or optional). When the block division indicator is not used, the processing unit may determine whether to divide the target block and what type of division is to be used based on a use condition of information related to the target block. Thus, it may be determined whether to divide the target block without signaling additional information.

For example, when the block division indicator indicates that the target block is to be divided, the processing unit may divide the non-square target block into square blocks.

The block partition indicator may be encoded and/or decoded for at least one of a Sequence Parameter Set (SPS), a Picture Parameter Set (PPS), a slice header, a parallel block header, a Coding Tree Unit (CTU), a Coding Unit (CU), a Prediction Unit (PU), and a Transform Unit (TU). In other words, the unit used to provide the block partition indicator may be at least one of SPS, PPS, slice header, parallel block header, CUT, CU, PU, and TU. The block partition indicator provided for a particular unit may be commonly applied to one or more target blocks included in the particular unit.

11 In an embodiment, the processing unit may determine whether to apply the partitioning to the target block based on the partitioning information of the reference block.

The reference blocks may be spatially adjacent blocks and/or temporally adjacent blocks.

For example, the partition information may be at least one of quadtree partition information, binary tree partition information, and quadtree plus binary tree (QTBT) information.

For example, when the reference block division information indicates that the target block is to be divided, the processing unit may divide the non-square target block into square blocks.

12 In an embodiment, the processing unit may determine whether to apply the partitioning to the target block based on the temporal layer level of the target block.

For example, the processing unit may divide the non-square target block into square blocks when the temporal layer level 1) of the non-square target block is equal to a specific value or 2) falls within a specific range.

13 In addition, in an embodiment, the information related to the target block may further include the information described above that is used to encode and/or decode the target block.

In embodiments 1) to 13) described above, specific values, specific ranges, and/or specific units may be set by the encoding apparatus 1200 or the decoding apparatus 1300. When a specific value, a specific range, and a specific unit are set by the encoding apparatus 1200, the set specific value, the set specific range, and/or the set specific unit may be signaled from the encoding apparatus 1200 to the decoding apparatus 1300 through a bitstream.

Optionally, specific values, specific ranges, and/or specific units may be derived from the additional encoding parameters. When the encoding parameters are shared between the encoding apparatus 1200 and the decoding apparatus 1300 through a bitstream, or when the encoding parameters can be equally derived by the encoding apparatus 1200 and the decoding apparatus 1300 using a predefined derivation scheme, the specific values, specific ranges, and/or specific units may not be signaled from the encoding apparatus 1200 to the decoding apparatus 1300.

In the embodiments 1) to 13) described above, the operation of determining whether to divide a target block based on the criterion for the shape of the target block is merely an example. In embodiments 1) to 13) described above, the operation of determining whether to divide the target block may be combined with other criteria described in the embodiments (such as the size of the target block).

In step 1420, the processing unit may derive a prediction mode for at least some of the plurality of partitions.

In an embodiment, the prediction may be intra prediction or inter prediction.

An example of step 1420 will be described in detail below with reference to fig. 21.

In step 1430, the processing unit may perform prediction on the plurality of partition blocks based on the derived prediction mode.

In an embodiment, the processing unit may perform prediction on at least some of the plurality of partition blocks using the derived prediction mode. The processing unit may perform prediction on the remaining blocks among the plurality of partitions using a prediction mode generated based on the derived prediction mode.

Prediction performed on the split blocks will be described below with reference to fig. 22, 23, 24, 25, 26, 27, and 28.

Fig. 15 is a flowchart of a block division method according to an embodiment.

The block division method according to the present embodiment may correspond to step 1410 described above. Step 1410 may include at least one of steps 1510 and 1520.

At step 1410, the processing unit may generate a plurality of tiles by dividing the target block based on one or more of a size of the target block and a shape of the target block.

The size of the target block may represent the width and/or height of the target block.

The shape of the target block may indicate whether the target block has a square shape. The shape of the target block may indicate whether the target block has a square shape or a non-square shape. The shape of the target block may be a ratio of the width to the height of the target block.

The processing unit may generate the plurality of partitions by dividing the target block using at least one of the prediction mode selection method at step 1510 and the prediction mode selection method at step 1520.

At step 1510, the processing unit may generate a plurality of tiles by dividing the target block based on the width or height of the target block.

In an embodiment, the processing unit may divide the target block when the width and the height of the target block are different from each other.

In an embodiment, the processing unit may divide the larger one of the width and the height of the target block at least once.

In an embodiment, the processing unit may divide the target block such that the width and the height of the block are identical to each other. Alternatively, the width and height of the tiles generated by the division may be equal to or greater than the smaller one of the width and height of the target block.

Examples of the target block and the division of the target block based on the size of the target block will be described later with reference to fig. 16, 17, 18, 19, and 20.

In an embodiment, the processing unit may divide the target block when the size of the target block is smaller than the specific size and the height and width of the target block are different from each other.

In an embodiment, the processing unit may divide the target block when a sum of the width and the height of the target block is smaller than a specific value and the width and the height of the target block are different from each other.

In an embodiment, the processing unit may divide the target block when the size of the target block falls within a specific range and the width and the height of the target block are different from each other.

At step 1520, the processing unit may generate a plurality of tiles by dividing the target block based on the shape of the target block.

When the target block has a square shape, the processing unit may not divide the target block.

When the target block has a non-square shape, the processing unit may divide the target block into square shapes. The operation of dividing into square shapes will be described later with reference to fig. 16, 17, 18, and 20.

As described above at steps 1510 and 1520, the processing unit may determine whether to divide the target block using only the size and shape of the target block, and may not use information directly indicating whether to divide the target block. Accordingly, information indicating whether to divide a block may not be signaled from the encoding apparatus 1200 to the decoding apparatus 1300, and whether to divide a block may be derived based on the size and/or shape of the target block.

Fig. 16 shows an 8 x 4 target block according to an example.

In fig. 17, the division of the target block will be explained.

Fig. 17 shows 4×4 tiling according to an example.

The size of each of the first and second sub-blocks may be 4×4.

As shown in fig. 17, when the width of the target block is greater than the height of the target block, the width of the target block shown in fig. 16 may be vertically divided, and thus two partition blocks may be derived.

Fig. 18 shows a 4 x 16 target block according to an example.

In fig. 19 and 20, the division of the target block will be explained.

Fig. 19 shows 8×4 tiling according to an example.

The size of each of the first and second sub-blocks may be 8×4.

Fig. 20 shows 4 x 4 tiling according to an example.

The size of each of the first, second, third and fourth sub-blocks may be 4×4.

As shown in fig. 19 and 20, when the height of the target block is greater than the width of the target block, the height of the target block shown in fig. 18 is horizontally divided, and thus two or four partitions can be derived.

Fig. 21 is a flow chart of a method for deriving a prediction mode for a partition according to an example.

The prediction mode pushing method according to an embodiment may correspond to step 1420 described above. Step 1420 may include at least one of steps 2110, 2120, and 2130.

For a plurality of partitions generated by dividing a target block, 1) respective prediction modes may be derived for the plurality of partitions, 2) prediction modes may be derived for specific partitions among the plurality of partitions, and 3) a common prediction mode may be derived for all of the plurality of partitions.

At least one of steps 2110, 2120 and 2130 may be performed according to the goal of the derived prediction mode.

At step 2110, the processing unit may derive respective prediction modes for the plurality of partition blocks.

The processing unit may derive the respective prediction modes for the plurality of partition blocks using the prediction mode derivation method described in the embodiments above.

In step 2120, the processing unit may derive a prediction mode for a particular partition among the plurality of partition blocks.

The specific partition block may be a block located at a specific position among a plurality of partition blocks.

For example, the specific partition block may be one or more of an uppermost block, a lowermost block, a leftmost block, a rightmost block, an nth block from the top, an nth block from the bottom, an nth block from the left, and an nth block from the right among the plurality of partition blocks. Here, n may be an integer equal to or greater than 1 and less than or equal to the number of tiles.

In an embodiment, the processing unit may derive the prediction mode for a specific partition among the plurality of partition blocks using the prediction mode derivation method described in the foregoing embodiment.

In an embodiment, the derived prediction mode may be used for the remaining blocks other than the specific block among the plurality of blocks. The processing unit may use the derived prediction mode for the remaining blocks except for the specific block among the plurality of partition blocks.

In an embodiment, a combination of the derived prediction mode and the additional prediction mode may be used for the remaining blocks other than the specific block among the plurality of blocks. The processing unit may use a prediction mode determined by combining the derived prediction mode and the additional prediction mode for the remaining blocks except for the specific partition among the plurality of partition blocks.

For example, the additional prediction modes may be determined using coding parameters associated with each of the remaining blocks. The processing unit may determine the additional prediction mode using coding parameters related to the remaining blocks, and may determine the prediction mode for the remaining blocks using a combination of the above prediction mode and the additional prediction mode derived for the particular block.

For example, the combination of prediction modes may be a prediction mode indicating a direction between directions of a plurality of prediction modes. The combination of prediction modes may be a prediction mode selected from among prediction modes according to a specific priority. The prediction modes determined using the combination of prediction modes may be different from each of the prediction modes used for the combination.

In step 2130, the processing unit may derive a common prediction mode for all of the plurality of partitions. In other words, a single common prediction mode for multiple partitions may be derived.

For example, the processing unit may derive a common prediction mode for all of the plurality of partitions using common encoding parameters for all of the plurality of partitions.

Deriving prediction modes using Most Probable Mode (MPM)

In the derivation of the prediction modes for the above blocks, the processing unit may use the Most Probable Mode (MPM).

To use MPM, the processing unit may configure an MPM list.

The MPM list may include one or more MPM candidate patterns. The number of one or more MPM candidate patterns may be N. N may be a positive integer.

In an embodiment, the processing unit may set the value of N according to the size and/or shape of the target block. Alternatively, the processing unit may set the value of N according to the size, shape and/or number of tiles.

Each of the one or more MPM candidate modes may be one of predefined intra-prediction modes.

The processing unit may configure one or more MPM candidate modes in the MPM list based on the one or more prediction modes for the one or more reference blocks of the target block. The reference block may be a block located at a predefined position or may be a block adjacent to the target block. For example, the one or more reference blocks may be a block adjacent to the top of the target block and a block adjacent to the left side of the target block.

The one or more MPM candidate modes may be one or more prediction modes decided based on the prediction modes of the reference block. The processing unit may determine one or more prediction modes specified with reference to the prediction modes of the one or more reference blocks as one or more MPM candidate modes. In other words, the one or more MPM candidate modes may be prediction modes having a high probability as a prediction mode of the target block. Such probabilities can be calculated using experimentation, or the like. For example, it is known that the probability that the prediction mode of a reference block will be used as the prediction mode of a target block is large due to the local association between the reference block and the target block. Accordingly, the prediction mode of the reference block may be included in one or more MPM candidate modes.

In an embodiment, the number of MPM lists may be one or more, and may be plural. For example, the number of MPM lists may be M. M may be a positive integer. The processing unit may use different methods to configure the respective plurality of MPM lists.

For example, the processing unit may configure the first MPM list, the second MPM list, and the third MPM list.

The MPM candidate patterns in the one or more MPM lists may be different from each other. Alternatively, the MPM candidate patterns in the one or more MPM lists may not overlap each other. For example, when a specific intra prediction mode is included in one MPM list, a plurality of MPM lists may be configured such that the specific intra prediction mode is not included in other MPM lists.

The MPM list indicator may be used to specify an MPM list including a prediction mode used to encode and/or decode the target block among the one or more MPM lists. In other words, an MPM list indicated by the MPM list indicator among the one or more MPM lists may be specified, and the processing unit may use any one of the one or more MPM candidate modes included in the specified MPM list for predicting the target block.

The MPM list indicator may be signaled from the encoding apparatus 1200 to the decoding apparatus 1300 through a bitstream.

When the MPM list indicator is used, the decoding apparatus 1300 may directly determine an MPM list including MPM candidate patterns to be used for predicting the target block among the one or more MPM lists based on the MP list indicator provided from the encoding apparatus 1200.

In an embodiment, the MPM use indicator may indicate whether the prediction mode is to be decided using the MPM list.

The MPM use indicator may indicate whether a prediction mode of the target block exists among one or more MPM candidate modes in the configured MPM list.

When the MPM use indicator indicates that the prediction mode of the target block exists among the one or more MPM candidate modes, the processing unit may determine the prediction mode of the target block among the one or more MPM candidate modes using the index indicator.

The index indicator may indicate an MPM candidate mode to be used for predicting the target block among one or more MPM candidate modes in the MPM list. The processing unit may determine an MPM candidate mode indicated by the index indicator among one or more MPM candidate modes in the MPM list as a prediction mode of the target block. The index indicator may also be referred to as an "MPM index".

When the MPM list is indicated with the MPM list indicator among the one or more MPM lists, the index indicator may be used to indicate which MPM candidate mode among the one or more MPM candidate modes in the MPM list indicated by the MPM list indicator is to be used for predicting the target block. In other words, the prediction mode of the target block may be specified by the MPM list indicator and the index indicator.

When the MPM use indicator indicates that the prediction mode of the target block does not exist among the one or more MPM candidate modes in the MPM list, the processing unit may determine the prediction mode of the target block using the prediction mode indicator indicating the prediction mode of the target block. The prediction mode indicator may indicate a prediction mode of the target block.

The prediction mode indicator may indicate one of the prediction modes not included in the MPM list (or the one or more MPM lists). In other words, one or more prediction modes not included in the MPM list or the one or more MPM lists are configured in a predefined order in the form of a prediction mode list, and the prediction mode indicator may indicate one of the one or more prediction modes in the prediction mode list.

One or more prediction modes in the prediction mode list may be ordered in an ascending or descending order. Here, the ranking criterion may be the number of each prediction mode.

When there are a plurality of MPM lists, a separate MPM usage indicator may be used for each of the plurality of MPM lists. Alternatively, when there are a plurality of MPM lists, there may be an MPM usage indicator for some of the plurality of MPM lists.

For example, the n-th MPM use indicator for the n-th MPM list may indicate whether a prediction mode of the target block exists in the n-th MPM list.

First, the processing unit may determine whether a prediction mode of the target block exists in the first MPM list using the first MPM use indicator. If it is determined that the prediction mode of the target block exists in the first MPM list, the processing unit may derive the MPM candidate mode indicated by the index indicator in the first MPM list as the prediction mode of the target block. If it is determined that the prediction mode of the target block does not exist in the first MPM list, the processing unit may determine whether the prediction mode of the target block exists in the second MPM list using the second MPM use indicator.

The processing unit may determine whether the prediction mode of the target block exists in the n-th MPM list using the n-th MPM use indicator. If it is determined that the prediction mode of the target block exists in the n-th MPM list, the processing unit may determine an MPM candidate mode indicating the prediction mode of the target block in the n-th MPM list using the index indicator. If it is determined that the prediction mode of the target block does not exist in the first MPM list, the processing unit may determine whether the prediction mode of the target block exists in the n+1th MPM list using the subsequent n+1th MPM use indicator.

When one MPM usage indicator indicates that a prediction mode of a target block exists in a corresponding MPM list, the MPM usage indicator following the MPM usage indicator may not be signaled.

The MPM use indicator, the index indicator, and/or the prediction mode indicator may be signaled from the encoding apparatus 1200 to the decoding apparatus 1300 through a bitstream.

When the MPM use indicator, the index indicator, and/or the prediction mode indicator are used, the decoding apparatus 1300 may directly determine which MPM candidate mode or which prediction mode is to be used for predicting the target block among 1) MPM candidate modes included in one or more MPM lists and 2) one or more prediction modes not included in the one or more MPM lists based on the MPM use indicator, the index indicator, and/or the prediction mode indicator provided from the encoding apparatus 1200.

Each MPM list may be configured for a particular unit.

In an embodiment, the particular unit may be a block or target block having a specified size.

When a specific unit is divided, the processing unit may use the configured MPM list for predicting a plurality of partitions generated by the division.

In an embodiment, the processing unit may configure the MPM list for the target block when the size of the target block is equal to or corresponds to the specified size. When the target block is divided into a plurality of partitions, the processing unit may derive a prediction mode of each of the plurality of partitions using an MPM list configured for the target block.

For example, when the size of the target block is 8×8 and the partition blocks are four 4×4 blocks, the MPM list may be configured for the 8×8 blocks, and the MPM list configured for the four 4×4 blocks, respectively, may be used.

In an embodiment, when the MPM list is configured, the processing unit may configure the MPM list for each partition block included in the block having the specified size based on the block having the specified size. In other words, the MPM list generated for the block having the specified size may be commonly used for blocking.

For example, when the size of the target block is a specified size, the MPM list for each partition in the target block may be configured using the prediction modes for one or more reference blocks of the target block (rather than the partition).

For example, when the size of the target block is 8×8 and the tiles are four 4×4 blocks, the processing unit may configure the MPM list for each of the four tiles based on one or more reference blocks for the target block. In this case, since the prediction mode of the reference block for the target block has been obtained, the processing unit may configure the MPM list for the four partition blocks in parallel.

Fig. 22 illustrates prediction of a bisected block according to an example.

In fig. 22, the first block may be a specific block among the partitioned blocks. For example, the first block may be a block among partitioned blocks, on which prediction is first performed.

When predicting the first block, the processing unit may derive a prediction mode for the first block.

As shown in fig. 22, when predicting a first block, the processing unit may use reference samples adjacent to the first block. Alternatively, the reference sample may be a pixel adjacent to the first block. The reference samples may be pixels in a reconstructed block adjacent to the first block.

Fig. 23 illustrates prediction of a partition using a reconstructed block of the partition according to an example.

In fig. 23, the second block may be a specific block among the partitioned blocks. For example, the second block may be 1) a block in which prediction is performed second, 2) a block in which prediction is performed last, 3) a block in which prediction is performed next after prediction of the first block, 4) a block in which prediction is performed after prediction of the first block, or 5) a block in which prediction is performed after prediction of at least one block.

As described above, when performing prediction on the second block, the processing unit may use a prediction mode derived for the first block.

As shown in fig. 22, when predicting the second block, the processing unit may use reference samples adjacent to the second block.

The reference samples may be pixels in a reconstructed block adjacent to the second block. The reference samples may comprise reconstructed pixels in a reconstructed block of the first block.

Alternatively, the reference samples may comprise reconstructed pixels present in an additional partitioned reconstructed block predicted prior to predicting the second block. In other words, when prediction is performed on the second block, an additional partition block, which is predicted before prediction of the second block, among the plurality of partition blocks may be used.

Fig. 24 illustrates prediction of a block using external reference pixels for the block, according to an example.

The processing unit may use the outer pixels for the plurality of tiles as reference points in the prediction of the tiles. In other words, when performing prediction on the tiles, the processing unit may exclude the intra pixels of the plurality of tiles from the reference sample. The pixels excluded from the reference sample point may be replaced with 1) the nearest pixel located in the same direction as the direction of the excluded pixel, or 2) a pixel adjacent to the target block and located in the same direction as the direction of the excluded pixel.

In an embodiment, in predicting a plurality of tiles, the reference samples used for prediction may be reconstructed pixels adjacent to the target block (rather than the respective tiles).

For example, as shown in fig. 24, the processing unit may exclude pixels in the reconstructed block of the first block from the reference sample point in the prediction of the second block, and may use reconstructed pixels adjacent to the reference block as the reference sample point.

For example, when prediction is performed on each of a plurality of tiles generated by dividing a target block, the processing unit may use reconstructed pixels adjacent to the target block as reference samples. By deciding on these reference points, the values of all reference points to be used for prediction of the plurality of partition blocks can be set before prediction of the plurality of partition blocks. Thus, before predicting a plurality of partition blocks, the processing unit may set values of all reference samples to be used for predicting the plurality of partition blocks, and then may perform prediction on the plurality of partition blocks in parallel.

Fig. 25 shows predictions for four tiles according to an example.

As shown in fig. 25, the first, second, third, and fourth blocks may be generated by dividing the target block.

As described above, the processing unit may derive a prediction mode for a particular partition among a plurality of partition blocks.

In fig. 25, as an example, a prediction mode may be derived for the fourth block, which is the lowest block. The derived prediction modes may be used for the remaining blocks, i.e., the first block, the second block, and the third block.

The processing unit may first perform prediction on a particular partition among the plurality of partition blocks from which a prediction mode is derived. Next, the processing unit may perform prediction on the remaining blocks except for the specific block among the plurality of blocks using the derived prediction mode.

In prediction of a particular partition from which a prediction mode is derived, the processing unit may use reconstructed pixels adjacent to the particular partition and/or the target block as reference samples.

According to fig. 25, reconstructed pixels adjacent to the top of the fourth block may not be present when predicting the fourth block. Thus, in predicting the fourth block, the processing unit may use reconstructed pixels adjacent to the top of the target block as reference pixels.

The processing unit may perform prediction on the plurality of partition blocks in a predefined order. The predefined order may be different from the order of the normal blocks other than the blocks generated by performing the division. For example, the predefined order may be 1) the order from the lowest block to the uppermost block, or 2) the order from the rightmost block to the leftmost block. Alternatively, the predefined order may be 3) an order of selecting the lowermost block first and thereafter sequentially selecting blocks ranging from the uppermost block to the second block at the bottom, or 4) an order of selecting the rightmost block first and thereafter sequentially selecting blocks ranging from the leftmost block to the second block at the right.

Alternatively, the predefined order may be arbitrarily set by the encoding apparatus 1200 and/or the decoding apparatus 1300. When the predefined order is set by the encoding apparatus 1200, the set predefined order may be signaled from the encoding apparatus 1200 to the decoding apparatus 1300.

The prediction order indicator may indicate an order in which the plurality of partitions are predicted. The encoding apparatus 1200 may set the value of the prediction order indicator. The prediction order indicator may be signaled from the encoding device 1200 to the decoding device 1300 via a bitstream.

Alternatively, the predefined order may be derived by the encoding device 1200 and/or the decoding device 1300 separately based on the same predefined scheme. The processing unit may use coding parameters or the like associated with the target block to derive the predefined order.

Fig. 26 illustrates prediction of the first block after prediction is performed on the fourth block according to an example.

As described above, a predefined order may be used to perform prediction on multiple partition blocks.

The predefined order shown in fig. 26 may be specified such that prediction is performed on blocks ranging from the uppermost block to the second block from the bottom in order from the top to the bottom after prediction is performed on the lowermost block among the plurality of partitioned blocks first. In the defined order according to the embodiments, the term "lowermost (bottom)" may be replaced with the term "rightmost" and the term "uppermost (top)" may be replaced with the term "leftmost".

Since prediction is performed on the plurality of partition blocks in a predefined order, as shown in fig. 26, additional reference samples may be used as compared to a configuration in which prediction is performed on the plurality of partition blocks in an existing order. Thus, intra prediction in additional directions using additional available reference samples may be used as compared to conventional intra prediction.

When prediction is performed on each of the plurality of blocks, the processing unit may use pixels existing in a reconstructed block of a block predicted before the prediction of the corresponding block as reference points.

For example, as shown in fig. 26, the processing unit may use pixels existing in the reconstructed block of the previously predicted fourth block as reference points when performing prediction on the first block.

According to the use of such a predefined order and the use of pixels present in the reconstructed block of the previously predicted partition, reference samples may be provided in more directions than if the prediction is performed on the partition only in the normal order. For example, as shown in fig. 26, a reference sample located below the first block may be provided to the first block.

In an embodiment, in the prediction of the partition, the processing unit may perform intra prediction using a reference sample adjacent to the bottom of the partition and intra prediction using a reference sample adjacent to the right side of the partition.

For example, reference samples adjacent to the bottom of a partitioned block may be copied to a prediction block located at an upper, upper left, and/or upper right position of the partitioned block. Reference samples adjacent to the right side of a partition block may be copied to prediction blocks located at left, upper left, and/or lower left positions of the partition block.

Fig. 27 illustrates prediction of a second block according to an example.

The prediction of the second block may be performed after the prediction of the fourth block and the prediction of the first block. Thus, as described above, the reference samples used for predicting the second block may include pixels in the reconstructed block of the fourth block and pixels in the reconstructed block of the first block.

In other words, when prediction is performed on a particular block of the plurality of blocks, the processing unit may use pixels in reconstructed blocks of other blocks as reference pixels. Here, the other partition may be a block in which prediction has been performed before prediction is performed on a specific partition among the plurality of partition blocks.

Alternatively, when prediction is performed on the third block earlier than the second block, the reference samples shown in fig. 27 may also be used to predict the third block.

Fig. 28 shows prediction of a third block according to an example.

The prediction of the third block may be performed last after the prediction of the fourth block, the prediction of the first block, and the prediction of the second block.

Fig. 28 shows available reference samples in the prediction of the third block.

In an embodiment, the type of reference sample to be used for predicting a particular partition in a partition block may be selected from among a plurality of reference sample types.

For example, in the prediction of the third block, the processing unit may use one of the reference points shown in fig. 25, the reference points shown in fig. 26, the reference points shown in fig. 27, and the reference points shown in fig. 28.

When performing prediction on a particular partition, the processing unit may use a reference sample corresponding to one of a plurality of reference sample types.

The plurality of reference sample types may include a first reference sample type, a second reference sample type, and a third reference sample type.

The reference samples of the first reference sample type may be reconstructed pixels adjacent to the target block. In other words, the reference samples of the first reference sample type may be the reference samples shown in fig. 25.

The reference samples of the second reference sample type may be reference samples of the first reference sample type and pixels in a reconstructed block of the block for which prediction has been previously performed. In other words, the reference points of the second reference point type may be the reference points shown in fig. 26 or 27.

In an embodiment, pixels in a reconstructed block of a partition, for which prediction has been previously performed, may be used in a direction not covered only by reconstructed pixels adjacent to the target block. In other words, reconstructed pixels adjacent to the target block may be used before pixels in the reconstructed block of the partition block (e.g., the reference sample points shown in fig. 26).

Optionally, in an embodiment, pixels in a reconstructed block of a partition for which prediction has been previously performed may replace at least some of the reconstructed pixels (e.g., reference sample points shown in fig. 27) adjacent to the target block.

In other words, pixels in a tiled reconstructed block may be used before reconstructed pixels adjacent to the target block.

That is, since pixels in a reconstructed block of a block for which prediction has been previously performed are closer to a specific block than reconstruction samples adjacent to a target block, pixels in a reconstructed block of a block for which prediction has been previously performed (instead of reconstructed pixels adjacent to a target block) may be used to predict a specific block, and reconstructed pixels adjacent to a target block may be used only in a direction that is not covered by pixels in a reconstructed block of a block for which prediction has been previously performed.

The reference samples of the third reference sample type may be reconstructed pixels adjacent to the particular partition block. In other words, the reference sample of the third reference sample type may be the reference sample shown in fig. 28.

In an embodiment, the processing unit may use information related to the target block or partition block to decide the reference points to be used for predicting the partition block.

In an embodiment, the processing unit may determine the reference samples to be used for predicting the fractional block based on the reference sample indicators.

The reference sample indicator may be an indicator indicating a reference sample to be used for predicting the block. The reference sample indicator may indicate a reference sample type to be used for predicting the block among a plurality of reference sample types.

The processing unit may set the value of the reference sample indicator.

The reference sample indicator may be signaled from the encoding device 1200 to the decoding device 1300 via a bitstream. Alternatively, to set the reference sample point indicator at least in part, coding parameters associated with the reference block or partition block may be used.

When the reference sample indicator is used, the decoding apparatus 1300 may directly decide a reference sample to be used for prediction of the fractional block using the reference sample indicator provided from the encoding apparatus 1200.

Filtering reference samples

The processing unit may perform filtering on the reference samples before the prediction is performed, and may determine whether to perform filtering on the reference samples.

In an embodiment, the processing unit may determine whether to perform filtering on the reference samples based on the size and/or shape of the target block.

In an embodiment, the processing unit may determine whether to perform filtering on the reference samples based on the size and/or shape of each tile.

In an embodiment, the processing unit may determine whether to perform filtering on the reference samples according to whether a reconstructed block adjacent to the target block is used as a reference block for the blocking.

In an embodiment, the processing unit may determine whether to perform filtering on the reference samples based on whether prediction is performed on the segmented blocks in parallel.

Alternatively, in an embodiment, the processing unit may determine whether to perform filtering on the reference samples according to whether the specific functions, operations, and processes described in the embodiment are performed.

Alternatively, in an embodiment, the processing unit may determine whether to perform filtering on the reference samples based on coding parameters associated with the target block or coding parameters associated with each partition block.

Fig. 29 is a flowchart of a prediction method according to an embodiment.

In the embodiment described above with reference to fig. 14, the description has been made under the assumption that a plurality of blocks are generated by dividing the target block in step 1410 and a prediction mode is derived for at least some of the prediction modes of the plurality of blocks in step 1420.

In the present embodiment, a plurality of partitions may be generated by dividing a target block after a prediction mode has been derived.

In step 2910, the processing unit may derive a prediction mode.

For example, the derived prediction mode may be a prediction mode of the target block. The processing unit may derive the prediction mode based on the scheme described above for deriving the prediction mode of the target block.

For example, when a target block is divided, the derived prediction mode may be a prediction mode for predicting a plurality of partitions generated by dividing the target block. In other words, the derived prediction mode may be a prediction mode used when the target block is divided.

In an embodiment, the derived prediction modes may include a plurality of prediction modes.

For example, a plurality of derived prediction modes may be used to predict a plurality of partitions generated by dividing a target block.

The description related to the derivation of the prediction mode for the partition in the above-described embodiment may also be applied to the derivation of the prediction mode in the present embodiment. For example, MPM may be used for deriving the prediction mode. Repeated descriptions will be omitted here.

In step 2920, the processing unit may generate a plurality of tiles by dividing the target block.

The description relating to the partitioning of the target block described above with reference to step 1410 may also be applied to step 2920. Repeated descriptions will be omitted here.

In step 2930, the processing unit may perform prediction on at least some of the plurality of partition blocks using the derived prediction mode.

The description regarding prediction of at least some of the plurality of tiles described above with reference to step 1430 and the like may also be applied to step 2930. However, in the description made with reference to steps 1420 and 1430, description has been made under the following assumption: the prediction mode is derived for a specific partition among the plurality of partition blocks, and the prediction mode derived for the specific partition block or the prediction mode decided based on the derived prediction mode is used for the remaining blocks except for the specific partition mode. From this description, it can be understood that: the prediction mode is derived in step 2910 and the prediction mode derived in step 2910 or a prediction mode determined based on the derived prediction mode is used for the plurality of partitions. Repeated descriptions will be omitted here.

Fig. 30 illustrates the derivation of a prediction mode for a target block according to an example.

As described above with reference to the embodiment of fig. 29, at step 2910, a prediction mode of the target block may be derived. When the prediction mode of the target block is derived, prediction of the first block and the second block may be performed using the derived prediction mode in a similar manner as described above with reference to fig. 22, 23, and 24.

Optionally, at step 2910, a plurality of prediction modes may be derived for the target block. The derived plurality of prediction modes may be used for prediction of the respective partition.

The processing unit may determine a prediction mode to be used for predicting the partition block among the derived plurality of prediction modes and a partition block to be used in the prediction mode according to a scheme in which coding parameters related to the target block, etc. are used.

Fig. 31 is a flowchart illustrating a target block prediction method and a bit stream generation method according to an embodiment.

The target block prediction method and the bit stream generation method according to the present embodiment may be performed by the encoding apparatus 1200. This embodiment may be part of a target block encoding method or a video encoding method.

In step 3110, processing unit 1210 may divide the block and derive a prediction mode.

Step 3110 may correspond to steps 1410 and 1420 described above with reference to fig. 14. Step 3110 may correspond to steps 2910 and 2920 described above with reference to fig. 29.

In step 3120, the processing unit 1210 may perform prediction using the derived prediction mode.

Step 3120 may correspond to step 1430 described above with reference to fig. 14. Step 3120 may correspond to step 2930 described above with reference to fig. 29.

In step 3130, the processing unit 1210 may generate prediction information. The prediction information may be generated at least in part at step 3110 or 3120.

The prediction information may be information for the aforementioned block division and prediction mode derivation. For example, the prediction information may include the indicators described above.

In step 3140, the processing unit 1210 may generate a bitstream.

The bitstream may include information about the encoded target block. For example, the information about the encoded target block may include transform and quantized coefficients of the target block and/or partition block, and encoding parameters of the target block and/or partition block. The bitstream may include prediction information.

The processing unit 1210 may perform entropy encoding on the prediction information, and may generate a bitstream including the entropy encoded prediction information.

The processing unit 1210 may store the generated bit stream in the storage 1240. Alternatively, the communication unit 1220 may transmit the bitstream to the decoding apparatus 1300.

Fig. 32 is a flowchart illustrating a target block prediction method using a bitstream according to an embodiment.

The target block prediction method using a bitstream according to the present embodiment may be performed by the decoding apparatus 1300. This embodiment may be part of a target block decoding method or a video decoding method.

At step 3210, the communication unit 1320 may obtain a bitstream. The communication unit 1320 may receive a bitstream from the encoding apparatus 1200.

The bitstream may include information about the encoded target block. For example, the information about the encoded target block may include transform and quantized coefficients of the target block and/or partition block and encoding parameters of the target block and/or partition block. The bitstream may include prediction information.

The processing unit 1310 may store the acquired bit stream in the storage 1240.

At step 3220, the processing unit 1310 may obtain prediction information from the bitstream.

The processing unit 1310 may obtain prediction information by performing entropy decoding on entropy-encoded prediction information of the bitstream.

At step 3230, the processing unit 1310 may divide the block and derive a prediction mode using the prediction information.

Step 3230 may correspond to steps 1410 and 1420 described above with reference to fig. 14. Step 3230 may correspond to steps 2910 and 2920 described above with reference to fig. 29.

At step 3240, the processing unit 1310 may perform prediction using the derived prediction mode.

Step 3240 may correspond to step 1430 described above with reference to fig. 14. Step 3240 may correspond to step 2930 described above with reference to fig. 29.

Block partitioning of a block using a partition indicator

In the above-described embodiments, the target blocks are described as being partitioned based on the size and/or shape of the target blocks.

In block partitioning, a partition indicator may be used. The partition indicator of the block may indicate whether two or more sub-blocks are to be generated by dividing the block, and whether each of the generated sub-blocks is to be used as a unit of encoding and decoding when the block is encoded and decoded.

The description related to block division and block prediction in the foregoing embodiments may also be applied to the following embodiments.

In an embodiment, the block partition indicator may be a binary tree partition indicator that indicates whether the block is to be partitioned in a binary tree form. For example, the name of the binary tree partition indicator may be "binarytre_flag" or "btslitflag".

Alternatively, the block partition indicator may be a quadtree indicator indicating whether the block is to be partitioned in the form of a quadtree.

In an embodiment, among the values of the partition indicator, a first predefined value may indicate that the block is not to be partitioned, and a second predefined value may indicate that the block is to be partitioned.

When the block division indicator has a first predefined value, the processing unit may not divide the block.

In an embodiment, when the block division indicator exists and the division indicator has a first predefined value, the block may not be divided even though the block has a shape and form to which the division is to be applied.

In an embodiment, when the block division indicator has a second predefined value, the processing unit may generate a partition block by dividing the block, and may perform encoding and/or decoding of the partition block. Further, when the block division indicator has a second predefined value, the processing unit may generate a partition block by dividing the block, and may re-divide a specific partition block according to a form and/or shape into which the partition block is divided.

In an embodiment, the partition indicator of the target block may indicate whether the target block is to be partitioned with respect to the target block. Further, the partition indicator of the upper layer block of the target block may indicate whether the upper layer block is to be partitioned. When the partition indicator of the upper layer block indicates that the upper layer block is to be divided into a plurality of blocks, the processing unit may divide the upper layer block into a plurality of blocks including the target block. That is, the target block described above in the embodiment may also be regarded as a block generated by division via a division indicator or the like.

Fig. 33 illustrates partitioning of upper layer blocks according to an example.

For example, when the division indicator of the upper layer block has the second predefined value and the division indicator of the target block has the first predefined value, the upper layer block may be divided into a plurality of blocks including the target block, each of which may be a target or unit of a designated process in encoding and/or decoding.

Fig. 34 illustrates partitioning of target blocks according to an example.

For example, when the division indicator of the upper layer block has a second predefined value and the target block generated by dividing the upper layer block has a size and/or shape to which division is to be applied, the target block may be re-divided into a plurality of sub-blocks. Each of the plurality of tiles may be a target or unit of a specified process in encoding and/or decoding.

Block partitioning for block transforms and the like

In the foregoing embodiments, description has been made under the assumption that each of the partitions is a predicted unit. The partition in an embodiment may be a unit of additional processing in encoding and/or decoding other than prediction.

In the following embodiments, an embodiment will be described in which a target block or partition block is used as a unit of prediction, transformation, quantization, inverse transformation, and processing of inverse quantization (dequantization) in encoding and/or decoding of the block.

In step 3510, the processing unit 1210 of the encoding apparatus 1200 may perform prediction related to the target block.

In step 3520, the processing unit 1210 of the encoding apparatus 1200 may perform a transform related to the target block based on the prediction.

In step 3530, the processing unit 1210 of the encoding apparatus 1200 may generate a bitstream including a result of the transformation.

In step 3540, the communication unit 1220 of the encoding apparatus 1200 may transmit the bitstream to the communication unit 1320 of the decoding apparatus 1300.

In step 3550, the processing unit 1310 of the decoding apparatus 1300 may extract the result of the transformation.

In step 3560, the processing unit 1310 of the decoding apparatus 1300 may perform inverse transform related to the target block.

In step 3570, the processing unit 1310 of the decoding apparatus 1300 may perform prediction related to the target block.

The detailed functions of steps 3510, 3520, 3530, 3540, 3550, 3560, and 3570 and the prediction, transformation, and inverse transformation of steps 3510, 3520, 3560, and 3570 will be described in detail below.

In the case where each block is a unit of transform and inverse transform

In step 3510, the processing unit 1210 may generate a residual block of the target block by performing prediction on the target block.

At step 3520, processing unit 1210 may perform a transform based on the partition blocks. In other words, the processing unit 1210 may generate a residual difference block by dividing the residual block, and may generate coefficients of the residual difference block by performing transform on the residual partition block, respectively.

In the following, a transform may be understood to include quantization in addition to the transform. The inverse transform may be understood to include dequantization, and may be understood to be performed after dequantization is performed. Furthermore, coefficients may be understood as representing the coefficients that are transformed and quantized.

The result of the transformation at step 3530 may include a plurality of blocking coefficients.

At step 3560, the processing unit 1310 may perform inverse transform on a block-by-block basis using the coefficients of each block. In other words, the processing unit 1310 may generate a reconstructed residual partition block by performing inverse transform on coefficients of the partition block.

The reconstructed residual block for the plurality of blocks may constitute a reconstructed residual block of the target block. Alternatively, the reconstructed residual block of the target block may comprise a plurality of reconstructed residual partition blocks.

In step 3570, the processing unit 1310 may generate a prediction block of the target block by performing prediction on the target block, and may generate a reconstructed block by adding the prediction block to the reconstructed residual block. The reconstructed block may comprise reconstructed samples.

In an embodiment, the TU partition indicator may indicate whether a transform and an inverse transform are to be performed based on the partition block. For example, the name of the TU partition indicator may be "tunelitflag".

The TU partition indicator may be signaled through a bitstream.

In an embodiment, when the partition indicator of the upper layer block has the second predefined value, the target block may be divided into a plurality of partition blocks in the transforming and inverse transforming step, and whether transformation and inverse transformation are to be performed on each partition block may be signaled through the TU partition indicator.

In an embodiment, when the partition indicator of the upper layer block has the second predefined value and the partition indicator of the target block has the first predefined value, the target block may be divided into a plurality of partition blocks in the transforming and inverse transforming step, and whether transformation and inverse transformation are to be performed on each partition block may be signaled through the TU partition indicator.

In an embodiment, when the shape and/or form of the target block is the shape and/or form to which division is to be applied, the target block may be divided into a plurality of partitions in the transforming and inverse transforming steps, and whether transformation and inverse transformation are to be performed on each partition block may be signaled through the TU division indicator.

Here, the shape and/or form to be applied with the division may be a shape and/or form in which the division of the target block is described as being performed in other foregoing embodiments, for example, a non-square shape. Optionally, the shape and/or form of the partition to be applied may represent a state and/or condition that includes the partition of the target block described as being performed in other previous embodiments. Hereinafter, this is the same as described above.

In an embodiment, when the division indicator of the upper layer block has the second predefined value, the target block may be divided into a plurality of square sub-blocks, and the transformation and inverse transformation may be performed on each of the sub-blocks at the transformation and inverse transformation steps.

In an embodiment, when the division indicator of the upper layer block has the second predefined value and the division indicator of the target block has the first predefined value, the target block may be divided into a plurality of square sub-blocks, and the transformation and inverse transformation may be performed on each of the sub-blocks at the transforming and inverse transforming steps.

In an embodiment, when the shape and/or form of the target block is the shape and/or form to be applied with the division, the target block may be divided into a plurality of sub-blocks, and the transformation and inverse transformation may be performed on each of the sub-blocks at the transforming and inverse transforming steps.

In the case where each block is a unit of transform, inverse transform and prediction

At step 3510, processing unit 1210 may perform prediction based on the partition blocks. In other words, the processing unit 1210 may generate the residual difference block of each partition by performing prediction on the prediction block.

In an embodiment, the prediction may be intra prediction.

At step 3520, processing unit 1210 may perform a transform based on the partition blocks. In other words, the processing unit 1210 may generate coefficients of the residual block by performing a transform on the residual block.

At step 3560, processing unit 1310 may perform an inverse transform based on the partition blocks. In other words, the processing unit 1310 may generate a reconstructed residual partition block by performing inverse transform on coefficients of the residual partition block.

At step 3570, the processing unit 1310 may perform prediction based on the partition blocks. In other words, the processing unit 1310 may generate a partition prediction block by performing prediction on a partition block, and may generate a reconstructed partition block by adding the partition prediction block to the reconstructed residual block.

The reconstructed partition blocks of the plurality of partitions may constitute a reconstructed block of the target block. Alternatively, the reconstructed block of the target block may comprise a reconstructed partition block. The reconstructed block may comprise reconstructed samples.

In an embodiment, the PU partition indicator may indicate whether prediction, transform, and inverse transform are to be performed based on the partition block. For example, the name of the PU partition indicator may be "intra_pu_split flag".

The PU partition indicator may be signaled by a bitstream.

In an embodiment, when the partition indicator of the upper layer block has the second predefined value, the target block may be divided into a plurality of partition blocks in the predicting step, and whether prediction, transformation, and inverse transformation are to be performed on each partition block may be signaled through the PU partition indicator.

In an embodiment, when the partition indicator of the upper layer block has the second predefined value and the partition indicator of the current block has the first predefined value, the target block may be divided into a plurality of partition blocks in the predicting step, and whether prediction, transformation, and inverse transformation are to be performed on each partition block may be signaled through the PU partition indicator.

In an embodiment, when the shape and/or form of the target block is the shape and/or form of the partition to be applied, the target block may be divided into a plurality of partition blocks in the prediction step, and whether prediction, transformation, and inverse transformation are to be performed on each partition block may be signaled through the PU partition indicator.

In an embodiment, when the partition indicator of the upper layer block has the second predefined value, the target block may be divided into a plurality of partition blocks at the predicting step, and the prediction, the transformation, and the inverse transformation may be performed on each partition block.

In an embodiment, when the partition indicator of the upper layer block has the second predefined value and the partition indicator of the target block has the first predefined value, the target block may be divided into a plurality of square partition blocks in the predicting step, and the prediction, the transformation, and the inverse transformation may be performed on each partition block.

In an embodiment, when the shape and/or form of the target block is the shape and/or form to be applied with the division, the target block may be divided into a plurality of partition blocks in the predicting step, and the prediction, the transformation, and the inverse transformation may be performed on each partition block.

In case that each partition block is a unit of prediction and the target block is a unit of transform and inverse transform

At step 3510, processing unit 1210 may perform prediction based on the partition blocks. In other words, the processing unit 1210 may generate the residual difference block for each partition by performing prediction on the partition.

The residual difference blocks of the plurality of blocks may constitute a residual block of the target block.

In step 3520, the processing unit 1210 may perform a transform based on the target block. For example, the processing unit 1210 may configure a residual block of the target block using residual partition blocks of the plurality of blocks. Alternatively, the residual block of the target block may include a plurality of block-wise residual difference blocks.

The processing unit 1210 may generate coefficients of the target block by performing a transform on a residual block of the target block.

The result of the transformation at step 3530 may include coefficients of the target block.

At step 3560, the processing unit 1310 may perform an inverse transform based on the target block using the coefficients of the target block. In other words, the processing unit 1310 may generate the reconstructed residual block by inversely transforming the coefficients of the target block.

The reconstructed residual block may be composed of a plurality of reconstructed residual partition blocks. Alternatively, the processing unit 1310 may generate a plurality of reconstructed residual partition blocks by dividing the reconstructed residual blocks. Alternatively, the reconstructed residual block may comprise a plurality of reconstructed residual partition blocks.

At step 3570, the processing unit 1310 may perform prediction based on the partition blocks.

In other words, the processing unit 1310 may generate a partition prediction block for each partition by performing prediction on the partition, and may generate a reconstructed partition block by adding the partition prediction block to the reconstructed residual partition block.

When prediction is performed on a partition, different prediction modes may be applied to multiple partitions.

A plurality of reconstructed partition blocks for a plurality of partition blocks may constitute a reconstructed block of the target block. Alternatively, the reconstructed block of the target block may comprise a plurality of reconstructed partitioned blocks.

In an embodiment, the TU merge PU partition indicator may indicate whether prediction is to be performed based on a partition block and whether inverse transformation is to be performed based on a target block. For example, the name of the TU Merge PU partition indicator may be "tu_merge_pu_split flag".

The TU merge PU partition indicator may be signaled by a bitstream.

In an embodiment, when the partition indicator of the upper layer block has the second predefined value, it may be signaled through the TU whether the transform and inverse transform are to be performed on the target block and whether prediction is to be performed on each partition block.

In an embodiment, when the partition indicator of the upper layer block has the second predefined value and the partition indicator of the target block has the first predefined value, it may be signaled through the TU whether the PU partition indicator is to perform the transform and the inverse transform on the target block and whether the prediction is to be performed on each partition block.

In an embodiment, when the shape and/or form of the target block is the shape and/or form to which partitioning is to be applied, it may be signaled whether transformation and inverse transformation are to be performed on the target block and whether prediction is to be performed on each partition block through a TU merge PU partition indicator.

In an embodiment, when the partition indicator of the upper layer block has the second predefined value, prediction may be performed on each partition block, and transformation may be performed on the target block after prediction is performed on the partition block. Further, when the partition indicator of the upper layer block has a second predefined value, an inverse transform may be performed on the target block, prediction may be performed on the respective partition blocks after the inverse transform is performed on the target block, and samples for reconstruction of the target block may be generated.

In an embodiment, when the partition indicator of the upper layer block has the second predefined value and the partition indicator of the target block has the first predefined value, prediction may be performed on the respective partition blocks, and transformation may be performed on the target block after performing prediction on the partition blocks. Further, when the partition indicator of the upper layer block has the second predefined value and the partition indicator of the target block has the first predefined value, the inverse transform may be performed on the target block, prediction may be performed on the partial block after the inverse transform is performed on the target block, and samples for reconstruction of the target block may be generated.

In an embodiment, when the shape and/or form of the target block is the shape and/or form of the partition to be applied, prediction may be performed on each partition block, and transformation may be performed on the target block after prediction is performed on the partition block. Further, when the shape and/or form of the target block is the shape and/or form to which the division is to be applied, the inverse transform may be performed on the target block, the prediction may be performed on each partition block after the inverse transform is performed on the target block, and samples for reconstruction of the target block may be generated.

In the above-described embodiments, although the method has been described based on a flowchart as a series of steps or units, the present disclosure is not limited to the order of the steps, and some steps may be performed in a different order from the order of the steps that have been described or simultaneously with other steps. Furthermore, those skilled in the art will appreciate that: the steps shown in the flowcharts are not exclusive and may include other steps as well, or one or more steps in the flowcharts may be deleted without departing from the scope of the present disclosure.

The embodiments according to the present disclosure described above may be implemented as programs capable of being executed by various computer devices, and may be recorded on computer-readable storage media. The computer readable storage medium may include program instructions, data files, and data structures, alone or in combination. The program instructions recorded on the storage medium may be specially designed or configured for the present disclosure, or may be known or available to those having ordinary skill in the computer software arts. Examples of computer storage media may include all types of hardware devices that are specially configured for recording and executing program instructions, such as magnetic media (such as hard disks, floppy disks, and magnetic tape), optical media (such as Compact Discs (CD) -ROMs, and Digital Versatile Discs (DVDs)), magneto-optical media (such as floppy disks, ROMs, RAMs, and flash memory). Examples of program instructions include both machine code, such as produced by a compiler, and high-level language code that can be executed by the computer using an interpreter. The hardware devices may be configured to operate as one or more software modules to perform the operations of the present disclosure, and vice versa.

As described above, although the present disclosure has been described based on specific details (such as detailed components and a limited number of embodiments and figures), the specific details are provided only for easy understanding of the present disclosure, the present disclosure is not limited to these embodiments, and various changes and modifications will be practiced by those skilled in the art in light of the above description.

It is, therefore, to be understood that the spirit of the present embodiments is not to be limited to the above-described embodiments and that the appended claims and equivalents thereof, and modifications thereto, fall within the scope of the present disclosure.

Claims

1. A video decoding method, comprising:

determining whether to perform partitioning on the target block;

generating a division block by dividing a target block in a case where it is determined to perform the division; and is also provided with

Decoding is performed on the divided blocks so that,

wherein the partitioning is determined based on a block partitioning indicator in case the block partitioning indicator is signaled via a bitstream, wherein the block partitioning indicator indicates a type of the partitioning for a target block among a plurality of types for the partitioning,

in case the block partition indicator is not signaled via the bitstream, determining the type of the partition based on a condition using information related to the target block,

the type of partition indicates one of a plurality of partition types, wherein the plurality of partition types includes a horizontal partition for a target block and a vertical partition for the target block,

a plurality of division blocks are generated by the division,

the list is configured for the target block using intra prediction modes of a plurality of neighboring blocks,

Determining prediction information for the plurality of partitioned blocks using a plurality of candidates in the list,

the prediction information indicates an intra prediction mode that is commonly used for prediction for the plurality of divided blocks, and

the plurality of neighboring blocks includes a block adjacent to a left side of the target block and a block adjacent to an upper side of the target block.

2. The video decoding method of claim 1, wherein:

whether to perform the partitioning is determined based on the shape of the target block.

3. The video decoding method of claim 1, wherein:

whether to perform the partitioning is determined based on the depth of the target block.

4. The video decoding method of claim 1, wherein:

the shape of the partition is determined based on the horizontal length of the target block and the vertical length of the target block.

5. The video decoding method of claim 1, wherein:

the partition is determined based on the block partition indicator.

6. The video decoding method of claim 5, wherein:

the block partition indicator indicates a type of the partition.

7. The video decoding method of claim 5, wherein:

in the case that the block partition indicator is not signaled via the bitstream, the partition is determined based on a condition that information related to the target block is used.

8. The video decoding method of claim 1, wherein:

prediction is performed on multiple regions of the target block using multiple candidates in a list configured for the target block.

9. The video decoding method of claim 1, wherein:

the target block is a coding tree unit.

10. The video decoding method of claim 1, wherein:

the target block is an encoding unit.

11. The video decoding method of claim 1, wherein:

the target block is a transform unit.

12. The video decoding method of claim 1, wherein:

the decoding includes intra prediction for the partitioned blocks.

13. The video decoding method of claim 1, wherein:

the decoding includes inter prediction for the partitioned blocks.

14. A video encoding method, comprising:

determining whether to perform partitioning on the target block;

generating prediction information of a plurality of divided blocks by dividing a target block in a case where it is determined to perform the division; and is also provided with

Performing encoding on the plurality of divided blocks to generate encoding information of a target block,

wherein,

in case a block partition indicator is included in the bitstream, the block partition indicator indicates a type of the partition for the target block among a plurality of types for the partition,

For decoding a target block using the coding information and the block partition indicator, configuring a list for the target block using intra prediction modes of a plurality of neighboring blocks,

for decoding a target block using the encoding information and the block partition indicator, determining prediction information for the plurality of partitioned blocks using a plurality of candidates in the list,

the prediction information indicates an intra prediction mode, which is commonly used for prediction for the plurality of divided blocks,

the plurality of adjacent blocks includes a block adjacent to the left side of the target block and a block adjacent to the upper side of the target block, and

the plurality of division blocks correspond to a plurality of portions defined by dividing the target block.

15. The video encoding method of claim 14, wherein:

the shape of the partition is determined based on the horizontal length of the target block and the vertical length of the target block,

each of the plurality of divided blocks has a predetermined size,

prediction information of one of the plurality of divided blocks is determined based on a position of the divided block in the target block.

16. The video encoding method of claim 14, further comprising: generating the block partition indicator indicating the partition,

The plurality of candidates in the list are derived from prediction information of the plurality of neighboring blocks neighboring the target block,

the prediction information of one of the plurality of divided blocks is determined based on the list,

none of the plurality of neighboring blocks is adjacent to the divided block.

17. A non-transitory computer-readable medium storing a bitstream comprising computer-executable code that, when executed by a processor of a video decoding device, causes the processor to perform the steps of:

determining whether to perform partitioning on the target block based on a block partitioning indicator from the computer executable code;

Decoding is performed on the divided blocks so that,

wherein the partitioning is determined based on the block partitioning indicator in a case where the block partitioning indicator is signaled via a bitstream, wherein the block partitioning indicator indicates a type of the partitioning for a target block among a plurality of types for the partitioning,

a plurality of division blocks are generated by the division,

18. The non-transitory computer-readable medium of claim 17, wherein:

19. The non-transitory computer-readable medium of claim 17, wherein:

the bitstream further includes the block partition indicator indicating the partition.