WO2020231220A1

WO2020231220A1 - Method and apparatus for parallel encoding and decoding of moving picture data

Info

Publication number: WO2020231220A1
Application number: PCT/KR2020/006424
Authority: WO
Inventors: 심동규; 박시내; 변주형; 박승욱; 임화평
Original assignee: 현대자동차주식회사; 기아자동차주식회사; 광운대학교 산학협력단
Priority date: 2019-05-15
Filing date: 2020-05-15
Publication date: 2020-11-19

Abstract

Disclosed is a method and an apparatus for parallel encoding and decoding of moving picture data. A method of decoding video data comprises the steps of: decoding, from a bitstream, a syntax element indicating that a picture can be decoded using wavefront parallel processing; and decoding encoded data of the picture. The step of decoding encoded data of the picture comprises the steps of: for a first coding block of a current CTU row encoded in a palette mode, predicting a palette table for the first coding block by using palette data from a first CTU of a previous CTU row; and decoding the first coding block in the palette mode by using the palette table predicted for the first coding block.

Description

Method and apparatus for parallel encoding and decoding of moving picture data

The present invention relates to encoding and decoding of moving picture data, and more particularly, to a method and apparatus for performing encoding or decoding of moving picture data in parallel.

Since moving picture data has a large amount of data compared to audio data or still image data, it requires a lot of hardware resources including memory in order to store or transmit itself without processing for compression.

Therefore, when storing or transmitting moving picture data, the moving picture data is generally compressed and stored or transmitted using an encoder, and the decoder receives the compressed moving picture data, decompresses and plays it. As such video compression technologies, there are H.264/AVC and HEVC (High Efficiency Video Coding), which improves coding efficiency by about 40% compared to H.264/AVC.

However, the size, resolution, and frame rate of pictures are gradually increasing, and accordingly, the amount of data to be encoded is also increasing. Accordingly, a new compression technique having higher encoding efficiency and higher quality improvement effect than the existing compression technique is required.

The present disclosure proposes a method and apparatus for parallel processing of encoding or decoding of moving picture data. In particular, techniques for supporting improved wavefront parallel processing that minimizes a decrease in coding efficiency while having a low latency time are proposed.

According to an aspect of the present disclosure, a method of encoding video data includes encoding a syntax element indicating that a picture can be decoded using wavefront parallel processing, in a bitstream, and decoding using the wavefront parallel processing. And encoding the data of the picture so as to be possible. In the encoding of the picture data, the palette table for the first coding block is predicted using the palette data from the first CTU of the previous CTU row for the first coding block of the current CTU row encoded in the palette mode. And encoding the first coding block in a palette mode using a palette table predicted for the first coding block.

According to another aspect of the present disclosure, a method of decoding video data includes decoding a syntax element indicating that a picture can be decoded using wavefront parallel processing from a bitstream, and decoding coded data of the picture. . The decoding of the coded data of the picture includes a palette table for the first coding block using palette data from the first CTU of the previous CTU row for the first coding block of the current CTU row encoded in the palette mode. And decoding the first coding block in a palette mode using the predicted palette table for the first coding block.

According to another aspect of the present disclosure, an apparatus for encoding video data includes a memory and at least one processor, the at least one processor encoding a syntax element indicating that a picture can be decoded using wavefront parallel processing, and It is configured to encode the data of the picture so that it can be decoded using parallel processing. As part of encoding the data of the picture, one or more processors use the palette data from the first CTU of the previous CTU row for the first coding block of the current CTU row to be encoded in the palette mode. The palette table for is predicted, and the first coding block is encoded in the palette mode using the predicted palette table for the first coding block.

According to another aspect of the present disclosure, an apparatus for decoding video data includes a memory and at least one processor, and at least one processor decodes a syntax element indicating that a picture can be decoded using wavefront parallel processing from a bitstream. And, it is configured to decode the coded data of the picture. As part of decoding the coded data of the picture, one or more processors use the palette data from the first CTU of the previous CTU row for the first coding block of the current CTU row coded in the palette mode. The palette table for the coding block is predicted, and the first coding block is decoded in the palette mode using the predicted palette table for the first coding block.

1 is an exemplary block diagram of an image encoding apparatus capable of implementing the techniques of the present disclosure.

2 is a diagram for explaining a method of dividing a block using a QTBTTT structure.

3A is a diagram illustrating a plurality of intra prediction modes.

3B is a diagram illustrating a plurality of intra prediction modes including wide-angle intra prediction modes.

4 is an exemplary block diagram of an image decoding apparatus capable of implementing the techniques of the present disclosure.

5 is a conceptual diagram illustrating a wavefront parallel encoding and decoding scheme of a 1-CTU (4-VPDU) delay structure according to an aspect of the present disclosure.

6 is a conceptual diagram illustrating a wavefront parallel encoding and decoding scheme of a 1.5-CTU (6-VPDU) delay structure according to an aspect of the present disclosure.

FIG. 7 is a diagram for explaining a constraint added to an intra prediction mode or an intra block copy mode when a current block is larger than the size of a VPDU in a 1.5-CTU (6-VPDU) delay structure.

8 illustrates a picture divided into a plurality of subgroups.

9 is a flowchart illustrating a method of initializing CABAC context information of a first CTU of a subgroup in a picture by a video decoder according to an aspect of the present disclosure.

10 is a conceptual diagram illustrating an example of initializing a palette for coding video data according to an aspect of the present disclosure.

11 is for explaining initialization of a palette table when 2-CTU delay WPP is activated according to an aspect of the present disclosure.

12 is for explaining initialization of a palette table when 1-CTU delayed WPP is activated according to an aspect of the present disclosure.

13 is a diagram illustrating scanning sequences for coding a palette index map according to an aspect of the present disclosure.

14 is a flowchart illustrating a method by which a decoder determines a palette index for a current pixel, according to an aspect of the present disclosure.

15 is a conceptual diagram illustrating a method of coding a palette index map according to an aspect of the present disclosure.

16 is a flowchart illustrating a method of decoding video data according to an aspect of the present disclosure.

Hereinafter, some embodiments of the present invention will be described in detail through exemplary drawings. In adding identification codes to elements of each drawing, it should be noted that the same elements have the same symbols as possible even if they are indicated on different drawings. In addition, in describing the present invention, if it is determined that a detailed description of a related known configuration or function may obscure the subject matter of the present invention, a detailed description thereof will be omitted.

1 is an exemplary block diagram of an image encoding apparatus capable of implementing the techniques of the present disclosure. Hereinafter, an image encoding apparatus and sub-elements of the apparatus will be described with reference to FIG. 1.

The image encoding apparatus includes a picture segmentation unit 110, a prediction unit 120, a subtractor 130, a transform unit 140, a quantization unit 145, a rearrangement unit 150, an entropy encoding unit 155, an inverse quantization unit. (160), an inverse transform unit 165, an adder 170, a filter unit 180, and a memory 190 may be included.

Each component of the image encoding apparatus may be implemented by hardware or software, or by a combination of hardware and software. In addition, functions of each component may be implemented as software, and a microprocessor may be implemented to execute a function of software corresponding to each component.

One image (video) is composed of a plurality of pictures. Each picture is divided into a plurality of regions, and encoding is performed for each region. For example, one picture is divided into one or more tiles or/and slices. Here, one or more tiles may be defined as a tile group. Each tile or/slice is divided into one or more Coding Tree Units (CTUs). And each CTU is divided into one or more CUs (Coding Units) by a tree structure. Information applied to each CU is encoded as the syntax of the CU, and information commonly applied to CUs included in one CTU is encoded as the syntax of the CTU. In addition, information commonly applied to all blocks in one slice is encoded as the syntax of the slice header, and information applied to all blocks constituting one picture is a picture parameter set (PPS) or picture. It is coded in the header. Further, information commonly referred to by a plurality of pictures is encoded in a sequence parameter set (SPS). In addition, information commonly referred to by one or more SPSs is encoded in a video parameter set (VPS). Also, information commonly applied to one tile or tile group may be encoded as syntax of a tile or tile group header.

The picture dividing unit 110 determines the size of a coding tree unit (CTU). Information on the size of the CTU (CTU size) is encoded as the syntax of the SPS or PPS and transmitted to the video decoding apparatus.

After dividing each picture constituting an image into a plurality of CTUs (Coding Tree Units) having a predetermined size, the picture dividing unit 110 repetitively divides the CTU using a tree structure. (recursively) split. A leaf node in the tree structure becomes a coding unit (CU), which is a basic unit of coding.

As a tree structure, a quad tree (QuadTree, QT) in which an upper node (or parent node) is divided into four lower nodes (or child nodes) of the same size, or a binary tree (BinaryTree) in which an upper node is divided into two lower nodes. , BT), or a ternary tree (TT) in which an upper node is divided into three lower nodes in a 1:2:1 ratio, or a structure in which two or more of these QT structures, BT structures, and TT structures are mixed. have. For example, a QTBT (QuadTree plus BinaryTree) structure may be used, or a QTBTTT (QuadTree plus BinaryTree TernaryTree) structure may be used. Here, by combining BTTT, it may be referred to as MTT (Multiple-Type Tree).

2 shows a QTBTTT split tree structure. As shown in FIG. 2, the CTU may be first divided into a QT structure. The quadtree division may be repeated until the size of a splitting block reaches the minimum block size (MinQTSize) of a leaf node allowed in QT. A first flag (QT_split_flag) indicating whether each node of the QT structure is divided into four nodes of a lower layer is encoded by the entropy encoder 155 and signaled to the image decoding apparatus. If the leaf node of the QT is not larger than the maximum block size (MaxBTSize) of the root node allowed in BT, it may be further divided into one or more of a BT structure or a TT structure. In the BT structure and/or the TT structure, a plurality of division directions may exist. For example, there may be two directions in which a block of a corresponding node is divided horizontally and a direction vertically divided. As shown in FIG. 2, when MTT splitting starts, a second flag (mtt_split_flag) indicating whether nodes are split, and if split, a flag indicating a split direction (vertical or horizontal) and/or a split type (Binary or Ternary). A flag indicating) is encoded by the entropy encoder 155 and signaled to the image decoding apparatus. Alternatively, before encoding the first flag (QT_split_flag) indicating whether each node is divided into four nodes of a lower layer, a CU split flag (split_cu_flag) indicating whether the node is divided is encoded. It could be. When it is indicated that the value of the CU split flag (split_cu_flag) is not split, the block of the corresponding node becomes a leaf node in the split tree structure and becomes a coding unit (CU), which is a basic unit of encoding. When indicating that the value of the CU split flag (split_cu_flag) is to be split, the video encoding apparatus starts encoding from the first flag in the above-described manner.

When QTBT is used as another example of the tree structure, two types of horizontally splitting a block of the corresponding node into two blocks of the same size (i.e., symmetric horizontal splitting) and a type splitting vertically (i.e., symmetric vertical splitting) Branches can exist. A split flag indicating whether each node of the BT structure is divided into blocks of a lower layer and split type information indicating a type to be divided are encoded by the entropy encoder 155 and transmitted to the image decoding apparatus. Meanwhile, a type of dividing the block of the corresponding node into two blocks having an asymmetric shape may further exist. The asymmetric form may include a form of dividing a block of a corresponding node into two rectangular blocks having a size ratio of 1:3, or a form of dividing a block of a corresponding node in a diagonal direction.

The CU can have various sizes according to the QTBT or QTBTTT split from the CTU. Hereinafter, a block corresponding to a CU to be encoded or decoded (ie, a leaf node of QTBTTT) is referred to as a'current block'. According to the adoption of the QTBTTT division, the shape of the current block may be not only square but also rectangular.

The prediction unit 120 predicts the current block and generates a prediction block. The prediction unit 120 includes an intra prediction unit 122 and an inter prediction unit 124.

In general, each of the current blocks in a picture can be predictively coded. In general, prediction of the current block is performed using an intra prediction technique (using data from a picture containing the current block) or an inter prediction technique (using data from a picture coded before a picture containing the current block). Can be done. Inter prediction includes both one-way prediction and two-way prediction.

The intra prediction unit 122 predicts pixels in the current block by using pixels (reference pixels) located around the current block in the current picture including the current block. There are a plurality of intra prediction modes according to the prediction direction. For example, as shown in FIG. 3A, the plurality of intra prediction modes may include two non-directional modes including a planar mode and a DC mode, and 65 directional modes. Depending on each prediction mode, the surrounding pixels to be used and the equation are defined differently.

For efficient directional prediction for the rectangular-shaped current block, directional modes (67 to 80, intra prediction modes -1 to -14) shown by dotted arrows in FIG. 3B may be additionally used. These may be referred to as "wide angle intra-prediction modes". Arrows in FIG. 3B indicate corresponding reference samples used for prediction, and do not indicate a prediction direction. The prediction direction is opposite to the direction indicated by the arrow. In the wide-angle intra prediction modes, when the current block is a rectangular shape, a specific directional mode is predicted in the opposite direction without additional bit transmission. In this case, among the wide-angle intra prediction modes, some wide-angle intra prediction modes available for the current block may be determined based on a ratio of the width and height of the rectangular current block. For example, wide-angle intra prediction modes with an angle less than 45 degrees (intra prediction modes 67 to 80) are available when the current block has a rectangular shape whose height is less than the width, and wide-angle with an angle greater than -135 degrees. The intra prediction modes (intra prediction modes -1 to -14) can be used when the current block has a rectangular shape whose height is greater than the width.

The intra prediction unit 122 may determine an intra prediction mode to be used to encode the current block. In some examples, the intra prediction unit 122 may encode the current block using several intra prediction modes and select an appropriate intra prediction mode to use from the tested modes. For example, the intra prediction unit 122 calculates rate distortion values using rate-distortion analysis for several tested intra prediction modes, and has the best rate distortion characteristics among the tested modes. It is also possible to select an intra prediction mode.

The intra prediction unit 122 selects one intra prediction mode from among a plurality of intra prediction modes, and predicts the current block using a neighboring pixel (reference pixel) determined according to the selected intra prediction mode and an equation. Information on the selected intra prediction mode is encoded by the entropy encoder 155 and transmitted to the image decoding apparatus.

The inter prediction unit 124 generates a prediction block for the current block through a motion compensation process. The inter prediction unit 124 searches for a block most similar to the current block in the coded and decoded reference picture prior to the current picture, and generates a prediction block for the current block using the searched block. Then, a motion vector corresponding to a displacement between the current block in the current picture and the prediction block in the reference picture is generated. In general, motion estimation is performed on a luma component, and a motion vector calculated based on the luma component is used for both the luma component and the chroma component. Motion information including information on a reference picture used to predict the current block and information on a motion vector is encoded by the entropy encoder 155 and transmitted to an image decoding apparatus.

The prediction unit 120 may further use an intra block copy (IBC) mode. In the IBC mode, the prediction unit 120 searches for a prediction block in the same frame or picture as the current block as in the intra prediction mode, but the prediction unit 120 usually searches for a wider range of pixels as well as neighboring rows and columns. You can explore the area. In the IBC mode, the prediction unit 120 may determine a block vector (also referred to as a motion vector) to identify a prediction block in the same frame or picture as the current block. The block vector includes an x-component and a y-component, where the x-component identifies the horizontal displacement between the predicted current block and the predicted block, and the y-component identifies the vertical displacement between the predicted current block and the predicted block. Identify. The determined block vector is signaled in the bitstream so that the image decoding apparatus can identify the same prediction block selected by the image encoding apparatus.

The image encoding apparatus may perform palette-based encoding on the current block and decode the encoded current block by using a palette-based coding technique to be described later. To this end, the image encoding apparatus may further include a palette-based encoding unit as, for example, a module of the prediction unit 120.

The subtractor 130 generates a residual block by subtracting the prediction block generated by the intra prediction unit 122 or the inter prediction unit 124 from the current block.

The transform unit 140 divides the residual block into one or more transform blocks, applies the transform to one or more transform blocks, and transforms residual values of the transform blocks from the pixel domain to the frequency domain. In the frequency domain, transformed blocks are referred to as coefficient blocks comprising one or more transform coefficient values. A 2D transformation kernel may be used for transformation, and a 1D transformation kernel may be used for each of the horizontal and vertical directions. The transform kernel may be based on discrete cosine transform (DCT), discrete sine transform (DST), or the like.

The transform unit 140 may transform residual signals in the residual block by using the entire size of the residual block as a transform unit. In addition, the transform unit 140 may divide the residual block into two sub-blocks in a horizontal or vertical direction, and may perform transformation on only one of the two sub-blocks, as described later with reference to FIGS. 5A to 5D. . Accordingly, the size of the transform block may be different from the size of the residual block (and thus the size of the prediction block). Non-zero residual sample values may not exist or may be very sparse in a subblock on which transformation is not performed. The residual samples of the subblock on which the transformation is not performed are not signaled, and may be regarded as "0" by the image decoding apparatus. There can be several partition types depending on the partitioning direction and the partitioning ratio. The transform unit 140 includes information on the coding mode (or transform mode) of the residual block (e.g., information indicating whether the residual block is transformed or the residual subblock is transformed, and a partition type selected to divide the residual block into subblocks) The entropy encoding unit 155 may be provided with information indicating information and information identifying a subblock on which transformation is performed. The entropy encoder 155 may encode information about a coding mode (or transform mode) of the residual block.

The quantization unit 145 quantizes the transform coefficients output from the transform unit 140 and outputs the quantized transform coefficients to the entropy encoding unit 155. The quantization unit 145 may immediately quantize a related residual block for a certain block or frame without transformation.

The rearrangement unit 150 may rearrange coefficient values on the quantized residual values. The rearrangement unit 150 may change a two-dimensional coefficient array into a one-dimensional coefficient sequence through coefficient scanning. For example, the rearrangement unit 150 may scan from a DC coefficient to a coefficient in a high frequency region using a zig-zag scan or a diagonal scan to output a one-dimensional coefficient sequence. . Depending on the size of the transform unit and the intra prediction mode, instead of zig-zag scan, a vertical scan that scans a two-dimensional coefficient array in a column direction or a horizontal scan that scans a two-dimensional block shape coefficient in a row direction may be used. That is, a scan method to be used may be determined from among zig-zag scan, diagonal scan, vertical scan, and horizontal scan according to the size of the transform unit and the intra prediction mode.

The entropy encoding unit 155 uses various encoding methods such as Context-based Adaptive Binary Arithmetic Code (CABAC), Exponential Golomb, and the like, and the quantized transform coefficients of 1D output from the reordering unit 150 are A bitstream is generated by encoding the sequence.

In addition, the entropy encoder 155 encodes information such as a CTU size related to block division, a CU division flag, a QT division flag, an MTT division type, and an MTT division direction, so that the video decoding apparatus performs the same block as the video encoding apparatus. Make it possible to divide. In addition, the entropy encoder 155 encodes information on a prediction type indicating whether the current block is encoded by intra prediction or inter prediction, and intra prediction information (ie, intra prediction) according to the prediction type. Mode information) or inter prediction information (reference picture and motion vector information) is encoded.

The inverse quantization unit 160 inverse quantizes the quantized transform coefficients output from the quantization unit 145 to generate transform coefficients. The inverse transform unit 165 converts transform coefficients output from the inverse quantization unit 160 from the frequency domain to the spatial domain to restore the residual block.

The addition unit 170 restores the current block by adding the restored residual block and the prediction block generated by the prediction unit 120. The pixels in the reconstructed current block are used as reference pixels when intra-predicting the next block.

The filter unit 180 filters reconstructed pixels to reduce blocking artifacts, ringing artifacts, blurring artifacts, etc. that occur due to block-based prediction and transformation/quantization. Perform. The filter unit 180 may include a deblocking filter 182 and a sample adaptive offset (SAO) filter 184.

The deblocking filter 180 filters the boundary between reconstructed blocks to remove blocking artifacts caused by block-based encoding/decoding, and the SAO filter 184 adds additional information to the deblocking-filtered image. Filtering is performed. The SAO filter 184 is a filter used to compensate for a difference between a reconstructed pixel and an original pixel caused by lossy coding.

The reconstructed block filtered through the deblocking filter 182 and the SAO filter 184 is stored in the memory 190. When all blocks in one picture are reconstructed, the reconstructed picture may be used as a reference picture for inter prediction of a block in a picture to be encoded later. Meanwhile, when deblocking is performed, deblocking filtering is not applied to the palette-coded block on one side of the block boundary.

4 is an exemplary functional block diagram of an image decoding apparatus capable of implementing the techniques of the present disclosure. Hereinafter, an image decoding apparatus and sub-components of the apparatus will be described with reference to FIG. 4.

The image decoding apparatus includes an entropy decoding unit 410, a rearrangement unit 415, an inverse quantization unit 420, an inverse transform unit 430, a prediction unit 440, an adder 450, a filter unit 460, and a memory 470. ) Can be included.

Like the image encoding apparatus of FIG. 1, each component of the image decoding apparatus may be implemented as hardware or software, or may be implemented as a combination of hardware and software. In addition, functions of each component may be implemented as software, and a microprocessor may be implemented to execute a function of software corresponding to each component.

The entropy decoding unit 410 determines the current block to be decoded by decoding the bitstream generated by the image encoding apparatus and extracting information related to block division, and predicting information and residual signals necessary to restore the current block. Extract information, etc.

The entropy decoding unit 410 determines the size of the CTU by extracting information on the CTU size from a sequence parameter set (SPS) or a picture parameter set (PPS), and divides the picture into CTUs of the determined size. Then, the CTU is determined as the uppermost layer of the tree structure, that is, the root node, and the CTU is divided using the tree structure by extracting partition information for the CTU.

For example, in the case of splitting the CTU using the QTBTTT structure, first, a first flag (QT_split_flag) related to the splitting of the QT is extracted and each node is split into four nodes of a lower layer. And, for the node corresponding to the leaf node of QT, the second flag (MTT_split_flag) related to the splitting of the MTT, and the splitting direction (vertical / horizontal) and / or split type (binary / ternary) information is extracted to MTT Divide into structure. Through this, each node below the leaf node of the QT is recursively divided into a BT or TT structure.

As another example, when splitting the CTU using the QTBTTT structure, first extract the CU split flag (split_cu_flag) indicating whether to split the CU, and if the corresponding block is split, the first flag (QT_split_flag) is extracted. May be. In the segmentation process, each node may have 0 or more repetitive MTT segmentation after 0 or more repetitive QT segmentation. For example, in the CTU, MTT division may occur immediately, or, conversely, only multiple QT divisions may occur.

As another example, when the CTU is divided using the QTBT structure, each node is divided into four nodes of a lower layer by extracting the first flag (QT_split_flag) related to the division of the QT. In addition, a split flag indicating whether or not the node corresponding to the leaf node of the QT is further split into BT and split direction information are extracted.

Meanwhile, when determining the current block to be decoded through the division of the tree structure, the entropy decoder 410 extracts information on a prediction type indicating whether the current block is intra prediction or inter prediction. When the prediction type information indicates intra prediction, the entropy decoder 410 extracts a syntax element for intra prediction information (intra prediction mode) of the current block. When the prediction type information indicates inter prediction, the entropy decoder 410 extracts a syntax element for the inter prediction information, that is, information indicating a motion vector and a reference picture referenced by the motion vector.

On the other hand, the entropy decoder 410 includes information on the coding mode of the residual block (e.g., information on whether the residual block is encoded or only the subblocks of the residual block are encoded, and is selected to divide the residual block into subblocks). Information indicating the partition type, information identifying the encoded residual subblock, quantization parameters, etc.) are extracted from the bitstream. In addition, the entropy decoder 410 extracts information on quantized transform coefficients of the current block as information on the residual signal.

The rearrangement unit 415, in the reverse order of the coefficient scanning order performed by the image encoding apparatus, reconverts the sequence of one-dimensional quantized transform coefficients entropy-decoded by the entropy decoder 410 into a two-dimensional coefficient array (ie, Block).

The inverse quantization unit 420 inverse quantizes the quantized transform coefficients, and the inverse transform unit 430 inversely transforms the inverse quantized transform coefficients from the frequency domain to the spatial domain based on information on the coding mode of the residual block By reconstructing the signals, a reconstructed residual block for the current block is generated.

When the information on the coding mode of the residual block indicates that the residual block of the current block is coded in the image encoding apparatus, the inverse transform unit 430 determines the size of the current block (and thus, to be reconstructed) with respect to the inverse quantized transformation coefficients. A reconstructed residual block for the current block is generated by performing inverse transformation using the residual block size) as a transformation unit.

In addition, when the information on the coding mode of the residual block indicates that only one subblock of the residual block is coded in the image encoding apparatus, the inverse transform unit 430 performs the transformed sub-blocks on the inverse quantized transform coefficients. By using the size of the block as a transformation unit, performing inverse transformation to restore residual signals for the transformed subblock, and filling the residual signals for untransformed subblocks with a value of "0", the reconstructed current block Create a residual block.

The prediction unit 440 may include an intra prediction unit 442 and an inter prediction unit 444. The intra prediction unit 442 is activated when the prediction type of the current block is intra prediction, and the inter prediction unit 444 is activated when the prediction type of the current block is inter prediction.

The intra prediction unit 442 determines an intra prediction mode of the current block among a plurality of intra prediction modes from the syntax element for the intra prediction mode extracted from the entropy decoding unit 410, and references around the current block according to the intra prediction mode. Predict the current block using pixels.

The inter prediction unit 444 determines a motion vector of the current block and a reference picture referenced by the motion vector using the syntax element for the intra prediction mode extracted from the entropy decoding unit 410, and determines the motion vector and the reference picture. Is used to predict the current block.

The prediction unit 440 may further use an intra block copy (IBC) mode. The predictor 440 may use a block vector decoded by the entropy decoder 410 from a bitstream in order to identify the same predictive block selected by the image encoding apparatus.

The image decoding apparatus may reconstruct the current block by performing palette-based decoding on the current block using a palette-based coding technique to be described later. The image decoding apparatus may further include a palette-based decoding unit as, for example, a module of the prediction unit 440.

The adder 450 restores the current block by adding the residual block output from the inverse transform unit and the prediction block output from the inter prediction unit or the intra prediction unit. The pixels in the reconstructed current block are used as reference pixels for intra prediction of a block to be decoded later.

The filter unit 460 may include a deblocking filter 462 and an SAO filter 464. The deblocking filter 462 performs deblocking filtering on the boundary between reconstructed blocks in order to remove blocking artifacts caused by decoding in units of blocks. The SAO filter 464 performs additional filtering on the reconstructed block after deblocking filtering in order to compensate for the difference between the reconstructed pixel and the original pixel caused by lossy coding. The reconstructed block filtered through the deblocking filter 462 and the SAO filter 464 is stored in the memory 470. When all blocks in one picture are reconstructed, the reconstructed picture is used as a reference picture for inter prediction of a block in a picture to be encoded later. When deblocking is performed, deblocking filtering may not be applied to blocks decoded in the palette mode on one side of the block boundary.

The following description mainly focuses on a decoding technique, that is, an operation of an image decoding apparatus, and descriptions of encoding techniques are simplified because they are opposite to the comprehensively described decoding technique.

One aspect of the present disclosure relates to improving the parallel coding of blocks of video data.

Various video coding standards including the High Efficiency Video Coding (HEVC) standard include Virtual Pipeline Data Unit (VPDU), tiles, and Wavefront Parallel Processing (WPP) so that different blocks in the same picture can be simultaneously decoded. It supports the same parallel processing mechanisms.

From the perspective of hardware implementation of the decoder, the decoder may be designed to configure the decoding process into several pipelines and process them in parallel, where the data unit input or output from each pipeline stage is VPDU (Virtual Pipeline Data Unit). ). The size of the VPDU is determined by the largest transform block. In the case of other blocks such as prediction blocks, it is possible to divide a given block into arbitrary small blocks and design them to operate, but in the case of transform blocks, this method cannot be applied. In the current discussion of VVC standardization, a transform having a size of up to 64×64 is used based on the luma component, and pipelines operating in a 64×64 block size can be used in a hardware decoder.

Tiles provide partitioning for dividing a picture into a plurality of independently decodeable rectangular regions so that a video decoding apparatus can decode a plurality of tiles in parallel.

In WPP, each row of CTUs in a picture is referred to as a “wavefront”. Unlike tiles, wavefronts are not independently decodeable, but a video decoder can decode several wavefronts in parallel by sequentially delaying the time point at which decoding of several wavefronts starts. For example, when the video decoder uses WPP to decode a picture, the video decoder starts decoding the second wavefront below the first wavefront after decoding two consecutive CTUs of the first wavefront. Accordingly, it can be ensured that arbitrary information on the first wavefront required for decoding the second wavefront is available at the time of decoding the second wavefront. The time required by the video decoder to wait after decoding the N-1 th CTU row and before decoding the N th CTU row may be referred to as a delay. In the WPP structure of HEVC, each CTU row is processed using a delay of two consecutive CTUs based on the upper CTU row. To mitigate the potential loss of coding efficiency due to conventional CABAC initialization at the beginning of each CTU row, in WPP, CABAC context information is from the second CTU of the preceding CTU row (i.e., upper right CTU) to the current CTU row. Is propagated to the first CTU of

While the maximum CTU size in HEVC was 64×64, in the discussion of VVC standardization, the maximum CTU size increased to 128×128, and accordingly, the parallel processing capability that the WPP architecture with 2-CTU delay can provide decreases rapidly. Became.

In the WPP mode, as the delay between CTU rows is reduced, the parallel processing performance increases, but the search range for the pixel reference in intra prediction and the block vector in the intra block copy (IBC) is limited, and relatively few updates are made. Since the CABAC context information is propagated to the first CTU of the next wavefront, coding efficiency decreases. That is, in the WPP structure, there is a trade-off between coding efficiency and parallel processing performance.

According to an aspect of the present disclosure, an improved WPP structure that minimizes deterioration of coding efficiency while having a lower waiting time compared to the 2-CTU delay structure of HEVC is proposed.

As an example, a 1-CTU (4-VPDU) delayed WPP structure may be considered. 5 is a conceptual diagram illustrating a wavefront parallel encoding and decoding scheme of a 1-CTU (4-VPDU) delay structure according to an aspect of the present disclosure. According to the proposed 1-CTU (4-VPDU) delayed WPP structure, compared to the HEVC framework, the pipeline delay of WPP is reduced from 2 CTUs to 4 VPDUs (1 CTU). The CABAC context information of the block corresponding to the first VPDU of each CTU row can be updated by getting the CABAC context information from the bottom-right VPDU in the first CTU of the preceding CTU row that has already been decoded. This 1-CTU (4-VPDU) delay structure has higher parallelism than the 2-CTU delay structure of the HEVC framework, but the pixel reference and intra block copy (or intra line) for intra prediction for the upper-right block A restriction on block vectors for copy) should be added.

As another example, a 1.5-CTU (6-VPDU) delayed WPP structure may be considered. 6 is a conceptual diagram illustrating a wavefront parallel encoding and decoding scheme of a 1.5-CTU (6-VPDU) delay structure according to an aspect of the present disclosure.

According to the proposed 1.5-CTU (6-VPDU) delay structure, the restrictions on the pixel reference for intra prediction and the block vector for intra block copy are relaxed compared to the 1-CTU (4-VPDU) delay structure. More efficient encoding and decoding is possible by setting CABAC context information of the first CTU of each CTU row by using the CABAC context information updated more than the CTU (4-VPDU) delay structure.

6, in the 1.5-CTU (6-VPDU) delay structure, the CABAC context information of the block corresponding to the first VPDU of each CTU row is top-right in the second CTU of the preceding CTU row that has already been decoded. right) CABAC context information can be retrieved from the VPDU and updated.

In the example of FIG. 6, a case in which the size of the VPDU = N and the size of the CTU = 2N is illustrated, and in this case, pixel reference for intra prediction in the left, upper left, upper and upper right directions may be allowed. In addition, the use of a block vector for intra block copies (or intra line copies) in those directions may be allowed.

If the size of the current block (CU) is larger than the size of the VPDU, a restriction may be added to the intra prediction mode or the intra block copy mode.

FIG. 7 is a diagram for explaining a constraint added to an intra prediction mode or an intra block copy mode when a current block is larger than the size of a VPDU in a 1.5-CTU (6-VPDU) delay structure. In FIG. 7, O/X indicated in the VPDU indicates the availability of the VPDU when coding the current block.

As shown in FIG. 7A, when the current CU is larger than the size of the VPDU, reference pixels included in the X-marked VPDUs in the upper-right direction are not available. In this case, by filling the unavailable reference pixels with the value of the rightmost pixel among the available upper reference pixels, the intra prediction mode referring to the pixel in the upper-right direction in the intra prediction of the current CU Use may be permitted.

When the prediction mode of the current CU is intra block copy (or intra line copy), the use of block vectors directed to blocks corresponding to two lower VPDUs in the upper-right CTU is restricted. Accordingly, the video encoder may exclude blocks corresponding to the lower two VPDUs in the upper-right CTU from the motion search area for intra-block copy. In addition, when a block vector in the upper-right direction indicating blocks corresponding to the upper two VPDUs in the upper-right CTU is selected, the video encoder has an offset of an unusable size from the block vector ( A block vector (or a block vector scaled by 1/2) minus the offset) may be signaled. When the video decoder decodes the block vector in the upper-right direction from the bitstream, the video decoder may restore the original block vector by adding an offset to the decoded block vector.

Similarly, as shown in (b) of FIG. 7, when the current CU is larger than the size of the VPDU, reference samples included in the X-marked VPDU in the upper-right direction are not available. In this case, by filling the unavailable reference pixels with the rightmost pixel value among the available upper reference pixels, the use of the intra prediction mode referring to the upper-right pixel for intra prediction of the current CU is allowed. May be.

In some embodiments, when the size of the CTU is equal to or smaller than the size of the VPDU (i.e., the size of the CTU <= the size of the VPDU), the video encoder and the decoder have a 2-CTU delay rather than a 1.5-CTU delay. Encoding and decoding may be performed through Accordingly, the encoder and decoder can initialize the CABAC context information of the first CTU of the current CTU row by using the CABAC context information of the upper-right CTU. In this case, separate signaling, such as a flag indicating that the coding is performed in a WPP structure with a 2-CTU delay, may not be required.

Meanwhile, a video encoder and a decoder may determine whether to apply WPP in units of sequences, typically. In some embodiments, whether to apply WPP may be determined in units of subgroups of pictures (eg, which may be subpictures, slices, tiles, CTU groups, etc.). The video encoder may signal a flag for whether to apply WPP (eg, wpp_enable_flag ) for each of the aforementioned units, and the video decoder may determine whether to perform WPP for each unit by parsing the flag from the bitstream. In some cases, when the width of a subgroup of a picture to be encoded or decoded is smaller than a predetermined specific width (eg, "(width of subgroup/width of CTU) <threshold value”), The video encoder and decoder may not apply WPP to the corresponding subgroup. Accordingly, in this case, encoding and decoding of the WPP flag is omitted, and the video decoder may implicitly disable WPP.

When a picture is divided into a plurality of subgroups to be encoded and decoded, dependency between subgroups may be controlled at a higher level or a subgroup level. Such dependency possibility may be signaled through one or more syntax elements at a higher level, and may be signaled through a flag for each subgroup. For example, a picture may be encoded so that all subgroups of the picture (eg, CTU groups, tiles, tile groups, slices, subpictures, etc.) do not have dependence, and only some subgroups It may be coded so as not to have this dependency.

Accordingly, each subgroup in a picture may be decoded independently (or in parallel) with other subgroups, and some subgroups may be decoded depending on information of another subgroup that has already been decoded. In this case, initializing the CABAC context information of the first CTU of the current subgroup by using the CABAC context information of the CTU of another subgroup previously encoded and decoded may provide a gain in encoding efficiency.

8 illustrates a picture divided into a plurality of subgroups. In FIG. 8, it is assumed that subgroup A is a subgroup that cannot be independently decoded, and some of the remaining subgroups can be decoded independently (or in parallel). Whether each subgroup is independently decoded (or sequentially decoded) may be signaled as a bitstream through a flag in the encoder. In order to initialize the CABAC context information of the first CTU of subgroup A, the encoder and decoder may search for preceding subgroups in reverse order of the Z-scan order to search whether or not a subgroup that has already been encoded and decoded exists. The encoder and decoder may initialize the context information of the current CTU by using the CABAC context information of the last CTU of the subgroup that is already coded before the coding of the first CTU of subgroup A. For example, if subgroup B and subgroup A, which are preceding subgroups adjacent to subgroup A, are encoded so that two subgroups are sequentially decoded (i.e., if subgroup B is a subgroup that cannot be independently decoded), decoding The group may obtain the CABAC context information of the CTU decoded last of subgroup B and initialize the context information of the first CTU of subgroup A.

The video decoder may parse the flag from the bitstream and determine whether the current subgroup is a subgroup that can be independently decoded (S910 to S920).

If the current subgroup is not a subgroup that can be independently decoded ("No" in S920), a subgroup that cannot be independently decoded may be searched for subgroups preceding in the reverse order of the Z-scanning order (S940). . When a subgroup that cannot be independently decoded is found ("No" in S950), the video decoder uses the CABAC context information of the CTU (or VPDU) last decoded in the subgroup, and the first CTU of the current subgroup ( Alternatively, CABAC context information of (VPDU) may be set (S960).

If the current subgroup is a subgroup that can be independently decoded (“Yes” in S920), context information of the first CTU (or VPDU) of the current subgroup may be initialized to a preset value (eg, 0 or 0.5). (S930). In some embodiments, information (eg, a specific value or table and/or index) for initialization of CABAC context information of the first CTU (or VPDU) of a subgroup that can be decoded independently (or in parallel) is It may be signaled in a bitstream.

Hereinafter, techniques for palette-based coding of video data proposed in the present disclosure will be described.

A. 팔레트 테이블의 초기화 및 생성A. Initialization and creation of pallet table

In palette-based video coding, a video encoder and decoder derive a palette table (also simply referred to as a "palette") for a block of pixels. Each entry in the palette table contains color component (eg, RGB, YUV, or the like) values or luma component values identified by indexes.

As part of coding a block in palette mode, the video encoder and decoder first determine the palette table to be used for the block. Then, the palette indices for each pixel (or sample) of the block can be coded to indicate which entry in the palette should be used to predict or reconstruct the pixel (sample).

Initializing a palette prediction list (also referred to as a palette predictor) generally refers to the process of generating a palette prediction list for the first block of a group of video blocks (eg, subpicture, slice or tile, etc.). The palette prediction list for subsequent blocks is typically generated by updating the previously used palette prediction list. That is, after coding a given block in a palette mode, the encoder and decoder update the palette prediction list using the current palette. Entries used in the current palette are inserted into the new palette prediction list, and entries in the previous palette prediction list that are not used in the current palette are new entries in the new palette prediction list until the maximum allowed size of the palette prediction list is reached. It can be added to the following locations: However, in the case of the first block, since the previously used palette prediction list is not available, the palette prediction list for the first block is initialized to 0 in the prior art. Thus, the entries in the palette table for the first block are new entries explicitly signaled by the encoder.

The present disclosure proposes a technique for efficiently generating or initializing a palette table for a block that is first encoded/decoded from a group of video blocks (eg, picture, slice, tile, etc.).

According to an aspect of the present disclosure, a video encoder sets a default palette table having a plurality of palette colors at a higher level (Picture Parameter Set (PPS), Sequence Parameter Set (SPS), Adaptation Parameter Set (APS), slice header, etc.)). Can signal. The default palette table may be used to generate (ie, initialize) a palette table for sub-unit palette coding when a previously configured palette prediction list is not available.

The video decoder may determine entries of the palette table for the first block of the lower unit based on the default palette table signaled from the upper unit. The palette table for the first block of the lower unit may be referred to as “initial palette table” or “initial palette”. For example, when generating an initial palette table, a binary flag may be signaled for each entry to indicate which of the entries in the default palette table should be used for initialization of the palette table. A binary flag with a value of "1" may indicate that the related entry is used in the palette, and a binary flag with a value of "0" may indicate that the related entry is not used in the initial palette. The string of binary flags may also be referred to as an index vector. The index vector may be transmitted using run-length coding (of bins of 0 or 1). The video decoder may construct a palette table for palette decoding of the first CU by parsing a default palette table signaled in a higher unit and an index vector signaled in a lower unit.

In the example of FIG. 10, the default palette table signaled in the upper unit has 8 entries. In the index vector, the first entry, the third entry, and the eighth entry (i.e., entries mapped to

index

0, 2, 7) of the default palette table are included in the initial palette table of the lower unit, and the remaining entries (i.e. It indicates that entries mapped to

indexes

1, 3 to 6) are not included in the initial palette table.

In some cases, the number of entries reused from the default palette table may be signaled in an upper unit or a lower unit. In addition, the size of the initial palette table to be used in the lower unit (ie, the maximum number of entries) may be signaled. In some cases, an initial palette table of a fixed size may be used, and thus signaling about the size of the initial palette table to be used in a lower unit may not be required.

The palette for coding the current block may also contain one or more new palette entries that are explicitly coded (eg, separate from the index vector). In the initial palette table illustrated in FIG. 10, (r', g', b') corresponding to

indices

3 and 4 are not palette entries of the upper unit, but new entries explicitly signaled in the lower unit by the encoder. . When all entries of the initial palette table are filled by the index vector, coding of syntax elements indicating a new palette entry (ie, color values) may be skipped. In some cases, a flag indicating the presence or absence of new palette entries may be coded.

In the case of slices using a dual tree in which CU partitioning is different between luma and chroma components, a palette for each color component (e.g., Y palette, Cb palette, Cr palette) or two palettes (e.g., Y palette) , Cb/Cr palette) may be used. In addition, in the case of a single tree, a single palette including all color component (Y, Cb, Cr) values in each entry may be used. In the case of monochrome, a single palette may be used.

B. WPP가 활성화된 경우에 팔레트 테이블의 초기화B. Initialization of the pallet table when WPP is activated

When Wavefront Parallel Processing (WPP) is activated, the palette table may need to be initialized at the first CTU (or VPDU) of each CTU row for parallel processing. In this case, the palette prediction list for the first CTU (or VPDU) of the current CTU row may be initialized using the pallet data of the CTU or VPDU that has already been decoded located at the top of the current CTU row.

As an example, as shown in FIG. 11, when a 2-CTU delayed WPP is used, from a previous CTU row, a palette prediction list of the upper-right CTU of the current CTU that has already been decoded is obtained and the first of the current CTU row It is also possible to initialize the palette prediction list for configuring the palette table of the CTU. As another example, as shown in FIG. 12, when 4-VPDU delayed WPP (i.e., 1-CTU delayed WPP) is used, the palette prediction of VPDUs that have already been decoded in the previous CTU row (i.e., the upper CTU of the current CTU) The list may be used to initialize the palette prediction list for constructing the palette table of the first CTU in the current CTU row.

In some embodiments, the palette prediction list of the already decoded CTU of the upper CTU row may be used as the palette prediction list of the first CTU of the current CTU row. In this case, the palette table of the first CTU of the current CTU row may be configured using a palette prediction list through signaling of an index vector and signaling of additional color component values, similar to the method illustrated in FIG. 10. In some other embodiments, the pallet table of the already decoded upper CTU (in case of 1-CTU delayed WPP) or the upper right CTU (in case of 1-CTU delayed WPP) may be used as the palette table of the first CTU of the current CTU row. have. Samples that do not have color values expressed in the palette for coding blocks

On the other hand, the encoder and decoder may be configured to code and/or determine a flag (which may be referred to as a block-level escape flag) indicating whether any sample of a block is coded in an escape mode described below. For example, a flag value of 0 may indicate that no samples of the block are coded using the escape mode. That is, values of all samples of a block may be determined based on a color value included in a palette for coding a block. A flag value of 1 may indicate that at least one sample of the block is coded using the escape mode. In other words, the value of at least one sample is coded as an escape sample.

In some examples, a CU level escape flag indicating whether the current CU has an escape sample may be signaled in the bitstream. The presence of the escape sample in the CU may affect the number of palette indices for the CU. For example, the palette of the CU generated from the palette prediction list may have N entry indices so that, for example, an entry index for a sample can be selected from {0, 1, 쪋, N-1}. If the CU-level escape flag indicates that there is an escape sample in the current block, the encoder and decoder will be able to ensure that the possible index values in the current block are {0, 1, Z, N-1, N}. It is also possible to add an unrelated) index to the palette for the current block. Here, an index equal to N (also referred to as an escape index) indicates that the associated sample is an escape sample. Indexes less than N may indicate that the associated sample is represented in a color from the palette associated with that index.

C. 팔레트 인덱스의 스캐닝 순서C. Pallet Index Scanning Order

The two-dimensional block of palette indices for each pixel (sample) in the CU is referred to as a palette index map. The video encoder may convert a 2D block of palette indices into a 1D array by scanning the palette indices using a scanning order. Similarly, the video decoder may reconstruct a block of palette indices using a scanning order. The previous sample refers to a sample that precedes the sample currently being coded in the scanning order.

In some embodiments, in order to scan the palette index of a given CU, the horizontal traverse scanning sequence illustrated in FIG. 13A and the vertical traverse scanning sequence illustrated in FIG. 13B. scanning) order can be used selectively. In another embodiment, a horizontal scanning sequence and a vertical scanning sequence may be selectively used. The encoder may signal a flag indicating the selected scanning order for a given CU. In still other embodiments, to scan the palette index of a CU given the diagonal scanning sequence illustrated in FIG. 13(c) or the zigzag scanning sequence illustrated in FIG. 13(d) Can be used.

D. 팔레트 인덱스의 코딩D. Coding of palette index

Each sample in a block coded with a palette-based coding mode may be coded using one of two index coding modes as disclosed below.

COPY_ABOVE mode: In this mode, the palette index for the current sample is copied from the same position sample from the previous line (top row or left column) in the scanning order in the block.

INDEX mode: In this mode, the palette index is explicitly signaled in the bitstream using a syntax element, expressed by the encoder, for example as truncated binary code, or inferred by the decoder. .

The INDEX mode includes a first INDEX mode in which the palette index of a previous sample position preceding in the scan order is copied, that is, inferred by the decoder, and a second INDEX mode in which the palette index is explicitly signaled.

In order to efficiently code the palette index of the current sample, the encoder and decoder set the index coding mode of the previous sample of the current sample and/or the same-positioned samples (ie, the upper sample or the left sample) in the previous line in the CABAC context. Can be used as information.

In the palette index coding scheme proposed in the present disclosure, for each sample position in a block, one or two flags for determining a mode are parsed. For each sample position, a first flag having a value of 0 or 1 is parsed, and a second flag having a value of 0 or 1 is parsed depending at least in part on the value of the first flag. One of the COPY_ABOVE mode, the first INDEX mode, and the second INDEX mode is determined according to a value derived from one or more flags parsed for each pixel position. The palette index for the corresponding pixel position is signaled by the encoder only when the determined mode is the second INDEX mode and parsed by the decoder. That is, in the present disclosure, a block map in which index coding modes are allocated according to one or two flags for each pixel position in a block, and a palette index for each pixel position is determined according to the block map.

In some embodiments in which the scanning order illustrated in FIGS. 13A and 13B may be selectively used, for each sample in the current block, whether the current sample is the same index coding mode as the previous sample (In other words, a first flag (e.g., run_copy_flag ) indicating whether the index coding mode of the current sample and the previous sample are both COPY_ABOVE mode, or both the current sample and the previous sample are INDEX mode and have the same index or not) is coded. . When the first flag has a value of 0 and the previous sample is in the INDEX mode, a second flag (eg, copy_above_palette_indices_flag ) indicating whether the index coding mode of the current sample is INDEX or COPY_ABOVE may be additionally coded. In addition, a variable Copy_Above_Flag representing the index coding mode of the sample is introduced.

Table 1 shows how the palette index of the related sample is determined according to the values of the syntax element run_copy_flag and the variable Copy_Above_Flag.

14 is a flowchart illustrating a method for a decoder to determine a palette index for a current sample according to an aspect of the present disclosure.

Referring to FIG. 14, the decoder determines whether the current sample is in the same index coding mode as the previous sample (that is, whether the current sample and the previous sample are both COPY_ABOVE, or whether the current sample and the previous sample are both INDEX and have the same index or not. The first flag ( run_copy_flag ) indicating ) may be parsed (S1611 ).

If the first flag run_copy_flag has a value of 1 ("Yes" in S1412), the decoder sets Copy_Above_Flag of the current sample to the same value as Copy_Above_Flag of the previous sample (S1414). That is, if Copy_Above_Flag of the previous sample is 0, Copy_Above_Flag of the current sample is set to 0, and thus, referring to Table 1, the palette index of the current sample is copied from the previous sample. If the Copy_Above_Flag of the previous sample is 1, the Copy_Above_Flag of the current sample is set to 1, and thus, referring to Table 1, the palette index of the current sample is copied from the sample at the same position in the previous line (upper row or left column) ( In other words, the palette index of the current sample is copied from the sample at the same position in the upper row for the horizontal transverse scanning of FIG. 13(a), and in the left column for the vertical transverse scanning of FIG. 13(b). The palette index of the current sample is copied from the sample at the same position).

If the first flag run_copy_flag has a value of 0 ("No" in S1412), the decoder determines whether the Copy_Above_Flag of the previous sample has a value of 1 (S1416). If Copy_Above_Flag of the previous sample has a value of 1 ("Yes" in S1416), the decoder sets Copy_Above_Flag of the current sample to a value of 0 (S1418). Therefore, since run_copy_flag = 0 and Copy_Above_Flag = 0 for the current sample, referring to Table 1, the palette index of the current sample is explicitly signaled in the bitstream. The decoder parses a syntax element ( palette_idx_idc ) representing the palette index of the current sample from the bitstream (S1420). If Copy_Above_Flag of the previous sample has a value of 0 ("Yes" in S1416), the decoder further parses the second flag ( copy_above_palette_indices_flag ) (S1422).

If copy_above_palette_indices_flag = 1 ("Yes" in S1424), the decoder sets Copy_Above_Flag of the current sample to a value of 1 (S1426). Therefore, since run_copy_flag = 0 and Copy_Above_Flag = 1 for the current sample, referring to Table 1, the palette index of the current sample is copied from the sample at the same position in the previous line (top row or left column) in the scan order.

If copy_above_palette_indices_flag = 0 ("No" in S1412), the decoder sets Copy_Above_Flag of the current sample to a value of 0 (S1428). Therefore, since run_copy_flag = 0 and Copy_Above_Flag = 0 for the current sample, referring to Table 1, the palette index of the current sample is explicitly signaled in the bitstream. The decoder parses a syntax element ( palette_idx_idc ) indicating the palette index of the current sample from the bitstream (S1430).

Here, for samples located in the first row of FIG. 13(a) and the first column of FIG. 13(b), there is no previous scan line, and the second flag (copy_above_palette_indices_flag) is not signaled, It is inferred as a value of 0. That is, the index coding mode of samples located in the first row of FIG. 13A and the first column of FIG. 13B is regarded as the INDEX mode by default.

It should be understood that the encoder can also code the palette index for each sample of the block substantially the same as the order illustrated in FIG. 14.

The encoder and decoder may perform the above-described palette index coding by dividing the one-dimensional array of palette indices into sample groups of a predefined size (eg, 16 samples). When palette index coding for one sample group is finished, palette index coding for the next sample group may start. In addition, in palette index coding for one sample group, after coding of the first flag ( run_copy_flag ) and the second flag ( copy_above_palette_indices_flag ) is completed, a syntax element ( palette_idx_idc ) for necessary samples may be coded.

15 is a conceptual diagram illustrating a method of coding a palette index map according to an aspect of the present disclosure. In FIG. 15, a palette index map 1510 for an 8×8 coding block is illustrated, and it is assumed that the horizontal transverse scanning order is used to scan the palette indices. The palette table has two entries associated with

index

0 and 1, and index 3 as an escape index for the escape sample.

In FIG. 15, values of the first flag ( run_copy_flag ) and the second flag ( copy_above_palette_indices_flag ) signaled for the palette indices in the second row 1512 of the palette index map 1510 are shown. The samples indicated in bold in the second row 1512 refer to samples in which a syntax element (palette_idx_idc) expressly expressing a related palette index is coded.

As described above, the INDEX mode is used for all samples of the first row 1511 in a given palette index map, and all of the samples in the given index map have a 0 value palette index, and as described above, the first row in the given index map Since the INDEX mode is used for all samples in (1511), the last sample in the first row (1511) (the variable Copy_Above_Flag for the rightmost sample has a value of 0. In the second row (1512), the first in the scanning order) For a sample (rightmost sample), the last sample (rightmost sample) of the first row 1511 is the previous sample and a sample at the same position of the previous line In the illustrated palette index map 1510, the second row ( In 1512), the palette index of the first sample in the scan order is the same as the previous sample and the sample at the same position in the previous line. Therefore, the encoder codes the palette index of the first sample in the second row 1512. The index coding mode to be used can be selected from INDEX mode and COPY_ABOVE mode, which can be based on R/D testing.

If the encoder encodes the palette index of the first sample in the scan order in the second row 1512 in the COPY_ABOVE mode, a run_copy_flag with a value of 0 is signaled for the first sample in the second row 1512, and an additional value of 1. The copy_above_palette_indices_flag of is signaled. The decoder parses run_copy_flag, and since run_copy_flag = 0 and Copy_Above_Flag of the previous sample (that is, the sample of the first row 1511) has a value of 0, copy_above_palette_indices_flag is additionally parsed. Since the decoder is copy_above_palette_indices_flag = 1, Set Copy_Above_Flag for the current sample to a value of 1. And, since run_copy_flag = 0 and Copy_Above_Flag = 1 for the current sample, the decoder can determine (infer) the index coding mode of the current sample as Above Copy. That is, the palette index of the first sample in the second row 1512 is copied from the sample at the same position in the first row, which is the previous line.

In the second row 1512, the palette index of the second sample in the scan order is the same as the previous sample and the same position of the sample in the previous line. Accordingly, the encoder may select an index coding mode to be used for coding the palette index of the second sample in the second row 1512 from the INDEX mode and the COPY_ABOVE mode. This choice can be based on R/D testing. If the COPY_ABOVE mode is selected, the second sample and the previous sample (first sample) in the scan order in the second row 1512 are the COPY_ABOVE mode. Accordingly, the encoder signals run_copy_flag of a value of 1 for the first sample in the second row 1512. The decoder parses the run_copy_flag for the current sample (i.e., the second sample in the second row 1512), and since run_copy_flag = 1, sets the Copy_Above_Flag of the current sample to the same value (ie, 1) of the previous sample. Therefore, since run_copy_flag = 0 and Copy_Above_Flag = 1 for the current sample, the decoder can determine (infer) the index coding mode of the current sample (that is, the second sample of the second row 1512) as Above Copy.

In the second row 1512, the palette index of the third sample in the scan order is different from the palette index of the previous sample, and is also different from the palette index of the upper sample. Accordingly, the encoder selects the index coding mode of the second sample as the INDEX mode. Since the index coding mode of the second sample and the first sample are different, a run_copy_flag of a value of 0 is signaled, and Copy_Above_Flag of the first sample, which is a previous sample, = 1, so Copy_Above_Flag of the second sample is set to a value of 0. Since run_copy_flag = 0 for the second sample and Copy_Above_Flag = 0, the encoder additionally signals a syntax element ( palette_idx_idc ) specifying the palette index value of the second sample.

The remaining samples in the second row are processed in a similar manner, and detailed descriptions are omitted.

In some other embodiments in which the scanning order illustrated in FIG. 13(c) or 13(d) may be used, whether the palette index of the current sample is predicted (copyed) from the palette index of the left or upper sample (i.e. , A flag (eg, index_pred_flag ) indicating whether the palette index of the current sample is the same as the palette index of the left or upper sample may be coded. index_pred_flag=1 may indicate that the palette index of the current sample is predicted (copied) from the palette index of the left or upper sample, and index_pred_flag=0 may indicate that the prediction (copy) is not performed.

When index_pred_flag=1, a flag ( left_or_above_flag ) indicating whether the palette index of the current sample is the same as the palette index of the left sample or the palette index of the upper sample may be additionally coded. left_or_above_flag=0 may indicate that it is the same as the palette index of the left sample, and left_or_above_flag=1 may indicate that it is the same as the palette index of the upper sample. 13(c) and 13(d), for the upper-left sample of the current block, index_pred_flag=0 can be inferred, and the left column excluding the upper-left sample Left_or_top_flag related to samples may be inferred as 1, and left_or_top_flag related to samples in the upper column may be inferred as 0.

For samples whose associated palette index is not predicted (i.e., index_pred_flag = 0) from the palette index of the left or upper sample, the value of the associated palette index is determined by the encoder, e.g., truncated binary code. code) is explicitly signaled in the bitstream.

As described above, to indicate that for a particular sample value of the block, the encoder and decoder are coded as escape samples (e.g., samples that do not have color values represented in the palette for coding the block), the index is 1 It is also possible to code data representing the last index (ie, escape index) of the palette increased by. For example, if the index for the sample is the same as the escape index (eg, the last index in the above-mentioned palette), the decoder may infer the sample to be decoded as the escape sample.

When the index map is determined, the encoder and decoder may restore the current block by determining color components corresponding to the palette index of each sample by referring to the palette table for the current block.

E. 이스케이프 샘플의 샘플 값 예측E. Predicting the sample value of the escaped sample

For escape samples whose sample value is not included as a palette entry in the palette, typically, the quantized sample value can be explicitly signaled in the bitstream for all color components by the encoder.

According to the present disclosure, for an escape sample in which a sample value is not included as a palette entry in the palette, the sample value is already decoded in a non-directional mode (DC, Planar, etc.) or a directional mode, similar to the intra-predicted sample. It may be predicted from neighboring blocks.

For example, for an escape sample, the encoder may determine whether to explicitly signal the quantized sample value by calculating the RD-cost for explicitly signaling the quantized sample value and predicting from the neighboring block. have. In addition, the encoder may signal a 1-bit flag indicating whether the quantized sample value for the escape sample is explicitly signaled in the bitstream. The decoder parses the 1-bit flag, selects whether to decode the quantized sample value from the bitstream for the escape sample, an escape mode or (non-directional or directional mode), and escapes from the already decoded neighboring blocks. It is possible to determine whether to predict the sample value of the sample.

As another example, an encoder and a decoder may be configured to always predict an escape sample from an already decoded neighboring block. In this case, signaling of the aforementioned 1-bit flag is not required.

The encoder may signal a syntax element indicating the mode number of the intra prediction mode selected for the escape pixel, and when one preset intra prediction mode is used, signaling of the syntax element indicating the mode number may not be required. .

After coding the current CU, the palette prediction list is updated using the palette table for the current CU. Entries used in the current palette are inserted into the new palette prediction list, and subsequently, entries from the previous palette prediction list that are not used in the current palette are added to the new palette prediction list until the maximum allowed size of the palette prediction list is reached. do.

16 is a flowchart illustrating a method for a decoder to decode video data according to an aspect of the present disclosure.

In step S1610, the decoder decodes a syntax element indicating that the picture can be decoded using wavefront parallel processing (WPP) from the bitstream. The syntax element may be signaled at a sequence parameter set (SPS) level.

In step S1620, the decoder decodes the coded data of the picture. The decoder may use wavefront parallel processing to decode the coded data of the picture. For example, the decoder may decode multiple CTU rows in parallel in a manner that starts decoding of the first CTU of each CTU row, after the first CTU of the previous CTU row is decoded. Wavefront parallel processing may be performed in units of slices or units of tiles. Further, even if a picture is coded so that it can be decoded using wavefront parallel processing, the coded data of the picture does not necessarily have to decode a plurality of CTU rows in parallel. Thus, the decoder may not decode multiple CTU rows in parallel. Even in such a case, the decoding of the first CTU of each CTU row can be started after the first CTU of the previous CTU row is decoded.

As part of decoding the coded data of the picture (S1620), the decoder uses the palette data from the first CTU of the previous CTU row for the first coding block of the current CTU row to be decoded in the palette mode. The palette table for the coding block may be predicted (S1621). In addition, the decoder may decode the first coding block in the palette mode by using the predicted palette table for the first coding block (S1622).

As part of predicting the palette table for the first coding block (S1621), the decoder determines whether to reuse one or more entries of the palette data from the first CTU of the previous CTU row in the palette table for the first coding block. You can decide. Also, the decoder may determine new entries to be added to the palette table for the first coding block.

As part of decoding the first coding block of the current CTU row in the palette mode (S1622), the decoder may decode an escape flag indicating whether one or more escape samples exist in the first coding block from the bitstream. . When the escape flag indicates that at least one escape sample is present in the first coding block, the decoder may add an additional index to the predicted palette table for the first coding block. The decoder may decode at least one syntax element from the bitstream for each sample of the first coding block in order to reconstruct the palette index map for the first coding block. Furthermore, the encoder may identify one or more escape samples having an additional index based on the reconstructed palette index map, and decode a syntax element representing quantized color component values for the identified escape samples from the bitstream.

At least one syntax element decoded to reconstruct the palette index map of the coding block includes a first flag (eg, run_copy_flag ) indicating whether the related sample is in the same index coding mode as the previous sample preceding in the scanning order. The first flag may be decoded for each sample of the coding block. The at least one syntax element further includes a second flag (eg, copy_above_palette_indices_flag ) indicating whether the palette index of the related sample is copied from the sample at the same position in the previous line in the scanning order. The second flag is indicated by the first flag that the related sample is not the same index coding mode as the previous sample, and may be decoded when the index coding mode of the previous sample is the INDEX mode. Further, the second flag may be omitted for samples located in the first row of the coding block for the horizontal transverse scanning order and samples located in the first column of the coding block for the vertical transverse scanning order. The at least one syntax element further includes a syntax element (eg, palette_idx_idc ) explicitly expressing a palette index of a related sample. The syntax element explicitly expressing the palette index is indicated by the first flag that the related sample is not the same index coding mode as the previous sample, and may be decoded when the index coding mode of the related sample is not the COPY ABOVE mode.

It should be understood that the encoder may also perform encoding of video data in substantially the same manner as in the order illustrated in FIG. For example, the encoder may encode a syntax element indicating that a picture of video data can be encoded and decoded using wavefront parallel processing, and may encode data of the picture to enable decoding using wavefront parallel processing. As part of encoding the data of the picture, the encoder for the first coding block of the current CTU row encoded in the palette mode, using the palette data from the first CTU of the previous CTU row, for the first coding block. A palette table may be predicted, and the first coding block may be encoded in a palette mode by using the predicted palette table for the first coding block.

In the above description, it should be understood that the exemplary embodiments may be implemented in many different ways. The functions or methods described in one or more examples may be implemented in hardware, software, firmware, or any combination thereof. It should be understood that the functional components described herein are labeled "...unit" to particularly emphasize their implementation independence.

Meanwhile, various functions or methods described in the present disclosure may be implemented as instructions stored in a non-transitory recording medium that can be read and executed by one or more processors. The non-transitory recording medium includes, for example, all kinds of recording devices in which data is stored in a form readable by a computer system. For example, the non-transitory recording medium includes a storage medium such as an erasable programmable read only memory (EPROM), a flash drive, an optical drive, a magnetic hard drive, and a solid state drive (SSD).

The above description is merely illustrative of the technical idea of the present embodiment, and those of ordinary skill in the technical field to which the present embodiment belongs will be able to make various modifications and variations without departing from the essential characteristics of the present embodiment. Accordingly, the present exemplary embodiments are not intended to limit the technical idea of the present exemplary embodiment, but are illustrative, and the scope of the technical idea of the present exemplary embodiment is not limited by these exemplary embodiments. The scope of protection of this embodiment should be interpreted by the following claims, and all technical ideas within the scope equivalent thereto should be construed as being included in the scope of the present embodiment.

CROSS-REFERENCE TO RELATED APPLICATION

This patent application is a patent application No. 10-2019-0056975 filed in Korea on May 15, 2019, and Patent application No. filed in Korea on September 30, 2019, which is incorporated by reference in its entirety in this specification. Priority is claimed for 10-2019-0120806 and Patent Application No. 10-2020-0058318 filed in Korea on May 15, 2020.

Claims

As a decoding method of video data,

Decoding a syntax element indicating that the picture can be decoded using wavefront parallel processing from the bitstream; And

Decoding the coded data of the picture

Including,

The step of decoding the coded data of the picture,

Predicting a palette table for the first coding block using palette data from the first CTU of the previous CTU row for the first coding block of the current CTU row encoded in the palette mode; And

Decoding the first coding block in a palette mode using a palette table predicted for the first coding block

Including a decoding method.
The method of claim 1,

The step of decoding the coded data of the picture,

And decoding a plurality of CTU rows in parallel in a manner that starts decoding of the first CTU of each CTU row, after the first CTU of the previous CTU row is decoded.
The method of claim 1,

Predicting the palette table,

Determining whether to reuse one or more entries of palette data from a first CTU of the previous CTU row in a palette table for the first coding block; And

Determining new entries to be added to the palette table for the first coding block

Including a decoding method.
The method of claim 1,

The step of decoding the first coding block of the current CTU row in the palette mode,

Decoding an escape flag indicating whether one or more escape samples are present in the first coding block from the bitstream;

If the escape flag indicates that at least one escape sample exists in the first coding block, adding an additional index to a predicted palette table for the first coding block;

Decoding at least one syntax element for each sample of the first coding block from the bitstream to reconstruct the palette index map for the first coding block; And

Identifying one or more escape samples having the additional index based on the reconstructed palette index map, and decoding a syntax element representing quantized color component values for the identified escape samples from the bitstream

Including a decoding method.
The method of claim 4,

The at least one syntax element,

A first flag indicating whether a related sample is in the same index coding mode as a preceding sample in a scanning order, and is decoded for each sample of the first coding block;

A second flag indicating whether the palette index of the related sample is copied from the sample at the same position in the previous line in the scanning order, wherein the related sample is not in the same index coding mode as the previous sample by the first flag Indicated and the second flag is decoded when the index coding mode of the previous sample is the INDEX mode; And

It is indicated by the first flag that the related sample is not the same index coding mode as the previous sample, and when the index coding mode of the related sample is not the COPY ABOVE mode, the palette index of the related sample is explicitly expressed. Syntax element

Including a decoding method.
The method of claim 4,

The step of decoding the first coding block of the current CTU row in the palette mode,

And predicting a palette table for a subsequent coding block of the current CTU row using palette data from a palette table for a first coding block of the current CTU row.
The method of claim 1,

The syntax element indicating that the picture can be decoded using wavefront parallel processing is signaled at a sequence parameter set (SPS) level.
An apparatus for decoding video data, comprising:

Memory; And

Contains one or more processors,

The one or more processors,

Decoding a syntax element indicating that the picture can be decoded using wavefront parallel processing from the bitstream; And

It is configured to perform the step of decoding the encoded data of the picture,

The step of decoding the coded data of the picture,

Predicting a palette table for the first coding block using palette data from the first CTU of the previous CTU row for the first coding block of the current CTU row encoded in the palette mode; And

Decoding the first coding block in a palette mode using a palette table predicted for the first coding block

Containing, a decoding device.
The method of claim 8,

The step of decoding the coded data of the picture,

And decoding a plurality of CTU rows in parallel in a manner that starts decoding of the first CTU of each CTU row, after the first CTU of the previous CTU row is decoded.
The method of claim 8,

Predicting the palette table,

Determining whether to reuse one or more entries of palette data from a first CTU of the previous CTU row in a palette table for the first coding block; And

Determining new entries to be added to the palette table for the first coding block

Containing, a decoding device.
The method of claim 8,

The step of decoding the first coding block of the current CTU row in the palette mode,

Decoding an escape flag indicating whether one or more escape samples are present in the first coding block from the bitstream;

If the escape flag indicates that at least one escape sample exists in the first coding block, adding an additional index to a predicted palette table for the first coding block;

Decoding at least one syntax element for each sample of the first coding block from the bitstream to reconstruct the palette index map for the first coding block; And

Identifying one or more escape samples having the additional index based on the reconstructed palette index map, and decoding a syntax element representing quantized color component values for the identified escape samples from the bitstream

Containing, the decoding device.
The method of claim 11,

The at least one syntax element,

A first flag indicating whether a related sample is in the same index coding mode as a preceding sample in a scanning order, and is decoded for each sample of the first coding block;

A second flag indicating whether the palette index of the related sample is copied from the sample at the same position in the previous line in the scanning order, wherein the related sample is not in the same index coding mode as the previous sample by the first flag Is indicated and the second flag is decoded when the index coding mode of the previous sample is the INDEX mode; And

It is indicated by the first flag that the related sample is not the same index coding mode as the previous sample, and when the index coding mode of the related sample is not the COPY ABOVE mode, the palette index of the related sample is explicitly expressed. Syntax element

Containing, the decoding device.
The method of claim 11,

The step of decoding the first coding block of the current CTU row in the palette mode,

And predicting a palette table for a subsequent coding block of the current CTU row by using palette data from a palette table for a first coding block of the current CTU row.
The method of claim 8,

The syntax element indicating that the picture can be decoded using wavefront parallel processing is signaled at a sequence parameter set (SPS) level.
As a method of encoding video data,

Encoding a syntax element indicating that the picture can be encoded and decoded using wavefront parallel processing; And

Encoding the data of the picture to enable decoding using the wavefront parallel processing

Including,

Encoding the data of the picture,

Predicting a palette table for the first coding block using palette data from the first CTU of the previous CTU row for the first coding block of the current CTU row encoded in the palette mode; And

Encoding the first coding block in a palette mode using a palette table predicted for the first coding block

Including, the encoding method.