CN117979012A

CN117979012A - Image decoding and encoding method and storage medium

Info

Publication number: CN117979012A
Application number: CN202410124725.5A
Authority: CN
Inventors: 柳先美; 崔情娥; 金昇焕; 崔璋元; 许镇
Original assignee: LG Electronics Inc
Current assignee: LG Electronics Inc
Priority date: 2018-10-05
Filing date: 2019-10-04
Publication date: 2024-05-03
Also published as: US20230370599A1; US11470319B2; KR20210041086A; CN112806017A; US11750813B2; US20220385925A1; CN112806017B; WO2020071856A1; US20210400274A1; KR20230070528A; KR102533227B1

Abstract

Image decoding and encoding methods and storage media are disclosed. The image decoding method according to the present document includes the steps of: receiving a bitstream including residual information; deriving a quantized transform coefficient of the current block based on residual information included in the bitstream; deriving residual samples of the current block based on the quantized transform coefficients; and generating a reconstructed picture based on residual samples of the current block, wherein residual information may be derived via different syntax elements depending on whether a transform has been applied to the current block.

Description

Image decoding and encoding method and storage medium

The application is a divisional application of an application patent application with the original application number 201980065600.7 (International application number: PCT/KR 2019/01343, application date: 10/4/2019, application name: method and device for coding transform coefficients).

Technical Field

The present disclosure relates generally to image coding techniques and, more particularly, to methods and apparatus for coding transform coefficients.

Background

Today, there has been an increasing demand for high resolution and high quality images/video such as Ultra High Definition (UHD) images/video of 4K, 8K or higher in various fields. As image/video data becomes higher resolution and higher quality, the amount of information or bits transmitted increases as compared to conventional image data. Therefore, when image data is transmitted using a medium such as a conventional wired/wireless broadband line or image/video data is stored using an existing storage medium, its transmission cost and storage cost increase.

In addition, nowadays, interest and demand for immersive media such as Virtual Reality (VR), artificial Reality (AR) content, or holograms are increasing, and broadcasting of images/videos having image characteristics different from those of real images such as game images is increasing.

Therefore, there is a need for an efficient image/video compression technique that efficiently compresses and transmits or stores and reproduces information of high resolution and high quality images/videos having various features as described above.

Disclosure of Invention

Technical purpose

The present disclosure provides methods and apparatus for improving image coding efficiency.

The present disclosure also provides methods and apparatus for improving the efficiency of residual coding.

The present disclosure also provides methods and apparatus for improving the efficiency of residual coding depending on whether transform skip (skip) is applied.

Technical proposal

According to an embodiment of the present disclosure, there is provided an image decoding method performed by a decoding apparatus, the method including: receiving a bitstream including residual information; deriving a quantized transform coefficient of the current block based on residual information included in the bitstream; deriving residual samples of the current block based on the quantized transform coefficients; and generating a reconstructed picture based on residual samples of the current block, wherein residual information may be derived by different syntax elements depending on whether a transform is applied to the current block.

The residual information includes: a first transform coefficient level flag regarding whether a transform coefficient level of the quantized transform coefficient is greater than a first threshold; and a second transform coefficient level flag regarding whether a transform coefficient level of the quantized transform coefficient is greater than a second threshold, and wherein the second transform coefficient level flag is decoded differently depending on whether a transform is applied to the current block.

The residual information includes a context syntax element encoded based on the context, and wherein the context syntax element includes: a significant coefficient flag relating to whether the quantized transform coefficient is a non-zero significant coefficient (SIGNIFICANT COEFFICIENT), a parity level flag regarding the parity of the transform coefficient level of the quantized transform coefficient, a first transform coefficient level flag regarding whether the quantized transform coefficient level is greater than a first threshold, and a second transform coefficient level flag regarding whether the transform coefficient level of the quantized transform coefficient is greater than a second threshold.

The step of deriving quantized transform coefficients comprises: decoding the first transform coefficient level flag and decoding the parity level flag; and deriving quantized transform coefficients based on the decoded value of the parity level flag and the decoded value of the first transform coefficient level flag, and wherein decoding of the first transform coefficient level flag is performed prior to decoding of the parity level flag.

According to another embodiment of the present disclosure, there is provided an image encoding method by an encoding apparatus, the method including: deriving residual samples of the current block; deriving a quantized transform coefficient based on residual samples of the current block; and encoding residual information including information on quantized transform coefficients, wherein the residual information may be derived by different syntax elements depending on whether a transform is applied to the current block.

According to still another embodiment of the present disclosure, an image decoding apparatus for performing an image decoding method includes: an entropy decoder which receives a bitstream including residual information and derives quantized transform coefficients of a current block based on the residual information included in the bitstream; an inverse transformer deriving residual samples of the current block based on the quantized transform coefficients; and an adder generating a reconstructed picture based on residual samples of the current block, wherein residual information may be derived by different syntax elements depending on whether a transform is applied to the current block.

According to still another embodiment of the present disclosure, there is provided an encoding apparatus for performing image encoding. The encoding apparatus includes: a subtractor which derives residual samples of the current block; a quantizer that derives quantized transform coefficients based on residual samples of the current block; and an entropy encoder encoding residual information including information on the quantized transform coefficients, wherein the residual information may be derived by different syntax elements according to whether a transform is applied to the current block.

According to still another embodiment of the present disclosure, a digital storage medium may be provided in which image data including encoded image information generated according to an image encoding method performed by an encoding apparatus is stored.

According to still another embodiment of the present disclosure, a digital storage medium may be provided in which image data including encoded image information that causes a decoding apparatus to perform an image decoding method is stored.

Technical effects

According to the embodiments of the present disclosure, the overall image/video compression efficiency may be improved.

According to the embodiment of the disclosure, the efficiency of residual coding can be improved.

According to the present disclosure, the efficiency of transform coefficient coding can be improved.

According to the present disclosure, the efficiency of residual coding can be improved according to whether transformation skip is applied.

Drawings

Fig. 1 schematically shows an example of a video/image encoding system to which the present disclosure may be applied.

Fig. 2 is a diagram schematically illustrating a configuration of a video/image encoding apparatus to which the present disclosure can be applied.

Fig. 3 is a diagram schematically illustrating a configuration of a video/image decoding apparatus to which the present disclosure can be applied.

Fig. 4 is a control flow diagram illustrating a video/image encoding method to which the present disclosure may be applied.

Fig. 5 is a control flow diagram illustrating a video/image decoding method to which the present disclosure may be applied.

Fig. 6 is a diagram illustrating a block diagram of a CABAC encoding system according to an embodiment.

Fig. 7 is a diagram illustrating an example of transform coefficients in a 4×4 block.

Fig. 8 is a diagram illustrating a residual signal decoder according to an example of the present disclosure.

Fig. 9 is a control flow diagram illustrating a method of decoding a residual signal according to an exemplary embodiment of the present disclosure.

Fig. 10 is a control flow diagram illustrating a method of resolving a context element according to an embodiment of the present disclosure.

Fig. 11 is a control flow diagram illustrating a method of resolving a context element according to another embodiment of the present disclosure.

Fig. 12 is a control flow diagram illustrating a method of resolving a context element according to yet another embodiment of the present disclosure.

Fig. 13 is a control flow diagram illustrating a method of resolving a context element according to yet another embodiment of the present disclosure.

Fig. 14 is a control flow diagram illustrating a method of resolving a context element according to yet another embodiment of the present disclosure.

Fig. 15 is a control flow diagram illustrating a method of resolving a context element according to yet another embodiment of the present disclosure.

Fig. 16 is a diagram exemplarily showing a structure of a content streaming system to which the present disclosure can be applied.

Detailed Description

While the present disclosure may be susceptible to various modifications and alternative embodiments, specific embodiments thereof have been shown by way of example in the drawings and will herein be described in detail. However, this is not intended to limit the disclosure to the specific embodiments disclosed herein. The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the technical ideas of the present disclosure. Singular forms may include plural forms unless the context clearly indicates otherwise. Terms such as "comprises," "comprising," and the like are intended to indicate the presence of features, numbers, steps, operations, elements, components, or combinations thereof used in the following description, and thus should not be taken to pre-exclude the possibility of the presence or addition of one or more different features, numbers, steps, operations, elements, components, or combinations thereof.

Furthermore, the components on the drawings described herein are illustrated separately for convenience in describing characteristic functions that are different from each other, however, it is not intended that the components be implemented by separate hardware or software. For example, any two or more of these components may be combined to form a single component, and any single component may be divided into multiple components. Embodiments in which the components are combined and/or divided will fall within the scope of the patent claims of the present disclosure, so long as they do not depart from the spirit of the present disclosure.

Hereinafter, preferred embodiments of the present disclosure will be described in more detail while referring to the accompanying drawings. In addition, on the drawings, the same reference numerals are used for the same components, and repeated description of the same components will be omitted.

Fig. 1 schematically presents an example of a video/image encoding system to which the present disclosure may be applied.

Referring to fig. 1, a video/image encoding system may include a first device (source device) and a second device (sink device). The source device may transfer the encoded video/image information or data to the sink device in the form of a file or stream transmission via a digital storage medium or network.

The source device may include a video source, an encoding apparatus, and a transmitter. The receiving apparatus may include a receiver, a decoding device, and a renderer. The encoding device may be referred to as a video/image encoding device, and the decoding device may be referred to as a video/image decoding device. The transmitter may be included in the encoding device. The receiver may be included in a decoding device. The renderer may include a display, and the display may be configured as a separate device or external component.

The video source may obtain the video/image through a process of capturing, synthesizing, or generating the video/image. The video source may comprise video/image capturing means and/or video/image generating means. The video/image capturing means may comprise, for example, one or more cameras, video/image files comprising previously captured video/images, etc. The video/image generating means may comprise, for example, computers, tablet computers and smart phones, and may (electronically) generate video/images. For example, virtual video/images may be generated by a computer or the like. In this case, the video/image capturing process may be replaced by a process of generating related data.

The encoding device may encode the input video/image. The encoding device may perform a series of processes such as prediction, transformation, and quantization for compression and encoding efficiency. The encoded data (encoded video/image information) may be output in the form of a bitstream.

The transmitter may transmit the encoded video/image information or data output in the form of a bitstream to a receiver of the receiving apparatus in the form of file or stream transmission through a digital storage medium or network. The digital storage medium may include various storage media such as USB, SD, CD, DVD, blu-ray, HDD, SSD, etc. The transmitter may include an element for generating a media file through a predetermined file format, and may include an element for transmitting through a broadcast/communication network. The receiver may receive/extract the bit stream and transmit the received/extracted bit stream to the decoding apparatus.

The decoding apparatus may decode the video/image by performing a series of processes such as inverse quantization, inverse transformation, and prediction corresponding to the operation of the encoding apparatus.

The renderer may render the decoded video/images. The rendered video/image may be displayed by a display.

This document relates to video/image coding. For example, the methods/embodiments disclosed in this document may be applied to methods disclosed in a general video coding (VVC), EVC (elementary video coding) standard, AOMedia Video (AV 1) standard, a second generation audio video coding standard (AVs 2), or a next generation video/image coding standard (e.g., h.267 or h.268, etc.).

The present document sets forth various embodiments of video/image encoding, and these embodiments may be performed in combination with one another unless otherwise noted.

In this document, video may refer to a series of images over time. A picture generally refers to a unit representing a picture in a specific time region, and a slice/tile (tile) is a unit constituting a part of a picture in encoding. A slice/tile may include one or more Coding Tree Units (CTUs). A picture may be made up of one or more slices/tiles. A picture may be made up of one or more tile groups. A tile group may include one or more tiles. A block (brick) may represent a rectangular region of CTU rows within a tile in a picture. A tile may be partitioned into a plurality of block portions, each of which is made up of one or more rows of CTUs within the tile. A tile that is not divided into a plurality of blocks may also be referred to as a block. The block scan is a specific sequential ordering of CTTs of the following partitioned pictures: CTUs may be ordered by raster scan of CTUs in a block portion, blocks within a tile may be ordered consecutively by raster scan of a block portion of a tile, and tiles in a picture may be ordered consecutively by raster scan of a tile of a picture. A tile is a rectangular region of CTUs within a particular tile column and a particular tile row in a picture. A tile column is a rectangular region of CTU that is equal in height to the height of a picture and of width specified by syntax elements in the picture parameter set. A tile line is a rectangular region of CTUs that is specified in height by syntax elements in the picture parameter set and that has a width equal to the width of the picture. The block scan is a specific sequential ordering of CTUs of the following partitioned pictures: the CTUs are ordered consecutively in the tiles by raster scan of the CTUs, while the tiles in the picture are ordered consecutively by raster scan of the tiles of the picture. A slice includes an integer number of blocks of pictures that may be contained exclusively in a single NAL unit. A slice may consist of multiple complete tiles or of complete block portions of a contiguous sequence of only one tile. In this document, tile groups may be used interchangeably with slices. For example, in this document, a tile group/tile group header may be referred to as a slice/slice header.

A pixel or pel may mean the smallest unit that constitutes a picture (or image). In addition, "sample" may be used as a term corresponding to a pixel. The samples may generally represent pixels or values of pixels and may represent only pixels/pixel values of a luminance component or only pixels/pixel values of a chrominance component.

The unit may represent a basic unit of image processing. The unit may include at least one of a specific region of the picture and information related to the region. One unit may include one luminance block and two chrominance (e.g., cb, cr) blocks. In some cases, units may be used interchangeably with terms such as blocks or regions. In general, an mxn block may comprise M columns and N rows of samples (sample array) or sets (or arrays) of transform coefficients.

In this document, the terms "/" and "," should be interpreted as indicating "and/or". For example, the expression "a/B" may mean "a and/or B". In addition, "A, B" may mean "a and/or B". In addition, "a/B/C" may mean "at least one of A, B and/or C". In addition, "a/B/C" may mean "at least one of A, B and/or C".

In addition, in this document, the term "or" should be interpreted as indicating "and/or". For example, the expression "a or B" may include 1) a only, 2) B only, and/or 3) both a and B. In other words, the term "or" in this document should be interpreted as indicating "additionally or alternatively".

Fig. 2 is a diagram schematically describing a configuration of a video/image encoding apparatus to which the present disclosure can be applied. Hereinafter, a so-called video encoding apparatus may include an image encoding apparatus.

Referring to fig. 2, the encoding apparatus 200 includes an image divider 210, a predictor 220, a residual processor 230, an entropy encoder 240, an adder 250, a filter 260, and a memory 270. The predictor 220 may include an inter predictor 221 and an intra predictor 222. Residual processor 230 may include a transformer 232, a quantizer 233, an inverse quantizer 234, and an inverse transformer 235. The residual processor 230 may also include a subtractor 231. Adder 250 may be referred to as a reconstructor or a reconstructed block generator. According to an embodiment, the image divider 210, the predictor 220, the residual processor 230, the entropy encoder 240, the adder 250, and the filter 260 may be constructed by at least one hardware component (e.g., an encoder chipset or a processor). In addition, the memory 270 may include a Decoded Picture Buffer (DPB), and may be composed of a digital storage medium. The hardware components may also include memory 270 as an internal/external component.

The image divider 210 divides an input image (or picture or frame) input to the encoding apparatus 200 into one or more processors. For example, the processor may be referred to as a Coding Unit (CU). In this case, the coding units may be recursively partitioned according to a quadtree binary tree (QTBTTT) structure starting from a Coding Tree Unit (CTU) or a Largest Coding Unit (LCU). For example, one coding unit may be partitioned into multiple coding units of greater depth based on a quadtree structure, a binary tree structure, and/or a trigeminal tree structure. In this case, for example, a quadtree structure may be applied first, and then a binary tree structure and/or a trigeminal tree structure may be applied. Alternatively, the binary tree structure may be applied first. The encoding process according to the present document may be performed based on the final encoding unit that is no longer partitioned. In this case, the maximum coding unit may be used as a final coding unit based on coding efficiency according to image characteristics, or if necessary, the coding unit may be recursively divided into coding units of deeper depths and the best-sized coding unit may be used as the final coding unit. Here, the encoding process may include processes of prediction, transformation, and reconstruction, which will be described later. As another example, the processing unit may also include a Prediction Unit (PU) or a Transform Unit (TU). In this case, the prediction unit and the transform unit may be divided or partitioned from the final encoding unit described above. The prediction unit may be a unit of sample prediction and the transform unit may be a unit for deriving transform coefficients and/or a unit for deriving a residual signal from the transform coefficients.

In some cases, units and terms such as blocks or regions may be used interchangeably. Conventionally, an mxn block may represent a set of samples or transform coefficients consisting of M columns and N rows. The samples may generally represent pixels or values of pixels, may represent pixels/pixel values of only the luminance component, or may represent pixels/pixel values of only the chrominance component. A sample may be used as a term corresponding to one picture (or image) for a pixel or pel.

In the encoding apparatus 200, a prediction signal (prediction block, prediction sample array) output from the inter predictor 221 or the intra predictor 222 is subtracted from an input image signal (original block, original sample array) to generate a residual signal (residual block, residual sample array), and the generated residual signal is transmitted to the transformer 232. In this case, as shown, a unit of subtracting a prediction signal (prediction block, prediction sample array) from an input image signal (original block, original sample array) in the encoding apparatus 200 may be referred to as a subtractor 231. The predictor may perform prediction on a block to be processed (hereinafter, referred to as a "current block"), and may generate a prediction block including prediction samples for the current block. The predictor may determine whether to apply intra prediction or inter prediction based on the current block or CU. As described later in the description of each prediction mode, the predictor may generate various information related to prediction, such as prediction mode information, and transmit the generated information to the entropy encoder 240. Information about the prediction may be encoded in the entropy encoder 240 and output in the form of a bitstream.

The intra predictor 222 may predict the current block by referring to samples in the current picture. The reference samples may be located near the current block or may be separated from the current block according to the prediction mode. In intra prediction, the prediction modes may include a plurality of non-directional modes and a plurality of directional modes. The non-directional modes may include, for example, a DC mode and a planar mode. Depending on the degree of detail of the prediction direction, the directional modes may include, for example, 33 directional prediction modes or 65 directional prediction modes. However, this is merely an example, and more or fewer directional prediction modes may be used depending on the setting. The intra predictor 222 may determine a prediction mode applied to the current block by using a prediction mode applied to a neighboring block.

The inter predictor 221 may derive a prediction block for the current block based on a reference block (reference sample array) specified by a motion vector on a reference picture. Here, in order to reduce the amount of motion information transmitted in the inter prediction mode, the motion information may be predicted in units of blocks, sub-blocks, or samples based on the correlation of the motion information between the neighboring blocks and the current block. The motion information may include a motion vector and a reference picture index. The motion information may also include inter prediction direction (L0 prediction, L1 prediction, bi prediction, etc.) information. In the case of inter prediction, the neighboring blocks may include a spatial neighboring block present in the current picture and a temporal neighboring block present in the reference picture. The reference picture including the reference block and the reference picture including the temporal neighboring block may be the same or different. The temporal neighboring block may be referred to as a collocated reference block, a collocated CU (colCU), etc., and the reference picture including the temporal neighboring block may be referred to as a collocated picture (colPic). For example, the inter predictor 221 may configure a motion information candidate list based on neighboring blocks and generate information indicating which candidates are used to derive a motion vector and/or a reference picture index of the current block. Inter prediction may be performed based on various prediction modes. For example, in the case of the skip mode and the merge mode, the inter predictor 221 may use motion information of a neighboring block as motion information of the current block. In the skip mode, unlike the merge mode, a residual signal cannot be transmitted. In the case of a Motion Vector Prediction (MVP) mode, a motion vector of a neighboring block may be used as a motion vector prediction term, and a motion vector of a current block may be indicated by signaling a motion vector difference.

The predictor 220 may generate a prediction signal based on various prediction methods described below. For example, the predictor may apply not only intra prediction or inter prediction to one block prediction but also intra prediction and inter prediction at the same time. This may be referred to as Combined Inter and Intra Prediction (CIIP). In addition, the predictor may be based on an Intra Block Copy (IBC) prediction mode or a palette mode in order to predict a block. IBC prediction mode or palette mode may be used for content image/video coding (e.g., screen Content Coding (SCC)) of games and the like. Although IBC basically performs prediction in the current picture, it performs in a similar manner to inter prediction in that a reference block is derived in the current picture. That is, IBC may use at least one of the inter prediction techniques described in this document. Palette modes may be considered as examples of intra coding or intra prediction. When palette modes are applied, sample values within a picture may be signaled based on information about palette tables and palette indices.

The prediction signal generated by the predictor (including the inter predictor 221 and/or the intra predictor 222) may be used to generate a reconstructed signal or to generate a residual signal. The transformer 232 may generate transform coefficients by applying a transform technique to the residual signal. For example, the transformation techniques may include at least one of Discrete Cosine Transformation (DCT), discrete Sine Transformation (DST), karhunen-Loeve transformation (KLT), graph-based transformation (GBT), or Conditional Nonlinear Transformation (CNT). Here, GBT means a transformation obtained from a graph when representing relationship information between pixels with the graph. CNT refers to the transformation that is generated based on the prediction signal generated using all previously reconstructed pixels. In addition, the transform process may be applied to square pixel blocks of the same size, or may be applied to blocks of variable size instead of square blocks.

The quantizer 233 may quantize the transform coefficients and transmit them to the entropy encoder 240, and the entropy encoder 240 may encode the quantized signal (information about the quantized transform coefficients) and output a bitstream. The information about the quantized transform coefficients may be referred to as residual information. The quantizer 233 may rearrange the quantized transform coefficients of the block type into a one-dimensional vector form based on the coefficient scan order, and generate information about the quantized transform coefficients based on the quantized transform coefficients in the one-dimensional vector form. Information about the transform coefficients may be generated. The entropy encoder 240 may perform various encoding methods such as, for example, exponential golomb (exponential Golomb), context Adaptive Variable Length Coding (CAVLC), context Adaptive Binary Arithmetic Coding (CABAC), and the like. The entropy encoder 240 may encode information required for video/image reconstruction other than quantized transform coefficients (e.g., values of syntax elements, etc.), together or separately. The encoded information (e.g., encoded video/image information) may be transmitted or stored in units of NAL (network abstraction layer) in the form of a bitstream. The video/image information may also include information about various parameter sets such as an Adaptation Parameter Set (APS), a Picture Parameter Set (PPS), a Sequence Parameter Set (SPS), a Video Parameter Set (VPS). In addition, the video/image information may also include conventional constraint information. In this document, information and/or syntax elements transmitted/signaled from the encoding device to the decoding device may be included in the video/picture information. The video/image information may be encoded by the above-described encoding process and included in the bitstream. The bit stream may be transmitted over a network or may be stored in a digital storage medium. The network may include a broadcast network and/or a communication network, and the digital storage medium may include various storage media such as USB, SD, CD, DVD, blu-ray, HDD, SSD, etc. A transmitter (not shown) transmitting a signal output from the entropy encoder 240 or a storage unit (not shown) storing a signal may be included as an internal/external element of the encoding apparatus 200, alternatively, a transmitter may be included in the entropy encoder 240.

The quantized transform coefficients output from the quantizer 233 may be used to generate a prediction signal. For example, the residual signal (residual block or residual sample) may be reconstructed by applying inverse quantization and inverse transformation to the vectorized transform coefficients using inverse quantizer 234 and inverse transformer 235. The adder 250 adds the reconstructed residual signal to the prediction signal output from the inter predictor 221 or the intra predictor 222 to generate a reconstructed signal (reconstructed picture, reconstructed block, reconstructed sample array). The prediction block may be used as a reconstruction block if there is no residual for the block to be processed, such as in the case of applying a skip mode. Adder 250 may be referred to as a reconstructor or a reconstructed block generator. The generated reconstructed signal may be used for intra prediction of a next block to be processed in the current block, and may be used for inter prediction of a next picture by filtering as described below.

Furthermore, during picture encoding and/or reconstruction, a Luminance Map (LMCS) with chroma scaling may be applied.

The filter 260 may improve subjective/objective image quality by applying filtering to the reconstructed signal. For example, the filter 260 may generate a modified reconstructed picture by applying various filtering methods to the reconstructed slice and store the modified reconstructed slice in the memory 270, particularly in the DPB of the memory 270. Various filtering methods may include, for example, deblocking filtering, sample adaptive shifting, adaptive loop filters, bilateral filters, and the like. As discussed later in the description of each filtering method, the filter 260 may generate various information related to filtering and transmit the generated information to the entropy encoder 240. The information about the filtering may be encoded in the entropy encoder 240 and output in the form of a bitstream.

The modified reconstructed picture sent to the memory 270 may be used as a reference picture in the inter predictor 221. When inter prediction is applied by the encoding apparatus, prediction mismatch between the encoding apparatus 200 and the decoding apparatus can be avoided, and encoding efficiency can be improved.

The DPB of the memory 270 may store the modified reconstructed picture for use as a reference picture in the inter predictor 221. The memory 270 may store motion information of blocks from which motion information in the current picture is derived (or encoded) and/or motion information of blocks in a picture that has been reconstructed. The stored motion information may be transmitted to the inter predictor 221 and used as motion information of a spatially neighboring block or motion information of a temporally neighboring block. The memory 270 may store reconstructed samples of the reconstructed block in the current picture and may transmit the reconstructed samples to the intra predictor 222.

Fig. 3 is a schematic diagram illustrating a configuration of a video/image decoding apparatus to which the embodiment of this document can be applied.

Referring to fig. 3, the decoding apparatus 300 may include an entropy decoder 310, a residual processor 320, a predictor 330, an adder 340, a filter 350, and a memory 360. The predictor 330 may include an intra predictor 331 and an inter predictor 332. The residual processor 320 may include an inverse quantizer 321 and an inverse transformer 322. According to an embodiment, the entropy decoder 310, residual processor 320, predictor 330, adder 340, and filter 350 may be constructed by hardware components (e.g., a decoder chipset or processor). In addition, the memory 360 may include a Decoded Picture Buffer (DPB), or may be constructed of a digital storage medium. The hardware components may also include memory 360 as an internal/external component.

When a bitstream including video/image information is input, the decoding apparatus 300 may reconstruct an image corresponding to the process of processing the video/image information in the encoding apparatus of fig. 2 according thereto. For example, the decoding apparatus 300 may derive the units/blocks based on information related to block partitioning obtained from the bitstream. The decoding apparatus 300 may perform decoding using a processor applied in the encoding apparatus. Thus, the decoded processor may be, for example, an encoding unit, and the encoding unit may be partitioned from the encoding tree unit or the largest encoding unit according to a quadtree structure, a binary tree structure, and/or a trigeminal tree structure. One or more transform units may be derived with the coding unit. The reconstructed image signal decoded and output by the decoding apparatus 300 may be reproduced by a reproducing apparatus.

The decoding apparatus 300 may receive the signal output from the encoding apparatus of fig. 2 in the form of a bitstream and may decode the received signal through the entropy decoder 310. For example, the entropy decoder 310 may parse the bitstream to derive information (e.g., video/image information) required for image reconstruction (or picture reconstruction). The video/image information may also include information on various parameter sets such as an Adaptive Parameter Set (APS), a Picture Parameter Set (PPS), a Sequence Parameter Set (SPS), a Video Parameter Set (VPS). In addition, the video/image information may also include conventional constraint information. The decoding device may further decode the picture based on information about the parameter set and/or conventional constraint information. The signaling/receiving information and/or syntax elements described later in this document may be decoded and obtained from the bitstream by a decoding process. For example, the entropy decoder 310 decodes information in a bitstream based on an encoding method such as exponential golomb encoding, CAVLC, or CABAC, and outputs values of syntax elements required for image reconstruction and quantized values of transform coefficients for a residual. More specifically, the CABAC entropy decoding method may receive bins corresponding to respective syntax elements in a bitstream, determine a context model using decoding target syntax element information, decoding information of a decoding target block, or information of symbols/bins decoded in a previous stage, perform arithmetic decoding on the bins by predicting generation probabilities of the bins according to the determined context model, and generate symbols corresponding to values of each syntax element. In this case, the CABAC entropy decoding method may update the context model using information of the symbol/bin decoded for the context model of the next symbol/bin after determining the context model. The prediction-related information among the information decoded in the entropy decoder 310 may be provided to predictors (the inter predictor 332 and the intra predictor 331), and the residual value (i.e., quantized transform coefficient) on which entropy decoding is performed in the entropy decoder 310 and associated parameter information may be input to the residual processor 320. The residual processor 320 may derive residual signals (residual block, residual samples, residual sample array). In addition, information on filtering among the information decoded in the entropy decoder 310 may be provided to the filter 350. Further, a receiver (not shown) receiving a signal output from the encoding apparatus may also be configured as an internal/external element of the decoding apparatus 300, and the receiver may be a component of the entropy decoder 310. Further, the decoding apparatus according to the present document may be referred to as a video/image/picture encoding apparatus, and the decoding apparatus may be divided into an information decoder (video/image/picture information decoder) and a sample decoder (video/image/picture sample decoder). The information decoder may include an entropy decoder 310, and the sample decoder may include at least one of an inverse quantizer 321, an inverse transformer 322, an adder 340, a filter 350, a memory 360, an inter predictor 332, and an intra predictor 331.

The dequantizer 321 may dequantize the quantized transform coefficient and output the transform coefficient. The inverse quantizer 321 may rearrange the quantized transform coefficients into the form of two-dimensional blocks. In this case, the rearrangement may be performed based on the coefficient scan order performed in the encoding apparatus. The dequantizer 321 may perform dequantization on quantized transform coefficients using quantization parameters (e.g., quantization step information), and obtain transform coefficients.

The inverse transformer 322 inversely transforms the transform coefficients to obtain residual signals (residual blocks, residual sample arrays).

The predictor may perform prediction on the current block and generate a prediction block including prediction samples for the current block. The predictor may determine whether to apply intra prediction or inter prediction to the current block based on information about prediction output from the entropy decoder 310, and may determine a specific intra/inter prediction mode.

The predictor may generate a prediction signal based on various prediction methods described below. For example, the predictor may apply not only intra prediction or inter prediction to one block prediction but also intra prediction and inter prediction at the same time. This may be referred to as Combined Inter and Intra Prediction (CIIP). In addition, the predictor may predict the block based on an Intra Block Copy (IBC) prediction mode or a palette mode. IBC prediction mode or palette mode may be used for content image/video coding (e.g., screen Content Coding (SCC)) of games and the like. Although IBC basically performs prediction in the current picture, it performs in a similar manner to inter prediction in that a reference block is derived in the current picture. That is, IBC may use at least one of the inter prediction techniques described in this document. Palette modes may be considered as examples of intra coding or intra prediction. When palette modes are applied, sample values within a picture may be signaled based on information about palette tables and palette indices.

The intra predictor 331 may predict the current block by referring to samples in the current picture. The reference samples may be located near the current block or may be separated from the current block according to the prediction mode. In intra prediction, the prediction modes may include a plurality of non-directional modes and a plurality of directional modes. The intra predictor 331 may determine a prediction mode applied to the current block by using a prediction mode applied to a neighboring block.

The inter predictor 332 may derive a prediction block for the current block based on a reference block (reference sample array) specified by a motion vector on a reference picture. At this time, in order to reduce the amount of motion information transmitted in the inter prediction mode, the motion information may be predicted in units of blocks, sub-blocks, or samples based on the correlation of the motion information between the neighboring blocks and the current block. The motion information may include a motion vector and a reference picture index. The motion information may also include inter prediction direction (L0 prediction, L1 prediction, bi prediction, etc.) information. In the case of inter prediction, the neighboring blocks may include a spatial neighboring block present in the current picture and a temporal neighboring block present in the reference picture. For example, the inter predictor 332 may configure a motion information candidate list based on neighboring blocks and derive a motion vector and/or a reference picture index of the current block based on the received candidate selection information. Inter prediction may be performed based on various prediction modes, and the information on prediction may include information indicating a mode of inter prediction for the current block.

The adder 340 may generate a reconstructed signal (reconstructed picture, reconstructed block, reconstructed sample array) by adding the obtained residual signal to a prediction signal (prediction block, prediction sample array) output from a predictor (inter predictor 332 or intra predictor 331). The prediction block may be used as a reconstructed block if there is no residual for the block to be processed, such as in the case of applying a skip mode.

Adder 340 may be referred to as a reconstructor or a reconstructed block generator. The generated reconstructed signal may be used for intra prediction of the next block to be processed in the current picture, may be output by filtering as described below, or may be used for inter prediction of the next picture.

Further, a Luminance Map (LMCS) with chroma scaling may be applied in the picture decoding process.

The filter 350 may improve subjective/objective image quality by applying filtering to the reconstructed signal. For example, the filter 350 may generate a modified reconstructed picture by applying various filtering methods to the reconstructed slice, and may store the modified reconstructed slice in the memory 360, in particular, the DPB transmitted to the memory 360. Various filtering methods may include, for example, deblocking filtering, sample adaptive shifting, adaptive loop filters, bilateral filters, and the like.

The (modified) reconstructed picture stored in the DPB of the memory 360 may be used as a reference picture in the inter predictor 332. The memory 360 may store motion information of a block from which motion information in the current picture is derived (or decoded) and/or motion information of a block in a picture that has been reconstructed. The stored motion information may be transmitted to the inter predictor 332 to be used as motion information of a spatially neighboring block or motion information of a temporally neighboring block. The memory 360 may store reconstructed samples of the reconstructed block in the current picture and transmit the reconstructed samples to the intra predictor 331.

In the present disclosure, the embodiments described in the filter 260, the inter predictor 221, and the intra predictor 222 of the encoding apparatus 200 may be identical to the filter 350, the inter predictor 332, and the intra predictor 331 of the decoding apparatus 300 or applied to correspond to the filter 350, the inter predictor 332, and the intra predictor 331 of the decoding apparatus 300, respectively. The same applies to the inter predictor 332 and the intra predictor 331.

As described above, prediction is performed in order to improve compression efficiency when video encoding is performed. Accordingly, a prediction block including prediction samples for a current block as an encoding target block can be generated. Here, the prediction block includes prediction samples in a spatial domain (or pixel domain). The prediction block may be equally derived in the encoding device and the decoding device, and the encoding device may improve the image encoding efficiency by signaling information (residual information) not the original sample values of the original block itself but about the residual between the original block and the prediction block to the decoding device. The decoding apparatus may derive a residual block including residual samples based on the residual information, generate a reconstructed block including reconstructed samples by adding the residual block to the prediction block, and generate a reconstructed picture including the reconstructed block.

Residual information may be generated through a transformation process and a quantization process. For example, the encoding device may derive a residual block between the original block and the prediction block, derive transform coefficients by performing a transform process on residual samples (residual sample array) included in the residual block, and derive quantized transform coefficients by performing a quantization process on the transform coefficients, so that it may signal (via the bitstream) the associated residual information to the decoding device. Here, the residual information may include value information of quantized transform coefficients, position information, a transform technique, a transform kernel, quantization parameters, and the like. The decoding device may perform a quantization/inverse quantization process based on the residual information and derive residual samples (or residual sample blocks). The decoding apparatus may generate a reconstructed block based on the prediction block and the residual block. The encoding apparatus may derive the residual block by inverse quantizing/inverse transforming the quantized transform coefficient so as to be a reference for inter prediction of the next picture, and may generate a reconstructed picture based thereon.

S400 may be performed by the inter predictor 221 or the intra predictor 222 of the encoding device, and S410, S420, S430, and S440 may be performed by the subtractor 231, the transformer 232, the quantizer 233, and the entropy encoder 240 of the encoding device, respectively.

The encoding apparatus may derive a prediction sample through prediction for the current block (S400). The encoding device may determine whether to perform inter prediction or intra prediction on the current block, and determine a specific inter prediction mode or a specific intra prediction mode based on the RD cost. Based on the determined mode, the encoding device may derive prediction samples for the current block.

The encoding apparatus may compare the prediction samples of the current block with the original samples and derive residual samples (S410).

The encoding apparatus derives transform coefficients through transform processing of residual samples (S420). By quantizing the derived transform coefficients, quantized transform coefficients are derived (S430).

The encoding apparatus may encode image information including prediction information and residual information and output the encoded image information in the form of a bitstream (S440).

The prediction information may include information about motion information (e.g., when inter prediction is applied) and prediction mode information as a plurality of information related to the prediction process. The residual information may be information about quantized transform coefficients, and include, for example, information disclosed in table 1, which will be described later. The residual information may be entropy encoded.

The output bitstream may be transmitted to a decoding device through a storage medium or a network.

S500 may be performed by the inter predictor 332 or the intra predictor 331 of the decoding apparatus. In S500, a process of decoding prediction information included in a bitstream and deriving values of related syntax elements may be performed by the entropy decoder 310 of the encoding device. S510, S520, S530, and S540 may be performed by the entropy decoder 310, the inverse quantizer 321, the inverse transformer 322, and the adder 340, respectively, of the decoding apparatus.

The decoding device may perform operations corresponding to operations already performed in the encoding device. The decoding apparatus may perform inter prediction or intra prediction on the current block based on the received prediction information and derive a prediction sample (S500).

The decoding apparatus may derive quantized transform coefficients of the current block based on the received residual information (S510). The decoding apparatus may derive the quantized transform coefficients from the residual information by entropy decoding.

The decoding apparatus may dequantize the quantized transform coefficients and derive the transform coefficients (S520).

The decoding apparatus may derive residual samples through an inverse transform process for the transform coefficients (S530).

The decoding apparatus may generate reconstructed samples of the current block based on the residual samples and the prediction samples, and generate a reconstructed picture based on the reconstructed samples (S540).

As described above, the loop filtering process may be further applied to the reconstructed picture thereafter.

Fig. 6 shows a block diagram of context-adaptive binary arithmetic coding (CABAC) for coding a single syntax element, as a diagram illustrating a block diagram of a CABAC coding system according to an embodiment.

In the case where the input signal is a non-binarized syntax element, the CABAC encoding process first converts the input signal into a binarized value by binarization. In case the input signal is already a binarized value, the input signal bypasses binarization and does not undergo binarization and is input to the encoding engine. Here, each binary 0 or 1 constituting a binary value is referred to as a bin. For example, in the case where the binary string after binarization is "110", each of 1, and 0 is referred to as bin. The bin of the syntax element may be a value of the syntax element.

The binarized bin is input to either a regular encoding engine or a bypass encoding engine.

The conventional encoding engine assigns context models reflecting probability values to corresponding bins and encodes the bins according to the assigned context models. After performing encoding on each bin, the conventional encoding engine may update the probability model of the bin. The bin so encoded is referred to as a context-encoded bin.

The bypass encoding engine omits the process of estimating the probability of the input bin and the process of updating the probability model that has been applied to the bin after encoding. The bypass coding engine increases the coding speed by coding the bins input thereto while applying a uniform probability distribution instead of assigning contexts. The bin so encoded is referred to as a bypass bin.

Entropy encoding may determine whether encoding is performed by a conventional encoding engine or by a bypass encoding engine, and switch encoding paths. The entropy decoding performs the same processing as the encoding in the reverse order.

Further, in an embodiment, the (quantized) transform coefficients are encoded and/or decoded based on syntax elements such as transform_skip_flag、last_sig_coeff_x_prefix,last_sig_coeff_y_prefix、last_sig_coeff_x_suffix、last_sig_coeff_y_suffix、coded_sub_block_flag、sig_coeff_flag、par_level_flag、rem_abs_gt1_flag、rem_abs_gt2_flag、abs_remainder、coeff_sign_flag、mts_idx. Table 1 below shows syntax elements related to residual data encoding according to an example.

TABLE 1

/>

The transform_skip_flag indicates whether to skip the transform of the associated block. The associated block may be a Coded Block (CB) or a Transformed Block (TB). With respect to transform (and quantization) and residual coding processes, CB and TB may be used interchangeably. For example, as described above, residual samples of a CB may be derived, and (quantized) transform coefficients may be derived by transforming and quantizing the residual samples. Further, by the residual coding process, information (e.g., syntax elements) that efficiently indicates the position, size, sign, or the like of the (quantized) transform coefficient can be generated and signaled. The quantized transform coefficients may be simply referred to as transform coefficients. In general, in case that CB is not greater than the maximum TB, the size of CB may be the same as that of TB, and in this case, a target block to be transformed (and quantized) and residual-encoded may be referred to as CB or TB. Further, in case that CB is greater than the maximum TB, a target block to be transformed (and quantized) and residual encoded may be referred to as a TB. Although it is described below by way of example that syntax elements related to residual coding are signaled in units of Transform Blocks (TBs), as described above, TBs and Coding Blocks (CBs) may be used interchangeably.

In one embodiment, the (x, y) position information of the last non-zero transform coefficient in the transform block may be encoded based on syntax elements last_sig_coeff_x_prefix, last_sig_coeff_y_prefix, last_sig_x_suffix, and last_sig_coeff_y_suffix. More specifically, last_sig_coeff_x_prefix indicates a prefix of a column position of the last significant coefficient in the scan order in the transform block, and last_sig_coeff_y_prefix indicates a prefix of a row position of the last significant coefficient in the scan order in the transform block; last_sig_coeff_x_unification indicates the suffix of the column position of the last significant coefficient in the scan order in the transform block; and last_sig_coeff_y_unification indicates the suffix of the line position of the last significant coefficient in the scan order in the transform block. Here, the significant coefficient may be a non-zero coefficient. The scan order may be an upper right diagonal scan order. Alternatively, the scanning order may be a horizontal scanning order or a vertical scanning order. The scan order may be determined based on whether intra/inter prediction is applied to a target block (CB or CB including TB) and/or a specific intra/inter prediction mode.

Next, after dividing the transform block into 4x4 sub-blocks, a one-bit syntax element of coded_sub_block_flag may be used for each 4x4 sub-block to indicate whether non-zero coefficients exist in the current sub-block.

If the value of coded_sub_block_flag is 0, there is no more information to be transmitted, and thus, the encoding process for the current sub-block may be terminated. In contrast, if the value of coded_sub_block_flag is 1, the encoding process for sig_coeff_flag may be continuously performed. Since the sub-block including the last non-zero coefficient does not need to encode the coded sub-block flag and the sub-block including the DC information of the transform block has a high probability of including the non-zero coefficient, it can be assumed that the coded sub-block flag has a value of 1 and is not encoded.

If it is determined that a non-zero coefficient exists in the current sub-block because the value of coded_sub_block_flag is 1, the sig_coeff_flag having a binary value may be encoded according to a scan order instead. A 1-bit syntax element sig_coeff_flag may be encoded for each coefficient according to a scan order. If the value of the transform coefficient at the current scan position is not 0, the value of sig_coeff_flag may be 1. Here, in the case of a sub-block including the last non-zero coefficient, since the sig_coeff_flag does not need to be encoded for the last non-zero coefficient, the encoding process for the sig_coeff_flag may be omitted. Only when sig_coeff_flag is 1, horizontal information encoding may be performed, and four syntax elements may be used in the horizontal information encoding process. More specifically, each sig_coeff_flag [ xC ] [ yC ] may indicate whether the level (value) of the corresponding transform coefficient at each transform coefficient position (xC, yC) in the current TB is non-zero. In an embodiment, sig_coeff_flag may correspond to an example of a significant coefficient flag indicating whether a quantized transform coefficient is a non-zero significant coefficient.

The horizontal value remaining after the sig_coeff_flag is encoded may be the same as in the following equation 1. That is, the syntax element remAbsLevel indicating the level value to be encoded may be as shown in the following equation 1. Here coeff refers to the actual transform coefficient value.

[ 1]

remAbsLevel＝|coeff|-1

The lowest significant coefficient (LSB) value of remAbsLevel written in equation 1 can be encoded as shown in equation 2 below by the par_level_flag. Here, par_level_flag [ n ] may indicate the parity of the transform coefficient level (value) at the scan position n. After the par_lev_flag is encoded, the transform coefficient level value remAbsLevel to be encoded may be updated as shown in equation 3 below.

[ 2]

par_level_flag＝remAbsLevel&1

[ 3]

remAbsLevel’＝remAbsLevel>>1

Rem_abs_gt1_flag may indicate whether remAbsLevel 'at the corresponding scan position n is greater than 1, and rem_abs_gt2_flag may indicate whether remAbsLevel' at the corresponding scan position n is greater than 2. Encoding of abs_ remainder may be performed only when rem_abs_gt2_flag is 1. When summarizing the relation between the actual transform coefficient value coeff and each syntax element, for example, it may be the following equation 4, and the following table 2 shows an example related to equation 4. In addition, the symbol of each coefficient may be encoded using a 1-bit symbol coeff_sign_flag. The coeff may indicate a transform coefficient level (value) and may be represented as AbsLevel of transform coefficients.

[ 4]

|coeff|＝sig_coeff_flag+par_level_flag+2*(rem_abs_gtl_flag+rem_abs_gt2_flag+abs_remainder)

In an embodiment, par_level_flag indicates an example of a parity level flag regarding the parity of a transform coefficient level of a quantized transform coefficient, rem_abs_gtl_flag indicates an example of a first transform coefficient level flag regarding whether the transform coefficient level is greater than a first threshold, and rem_abs_gt2_flag may indicate an example of a second transform coefficient level flag regarding whether the transform coefficient level is greater than a second threshold.

In addition, in another embodiment, the rem_abs_gt2_flag may be referred to as rem_abs_gt3_flag, and in another embodiment, rem_abs_gt1_flag and rem_abs_gt2_flag may be expressed based on abs_level_gtx_flag [ n ] [ j ]. abs_level_gtx_flag [ n ] [ j ] may be a flag indicating whether the absolute value of the transform coefficient level (or the transform coefficient level shifted to the right by 1) at scan position n is greater than (j < 1) +1. In one example, rem_abs_gt1_flag may perform the same and/or similar functions as abs_level_gtx_flag [ n ] [0], and rem_abs_gt2_flag may perform the same and/or similar functions as abs_level_gtx_flag [ n ] [1 ]. That is, abs_1evel_gtx_flag [ n ] [0] may correspond to an example of a first transform coefficient level flag, and abs_level_gtx_flag [ n ] [1] may correspond to an example of a second transform coefficient level flag. (j < 1) +1 may be replaced with predetermined thresholds (e.g., a first threshold and a second threshold) according to circumstances.

TABLE 2

\|coeff\|	sig_coeff_flag	par_level_flag	rem_abs_gt1_fllag	rem_abs_gt2_flag	abs_remainder
						0	0
1	1	0	0
						2	1	1	0
3	1	0	1	0
						4	1	1	1	0
5	1	0	1	1	0
						6	1	1	1	1	0
7	1	0	1	1	1
						8	1	1	1	1	1
9	1	0	1	1	2
						10	1	1	1	1	2
11	1	0	1	1	3
						…	…	…	…	…	…

The 4x4 block of fig. 7 shows an example of quantization coefficients. The block shown in fig. 7 may be a 4x4 transform block, or a 4x4 sub-block of an 8x8, 16x16, 32x32, or 64x64 transform block. The 4x4 block of fig. 7 may be a luminance block or a chrominance block. The encoding results of the anti-diagonal scan coefficients of fig. 7 may be shown, for example, in table 3. In table 3, scan_pos indicates the position of the coefficient according to the anti-diagonal scan. scan_pos 15 is the coefficient that was scanned first, i.e., in the lower right corner, in the 4x4 block, while scan_pos 0 is the coefficient that was scanned last, i.e., in the upper left corner. Further, in one embodiment, scan_pos may be referred to as a scan position. For example, scan_pos 0 may be referred to as scan position 0.

TABLE 3

scan_pos	15	14	13	12	11	10	9	8	7	6	5	4	3	2	1	0
																	coefficients	0	0	0	0	1	-1	0	2	0	3	-2	-3	4	6	-7	10
sig_coeff_flag	0	0	0	0	1	1	0	1	0	1	1	1	1	1	1	1
																	par_level_flag					0	0		1		0	1	0	1	1	0	1
rem abs_gt1_flag					0	0		0		1	0	1	1	1	1	1
																	rem_abs_gt2_flag										0		0	0	1	1	1
abs_remainder														0	1	2
																	ceoff_sign_flag					0	1		0		0	1	1	0	0	1	0

Further, as described with reference to table 1, before encoding the residual signal and the special residual signal, it is first transmitted whether the transform of the corresponding block is applied. By representing the correlation between residual signals in the transform domain, compression of the data is achieved and sent to the decoding device. If the correlation between the residual signals is insufficient, data compression may not be sufficiently performed. In this case, a transformation process including a complex calculation process may be omitted, and a residual signal in a pixel domain (spatial domain) may be transmitted to a decoding apparatus.

Since the residual signal of the pixel domain that has not undergone transformation has different characteristics (distribution of residual signals, absolute level of each residual signal, etc.) from the residual signal of the general transform domain, a residual signal encoding method for efficiently transmitting such a signal to a decoding apparatus according to an example of the present disclosure will be presented hereinafter.

As shown in the drawing, a transform application flag indicating whether a transform is applied to a corresponding transform block and information on an encoded binarized code may be input to the residual signal decoder 800, and a decoded residual signal may be output from the residual signal decoder 800.

The flag as to whether to apply the transform may be expressed as transform_skip_f1ag, and the encoded binarized code may be input to the residual signal decoder 800 through the binarization process of fig. 6.

The transform skip flag is transmitted in units of transform blocks, and in table 1, a flag as to whether or not to perform a transform is limited to a specific block size (a condition of parsing the transform skip flag is included only when the size of a transform block is 4x4 or less). However, in the present embodiment, the size of the block for determining whether to parse the transform skip flag may be variously configured. The sizes of Log2TbWidth and Log2TbHeight are determined as variables wN and hN, and wN and hN may be selected as one of the following.

[ 5]

wN＝{2，3，4，5}

wH＝{2，3，4，5}

Syntax elements that equation 5 may apply are as follows.

TABLE 4

As described above, a method of decoding a residual signal may be determined according to the transform skip flag. By the proposed method, it is possible to reduce complexity in entropy decoding processing and improve coding efficiency by efficiently processing signals having different statistical characteristics from each other.

Based on the description of the above embodiment and table 1, in the present embodiment, when the current decoding target block is the residual of the untransformed pixel domain, a method of decoding the residual signal encoded in units of 4×4 sub-blocks in units of a Transform Block (TB) is proposed.

The general residual signal is represented as a transform domain, and as the transformed residual signal becomes closer to the upper left corner based on coefficients of the block, non-zero coefficients are more likely to occur, and the absolute value level of the coefficients is such that it may also have a larger value. Encoding may be performed by the above-described methods reflecting these characteristics.

However, the residual of the pixel domain, which is not represented in the transform domain, does not have the above-described characteristics, and the probability of generating a coefficient of zero or more has randomness. In this case, a method of determining an element to be encoded as a coded sub_block_flag by representing a residual in units higher than a pixel unit in units of sub-blocks may considerably cause a side effect of redundantly transmitting information about coefficient distribution, that is, increase complexity. Therefore, according to the present embodiment, for a transform block to which no transform is applied, encoding and decoding efficiency can be improved by transmitting a residual signal in units of transform blocks instead of in units of sub-blocks.

The following will be summarized with reference to fig. 9. Fig. 9 is a control flow diagram illustrating a method of decoding a residual signal according to an exemplary embodiment of the present disclosure.

First, an entropy decoder or a residual signal decoder of the decoding apparatus parses a transform skip flag (transform_skip_flag) indicating whether a transform process is performed on a transform block (S900), and it may determine whether the residual signal is transformed based on parsed information (S910).

As a result of the determination, when the transform skip flag indicates that the residual signal has been transformed, the entropy decoder or the residual signal decoder may decode the transformed block in units of sub-blocks (S920).

In contrast, when the transform skip flag indicates that the residual signal is not transformed, the entropy decoder or the residual signal decoder may decode the transform block in units of transform blocks instead of in units of sub-blocks (S930).

Further, according to the technique of decoding a residual based on table 1 and a transform skip flag, the present embodiment proposes a method of determining a context element (i.e., syntax) when a current decoding target block is a residual of an untransformed pixel domain.

In the case of a general transform domain residual, a residual signal is represented as a horizontal value of each frequency component, and in a high frequency region, the probability of a number represented as zero or near zero by quantization increases. Thus, in table 1, a method in which subsequent context element parsing can be omitted by first encoding sig_coeff_flag, which is a context element regarding whether or not the current transform coefficient value is 0, is used.

When sig_coeff_flag is not 0, rem_abs_gt1_flag, par_level_flag, rem_abs_gt2_flag, and the like may be sequentially encoded according to the value of the current transform coefficient. However, in the case where the residual signal of the pixel domain is not subjected to transformation, the absolute level value of the signal has randomness.

The context-encoded syntax element may include at least one of sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, and/or rem_abs_gt2_flag as a syntax element encoded by context-based arithmetic encoding. In addition, in the following, the context coding bin may indicate a context coding bin with respect to at least one of the sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, and/or rem_abs_gt2_flag.

In general, in the case where the value of the residual signal is large, when all syntax elements (e.g., sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, and rem_abs_gt2_flag) are expressed with all coefficients, redundant information is more likely to be transmitted than transmitting a horizontal value by binarizing it as it is. Therefore, the present embodiment proposes a method of improving coding efficiency by omitting some context elements for the residual signal of the pixel domain.

The proposed method may be branched based on the transform_skip_flag context element in table 1, and may follow existing methods that do not correspond to the branching statement, for example, the context elements (sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, rem_abs_gt2_flag, abs_ REMAINDER, COEFF _sign_flag) of table 1 may be encoded and decoded, or the context elements defined above may be included. That is, when a transform is applied, context elements of sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, rem_abs_gt2_flag, abs_ remainder, and coeff_sign_flag may be encoded and decoded as shown in table 1.

Furthermore, the residual signal to which no transform is applied may be encoded and decoded by context elements of sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, abs_ remainder, and coeff_sign_flag.

Table 5 shows the context elements according to the present embodiment.

TABLE 5

/>

Fig. 10 is a control flow chart illustrating a process of resolving a context element according to the present embodiment. The context element parsing according to the transform skip flag according to fig. 10 will be described as follows.

First, a transform skip flag (transform_skip_flag) indicating whether a transform process has been performed on a transform block is parsed to determine whether the transform_skip_flag is 1 (S1000).

As a result of the determination, in the case where the transform_skip_flag is a residual value of 1 (i.e., the transform has been skipped without being applied), context elements of sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, abs_ remainder, and coeff_sign_flag may be encoded, and context elements of sig_coeff_flag, ar_level_flag, rem_abs_gt1_flag, abs_ remainder, and coeff_sign_flag may be parsed (S1010).

In this case, the context elements may be sequentially parsed, or the parsing order may be changed.

In contrast, in the case of a residual value of which transform_skip_flag is 0 (i.e., to which transform has been applied), context elements of sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, rem_abs_gt2_flag, abs_ remainder, and coeff_sign_flag may be encoded, and context elements of sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, rem_abs_gt2_flag, abs_ remainder, and coeff_sign_flag may be parsed (S1020). In this case, the context elements may be sequentially parsed, or the parsing order may be changed.

That is, when compared with the residual value to which the transform is applied, the rem_abs_gt2_flag is not encoded and decoded without applying the transformed residual value. In the case where the residual value is large, when all syntax elements such as sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, and rem_abs_gt2_flag are expressed in all coefficients, redundant information is more likely to be transmitted than when a horizontal value is transmitted by binarizing it as it is, and thus, in the present embodiment, the coding efficiency is improved by omitting a context element of rem_abs_gt2_flag.

Further, according to a technique of decoding a residual based on table 1 and a transform skip flag, another embodiment according to the present disclosure proposes a method of determining a context element (i.e., syntax) when a current decoding target block is a residual of an untransformed pixel domain.

In general, in the case where the value of the residual signal is large, when all syntax elements such as sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, and rem_abs_gt2_flag are expressed in all coefficients, redundant information is more likely to be transmitted than transmitting a horizontal value by binarizing it as it is. Therefore, the present embodiment proposes a method of improving coding efficiency by omitting some context elements for the residual signal of the pixel domain.

The proposed method may branch based on the transform_skip_flag context element in table 1, and may follow existing methods that do not correspond to a branch statement, such as the context elements (sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, rem_abs_gt2_flag, abs_ REMAINDER, COEFF _sign_flag) of table 1, or may include the above-defined context elements for encoding and decoding. That is, when the transform is applied, the context elements of sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, rem_abs_gt2_flag, abs_ remainder, and coeff_sign_flag may be encoded and decoded as shown in table 1.

Furthermore, the residual signal to which no transform is applied may be encoded and decoded by context elements of sig_coeff_flag, par_level_flag, abs_ remainder, and coeff_sign_flag.

Table 6 shows the context elements according to the present embodiment.

TABLE 6

/>

Fig. 11 is a control flow chart illustrating a process of resolving a context element according to the present embodiment. The context element parsing according to the transform skip flag according to fig. 11 will be described as follows.

First, a transform skip flag (transform_skip_flag) indicating whether a transform process has been performed on a transform block is parsed to determine whether the transform_skip_flag is 1 (S1100).

As a result of the determination, in the case where the transform_skip_flag is a residual value of 1 (i.e., the transform has been skipped without being applied), context elements of sig_coeff_flag, par_level_flag, abs_ remainder, and coeff_sign_flag may be encoded, and context elements of sig_coeff_flag, par_level_flag, abs_ remainder, and coeff_sign_flag may be parsed (S1110). In this case, the context elements may be sequentially parsed, or the parsing order may be changed.

In contrast, in the case of a residual value of which transform_skip_flag is 0 (i.e., to which transform has been applied), context elements of sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, rem_abs_gt2_flag, abs_ remainder, and coeff_sign_flag may be encoded, and context elements of sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, rem_abs_gt2_flag, abs_ remainder, and coeff_sign_flag may be parsed (S1120). In this case, the context elements may be sequentially parsed, or the parsing order may be changed.

That is, in the case where the transformed residual value is not applied, the rem_abs_gt1_flag and rem_abs_gt2_flag are not encoded and decoded when compared with the residual value to which the transformation has been applied. In the case where the residual value is large, when all syntax elements such as sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, and rem_abs_gt2_flag are expressed in all coefficients, redundant information is more likely to be transmitted than if the level value is transmitted by binarizing it as it is, and thus, in the present embodiment, the coding efficiency is improved by omitting context elements of rem_abs_gt1_flag and rem_abs_gt2_flag.

Further, according to a technique of decoding a residual based on table 1 and a transform skip flag, a method of determining a context element when a current decoding target block is a residual of an untransformed pixel domain is proposed according to still another embodiment of the present disclosure.

The proposed method may be branched based on the transform_skip_flag context Wen Yuan element in table 1, and an existing method that does not correspond to a branching statement may be followed, for example, the context elements (sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, rem_abs_gt2_flag, abs_ REMAINDER, COEFF _sign_flag) of table 1 may be encoded and decoded, or the context elements defined above may be included. That is, when the transform is applied, the context elements of sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, rem_abs_gt2_flag, abs_ remainder, and coeff_sign_flag may be encoded and decoded as shown in table 1.

Furthermore, residual signals to which no transform is applied may be encoded and decoded by context elements of sig_coeff_flag, abs_ remainder, and coeff_sign_flag.

In addition, according to an example, when all the number of bins for the context-encoding syntax element have been used, only abs_ remainder and coeff_sign_flag may be encoded or decoded/parsed without further encoding or decoding/parsing the context-encoding syntax element.

Table 7 shows the context elements according to the present embodiment.

TABLE 7

/>

Fig. 12 is a control flow chart illustrating a process of resolving a context element according to the present embodiment. The context element parsing according to the transform skip flag according to fig. 12 will be described as follows.

First, a transform skip flag (transform_skip_flag) indicating whether a transform process has been performed on a transform block is parsed to determine whether the transform_skip_flag is 1 (S1200).

As a result of the determination, in the case where the transform_skip_flag thereof is a residual value of 1 (i.e., the transform has been skipped without being applied), context elements of sig_coeff_flag, abs_ remainder, and coeff_sign_flag may be encoded, and context elements of sig_coeff_flag, abs_ remainder, and coeff_sign_flag may be parsed (S1210). In this case, the context elements may be sequentially parsed, or the parsing order may be changed.

In contrast, in the case of a residual value of which transform_skip_flag is 0 (i.e., to which transform has been applied), context elements of sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, rem_abs_gt2_flag, abs_ remainder, and coeff_sign_flag may be encoded, and context elements of sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, rem_abs_gt2_flag, abs_ remainder, and coeff_sign_flag may be parsed (S1220). In this case, the context elements may be sequentially parsed, or the parsing order may be changed.

That is, when compared with the residual value to which the transform has been applied, the par_level_flag, rem_abs_gt1_flag, and rem_abs_gt2_flag are not encoded and decoded in the case where the transformed residual value is not applied. In the case where the residual value is large, when all syntax elements such as sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, and rem_abs_gt2_flag are expressed in all coefficients, redundant information is more likely to be transmitted than if the level value is transmitted by binarizing it as it is, and thus, in the present embodiment, the coding efficiency is improved by omitting the context elements of par_level_flag, rem_abs_gt1_flag, and rem_abs_gt2_flag.

In the case of a general transform domain residual, the residual signal is represented as a horizontal value of each frequency component, and in a high frequency region, the probability of being represented as a number of zero or near zero by quantization increases. Thus, in table 1, a method is used in which subsequent context element parsing can be omitted by first encoding sig_coeff_flag, which is a context element regarding whether or not the current transform coefficient value is 0.

In general, in the case where the value of the residual signal is large, when all syntax elements such as sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, and rem_abs_gt2_flag are expressed with all coefficients, redundant information is more likely to be transmitted than transmitting a horizontal value by binarizing it as it is. Therefore, the present embodiment proposes a method of improving coding efficiency by omitting some context elements for the residual signal of the pixel domain.

Furthermore, the residual signal to which no transform is applied may be encoded and decoded by context elements of abs_ remainder and coeff_sign_flag.

Table 8 shows the context elements according to the present embodiment.

TABLE 8

/>

Fig. 13 is a control flow chart illustrating a process of resolving a context element according to the present embodiment. The context element parsing according to the transform skip flag according to fig. 13 will be described as follows.

First, a transform skip flag (transform_skip_flag) indicating whether a transform process has been performed on a transform block is parsed to determine whether the transform_skip_flag is 1 (S1300).

As a result of the determination, in case of a residual value whose transform_skip_flag is 1 (i.e., transform has been skipped without being applied), context elements of abs_ remainder and coeff_sign_flag may be encoded, and context elements of abs_ remainder and coeff_sign_flag may be parsed (S1310). In this case, the context elements may be sequentially parsed, or the parsing order may be changed.

In contrast, in the case of a residual value of which transform_skip_flag is 0 (i.e., to which transform has been applied), context elements of sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, rem_abs_gt2_flag, abs_ remainder, and coeff_sign_flag may be encoded, and context elements of sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, rem_abs_gt2_flag, abs_ remainder, and coeff_sign_flag may be parsed (S1320). In this case, the context elements may be sequentially parsed, or the parsing order may be changed.

That is, when compared with residual values to which a transform has been applied, sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, and rem_abs_gt2_flag are not encoded and decoded without applying the transformed residual values. In the case where the residual value is large, when all syntax elements such as sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, and rem_abs_gt2_flag, etc., are expressed with all coefficients, redundant information is more likely to be transmitted than if the horizontal value is transmitted by binarizing it as it is, and thus, in the present embodiment, the coding efficiency is improved by omitting the context elements of sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, and rem_abs_gt2_flag.

In the case of a general transform domain residual, the residual signal is represented as a horizontal value of each frequency component, and in a high frequency region, the probability of being represented as a number of zero or near zero by quantization increases. Thus, in table 1, a method in which subsequent context element parsing can be omitted by first encoding sig_coeff_flag, which is a context element regarding whether or not the current transform coefficient value is 0, is used.

Furthermore, the residual signal to which no transform is applied may be encoded and decoded by context elements of sig_coeff_flag, rem_abs_gt1_flag, abs_ remainder, and coeff_sign_flag.

Table 9 shows the context elements according to the present embodiment.

TABLE 9

/>

Fig. 14 is a control flow chart illustrating a process of resolving a context element according to the present embodiment. The context element parsing according to the transform skip flag according to fig. 14 will be described as follows.

First, a transform skip flag (transform_skip_flag) indicating whether a transform process has been performed on a transform block is parsed to determine whether the transform_skip_flag is 1 (S1400).

As a result of the determination, in the case where the transform_skip_flag is a residual value of 1 (i.e., the transform has been skipped without being applied), context elements of sig_coeff_flag, rem_abs_gt1_flag, abs_ remainder, and coeff_sign_flag may be encoded, and context elements of sig_coeff_flag, rem_abs_gt1_flag, abs_ remainder, and coeff_sign_flag may be parsed (S1410). In this case, the context elements may be sequentially parsed, or the parsing order may be changed.

In contrast, in the case of a residual value of which transform_skip_flag is 0 (i.e., to which transform has been applied), context elements of sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, rem_abs_gt2_flag, abs_ remainder, and coeff_sign_flag may be encoded, and context elements of sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, rem_abs_gt2_flag, abs_ remainder, and coeff_sign_flag may be parsed (S1420). In this case, the context elements may be sequentially parsed, or the parsing order may be changed.

That is, when compared with the residual value to which the transform has been applied, the par_level_flag and rem_abs_gt2_flag are not encoded and decoded in the case where the transformed residual value has not been applied. In the case where the residual value is large, when all syntax elements such as sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, and rem_abs_gt2_flag are expressed with all coefficients, it is more likely that redundant information is transmitted than that the horizontal value is transmitted by binarizing it as it is. Therefore, in the present embodiment, the encoding efficiency is improved by omitting the context elements of the par_level_flag and rem_abs_gt2_flag.

In general, in the case where the value of the residual signal is large, when all syntax elements such as sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, and rem_abs_gt2_flag are expressed with all coefficients, redundant information is more likely to be transmitted than transmitting a horizontal value by directly binarizing it as it is. Therefore, the present embodiment proposes a method of improving coding efficiency by omitting some context elements for the residual signal of the pixel domain.

Furthermore, the residual signal to which no transform is applied may be encoded and decoded by context elements of sig_coeff_flag, rem_abs_gt1_flag, rem_abs_gt2_flag, abs_ remainder, and coeff_sign_flag.

Table 10 shows the context elements according to the present embodiment.

TABLE 10

/>

Fig. 15 is a control flow chart illustrating a process of resolving a context element according to the present embodiment. The context element parsing according to the transform skip flag according to fig. 15 will be described as follows.

First, a transform skip flag (transform_skip_flag) indicating whether a transform process has been performed on a transform block is parsed to determine whether the transform_skip_flag is 1 (S1500).

As a result of the determination, in the case where the transform_skip_flag is a residual value of 1 (i.e., transform has been skipped without being applied), context elements of sig_coeff_flag, rem_abs_gt1_flag, rem_abs_gt2_flag, abs_ remainder, and coeff_sign_flag may be encoded, and context elements of sig_coeff_flag, rem_abs_gt1_flag, rem_abs_gt2_flag, abs_ remainder, and coeff_sign_flag may be parsed (S1510). In this case, the context elements may be sequentially parsed, or the parsing order may be changed.

In contrast, in the case of a residual value of which transform_skip_flag is 0 (i.e., to which transform has been applied), context elements of sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, rem_abs_gt2_flag, abs_ remainder, and coeff_sign_flag may be encoded, and context elements of sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, rem_abs_gt2_flag, abs_ remainder, and coeff_sign_flag may be parsed (S1520). In this case, the context elements may be sequentially parsed, or the parsing order may be changed.

That is, the par_level_flag is not encoded and decoded without applying the transformed residual value when compared with the transformed residual value. In the case where the residual value is large, when all syntax elements such as sig_coeff_flag, par_level_flag, rem_abs_gt1_flag, and rem_abs_gt2_flag are expressed in all coefficients, it is more likely that redundant information is transmitted than that a horizontal value is transmitted by binarizing it as it is. Therefore, in the present embodiment, the encoding efficiency is improved by omitting the context element of the par_level_flag.

The syntax elements rem_abs_gt1_flag and rem_abs_gt2_flag may be expressed based on abs_level_gtx_flag [ n ] [ j ] as described above, and may also be expressed as abs_rem_gt1_flag and abs_rem_gt2_flag or abs_rem_gtx_flag.

As described above, according to the embodiments of the present disclosure, different residual coding schemes (i.e., residual syntax) may be applied depending on whether transform skipping is applied to residual coding.

For example, the signaling order of the flag (coeff_sign_flag) regarding the sign of the transform coefficient may be different depending on whether transform skip is applied. The coeff_sign_flag may be signaled after abs_ remainder when no transform skip is applied, and may be signaled before rem_abs_gt1_flag when a transform skip is applied.

In addition, for example, the resolution of rem_abs_gt1_flag, rem_abs_gt2_flag (i.e., rem_abs_gtx_flag) and the resolution loop for abs_ remainder may vary depending on whether transform skipping is applied.

Additionally, the context syntax element encoded by context-based arithmetic coding may include: a significant coefficient flag (sig_coeff_flag) indicating whether the quantized transform coefficient is a non-zero significant coefficient; a parity level flag (par_level_flag) regarding the parity of the transform coefficient level of the quantized transform coefficient; a first transform coefficient level flag (rem_abs_gt1_flag) regarding whether the transform coefficient level is greater than a first threshold; and a second transform coefficient level flag (rem_abs_gt2_flag) regarding whether the quantized coefficient level of the quantized transform coefficient is greater than a second threshold. In this case, decoding of the first transform coefficient level flag may be performed before decoding of the parity level flag.

Tables 11 to 13 show context elements according to the above example.

TABLE 11

TABLE 12

/>

TABLE 13

/>

Table 11 shows that residual coding is branched according to the value of transform_skip_flag, that is, different syntax elements are used for the residual. In addition, table 12 shows residual encoding in the case where the value of transform_skip_flag is 0 (i.e., in the case where transform is applied), and table 13 shows residual encoding in the case where the value of transform_skip_flag is 1 (i.e., in the case where transform is not applied).

In tables 12 and 13, the par_level_flag may be expressed as the following equation 6.

[ 6]

par_level_flag＝coeff&1

In addition, in tables 12 and 13, since the par_level_flag is parsed, that is, decoded after the abs_level_gtx_flag, the rem_abs_gt1_flag may indicate whether the transform coefficient at the corresponding scan position n is greater than 1, and the rem_abs_gt2_flag may indicate whether the transform coefficient at the corresponding scan position n is greater than 3. That is, rem_abs_gt2_flag in table 1 may be expressed as rem_abs_gt3_flag in table 12 and table 13.

When changing the formulas 2 to 3 as described above, in the case of the following tables 12 and 13, the formula 4 may be changed as follows.

[ 7]

|coeff|＝sig_coeff_flag+par_level_flag+rem_abs_gt1_flag+2*(rem_abs_gt2_flag+abs_remainder

Fig. 16 shows an example of a content streaming system to which the present document can be applied.

Referring to fig. 16, a content streaming system to which the present document is applied may generally include an encoding server, a streaming server, a web server, a media storage, a user device, and a multimedia input device.

The encoding server is used to compress content input from a multimedia input device (e.g., a smart phone, a camera, a camcorder, etc.) into digital data to generate a bitstream, and transmit it to the streaming server. As another example, in the case where the multimedia input device (e.g., smart phone, camera, camcorder, etc.) directly generates the bitstream, the encoding server may be omitted.

The bitstream may be generated by applying the encoding method or the bitstream generation method of the present document. And, the streaming server may temporarily store the bit stream in the course of transmitting or receiving the bit stream.

The streaming server transmits multimedia data to the user device through the web server based on a user's request, and the web server serves as a tool for informing the user of which services exist. When a user requests a service desired by the user, the web server transmits the request to the streaming server, and the streaming server transmits multimedia data to the user. In this regard, the content streaming system may include a separate control server, and in this case, the control server is used to control commands/responses between the various devices in the content streaming system.

The streaming server may receive content from the media store and/or the encoding server. For example, in the case where the content is received from the encoding server, the content may be received in real time. In this case, the streaming server may store the bit stream for a predetermined period of time to smoothly provide the streaming service.

For example, the user device may include a mobile phone, a smart phone, a laptop computer, a digital broadcast terminal, a Personal Digital Assistant (PDA), a Portable Multimedia Player (PMP), a navigation, a tablet PC, a super book, a wearable device (e.g., a wristwatch-type terminal (smart watch), a glasses-type terminal (smart glasses), a Head Mounted Display (HMD)), a digital TV, a desktop computer, or a digital signage, etc.

Each server in the content streaming system may operate as a distributed server, and in this case, the data received by each server may be processed in a distributed manner.

Claims

1. An image decoding method performed by a decoding apparatus, the image decoding method comprising the steps of:

obtaining prediction information and residual information from the bitstream;

Deriving a prediction sample of the current block based on the prediction information;

Deriving quantized transform coefficients of the current block based on the residual information;

Deriving transform coefficients of the current block by performing inverse quantization on the quantized transform coefficients;

deriving residual samples of the current block based on the transform coefficients; and

Generating a reconstructed picture of the current block based on the prediction samples and the residual samples,

Wherein the residual information includes a significant coefficient flag associated with whether a quantized transform coefficient is a non-zero significant coefficient and a parity level flag associated with parity of a transform coefficient level of the quantized transform coefficient.

2. The image decoding method of claim 1, wherein the residual information further includes a transform coefficient level flag related to whether a transform coefficient level of the quantized transform coefficient is greater than a predetermined threshold, and

Wherein the number of transform coefficient level flags is based on whether a transform is applied to the current block.

3. The image decoding method of claim 2, wherein the transform coefficient level flag includes a first transform coefficient level flag regarding whether a transform coefficient level of the quantized transform coefficient is greater than a first threshold value and a second transform coefficient level flag regarding whether the transform coefficient level of the quantized transform coefficient is greater than a second threshold value, and

Wherein the second transform coefficient level flag is decoded differently based on whether a transform is applied to the current block.

4. A method of image decoding according to claim 3, wherein the step of deriving quantized transform coefficients of the current block comprises the steps of:

decoding the first transform coefficient level flag and decoding the parity level flag; and

Deriving the quantized transform coefficients based on the decoded value of the parity level flag and the decoded value of the first transform coefficient level flag, and

Wherein decoding of the first transform coefficient level flag is performed prior to decoding of the parity level flag.

5. An image encoding method performed by an encoding apparatus, the image encoding method comprising the steps of:

Deriving a prediction sample of the current block;

Deriving residual samples of the current block;

Deriving transform coefficients of the current block based on the residual samples;

deriving quantized transform coefficients of the current block by performing quantization on the transform coefficients; and

Encoding prediction information related to the prediction samples and residual information related to the quantized transform coefficients to output a bitstream,

6. The image encoding method of claim 5, wherein the residual information further includes a transform coefficient level flag related to whether a transform coefficient level of the quantized transform coefficient is greater than a predetermined threshold, and

7. The image encoding method of claim 6, wherein the transform coefficient level flag includes a first transform coefficient level flag regarding whether a transform coefficient level of the quantized transform coefficient is greater than a first threshold value and a second transform coefficient level flag regarding whether the transform coefficient level of the quantized transform coefficient is greater than a second threshold value, and

Wherein the second transform coefficient level flag is encoded differently based on whether a transform is applied to the current block.

8. The image encoding method of claim 7, wherein the step of deriving quantized transform coefficients of the current block comprises the steps of:

encoding the first transform coefficient level flag and encoding the parity level flag; and

The quantized transform coefficients are derived based on the encoded values of the parity level flags and the encoded values of the first transform coefficient level flags.

9. A non-transitory computer readable digital storage medium storing a bitstream generated by an image encoding method, the image encoding method comprising the steps of:

Deriving a prediction sample of the current block;

Deriving residual samples of the current block;

Encoding prediction information related to the prediction samples and residual information related to the quantized transform coefficients to output the bitstream,

10. The non-transitory computer-readable digital storage medium of claim 9, wherein the residual information further comprises a transform coefficient level flag relating to whether a transform coefficient level of the quantized transform coefficient is greater than a predetermined threshold, and