WO2021235895A1 - 영상 코딩 방법 및 그 장치 - Google Patents
영상 코딩 방법 및 그 장치 Download PDFInfo
- Publication number
- WO2021235895A1 WO2021235895A1 PCT/KR2021/006366 KR2021006366W WO2021235895A1 WO 2021235895 A1 WO2021235895 A1 WO 2021235895A1 KR 2021006366 W KR2021006366 W KR 2021006366W WO 2021235895 A1 WO2021235895 A1 WO 2021235895A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- picture
- poc
- information
- value
- current
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 94
- 230000002123 temporal effect Effects 0.000 claims description 52
- 239000010410 layer Substances 0.000 description 50
- 238000001914 filtration Methods 0.000 description 26
- 239000000523 sample Substances 0.000 description 25
- 230000008569 process Effects 0.000 description 22
- 238000013139 quantization Methods 0.000 description 19
- 238000010586 diagram Methods 0.000 description 16
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 11
- 230000009466 transformation Effects 0.000 description 10
- 241000023320 Luma <angiosperm> Species 0.000 description 5
- 230000005540 biological transmission Effects 0.000 description 5
- 238000004891 communication Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 5
- 239000011229 interlayer Substances 0.000 description 5
- OSWPMRLSEDHDFF-UHFFFAOYSA-N methyl salicylate Chemical compound COC(=O)C1=CC=CC=C1O OSWPMRLSEDHDFF-UHFFFAOYSA-N 0.000 description 5
- 230000003044 adaptive effect Effects 0.000 description 4
- 230000006835 compression Effects 0.000 description 4
- 238000007906 compression Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000006978 adaptation Effects 0.000 description 3
- 238000009795 derivation Methods 0.000 description 3
- 238000003491 array Methods 0.000 description 2
- 230000002146 bilateral effect Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 239000013074 reference sample Substances 0.000 description 2
- 238000004590 computer program Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000008707 rearrangement Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
- 230000007727 signaling mechanism Effects 0.000 description 1
- 239000010454 slate Substances 0.000 description 1
- 239000004984 smart glass Substances 0.000 description 1
- 230000000153 supplemental effect Effects 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 238000011426 transformation method Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/105—Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/132—Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/157—Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
- H04N19/159—Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/30—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
- H04N19/31—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the temporal domain
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/46—Embedding additional information in the video signal during the compression process
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/513—Processing of motion vectors
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/573—Motion compensation with multiple frame prediction using two or more reference frames in a given prediction direction
Definitions
- This document relates to image coding technology, and more particularly, to a POC-based image coding method and apparatus for a picture in an image coding system.
- VR Virtual Reality
- AR Artificial Realtiy
- holograms broadcasting is on the rise.
- high-efficiency image/video compression technology is required to effectively compress, transmit, store, and reproduce information of high-resolution and high-quality images/videos having various characteristics as described above.
- An object of the present document is to provide a method and an apparatus for increasing image coding efficiency.
- Another technical problem of the present document is to provide a method and apparatus for increasing POC decoding efficiency of a picture.
- Another technical object of the present document is to provide a method and apparatus for increasing inter prediction efficiency by not using a reference picture deleted in a system.
- Another technical task of this document is to reduce the occurrence of errors and stabilize the network by limiting the POC value between the current picture and the reference picture.
- an image decoding method performed by a decoding apparatus.
- the method includes receiving POC information and information about reference pictures from a bitstream; deriving POC values for a current picture and reference pictures based on the POC information; constructing a reference picture list based on the POC value of the current picture and the POC values of the reference pictures; deriving prediction samples for the current block by performing inter prediction on the current block based on the reference picture list; generating a reconstructed picture based on the prediction samples, wherein the POC information includes a maximum LSB value of the POC, and the information on the reference pictures is non-reference related to whether a picture is not used as a reference picture
- the value of the non-reference picture flag of the previous picture that includes a picture flag and is used to derive the POC value of the current picture is 0, and the difference between the POC values of the current picture and the previous picture is the maximum LSB of the POC. It can be less than half of the value.
- Layer IDs for the current picture and the previous picture may be the same, and a temporal ID derived from identification information of a temporal layer for the previous picture may be 0.
- the previous picture may not be a RASL picture or a RADL picture.
- the POC value of the current picture is derived based on a variable POCMsb and a POC LSB information value for the current picture, and the variable POCMsb is a POC MSB cycle value based on a cycle existence flag and the cycle existence flag value. It may be derived based on the POC MSB cycle value signaled as .
- variable POCMsb for the current picture may be derived based on the variable POCMsb for the previous picture.
- an image encoding method performed by an encoding apparatus includes deriving POC values for a current picture and reference pictures; performing inter prediction on the current block using the reference pictures; and encoding POC information and information on the reference pictures, wherein the POC information includes a maximum LSB value of the POC, and the information on the reference pictures relates to whether a picture is not used as a reference picture.
- the non-reference picture flag includes a non-reference picture flag, and the value of the non-reference picture flag of the previous picture used to derive the POC value of the current picture is 0, and the difference between the POC values of the current picture and the previous picture is the difference between the POC values of the POC. It may be less than half of the maximum LSB value.
- a digital storage medium in which image data including encoded image information and/or a bitstream generated according to an image encoding method performed by an encoding apparatus is stored may be provided.
- a digital storage medium in which image data including encoded image information and/or bitstream causing the decoding apparatus to perform the image decoding method is stored.
- the efficiency of inter prediction can be increased by preventing the system from using the deleted reference picture.
- the occurrence of errors can be reduced and the network can be stabilized.
- FIG. 1 is a diagram schematically illustrating a configuration of a video/image encoding apparatus to which this document can be applied.
- FIG. 2 is a diagram schematically illustrating a configuration of a video/image decoding apparatus to which this document can be applied.
- FIG. 3 is a diagram illustrating an example of a hierarchical structure for a coded image/video.
- FIG. 4 is a diagram illustrating a temporal layer structure for NAL units in a bitstream supporting temporal scalability.
- FIG. 5 is a diagram for explaining a method of encoding image information performed by an encoding apparatus according to an example of this document.
- FIG. 6 is a diagram for explaining a method of decoding image information performed by a decoding apparatus according to an example of this document.
- FIG. 7 is a diagram for explaining a method of decoding an image according to an example.
- FIG. 8 is a diagram for explaining a method of encoding an image according to an example.
- FIG. 9 schematically shows an example of a video/image coding system to which this document can be applied.
- FIG. 10 exemplarily shows a structure diagram of a content streaming system to which this document is applied.
- each configuration in the drawings described in this document is shown independently for convenience of description regarding different characteristic functions, and does not mean that each configuration is implemented as separate hardware or separate software.
- two or more components among each component may be combined to form one component, or one component may be divided into a plurality of components.
- Embodiments in which each component is integrated and/or separated are also included in the scope of the present document without departing from the essence of this document.
- VVC Versatile Video Coding
- HEVC High Efficiency Video Coding
- EMC essential video coding
- a video may mean a set of a series of images according to the passage of time.
- a picture generally means a unit representing one image in a specific time period, and a slice/tile is a unit constituting a part of a picture in coding.
- a slice/tile may include one or more coding tree units (CTUs).
- CTUs coding tree units
- One picture may consist of one or more slices/tiles.
- One picture may be composed of one or more tile groups.
- One tile group may include one or more tiles.
- a pixel or pel may mean a minimum unit constituting one picture (or image). Also, as a term corresponding to a pixel, a 'sample' may be used.
- the sample may generally represent a pixel or a value of a pixel, may represent only a pixel/pixel value of a luma component, or may represent only a pixel/pixel value of a chroma component.
- the sample may mean a pixel value in the spatial domain, or when the pixel value is transformed into the frequency domain, it may mean a transform coefficient in the frequency domain.
- a unit may represent a basic unit of image processing.
- the unit may include at least one of a specific region of a picture and information related to the region.
- One unit may include one luma block and two chroma (ex. cb, cr) blocks.
- a unit may be used interchangeably with terms such as a block or an area in some cases.
- an MxN block may include samples (or sample arrays) or a set (or arrays) of transform coefficients including M columns and N rows.
- A/B may mean “A and/or B.”
- A, B may mean “A and/or B.”
- A/B/C may mean “at least one of A, B, and/or C.”
- A/B/C may mean “ at least one of A, B, and/or C.”
- “or” in this document is to be construed as “and/or”.
- “A or B” may mean 1) only “A”, 2) only “B”, or 3) “A and B”.
- “or” in this document may mean “additionally or alternatively”.
- the term “or” should be interpreted to indicate “and/or.”
- the expression “A or B” may comprise 1) only A, 2) only B, and/or 3) both A and B.
- the term “or” in this document should be interpreted to indicate “additionally or alternatively.”
- At least one of A and B may mean “only A”, “only B” or “both A and B”.
- the expression “at least one of A or B” or “at least one of A and/or B” means “at least one It can be interpreted the same as “at least one of A and B”.
- At least one of A, B and C means “only A”, “only B”, “only C”, or “A, B and C” Any combination of A, B and C”. Also, “at least one of A, B or C” or “at least one of A, B and/or C” means may mean “at least one of A, B and C”.
- parentheses used herein may mean “for example”. Specifically, when “prediction (intra prediction)” is indicated, “intra prediction” may be proposed as an example of “prediction”. In other words, “prediction” in the present specification is not limited to “intra prediction”, and “intra prediction” may be proposed as an example of “prediction”. Also, even when “prediction (ie, intra prediction)” is indicated, “intra prediction” may be proposed as an example of “prediction”.
- a video encoding apparatus may include an image encoding apparatus.
- the encoding apparatus 100 includes an image partitioner 110 , a predictor 120 , a residual processor 130 , an entropy encoder 140 , It may be configured to include an adder 150 , a filter 160 , and a memory 170 .
- the prediction unit 120 may include an inter prediction unit 121 and an intra prediction unit 122 .
- the residual processing unit 130 may include a transformer 132 , a quantizer 133 , an inverse quantizer 134 , and an inverse transformer 135 .
- the residual processing unit 130 may further include a subtractor 131 .
- the adder 150 may be referred to as a reconstructor or a reconstructed block generator.
- the above-described image segmentation unit 110, prediction unit 120, residual processing unit 130, entropy encoding unit 140, adder 150, and filtering unit 160 may include one or more hardware components ( For example, by an encoder chipset or processor).
- the memory 170 may include a decoded picture buffer (DPB), and may be configured by a digital storage medium.
- the hardware component may further include a memory 170 as an internal/external component.
- the image dividing unit 110 may divide an input image (or a picture, a frame) input to the encoding apparatus 100 into one or more processing units.
- the processing unit may be referred to as a coding unit (CU).
- the coding unit is to be recursively divided according to a quad-tree binary-tree ternary-tree (QTBTTT) structure from a coding tree unit (CTU) or largest coding unit (LCU).
- QTBTTT quad-tree binary-tree ternary-tree
- CTU coding tree unit
- LCU largest coding unit
- one coding unit may be divided into a plurality of coding units having a lower depth based on a quad tree structure, a binary tree structure, and/or a ternary structure.
- a quad tree structure may be applied first and a binary tree structure and/or a ternary structure may be applied later.
- the binary tree structure may be applied first.
- a coding procedure according to this document may be performed based on the final coding unit that is no longer divided.
- the maximum coding unit may be directly used as the final coding unit based on coding efficiency according to image characteristics, or the coding unit may be recursively divided into coding units having a lower depth than the optimal coding unit if necessary.
- a coding unit of the size of may be used as the final coding unit.
- the coding procedure may include procedures such as prediction, transformation, and restoration, which will be described later.
- the processing unit may further include a prediction unit (PU) or a transform unit (TU).
- the prediction unit and the transform unit may be divided or partitioned from the above-described final coding unit, respectively.
- the prediction unit may be a unit of sample prediction
- the transform unit may be a unit for deriving a transform coefficient and/or a unit for deriving a residual signal from the transform coefficient.
- a unit may be used interchangeably with terms such as a block or an area in some cases.
- an MxN block may represent a set of samples or transform coefficients including M columns and N rows.
- a sample may generally represent a pixel or a value of a pixel, may represent only a pixel/pixel value of a luma component, or may represent only a pixel/pixel value of a chroma component.
- a sample may be used as a term corresponding to a picture (or image) as a pixel or a pel.
- the encoding apparatus 100 subtracts the prediction signal (predicted block, prediction sample array) output from the inter prediction unit 121 or the intra prediction unit 122 from the input image signal (original block, original sample array) to obtain a residual A signal (residual signal, residual block, residual sample array) may be generated, and the generated residual signal is transmitted to the converter 132 .
- a unit for subtracting a prediction signal (prediction block, prediction sample array) from an input image signal (original block, original sample array) in the encoding apparatus 100 may be referred to as a subtraction unit 131 .
- the prediction unit may perform prediction on a processing target block (hereinafter, referred to as a current block) and generate a predicted block including prediction samples for the current block.
- the prediction unit may determine whether intra prediction or inter prediction is applied on a current block or CU basis.
- the prediction unit may generate various information about prediction, such as prediction mode information, and transmit it to the entropy encoding unit 140 , as will be described later in the description of each prediction mode.
- the prediction information may be encoded by the entropy encoding unit 140 and output in the form of a bitstream.
- the intra prediction unit 122 may predict the current block with reference to samples in the current picture.
- the referenced samples may be located in the neighborhood of the current block or may be located apart from each other according to the prediction mode.
- prediction modes may include a plurality of non-directional modes and a plurality of directional modes.
- the non-directional mode may include, for example, a DC mode and a planar mode (Planar mode).
- the directional mode may include, for example, 33 directional prediction modes or 65 directional prediction modes according to the granularity of the prediction direction. However, this is an example, and a higher or lower number of directional prediction modes may be used according to a setting.
- the intra prediction unit 122 may determine the prediction mode applied to the current block by using the prediction mode applied to the neighboring block.
- the inter prediction unit 121 may derive the predicted block for the current block based on the reference block (reference sample array) specified by the motion vector on the reference picture.
- motion information may be predicted in units of blocks, subblocks, or samples based on the correlation between motion information between neighboring blocks and the current block.
- the motion information may include a motion vector and a reference picture index.
- the motion information may further include inter prediction direction (L0 prediction, L1 prediction, Bi prediction, etc.) information.
- the neighboring blocks may include spatial neighboring blocks existing in the current picture and temporal neighboring blocks present in the reference picture.
- the reference picture including the reference block and the reference picture including the temporal neighboring block may be the same or different.
- the temporal neighboring block may be called a collocated reference block, a collocated CU (colCU), etc.
- a reference picture including the temporally neighboring block may be called a collocated picture (colPic).
- the inter prediction unit 121 constructs a motion information candidate list based on neighboring blocks, and provides information indicating which candidate is used to derive a motion vector and/or a reference picture index of the current block. can create Inter prediction may be performed based on various prediction modes. For example, in the skip mode and merge mode, the inter prediction unit 121 may use motion information of a neighboring block as motion information of the current block. In the skip mode, unlike the merge mode, a residual signal may not be transmitted. In the case of motion vector prediction (MVP) mode, the motion vector of the current block is determined by using a motion vector of a neighboring block as a motion vector predictor and signaling a motion vector difference. can direct
- the prediction unit 120 may generate a prediction signal based on various prediction methods to be described later.
- the prediction unit may apply intra prediction or inter prediction for prediction of one block, and may simultaneously apply intra prediction and inter prediction. This can be called combined inter and intra prediction (CIIP).
- the prediction unit may be based on an intra block copy (IBC) prediction mode or based on a palette mode for prediction of a block.
- IBC prediction mode or the palette mode may be used for video/video coding of content such as games, for example, screen content coding (SCC).
- SCC screen content coding
- IBC basically performs prediction within the current picture, but may be performed similarly to inter prediction in that a reference block is derived within the current picture. That is, IBC may use at least one of the inter prediction techniques described in this document.
- the palette mode may be viewed as an example of intra coding or intra prediction. When the palette mode is applied, the sample value in the picture may be signaled based on information about the palette table and palette index.
- the prediction signal generated by the prediction unit may be used to generate a reconstructed signal or may be used to generate a residual signal.
- the transform unit 132 may generate transform coefficients by applying a transform technique to the residual signal.
- the transformation method may include at least one of Discrete Cosine Transform (DCT), Discrete Sine Transform (DST), Karhunen-Loeve Transform (KLT), Graph-Based Transform (GBT), or Conditionally Non-linear Transform (CNT).
- DCT Discrete Cosine Transform
- DST Discrete Sine Transform
- KLT Karhunen-Loeve Transform
- GBT Graph-Based Transform
- CNT Conditionally Non-linear Transform
- GBT means a transformation obtained from this graph when expressing relationship information between pixels in a graph.
- CNT refers to a transformation obtained by generating a prediction signal using all previously reconstructed pixels and based thereon.
- the transformation process may be applied to a block of pixels having the same size as a square, or may be applied to a block of a variable size that is not a square.
- the quantization unit 133 quantizes the transform coefficients and transmits them to the entropy encoding unit 140, and the entropy encoding unit 140 encodes the quantized signal (information about the quantized transform coefficients) and outputs it as a bitstream. have.
- Information about the quantized transform coefficients may be referred to as residual information.
- the quantization unit 133 may rearrange the quantized transform coefficients in the block form into a one-dimensional vector form based on a coefficient scan order, and the quantized transform coefficients in the one-dimensional vector form are quantized based on the quantized transform coefficients in the one-dimensional vector form. Information about the transform coefficients may be generated.
- the entropy encoding unit 140 may perform various encoding methods such as, for example, exponential Golomb, context-adaptive variable length coding (CAVLC), and context-adaptive binary arithmetic coding (CABAC).
- the entropy encoding unit 140 may encode information necessary for video/image reconstruction (eg, values of syntax elements, etc.) other than the quantized transform coefficients together or separately.
- Encoded information (eg, encoded video/image information) may be transmitted or stored in a network abstraction layer (NAL) unit unit in the form of a bitstream.
- the video/image information may further include information about various parameter sets, such as an adaptation parameter set (APS), a picture parameter set (PPS), a sequence parameter set (SPS), or a video parameter set (VPS).
- APS adaptation parameter set
- PPS picture parameter set
- SPS sequence parameter set
- VPS video parameter set
- the video/image information may further include general constraint information.
- information and/or syntax elements transmitted/signaled from the encoding device to the decoding device may be included in video/image information.
- the video/image information may be encoded through the above-described encoding procedure and included in the bitstream.
- the bitstream may be transmitted over a network or may be stored in a digital storage medium.
- the network may include a broadcasting network and/or a communication network
- the digital storage medium may include various storage media such as USB, SD, CD, DVD, Blu-ray, HDD, and SSD.
- a transmitting unit (not shown) and/or a storage unit (not shown) for storing the signal may be configured as internal/external elements of the encoding apparatus 100, or the transmitting unit It may be included in the entropy encoding unit 140 .
- the quantized transform coefficients output from the quantization unit 133 may be used to generate a prediction signal.
- the residual signal residual block or residual samples
- the adder 155 adds the reconstructed residual signal to the prediction signal output from the inter prediction unit 121 or the intra prediction unit 122 to obtain a reconstructed signal (reconstructed picture, reconstructed block, reconstructed sample array). can be created
- the predicted block may be used as a reconstructed block.
- the adder 150 may be referred to as a restoration unit or a restoration block generator.
- the generated reconstructed signal may be used for intra prediction of the next processing object block in the current picture, or may be used for inter prediction of the next picture after filtering as described below.
- LMCS luma mapping with chroma scaling
- the filtering unit 160 may improve subjective/objective image quality by applying filtering to the reconstructed signal.
- the filtering unit 160 may generate a modified reconstructed picture by applying various filtering methods to the reconstructed picture, and store the modified reconstructed picture in the memory 170 , specifically, the DPB of the memory 170 .
- the various filtering methods may include, for example, deblocking filtering, a sample adaptive offset, an adaptive loop filter, a bilateral filter, and the like.
- the filtering unit 160 may generate various types of filtering-related information and transmit it to the entropy encoding unit 140 , as will be described later in the description of each filtering method.
- the filtering-related information may be encoded by the entropy encoding unit 140 and output in the form of a bitstream.
- the modified reconstructed picture transmitted to the memory 170 may be used as a reference picture in the inter prediction unit 121 .
- the encoding apparatus can avoid prediction mismatch between the encoding apparatus 100 and the decoding apparatus, and can also improve encoding efficiency.
- the memory 170 DPB may store the corrected reconstructed picture to be used as a reference picture in the inter prediction unit 121 .
- the memory 170 may store motion information of a block in which motion information in the current picture is derived (or encoded) and/or motion information of blocks in an already reconstructed picture.
- the stored motion information may be transmitted to the inter prediction unit 121 to be used as motion information of a spatial neighboring block or motion information of a temporal neighboring block.
- the memory 170 may store reconstructed samples of blocks reconstructed in the current picture, and may transmit the reconstructed samples to the intra prediction unit 122 .
- FIG. 2 is a diagram schematically illustrating a configuration of a video/image decoding apparatus that can be applied to embodiments of the present document.
- the decoding apparatus 200 includes an entropy decoder 210 , a residual processor 220 , a predictor 230 , an adder 240 , and a filtering unit. (filter, 250) and may be configured to include a memory (memory, 260).
- the prediction unit 230 may include an inter prediction unit 231 and an intra prediction unit 232 .
- the residual processing unit 220 may include a dequantizer 221 and an inverse transformer 221 .
- the entropy decoding unit 210 , the residual processing unit 220 , the prediction unit 230 , the adder 240 , and the filtering unit 250 are one hardware component (eg, a decoder chipset or a processor according to an embodiment). ) can be configured by
- the memory 260 may include a decoded picture buffer (DPB), and may be configured by a digital storage medium.
- the hardware component may further include a memory 260 as an internal/external component.
- the decoding apparatus 200 may reconstruct an image corresponding to a process in which the video/image information is processed in the encoding apparatus of FIG. 1 .
- the decoding apparatus 200 may derive units/blocks based on block division related information obtained from the bitstream.
- the decoding apparatus 200 may perform decoding using a processing unit applied in the encoding apparatus.
- the processing unit of decoding may be, for example, a coding unit, and the coding unit may be divided according to a quad tree structure, a binary tree structure and/or a ternary tree structure from a coding tree unit or a largest coding unit.
- One or more transform units may be derived from a coding unit.
- the reconstructed image signal decoded and output through the decoding device 200 may be reproduced through the playback device.
- the decoding apparatus 200 may receive a signal output from the encoding apparatus of FIG. 1 in the form of a bitstream, and the received signal may be decoded through the entropy decoding unit 210 .
- the entropy decoding unit 210 may parse the bitstream to derive information (eg, video/image information) required for image restoration (or picture restoration).
- the video/image information may further include information about various parameter sets, such as an adaptation parameter set (APS), a picture parameter set (PPS), a sequence parameter set (SPS), or a video parameter set (VPS).
- the video/image information may further include general constraint information.
- the decoding apparatus may decode the picture further based on the information on the parameter set and/or the general restriction information.
- Signaled/received information and/or syntax elements described later in this document may be decoded through the decoding procedure and obtained from the bitstream.
- the entropy decoding unit 210 decodes information in a bitstream based on a coding method such as exponential Golomb coding, CAVLC or CABAC, and a value of a syntax element required for image reconstruction, and a quantized value of a transform coefficient related to a residual. can be printed out.
- the CABAC entropy decoding method receives a bin corresponding to each syntax element in a bitstream, and decodes the syntax element information to be decoded and the decoding information of the surrounding and decoding target blocks or the symbol/bin information decoded in the previous step.
- a context model is determined using the context model, and the probability of occurrence of a bin is predicted according to the determined context model, and a symbol corresponding to the value of each syntax element can be generated by performing arithmetic decoding of the bin.
- the CABAC entropy decoding method may update the context model by using the decoded symbol/bin information for the context model of the next symbol/bin after determining the context model.
- Prediction-related information among the information decoded by the entropy decoding unit 210 is provided to the prediction unit (the inter prediction unit 232 and the intra prediction unit 231), and the entropy decoding unit 210 performs entropy decoding.
- Dual values that is, quantized transform coefficients and related parameter information may be input to the residual processing unit 220 .
- the residual processing unit 220 may derive a residual signal (residual block, residual samples, residual sample array). Also, information about filtering among the information decoded by the entropy decoding unit 210 may be provided to the filtering unit 250 .
- a receiving unit (not shown) for receiving the signal output from the encoding apparatus may be further configured as an internal/external element of the decoding apparatus 200 , or the receiving unit may be a component of the entropy decoding unit 210 .
- the decoding apparatus may be called a video/image/picture decoding apparatus, and the decoding apparatus is divided into an information decoder (video/image/picture information decoder) and a sample decoder (video/image/picture sample decoder).
- the information decoder may include the entropy decoding unit 210 , and the sample decoder includes the inverse quantization unit 221 , the inverse transform unit 222 , the adder 240 , the filtering unit 250 , and the memory 260 . ), an inter prediction unit 232 , and an intra prediction unit 231 .
- the inverse quantizer 221 may inverse quantize the quantized transform coefficients to output transform coefficients.
- the inverse quantizer 221 may rearrange the quantized transform coefficients in a two-dimensional block form. In this case, the rearrangement may be performed based on the coefficient scan order performed by the encoding device.
- the inverse quantizer 221 may perform inverse quantization on the quantized transform coefficients using a quantization parameter (eg, quantization step size information) and obtain transform coefficients.
- a quantization parameter eg, quantization step size information
- the inverse transform unit 222 inversely transforms the transform coefficients to obtain a residual signal (residual block, residual sample array).
- the prediction unit may perform prediction on the current block and generate a predicted block including prediction samples for the current block.
- the prediction unit may determine whether intra prediction or inter prediction is applied to the current block based on the prediction information output from the entropy decoding unit 210 , and may determine a specific intra/inter prediction mode.
- the prediction unit 220 may generate a prediction signal based on various prediction methods to be described later.
- the prediction unit may apply intra prediction or inter prediction for prediction of one block, and may simultaneously apply intra prediction and inter prediction. This can be called combined inter and intra prediction (CIIP).
- the prediction unit may be based on an intra block copy (IBC) prediction mode or based on a palette mode for prediction of a block.
- IBC prediction mode or the palette mode may be used for video/video coding of content such as games, for example, screen content coding (SCC).
- SCC screen content coding
- IBC basically performs prediction within the current picture, but may be performed similarly to inter prediction in that a reference block is derived within the current picture. That is, IBC may use at least one of the inter prediction techniques described in this document.
- the palette mode may be viewed as an example of intra coding or intra prediction. When the palette mode is applied, information about the palette table and the palette index may be included in the video/image information and signaled.
- the intra prediction unit 231 may predict the current block with reference to samples in the current picture.
- the referenced samples may be located in the neighborhood of the current block or may be located apart from each other according to the prediction mode.
- prediction modes may include a plurality of non-directional modes and a plurality of directional modes.
- the intra prediction unit 231 may determine the prediction mode applied to the current block by using the prediction mode applied to the neighboring block.
- the inter prediction unit 232 may derive the predicted block for the current block based on the reference block (reference sample array) specified by the motion vector on the reference picture.
- motion information may be predicted in units of blocks, subblocks, or samples based on the correlation between motion information between neighboring blocks and the current block.
- the motion information may include a motion vector and a reference picture index.
- the motion information may further include inter prediction direction (L0 prediction, L1 prediction, Bi prediction, etc.) information.
- the neighboring blocks may include spatial neighboring blocks existing in the current picture and temporal neighboring blocks present in the reference picture.
- the inter prediction unit 232 may construct a motion information candidate list based on neighboring blocks, and derive a motion vector and/or a reference picture index of the current block based on the received candidate selection information. Inter prediction may be performed based on various prediction modes, and the prediction information may include information indicating a mode of inter prediction for the current block.
- the adder 240 restores the obtained residual signal by adding it to the prediction signal (predicted block, prediction sample array) output from the prediction unit (including the inter prediction unit 232 and/or the intra prediction unit 231 ).
- a signal (reconstructed picture, reconstructed block, reconstructed sample array) may be generated.
- the predicted block may be used as a reconstructed block.
- the adder 240 may be referred to as a restoration unit or a restoration block generator.
- the generated reconstructed signal may be used for intra prediction of the next processing object block in the current picture, may be output through filtering as described below, or may be used for inter prediction of the next picture.
- LMCS luma mapping with chroma scaling
- the filtering unit 250 may improve subjective/objective image quality by applying filtering to the reconstructed signal.
- the filtering unit 250 may generate a modified reconstructed picture by applying various filtering methods to the reconstructed picture, and store the modified reconstructed picture in the memory 260 , specifically, the DPB of the memory 260 .
- the various filtering methods may include, for example, deblocking filtering, a sample adaptive offset, an adaptive loop filter, a bilateral filter, and the like.
- the (modified) reconstructed picture stored in the DPB of the memory 260 may be used as a reference picture in the inter prediction unit 232 .
- the memory 260 may store motion information of a block from which motion information in the current picture is derived (or decoded) and/or motion information of blocks in an already reconstructed picture.
- the stored motion information may be transmitted to the inter prediction unit 232 to be used as motion information of a spatial neighboring block or motion information of a temporal neighboring block.
- the memory 260 may store reconstructed samples of blocks reconstructed in the current picture, and may transmit the reconstructed samples to the intra prediction unit 231 .
- the embodiments described in the filtering unit 160, the inter prediction unit 121, and the intra prediction unit 122 of the encoding apparatus 100 are the filtering unit 250 and the inter prediction unit of the decoding apparatus 200, respectively.
- the same or corresponding application may be applied to the unit 232 and the intra prediction unit 231 .
- the predicted block includes prediction samples in the spatial domain (or pixel domain).
- the predicted block is derived equally from the encoding device and the decoding device, and the encoding device signals information (residual information) about the residual between the original block and the predicted block, rather than the original sample value of the original block itself, to the decoding device. It is possible to increase the video coding efficiency.
- the decoding apparatus may derive a residual block including residual samples based on the residual information, generate a reconstructed block including reconstructed samples by combining the residual block and the predicted block, and reconstruct including reconstructed blocks You can create a picture.
- the residual information may be generated through transformation and quantization procedures.
- the encoding apparatus derives a residual block between the original block and the predicted block, and performs a transform procedure on residual samples (residual sample array) included in the residual block to derive transform coefficients, and transform A quantization procedure may be performed on the coefficients to derive quantized transform coefficients, and related residual information may be signaled to a decoding apparatus (via a bitstream).
- the residual information may include information such as value information of quantized transform coefficients, location information, a transform technique, a transform kernel, and a quantization parameter.
- the decoding apparatus may perform an inverse quantization/inverse transformation procedure based on the residual information and derive residual samples (or residual blocks).
- the decoding apparatus may generate a reconstructed picture based on the predicted block and the residual block.
- the encoding apparatus may also inverse quantize/inverse transform the quantized transform coefficients for reference for inter prediction of a later picture to derive a residual block, and generate a reconstructed picture based thereon.
- the VVC system there is a signaling mechanism that allows a system level entity to know whether a picture is not used as a reference picture for any other picture.
- This information allows system-level entities to remove pictures in certain specific situations. That is, the system-level entity may remove a picture marked not to be used as a reference for another picture. For example, when network congestion occurs, a media-aware network router may drop network packets carrying coded bits of pictures that are marked not to be used as references to other pictures.
- Table 1 below shows flag information for the above contents.
- variable PicOrderCntVal image information signaled in the high-level syntax is required. Specifically, it is as follows.
- nuh_layer_id is signaled in the NAL unit header, and is an identifier for identifying a layer to which a VCL NAL unit belongs or a layer to which a non-VCL NAL unit is applied.
- the coded image/video is a video coding layer (VCL) that handles the decoding process of the image/video itself and a subsystem that transmits and stores the coded information, and the VCL and the It exists between subsystems and is divided into a network abstraction layer (NAL) that is responsible for network adaptation functions.
- VCL video coding layer
- NAL network abstraction layer
- VCL data including compressed video data is generated, or picture parameter set (PPS), sequence parameter set (SPS), video parameter set (Video Parameter Set: A supplemental enhancement information (SEI) message additionally necessary for a parameter set including information such as VPS) or an image decoding process may be generated.
- PPS picture parameter set
- SPS sequence parameter set
- SEI Supplemental Enhancement Information
- a NAL unit may be generated by adding header information (NAL unit header) to a raw byte sequence payload (RBSP) generated in the VCL.
- the RBSP refers to slice data, parameter sets, SEI messages, etc. generated in the VCL.
- the NAL unit header may include NAL unit type information specified according to RBSP data included in the corresponding NAL unit.
- the NAL unit may be divided into a VCL NAL unit and a Non-VCL NAL unit according to the RBSP generated in the VCL.
- a VCL NAL unit may mean a NAL unit including information (slice data) about an image
- the Non-VCL NAL unit is a NAL unit containing information (parameter set or SEI message) necessary for decoding an image.
- VCL NAL unit and Non-VCL NAL unit may be transmitted through a network by attaching header information according to a data standard of a subsystem.
- the NAL unit may be transformed into a data form of a predetermined standard such as H.266/VVC file format, Real-time Transport Protocol (RTP), Transport Stream (TS), and transmitted through various networks.
- RTP Real-time Transport Protocol
- TS Transport Stream
- the NAL unit type may be specified according to the RBSP data structure included in the corresponding NAL unit, and information on the NAL unit type may be stored and signaled in the NAL unit header.
- the NAL unit may be largely classified into a VCL NAL unit type and a Non-VCL NAL unit type according to whether or not the NAL unit includes image information (slice data).
- the VCL NAL unit type may be classified according to properties and types of pictures included in the VCL NAL unit, and the Non-VCL NAL unit type may be classified according to the type of a parameter set.
- NAL unit types have syntax information for the NAL unit type, and the syntax information may be stored and signaled in a NAL unit header.
- the syntax information may be nal_unit_type, and NAL unit types may be specified by a nal_unit_type value.
- vps_independent_layer_flag[i] is flag information signaled in the video parameter set, and if the value is 1, it indicates that the layer indexed by i is not used for inter-layer inter prediction, that is, inter-layer prediction. This indicates that the layered layer is used for inter-layer prediction.
- sps_log2_max_pic_order_cnt_lsb_minus4 is a signal signaled in the sequence parameter set and indicates the value of the variable MaxPicOrderCntLsb used in the decoding process of the POC.
- the variable MaxPicOrderCntLsb may be embodied as 2 (sps_log2_max_pic_order_cnt_lsb_minus4 + 4).
- a value obtained by adding 1 to sps_poc_msb_cycle_len_minus1 indicates the bit length of the ph_poc_msb_cycle_val syntax element.
- ph_pic_order_cnt_lsb represents a value obtained by dividing the POC of the current picture by the variable MaxPicOrderCntLsb, and the length of ph_pic_order_cnt_lsb is sps_log2_max_pic_order_cnt_lsb_minus4 +4 bits.
- the value of ph_pic_order_cnt_lsb exists in the range of 0 to (MaxPicOrderCntLsb - 1).
- ph_poc_msb_cycle_present_flag is flag information indicating whether a ph_poc_msb_cycle_val syntax element exists in the picture header. If the ph_poc_msb_cycle_present_flag value is 1, it indicates that the ph_poc_msb_cycle_val syntax element is present in the picture header. If the ph_poc_msb_cycle_present_flag value is 0, it indicates that the ph_poc_msb_cycle_val syntax element does not exist in the picture header.
- vps_independent_layer_flag[GeneralLayerIdx[nuh_layer_id]] is 0 and a picture exists in the current AU in the reference layer of the current layer, the value of ph_poc_msb_cycle_present_flag is 0.
- ph_poc_msb_cycle_val indicates a POC MSB cycle value for the current picture.
- the length of the ph_poc_msb_cycle_val syntax element is sps_poc_msb_cycle_len_minus1 +1 bits.
- vps_independent_layer_flag[GeneralLayerIdx[nuh_layer_id]] is 0 and there is picture A included in the current AU in the reference layer of the current layer
- the variable PicOrderCntVal is derived with the same value as PicOrderCntVal of the picture A, and all VCL NALs in the current AU Units must have the same ph_pic_order_cnt_lsb value.
- variable PicOrderCntVal of the current picture may be derived as follows.
- variable prevPicOrderCntLsb and the variable prevPicOrderCntMsb are derived as follows.
- nuh_layer_id is the same as the current picture, TemporalId is 0, and a previous picture other than a random access skipped leading (RASL) picture or a random access decodable leading (RADL) picture is set to prevTid0Pic, the variable prevPicOrderCntLsb is the same as the variable prevPicOrderCntLsb and the ph of the prevTid0Pic prevPicOrderCntMsb is the same as PicOrderCntMsb of prevTid0Pic.
- TemporalId means a variable derived based on identification information on a temporal layer in a bitstream (or temporal scalable bitstream) supporting temporal scalability.
- a bitstream (or temporal scalable bitstream) supporting temporal scalability includes information on a temporally scaled temporal layer.
- the information on the temporal layer may be identification information of the temporal layer specified according to the temporal scalability of the NAL unit.
- temporal_id syntax information may be used for temporal layer identification information, and the temporal_id syntax information may be stored in a NAL unit header in an encoding device and signaled to a decoding device.
- a temporal layer may be referred to as a sub-layer, a temporal sub-layer, or a temporal scalable layer.
- FIG. 4 is a diagram illustrating a temporal layer structure for NAL units in a bitstream supporting temporal scalability.
- NAL units included in the bitstream have temporal layer identification information (eg, temporal_id).
- temporal_id For example, a temporal layer composed of NAL units having a temporal_id value of 0 may provide the lowest temporal scalability, and a temporal layer composed of NAL units having a temporal_id value of 2 may provide the highest temporal scalability.
- a box marked with I refers to an I picture
- a box marked with B refers to a B picture.
- arrows indicate a reference relationship with respect to whether a picture refers to another picture.
- NAL units of a temporal layer having a temporal_id value of 0 are reference pictures that can be referenced by NAL units of a temporal layer having a temporal_id value of 0, 1, or 2.
- NAL units of a temporal layer having a temporal_id value of 1 are reference pictures that can be referenced by NAL units of a temporal layer having a temporal_id value of 1 or 2.
- NAL units of a temporal layer having a temporal_id value of 2 may be reference pictures that NAL units of the same temporal layer, that is, a temporal layer having a temporal_id value of 2, can refer, or may be non-reference pictures that are not referenced by other pictures. have.
- NAL units of a temporal layer having a temporal_id value of 2 that is, the highest temporal layer are non-reference pictures
- these NAL units are extracted from the bitstream without affecting other pictures in the decoding process ( extracted, or removed).
- variable PicOrderCntMsb for the current picture is derived as follows.
- the variable PicOrderCntMsb becomes a value obtained by multiplying ph_poc_msb_cycle_val by MaxPicOrderCntLsb (ph_poc_msb_cycle_val * MaxPicOrderCntLsb).
- the variable PicOrderCntMsb becomes 0 if the current picture is a CVSS picture.
- the variable PicOrderCntMsb may be derived based on the following equation.
- variable PicOrderCntVal which is the POC value for the current picture, is derived as the sum of the previously derived variables PicOrderCntMsb and ph_pic_order_cnt_lsb.
- PicOrderCntMsb is 0 for all CVSS pictures in which the ph_poc_msb_cycle_val value does not exist
- PicOrderCntVal is the same as ph_pic_order_cnt_lsb.
- the PicOrderCntVal value may have from -2 31 to 2 31 - 1, and two coded pictures with the same nuh_layer_id in one CVS cannot have the same PicOrderCntVal value.
- the inference for PicOrderCntMsb may vary depending on the POC value of the picture designed as prevTid0Pic. Since prevTid0Pic for the current picture must be the same picture in the encoding and decoding process, the POC value is the same.
- prevTid0Pic it is not considered whether the value of ph_non_ref_pic_flag is 1 and thus prevTid0Pic is a picture that can be removed by the system entity.
- the decoding device may unconsciously use another picture as prevTid0Pic for POC decoding of the current picture because it has been removed by the system. As a result, the decoding apparatus may derive an incorrect POC value.
- a picture selected as prevTid0Pic may be limited to a picture other than a picture having a ph_non_ref_pic_flag of 1.
- ph_non_ref_pic_flag of a picture having a TemporalId of 0 may be limited so that it does not become 1.
- any picture in the default temporal sub-layer ie, a picture in which TemporalId is 0
- ph_non_ref_pic_flag It can be restricted so that it cannot have 1.
- a picture having a TemporalId of 0 may be restricted so that it cannot have 1 as a value of ph_non_ref_pic_flag.
- a picture with a TemporalId of 0 may have a value of 1 as a value of ph_non_ref_pic_flag. may be limited to no
- the absolute POC difference between pictures may be limited not to be greater than half the value of MaxPicOrderCntLsb.
- a non-reference picture having a TemporalId of 0 is not used in the POC decoding process, and additionally TemporalId is 0 and ph_non_ref_pic_flag is 0.
- a difference in POC values between two consecutive pictures may be limited not to be greater than half of MaxPicOrderCntLsb.
- FIG. 5 is a diagram for explaining a method of encoding image information performed by an encoding apparatus according to an example of this document
- FIG. 6 is a diagram for explaining a method of decoding image information performed by a decoding apparatus according to an example of this document is a drawing for
- the encoding apparatus may derive a POC value of a reference picture to configure a reference picture set and may derive a POC value for a current picture ( S510 ).
- the derived POC information for the current picture may be encoded (S520), and image information including the POC information may be encoded (S530).
- the decoding apparatus obtains image information including POC information from the bitstream (S610), and based on the POC information, POC values of reference pictures and the current picture may be derived ( S620).
- a reference picture set may be constructed based on the derived POC value (S630), and a reference picture list may be derived based on the reference picture set (S640).
- Inter prediction for the current picture may be performed based on the derived reference picture list (S650).
- Image information such as POC information may be included in high level syntax (HLS).
- the POC information may include POC-related information and syntax elements, and the POC information may include POC information related to a current picture and/or POC information related to reference pictures.
- the POC information may include at least one of ph_non_reference_picture_flag, ph_non_reference_picture_flag, ph_poc_msb_cycle_present_flag, and/or ph_poc_msb_cycle_val.
- any one of the reference picture set or the reference picture list derived from FIGS. 5 and 6 may be omitted.
- S640 of deriving the reference picture list may be omitted, and inter prediction may be performed based on the reference picture set.
- a reference picture list may be derived based on a POC value instead of steps S630 and S640 of deriving a reference picture set and a reference picture list according to another example.
- the POC value of the i-th reference picture may be derived based on a POC difference value indicated by POC information related to the reference picture.
- the POC information may indicate the POC difference between the current picture and the i-th reference picture
- the POC information indicates the POC difference between the i-th reference picture and the (i-1)-th reference picture.
- the reference picture may include a previous reference picture with a smaller POC value than the current picture and/or a subsequent reference picture with a larger POC value than the current picture.
- Table 2 corresponds to an example in which the above-described example No. 2 (the ph_non_ref_pic_flag value of a picture having a TemporalId of 0 is limited not to be 1) is implemented.
- Table 2 based on the current VVC specification, parts added according to the present embodiment are underlined.
- Table 3 shows that the above-described example 3 (limited so that no picture in the basic temporal sub-layer can have 1 as the value of ph_non_ref_pic_flag when CLVS has one or more temporal sub-layers) is implemented corresponds to an example of being In Table 3, based on the current VVC specification, parts added according to the present embodiment are underlined.
- Table 4 corresponds to an example in which the above-described example No. 4 (limited so that a picture having a TemporalId of 0 cannot have 1 as a value of ph_non_ref_pic_flag when CLVS includes a picture other than an all intra picture) is implemented.
- Table 4 based on the current VVC specification, parts added according to the present embodiment are underlined.
- Table 5 is an example of No. 5 described above (if the CLVS includes a picture that is not an all-intra picture and there is one or more temporal sub-layers in the CLVS, a picture with a TemporalId of 0 cannot have 1 as the value of ph_non_ref_pic_flag) corresponds to an example in which is implemented.
- Table 5 based on the current VVC specification, parts added according to the present embodiment are underlined.
- Table 6 corresponds to an example in which the above-described example No. 6 (with respect to two consecutive picture pairs in which TemporalId is 0 and ph_non_ref_pic_flag is 0, the absolute POC difference between pictures is limited not to be greater than half of MaxPicOrderCntLsb) is implemented. .
- Table 6 based on the current VVC specification, parts added according to the present embodiment are underlined.
- FIG. 7 is a flowchart illustrating an operation of a video decoding apparatus according to an embodiment of this document.
- FIG. 7 Each step disclosed in FIG. 7 is based on some of the contents described above in FIGS. 3 to 6 . Accordingly, detailed descriptions overlapping with those described above in FIGS. 2 to 6 will be omitted or simplified.
- the decoding apparatus 200 may receive POC information and information on reference pictures from a bitstream, and the POC information may include a maximum LSB value of the POC, and information on reference pictures may include a non-reference picture flag related to whether the picture is not used as a reference picture (S710).
- the non-reference picture flag may be ph_non_ref_pic_flag shown in Table 1, and if the value is 1, it indicates that the picture associated with the picture header is not used as a reference picture, and if the value is 0, the picture associated with the picture header is used as a reference picture, or indicates that it may not be possible. That is, a picture having a non-reference picture flag value of 0 is not used as a reference picture of another picture. In other words, the non-reference picture flag value of a picture used as a reference picture of another picture is 1.
- the received POC information may include a header or sequence parameter set such as vps_independent_layer_flag, sps_log2_max_pic_order_cnt_lsb_minus4, sps_poc_msb_cycle_len_minus1, ph_pic_order_cnt_lsb, ph_poc_msb_cycle_present_flag, ph_poc_msb_cycle_val, and the like.
- a description of the signaled syntax information is the same as described above.
- the decoding apparatus may derive POC values for the current picture and reference pictures based on the POC information for inter prediction and reference picture list generation for the current picture (S720).
- variable PicOrderCntVal indicating the POC value of the current picture may be derived as the sum of the variable PicOrderCntMsb (variable POCMsb) indicating the MSB value of the current picture and ph_pic_order_cnt_lsb (POC LSB information) indicating the LSB of the current picture signaled in the picture header. (variable PicOrderCntMsb + ph_pic_order_cnt_lsb)
- the current picture has the same POC value as the picture included in the current AU in the reference layer.
- the variable PicOrderCntVal of the current picture is to be derived based on the ph_poc_msb_cycle_present_flag value (cycle presence flag) and the POC MSB cycle value (ph_poc_msb_cycle_val) signaled based on the cycle presence flag value.
- a different derivation process may be applied depending on whether the cycle presence flag value exists and whether the current picture is a coded layer video sequence start (CLVSS) picture.
- the first case is when ph_poc_msb_cycle_present_flag is 0 and the current picture is not a CLVSS picture.
- the variable prevPicOrderCntLsb and the variable prevPicOrderCntMsb for the previous picture may be derived, and the variable POCMsb for the current picture may be derived based on the variable POCMsb of the previous picture.
- nuh_layer_id is the same as the current picture, TemporalId is 0, and a previous picture other than a random access skipped leading (RASL) picture or a random access decodable leading (RADL) picture is set to prevTid0Pic, the variable prevPicOrderCntLsb is the same as the variable prevPicOrderCntLsb and the ph of the prevTid0Pic prevPicOrderCntMsb is derived the same as PicOrderCntMsb of prevTid0Pic.
- the layer IDs of the current picture and the previous picture are the same, and the temporal ID (TemporalId) derived from identification information of the temporal layer for the previous picture is 0.
- the previous picture for POC derivation of the current picture is not a RASL picture or a RADL picture.
- the variable PicOrderCntVal may be derived as shown in Equation (1).
- ph_poc_msb_cycle_present_flag is 0 and the current picture is a CLVSS picture. Since the PicOrderCntMsb value is 0, the variable PicOrderCntVal is derived as a ph_pic_order_cnt_lsb value.
- the ph_poc_msb_cycle_present_flag value is 1, in which case the variable PicOrderCntMsb is derived as a value obtained by multiplying ph_poc_msb_cycle_val by MaxPicOrderCntLsb (ph_poc_msb_cycle_val * MaxPicOrderCntLsb). Finally, the variable PicOrderCntVal is derived as the sum of the ph_pic_order_cnt_lsb values signaled to the derived variable PicOrderCntMsb.
- the decoding apparatus may construct a reference picture list based on the POC value of the current picture and the POC values of the reference pictures (S730), and perform inter prediction on the current block to derive prediction samples for the current block (S740). ).
- the decoding apparatus 200 may decode information on quantized transform coefficients for the current block from the bitstream, and quantize the target block based on information about the quantized transform coefficients for the current block.
- Transform coefficients can be derived.
- Information on the quantized transform coefficients for the target block may be included in a sequence parameter set (SPS) or a slice header, information on whether or not a simplified transform (RST) is applied, information on a simplification factor, At least one of information about a minimum transform size to which a simplified transform is applied, information about a maximum transform size to which a simplified transform is applied, a simplified inverse transform size, and information about a transform index indicating any one of a transform kernel matrix included in the transform set may include.
- SPS sequence parameter set
- RST simplified transform
- simplification factor At least one of information about a minimum transform size to which a simplified transform is applied, information about a maximum transform size to which a simplified transform is applied, a simplified inverse transform size, and information about a transform index indicating
- the decoding apparatus 200 may derive transform coefficients by performing inverse quantization on residual information about the current block, that is, quantized transform coefficients, and may arrange the derived transform coefficients in a predetermined scanning order.
- a transform coefficient derived based on the residual information may be an inverse quantized transform coefficient as described above, or a quantized transform coefficient. That is, the transform coefficient may be any data capable of checking whether non-zero data in the current block regardless of whether the transform coefficient is quantized or not.
- the decoding apparatus may derive residual samples by applying an inverse transform to the quantized transform coefficients.
- the decoding apparatus may generate a reconstructed picture based on the residual samples and the prediction samples (S750).
- FIG. 8 is a flowchart illustrating an operation of a video encoding apparatus according to an embodiment of the present document.
- FIG. 8 Each step disclosed in FIG. 8 is based on some of the contents described above in FIGS. 3 to 6 . Accordingly, detailed descriptions overlapping with those described above in FIGS. 1 and 3 to 6 will be omitted or simplified.
- the encoding apparatus 100 may derive POC values for the current picture and reference pictures (S810), and perform inter prediction on the current block using the derived POC values and the reference picture. (S820).
- the encoding apparatus may encode and output information about reference pictures including POC information including the maximum LSB value of the POC and a non-reference picture flag related to whether or not the picture is not used as a reference picture.
- the value of the non-reference picture flag of the previous picture used to derive the POC value may be 0, and the difference between the POC values of the current picture and the previous picture may be set to be less than half of the maximum LSB value of the POC (S830).
- the POC information for the current picture, the method of deriving the POC of the current picture, and the restrictions on the previous picture and the restriction conditions on the POC value of the previous picture are the same as those for the decoding apparatus described with reference to FIG. Description is omitted.
- the encoding apparatus may derive residual samples for the current block based on the prediction samples, and generate information on the residual through transformation.
- the residual information may include the above-described transformation related information/syntax element.
- the encoding apparatus may encode image/video information including residual information and output the encoded image/video information in the form of a bitstream.
- the encoding apparatus may generate information about the quantized transform coefficients and encode the information about the generated quantized transform coefficients.
- At least one of quantization/inverse quantization and/or transform/inverse transform may be omitted.
- the quantized transform coefficient may be referred to as a transform coefficient.
- the transform coefficients may be called coefficients or residual coefficients, or may still be called transform coefficients for uniformity of expression.
- the above-described method according to this document may be implemented in the form of software, and the encoding device and/or decoding device according to this document is, for example, a TV, computer, smart phone, set-top box, or display device that performs image processing. may be included in the device.
- a module may be stored in a memory and executed by a processor.
- the memory may be internal or external to the processor, and may be coupled to the processor by various well-known means.
- the processor may include an application-specific integrated circuit (ASIC), other chipsets, logic circuits, and/or data processing devices.
- Memory may include read-only memory (ROM), random access memory (RAM), flash memory, memory cards, storage media, and/or other storage devices. That is, the embodiments described in this document may be implemented and performed on a processor, a microprocessor, a controller, or a chip. For example, the functional units shown in each figure may be implemented and performed on a computer, a processor, a microprocessor, a controller, or a chip.
- the decoding device and the encoding device to which this document is applied are a multimedia broadcasting transceiver, mobile communication terminal, home cinema video device, digital cinema video device, surveillance camera, video conversation device, real-time communication device such as video communication, mobile streaming device, storage medium, camcorder, video on demand (VoD) service providing device, OTT video (Over the top video) device, Internet streaming service providing device, three-dimensional (3D) video device, video telephony video device, medical video device, etc. may be included, and may be used to process a video signal or a data signal.
- a multimedia broadcasting transceiver mobile communication terminal, home cinema video device, digital cinema video device, surveillance camera, video conversation device, real-time communication device such as video communication, mobile streaming device, storage medium, camcorder, video on demand (VoD) service providing device, OTT video (Over the top video) device, Internet streaming service providing device, three-dimensional (3D) video device, video telephony video device, medical video device, etc.
- the OTT video (Over the top video) device may include a game console, a Blu-ray player, an Internet-connected TV, a home theater system, a smart phone, a tablet PC, a digital video recorder (DVR), and the like.
- a game console a Blu-ray player
- an Internet-connected TV a home theater system
- a smart phone a tablet PC
- DVR digital video recorder
- the processing method to which this document is applied may be produced in the form of a program executed by a computer, and may be stored in a computer-readable recording medium.
- Multimedia data having a data structure according to this document may also be stored in a computer-readable recording medium.
- the computer-readable recording medium includes all types of storage devices and distributed storage devices in which computer-readable data is stored.
- the computer-readable recording medium includes, for example, Blu-ray Disc (BD), Universal Serial Bus (USB), ROM, PROM, EPROM, EEPROM, RAM, CD-ROM, magnetic tape, floppy disk and optical It may include a data storage device.
- the computer-readable recording medium includes a medium implemented in the form of a carrier wave (eg, transmission through the Internet).
- bitstream generated by the encoding method may be stored in a computer-readable recording medium or transmitted through a wired/wireless communication network.
- embodiments of this document may be implemented as a computer program product using program codes, and the program codes may be executed in a computer according to the embodiments of this document.
- the program code may be stored on a carrier readable by a computer.
- FIG. 9 schematically shows an example of a video/image coding system to which this document can be applied.
- a video/image coding system may include a source device and a receive device.
- the source device may transmit encoded video/image information or data in the form of a file or streaming to the receiving device through a digital storage medium or a network.
- the source device may include a video source, an encoding apparatus, and a transmission unit.
- the receiving device may include a receiving unit, a decoding apparatus, and a renderer.
- the encoding apparatus may be referred to as a video/image encoding apparatus, and the decoding apparatus may be referred to as a video/image decoding apparatus.
- the transmitter may be included in the encoding device.
- the receiver may be included in the decoding device.
- the renderer may include a display unit, and the display unit may be configured as a separate device or external component.
- a video source may acquire a video/image through a process of capturing, synthesizing, or generating a video/image.
- a video source may include a video/image capture device and/or a video/image generating device.
- a video/image capture device may include, for example, one or more cameras, a video/image archive containing previously captured video/images, and the like.
- a video/image generating device may include, for example, a computer, tablet, and smart phone, and may (electronically) generate a video/image.
- a virtual video/image may be generated through a computer, etc. In this case, the video/image capturing process may be substituted for the process of generating related data.
- the encoding device may encode the input video/image.
- the encoding apparatus may perform a series of procedures such as prediction, transformation, and quantization for compression and coding efficiency.
- the encoded data (encoded video/image information) may be output in the form of a bitstream.
- the transmitting unit may transmit the encoded video/image information or data output in the form of a bitstream to the receiving unit of the receiving device in the form of a file or streaming through a digital storage medium or a network.
- the digital storage medium may include various storage media such as USB, SD, CD, DVD, Blu-ray, HDD, and SSD.
- the transmission unit may include an element for generating a media file through a predetermined file format, and may include an element for transmission through a broadcast/communication network.
- the receiver may receive/extract the bitstream and transmit it to the decoding device.
- the decoding apparatus may decode the video/image by performing a series of procedures such as inverse quantization, inverse transformation, and prediction corresponding to the operation of the encoding apparatus.
- the renderer may render the decoded video/image.
- the rendered video/image may be displayed through the display unit.
- FIG. 10 exemplarily shows a structure diagram of a content streaming system to which this document is applied.
- the content streaming system to which this document is applied may largely include an encoding server, a streaming server, a web server, a media storage, a user device, and a multimedia input device.
- the encoding server compresses content input from multimedia input devices such as a smart phone, a camera, a camcorder, etc. into digital data to generate a bitstream and transmits it to the streaming server.
- multimedia input devices such as a smartphone, a camera, a camcorder, etc. directly generate a bitstream
- the encoding server may be omitted.
- the bitstream may be generated by an encoding method or a bitstream generation method to which this document is applied, and the streaming server may temporarily store the bitstream in the process of transmitting or receiving the bitstream.
- the streaming server transmits multimedia data to the user device based on a user's request through the web server, and the web server serves as a medium informing the user of any service.
- the web server transmits it to a streaming server, and the streaming server transmits multimedia data to the user.
- the content streaming system may include a separate control server.
- the control server serves to control commands/responses between devices in the content streaming system.
- the streaming server may receive content from a media repository and/or an encoding server. For example, when content is received from the encoding server, the content may be received in real time. In this case, in order to provide a smooth streaming service, the streaming server may store the bitstream for a predetermined time.
- Examples of the user device include a mobile phone, a smart phone, a laptop computer, a digital broadcasting terminal, a personal digital assistant (PDA), a portable multimedia player (PMP), a navigation system, a slate PC, Tablet PC (tablet PC), ultrabook (ultrabook), wearable device (eg, watch-type terminal (smartwatch), glass-type terminal (smart glass), HMD (head mounted display), digital TV, desktop computer , digital signage, etc.
- PDA personal digital assistant
- PMP portable multimedia player
- PDA portable multimedia player
- PMP portable multimedia player
- navigation system e.g, a portable multimedia player (PMP), a navigation system
- slate PC Tablet PC (tablet PC)
- ultrabook ultrabook
- wearable device eg, watch-type terminal (smartwatch), glass-type terminal (smart glass), HMD (head mounted display), digital TV, desktop computer , digital signage, etc.
- Each server in the content streaming system may be operated as a distributed server, and in this case,
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Compression Of Band Width Or Redundancy In Fax (AREA)
Abstract
Description
Claims (11)
- 디코딩 장치에 의하여 수행되는 영상 디코딩 방법에 있어서,비트스트림으로부터 POC 정보와 참조 픽처들에 대한 정보를 수신하는 단계와;상기 POC 정보를 기반으로 현재 픽처와 참조 픽처들에 대한 POC 값을 도출하는 단계와;상기 현재 픽처의 POC 값과 상기 참조 픽처들에 대한 POC 값을 기반으로 참조 픽처 리스트를 구성하는 단계와;상기 참조 픽처 리스트를 기반으로 현재 블록에 대한 인터 예측을 수행하여 상기 현재 블록에 대한 예측 샘플들을 도출하는 단계와;상기 예측 샘플들을 기반으로 복원 픽처를 생성하는 단계를 포함하고,상기 POC 정보는 POC의 최대 LSB값을 포함하고, 상기 참조 픽처들에 대한 정보는 픽처가 참조 픽처로 사용되지 않는지 여부와 관련된 비 참조 픽처 플래그를 포함하고,상기 현재 픽처의 POC 값을 도출하기 위하여 사용되는 이전 픽처의 상기 비 참조 픽처 플래그의 값은 0이고,상기 현재 픽처와 상기 이전 픽처의 POC 값의 차이는 상기 POC의 최대 LSB값의 절반 보다 작은 것을 특징으로 하는 영상 디코딩 방법.
- 제1항에 있어서,상기 현재 픽처와 상기 이전 픽처에 대한 레이어 ID는 동일하고,상기 이전 픽처에 대한, 시간 레이어의 식별 정보로부터 도출되는 시간적 ID는 0인 것을 특징으로 하는 영상 디코딩 방법.
- 제1항에 있어서,상기 이전 픽처는 RASL 픽처 또는 RADL 픽처가 아닌 것을 특징으로 하는 영상 디코딩 방법.
- 제1항에 있어서,상기 현재 픽처의 POC 값은, 변수 POCMsb와 상기 현재 픽처에 대한 POC LSB 정보 값을 기반으로 도출되고,상기 변수 POCMsb는 POC MSB 사이클 값의 존재 여부에 대한 사이클 존재 플래그 및 상기 사이클 존재 플래그 값을 기반으로 시그널링되는 POC MSB 사이클 값을 기반으로 도출되는 것을 특징으로 하는 영상 디코딩 방법.
- 제4항에 있어서,상기 현재 픽처에 대한 상기 사이클 존재 플래그의 값이 0이고, 상기 현재 픽처가 CLVSS 픽처가 아니면,상기 현재 픽처에 대한 상기 변수 POCMsb는 상기 이전 픽처의 상기 변수 POCMsb를 기반으로 도출되는 것을 특징으로 하는 영상 디코딩 방법.
- 영상 인코딩 장치에 의하여 수행되는 영상 인코딩 방법에 있어서,현재 픽처와 참조 픽처들에 대한 POC 값을 도출하는 단계와;상기 참조 픽처들을 이용하여 상기 현재 블록에 대한 인터 예측을 수행하는 단계와;POC 정보와 상기 참조 픽처들에 대한 정보를 인코딩 하는 단계를 포함하되,상기 POC 정보는 POC의 최대 LSB값을 포함하고, 상기 참조 픽처들에 대한 정보는 픽처가 참조 픽처로 사용되지 않는지 여부와 관련된 비 참조 픽처 플래그를 포함하고,상기 현재 픽처의 POC 값을 도출하기 위하여 사용되는 이전 픽처의 상기 비 참조 픽처 플래그의 값은 0이고,상기 현재 픽처와 상기 이전 픽처의 POC 값의 차이는 상기 POC의 최대 LSB값의 절반 보다 작은 것을 특징으로 하는 영상 인코딩 방법.
- 제6항에 있어서,상기 현재 픽처와 상기 이전 픽처에 대한 레이어 ID는 동일하고,상기 이전 픽처에 대한, 시간 레이어를 식별하는 시간적 ID는 0인 것을 특징으로 하는 영상 인코딩 방법.
- 제6항에 있어서,상기 이전 픽처는 RASL 픽처 또는 RADL 픽처가 아닌 것을 특징으로 하는 영상 인코딩 방법.
- 제7항에 있어서,상기 현재 픽처의 POC 값은, 변수 POCMsb와 상기 현재 픽처에 대한 POC LSB 정보 값을 기반으로 도출되고,상기 변수 POCMsb는 상기 현재 픽처에 대한 POC MSB 사이클 값의 존재 여부 및 상기 현재 픽처에 대한 POC MSB 사이클 값을 기반으로 도출되는 것을 특징으로 하는 영상 인코딩 방법.
- 제9항에 있어서,상기 현재 픽처에 대한 POC MSB 사이클 값이 존재하지 않고, 상기 현재 픽처가 CLVSS 픽처가 아니면,상기 현재 픽처에 대한 상기 변수 POCMsb는 상기 이전 픽처의 상기 변수 POCMsb를 기반으로 도출되는 것을 특징으로 하는 영상 인코딩 방법.
- 영상 디코딩 방법을 수행하도록 야기하는 지시 정보가 저장된 컴퓨터 판독 가능한 디지털 저장 매체로서, 상기 영상 디코딩 방법은,비트스트림으로부터 POC 정보와 참조 픽처들에 대한 정보를 수신하는 단계와;상기 POC 정보를 기반으로 현재 픽처와 참조 픽처들에 대한 POC 값을 도출하는 단계와;상기 현재 픽처의 POC 값과 상기 참조 픽처들에 대한 POC 값을 기반으로 참조 픽처 리스트를 구성하는 단계와;상기 참조 픽처 리스트를 기반으로 현재 블록에 대한 인터 예측을 수행하여 상기 현재 블록에 대한 예측 샘플들을 도출하는 단계와;상기 예측 샘플들을 기반으로 복원 픽처를 생성하는 단계를 포함하고,상기 POC 정보는 POC의 최대 LSB값을 포함하고, 상기 참조 픽처들에 대한 정보는 픽처가 참조 픽처로 사용되지 않는지 여부와 관련된 비 참조 픽처 플래그를 포함하고,상기 현재 픽처의 POC 값을 도출하기 위하여 사용되는 이전 픽처의 상기 비 참조 픽처 플래그의 값은 0이고,상기 현재 픽처와 상기 이전 픽처의 POC 값의 차이는 상기 POC의 최대 LSB값의 절반 보다 작은 것을 특징으로 하는 디지털 저장 매체.
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202180060988.9A CN116195247A (zh) | 2020-05-22 | 2021-05-21 | 图像编码方法和用于该图像编码方法的装置 |
US17/926,402 US20230188707A1 (en) | 2020-05-22 | 2021-05-21 | Image coding method and device therefor |
EP21809790.5A EP4156688A4 (en) | 2020-05-22 | 2021-05-21 | IMAGE CODING METHOD AND ASSOCIATED DEVICE |
KR1020227045280A KR20230017819A (ko) | 2020-05-22 | 2021-05-21 | 영상 코딩 방법 및 그 장치 |
JP2022571334A JP2023526535A (ja) | 2020-05-22 | 2021-05-21 | 映像コーディング方法及びその装置 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202063028596P | 2020-05-22 | 2020-05-22 | |
US63/028,596 | 2020-05-22 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021235895A1 true WO2021235895A1 (ko) | 2021-11-25 |
Family
ID=78707654
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2021/006366 WO2021235895A1 (ko) | 2020-05-22 | 2021-05-21 | 영상 코딩 방법 및 그 장치 |
Country Status (6)
Country | Link |
---|---|
US (1) | US20230188707A1 (ko) |
EP (1) | EP4156688A4 (ko) |
JP (1) | JP2023526535A (ko) |
KR (1) | KR20230017819A (ko) |
CN (1) | CN116195247A (ko) |
WO (1) | WO2021235895A1 (ko) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10674171B2 (en) * | 2011-09-27 | 2020-06-02 | Telefonaktiebolaget Lm Ericsson (Publ) | Decoders and methods thereof for managing pictures in video decoding process |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150304671A1 (en) * | 2012-07-01 | 2015-10-22 | Sharp Kabushiki Kaisha | Device for signaling a long-term reference picture in a parameter set |
US20180199051A1 (en) * | 2011-11-08 | 2018-07-12 | Nokia Technologies Oy | Reference picture handling |
KR101981712B1 (ko) * | 2012-11-21 | 2019-05-24 | 엘지전자 주식회사 | 영상 디코딩 방법 및 이를 이용하는 장치 |
US20190379903A1 (en) * | 2011-11-07 | 2019-12-12 | Microsoft Technology Licensing, Llc | Signaling of state information for a decoded picture buffer and reference picture lists |
Family Cites Families (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4935746B2 (ja) * | 2008-04-07 | 2012-05-23 | 富士通株式会社 | 動画像符号化装置、動画像復号化装置及びその符号化、復号化方法 |
US10674171B2 (en) * | 2011-09-27 | 2020-06-02 | Telefonaktiebolaget Lm Ericsson (Publ) | Decoders and methods thereof for managing pictures in video decoding process |
JP5869688B2 (ja) * | 2011-10-28 | 2016-02-24 | サムスン エレクトロニクス カンパニー リミテッド | インター予測方法及びその装置、動き補償方法及びその装置 |
KR102219082B1 (ko) * | 2011-11-11 | 2021-02-23 | 엘지전자 주식회사 | 영상 정보 전송 방법 및 장치와 이를 이용한 복호화 방법 및 장치 |
US9432665B2 (en) * | 2011-12-02 | 2016-08-30 | Qualcomm Incorporated | Coding least significant bits of picture order count values identifying long-term reference pictures |
WO2013107939A1 (en) * | 2012-01-20 | 2013-07-25 | Nokia Corporation | Method for video coding and an apparatus, a computer-program product, a system, and a module for the same |
KR101995270B1 (ko) * | 2012-04-25 | 2019-07-03 | 삼성전자주식회사 | 비디오 데이터를 재생하는 방법 및 장치 |
WO2014069920A1 (en) * | 2012-11-01 | 2014-05-08 | Samsung Electronics Co., Ltd. | Recording medium, reproducing device for providing service based on data of recording medium, and method thereof |
US9521393B2 (en) * | 2013-01-07 | 2016-12-13 | Qualcomm Incorporated | Non-nested SEI messages in video coding |
US10212435B2 (en) * | 2013-10-14 | 2019-02-19 | Qualcomm Incorporated | Device and method for scalable coding of video information |
US9942546B2 (en) * | 2013-12-12 | 2018-04-10 | Qualcomm Incorporated | POC value design for multi-layer video coding |
US11228776B1 (en) * | 2020-03-27 | 2022-01-18 | Tencent America LLC | Method for output layer set mode in multilayered video stream |
-
2021
- 2021-05-21 WO PCT/KR2021/006366 patent/WO2021235895A1/ko unknown
- 2021-05-21 US US17/926,402 patent/US20230188707A1/en active Pending
- 2021-05-21 KR KR1020227045280A patent/KR20230017819A/ko unknown
- 2021-05-21 JP JP2022571334A patent/JP2023526535A/ja active Pending
- 2021-05-21 CN CN202180060988.9A patent/CN116195247A/zh active Pending
- 2021-05-21 EP EP21809790.5A patent/EP4156688A4/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190379903A1 (en) * | 2011-11-07 | 2019-12-12 | Microsoft Technology Licensing, Llc | Signaling of state information for a decoded picture buffer and reference picture lists |
US20180199051A1 (en) * | 2011-11-08 | 2018-07-12 | Nokia Technologies Oy | Reference picture handling |
US20150304671A1 (en) * | 2012-07-01 | 2015-10-22 | Sharp Kabushiki Kaisha | Device for signaling a long-term reference picture in a parameter set |
KR101981712B1 (ko) * | 2012-11-21 | 2019-05-24 | 엘지전자 주식회사 | 영상 디코딩 방법 및 이를 이용하는 장치 |
Non-Patent Citations (2)
Title |
---|
CHOI (TENCENT) B; WENGER (STEWE) S; LIU (TENCENT) S: "AHG9: On picture output for non-reference pictures", 130. MPEG MEETING; 20200420 - 20200424; ALPBACH; (MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11),, 3 April 2020 (2020-04-03), pages 1 - 2, XP030285956 * |
See also references of EP4156688A4 * |
Also Published As
Publication number | Publication date |
---|---|
KR20230017819A (ko) | 2023-02-06 |
JP2023526535A (ja) | 2023-06-21 |
EP4156688A4 (en) | 2024-05-15 |
EP4156688A1 (en) | 2023-03-29 |
US20230188707A1 (en) | 2023-06-15 |
CN116195247A (zh) | 2023-05-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020197236A1 (ko) | 서브 픽처 핸들링 구조 기반 영상 또는 비디오 코딩 | |
WO2020076066A1 (ko) | 신택스 디자인 방법 및 신택스를 이용하여 코딩을 수행하는 장치 | |
WO2020189960A1 (ko) | 크로마 포맷에 대한 정보를 시그널링 하는 방법 및 장치 | |
WO2021225338A1 (ko) | 영상 디코딩 방법 및 그 장치 | |
WO2021201515A1 (ko) | Hls를 시그널링하는 영상 부호화/복호화 방법, 장치 및 비트스트림을 저장한 컴퓨터 판독 가능한 기록 매체 | |
WO2021096057A1 (ko) | 비디오 또는 영상 코딩 시스템에서의 엔트리 포인트 관련 정보에 기반한 영상 코딩 방법 | |
WO2021145728A1 (ko) | 인루프 필터링 기반 영상 코딩 장치 및 방법 | |
WO2021118295A1 (ko) | 루프 필터링을 제어하기 위한 영상 코딩 장치 및 방법 | |
WO2021118265A1 (ko) | 적응적 루프 필터를 적용하는 비디오 또는 영상 코딩 | |
WO2021133060A1 (ko) | 서브픽처 기반 영상 코딩 장치 및 방법 | |
WO2021118293A1 (ko) | 필터링 기반 영상 코딩 장치 및 방법 | |
WO2021107622A1 (ko) | 영상/비디오 코딩 방법 및 장치 | |
WO2021118261A1 (ko) | 영상 정보를 시그널링하는 방법 및 장치 | |
WO2021235895A1 (ko) | 영상 코딩 방법 및 그 장치 | |
WO2019203533A1 (ko) | 다중 움직임 모델을 고려한 인터 예측 방법 및 그 장치 | |
WO2021137588A1 (ko) | 픽처 헤더를 포함하는 영상 정보를 코딩하는 영상 디코딩 방법 및 그 장치 | |
WO2021107634A1 (ko) | 픽처 분할 정보를 시그널링 하는 방법 및 장치 | |
WO2021172891A1 (ko) | 인루프 필터링 기반 영상 코딩 장치 및 방법 | |
WO2021145726A1 (ko) | 적응적 루프 필터링 기반 영상 코딩 장치 및 방법 | |
WO2021145725A1 (ko) | 필터링 관련 정보 시그널링 기반 영상 코딩 장치 및 방법 | |
WO2021107624A1 (ko) | 픽처의 분할 구조에 기반한 영상/비디오 코딩 방법 및 장치 | |
WO2021201463A1 (ko) | 인루프 필터링 기반 영상 코딩 장치 및 방법 | |
WO2021118263A1 (ko) | 영상 정보를 시그널링하는 방법 및 장치 | |
WO2021137589A1 (ko) | 영상 디코딩 방법 및 그 장치 | |
WO2021137591A1 (ko) | Ols dpb 파라미터 인덱스를 포함하는 영상 정보 기반 영상 디코딩 방법 및 그 장치 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21809790 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2022571334 Country of ref document: JP Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 20227045280 Country of ref document: KR Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2021809790 Country of ref document: EP Effective date: 20221222 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |