WO2021215614A1 - 영상의 복호화 방법 및 장치 - Google Patents
영상의 복호화 방법 및 장치 Download PDFInfo
- Publication number
- WO2021215614A1 WO2021215614A1 PCT/KR2020/018464 KR2020018464W WO2021215614A1 WO 2021215614 A1 WO2021215614 A1 WO 2021215614A1 KR 2020018464 W KR2020018464 W KR 2020018464W WO 2021215614 A1 WO2021215614 A1 WO 2021215614A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- picture
- nal unit
- subpicture
- current
- flag
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 113
- 230000008569 process Effects 0.000 claims description 41
- 239000000523 sample Substances 0.000 description 47
- 238000010586 diagram Methods 0.000 description 22
- 238000013139 quantization Methods 0.000 description 13
- 238000012545 processing Methods 0.000 description 11
- 230000002123 temporal effect Effects 0.000 description 11
- 230000006870 function Effects 0.000 description 9
- 230000003044 adaptive effect Effects 0.000 description 8
- 230000008707 rearrangement Effects 0.000 description 8
- 230000002194 synthesizing effect Effects 0.000 description 8
- 238000001914 filtration Methods 0.000 description 7
- 230000015572 biosynthetic process Effects 0.000 description 5
- 238000012856 packing Methods 0.000 description 5
- 238000003786 synthesis reaction Methods 0.000 description 5
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 4
- 230000008901 benefit Effects 0.000 description 2
- 239000013074 reference sample Substances 0.000 description 2
- 230000011664 signaling Effects 0.000 description 2
- 238000000638 solvent extraction Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 230000001131 transforming effect Effects 0.000 description 2
- 241000023320 Luma <angiosperm> Species 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- OSWPMRLSEDHDFF-UHFFFAOYSA-N methyl salicylate Chemical compound COC(=O)C1=CC=CC=C1O OSWPMRLSEDHDFF-UHFFFAOYSA-N 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/119—Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/132—Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/593—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
Definitions
- the present invention relates to a subpicture division method for synthesizing with other sequences and a slice division method for bitstream packing.
- This specification presents a subpicture division method for compositing with other sequences and a slice division method for bit stream packing.
- an image decoding method performed by an image decoding apparatus includes: obtaining NAL unit type information indicating a type of a current network abstraction layer (NAL) unit from a bitstream; and when the NAL unit type information indicates that the NAL unit type of the current NAL unit is encoded data for an image slice, the image slice based on whether a mixed NAL unit type is applied to the current picture It includes the step of decrypting.
- the decoding of the image slice is performed by determining whether the NAL unit type of the current NAL unit indicates a property of a subpicture for the current image slice based on whether the hybrid NAL unit type is applied.
- an image decoding apparatus for solving the above problem is an image decoding apparatus including a memory and at least one processor, wherein the at least one processor is configured to extract a current network abstraction (NAL) from a bitstream. layer) obtains NAL unit type information indicating the type of unit, and when the NAL unit type information indicates that the NAL unit type of the current NAL unit is encoding data for an image slice, whether the hybrid NAL unit type is applied to the current picture The image slice may be decoded based on whether or not In this case, the decoding of the image slice may be performed by determining whether the NAL unit type of the current NAL unit indicates the property of a subpicture for the current image slice based on whether the hybrid NAL unit type is applied. have.
- NAL network abstraction
- the encoding of the image slice may include encoding such that, when the current picture is encoded based on the hybrid NAL unit type, the NAL unit type of the current NAL unit indicates a property of a subpicture for the current image slice. can be performed.
- the transmission method according to an embodiment of the present invention for solving the above problem may transmit a bitstream generated by the image encoding apparatus or the image encoding method of the present disclosure.
- the computer-readable recording medium for solving the above problem may store a bitstream generated by the image encoding method or image encoding apparatus of the present disclosure.
- the present invention provides a method of generating one picture through synthesis with several different sequences.
- a picture in a sequence is divided into a plurality of sub-pictures, and a new picture is generated by synthesizing the divided sub-pictures of other pictures.
- NAL network abstraction layer
- FIG. 1 is a diagram schematically showing the configuration of a video encoding apparatus to which the present invention can be applied.
- FIG. 2 is a diagram illustrating an example of an image encoding method performed by a video encoding apparatus.
- FIG. 3 is a diagram schematically showing the configuration of a video decoding apparatus to which the present invention can be applied.
- FIG. 4 is a diagram illustrating an example of an image decoding method performed by a decoding apparatus.
- FIG. 5 is a diagram illustrating an example of a NAL packet for a slice.
- FIG. 6 is a diagram illustrating an example of a hierarchical GOP structure.
- FIG. 7 is a diagram illustrating an example of a display output order and a decoding order.
- FIG. 8 is a diagram illustrating an example of a leading picture and a normal picture.
- FIG. 9 is a diagram illustrating an example of a RASL picture and a RADL picture.
- 10 is a diagram illustrating syntax for a slice segment header.
- FIG. 11 is a diagram illustrating an example of a content synthesis process.
- FIG. 12 is a diagram illustrating an example of a subpicture ID and a slice address.
- 13 is a diagram illustrating an example of a NUT for each subpicture/slice.
- FIG. 14 is a diagram illustrating an embodiment of a syntax of a picture parameter set (PPS).
- PPS picture parameter set
- 15 is a diagram illustrating an embodiment of a syntax of a slice header.
- 16 is a diagram showing the syntax of a picture header structure.
- 17 is a diagram illustrating syntax for obtaining a reference picture list.
- 18 is a diagram showing an example of content synthesis.
- 19 and 20 are flowcharts illustrating a decoding method and an encoding method according to an embodiment of the present invention.
- each component in the drawings described in the present invention is shown independently for the convenience of description regarding different characteristic functions, and does not mean that each component is implemented as separate hardware or separate software.
- two or more components among each component may be combined to form one component, or one component may be divided into a plurality of components.
- Embodiments in which each component is integrated and/or separated are also included in the scope of the present invention without departing from the essence of the present invention.
- the present invention relates to video/image coding.
- the method/embodiment disclosed in the present invention is a versatile video coding (VVC) standard, an Essential Video Coding (EVC) standard, an AOMedia Video 1 (AV1) standard, a 2nd generation of audio video coding standard (AVS2) or a next-generation video It can be applied to methods disclosed in /image coding standards (eg, H.267, H.268, etc.).
- an access unit means a unit indicating a plurality of picture sets belonging to different layers output at the same time from a decoded picture buffer (DPB).
- a picture generally means a unit representing one image in a specific time period, and a slice is a unit constituting a part of a picture in coding.
- One picture may consist of a plurality of slices, and if necessary, a picture and a slice may be used interchangeably.
- a pixel or pel may mean a minimum unit constituting one picture (or image). Also, as a term corresponding to a pixel, a 'sample' may be used. A sample may generally represent a pixel or a value of a pixel, may represent only a pixel/pixel value of a luma component, or may represent only a pixel/pixel value of a chroma component.
- a unit represents a basic unit of image processing.
- the unit may include at least one of a specific region of a picture and information related to the region.
- a unit may be used interchangeably with terms such as a block or an area in some cases.
- an MxN block may represent a set of samples or transform coefficients including M columns and N rows.
- FIG. 1 is a diagram schematically illustrating a configuration of a video encoding apparatus to which the present invention can be applied.
- the video encoding apparatus 100 includes a picture division unit 105 , a prediction unit 110 , a residual processing unit 120 , an entropy encoding unit 130 , an adder 140 , and a filter unit 150 . ) and a memory 160 .
- the residual processing unit 120 may include a subtraction unit 121 , a transform unit 122 , a quantization unit 123 , a rearrangement unit 124 , an inverse quantization unit 125 , and an inverse transform unit 126 .
- the picture divider 105 may divide the input picture into at least one processing unit.
- the processing unit may be referred to as a coding unit (CU).
- the coding unit may be recursively divided from a coding tree unit according to a quad-tree binary-tree (QTBT) structure.
- QTBT quad-tree binary-tree
- one coding tree unit may be divided into a plurality of nodes having a lower depth based on a quad tree structure and/or a binary tree structure.
- a quad tree structure may be applied first and a binary tree structure may be applied later.
- the binary tree structure may be applied first.
- Decoding may be performed on a node that is no longer split, and a coding unit may be determined for a node that is no longer split in this way.
- the coding tree unit is a unit for division of the coding unit
- the coding tree unit may be referred to as a coding unit.
- the coding tree unit since the coding unit is determined by the division of the coding tree unit, the coding tree unit may be called a largest coding unit (LCU).
- LCU largest coding unit
- the coding procedure according to the present invention may be performed based on the final coding unit that is no longer divided.
- the coding tree unit may be directly used as the final coding unit based on coding efficiency according to image characteristics, or if necessary, the coding unit may be recursively divided into coding units having a lower depth than the optimal coding unit.
- a coding unit of the size of may be used as the final coding unit.
- the coding procedure may include procedures such as prediction, transformation, and restoration, which will be described later.
- the processing unit may include a coding unit (CU), a prediction unit (PU), or a transform unit (TU).
- a coding unit may be split into coding units of a lower depth along a quad tree structure from the coding tree unit.
- the coding tree unit may be directly used as the final coding unit based on coding efficiency according to image characteristics, or if necessary, the coding unit may be recursively divided into coding units having a lower depth than the optimal coding unit.
- a coding unit of the size of may be used as the final coding unit.
- the final coding unit means a coding unit that is a base that is partitioned or divided into a prediction unit or a transform unit.
- a prediction unit is a unit partitioned from a coding unit, and may be a unit of sample prediction. In this case, the prediction unit may be divided into sub-blocks.
- a transform unit may be divided along a quad tree structure from a coding unit, and may be a unit deriving a transform coefficient and/or a unit deriving a residual signal from the transform coefficient.
- the coding unit may be referred to as a coding block (CB)
- the prediction unit may be referred to as a prediction block (PB)
- the transform unit may be referred to as a transform block (TB).
- a prediction block or a prediction unit may mean a specific area in the form of a block within a picture, and may include an array of prediction samples.
- a transform block or transform unit may mean a specific block-shaped region within a picture, and may include transform coefficients or an array of residual samples.
- the prediction unit 110 may perform prediction on a processing target block (hereinafter, referred to as a current block) and generate a predicted block including prediction samples for the current block.
- a unit of prediction performed by the prediction unit 110 may be a coding block, a transform block, or a prediction block.
- the prediction unit 110 may determine whether intra prediction or inter prediction is applied to the current block. For example, the prediction unit 110 may determine whether intra prediction or inter prediction is applied in units of CUs.
- the prediction unit 110 may derive a prediction sample for the current block based on a reference sample outside the current block in a picture to which the current block belongs (hereinafter, referred to as a current picture).
- the prediction unit 110 may (i) derive a prediction sample based on an average or interpolation of neighboring reference samples of the current block, and (ii) a neighboring reference of the current block.
- the prediction sample may be derived based on a reference sample existing in a specific (prediction) direction with respect to the prediction sample among the samples.
- the case of (i) may be called a non-directional mode or a non-angular mode, and the case of (ii) may be called a directional mode or an angular mode.
- a prediction mode may have, for example, 33 directional prediction modes and at least two or more non-directional modes.
- the non-directional mode may include a DC prediction mode and a planar mode (Planar mode).
- the prediction unit 110 may determine the prediction mode applied to the current block by using the prediction mode applied to the neighboring block.
- the prediction unit 110 may derive a prediction sample for the current block based on a sample specified by a motion vector on a reference picture.
- the prediction unit 110 may derive a prediction sample for the current block by applying any one of a skip mode, a merge mode, and a motion vector prediction (MVP) mode.
- the prediction unit 110 may use motion information of a neighboring block as motion information of the current block.
- the skip mode unlike the merge mode, the difference (residual) between the predicted sample and the original sample is not transmitted.
- the motion vector of the current block may be derived by using the motion vector of the neighboring block as a motion vector predictor of the current block.
- a neighboring block may include a spatial neighboring block existing in the current picture and a temporal neighboring block present in a reference picture.
- the reference picture including the temporal neighboring block may be referred to as a collocated picture (colPic).
- Motion information may include a motion vector and a reference picture index.
- Information such as prediction mode information and motion information may be (entropy) encoded and output in the form of a bitstream.
- the highest picture on a reference picture list may be used as a reference picture.
- Reference pictures included in the reference picture list may be sorted based on a picture order count (POC) difference between the current picture and the corresponding reference picture.
- POC picture order count
- the subtraction unit 121 generates a residual sample that is a difference between an original sample and a predicted sample.
- the residual sample may not be generated as described above.
- the transform unit 122 generates transform coefficients by transforming residual samples in units of transform blocks.
- the transform unit 122 may perform transform according to the size of the corresponding transform block and the prediction mode applied to the coding block or prediction block spatially overlapping the corresponding transform block. For example, if intra prediction is applied to the coding block or the prediction block overlapping the transform block, and the transform block is a 4 ⁇ 4 residual array, the residual sample is a Discrete Sine Transform (DST) transform kernel. In other cases, the residual sample may be transformed using a DCT (Discrete Cosine Transform) transformation kernel.
- DST Discrete Sine Transform
- the quantizer 123 may quantize the transform coefficients to generate a quantized transform coefficient.
- the rearrangement unit 124 rearranges the quantized transform coefficients.
- the reordering unit 124 may rearrange the quantized transform coefficients in a block form into a one-dimensional vector form through a coefficient scanning method.
- the rearrangement unit 124 has been described as a separate configuration, the rearrangement unit 124 may be a part of the quantization unit 123 .
- the entropy encoding unit 130 may perform entropy encoding on the quantized transform coefficients.
- Entropy encoding may include, for example, an encoding method such as exponential Golomb, context-adaptive variable length coding (CAVLC), and context-adaptive binary arithmetic coding (CABAC).
- the entropy encoding unit 130 may encode information necessary for video reconstruction (eg, a value of a syntax element, etc.) other than the quantized transform coefficient together or separately.
- Entropy-encoded information may be transmitted or stored in a network abstraction layer (NAL) unit unit in the form of a bitstream.
- NAL network abstraction layer
- the inverse quantization unit 125 inversely quantizes the values (quantized transform coefficients) quantized by the quantizer 123 , and the inverse transform unit 126 inversely transforms the values inversely quantized by the inverse quantization unit 125 to obtain a residual sample. create
- the adder 140 reconstructs a picture by combining the residual sample and the prediction sample.
- a reconstructed block may be generated by adding the residual sample and the prediction sample in units of blocks.
- the adder 140 has been described as a separate configuration, but the adder 140 may be a part of the prediction unit 110 .
- the adder 140 may be referred to as a restoration unit or a restoration block generator.
- the filter unit 150 may apply a deblocking filter and/or a sample adaptive offset to a reconstructed picture. Artifacts of block boundaries in the reconstructed picture or distortion in the quantization process may be corrected through deblocking filtering and/or sample adaptive offset.
- the sample adaptive offset may be applied in units of samples, and may be applied after the process of deblocking filtering is completed.
- the filter unit 150 may apply an adaptive loop filter (ALF) to the reconstructed picture. ALF may be applied to the reconstructed picture after the deblocking filter and/or sample adaptive offset is applied.
- ALF adaptive loop filter
- the memory 160 may store a reconstructed picture (a decoded picture) or information required for encoding/decoding.
- the reconstructed picture may be a reconstructed picture whose filtering procedure has been completed by the filter unit 150 .
- the stored reconstructed picture may be used as a reference picture for (inter) prediction of another picture.
- the memory 160 may store (reference) pictures used for inter prediction.
- pictures used for inter prediction may be designated by a reference picture set or a reference picture list.
- the image encoding method may include block partitioning, intra/inter prediction, transform, quantization, and entropy encoding.
- the current picture may be divided into a plurality of blocks, a prediction block of the current block may be generated through intra/inter prediction, and the input block of the current block may be subtracted from the prediction block.
- a residual block of the current block may be generated.
- a coefficient block ie, transform coefficients of the current block, may be generated by transforming the residual block.
- the transform coefficients may be quantized and entropy encoded and stored in a bitstream.
- FIG. 3 is a diagram schematically illustrating a configuration of a video decoding apparatus to which the present invention can be applied.
- the video decoding apparatus 300 includes an entropy decoding unit 310 , a residual processing unit 320 , a prediction unit 330 , an adder 340 , a filter unit 350 and a memory 360 .
- the residual processing unit 320 may include a rearrangement unit 321 , an inverse quantization unit 322 , and an inverse transform unit 323 .
- the video decoding apparatus 300 may reconstruct a video corresponding to a process in which the video information is processed by the video encoding apparatus.
- the video decoding apparatus 300 may perform video decoding using a processing unit applied in the video encoding apparatus.
- a processing unit block of video decoding may be, as an example, a coding unit, and may be a coding unit, a prediction unit, or a transform unit, as another example.
- a coding unit may be partitioned from a coding tree unit along a quad tree structure and/or a binary tree structure.
- a prediction unit and a transform unit may be further used depending on the case, in which case a prediction block is a block derived or partitioned from a coding unit, and may be a unit of sample prediction. In this case, the prediction unit may be divided into sub-blocks.
- a transform unit may be divided along a quad tree structure from a coding unit, and may be a unit deriving a transform coefficient or a unit deriving a residual signal from a transform coefficient.
- the entropy decoding unit 310 may parse the bitstream and output information necessary for video or picture restoration. For example, the entropy decoding unit 310 decodes information in a bitstream based on a coding method such as exponential Golomb encoding, CAVLC, or CABAC, and a value of a syntax element required for video reconstruction and a quantized value of a transform coefficient related to a residual can be printed out.
- a coding method such as exponential Golomb encoding, CAVLC, or CABAC
- the CABAC entropy decoding method receives a bin corresponding to each syntax element in a bitstream, and decodes the syntax element information to be decoded and the decoding information of the surrounding and decoding target blocks or the symbol/bin information decoded in the previous step.
- a context model is determined using the context model, and the probability of occurrence of a bin is predicted according to the determined context model, and a symbol corresponding to the value of each syntax element can be generated by performing arithmetic decoding of the bin. have.
- the CABAC entropy decoding method may update the context model by using the decoded symbol/bin information for the context model of the next symbol/bin after determining the context model.
- the information about prediction among the information decoded by the entropy decoding unit 310 is provided to the prediction unit 330, and the residual value on which the entropy decoding is performed in the entropy decoding unit 310, that is, the quantized transform coefficient is a rearrangement unit ( 321) can be entered.
- the reordering unit 321 may rearrange the quantized transform coefficients in a two-dimensional block form.
- the reordering unit 321 may perform reordering in response to coefficient scanning performed by the encoding apparatus.
- the rearrangement unit 321 has been described as a separate configuration, the rearrangement unit 321 may be a part of the inverse quantization unit 322 .
- the inverse quantizer 322 may inverse quantize the quantized transform coefficients based on the (inverse) quantization parameter to output the transform coefficients.
- information for deriving the quantization parameter may be signaled from the encoding device.
- the inverse transform unit 323 may inverse transform the transform coefficients to derive residual samples.
- the prediction unit 330 may perform prediction on the current block and generate a predicted block including prediction samples for the current block.
- a unit of prediction performed by the prediction unit 330 may be a coding block, a transform block, or a prediction block.
- the prediction unit 330 may determine whether to apply intra prediction or inter prediction based on the information on the prediction.
- a unit for determining which one of intra prediction and inter prediction is applied and a unit for generating a prediction sample may be different.
- units for generating prediction samples in inter prediction and intra prediction may also be different.
- which one of inter prediction and intra prediction is to be applied may be determined in units of CUs.
- a prediction mode may be determined in units of PUs and a prediction sample may be generated
- intra prediction a prediction mode may be determined in units of PUs and a prediction sample may be generated in units of TUs.
- the prediction unit 330 may derive a prediction sample for the current block based on neighboring reference samples in the current picture.
- the prediction unit 330 may derive a prediction sample for the current block by applying a directional mode or a non-directional mode based on the neighboring reference samples of the current block.
- the prediction mode to be applied to the current block may be determined by using the intra prediction mode of the neighboring block.
- the prediction unit 330 may derive a prediction sample for the current block based on a sample specified on the reference picture by a motion vector on the reference picture.
- the prediction unit 330 may derive a prediction sample for the current block by applying any one of a skip mode, a merge mode, and an MVP mode.
- motion information necessary for inter prediction of the current block provided by the video encoding apparatus for example, information about a motion vector, a reference picture index, etc., may be obtained or derived based on the information about the prediction.
- motion information of a neighboring block may be used as motion information of the current block.
- the neighboring block may include a spatial neighboring block and a temporal neighboring block.
- the prediction unit 330 may construct a merge candidate list with motion information of available neighboring blocks, and use information indicated by a merge index on the merge candidate list as a motion vector of the current block.
- the merge index may be signaled from the encoding device.
- the motion information may include a motion vector and a reference picture. When motion information of a temporal neighboring block is used in the skip mode and the merge mode, the highest picture on the reference picture list may be used as the reference picture.
- the difference (residual) between the predicted sample and the original sample is not transmitted.
- the motion vector of the current block may be derived by using the motion vector of the neighboring block as a motion vector predictor.
- the neighboring block may include a spatial neighboring block and a temporal neighboring block.
- a merge candidate list may be generated using a motion vector of a reconstructed spatial neighboring block and/or a motion vector corresponding to a Col block that is a temporal neighboring block.
- the motion vector of the candidate block selected from the merge candidate list is used as the motion vector of the current block.
- the prediction information may include a merge index indicating a candidate block having an optimal motion vector selected from among candidate blocks included in the merge candidate list.
- the prediction unit 330 may derive the motion vector of the current block by using the merge index.
- a motion vector predictor candidate list is generated using a motion vector of a reconstructed spatial neighboring block and/or a motion vector corresponding to a col block that is a temporal neighboring block.
- a motion vector of a reconstructed spatial neighboring block and/or a motion vector corresponding to a col block that is a temporal neighboring block may be used as a motion vector candidate.
- the prediction information may include a prediction motion vector index indicating an optimal motion vector selected from motion vector candidates included in the list.
- the prediction unit 330 may select a prediction motion vector of the current block from among motion vector candidates included in the motion vector candidate list by using the motion vector index.
- the prediction unit of the encoding apparatus may obtain a motion vector difference (MVD) between the motion vector of the current block and the motion vector predictor, encode it and output it in the form of a bitstream. That is, the MVD may be obtained by subtracting the motion vector predictor from the motion vector of the current block.
- the prediction unit 330 may obtain a motion vector difference included in the prediction-related information, and derive the motion vector of the current block by adding the motion vector difference and the motion vector predictor.
- the prediction unit may also obtain or derive a reference picture index indicating a reference picture from the information about the prediction.
- the adder 340 may reconstruct the current block or the current picture by adding the residual sample and the prediction sample.
- the adder 340 may reconstruct the current picture by adding the residual sample and the prediction sample in units of blocks.
- the prediction sample may be the reconstructed sample.
- the adder 340 is described as a separate configuration, but the adder 340 may be a part of the predictor 330 . Meanwhile, the adder 340 may be referred to as a restoration unit or a restoration block generator.
- the filter unit 350 may apply a deblocking filtering sample adaptive offset and/or ALF to the reconstructed picture.
- the sample adaptive offset may be applied in units of samples or may be applied after deblocking filtering.
- ALF may be applied after deblocking filtering and/or sample adaptive offset.
- the memory 360 may store a reconstructed picture (a decoded picture) or information necessary for decoding.
- the reconstructed picture may be a reconstructed picture whose filtering procedure has been completed by the filter unit 350 .
- the memory 360 may store pictures used for inter prediction.
- pictures used for inter prediction may be designated by a reference picture set or a reference picture list.
- the reconstructed picture may be used as a reference picture for other pictures.
- the memory 360 may output the restored pictures according to the output order.
- the image decoding method may include entropy decoding, inverse quantization, inverse transform, and intra/inter prediction processes.
- the reverse process of the encoding method may be performed.
- quantized transform coefficients may be obtained through entropy decoding of the bitstream, and a coefficient block of the current block, ie, transform coefficients, may be obtained through an inverse quantization process for the quantized transform coefficients.
- a residual block of the current block may be derived through inverse transform on the transform coefficients, and the prediction block of the current block derived through intra/inter prediction is added to the residual block of the current block
- a reconstructed block may be derived.
- Floor(x) may represent a maximum integer value less than or equal to x
- Log2(u) may represent a log value based on 2 of u
- Ceil(x) is greater than or equal to x It can represent the minimum integer value. For example, in the case of Floor(5.93), since the maximum integer value less than or equal to 5.93 is 5, it may represent 5.
- x>>y may represent an operator for right shifting x y times
- x ⁇ y may represent an operator for left shifting x y times.
- the HEVC standard proposes two types of screen division methods.
- Slice Provides a function to encode/decode a single image by dividing it into CTU (coding tree unit) units in the order of raster scan order, and slice header information is exist.
- CTU coding tree unit
- Tile Provides a function for encoding/decoding one image by dividing it into a plurality of columns and rows in units of CTUs.
- the division method can be either equal division or individual division. A header for a tile does not exist separately.
- a slice becomes a bit-stream packing unit. That is, one slice may be generated as one network abstraction layer (NAL) bit stream.
- NAL network abstraction layer
- a NAL packet for a slice is configured in the order of a NAL header, a slice header, and slice data.
- NUT NAL unit type
- Table 2 shows NUTs for slices proposed by the HEVC standard according to an embodiment.
- NUTs for an inter slice in which inter prediction is performed are numbered 0 to 9
- NUTs for an intra slice in which intra prediction is performed are numbers 16 to 21.
- the inter-slice means that it is encoded by the inter prediction method
- the intra slice means that it is encoded by the intra-prediction method.
- One slice is defined to have one NUT, and a plurality of slices in one picture may all be set to have the same NUT value. For example, if one picture is divided into 4 slices and encoded using an intra prediction method, the NUT values for all 4 slices in the picture may be equally set to “19: IDR_W_RADL”.
- IRAP Intra Random Access Point
- An intra slice can exist only as an I slice type.
- Inter-slice may be divided into P slices or B slices according to unidirectional prediction (P: predictive) or bi-predictive (B: bi-predictive).
- P predictive
- B bi-predictive
- Prediction and encoding processes are performed in a group of picture (GOP) unit, and the HEVC standard performs encoding/decoding processes including prediction using a hierarchical GOP structure.
- 6 shows an example of a hierarchical GOP structure, and each picture is divided into I, P, or B pictures (slice) according to a prediction method.
- IRAP denotes an intra slice
- B and P denote an inter slice
- LP leading picture
- RADL an LP that can be decoded
- RASL an LP that cannot be decoded during random access and needs to skip a restoration process of a corresponding picture
- pictures of the same color are defined as one GOP.
- RASL refers to an inter-picture in which a reconstructed picture in another GOP is used as a reference picture in addition to the corresponding GOP, or a picture reconstructed by using a reconstructed picture in another GOP as a reference picture is used as a reference picture.
- the restored picture in another GOP is used as a reference picture (directly or indirectly), it is called an open GOP.
- RASL and RADL are set as NUT information for the corresponding inter slice.
- a NUT for an intra slice is divided into different intra slice NUTs according to a NUT of an inter slice preceding and/or following in a reproduction order and/or a restoration order of the corresponding intra slice.
- IDR can be divided into IDR_W_RADL with RADL and IDR_N_LP without LP. That is, IDR is a type that does not have an LP or a type that has only RADL among LPs, and an IDR cannot have RASL.
- CRA is a type that can have both RADL and/or RASL among LPs. That is, CRA is a type that can support open GOP.
- reference picture information for the corresponding intra slice is not required.
- the reference picture is used for inter prediction.
- the reference picture information is not for use in the corresponding CRA slice, but information on the reference picture scheduled to be used in the inter-slice after the corresponding CRA (in the restoration order). This is so that the reference picture is not removed from a decoded picture buffer (DPB).
- DPB decoded picture buffer
- FIG. 10 is a diagram illustrating syntax for a slice segment header. As shown in FIG. 10 , if the NUT of the corresponding slice is not IDR, reference picture information may be described in the bitstream. That is, if the NUT of the slice is CRA, reference picture information may be described.
- the present invention provides a subpicture division method for compositing with other sequences and a slice division method for bitstream packing.
- a slice means an encoding/decoding region, and is a data packing unit generating one NAL bitstream. For example, one picture is divided into a plurality of slices, and each slice is generated as one NAL packet through an encoding process.
- a sub picture is a region division for synthesis with other contents.
- 11 shows an example of composition with other content.
- the white region and the gray region may constitute one subpicture as one slice, and two slices may constitute one subpicture in the black region. That is, one subpicture may include at least one slice.
- BEAMer Bit-stream Extractor And Merger extracts regions from different contents in units of sub-pictures and synthesizes them.
- the image synthesized in FIG. 11 may be divided into 4 slices and composed of 3 sub pictures.
- One sub picture means a region having the same sub picture ID and/or sub picture index value.
- at least one slice having the same subpicture ID and/or subpicture index value may be referred to as one subpicture region.
- a subpicture ID and/or a subpicture index value are included in the slice header information.
- Subpicture index values may be set in a raster-scan order. 12 shows an example in which one picture is composed of six (square) slices and four (by color) sub-picture regions.
- “A”, “B”, “C”, and “D” represent examples of subpicture IDs
- “0” and “1” represent slice addresses in the corresponding subpicture.
- the slice address value is the slice index value in the raster scan order in the corresponding subpicture.
- “B-0” denotes the 0th slice in the B subpicture
- “B-1” denotes the 1st slice in the B subpicture.
- NUT values for two or more sub-pictures constituting one image may be different.
- a white subpicture (slice) in one image may be an intra slice
- a gray subpicture (slice) and a black subpicture (slice) may be an inter slice.
- the corresponding function may be referred to as a mixed NAL Unit Type in a picture in a single picture, and may simply be referred to as a mixed NUT.
- mixed_nalu_type_in_pic_flag By adding mixed_nalu_type_in_pic_flag, it is possible to set enable/disable of a corresponding function.
- the corresponding flag may be defined in one or more positions of a sequence parameter set (SPS), a picture parameter set (PPS), a picture header (PH), and a slice header (SH).
- SPS sequence parameter set
- PPS picture parameter set
- PH picture header
- SH slice header
- NUTs for all subpictures and/or slices in the corresponding picture may have the same value.
- NUTs for all video coding layer (VCL) NAL units for one picture may be set to have the same value.
- VCL means a NAL type for a slice including a slice data value.
- the NUT (eg first NUT) of any one VCL NAL unit (eg first NAL unit) of the picture is any one of IDR_W_RADL, IDR_N_LP, or CRA_NUT
- another VCL NAL unit (eg the first NAL unit) of the picture may be limited to be set to any one of IDR_W_RADL, IDR_N_LP, CRA_NUT, or TRAIL_NUT.
- the second NUT may be limited to be set to one of the first NUT and TRAIL_NUT.
- VCL NAL units of the corresponding picture have at least two different NUT values will be described with reference to FIGS. 12 and 13 .
- two or more sub-pictures may have two or more different NUT values.
- NUT values for all slices included in one subpicture may be limited to be the same.
- the NUT values for two slices in the B subpicture of FIG. 12 may be set to the same CRA, and the NUT values for the two slices in the C subpicture are also set to TRAIL.
- the same may be set, and the A, B, C, and D subpictures may be set to have at least two different NUT values.
- the NUT values for the slices in the A, C, and D subpictures may be set as TRAIL to have a different NUT value from the CRA, which is the NUT of the B subpicture.
- the NUT for the intra slice and the inter slice may be specified as shown in Table 3.
- definitions and functions for RADL, RASL, IDR, CRA, etc. may be set to be the same as in the HEVC standard (Table 1).
- Table 3 a hybrid NUT type is added.
- the disable value (eg 0) of mixed_nalu_type_in_pic_flag indicates the NUT for the slice in the picture (same as HEVC)
- the enable value (eg 1) of the mixed_nalu_type_in_pic_flag indicates the NUT for the slice in the subpicture.
- the NUT of the current picture may be identified as TRAIL_NUT, and the NUT of other subpictures belonging to the current picture may also be derived as TRAIL_NUT.
- the NUT of the current subpicture may be identified as TRAIL_NUT, and it is predicted that at least one NUT among other subpictures belonging to the current picture is not TRAIL_NUT can be
- any one VCL NAL unit (eg first NAL unit) belonging to one picture is IDR_W_RADL, IDR_N_LP, or CRA_NUT
- at least one VCL NAL unit (eg second NAL unit) among other VCL NAL units of the corresponding picture is a NUT (eg second NUT) among IDR_W_RADL, IDR_N_LP, CRA_NUT, or TRAIL_NUT It may have any one NUT value other than the first NUT.
- the VCL NAL unit (eg the first NAL unit) for the first subpicture belonging to one picture has any one of IDR_W_RADL, IDR_N_LP, and CRA_NUT as the NUT (eg the first NUT)
- the VCL NAL unit (eg the second NAL unit) for the second subpicture of may have a value of any one of IDR_W_RADL, IDR_N_LP, CRA_NUT, or TRAIL_NUT other than the first NUT as a NUT (eg second NUT).
- the NUT value of the VCL NAL unit for two or more subpictures may be configured as follows. The following description is merely an example and is not limited thereto.
- Combination 1 is an embodiment in which at least one subpicture in a picture has an IRAP (IDR or CRA) NUT value, while at least one other subpicture has a non-IRAP (inter slice) NUT value.
- IDR or CRA IRAP
- inter slice inter slice
- a value other than LP RASL and RADL
- LP RASL or RADL
- RASL and RADL RASL or RADL
- it may be restricted so that RASL and RADL subpictures are not encoded in the bitstream associated with the IDR or CRA subpicture.
- only the TRAIL value may be allowed as the inter-slice NUT value.
- all inter-slice VCL NUTs may be allowed as inter-slice NUT values.
- Combination 2 is an embodiment in which at least one sub-picture in a picture has a non-IRAP (inter-slice) NUT value, while at least one other sub-picture has a different non-IRAP (inter-slice) NUT value.
- at least one subpicture may have a RASL NUT value and at least one other subpicture may have a RADL NUT value.
- the following restrictions may be applied depending on the embodiment.
- LP RASL and RADL
- TRAIL non-LP
- the NUT of at least one subpicture is RASL (or RADL)
- the NUT of the other at least one subpicture cannot be TRAIL.
- RASL or RADL may be used as the NUT of at least one other subpicture.
- the leading subpicture of an IRAP subpicture may be coerced to be a RADL or RASL subpicture.
- LP RASL and RADL
- TRAIL non-LP
- at least one sub-picture may be RASL (or RADL)
- at least one other sub-picture may be TRAIL.
- all subpictures may have the same inter-slice NUT value.
- all sub-pictures in a picture may have a TRAIL NUT value.
- all sub-pictures in a picture may have a RASL (or RADL) NUT value.
- Combination 3 represents an embodiment in which all subpictures or slices in a picture are configured with IRAP.
- the NUT value for the slice in the first subpicture is IDR_W_RADL, IDR_N_LP, or CRA_NUT
- the NUT value for the slice in the second subpicture is not the NUT of the first subpicture among IDR_W_RADL, IDR_N_LP, and CRA_NUT It can consist of values.
- a NUT value for a slice within at least one subpicture may be IDR
- a NUT value for a slice within another at least one subpicture may be configured as a CRA.
- all pictures belonging to an IRAP or GDR access unit may be restricted to have the same NUT. That is, when the current access unit is an IRAP access unit configured only with IRAP pictures or the current access unit is a GDR access unit configured only with GDR pictures, pictures belonging to the current access unit may be restricted to have the same NUT. For example, while the NUT value for a slice within at least one subpicture is IDR, it may be restricted so that the NUT value for a slice within another at least one subpicture is not configured as a CRA.
- At least one subpicture in the corresponding picture may be restricted to have a NUT value for non-IRAP (inter slice). For example, it may be restricted so that all subpictures in the corresponding picture cannot have the NUT value for IDR in the encoding and decoding process. Alternatively, it may be restricted so that some subpictures in the corresponding picture have a NUT value for IDR and other subpictures do not have a CRA NUT value.
- mixed_nalu_type_in_pic_flag when the value of mixed_nalu_type_in_pic_flag indicates that the mixed NUT is applied, one picture may be divided into at least two subpictures. Accordingly, information on the sub-picture for the corresponding picture may be signaled through the bitstream.
- mixed_nalu_type_in_pic_flag may indicate whether the current picture is split. For example, when a value of mixed_nalu_type_in_pic_flag indicates that a mixed NUT is applied, it may indicate that the current picture is split.
- FIG. 14 is a diagram illustrating an embodiment of a syntax of a picture parameter set (PPS).
- a flag e.g. pps_no_pic_partition_flag
- a value indicating the enable of pps_no_pic_partition_flag may indicate that picture partitioning is not applied to pictures referring to the current PPS.
- a value e.g.
- pps_no_pic_partition_flag may indicate that picture division using a slice or a tile can be applied to pictures referring to the current PPS.
- the value of pps_no_pic_partition_flag may be forced to a value indicating disable (e.g. 0).
- pps_no_pic_partition_flag indicates that the current picture can be divided
- information on the number of sub pictures e.g. pps_num_subpics_minus1
- pps_num_subpics_minus1 may represent a value obtained by subtracting 1 from the number of subpictures included in the current picture.
- the value of pps_num_subpics_minus1 may be derived from 0 without being obtained from the bitstream. According to the determined information on the number of subpictures, encoding information for each subpicture may be signaled as much as the number of subpictures included in one picture.
- a subpicture identifier (e.g. pps_subpic_id) for identifying each subpicture and/or a flag indicating whether the encoding/decoding process of each subpicture is independent (subpic_treated_as_pic_flag[ i ]) may be designated and signaled.
- the hybrid NUT can be applied when one picture consists of two or more sub pictures.
- a flag (subpic_treated_as_pic_flag[ i ]) indicating whether the encoding/decoding process of each subpicture is independent as much as the number (i) of the subpictures included in one picture may be designated and signaled.
- the corresponding subpicture may refer to another subpicture in the picture during the inter prediction process.
- a separate flag may be placed to control whether the in-loop filter process is independent or referenced.
- the corresponding flag (subpic_treated_as_pic_flag) may be defined in one or more positions of SPS, PPS, and PH.
- the corresponding flag may be named sps_subpic_treated_as_pic_flag.
- each subpicture in the picture is independently It may have to be encoded/decoded.
- mixed_nalu_type_in_pic_flag 1
- the subpic_treated_as_pic_flag values of all sub pictures in the picture are set to “1” or derived from “1”. Being can be forced.
- mixed_nalu_type_in_pic_flag 1 if the NUT of the current picture is RASL, the subpic_treated_as_pic_flag for the current picture may be forced to be set to “1”.
- mixed_nalu_type_in_pic_flag 1 if the NUT of the current picture is RADL and the NUT of the referenced picture is RASL, the subpic_treated_as_pic_flag for the current picture may be forced to “1”.
- the hybrid NUT function may restrict that all subpictures (or slices) in one picture are configured with IRAP.
- the flag (gdr_or_irap_pic_flag) may be defined in one or more positions of SPS, PPS, and PH.
- At least one subpicture in one picture has an IRAP (IDR or CRA) NUT value, while at least one other subpicture has a non-IRAP (inter slice) NUT value.
- IDR IDR
- inter slice non-IRAP
- an intra slice and an inter slice may exist simultaneously in one picture.
- the DPB is reset. Accordingly, all reconstructed pictures existing in the DPB at the time point are removed.
- RPL reference picture information
- the flag may be defined in one or more positions of SPS, PPS, and PH.
- the flag when the flag is defined in the SPS, the flag may be named sps_idr_rpl_present_flag.
- slice header information may be signaled using the syntax of the slice header of FIG. 15 .
- the first value e.g.
- sps_idr_rpl_present_flag may indicate that the RPL syntax element is not provided by the slice header of a slice whose NUT is IDR_N_LP or IDR_W_RADL.
- a second value (e.g. 1) of sps_idr_rpl_present_flag may indicate that the RPL syntax element may be provided by a slice header of a slice whose NUT is IDR_N_LP or IDR_W_RADL.
- the value of pps_no_pic_partition_flag when the value of mixed_nalu_type_in_pic_flag indicates that the mixed NUT is applied, the value of pps_no_pic_partition_flag may be forced to a value indicating disable (e.g. 0). And accordingly, the value of the flag (pps_rpl_info_in_ph_flag) indicating whether RPL information is provided in the picture header may be obtained from the bitstream. If pps_rpl_info_in_ph_flag indicates enable (e.g.
- RPL information may be obtained from the picture header as shown in FIGS. 16 and 17 .
- RPL information may be obtained irrespective of the type of the corresponding picture.
- pps_rpl_info_in_ph_flag indicates disable (e.g. 0)
- RPL information cannot be obtained from the picture header.
- pps_rpl_info_in_ph_flag when the value of pps_rpl_info_in_ph_flag is “0”, the slice NUT is IDR_N_LP or IDR_W_RADL, and the value of sps_idr_rpl_present_flag is “0”, RPL information of the corresponding slice cannot be obtained. That is, since there is no RPL information of the corresponding slice, the RPL information may be initialized and induced to be empty.
- one picture may be signaled by heterogeneous NAL units.
- NAL units having different NUTs can be used to signal one picture, a method for determining the type of a picture according to the type of the NAL unit is required. Accordingly, during random access (RA), it may be determined whether a corresponding picture is normally restored and output is possible.
- RA random access
- the corresponding picture when each VCL NAL unit corresponding to one picture is a CRA_NUT type NAL unit, the corresponding picture may be determined as a CRA picture. And, when each VCL NAL unit corresponding to one picture is an IDR_W_RADL or an IDR_N_LP type NAL unit, the corresponding picture may be determined as an IDR picture. And, when each VCL NAL unit corresponding to one picture is an IDR_W_RADL, IDR_N_LP, or CRA_NUT type NAL unit, the corresponding picture may be determined as an IRAP picture.
- each VCL NAL unit corresponding to one picture is a RADL_NUT type NAL unit
- the corresponding picture may be determined as a random access decodable leading (RADL) picture.
- each VCL NAL unit corresponding to one picture is a TRAIL_NUT type NAL unit
- the corresponding picture may be determined as a trailing picture.
- the type of at least one VCL NAL unit among VCL NAL units corresponding to one picture is the RASL_NUT type and the types of all other VCL NAL units are the RASL_NUT type or the RADL_NUT type
- the corresponding picture is a random access skipped leading (RASL) type. ) may be determined as a picture.
- the corresponding picture when at least one subpicture in one picture is RASL and the other at least one subpicture is RADL, the corresponding picture may be determined as a RASL picture.
- the corresponding picture when at least one subpicture in one picture is RASL and the other at least one subpicture is RADL, the corresponding picture may be set as a RASL picture during the decoding process.
- the type of the VCL NAL unit corresponding to the subpicture is RASL_NUT
- the subpicture may be determined to be RASL. Accordingly, during RA, both the RASL subpicture and the RADL subpicture may be treated as RASL pictures, and accordingly, the corresponding picture may not be output.
- the corresponding picture may be set as a RASL picture.
- the corresponding picture may be set as a RASL picture during the decoding process. Accordingly, during RA, the corresponding picture may be treated as a RASL picture, and the corresponding picture may not be output.
- RA generation may be determined as the NoOutputBeforeRecoveryFlag value of the (related) IRAP picture associated with the corresponding inter slice (RADL, RASL, or TRAIL).
- the corresponding flag value may be set as follows for IRAP.
- NoOutputBeforeRecoveryFlag is set to “1”
- NoOutputBeforeRecoveryFlag is set to “0”
- the decoding apparatus may receive a signal of occurrence of random access from an external terminal.
- the external terminal may signal the random access occurrence to the decoding device by setting the value of the random access occurrence information to 1 and signaling to the decoding device.
- the decoding apparatus may set the value of the flag HandleCraAsClvsStartFlag indicating whether random access occurrence is received from the external terminal to 1 according to the random access occurrence information received from the external terminal.
- the decryption apparatus may set the value of NoOutputBeforeRecoveryFlag to the same value as the value of HandleCraAsClvsStartFlag.
- the decoding apparatus determines that random access has occurred for the corresponding CRA picture, or treats the CRA as located at the beginning of the bitstream to perform decoding can be done
- a process of setting a flag (PictureOutputFlag) for determining whether or not a current picture is output during RA is as follows.
- the PictureOutputFlag for the current picture may be set according to the following order.
- the first value (e.g. 0) of PictureOutputFlag may indicate that the current picture is not output.
- the second value (e.g. 1) of PictureOutputFlag may indicate that the current picture is output.
- pic_output_flag may be obtained at one or more positions of PH and SH.
- 18 shows an example of the synthesis of three different contents presented in the present invention.
- 18-(a) shows sequences for three different contents, and for convenience, one picture is shown as one packet, but one picture is divided into multiple slices so that multiple packets may exist.
- 18-(b) and 18-(c) show the result of the synthesized image for the picture indicated by the dotted line in FIG. 18-(a).
- the same color means the same picture/subpicture/slice.
- the P slice and the B slice may have one value among the inter NUTs.
- the contents when synthesizing a plurality of contents through the present invention, the contents can be synthesized quickly and easily without delay by simply matching the hierarchical GOP structure without necessarily aligning the intra slice (picture) positions equally. have.
- 19 and 20 are flowcharts illustrating a decoding method and an encoding method according to an embodiment of the present invention.
- An image decoding apparatus may include a memory and at least one processor, and the following decoding method may be performed by the operation of the processor.
- the decoding apparatus may obtain NAL unit type information indicating the type of the current network abstraction layer (NAL) unit from the bitstream (S1910).
- the decoding apparatus determines whether a mixed NAL unit type is applied to the current picture.
- the slice may be decoded (S1920).
- the decoding apparatus may perform decoding of the image slice by determining whether the NAL unit type of the current NAL unit indicates the property of a subpicture for the current image slice, based on whether the hybrid NAL unit type is applied. .
- Whether the hybrid NAL unit type is applied may be identified based on a first flag (e.g. pps_mixed_nalu_types_in_pic_flag) obtained from a picture parameter set.
- a first flag e.g. pps_mixed_nalu_types_in_pic_flag
- the current picture to which the current image slice belongs may be divided into at least two subpictures.
- decoding information for a subpicture may be included in the bitstream based on whether the hybrid NAL unit type is applied.
- a second flag e.g. pps_no_pic_partition_flag
- a third flag e.g pps_rpl_info_in_ph_flag
- the value of the second flag (pps_no_pic_partition_flag) is forced to 0 as the current picture is forced to be divided into at least two subpictures, and the reference picture list information is stored in the picture header.
- the current picture may be decoded based on the first subpicture and the second subpicture having different NAL unit types.
- the NAL unit type of the first sub-picture has a value of any one of Instantaneous Decoding Refresh_With_Random Access Decodable Leading (IDR_W_RADL), Instantaneous Decoding Refresh_No reference_Leading Picture (IDR_N_LP), and Clean Random Access_NAL Unit Type (CRA_NUT)
- An available NAL unit type selectable as a picture NUT may include a NAL unit type that is not selected in the first subpicture from among IDR_W_RADL, IDR_N_LP, and CRA_NUT.
- the second sub-picture may include a Trail_NAL Unit Type (TRAIL_NUT).
- TRAIL_NUT Trail_NAL Unit Type
- the first subpicture and the second subpicture constituting the current picture may be independently decoded.
- the first subpicture and the second subpicture including the B or P slice may be treated as one picture and decoded.
- the first sub-picture may be decoded without using the second sub-picture as a reference picture.
- a fourth flag (e.g. sps_subpic_treated_as_pic_flag) indicating whether the first subpicture is treated as a picture in the decoding process may be obtained from the bitstream.
- the first subpicture is treated as a picture in the decoding process and may be decoded.
- the fourth flag indicates that the first subpicture is treated as a picture in the decoding process.
- the slice type belonging to the current picture must be intra.
- the fourth flag indicates that the first subpicture is treated as a picture in the decoding process, it may be determined that the decoding process of the first subpicture is independent from other subpictures. For example, if the fourth flag indicates that the first subpicture is decoded independently from other subpictures during the decoding process, the first subpicture may be decoded without using the other subpictures as reference pictures.
- the current picture may be determined as a RASL picture based on whether the second subpicture is a RADL (Random Access Decodable Leading) subpicture.
- the type of the NAL unit corresponding to the first subpicture is RASL_NUT (Random Access Skipped Leading_NAL Unit Type)
- the first subpicture may be determined as a RASL subpicture.
- IDR_W_RADL Instantaneous
- a fifth flag indicating whether reference picture list information for the IDR picture may be present in the slice header.
- reference picture list information may be obtained from the bitstream regarding the slice header.
- the fifth flag may be obtained from a bitstream related to the sequence parameter set.
- An image encoding apparatus may include a memory and at least one processor, and an encoding method corresponding to the above-described decoding method may be performed by the operation of the processor. For example, when the current picture is encoded based on the hybrid NAL unit type, the encoding apparatus may determine the type of a subpicture into which the picture is divided ( S2010 ). Then, the encoding apparatus may generate a current NAL unit by encoding at least one current image slice constituting the subpicture based on the type of the subpicture ( S2020 ).
- the encoding apparatus may encode the image slice by encoding the NAL unit type of the current NAL unit to indicate the property of a subpicture with respect to the current image slice.
- the present invention can be embodied as computer-readable codes on a computer-readable recording medium (including all devices having an information processing function).
- the computer-readable recording medium includes all types of recording devices in which data readable by a computer system is stored. Examples of computer-readable recording devices include ROM, RAM, CD-ROM, magnetic tape, floppy disk, and optical data storage device.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
Claims (16)
- 영상 복호화 장치가 수행하는 영상 복호화 방법으로서,비트스트림으로부터 현재 NAL(network abstraction layer) 유닛의 타입을 나타내는 NAL 유닛 타입 정보를 획득하는 단계; 및상기 NAL 유닛 타입 정보가 상기 현재 NAL 유닛의 NAL 유닛 타입이 영상 슬라이스에 대한 부호화 데이터임을 나타내는 경우, 현재 픽처에 혼성 NAL 유닛 타입(mixed NAL unit type)이 적용되는지 여부에 기반하여 상기 영상 슬라이스를 복호화 하는 단계를 포함하고,상기 영상 슬라이스를 복호화 하는 단계는, 상기 혼성 NAL 유닛 타입이 적용되는지 여부에 기반하여 상기 현재 NAL 유닛의 NAL 유닛 타입이 상기 현재 영상 슬라이스에 대한 서브 픽처의 속성을 나타내는지 여부를 결정함으로써 수행되는 영상 복호화 방법.
- 제 1 항에 있어서,상기 혼성 NAL 유닛 타입이 적용되는지 여부는 픽처 파라미터 셋으로부터 획득되는 제 1 플래그에 기반하여 식별되는 영상 복호화 방법.
- 제 1 항에 있어서,상기 비트스트림으로부터 상기 현재 픽처가 분할되지 않는지 여부를 나타내는 제 2 플래그가 획득되고,상기 제 2 플래그가 상기 현재 픽처가 분할될 수 있음을 나타내는 경우, 참조 픽처 리스트 정보가 픽처 헤더에서 제공되는지 여부를 나타내는 제 3 플래그가 비트스트림으로부터 획득되되,상기 혼성 NAL 유닛 타입이 적용되는 경우, 상기 제 2 플래그는 상기 현재 픽처가 분할될 수 있음을 나타내는 값을 가지도록 제한되는 영상 복호화 방법.
- 제 3항에 있어서,상기 제 3 플래그가 상기 참조 픽처 리스트 정보가 상기 픽처 헤더에서 제공됨을 나타내면, 상기 픽처 헤더에 관한 비트스트림으로부터 상기 참조 픽처 리스트 정보가 획득되는 영상 복호화 방법
- 제 1 항에 있어서,상기 혼성 NAL 유닛 타입이 적용되는 경우, 상기 현재 영상 슬라이스가 속한 현재 픽처는 적어도 두개의 서브 픽처로 분할되는 영상 복호화 방법.
- 제 5 항에 있어서,상기 혼성 NAL 유닛 타입이 적용되는 경우, 상기 현재 픽처는 서로 상이한 NAL 유닛 타입을 가지는 제 1 서브 픽처와 제 2 서브 픽처에 기반하여 복호화되는 영상 복호화 방법.
- 제 6 항에 있어서,상기 제 1 서브 픽처가 복호화 과정에서 픽처로 취급되는지 여부를 나타내는 제 4 플래그가 상기 비트스트림으로부터 획득되고,상기 제 1 서브 픽처가 복호화 과정에서 픽처로 취급됨을 상기 제 4 플래그가 나타내면, 상기 제 1 서브 픽처는 복호화 과정에서 픽처로 취급되어 복호화 되는 영상 복호화 방법.
- 제 7 항에 있어서,상기 제 4 플래그에 기반하여 상기 제 1 서브 픽처의 복호화 과정의 독립 여부가 결정되고,상기 제 4 플래그가 상기 제 1 서브 픽처가 복호화 과정에서 다른 서브 픽처로부터 독립되어 복호화됨을 나타내면, 상기 제 1 서브 픽처는 다른 서브 픽처를 참조 하지 않고 복호화 되는 영상 복호화 방법.
- 제 6 항에 있어서,상기 제 1 서브 픽처의 NAL 유닛 타입이 IDR_W_RADL(Instantaneous Decoding Refresh_With_Random Access Decodable Leading), IDR_N_LP(Instantaneous Decoding Refresh_No reference_Leading Picture) 및 CRA_NUT(Clean Random Access_NAL Unit Type) 중 어느 하나의 값을 가지는 경우, 제 2 서브 픽처의 가용 NAL 유닛 타입은 IDR_W_RADL, IDR_N_LP 및 CRA_NUT 중에서 상기 제 1 서브 픽처에서 선택되지 않은 NAL 유닛 타입을 포함하는 영상 복호화 방법.
- 제 6 항에 있어서,상기 제 1 서브 픽처의 NAL 유닛 타입이 IDR_W_RADL(Instantaneous Decoding Refresh_With_Random Access Decodable Leading), IDR_N_LP(Instantaneous Decoding Refresh_No reference_Leading Picture) 및 CRA_NUT(Clean Random Access_NAL Unit Type) 중 어느 하나의 값을 가지는 경우, 제 2 서브 픽처의 가용 NAL 유닛 타입은 TRAIL_NUT(Trail_NAL Unit Type)를 포함하는 영상 복호화 방법.
- 제 6 항에 있어서,상기 제 1 서브 픽처가 RASL(Random Access Skipped Leading) 서브 픽처인 경우, 상기 제 2 서브 픽처가 RADL(Random Access Decodable Leading) 서브 픽처인지 여부에 기반하여 상기 현재 픽처가 RASL 픽처로 결정되는 영상 복호화 방법.
- 제 6 항에 있어서,상기 제 1 서브 픽처의 NAL 유닛 타입이 IDR_W_RADL(Instantaneous Decoding Refresh_With_Random Access Decodable Leading) 및 IDR_N_LP(Instantaneous Decoding Refresh_No reference_Leading Picture) 중 어느 하나의 값을 가지는 경우, IDR 픽처에 대한 참조 픽처 리스트 (reference picture list) 정보가 존재할 수 있는지 여부를 나타내는 제 5 플래그에 기반하여, 상기 비트스트림으로부터 참조 픽처 리스트 정보가 획득되는 영상 복호화 방법
- 제 12 항에 있어서,상기 제 5 플래그는 시퀀스 파라미터 셋으로부터 획득되는 영상 복호화 방법.
- 메모리 및 적어도 하나의 프로세서를 포함하는 영상 복호화 장치로서,상기 적어도 하나의 프로세서는,비트스트림으로부터 현재 NAL(network abstraction layer) 유닛의 타입을 나타내는 NAL 유닛 타입 정보를 획득하고,상기 NAL 유닛 타입 정보가 상기 현재 NAL 유닛의 NAL 유닛 타입이 영상 슬라이스에 대한 부호화 데이터임을 나타내는 경우, 현재 픽처에 혼성 NAL 유닛 타입이 적용되는지 여부에 기반하여 상기 영상 슬라이스를 복호화하되,상기 영상 슬라이스의 복호화는, 상기 혼성 NAL 유닛 타입이 적용되는지 여부에 기반하여 상기 현재 NAL 유닛의 NAL 유닛 타입이 상기 현재 영상 슬라이스에 대한 서브 픽처의 속성을 나타내는지 여부를 결정함으로써 수행되는 영상 복호화 장치.
- 영상 부호화 장치에 의해 수행되는 영상 부호화 방법으로서,현재 픽처가 혼성 NAL 유닛 타입에 기반하여 부호화되는 경우, 상기 픽처를 분할하는 서브 픽처의 타입을 결정하는 단계; 및상기 서브 픽처의 타입에 기반하여 상기 서브 픽처를 구성하는 적어도 하나의 현재 영상 슬라이스를 부호화하여 현재 NAL 유닛을 생성하는 단계를 포함하고,상기 영상 슬라이스를 부호화 하는 단계는, 상기 현재 픽처가 상기 혼성 NAL 유닛 타입에 기반하여 부호화되는 경우 상기 현재 NAL 유닛의 NAL 유닛 타입이 상기 현재 영상 슬라이스에 대한 서브 픽처의 속성을 나타내도록 부호화함으로써 수행되는 영상 부호화 방법.
- 제 15 항의 영상 부호화 방법에 의해 생성된 비트스트림을 전송하는 방법.
Priority Applications (9)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
BR112022014902A BR112022014902A2 (pt) | 2020-04-24 | 2020-12-16 | Métodos de decodificação de vídeo, dispositivo de decodificação de vídeo e método de transmissão de fluxo de dados contínuo |
CN202080005988.4A CN113875245A (zh) | 2020-04-24 | 2020-12-16 | 影像的译码方法及装置 |
EP20870472.6A EP4142286A4 (en) | 2020-04-24 | 2020-12-16 | IMAGE DECODING METHOD AND APPARATUS |
MX2022008860A MX2022008860A (es) | 2020-04-24 | 2020-12-16 | Metodo y dispositivo de decodificacion de video. |
JP2021560581A JP7358502B2 (ja) | 2020-04-24 | 2020-12-16 | 映像の復号化方法及び装置 |
US17/314,032 US11265560B2 (en) | 2020-04-24 | 2021-05-06 | Method and device for decoding video |
US17/643,370 US11770543B2 (en) | 2020-04-24 | 2021-12-08 | Method and device for decoding video |
US18/467,662 US20240007655A1 (en) | 2020-04-24 | 2023-09-14 | Method and device for decoding video |
JP2023166084A JP2023169389A (ja) | 2020-04-24 | 2023-09-27 | 映像の復号化方法及び装置 |
Applications Claiming Priority (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR10-2020-0050298 | 2020-04-24 | ||
KR20200050298 | 2020-04-24 | ||
KR1020200153467A KR102267873B1 (ko) | 2020-04-24 | 2020-11-17 | 영상의 복호화 방법 및 장치 |
KR1020200153465A KR102267844B1 (ko) | 2020-04-24 | 2020-11-17 | 영상의 복호화 방법 및 장치 |
KR10-2020-0153465 | 2020-11-17 | ||
KR10-2020-0153467 | 2020-11-17 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/314,032 Continuation US11265560B2 (en) | 2020-04-24 | 2021-05-06 | Method and device for decoding video |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021215614A1 true WO2021215614A1 (ko) | 2021-10-28 |
Family
ID=76600394
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2020/018464 WO2021215614A1 (ko) | 2020-04-24 | 2020-12-16 | 영상의 복호화 방법 및 장치 |
Country Status (3)
Country | Link |
---|---|
KR (3) | KR102267873B1 (ko) |
TW (1) | TWI782498B (ko) |
WO (1) | WO2021215614A1 (ko) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20150081236A (ko) * | 2014-01-03 | 2015-07-13 | 삼성전자주식회사 | 효율적인 파라미터 전달을 사용하는 비디오 부호화 방법 및 그 장치, 비디오 복호화 방법 및 그 장치 |
KR20180097106A (ko) * | 2017-02-22 | 2018-08-30 | 에스케이텔레콤 주식회사 | 영상 복호화 방법 및 장치 |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130114694A1 (en) * | 2011-11-08 | 2013-05-09 | Qualcomm Incorporated | Parameter set groups for coded video data |
-
2020
- 2020-11-17 KR KR1020200153467A patent/KR102267873B1/ko active IP Right Grant
- 2020-11-17 KR KR1020200153465A patent/KR102267844B1/ko active IP Right Grant
- 2020-12-16 WO PCT/KR2020/018464 patent/WO2021215614A1/ko active Application Filing
-
2021
- 2021-04-21 TW TW110114422A patent/TWI782498B/zh active
- 2021-05-13 KR KR1020210061991A patent/KR20210131920A/ko unknown
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20150081236A (ko) * | 2014-01-03 | 2015-07-13 | 삼성전자주식회사 | 효율적인 파라미터 전달을 사용하는 비디오 부호화 방법 및 그 장치, 비디오 복호화 방법 및 그 장치 |
KR20180097106A (ko) * | 2017-02-22 | 2018-08-30 | 에스케이텔레콤 주식회사 | 영상 복호화 방법 및 장치 |
Non-Patent Citations (3)
Title |
---|
BENJAMIN BROSS , JIANLE CHEN , SHAN LIU , YE-KUI WANG: "Versatile Video Coding (Draft 8)", 17. JVET MEETING; 20200107 - 20200117; BRUSSELS; (THE JOINT VIDEO EXPLORATION TEAM OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ), no. JVET-Q2001-vE, 12 March 2020 (2020-03-12), pages 1 - 510, XP030285390 * |
L. CHEN, S.-T. HSIANG, O. CHUBACH, Y.-W. HUANG, S.-M. LEI (MEDIATEK): "AHG9: On signalling the mixed NAL unit type flag", 130. MPEG MEETING; 20200420 - 20200424; ALPBACH; (MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11), 3 April 2020 (2020-04-03), XP030285854 * |
Y.-K. WANG (BYTEDANCE): "AHG9: A summary of proposals on mixed NAL unit types within a coded picture", 130. MPEG MEETING; 20200420 - 20200424; ALPBACH; (MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11), 12 April 2020 (2020-04-12), XP030287526 * |
Also Published As
Publication number | Publication date |
---|---|
TWI782498B (zh) | 2022-11-01 |
TW202141985A (zh) | 2021-11-01 |
KR20210131920A (ko) | 2021-11-03 |
KR102267844B1 (ko) | 2021-06-22 |
KR102267873B1 (ko) | 2021-06-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2018044088A1 (ko) | 비디오 신호 처리 방법 및 장치 | |
WO2017057953A1 (ko) | 비디오 코딩 시스템에서 레지듀얼 신호 코딩 방법 및 장치 | |
WO2016200043A1 (ko) | 비디오 코딩 시스템에서 가상 참조 픽처 기반 인터 예측 방법 및 장치 | |
WO2012023763A2 (ko) | 인터 예측 부호화 방법 | |
WO2012148138A2 (ko) | 인트라 예측 방법과 이를 이용한 부호화기 및 복호화기 | |
WO2017188565A1 (ko) | 영상 코딩 시스템에서 영상 디코딩 방법 및 장치 | |
WO2018056702A1 (ko) | 비디오 신호 처리 방법 및 장치 | |
WO2020242145A1 (ko) | 적응적 파라미터 셋을 사용하는 비디오 코딩 방법 및 장치 | |
WO2018128222A1 (ko) | 영상 코딩 시스템에서 영상 디코딩 방법 및 장치 | |
WO2016056754A1 (ko) | 3d 비디오 부호화/복호화 방법 및 장치 | |
WO2021225338A1 (ko) | 영상 디코딩 방법 및 그 장치 | |
WO2021201515A1 (ko) | Hls를 시그널링하는 영상 부호화/복호화 방법, 장치 및 비트스트림을 저장한 컴퓨터 판독 가능한 기록 매체 | |
WO2020076066A1 (ko) | 신택스 디자인 방법 및 신택스를 이용하여 코딩을 수행하는 장치 | |
WO2019143103A1 (ko) | 다양한 변환 기술을 사용하는 비디오 코딩 방법 및 장치 | |
WO2014051372A1 (ko) | 영상 복호화 방법 및 이를 이용하는 장치 | |
WO2021118261A1 (ko) | 영상 정보를 시그널링하는 방법 및 장치 | |
WO2021060801A1 (ko) | 영상 부호화/복호화 방법, 장치 및 비트스트림을 저장한 기록 매체 | |
WO2015057032A1 (ko) | 멀티 뷰를 포함하는 비디오 부호화/복호화 방법 및 장치 | |
WO2019027200A1 (ko) | 비-제로 계수들의 위치를 표현하는 방법 및 장치 | |
WO2020242181A1 (ko) | 인트라 모드 후보 구성 방법 및 영상 복호화 장치 | |
WO2018128228A1 (ko) | 영상 코딩 시스템에서 영상 디코딩 방법 및 장치 | |
WO2021215614A1 (ko) | 영상의 복호화 방법 및 장치 | |
WO2021107634A1 (ko) | 픽처 분할 정보를 시그널링 하는 방법 및 장치 | |
WO2021235895A1 (ko) | 영상 코딩 방법 및 그 장치 | |
WO2021137588A1 (ko) | 픽처 헤더를 포함하는 영상 정보를 코딩하는 영상 디코딩 방법 및 그 장치 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
ENP | Entry into the national phase |
Ref document number: 2021560581 Country of ref document: JP Kind code of ref document: A |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20870472 Country of ref document: EP Kind code of ref document: A1 |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112022014902 Country of ref document: BR |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2022120184 Country of ref document: RU |
|
ENP | Entry into the national phase |
Ref document number: 2020870472 Country of ref document: EP Effective date: 20221124 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 112022014902 Country of ref document: BR Kind code of ref document: A2 Effective date: 20220727 |