WO2020175905A1 - Procédé et appareil de partitionnement d'image sur la base d'informations signalées - Google Patents

Procédé et appareil de partitionnement d'image sur la base d'informations signalées Download PDF

Info

Publication number
WO2020175905A1
WO2020175905A1 PCT/KR2020/002730 KR2020002730W WO2020175905A1 WO 2020175905 A1 WO2020175905 A1 WO 2020175905A1 KR 2020002730 W KR2020002730 W KR 2020002730W WO 2020175905 A1 WO2020175905 A1 WO 2020175905A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
tile
tiles
picture
prediction
Prior art date
Application number
PCT/KR2020/002730
Other languages
English (en)
Korean (ko)
Inventor
파루리시탈
김승환
Original Assignee
엘지전자 주식회사
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 엘지전자 주식회사 filed Critical 엘지전자 주식회사
Publication of WO2020175905A1 publication Critical patent/WO2020175905A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/129Scanning of coding units, e.g. zig-zag scan of transform coefficients or flexible macroblock ordering [FMO]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards

Definitions

  • This disclosure is about video coding technology, and more specifically, video coding.
  • It relates to a picture partitioning method and apparatus based on information signaled by the system.
  • the demand for high-resolution, high-quality video/video is increasing in various fields.
  • the higher the resolution and quality of the video/video data the higher the amount of information or bits to be transmitted compared to the existing video/video data.
  • the video data can be transmitted using a medium such as a wired/wireless broadband line or an existing storage medium
  • the transmission cost and storage cost increase.
  • a flexible picture partitioning method that can be applied to efficiently compress and play back images/videos is required.
  • the technical task of this disclosure is to provide a method and apparatus to increase the image coding efficiency.
  • Another technical task of this disclosure is to provide a method and apparatus for signaling partitioning information.
  • Another technical task of this disclosure is to create pictures based on signaled information.
  • Another technical task of this disclosure is to provide a method and apparatus for partitioning a current picture based on partition information for the current picture.
  • Another technical task of this disclosure is flag information on whether the current picture is divided into motion constrained tile sets (MCTS), information on the number of MCTSs in the current picture, and upper left for each of the MCTSs. Position information of the tile located at the (top-left) or the position of the tile located at the bottom-right for each MCTS It is intended to provide a method and apparatus for partitioning the current picture based on at least one of the information.
  • MCTS motion constrained tile sets
  • an image decoding method performed by a decoding apparatus includes partition information for a current picture and a current block included in the current picture. For example, acquiring image information including prediction information from a bitstream; Based on the partitioning information for the current picture, a partitioning structure of the current picture based on a plurality of tiles is also provided. A step of; Deriving prediction samples for the current block based on the prediction information for the current block included in one of the plurality of tiles; And the current picture based on the prediction samples Including the step of restoring, the plurality of tiles are grouped into a plurality of tile groups, and at least one type group among the plurality of tile groups is arranged in a non-raster scan order. Includes tiles.
  • a decoding device for performing image decoding includes partition information for a current picture and a current block included in the current picture. For example, obtaining image information including prediction information from a bitstream, and reducing the partitioning structure of the current picture based on a plurality of tiles, based on the partitioning information for the current picture.
  • An entropy decoding unit A prediction unit for deriving prediction samples for the current block based on the prediction information for the current block included in one of the plurality of tiles; And the current picture based on the prediction samples It includes an adder for restoring, and the plurality of tiles are grouped into a plurality of tile groups, and at least one type group among the plurality of tile groups is arranged in a non-raster scan order. Includes tiles.
  • an image encoding method performed by an encoding device includes the steps of dividing a current picture into a plurality of tiles; based on the plurality of tiles. Generating segmentation information for the current picture; Deriving prediction samples for a current block included in one of the plurality of tiles; Generating prediction information for the current block based on the prediction samples And encoding image information including segmentation information on the current picture and prediction information on the current block, wherein the plurality of tiles are grouped into a plurality of tile groups, and the plurality of tiles At least one type group of the groups contains tiles arranged in a non-raster scan order.
  • an encoding device that performs image encoding.
  • the encoding apparatus includes: an image segmentation unit that divides the current picture into a plurality of tiles and generates segmentation information for the current picture based on the plurality of tiles, and in one tile of the plurality of tiles.
  • a prediction unit that derives prediction samples for the included current block and generates prediction information for the current block based on the prediction samples, and an image including segmentation information for the current picture and prediction information for the current block Entropy to encode information
  • the plurality of tiles are grouped into a plurality of tile groups, and at least one type group among the plurality of tile groups is non-raster
  • a computer-readable digital storage medium for storing encoded image information for causing the method to be performed includes partition information for a current picture and partition information included in the current picture. Acquiring image information including prediction information for the current block from the bitstream; based on the partitioning information for the current picture, partitioning structure of the current picture based on a plurality of tiles ); Deriving prediction samples for the current block based on the prediction information for the current block included in one of the plurality of tiles; And based on the prediction samples Including the step of restoring the current picture, the plurality of tiles are grouped into a plurality of tile groups, and at least one type group among the plurality of tile groups is in a non-raster scan order. Includes arranged tiles.
  • non-raster scans are arranged in order.
  • FIG. 1 schematically shows an example of a video/video coding system to which this disclosure can be applied.
  • Fig. 2 shows the configuration of a video/video encoding device to which this disclosure can be applied.
  • Figure 3 shows the configuration of a video/video decoding apparatus to which this disclosure can be applied.
  • FIG. 5 is a diagram showing an example of partitioning a picture.
  • FIG. 6 is a flowchart illustrating a procedure for encoding a picture based on a tile and/or a tile group according to an embodiment.
  • FIG. 7 is a flowchart illustrating a tile and/or tile group-based picture decoding procedure according to an embodiment.
  • FIG. 8 is a diagram showing an example of partitioning a picture into a plurality of tiles.
  • 9 is a block diagram showing the configuration of an encoding apparatus according to an embodiment.
  • W is a block diagram showing the configuration of a decoding apparatus according to an embodiment.
  • 11 is a diagram showing an example of a tile and a tile group unit constituting a current picture.
  • FIG. 12 is a diagram schematically showing an example of a signaling structure of tile group information.
  • 13 is a diagram illustrating an example of a picture in a video conferencing program.
  • 14 is a diagram showing an example of partitioning a picture into tiles or tile groups in a video conference video program.
  • FIG. 15 is a diagram illustrating an example of partitioning a picture into tiles or tile groups based on MCTS (Motion Constrained Tile Set).
  • MCTS Motion Constrained Tile Set
  • 16 is a diagram illustrating an example of dividing a picture based on an R0I area.
  • 17 is a diagram showing an example of partitioning a picture into a plurality of tiles.
  • 18 is a diagram illustrating an example of partitioning a picture into a plurality of tiles and tile groups.
  • FIG. 19 illustrates an example of partitioning a picture into a plurality of tiles and tile groups.
  • FIG. 20 is a flow chart showing the operation of the decoding apparatus according to an embodiment.
  • 21 is a block diagram showing a configuration of a decoding apparatus according to an embodiment.
  • 22 is a flow chart showing the operation of the encoding device according to an embodiment.
  • 23 is a block diagram showing the configuration of an encoding device according to an embodiment.
  • 24 shows an example of a content streaming system to which the disclosure of this document can be applied.
  • each configuration is implemented as separate hardware or separate software.
  • two or more of each configuration may be combined to form a single configuration.
  • one configuration may be divided into a plurality of configurations.
  • Embodiments in which each configuration is incorporated and/or separated are also included within the scope of the rights of this disclosure, unless departing from the essence of this disclosure.
  • 1 or show or p may mean “only show,” “only, or “show and ⁇ both.”
  • 1 or 1 of show in this specification is 1 and/ Or it can be interpreted as show (1/(epidermis”.
  • “6” or “:( ⁇ 3 or(:)” means “only show”, “only ,,” only 0 ⁇ or “ ⁇ Any combination of :8 and(:(
  • Show ⁇ can mean “only show”, “only, or “show and ⁇ all”.
  • ⁇ 3,(:” can mean “ ⁇ 3 or (:”).
  • intra prediction when marked as “prediction (intra prediction)”, “intra prediction” may have been proposed as an example of “prediction”. In other words, “forecast” in this specification is limited to “intra prediction” (1) It is not 0, and “intra prediction” may be suggested as an example of “prediction.” In addition, even when “prediction (ie, intra prediction)” is indicated, “intra prediction” is proposed as an example of “prediction” Can be [5 In this specification, technical features that are individually described within one drawing may be implemented individually or simultaneously.
  • FIG. 1 schematically shows an example of a video/video coding system to which this disclosure can be applied.
  • a video/video coding system may include a first device (source device) and a second device (receive device).
  • the source device is encoded.
  • Video/image information or data can be transferred to a receiving device via a digital storage medium or a network in the form of files or streaming.
  • the source device may include a video source, an encoding device, and a transmission unit.
  • the receiving device may include a receiver, a decoding device, and a renderer.
  • the encoding device may be referred to as a video/image encoding device, and the decoding device may be referred to as a video/image decoding device.
  • the transmitter may be included in the encoding device.
  • the receiver may be included in the decoding device.
  • the renderer may include a display unit, and the display unit may be composed of separate devices or external components.
  • Video sources are captured through video/video capture, synthesis, or generation
  • Video/image can be acquired Video sources can include video/image capture devices and/or video/image generation devices Video/image capture devices can be, for example, one or more cameras, previously captured video/image It can contain video/picture archives, etc. Video/picture generation devices can include, for example computers, tablets and smartphones, etc. It can generate video/pictures (electronically), for example computers A virtual video/video can be created through the like, and in this case, the video/video capture process can be replaced by the process of generating related data.
  • the encoding device can encode the input video/video.
  • the encoding device can perform a series of procedures such as prediction, transformation, and quantization for compression and coding efficiency.
  • the encoded data (encoded video/video information) can be summarized in the form of a bitstream.
  • the transmission unit is encoded video/video information output in the form of a bitstream or
  • Data can be transferred to the receiver of the receiving device via a digital storage medium or network in the form of a file or streaming.
  • the digital storage medium can include various storage media such as USB, SD, CD, DVD, Blu-ray, HDD, SSD, etc.
  • the transmission unit may include an element for generating a media file through a predetermined file format and may include an element for transmission through a broadcasting/communication network.
  • the receiving unit may receive/extract the bitstream and transmit it to the decoding device. have.
  • the decoding device is inverse quantization, inverse transformation, prediction, etc. corresponding to the operation of the encoding device.
  • Video/video can be decoded by performing a series of procedures.
  • the renderer can render decoded video/video.
  • the rendered video/video can be displayed through the display unit.
  • This document is about video/image coding. For example,
  • the method/embodiment includes a versatile video coding (VVC) standard, an essential video coding (EVC) standard, an AOMedia Video 1 (AVI) standard, a 2nd generation of audio video coding standard (AVS2), or a next-generation video/image coding standard (ex.H). .267 or H.268, etc.).
  • VVC versatile video coding
  • EVC essential video coding
  • AVI AOMedia Video 1
  • AVS2 2nd generation of audio video coding standard
  • next-generation video/image coding standard ex.H). .267 or H.268, etc.
  • a picture generally refers to a unit representing one image in a specific time period, and a slice/tile is a unit constituting a part of a picture in coding.
  • a tile can contain more than one CTU (coding tree unit); a picture can consist of more than one slice/tile.
  • a tile is a rectangular region of CTUs within a particular tile column and a particular tile row in a picture).
  • the tile row is a rectangular region of CTUs within a particular tile row.
  • the tile column is a rectangular region of CTUs having a height equal to the height of the picture and width can be specified by syntax elements in the picture parameter set. a width specified by syntax elements in the picture parameter set).
  • the tile row is a rectangular area of CTUs, the rectangular area has a width specified by syntax elements in the picture parameter set, and the height can be the same as the height of the picture.
  • a tile scan can represent a specific sequential ordering of CTUs partitioning the picture.
  • the CTUs may be sequentially aligned with a CTU raster scan in a tile, and tiles in a picture may be sequentially aligned with a raster scan of the tiles of the picture (A tile scan is a specific sequential ordering of CTUs partitioning a picture in which the CTUs are ordered consecutively in CTU raster scan in a tile whereas tiles in a picture are ordered consecutively in a raster scan of the tiles of the picture).
  • tile groups and slices can be mixed in this document.
  • tile group/ The tile group header can be called a slice/slice header.
  • a picture can be divided into two or more subpictures.
  • a subpicture can be a rectangular region of one or more slices within a picture.
  • a pixel or pel may mean the smallest unit constituting a picture (or image).
  • example' may be used as a term corresponding to a pixel.
  • a unit can represent a basic unit of image processing.
  • a unit can contain at least one of a specific region of a picture and information related to that region.
  • One unit contains one luma block and two chromas (one luma block and two chromas).
  • ex. cb, cr) may contain a block
  • a unit may be used interchangeably with terms such as block or area in some cases.
  • the MxN block may include a set (or array) of samples (or sample array) or transform coefficients consisting of M columns and N rows.
  • FIG. 2 shows the configuration of a video/video encoding apparatus to which this disclosure can be applied.
  • the video encoding device may include an image encoding device.
  • the encoding apparatus 200 includes an image partitioner 210,
  • Predictor 220 residual processor (230), entropy encoder (240), adder (250), filtering unit (filter, 260) and memory (memory, 270) It can be configured to include.
  • the part 220 is
  • the residual processing unit 230 includes a transform unit 232, a quantizer 233, an inverse quantizer 234, and an inverse transform unit ( An inverse transformer 235 may be included.
  • the residual processing unit 230 may further include a subtractor 231.
  • the addition unit 250 may include a reconstructor or a recontructged block generator.
  • the image segmentation unit 210, the prediction unit 220, the residual processing unit 230, the entropy encoding unit 240, the addition unit 250 and the filtering unit 260 described above may be used according to the embodiment.
  • the hardware component may be configured by a component (e.g., an encoder chipset or processor).
  • the memory 270 may include a decoded picture buffer (DPB), and may be configured by a digital storage medium.
  • DPB decoded picture buffer
  • the hardware component is a memory 270. You can also include more as internal/external components.
  • the image segmentation unit (2W) is an input image (or, picture, input) input to the encoding device 200
  • Frame can be divided into one or more processing units.
  • the processing unit may be referred to as a coding unit (CU), in which case the coding unit is a coding tree unit (CTU). Or according to the QTBTTT (Quad-tree binary-tree ternary-tree) structure from the largest coding unit (LCU) It can be divided recursively; for example, a coding unit can be divided into multiple coding units of deeper depth based on a quad tree structure, a binary tree structure, and/or a ternary structure. In this case, for example, the quadtree structure may be applied first, and the binary and/or ternary structure may be applied later, or the binary tree structure may be applied first.
  • This disclosure is based on the final coding unit, which is no longer divided. In this case, the maximum coding unit can be used directly as the final coding unit, or if necessary, the coding unit can be used based on the coding efficiency according to the video characteristics.
  • the processing unit may further include a unit (PU: Prediction Unit) or a transformation unit (TU: Transform Unit).
  • PU Prediction Unit
  • TU Transform Unit
  • the prediction unit and the transformation unit are each divided from the final coding unit described above.
  • it may be partitioned.
  • the prediction unit may be a unit of sample prediction
  • the transform unit may be a unit for inducing a conversion factor and/or a unit for inducing a residual signal from the conversion factor.
  • an MxN block can represent a set of samples or transform coefficients consisting of M columns and N rows.
  • a sample can typically represent a pixel or pixel value, and the luminance ( It can represent only the pixel/pixel value of the luma component, or it can represent only the pixel/pixel value of the chroma component.
  • a sample corresponds to one picture (or image) corresponding to a pixel or pel. Can be used as a term.
  • the encoding device 200 intercepts the input video signal (original block, original sample array)
  • a residual signal can be generated by subtracting the prediction signal (predicted block, prediction sample array) output from the prediction unit 221 or the intra prediction unit 222, and the generated The residual signal is transmitted to the conversion unit 232.
  • the prediction signal (prediction block, prediction sample array) is subtracted from the input video signal (original block, original sample array) in the encoding device 200 as shown.
  • the unit to be processed may be called a subtraction unit 231.
  • the prediction unit performs prediction on the block to be processed (hereinafter referred to as the current block), and a predicted block including prediction samples for the current block.
  • the prediction unit can determine whether intra prediction or inter prediction is applied in the current block or CU unit.
  • the prediction unit may generate various types of information related to prediction, such as prediction mode information, as described later in the description of each prediction mode, and transmit it to the entropy encoding unit 240.
  • the information on prediction may be encoded in the entropy encoding unit 240 and summarized in the form of a bitstream.
  • the intra prediction unit 222 refers to the samples in the current picture and predicts the current block. Depending on the prediction mode, the referenced samples may be located in the vicinity of the current block or may be located apart from each other.
  • prediction modes include a plurality of non-directional modes and a plurality of directional modes.
  • Non-directional mode can include, for example, DC mode and planar mode (Planar mode) Directional mode depends on the degree of detail of the prediction direction, for example
  • It may include 33 directional prediction modes or 65 directional prediction modes. However, this is an example and more or less directional predictions depending on the setting.
  • the intra prediction unit 222 may determine a prediction mode to be applied to the current block by using the prediction mode applied to the surrounding block.
  • the inter prediction unit 221 refers to a reference specified by a motion vector on the reference picture.
  • Motion information can be predicted in units of blocks, sub-blocks, or samples.
  • the motion information may include a motion vector and a reference picture index.
  • the motion information indicates inter prediction direction (L0 prediction, L1 prediction, Bi prediction, etc.)
  • the peripheral block may include a spatial neighboring block existing in the current picture and a temporal neighboring block existing in the reference picture.
  • the reference picture including the reference block and the reference picture including the temporal peripheral block may be the same or different.
  • the temporal peripheral block may be a collocated reference block, a co-located CU (colCU), etc. It can be called by the name of, and the reference picture containing the temporal surrounding block is the same position.
  • colPic collocated picture
  • the prediction unit 221 may construct a motion information candidate list based on the neighboring blocks, and generate information indicating which candidate is used to derive the motion vector and/or reference picture index of the current block. Inter prediction may be performed based on the prediction mode, for example, in the case of skip mode and merge mode, the inter prediction unit 221 may use the motion information of the neighboring block as the motion information of the current block. In this case, unlike the merge mode, the residual signal may not be transmitted. In the case of the motion information (motion vector prediction, MVP) mode, the motion vector of the surrounding block is used as the motion vector predictor, and the motion vector By signaling the motion vector difference, you can indicate the motion vector of the current block.
  • the motion information motion vector prediction, MVP
  • the prediction unit 220 may generate a prediction signal based on various prediction methods to be described later.
  • the prediction unit may apply intra prediction or inter prediction for prediction of one block, as well as intra prediction. Prediction and inter prediction can be applied at the same time. This can be called combined inter and intra prediction ([It can be called).
  • the IBC prediction mode or the palette mode may be based.
  • the IBC prediction mode or the palette mode can be used for content video/video coding such as games, for example, SCC (screen content coding).
  • Can be used for IBC basically performs prediction within the current picture, but can perform similarly to inter prediction in that it derives a reference block within the current picture, i.e.
  • the IBC can use at least one of the inter prediction techniques described in this document.
  • the palette mode can be seen as an example of intracoding or intra prediction. When the palette mode is applied, the sample values in the picture can be signaled based on the information about the palette table and palette index.
  • the prediction signal generated through the prediction unit may be used to generate a restoration signal or may be used to generate a residual signal.
  • the transform unit 232 may generate transform coefficients by applying a transform method to the residual signal.
  • the transform method is DCT (Discrete Cosine Transform), DST (Discrete Sine Transform),
  • KLT Kerhunen-Loeve Transform
  • GBT Graph-Based Transform
  • It may include at least one of CNT (Conditionally Non-linear Transform).
  • CNT refers to a transformation that is obtained based on, e.g., generating a signal using all previously reconstructed pixels. Also, the transformation process can be applied to a block of pixels of the same size of a square, and It can also be applied to blocks of variable size that are not square.
  • the quantization unit 233 quantizes the transform coefficients to the entropy encoding unit 240
  • the entropy encoding unit 240 encodes the quantized signal (information on quantized transformation coefficients) and outputs it as a bitstream.
  • the information on the quantized transformation coefficients may be referred to as residual information.
  • the quantization unit 233 can rearrange the quantized transformation coefficients in the block form into a one-dimensional vector form based on the coefficient scan order, and the quantized transformation coefficients are quantized based on the quantized transformation coefficients in the one-dimensional vector form. It is also possible to generate information about the transformation coefficients.
  • the entropy encoding unit 240 for example, exponential Golomb,
  • the entropy encoding unit 240 includes quantized conversion factors and information necessary for video/image restoration. It is also possible to encode together or separately (e.g., values of syntax elements).
  • the encoded information (e.g., encoded video/video information) is
  • the video/video information may be transmitted or stored in the form of a bitstream in units of network abstraction layer (NAL) units.
  • the video/video information is an appointment parameter set (APS), a picture parameter set (PPS), a sequence parameter set (SPS), or a video parameter. Set (VPS), etc. may contain more information about various parameter sets. Also, the video/video information may further contain general constraint information.
  • Information and/or syntax elements may be included in video/image information.
  • the video/image information may be encoded through the above-described encoding procedure and included in the bitstream.
  • the bitstream may be transmitted through a network, or It can be stored on a digital storage medium, where the network can include a broadcasting network and/or a communication network, and the digital storage medium can include a variety of storage media such as USB, SD, CD, DVD, Blu-ray, HDD, SSD, etc.
  • Entropy The signal output from the encoding unit 240 may be configured as an internal/external element of the encoding device 200 by a transmitting unit (not shown) for transmitting and/or a storage unit (not shown) for storing, or the transmitting unit It may be included in (240).
  • the quantized transformation coefficients output from the quantization unit 233 can be used to generate a prediction signal.
  • the quantization unit 234 and the inverse transformation unit 235 are used to generate a prediction signal. Residual by applying quantization and inverse transformation
  • a signal (residual block or residual samples) can be restored.
  • the addition unit 155 restores the restored residual signal by adding the restored residual signal to the prediction signal output from the inter prediction unit 221 or the intra prediction unit 222.
  • a (reconstructed) signal (restored picture, reconstructed block, reconstructed sample array) can be generated If there is no residual for the block to be processed, such as when the skip mode is applied, the predicted block can be used as a reconstructed block.
  • the unit 250 may be referred to as a restoration unit or a restoration block generation unit.
  • the generated restoration signal may be used for intra prediction of the next processing target block in the current picture, and inter prediction of the next picture through filtering as described below. It can also be used for
  • LMCS luma mapping with chroma scaling
  • the filtering unit 260 applies filtering to the restored signal to improve subjective/objective image quality.
  • the filtering unit 260 may apply various filtering methods to the restored picture to generate a modified restored picture, and store the modified restored picture in a memory 270, specifically a memory 270.
  • the various filtering methods include, for example, deblocking filtering, sample adaptive offset, adaptive loop filter, and bilateral filter.
  • the filtering unit 260 may generate a variety of filtering information and transmit it to the entropy encoding unit 240, as described later in the description of each filtering method.
  • the filtering information is encoded by the entropy encoding unit 240. It can be output as a bitstream.
  • the modified restored picture transmitted to this memory 270 can be used as a reference picture in the inter prediction unit 221.
  • the encoding device encodes when the inter prediction is applied through this. It can avoid predictive mismatch between the device (WO) and the decoding device, and also improve the encoding efficiency.
  • the DPB can store the modified restored picture for use as a reference picture in the inter prediction unit 221.
  • the memory 270 is a block from which motion information in the current picture is derived (or encoded). Motion information and/or motion information of blocks in a picture that has already been restored can be stored. The stored motion information is transmitted to the inter prediction unit 221 in order to use it as motion information of spatial neighboring blocks or motion information of temporal neighboring blocks.
  • the memory 270 may store restoration samples of the restored blocks in the current picture, and may transmit the restoration samples to the intra prediction unit 222.
  • Figure 3 shows the configuration of a video/video decoding apparatus to which this disclosure can be applied.
  • the decoding apparatus 300 includes an entropy decoder 310, a residual processor 320, a predictor 330, and an adder 340.
  • the prediction unit 330 may include an intra prediction unit 331 and an inter prediction unit 332.
  • the residual processing unit 320 may include a dequantizer 321 and an inverse transformer 321.
  • the addition unit 340 and the filtering unit 350 may be configured by one hardware component (for example, a decoder chipset or processor) according to an embodiment.
  • the memory 360 may include a decoded picture buffer (DPB). In addition, it may be configured by a digital storage medium.
  • the hardware component may include a memory 360 as an internal/external component loader.
  • the decoding device 300 can restore the image in response to the process in which the video/image information is processed in the encoding device of FIG. 3.
  • decoding The device 300 may derive units/blocks based on the block division related information acquired from the bitstream.
  • the decoding device 300 may perform decoding using a processing unit applied in the encoding device. Therefore, decoding
  • the processing unit of may be, for example, a coding unit, and the coding unit may be divided from the coding tree unit or the largest coding unit according to the quadtree structure, binary retrieval structure and/or turner retrie structure. From the coding unit one or more conversion units In addition, the restored video signal decoded and output through the decoding device 300 can be reproduced through the playback device.
  • the decoding device 300 converts the signal output from the encoding device of FIG. 3 into a bitstream.
  • the received signal can be decoded through the entropy decoding unit 310.
  • the entropy decoding unit 3W parses the bitstream and is required for image restoration (or picture restoration).
  • Information (ex. video/video information) can be derived.
  • the above video/video information includes an appointment parameter set (APS), a picture parameter set (PPS), a sequence parameter set (SPS), or a video parameter set (VPS). Further information on various parameter sets may be included.
  • the video/video information may further include general constraint information.
  • the decoding device may further decode the picture based on the information on the parameter set and/or the general limit information.
  • the signaling/received information and/or syntax elements described later in this document are decoded through the decoding procedure, It can be obtained from the bitstream.
  • the entropy decoding unit (3W) decodes the information in the bitstream based on a coding method such as exponential Golomb encoding, CAVLC or CABAC, and determines the value of the syntax element required for image restoration, and the residual.
  • the CABAC entropy decoding method receives the bin corresponding to each syntax element in the bitstream, and receives the decoding target syntax element information and the surrounding and decoding information of the decoding target block.
  • the context model is determined using the symbol/bin information decoded in the previous step, and the probability of occurrence of bins is predicted according to the determined context model, and arithmetic decoding of bins is performed.
  • a symbol corresponding to the value of the syntax element can be generated.
  • the CABAC entropy decoding method can update the context model using information of the decoded symbol/bin for the context model of the next symbol/bin after determining the context model.
  • the entropy decoding unit (3W) information about prediction is provided to the prediction unit (inter prediction unit 332 and intra prediction unit 331), and entropy decoding is performed by the entropy decoding unit 3W.
  • the residual value that is, quantized transform coefficients and related parameter information may be input to the residual processing unit 320.
  • the residual processing unit 320 may derive a residual signal (residual block, residual samples, and residual sample array).
  • a filtering unit information about filtering among information decoded by the entropy decoding unit 310 is a filtering unit. Can be provided as 350.
  • a receiving unit (not shown) that receives the signal output from the encoding device may be further configured as an internal/external element of the decoding device 300, or the receiving unit may be a component of the entropy decoding unit 3W.
  • the decoding device can be called a video/video/picture decoding device, and the decoding device can be divided into an information decoder (video/video/picture information decoder) and a sample decoder (video/video/picture sample decoder).
  • the information decoder is the entropy
  • a decoding unit (3W) may be included, and the sample decoder includes the inverse quantization unit 321, an inverse transform unit 322, an addition unit 340, a filtering unit 350, a memory 360, an inter prediction unit ( 332) and an intra prediction unit 331.
  • the inverse quantization unit 321 may inverse quantize the quantized transformation coefficients to output the transformation coefficients.
  • the inverse quantization unit 321 may rearrange the quantized transformation coefficients into a two-dimensional block form. In this case, the above reordering
  • the inverse quantization unit 321 performs inverse quantization on the quantized transform coefficients using a quantization parameter (for example, quantization step size information) based on the coefficient scan order performed by the silver encoding device. And obtain the transform coefficients Can
  • the residual signal (residual block, residual sample array) is obtained by inverse transforming the transform coefficients.
  • the prediction unit performs prediction on the current block, and predicts the current block
  • a predicted block including samples may be generated.
  • the prediction unit determines whether intra prediction or inter prediction is applied to the current block based on the information about the prediction output from the entropy decoding unit 310. Can be determined and specific intra/inter prediction modes can be determined.
  • the prediction unit 330 may generate a prediction signal based on various prediction methods to be described later.
  • the prediction unit may apply intra prediction or inter prediction for prediction for one block, as well as, Intra prediction and inter prediction can be applied at the same time. This can be called combined inter and intra prediction ([can be referred to as).
  • the example is based on the intra block copy (IBC) prediction mode for block prediction. It may or may be based on a palette mode.
  • the IBC prediction mode or palette mode can be used for content video/video coding such as games, for example, SCC (screen content coding), etc.
  • IBC is Basically, the prediction is performed within the current picture, but it can be performed similarly to inter prediction in that it derives a reference block within the current picture, i.e., IBC can use at least one of the inter prediction techniques described in this document.
  • Palette mode can be seen as an example of intra coding or intra prediction. When the palette mode is applied, information about the palette table and palette index can be included in the video/video information and signaled.
  • the intra prediction unit 331 may predict the current block by referring to samples in the current picture.
  • the referenced samples are of the current block according to the prediction mode.
  • the prediction modes may include a plurality of non-directional modes and a plurality of directional modes in intra prediction.
  • the intra prediction unit 331 is a prediction applied to a peripheral block. Using the mode, you can also determine the prediction mode that applies to the current block.
  • the inter prediction unit 332 is a reference specified by a motion vector on the reference picture.
  • Motion information can be predicted in units of blocks, sub-blocks, or samples.
  • the motion information may include a motion vector and a reference picture index.
  • the motion information indicates inter prediction direction (L0 prediction, L1 prediction, Bi prediction, etc.)
  • the peripheral block may include a spatial neighboring block existing in the current picture and a temporal neighboring block existing in the reference picture.
  • the inter prediction unit 332 is a motion information candidate based on surrounding blocks.
  • a list can be constructed, and a motion vector and/or a reference picture index of the current block can be derived based on the received candidate selection information.
  • Inter prediction can be performed based on various prediction modes, and the information on the prediction is described above. It may include information indicating the mode of inter prediction for the current block.
  • the addition unit 340 predicts the obtained residual signal (inter prediction unit 332 and/or
  • a restoration signal (restored picture, restoration block, restoration sample array) can be generated. Processing as in the case where skip mode is applied. If there is no residual for the target block, the predicted block can be used as a restore block.
  • the addition unit 340 may be referred to as a restoration unit or a restoration block generation unit.
  • the generated restoration signal may be used for intra prediction of the next processing target block in the current picture, and output through filtering as described later. It may be used or it may be used for inter prediction of the next picture.
  • LMCS luma mapping with chroma scaling
  • the filtering unit 350 applies filtering to the restored signal to improve subjective/objective image quality.
  • the filtering unit 350 may apply various filtering methods to the restored picture to generate a modified restored picture, and store the modified restored picture in a memory 360, specifically a memory 360.
  • the various filtering methods include, for example, deblocking filtering, sample adaptive offset, adaptive loop filter, bilateral filter, etc. can do.
  • the (modified) restored picture stored in the DPB of the memory 360 can be used as a reference picture in the inter prediction unit 332.
  • the memory 360 is from which motion information in the current picture is derived (or decoded).
  • the motion information of the block and/or the motion information of the blocks in the picture that has already been restored can be stored.
  • the stored motion information is interpolated to be used as the motion information of the spatial surrounding block or the motion information of the temporal surrounding block.
  • the memory 360 can store reconstructed samples of the restored blocks in the current picture, and can transfer them to the intra prediction unit 331.
  • the embodiments described in the filtering unit 260, the inter prediction unit 221, and the intra prediction unit 222 of the encoding apparatus 100 are respectively described in the filtering unit 350 of the decoding apparatus 300.
  • the inter prediction unit 332 and the intra prediction unit 331 may be applied to be the same or corresponding to each other.
  • a predicted block including predicted samples for the current block which is a block to be coded
  • the predicted The block includes prediction samples in the spatial domain (or pixel domain).
  • the predicted block is derived identically in the encoding device and the decoding device, and the encoding device Image coding efficiency can be improved by signaling information about the residual (residual information) between the original block and the predicted block, not the original sample value of the original block by itself.
  • the decoding apparatus is based on the residual information.
  • a residual block including residual samples may be derived, the residual block and the predicted block may be combined to generate a restoration block including restoration samples, and a restoration picture including restoration blocks may be generated.
  • the encoding apparatus derives a residual block between the original block and the predicted block, and performs a conversion procedure on residual samples (residual sample array) included in the residual block to derive conversion coefficients, By performing a quantization procedure on the transform coefficients, quantized transform coefficients are derived, and related residual information can be signaled to a decoding device (via a bitstream).
  • the residual information is value information of the quantized transform coefficients, value information, and It can include information such as location information, conversion technique, conversion kernel, quantization parameter, etc.
  • the decoding device can perform inverse quantization/inverse conversion procedures based on the residual information and derive residual samples (or residual blocks).
  • the decoding device can generate a reconstructed picture based on the predicted block and the residual block.
  • the encoding device can also inverse quantize/inverse transform the quantized transformation coefficients for reference for inter prediction of a subsequent picture to obtain a residual block. Can be derived, and a restored picture can be created based on it.
  • FIG. 4 exemplarily shows a hierarchical structure for coded data.
  • the coded data is between a video coding layer (VCL) that handles the video/image coding process and itself, and a sub-system that stores and transmits the coded video/image data. It can be classified as a network abstraction layer (NAL).
  • VCL video coding layer
  • NAL network abstraction layer
  • VCL is a set of parameters corresponding to headers such as sequences and pictures (picture parameter set (PPS), sequence parameter set (SPS), video parameter set (VPS), etc.) and in addition to the video/image coding process.
  • PPS picture parameter set
  • SPS sequence parameter set
  • VPS video parameter set
  • SEI Supplemental Enhancement Information
  • the SEI message is separated from the video/image information (slice data).
  • the VCL containing the video/image information consists of the slice data and the slice header.
  • the slice header is a tile group header. It may be referred to as, and the slice data may be referred to as tile group data.
  • NAL unit can be created by adding information (NAL unit header).
  • RBSP refers to slice data, parameter set, SEI message, etc. generated from VCL.
  • the NAL unit header may include NAL unit type information specified according to RBSP data included in the corresponding NAL unit.
  • the NAL unit which is the basic unit of NAL, plays a role of mapping the coded image to the bit string of sub-systems such as file format, RTP (Real-time Transport Protocol), TS (Transport Strea), etc. according to a predetermined standard.
  • the NAL unit is the NAL unit according to the RBSP generated from the VCL.
  • the VCL NAL unit can mean a NAL unit that contains information about the video (slice data), and the Non-VCL NAL unit is a NAL unit that contains the information (parameter set or SEI message) necessary for decoding the video.
  • VCL NAL unit can mean a NAL unit that contains information about the video (slice data)
  • Non-VCL NAL unit is a NAL unit that contains the information (parameter set or SEI message) necessary for decoding the video.
  • VCL NAL unit and Non-VCL NAL unit can be transmitted through a network by attaching header information according to the data standard of the sub-system.
  • the NAL unit is in H.266/VVC file format, RTP (Real-RTP). time Transport Protocol), TS (Transport Stream), etc., can be transformed into data types of predetermined standards and transmitted through various networks.
  • the NAL unit is RBSP data included in the NAL unit.
  • the NAL unit type may be specified according to the structure, and information on the NAL unit type may be stored in the NAL unit header and signaled.
  • VCL NAL unit type can be classified according to the properties and types of pictures included in the VCL NAL unit
  • non-VCL NAL unit type can be classified according to the type of parameter set.
  • the NAL unit type can be specified according to the type of parameter set, etc.
  • the NAL unit type is an APS (Adaptation Parameter Set) NAL unit, which is a type for NAL units including APS, and a type for NAL units including DPS.
  • APS Adaptation Parameter Set
  • VPS Video Parameter Set
  • SPS Sequence Parameter Set
  • PPS Position Parameter Set
  • NAL unit types have syntax information for the NAL unit type, and the syntax information may be stored in the NAL unit header and signaled.
  • the syntax information may be nal_unit_type, and the NAL unit types are nal_unit_type values. Can be specified.
  • one picture can contain a plurality of slices, and one slice can contain a slice header and slice data.
  • multiple slices (slice header and slice data) within one picture.
  • one picture header can be added.
  • the picture header may include information/parameters commonly applicable to the picture.
  • the slice header may include information/parameters commonly applicable to the slice.
  • the APS APS syntax
  • PPS PPS syntax
  • SPS SPS syntax
  • VPS syntax may include information/parameters commonly applicable to multiple layers.
  • the DPS may include information/parameters commonly applicable to overall video.
  • the DPS is a coded video sequence (CVS).
  • the high level syntax refers to the above APS syntax, PPS syntax, SPS syntax, VPS syntax, DPS syntax, a picture header syntax, slice You can include at least one of the header syntax.
  • the image/video information encoded from the encoding device to the decoding device and signaled in the form of a bitstream only includes intra-picture partitioning information, intra/inter prediction information, residual information, and in-loop filtering information. Rather, information included in the slice header, information included in the picture header, information included in the APS, information included in the PPS, information included in the SPS, information included in the VPS, and/or the information included in the DPS. Information may be included. In addition, the image/video information may further include information of the NAL unit header.
  • FIG. 5 is a diagram showing an example of partitioning a picture.
  • CTUs coding tree units
  • the CTU can include a coding tree block of luma samples and two coding tree blocks of chroma samples corresponding thereto.
  • the maximum allowable size of the CTU for coding and prediction is the CTU for conversion. It may be different from the maximum allowable size.
  • a tile can correspond to a series of CTUs that cover a rectangular area of a picture, and a picture can be divided into one or more tile rows and one or more tile columns.
  • a slice may consist of an integer number of complete tiles or an integer number of consecutive complete CTU rows.
  • two slice modes including a raster-scan slice mode and a rectangular slice mode can be supported.
  • a slice can contain a series of complete tiles in a tile raster scan of a picture.
  • a slice is a number of complete tiles or pictures that collectively form a rectangular area of a picture. It can contain a number of consecutive CTU rows within a tile that collectively form a rectangular region of the square. Tiles within a square slice can be scanned in tile raster scan order within the square region corresponding to that slice.
  • Figure 5 (a) shows an example of dividing a picture into tiles and raster scan slices. 2020/175905 1»(:1 ⁇ 1 ⁇ 2020/002730 This is a drawing, for example, a picture can be divided into 12 tiles and 3 raster scan slices.
  • FIG. 5 shows an example of dividing a picture into tiles and square slices.
  • a picture can be divided into 24 tiles (6 tile columns and 4 tile rows) and 9 square slices.
  • FIG. 5 shows an example of dividing the picture into tiles and square slices.
  • a picture can be divided into 24 tiles (2 tile columns and 2 tile rows) and 4 square slices.
  • FIG. 6 is a flowchart illustrating a procedure for encoding a picture based on a tile and/or a tile group according to an embodiment.
  • Generation 610) may be performed by the image dividing unit 210 of the encoding device, and for video/image information including information on a tile/tile group.
  • the encoding 620 may be performed by the entropy encoding unit 240 of the encoding device.
  • the encoding apparatus may perform picture partitioning for encoding an input picture 600).
  • the picture may include one or more tiles/tile groups.
  • the encoding apparatus includes an image of the picture. Considering the characteristics and coding efficiency, the picture can be partitioned into various types, and information indicating the partitioning type with the optimum coding efficiency can be generated and signaled to the decoding device.
  • An encoding apparatus includes a tile/tile applied to the picture
  • a group is determined, and information about the tile/tile group can be generated (610).
  • the information on the tile/tile group may include information indicating the structure of the tile/tile group for the picture.
  • the information on the tile/tile group includes various parameter sets and/or tile group headers as described later. It can be signaled through. A specific example is described below.
  • the encoding apparatus may encode video/image information including information on the tile/tile group and output it in the form of a bitstream 620).
  • the bitstream is through a digital storage medium or a network.
  • the video/video information may include a table and/or tile group header syntax described in this document.
  • the video/video information may include prediction information, residual information, and (In-loop) filtering information may be further included.
  • the encoding apparatus may restore the current picture, apply in-loop filtering, and encode the parameters related to the in-loop filtering, and output in a bitstream format.
  • FIG. 7 is a flowchart illustrating a tile and/or tile group-based picture decoding procedure according to an embodiment.
  • the step of performing the based picture decoding (S720) is the entropy of the decoding device.
  • step (S620) of encoding video/image information including information on a tile/tile group may be performed by a sample decoder of the decoding apparatus.
  • a decoding apparatus includes tiles/tiles from a received bitstream.
  • Information on the group can be obtained (S700).
  • the information on the tile/tile group can be obtained through various parameter sets and/or tile group headers as described later. A specific example will be described later.
  • the decoding apparatus may derive a tile/tile group in a current picture based on the information on the tile/tile group (S phase 0).
  • the decoding apparatus may decode the current picture based on the tile/tile group (S720). For example, the decoding apparatus derives a CTU/CU located in the tile, and performs it. Based on inter/intra prediction, residual processing, restoration block (picture) generation, and/or in-loop filtering procedures can be performed. In this case, for example, the decoding device can perform context model/information in tile/tile group units. In addition, if the surrounding block or the surrounding sample referenced during inter/intra prediction is located on a tile different from the current tile where the current block is located, the decoding device may treat the surrounding block or the surrounding sample as not available. .
  • FIG. 8 is a diagram showing an example of partitioning a picture into a plurality of tiles.
  • tiles may refer to regions within a picture defined by a set of vertical and/or horizontal boundaries that divide the picture into a plurality of rectangles.
  • FIG. 8 shows one picture 700
  • Figure 8 shows an example of splitting into multiple tiles based on a plurality of column boundaries (810) and row boundaries (820) within the first 32 maximum coding units (or 820).
  • Coding Tree Units (CTUs) are numbered and shown.
  • each tile may include an integer number of CTUs processed in a raster scan order within each tile.
  • a plurality of tiles within a picture, including each tile may also include the picture. It can be processed as a raster scan order within.
  • the tiles can be grouped to form tile groups, and tiles within a single tile group can be raster scanned. Dividing a picture into tiles is the syntax and semantics of the Picture Parameter Set (PPS). It can be defined based on semantics.
  • PPS Picture Parameter Set
  • the information derived from the PPS about tiles may be used to check (or read) the following items. First, it is checked whether one tile exists in the picture or if more than one tile exists. If more than one tile is present, it can be checked whether the above one or more tiles are uniformly distributed, the dimension of the tiles can be checked, and whether the loop filter is enabled can be checked. have.
  • the PPS may first signal the syntax element single_tile_in_pic_flag.
  • the single_tile_in_pic_flag may indicate whether only one tile in a picture exists or whether a plurality of tiles in a picture exist.
  • the decoding device can parse information about the number of tile rows and tile columns using the syntax elements num_tile_columns_minus 1 and num_tile_rows_minusl.
  • the syntax element num_tile_columns_minus 1 and num_tile_rows_minusl are present.
  • num_tile_rows_minusl can specify the process of dividing a picture into tile rows and columns.
  • the heights of tile rows and widths of tile columns are in terms of CTBs (i.e.
  • Additional flags can be parsed to check. If the tiles in the picture are not uniformly spaced, the number of CTBs per tile can be explicitly signaled for each tile row and column boundaries (i.e. CTB within each tile row). The number of and the number of CTBs in each tile row can be signaled). If the tiles are uniformly spaced, the tiles can have the same width and height.
  • a loop filter is enabled for tile boundaries.
  • Another flag (for example, the syntax element loop_filter_across_tiles_enabled_flag) can be parsed to determine whether or not.
  • Table 1 summarizes examples of main information about tiles that can be derived by parsing the PPS.
  • Table 1 can represent the PPS RBSP syntax.
  • FIG. 9 is a block diagram showing a configuration of an encoding apparatus according to an embodiment
  • FIG. 9 is a block diagram showing a configuration of a decoding apparatus according to an embodiment.
  • the encoding device 900 shown in FIG. 9 includes a partitioning module 910 and an encoding module 920.
  • the partitioning module (0) and the image division unit (0) of the encoding device shown in FIG. The same and/or similar operations may be performed, and the encoding module 920 may perform the same and/or similar operations as the entropy encoding unit 240 of the encoding apparatus shown in FIG. 2.
  • the input video is a partitioning module 9 After being divided in W), it can be encoded in the encoding module 920. After being encoded, the encoded video can be output from the encoding device 900.
  • FIG. W An example of a block diagram of a decoding apparatus is shown in FIG. W.
  • the decoding apparatus 1000 shown in FIG. W includes a decoding module 1010 and a deblocking filter 1020.
  • the decoding module ( 1010) can perform the same and/or similar operations as the entropy decoding unit 3W of the decoding apparatus shown in FIG. 3, and the deblocking filter 1020 is a filtering unit 350 of the decoding apparatus shown in FIG. The same and/or similar operations can be performed.
  • the decoding module 1010 decodes the input received from the encoding device 900 to derive information about tiles. A processing unit based on the decoded information
  • the deblocking filter 1020 may apply an in-loop deblocking filter to process the processing unit.
  • In-loop filtering may be applied to remove coding artifacts generated during the partitioning process.
  • the in-loop filtering The operation may include an adaptive loop filter (ALF), a deblocking filter (DF), a sample adaptive operation set (SAO), etc. After that, the decoded picture can be output.
  • ALF adaptive loop filter
  • DF deblocking filter
  • SAO sample adaptive operation set
  • FIG. 11 is a diagram showing an example of a tile and a tile group unit constituting the current picture.
  • tiles can be grouped to form tile groups.
  • 11 shows an example in which one picture is divided into tiles and tile groups.
  • the picture includes 9 tiles and 3 tile groups.
  • Each tile group can be independently coded.
  • each tile group has a tile group header.
  • Tile groups can have a similar meaning to a slice group. Each tile group can be independently coded.
  • a tile group can contain one or more tiles.
  • a tile group header can refer to a PPS, and a PPS can sequentially refer to a SPS (Sequence Parameter Set). .
  • the tile group header is the PPS of the PPS referenced by the tile group header.
  • the PPS can refer to the SPS in sequence.
  • the tile group header can be determined for the following information. First, if more than one tile exists per picture, the tile group address and the number of tiles in the tile group are determined. Next, you can determine the tile group type, such as intra/predictive/bi-directional. Next, you can determine the picture order count (POC) of the Lease Significant Bits (LSB). Next, if there is more than one tile in a picture, you can determine the offset length and entry point to the tile.
  • POC picture order count
  • LSB Lease Significant Bits
  • Table 4 shows an example of the syntax of the tile group header.
  • the tile group header (tile_group_header) can be replaced by a slice header.
  • Table 5 below shows an example of English semantics for the syntax of the tile group header.
  • tile group header syntax element group_pic_parameter_set_id and ti le_group_pic_order_cnt_l sb shall be ame in all tile group headers of a coded picture.
  • *' ti le_group_pic_para eter_set_id specifies the value of
  • pps_pic_parameter_set_id for the PPS in use.
  • the value of ti 1 e_group_pic_para eter_set_id shall be in the range of 0 to 63, inclusive. * ⁇
  • Temporal Id of the current picture shall be greater than or equal to the value of Temporal Id of the PPS that has pps_pic_parameter_set_id equal to ti 1 e_group_p i c_parameter _set_id.
  • ti le_group_address specifies the tile address of the first tile in the tile group, where tile address is the tile ID as specified by Equation c-7.
  • the length of ti le_group_address is Cei 1 (Log2 (NumTi lesInPic)) bits.
  • the value of ti le_group_address shall be in the range of 0 to
  • ti le_group_address When ti le_group_address is not present it is inferred to be equal to 0..: num_tiies_in_ti le_group_minusl plus 1 specifies the number of tiles [159] in the tile group. The value of num_ti les_in_ti le_group_minusl shall be in the range of 0 to Nu Ti lesInPic-1, inclusive.
  • ti le_group_type specifies the coding type of the tile group according to table 6.
  • nal_unit_type is equal to IRAP_NUT, i.e., the picture is an
  • ti le_group_type shall be equal to 2.* ⁇ ti le_group_pic_order_cnt_lsb specifies the picture order count modulo MaxPicOrderCntLsb for the current picture.
  • the length of the ti le_group_pic_order_cnt_lsb syntax element is log2_max_pic_order_cnt_lsb_minus4 + 4 bits.
  • the value of the ti le_group_pic_order_cnt_lsb shall be in the range of 0 to
  • MaxPicOrderCntLsb-1, inclusive.- ' of fset_len_ inusl plus 1 specifies the length, in bits, of the entry_point_offset_ inusl [i] syntax elements.
  • the value of offset_len_minusl shall be in the range of 0 to 31, inclusive.
  • entry_point_of fset_minusl[ i] plus 1 specifies the i-th entry point offset in bytes, and is represented by offset_len_minusl plus 1 bits.
  • the tile group data that follow the tile group header consists of nu _ti les_in_ti le_group_ inusl +1 subsets, with subset index values 2020/175905 1»(:1/10 ⁇ 020/002730
  • the tile group may include a tile group header and tile group data.
  • each 0X1 in the tile group is 2020/175905 PCT/KR2020/002730 Locations can be mapped and decoded.
  • Table 7 below shows an example of the syntax of tile group data. In Table 7, tile group data can be replaced with slice data.
  • Table 8 below shows an example of English semantics for the syntax of the tile group data.
  • tbY ctbAddrRs / PicfidthlnCtbsY
  • NumCtusInTi le[ tileldx] Colfidth[ i] * RowHeight[ j] '
  • tileStartFIag 0
  • tileStartFIag 1 2020/175905 1»(:1/10 ⁇ 020/002730
  • Some implementations running on CPUs require dividing the source picture into tiles and tile groups, where each tile group can be processed in parallel on a separate core.
  • the parallel processing is a high-resolution real-time encoding of videos.
  • the above parallel processing can reduce the sharing of information between groups of tiles, thereby reducing the memory constraint. Tiles can be distributed to different threads while processing in parallel. Therefore, the parallel architecture can benefit from this partitioning mechanism.
  • the maximum transmission unit (MTU) size matching is reviewed.
  • the coded pictures transmitted through the network are subject to fragmentation when the coded pictures are larger than the MTU size. It can be different. Similarly, if the coded segments are small, the IP (Internet Protocol) header can become important. Packet fragmentation can lead to loss of error resiliency. The picture is taken to mitigate the effects of packet fragmentation. When dividing into tiles and packing each tile/tile group as a separate packet, the packet may be smaller than the MTU size.
  • FIG. 13 is a diagram showing an example of a picture in a video conference video program.
  • flexible tiling can be achieved by using a predefined rectangular area.
  • Fig. 13 shows an example of a picture in a video program for video conferencing when a video conference with multiple participants is held.
  • the participant is speaker l (Speaker 1), speaker 2 (Speaker 2), speaker 3 (Speaker 3) and Speaker 4 (Speaker 4)
  • the area corresponding to each participant in the picture can correspond to each of the preset areas, and each of the preset areas can be coded as a single tile or a group of tiles. have.
  • the single tile or group of tiles corresponding to the participant may also change.
  • FIG. 14 is a diagram showing an example of partitioning a picture into tiles or tile groups in a video conference video program.
  • an area assigned to a speaker 1 (Speaker 1) participating in a video conference may be coded as a single tile.
  • the areas assigned to each of Speaker 2, Speaker 3, and Speaker 4 can be coded as a single tile.
  • FIG. 15 is a diagram illustrating an example of partitioning a picture into tiles or tile groups based on MCTS (Motion Constrained Tile Set).
  • MCTS Motion Constrained Tile Set
  • a picture can be acquired from 360 degree video data.
  • 360 video can mean video or image content that is captured or played back in all directions (360 degrees) at the same time required to provide VR (Virtual Reality).
  • %0 video can refer to a video or image that appears in various types of 3D space depending on the 3D model.
  • a 360 video can be displayed on a spherical surface.
  • a two-dimensional space (2D) picture obtained from 360-degree video data can be encoded with at least one spatial resolution.
  • a picture can be encoded with a first resolution and a second resolution, and the first resolution. May be higher than the second resolution.
  • a picture can be encoded in two spatial resolutions, each having a size of 1536x1536 and 768x768, but the spatial resolution is not limited thereto and may correspond to various sizes.
  • a 6x4 size tile grid may be used for the bitstreams encoded at each of the two spatial resolutions.
  • a motion constraint tile set (MCTS) for each position of the tiles may be coded and used.
  • each of the MCTSs may include tiles positioned in respective areas set for a picture.
  • MCTS may contain at least one tile to form a set of square tiles.
  • a tile can represent a rectangular area composed of coding tree blocks (CTBs) of a two-dimensional picture.
  • CTBs coding tree blocks
  • a tile can be classified based on a specific tile row and tile column within a picture.
  • a specific MCTS in the encoding/decoding process When inter prediction is performed on the blocks within, the blocks within the specific MCTS may be restricted to refer only to the corresponding MCTS of the reference picture for motion estimation/motion compensation.
  • the 12 first MCTSs 1510 are of 1536x1536.
  • first MCTSs 1510 May correspond to a region having a first resolution in the same picture
  • second MCTSs 1520 may correspond to a region having a second resolution in the same picture.
  • the first MCTSs may correspond to the viewport area in the picture.
  • the viewport area may refer to the area that the user is viewing in the 360-degree video.
  • the first MCTSs may correspond to the ROI (Region in the picture). of Interest).
  • the ROI area can refer to the area of interest of users, suggested by the 360 content provider.
  • the MCTSs 1520 can be merged and merged into a 1920x4708-sized merge picture 1530, and the merge picture 1530 can have four tile groups.
  • ti le_addr_val [i ][ j] specifies the ti le_group_address value of the tile of the i-th tile row and the j— th tile column.
  • the length of ti le_addr_val [i ][ j] is ti le_addr_len_minusl + 1 bits.
  • ti le_addr_val [i ][ j] shall not be equal to ti le_addr_val [m ][ n] when i is not equal to m or j is not equal to n.
  • num_mcts_in_pic_minusl plus 1 specifies the number of MCTSs in the picture 2020/175905 1»(:1/10 ⁇ 020/002730
  • a syntax element unifoml_tile_spacing_flag indicating whether tiles having the same width and height are to be derived by dividing the picture uniformly may be parsed.
  • the unifoml_tile_spacing_flag can be used to indicate whether the tiles in the picture are divided in a uniform manner.
  • the syntax element unifoml_tile_spacing_flag is enabled, the width of the tile row and the height of the tile row can be parsed, i.e., the syntax indicating the width of the tile column.
  • the tiles in the picture Syntax element indicating whether to form 111 8_: ⁇ Can be parsed. If so, it may indicate that the tiles or groups of tiles in the picture may or may not form a square tile set, and that the use of sample values or variables outside the rectangular tile set is restricted or unrestricted. 111 If _: ⁇ is 1, it can be indicated that the picture is divided into ⁇ .
  • the syntax element 1111111_111(:18_:11 in 1(:_1111111181 may represent the number. In one embodiment, when 111_£ is 1, In the case of dividing by, the syntax element num_mcts_in_pic_minusl can be parsed. 2020/175905 1»(:1/10 ⁇ 020/002730 There is.
  • the tile_group_address value which is the position of the tile located at the top-left, can be indicated.
  • the syntax element bottom_right_tile_addr[ i] is the i-th
  • the tile_group_address value which is the location of the tile located at the bottom-right, can be displayed.
  • Table 11 shows an example of the tile group data syntax.
  • tile group data can be replaced with slice data.
  • Table 12 below shows English semantics for the tile group data syntax.
  • 16 is a diagram showing an example of dividing a picture based on an R region.
  • tiling for partitioning a picture into a plurality of tiles flexible tiling based on a region of interest (ROI) can be achieved.
  • ROI region of interest
  • FIG. 16 a picture is in the R region. Based on this, it can be divided into multiple tile groups.
  • Table 15 below shows an example of English semantics for the above syntax.
  • tile_group_info_in_pps_flag indicating whether tile group information related to tiles included in the tile group exists in or in a tile group header referring to may be parsed.
  • tile_group_info_in_pps_flag 1
  • the tile group information does not exist in ⁇ and refers to In the tile group header, it can indicate its presence. have.
  • syntax element niim_tile_groups_in_pic_minusl may indicate the number of tile groups in the picture referring to.
  • syntax element pps_first_tile_id can represent the tile 11) of the first tile of the first tile group
  • syntax element pps_last_tile_id can represent the tile 11) of the last tile of the first tile group.
  • 17 is a diagram showing an example of partitioning a picture into a plurality of tiles.
  • coding for tiling that divides a picture into a plurality of tiles Flexible tiling can be achieved by considering tiles smaller than the size of the tree unit (CTU).
  • the tiling structure according to this method can be usefully applied to modern video applications such as video conferencing programs.
  • a picture may be partitioned into a plurality of tiles, and a plurality of
  • the size of at least one of the tiles may be smaller than the size of the Coding Tree Unit (CTU), e.g., a picture is a tile l (Tile 1), a tile 2 (Tile 2), a tile 3 (Tile 3) and a tile. It can be partitioned into 4 (Tile 4), among which the size of Tile 1, Tile 2, and Tile 4 is smaller than the size of CTU.
  • CTU Coding Tree Unit
  • the syntax element tile_size_unit_idc may represent the unit size of the tile. For example, if tile_size_unit_id is 0, 1, 2..., the height and width of the tile is a coding tree block (CTB) can be defined as 4, 8, 16...
  • CTB coding tree block
  • Figure 18 shows an example of partitioning a picture into a plurality of tiles and tile groups
  • a plurality of tiles within a picture can be grouped into a plurality of tile groups, and flexible tiling can be achieved by applying a tile group index to the plurality of tile groups.
  • It can contain tiles arranged in a non-raster scan order.
  • a picture can be partitioned into a plurality of tiles, and a plurality of tiles is a tile group l (Tile Group 1), a tile group 2 (Tile Group 2), and a tile group 3 (Tile Group). It can be grouped by 3), where each of tile group 1, tile group 2 and tile group 3 can contain tiles arranged in a non-raster scan order.
  • Table 18 below shows an example of the syntax of the tile group header (tile_group_header).
  • tile group headers can be replaced with slice headers.
  • a syntax element bar 6_ that designates an index of each of a plurality of tile groups within a picture may be 1'011]3_:111 (16 visible signalling/parsing.
  • bar 6_ is 1'011] 3_: 111 (value of 16 bar for another tile in the same picture group NAL unit 6_; 011; 3_: 111 (not the same as the value of 16.
  • tile group headers can be replaced with slice headers.
  • single_t i le_per_t i le_group_f lag is equal to 1
  • the value of single_t i le_in_t i le_group_f lag is inferred to be equal to 1.
  • firs t_t i 1 e_i d specifies the tile ID of the first tile of the tile group.
  • the length of fir s t_t i 1 e_i d is CeiK Log2( NumTi lesInTic)) bits.
  • the value of f irst_ti le_id of a tile group shall not be equal to the value of f irst_t i le_id of any other tile group of the same picture.
  • the value of f irst_t i le_id is inferred to be equal to the tile ID of the first tile of the current picture.
  • last_tile_id specifies the tile ID of the last tile of the tile group.
  • the length of last_tile_id is CeiK Log2( NumTi lesInTic)) bits.-' When NumTi lesInTic is equal to 1 or single_t i le_in_t i le_group_f lag is equal to 1, the value of last_tile_id is inferred to be equal to f irst _ ti le_id . When ti le_group_info_in_pps_f lag is equal to 1, the value of 1 as t_t ii e_i d is inferred to be equal to the value of
  • a syntax element first_tile_id that designates a tile ID of the first tile may be signaled/parsed.
  • the first_tile_id may correspond to the tile ID of the tile located at the top-left of the tile group. In this case, the tile ID of the first tile of the tile group is not the same as the tile ID of the first tile of the other tile group in the same picture.
  • the tile of the last tile for each of the plurality of tile groups in the picture The syntax element last_tile_id specifying the ID can be signaled/parsed.
  • the last_tile_id may correspond to the tile ID of the tile located at the bottom-right of the tile group.
  • the syntax element NumTilesInPic is 1 or single_tile_in_tile_group_flag is 1, the value of last_tile_id can be the same as first_tile_id.
  • the tile_group_info_in_pps_flag the value of last_tile_id can be the same as the meaning of pps_last_tile_id.
  • Fig. 19 shows an example of partitioning a picture into a plurality of tiles and tile groups
  • tiles can be grouped secondaryly within the tile group of a picture. Accordingly, the size of the tiles can be more effectively controlled, and thus flexible tiling can be achieved.
  • a picture can be first partitioned into three tile groups, and Tile group #2, which is a second tile group, can be additionally partitioned into secondary tile groups.
  • 111D1_(116_ ⁇ 1'01 ⁇ 8_1111111181 can be signaled/parsed.
  • the value of the syntax element 1111111_(1'013 ⁇ 4 ⁇ _1111111181 for 116_ plus 1) indicates the number of tile groups in the picture. Can be represented.
  • _611 (1_(1(1 88] can be signaled/parsed. Mountain _ ⁇ 1'0111)_ 031_( 1(1 88] and 1: The value of _ ⁇ 1'0111)_611(1_(1(1 88] is (for ⁇ 16_1 * 0111)_ Table 11_(1(1 8) of the other tile group units in the same picture. It is not the same as the value of [ ⁇ ] and (for ⁇ 16_1 * 0111 YES 1 (1_(1(1 8 ⁇ ]).
  • a plurality of tiles, each tile of the tile II) is explicitly
  • a syntax element tile_id_val[i] designating the tile ID of the i-th tile in the picture referencing the PPS may be signaled/parsed.
  • Table 27 shows an example of the syntax of the tile group header.
  • the tile group header can be replaced by a slice header. 2020/175905 1»(:1/10 ⁇ 020/002730
  • Table 28 below shows an example of English semantics for the syntax of the tile group header.
  • tile_group_address that designates the tile ID of the first tile of the tile group in the picture may be signaled/parsed.
  • the value of tile_group_address is not the same as the value of tile_group_address of other tile group NAL units in the same picture.
  • a MANE Media-Aware Network Element
  • video editor can identify a tile group carried by NAL units, and remove the corresponding NAL units or belong to a target tile group.
  • a sub-bitstream including NAL units can also be provided.
  • nuh_tile_group_id may be suggested in the NAL unit header.
  • This network element or video editor only 2020/175905 1»(:1/10 ⁇ 020/002730 By parsing and interpreting, the tile group carried by the NAL units can be easily identified. In addition, the network element or video editor can remove the corresponding NAL units. Accordingly, a subbitstream including NAL units belonging to the target tile group can be extracted.
  • Table 29 below shows an example of the syntax of the NAL unit header.
  • Table 30 below shows an example of English semantics for the syntax of the show unit header.
  • Table 31 shows an example of the syntax of _1 ⁇ (1) when the example of a tile group header is bright.
  • a tile group header can be replaced by a slice header.
  • tile_group_id specifying a tile group ID of a tile group in a picture may be signaled/parsed. At this time, the value of tile_group_id is not the same as the value of tile_group_id of another tile group NAL unit in the same picture.
  • FIG. 20 is a flow chart showing the operation of the decoding apparatus according to an embodiment
  • FIG. 21 is a block diagram showing the configuration of the decoding apparatus according to the embodiment.
  • Each step disclosed in FIG. 20 may be performed by the decoding device 300 disclosed in FIG. 3. More specifically, S2000 and S2010 are entropy disclosed in FIG.
  • S2020 may be performed by the prediction unit 330 disclosed in FIG. 3
  • S2030 may be performed by the addition unit 340 disclosed in FIG. 3.
  • operations according to S2000 to S2030 may be performed according to S2000 to S2030. , It is based on some of the contents described above in Figs. 1 to 19. Therefore, specific contents overlapping with the contents described above in Figs. 1 to 19 will be omitted or simplified.
  • the decoding apparatus As shown in FIG. 21, the decoding apparatus according to an embodiment is
  • Fig. 21 may not be essential components of the decoding device, and the decoding device is It may be implemented by more or less components than the components shown in FIG. 21.
  • the entropy decoding unit (3W), the prediction unit 330, and the addition unit 340 are each implemented as a separate chip, or at least two or more components are It can also be implemented through a chip.
  • the decoding apparatus includes partition information for a current picture (partition
  • image information including prediction infomiation about the current block included in the current picture can be obtained from the bitstream.
  • the entropy decoding unit (3W) of the decoding device can obtain image information including segmentation information for the current picture and prediction information for the current block included in the current picture from the bitstream. have.
  • the decoding apparatus may provide a partitioning structure of the current picture based on a plurality of tiles, based on the division information for the current picture. (S2(XL0). More specifically, the entropy decoding unit (3W) of the decoding device is based on the division information on the current picture, In one example, the plurality of tiles are grouped into a plurality of tile groups, and at least one type group among the plurality of tile groups is a non-raster scan. ) May contain tiles arranged in order.
  • the decoding apparatus may derive predicted samples for the current block based on the prediction information for the current block included in one of the plurality of tiles (S2020). More specifically, the prediction unit 330 of the decoding apparatus may derive prediction samples for the current block based on the prediction information for the current block included in one of the plurality of tiles.
  • the decoding apparatus may restore the current picture based on the predicted samples (S2030). More specifically, the adding unit 340 of the decoding apparatus is based on the predicted samples. Pictures can be restored.
  • the split information for the current picture is, index information for each of the plurality of tile groups, and located at the top-left for each of the plurality of tile groups. It may include at least one of the ID information of the tile and the ID information of the tile located at the bottom-right of each of the plurality of tile groups.
  • the division information on the current picture is at least one of flag information on whether ID information of each of the plurality of tiles is explicitly signaled, and ID information of each of the plurality of tiles.
  • at least one of the flag information and ID information of each of the plurality of tiles may be included in a PPS (Picture Parameter Set) of the image information.
  • the division information for the current picture includes information on the number of tiles of the plurality of tile groups, and a coding tree (CTB) positioned at the top-left for each of the plurality of tile groups. Block) and at least one of the location information of the CTB located at the bottom-right for each of the plurality of tile groups.
  • CTB coding tree
  • information on the number of tile groups, location information of the CTB located at the upper left of each of the plurality of tile groups, and location of the CTB located at the lower right of each of the plurality of tile groups At least one of the information may be included in the PPS (Picture Parameter Set) of the image information.
  • the division information for the current picture may further include ID information of each of the plurality of tile groups.
  • Each ID information may be included in the NAL (Network Abstraction Layer) unit header of the image information.
  • a picture can be converted to a plurality of tiles and the plurality of tiles are
  • FIG. 22 is a flow chart showing an operation of an encoding device according to an embodiment
  • FIG. 23 is a block diagram showing a configuration of an encoding device according to an embodiment.
  • the encoding apparatus according to FIGS. 22 and 23 can perform operations corresponding to those of the decoding apparatus according to FIGS. 20 and 21. Accordingly, operations of the encoding apparatus to be described later in FIGS. The same can be applied to a decoding device according to 21.
  • Each step disclosed in FIG. 22 may be performed by the encoding apparatus 200 disclosed in FIG. 2. More specifically, S2200 and S2210 may be performed by the image dividing unit 210 disclosed in FIG. 2, and S2220 and S2230 may be performed by the prediction unit 220 disclosed in Fig. 2, and S2240 may be performed by the entropy encoding unit 240 disclosed in Fig. 2. In addition, operations according to S2200 to S2240 are described above in Figs. It is based on some of the contents. Therefore, detailed contents overlapping with the contents described above in Figs. 1 to 19 will be omitted or simplified.
  • the encoding apparatus may include an image division unit 210, a prediction unit 220, and an entropy encoding unit 240.
  • All of the components shown in 23 may not be essential components of the encoding device, and the encoding device may be implemented by more or less components than the components shown in FIG. 23.
  • the image segmentation unit 210, the prediction unit 220, and the entropy encoding unit 240 are each implemented as a separate chip, or at least two or more components are It can also be implemented through a chip.
  • the encoding apparatus can divide the current picture into a plurality of tiles.
  • the image dividing unit 210 of the encoding apparatus may divide the current picture into a plurality of tiles.
  • the encoding apparatus may generate division information for the current picture based on the plurality of tiles (S22W). More specifically, the image division unit 2W of the encoding apparatus includes the plurality of tiles. In one example, the plurality of tiles are grouped into a plurality of tile groups, and at least one type group among the plurality of tile groups is non-raster scan. You can include tiles arranged in (non-raster scan) order.
  • the encoding apparatus according to an embodiment may derive prediction samples for the current block included in one of the plurality of tiles (S2220). More specifically, the prediction unit 220 of the encoding apparatus Can derive prediction samples for the current block included in one of the plurality of tiles.
  • the prediction unit 220 of the encoding device may generate prediction information for the current block based on the prediction samples.
  • the encoding apparatus may encode image information including division information for the current picture and prediction information for the current block (S2240). More specifically, it is possible to encode image information including at least one of division information for the current picture or prediction information for the current block.
  • the split information for the current picture is, index information for each of the plurality of tile groups, and located at the top-left for each of the plurality of tile groups. It may include at least one of the ID information of the tile and the ID information of the tile located at the bottom-right of each of the plurality of tile groups.
  • the split information on the current picture is at least one of flag information on whether ID information of each of the plurality of tiles is explicitly signaled, and ID information of each of the plurality of tiles.
  • at least one of the flag information and ID information of each of the plurality of tiles may be included in a PPS (Picture Parameter Set) of the image information.
  • the division information for the current picture includes information on the number of tile groups, and a coding tree (CTB) positioned at the top-left for each of the plurality of tile groups. Block) and at least one of the location information of the CTB located at the bottom-right for each of the plurality of tile groups.
  • CTB coding tree
  • information on the number of tile groups, location information of the CTB located at the upper left of each of the plurality of tile groups, and location of the CTB located at the lower right of each of the plurality of tile groups At least one of the information may be included in the PPS (Picture Parameter Set) of the image information.
  • the split information for the current picture may further include ID information of each of the plurality of tile groups. Further, ID information of each of the plurality of tile groups is the image information. It can be included in the NAL (Network Abstraction Layer) unit header.
  • NAL Network Abstraction Layer
  • the above-described method according to this disclosure can be implemented in the form of software, and the encoding device and/or the decoding device according to this disclosure can perform image processing such as TV, computer, smartphone, set-top box, and display device. It can be included in a device that performs.
  • Modules are stored in memory and can be executed by the processor.
  • the memory can be inside or outside the processor, it is well known and can be connected to the processor by various means.
  • Processors may include application-specific integrated circuits (ASICs), other chipsets, logic circuits and/or data processing devices.
  • Memory includes read-only memory (ROM), random access memory (RAM), flash memory, memory card
  • ROM read-only memory
  • RAM random access memory
  • flash memory memory card
  • the embodiments described in this disclosure may be implemented and implemented on a processor, microprocessor, controller, or chip.
  • the functional units shown in the respective figures may be included. It can be implemented and performed on a computer, processor, microprocessor, controller or chip, in which case information on instructions or algorithms can be stored on a digital storage medium.
  • the decoding device and encoding device to which this disclosure is applied are multimedia broadcasting.
  • Transmission/reception device mobile communication terminal, home cinema video device, digital cinema video device, surveillance camera, video conversation device, real-time communication device such as video communication, mobile streaming device, storage medium, camcorder, video-on-demand (VoD) service provider device , OTT video (Over the top video) device, Internet streaming service providing device,
  • real-time communication device such as video communication, mobile streaming device, storage medium, camcorder, video-on-demand (VoD) service provider device , OTT video (Over the top video) device, Internet streaming service providing device,
  • 3D (3D) video device VR (virtual reality) device, AR (argumente reality) device, video phone video device, transportation terminal (ex. vehicle (including self-driving vehicle) terminal, airplane terminal, ship terminal, etc.) and It can be included in medical video equipment, etc., and can be used to process video signals or data signals.
  • OTT video (Over the top video) devices include game consoles, Blu-ray players, Internet access TVs, home theater systems, It can include smartphones, tablet PCs, and DVR (Digital Video Recoder).
  • the processing method to which this disclosure is applied may be produced in the form of a program executed by a computer, and may be stored in a computer-readable recording medium.
  • Multimedia data having a data structure according to the present disclosure may also be produced by a computer.
  • the computer-readable recording medium includes all kinds of storage devices and distributed storage devices in which computer-readable data is stored.
  • the computer-readable recording medium is, for example, a computer-readable recording medium.
  • it can include Blu-ray disk (BD), universal serial bus (USB), ROM, PROM, EPROM, EEPROM, RAM, CD-ROM, magnetic tape, floppy disk and optical data storage device.
  • the temporary readable recording medium is a carrier (e.g. For example, it includes media implemented in the form of transmission over the Internet).
  • bitstreams generated by the encoding method may be stored in a computer-readable recording medium or transmitted through a wired or wireless communication network.
  • an embodiment of the present disclosure may be implemented as a computer program product using a program code, and the program code may be executed in a computer by an embodiment of the present disclosure.
  • the program code is a carrier readable by a computer. Can be stored on
  • Figure 24 shows an example of a content streaming system to which the disclosure of this document can be applied.
  • the content streaming system to which this disclosure is applied may largely include an encoding server, a streaming server, a web server, a media storage, a user device, and a multimedia input device.
  • the encoding server plays a role of generating a bitstream by compressing content input from multimedia input devices such as smartphones, cameras, camcorders, etc. into digital data and transmitting them to the streaming server.
  • multimedia input devices such as smartphones, cameras, camcorders, etc.
  • the encoding server may be omitted.
  • the bitstream may be generated by an encoding method or a bitstream generation method to which the present disclosure is applied, and the streaming server may temporarily store the bitstream while transmitting or receiving the bitstream.
  • This streaming server transmits multimedia data to a user device based on a user request through a web server, and the web server serves as a medium that informs the user of what kind of service is available.
  • the user wants the web server to be sent to the user's device.
  • the web server delivers it to the streaming server, and the streaming server transmits multimedia data to the user.
  • the content streaming system may include a separate control server, in which case the control server is the above. It controls the command/response between devices in the content streaming system.
  • the streaming server may receive the content from the media storage and/or the encoding server. For example, when receiving the content from the encoding server, the content can be received in real time. In this case, a seamless streaming service In order to provide a, the streaming server may store the bitstream for a predetermined time.
  • Computer laptop computer
  • digital broadcasting terminal PDA (personal digital assistants), PMP (portable multimedia player), navigation, slate PC, tablet PC, ultrabook (ul-abook), wearable device (wearable device, for example, a watch-type terminal (smartwatch), a glass-type terminal (smart glass), HMD (head mounted display), digital TV, desktop computer, digital signage, etc.
  • PDA personal digital assistants
  • PMP portable multimedia player
  • navigation slate PC
  • tablet PC tablet PC
  • ultrabook ultrabook
  • wearable device wearable device
  • wearable device wearable device, for example, a watch-type terminal (smartwatch), a glass-type terminal (smart glass), HMD (head mounted display), digital TV, desktop computer, digital signage, etc.
  • Each server in the content streaming system can be operated as a distributed server, and in this case, the data received from each server can be distributed and processed.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

La présente invention concerne un procédé de décodage vidéo exécuté par un appareil de décodage qui comprend les étapes consistant : à acquérir, à partir d'un train de bits, des informations vidéo comprenant des informations de partition pour une image actuelle et des informations de prédiction pour un bloc actuel compris dans l'image actuelle; à dériver, sur la base des informations de partition correspondant à l'image actuelle, une structure de partitionnement de l'image actuelle sur la base d'une pluralité de mosaïques; à dériver des échantillons de prédiction correspondant au bloc actuel sur la base des informations de prédiction sur le bloc actuel compris dans une mosaïque de la pluralité de mosaïques; et à restaurer l'image actuelle sur la base des échantillons de prédiction.
PCT/KR2020/002730 2019-02-26 2020-02-26 Procédé et appareil de partitionnement d'image sur la base d'informations signalées WO2020175905A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201962810942P 2019-02-26 2019-02-26
US62/810,942 2019-02-26

Publications (1)

Publication Number Publication Date
WO2020175905A1 true WO2020175905A1 (fr) 2020-09-03

Family

ID=72238924

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2020/002730 WO2020175905A1 (fr) 2019-02-26 2020-02-26 Procédé et appareil de partitionnement d'image sur la base d'informations signalées

Country Status (1)

Country Link
WO (1) WO2020175905A1 (fr)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20150140360A (ko) * 2013-04-08 2015-12-15 마이크로소프트 테크놀로지 라이센싱, 엘엘씨 관심 영역 코딩을 위한 움직임 제약된 타일 세트

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20150140360A (ko) * 2013-04-08 2015-12-15 마이크로소프트 테크놀로지 라이센싱, 엘엘씨 관심 영역 코딩을 위한 움직임 제약된 타일 세트

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
COBAN, MUHAMMED: "AHG12: On signalling of tiles", JVET-M0530, JOINT VIDEO EXPERTS TEAM (JVET) OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11 13TH M EETING, 4 January 2019 (2019-01-04), Marrakech, XP030198370, Retrieved from the Internet <URL:http://phenix.it-sudparis.eu/jvet/doc_end_user/current_document.php?id=5341> *
DESHPANDE, SACHIN: "AHG12: On Tile Information Signalling", JVET-M0416, JOINT VIDEO EXPERTS TEAM (JVET) OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11 13TH MEETING, 4 January 2019 (2019-01-04), Marrakech, XP030200653, Retrieved from the Internet <URL:http://phenix.it-sudparis.eu/jvet/doc_end_user/current_document.php?id=5225> *
HENDRY: "AHG12: On explicit signalling of tile IDs", JVET-M0134-V2, JOINT VIDEO EXPERTS TEAM (JVET) OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11 13TH MEETING, 2 January 2019 (2019-01-02), Marrakech, XP030197808, Retrieved from the Internet <URL:http://phenix.it-sudparis.eu/jvet/doc_end_user/current_document.php?id=4939> *
SYCHEV, MAXIM: "AHG12: On tile configuration signalling", JVET-M0137-VL, J OINT VIDEO EXPERTS TEAM (JVET) OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/W G 11 13TH MEETING, 2 January 2019 (2019-01-02), Marrakech, XP030197812, Retrieved from the Internet <URL:http://phenix.it-sudparis.eu/jvet/doc_end_user/current_document.php?id=4942> *

Similar Documents

Publication Publication Date Title
US20220182681A1 (en) Image or video coding based on sub-picture handling structure
US11575942B2 (en) Syntax design method and apparatus for performing coding by using syntax
US11825080B2 (en) Image decoding method and apparatus therefor
US12052433B2 (en) Image encoding/decoding method and device for signaling information related to sub picture and picture header, and method for transmitting bitstream
JP2024091785A (ja) Nalユニット関連情報に基づく映像又はビデオコーディング
US20240333930A1 (en) Picture partitioning-based coding method and device
JP2024144567A (ja) ピクチャ分割情報をシグナリングする方法及び装置
US20240146920A1 (en) Method for decoding image by using block partitioning in image coding system, and device therefor
US20230308674A1 (en) Method and apparatus for encoding/decoding image on basis of cpi sei message, and recording medium having bitstream stored therein
WO2020175908A1 (fr) Procédé et dispositif de partitionnement d&#39;image en fonction d&#39;informations signalées
US20240205424A1 (en) Image coding method based on information related to tile and information related to slice in video or image coding system
US20230144371A1 (en) Image decoding method and apparatus
US20230016307A1 (en) Method for decoding image on basis of image information including ols dpb parameter index, and apparatus therefor
US20230028326A1 (en) Image coding method based on partial entry point-associated information in video or image coding system
JP2023526535A (ja) 映像コーディング方法及びその装置
KR20230023708A (ko) 영상/비디오 코딩 시스템에서 상위 레벨 신택스를 처리하는 방법 및 장치
WO2020175905A1 (fr) Procédé et appareil de partitionnement d&#39;image sur la base d&#39;informations signalées
WO2020175904A1 (fr) Procédé et appareil de partitionnement d&#39;image sur la base d&#39;informations signalées
US11956450B2 (en) Slice and tile configuration for image/video coding
US20240056591A1 (en) Method for image coding based on signaling of information related to decoder initialization
US20230156228A1 (en) Image/video encoding/decoding method and device
KR20220082082A (ko) 영상 정보를 시그널링하는 방법 및 장치
KR20220083818A (ko) 슬라이스 관련 정보를 시그널링하는 방법 및 장치
KR20220085819A (ko) 영상 디코딩 방법 및 그 장치
CA3162960A1 (fr) Procede et dispositif de signalisation d&#39;informations relatives a une tranche d&#39;un systeme de codage/decodage d&#39;images/de videos

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20762226

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20762226

Country of ref document: EP

Kind code of ref document: A1