CN116193116A - Method and apparatus for encoding and decoding video using picture division information - Google Patents

Method and apparatus for encoding and decoding video using picture division information Download PDF

Info

Publication number
CN116193116A
CN116193116A CN202310212807.0A CN202310212807A CN116193116A CN 116193116 A CN116193116 A CN 116193116A CN 202310212807 A CN202310212807 A CN 202310212807A CN 116193116 A CN116193116 A CN 116193116A
Authority
CN
China
Prior art keywords
picture
pictures
information
partition
partitioned
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310212807.0A
Other languages
Chinese (zh)
Inventor
金燕姬
石镇旭
金晖容
奇明锡
林成昶
崔振秀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Electronics and Telecommunications Research Institute ETRI
Original Assignee
Electronics and Telecommunications Research Institute ETRI
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Electronics and Telecommunications Research Institute ETRI filed Critical Electronics and Telecommunications Research Institute ETRI
Priority claimed from PCT/KR2017/003496 external-priority patent/WO2017171438A1/en
Publication of CN116193116A publication Critical patent/CN116193116A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/174Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a slice, e.g. a line of blocks or a group of blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/436Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation using parallelised computational arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/114Adapting the group of pictures [GOP] structure, e.g. number of B-frames between two anchor frames

Abstract

A method and apparatus for encoding and decoding video using picture division information are disclosed. Each picture of a plurality of pictures of the video is partitioned into parallel blocks or slices based on the picture partition information. Each picture is partitioned using one of at least two different methods based on the picture partition information. The picture partition information may indicate two or more picture partition methods. The picture partition method may be changed periodically or according to a specific rule. The picture partition information may describe these periodic changes or certain rules.

Description

Method and apparatus for encoding and decoding video using picture division information
The present application is a divisional application of the invention patent application with the application number of "201780022137.9", entitled "method and apparatus for encoding and decoding video using picture division information", having the application date of 2017, 03, 30.
The present application claims the benefit of korean patent application No. 10-2016-0038461, filed 3/30/2016, which is hereby incorporated by reference in its entirety.
Technical Field
The following embodiments relate generally to a video decoding method and apparatus and a video encoding method and apparatus, and more particularly, to a method and apparatus for performing encoding and decoding on a video using picture division information.
Background
With the continued development of the information and communication industry, broadcast services with High Definition (HD) resolution have been popular throughout the world. Through this popularity, a large number of users have become accustomed to high resolution and high definition images and/or videos.
In order to meet the demands of users for high definition, a large number of institutions have accelerated the development of next-generation imaging devices. In addition to the increased interest of users in High Definition TV (HDTV) and Full High Definition (FHD) TV, there has been an increase in Ultra High Definition (UHD) TV, which has a resolution that is four times or more that of full high definition (FUD) TV. With this interest increasing, image encoding/decoding techniques for images with higher resolution and higher definition are required.
The image encoding/decoding apparatus and method may use an inter prediction technique, an intra prediction technique, an entropy encoding technique, or the like in order to perform encoding/decoding on high resolution and high definition images. The inter prediction technique may be a technique for predicting values of pixels included in a current picture using a temporally preceding picture and/or a temporally following picture. The intra prediction technique may be a technique for predicting a value of a pixel included in a current picture using information about the pixel in the current picture. Entropy coding techniques may be techniques for assigning short codes to more frequently occurring symbols and long codes to less frequently occurring symbols.
In image encoding and decoding processes, prediction may mean generating a predicted signal similar to the original signal. Predictions can be categorized primarily as: prediction of reference spatially reconstructed images, prediction of reference temporally reconstructed images, and prediction with reference to other symbols. In other words, the temporal reference may represent that the temporally reconstructed image is referenced and the spatial reference may represent that the spatially reconstructed image is referenced.
The current block may be a block that is a target to be currently encoded or decoded. The current block may be referred to as a "target block" or "target unit". In the encoding process, the current block may be referred to as an "encoding target block" or "encoding target unit". In the decoding process, the current block may be referred to as a "decoding target block" or a "decoding target unit".
Inter prediction may be a technique for predicting a current block using a temporal reference and a spatial reference. Intra prediction may be a technique for predicting a current block using only spatial references.
When pictures constituting a video are encoded, each picture may be partitioned into a plurality of portions, and the plurality of portions may be encoded. In this case, in order for the decoder to decode the partitioned picture, information about the partitioned picture may be required.
Disclosure of Invention
Technical problem
Embodiments are directed to a method and apparatus for improving coding efficiency and decoding efficiency using a technique for performing proper coding and decoding using picture partition information.
Embodiments are directed to providing a method and apparatus for improving coding efficiency and decoding efficiency using a technique for performing coding and decoding that determines picture partitions for a plurality of pictures based on one piece of picture partition information.
Embodiments are directed to a method and apparatus for deriving additional picture partition information from one piece of picture partition information for a bitstream encoded using two or more different pieces of picture partition information.
Embodiments are directed to a method and apparatus that omits transmission or reception of picture partition information for at least some of a plurality of pictures in a video.
Solution scheme
According to an aspect, there is provided a video encoding method, comprising: performing encoding on a plurality of pictures; generating data including picture partition information and a plurality of encoded pictures; wherein each of the plurality of pictures is partitioned using one of at least two different methods corresponding to the picture partition information.
According to another aspect, there is provided a video decoding apparatus including: the control unit is used for acquiring the picture partition information; a decoding unit configured to perform decoding on a plurality of pictures, wherein each picture of the plurality of pictures is partitioned using one of at least two different methods based on the picture partition information.
According to another aspect, there is provided a video decoding method, comprising: decoding the picture partition information; decoding is performed on a plurality of pictures based on the picture partition information, wherein each picture of the plurality of pictures is partitioned using one of at least two different methods.
A first picture of the plurality of pictures may be partitioned based on the picture partition information,
a second picture of the plurality of pictures may be partitioned based on additional picture partition information derived from the picture partition information.
The plurality of pictures may be partitioned using a picture partition method defined by the picture partition information and periodically changed.
The plurality of pictures may be partitioned using a picture partition method defined by the picture partition information and changed according to a rule.
The picture partition information indicates that the same picture partition method is to be applied to the following pictures: and a picture of the plurality of pictures, a remainder obtained when the picture order count value of the picture is divided by the first predetermined value being a second predetermined value.
The picture partition information may indicate the number of parallel blocks into which each picture of the plurality of pictures is to be partitioned.
Each picture of the plurality of pictures may be partitioned into a number of parallel blocks determined based on the picture partition information.
Each of the plurality of pictures may be partitioned into a number of slices determined based on the picture partition information.
The picture partition information may be included in the picture parameter set PPS.
The PPS includes a unified partition indication flag, wherein the unified partition indication flag indicates whether a picture referencing the PPS is partitioned using one of at least two different methods.
The picture partition information may indicate a picture partition method corresponding to a picture at a specific level for the picture.
The level may be a temporal level.
The picture partition information may include reduction instruction information for reducing the number of parallel blocks generated from partitioning each picture.
The reduction indication information may be configured to adjust the number of horizontal parallel blocks when the picture horizontal length is greater than the picture vertical length, and adjust the number of vertical parallel blocks when the picture vertical length is greater than the picture horizontal length.
The picture horizontal length may be the horizontal length of the picture,
the vertical length of a picture may be the vertical length of a picture,
the number of horizontal parallel blocks may be the number of parallel blocks arranged in the lateral direction of the picture,
the number of vertical parallel blocks may be the number of parallel blocks arranged in the longitudinal direction of the picture.
The picture partition information may include level n reduction indication information for reducing the number of parallel blocks generated from partitioning a picture at level n.
The picture division information may include reduction instruction information for reducing the number of slices generated from dividing each picture.
The picture partition information may include level n reduction indication information for reducing the number of slices generated from partitioning a picture at level n.
The at least two different methods may be different from each other in the number of slices generated from partitioning each picture.
Advantageous effects
A method and apparatus for improving coding efficiency and decoding efficiency using a technique for performing proper coding and decoding using picture partition information are provided.
A method and apparatus for improving coding efficiency and decoding efficiency using a technique for performing coding and decoding that determines picture partitions for a plurality of pictures based on one piece of picture partition information are provided.
A method and apparatus for deriving additional picture partition information from one piece of picture partition information for a bitstream encoded using two or more different pieces of picture partition information are provided.
A method and apparatus for omitting transmission or reception of picture partition information for at least some pictures of a picture in a video are provided.
Drawings
Fig. 1 is a block diagram showing the configuration of an embodiment of an encoding apparatus to which the present invention is applied;
fig. 2 is a block diagram showing the configuration of an embodiment of a decoding apparatus to which the present invention is applied;
fig. 3 is a diagram schematically showing a partition structure of an image when the image is encoded and decoded;
fig. 4 is a diagram showing the shape of a Prediction Unit (PU) that an encoding unit (CU) can include;
fig. 5 is a diagram illustrating the shape of a Transform Unit (TU) that can be included in a CU;
fig. 6 is a diagram for explaining an embodiment of an intra prediction process;
fig. 7 is a diagram for explaining an embodiment of an inter prediction process;
FIG. 8 illustrates partitioning a picture using parallel blocks (tile) according to an embodiment;
fig. 9 shows a reference structure to which GOP level encoding is applied;
fig. 10 shows an encoding order of pictures in a GOP according to an embodiment;
Fig. 11 illustrates parallel encoding of pictures in a GOP according to an embodiment;
FIG. 12 illustrates partitioning a picture using stripes according to an embodiment;
fig. 13 is a configuration diagram of an encoding apparatus for performing video encoding according to an embodiment;
fig. 14 is a flowchart of an encoding method for performing video encoding according to an embodiment;
fig. 15 is a configuration diagram of a decoding apparatus for performing video decoding according to an embodiment;
fig. 16 is a flowchart of a decoding method for performing video decoding according to an embodiment;
fig. 17 is a configuration diagram of an electronic apparatus implementing an encoding apparatus and/or a decoding apparatus according to an embodiment.
Best mode for carrying out the invention
The following exemplary embodiments will be described in detail with reference to the accompanying drawings showing specific embodiments.
In the drawings, like reference numerals are used to designate the same or similar functions in all respects. The shapes, sizes, etc. of components in the drawings may be exaggerated to make the description clear.
It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may be present. Furthermore, it should be noted that, in the exemplary embodiments, the expression for describing "including" a specific component means that another component may be included in the practical scope or technical spirit of the exemplary embodiments, but does not exclude the presence of components other than the specific component.
For convenience of description, the respective components are separately arranged. For example, at least two of the plurality of components may be integrated into a single component. Instead, one component may be divided into a plurality of components. Embodiments in which multiple components are integrated or embodiments in which some components are separated are included in the scope of the present specification as long as they do not depart from the essence of the present specification.
The embodiments will be described in detail below with reference to the drawings so that those skilled in the art to which the embodiments pertain can easily practice the embodiments. In the following description of the embodiments, a detailed description of known functions or configurations that are considered to obscure the gist of the present specification will be omitted.
Hereinafter, "image" may represent a single picture that forms part of a video, or may represent the video itself. For example, "encoding and/or decoding an image" may mean "encoding and/or decoding a video" and may also mean "encoding and/or decoding any one of a plurality of images constituting a video".
Hereinafter, the terms "video" and "moving picture" may be used to have the same meaning and may be used interchangeably with each other.
Hereinafter, the terms "image", "picture", "frame", and "screen" may be used to have the same meaning and may be used interchangeably with each other.
In the following embodiments, specific information, data, flags, elements, and attributes may have their respective values. A value of 0 corresponding to each of the information, data, flags, elements, and attributes may indicate a logical false or a first predefined value. In other words, the value "0" (logical false) and the first predefined value may be used interchangeably with each other. The value "1" corresponding to each of the information, data, flags, elements, and attributes may indicate a logical true or a second predefined value. In other words, the value "1" (logical true) and the second predefined value may be used interchangeably with each other.
When a variable such as i or j is used to indicate a row, column, or index, the value i may be an integer of 0 or an integer greater than 0, or may be an integer of 1 or an integer greater than 1. In other words, in an embodiment, each of the rows, columns, and indexes may be counted starting from 0, or may be counted starting from 1.
Next, terms to be used in the embodiments will be described.
A unit: "unit" may mean a unit of image encoding and decoding. The meaning of the terms "unit" and "block" may be identical to each other. Furthermore, the terms "unit" and "block" are used interchangeably.
The cells (or blocks) may be an mxn sample matrix. M and N may each be positive integers. The term "cell" may generally refer to an array of two-dimensional (2D) samples. The term "sample" may be a pixel or a pixel value.
The terms "pixel" and "sample" may be used with the same meaning and are used interchangeably with each other.
During the encoding and decoding of an image, a "unit" may be a region created by partitioning an image. A single image may be partitioned into multiple units. In encoding and decoding an image, a process predefined for each unit may be performed according to the type of the unit. The types of units may be classified into macro units, coding Units (CUs), prediction Units (PUs), and Transform Units (TUs) according to functions. The individual cells may be further partitioned into lower level cells having smaller dimensions than the size of the cell.
The unit partition information may comprise information about the depth of the unit. The depth information may represent the number and/or degree to which the unit is partitioned.
A single unit may be hierarchically partitioned into a plurality of lower layer units, while the plurality of lower layer units have tree-structure-based depth information. In other words, a cell and a lower level cell generated by partitioning the cell may correspond to a node and a child node of the node, respectively. Each partitioned lower level unit may have depth information. The depth information of a unit indicates the number and/or degree of times the unit is partitioned, and thus the partition information of a lower layer unit may include information about the size of the lower layer unit.
In a tree structure, the top node may correspond to the initial node before partitioning. The top node may be referred to as the "root node". Further, the root node may have a minimum depth value. Here, the depth of the top node may be level "0".
A node of depth level "1" may represent a cell that is generated when the initial cell is partitioned once. A node of depth level "2" may represent a cell that is generated when an initial cell is partitioned twice.
Leaf nodes of depth level "n" may represent units that are generated when an initial unit is partitioned n times.
The leaf node may be a bottom node, which cannot be partitioned further. The depth of the leaf node may be a maximum level. For example, the predefined value for the maximum level may be 3.
-a Transform Unit (TU): the TUs may be basic units of residual signal encoding and/or residual signal decoding (such as transform, inverse transform, quantization, inverse quantization, transform coefficient encoding, and transform coefficient decoding). A single TU may be partitioned into multiple TUs, wherein each of the multiple TUs has a smaller size.
-a Prediction Unit (PU): a PU may be a basic unit in the execution of prediction or compensation. A PU may be partitioned into multiple partitions by performing the partitioning. The plurality of partitions may also be basic units in the execution of prediction or compensation. The partition generated via partitioning the PU may also be a prediction unit.
-reconstructed neighboring units: the reconstructed neighboring unit may be a unit that has been previously encoded or decoded and reconstructed in the vicinity of the encoding target unit or decoding target unit. The reconstructed neighboring cell may be a cell that is spatially neighboring the target cell or a cell that is temporally neighboring the target cell.
-prediction unit partition: the prediction unit partition may represent the shape of the PU being partitioned.
-parameter set: the parameter set may correspond to information about a header of a structure of the bitstream. For example, the parameter sets may include a sequence parameter set, a picture parameter set, an adaptation parameter set, and the like.
Rate distortion optimization: the encoding device may use rate distortion optimization to provide higher encoding efficiency by utilizing a combination of: the size of a CU, prediction mode, size of a prediction unit, motion information, and size of a TU.
-rate distortion optimization scheme: the scheme may calculate the rate distortion costs for each combination to select the optimal combination from among the combinations. The rate distortion cost may be calculated using equation 1 below. In general, the combination that minimizes the rate-distortion cost may be selected as the optimal combination under the rate-distortion optimization method.
[ equation 1]
D+λ*R
Here, D may represent distortion. D may be an average (mean square error) of the squares of the differences between the original transform coefficients and the reconstructed transform coefficients in the transform block.
R represents a code rate, which may represent a bit rate using the relevant context information.
Lambda represents the lagrangian multiplier. R may include not only coding parameter information such as a prediction mode, motion information, and a coding block flag, but also bits generated due to coding of transform coefficients.
The encoding apparatus performs processes such as inter prediction and/or intra prediction, transformation, quantization, entropy coding, inverse quantization, and inverse transformation in order to calculate accurate D and R, but these processes greatly increase the complexity of the encoding apparatus.
-a reference picture: the reference picture may be an image used for inter prediction or motion compensation. The reference picture may be a picture including a reference unit that is referenced by the target unit to perform inter prediction or motion compensation. The terms "picture" and "image" may have the same meaning. Accordingly, the terms "picture" and "image" may be used interchangeably with each other.
-reference picture list: the reference picture list may be a list including reference pictures used for inter prediction or motion compensation. The type of reference picture list may be a merged List (LC), list 0 (L0), list 1 (L1), etc.
-Motion Vector (MV): the MV may be a 2D vector for inter prediction. For example, it may be as follows (mv) x ,mv y ) Is expressed as MV. mv (mv) x Can indicate the horizontal component, mv y The vertical component may be indicated.
The MV may represent an offset between the target picture and the reference picture.
Search range: the search range may be a 2D region in which a search for MVs is performed during inter prediction. For example, the size of the search range may be mxn. M and N may each be positive integers.
Fig. 1 is a block diagram showing the configuration of an embodiment of an encoding apparatus to which the present invention is applied.
The encoding apparatus 100 may be a video encoding apparatus or an image encoding apparatus. The video may include one or more images (pictures). The encoding device 100 may encode one or more images of the video sequentially over time.
Referring to fig. 1, the encoding apparatus 100 includes an inter prediction unit 110, an intra prediction unit 120, a switcher 115, a subtractor 125, a transform unit 130, a quantization unit 140, an entropy encoding unit 150, an inverse quantization unit 160, an inverse transform unit 170, an adder 175, a filtering unit 180, and a reference picture buffer 190.
The encoding apparatus 100 may perform encoding on an input image in an intra mode and/or an inter mode. The input image may be referred to as a "current image" as a target to be currently encoded.
Further, the encoding apparatus 100 may generate a bitstream including information about encoding by encoding an input image, and may output the generated bitstream.
When the intra mode is used, the switcher 115 can switch to the intra mode. When the inter mode is used, the switcher 115 can switch to the inter mode.
The encoding apparatus 100 may generate a prediction block for an input block in an input image. Further, after the prediction block is generated, the encoding apparatus 100 may encode a residual between the input block and the prediction block. The input block may be referred to as a "current block" as a target to be currently encoded.
When the prediction mode is an intra mode, the intra prediction unit 120 may use pixel values of previously encoded neighboring blocks around the current block as reference pixels. Intra-prediction unit 120 may perform spatial prediction on the current block using the reference pixels and generate prediction samples for the current block via spatial prediction.
The inter prediction unit 110 may include a motion prediction unit and a motion compensation unit.
When the prediction mode is an inter mode, the motion prediction unit may search a reference image for a region that best matches the current block in the motion prediction process, and may derive a motion vector for the current block and the found region. The reference picture may be stored in a reference picture buffer 190. More specifically, when encoding and/or decoding of a reference picture is processed, the reference picture may be stored in the reference picture buffer 190.
The motion compensation unit may generate the prediction block by performing motion compensation using the motion vector. Here, the motion vector may be a two-dimensional (2D) vector for inter prediction. Further, the motion vector may represent an offset between the current image and the reference image.
The subtractor 125 may generate a residual block, where the residual block is a residual between the input block and the prediction block. The residual block is also referred to as a "residual signal".
The transform unit 130 may generate transform coefficients by transforming the residual block, and may output the generated transform coefficients. Here, the transform coefficient may be a coefficient value generated by transforming the residual block. When the transform skip mode is used, the transform unit 130 may omit an operation of transforming the residual block.
By quantizing the transform coefficients, quantized transform coefficient levels may be generated. Here, in an embodiment, the quantized transform coefficient level may also be referred to as a "transform coefficient".
The quantization unit 140 may generate quantized transform coefficient levels by quantizing the transform coefficients according to quantization parameters. The quantization unit 140 may output quantized transform coefficient levels. In this case, the quantization unit 140 may quantize the transform coefficient using a quantization matrix.
The entropy encoding unit 150 may generate a bitstream by performing entropy encoding based on probability distribution based on the values calculated by the quantization unit 140 and/or the encoding parameter values calculated in the encoding process. The entropy encoding unit 150 may output the generated bitstream.
In addition to pixel information of an image, the entropy encoding unit 150 may perform entropy encoding on information required to decode the image. For example, information required for decoding an image may include syntax elements and the like.
The encoding parameters may be information required for encoding and/or decoding. The encoding parameters may include information encoded by the encoding device and transmitted to the decoding device, and may also include information derived during the encoding or decoding process. For example, the information transmitted to the decoding device may include syntax elements.
For example, the encoding device may include values or statistical information such as prediction modes, motion vectors, reference picture indexes, encoded block patterns, presence or absence of residual signals, transform coefficients, quantized transform coefficients, quantization parameters, block sizes, and block partition information. The prediction mode may be an intra prediction mode or an inter prediction mode.
The residual signal may represent a difference between the original signal and the predicted signal. Alternatively, the residual signal may be a signal generated by transforming a difference between the original signal and the predicted signal. Alternatively, the residual signal may be a signal generated by transforming and quantizing a difference between the original signal and the predicted signal. The residual block may be a block-based residual signal.
When entropy coding is applied, fewer bits may be allocated to more frequently occurring symbols and more bits may be allocated to less frequently occurring symbols. Since the symbol is represented by this allocation, the size of the bit string for the target symbol to be encoded can be reduced. Accordingly, the compression performance of video coding can be improved by entropy coding.
Further, for entropy encoding, an encoding method such as exponential golomb, context Adaptive Variable Length Coding (CAVLC), or Context Adaptive Binary Arithmetic Coding (CABAC) may be used. For example, the entropy encoding unit 150 may perform entropy encoding using a variable length coding/coding (VLC) table. For example, the entropy encoding unit 150 may derive a binarization method for the target symbol. Furthermore, entropy encoding unit 150 may derive a probability model for the target symbol/binary bit. The entropy encoding unit 150 may perform entropy encoding using a derived binarization method or probability model.
Since the encoding apparatus 100 performs encoding via inter prediction, the encoded current image may be used as a reference image for another image to be subsequently processed. Accordingly, the encoding apparatus 100 may decode the encoded current image and store the decoded image as a reference image. For decoding, inverse quantization and inverse transformation of the encoded current image may be performed.
The quantized coefficients may be inverse quantized by the inverse quantization unit 160 and may be inverse transformed by the inverse transformation unit 170. The coefficients that have been inverse quantized and inverse transformed may be added to the prediction block by adder 175. The inverse quantized and inverse transformed coefficients and the prediction block are added, and then a reconstructed block may be generated.
The reconstructed block may be filtered by a filtering unit 180. The filtering unit 180 may apply one or more of a deblocking filter, a Sample Adaptive Offset (SAO) filter, and an Adaptive Loop Filter (ALF) to the reconstructed block or the reconstructed picture. The filtering unit 180 may also be referred to as an "adaptive in-loop filter".
The deblocking filter may remove block distortion that occurs at the boundaries of the blocks. The SAO filter may add the appropriate offset value to the pixel value to compensate for the coding error. The ALF may perform filtering based on a comparison result between the reconstructed block and the original block. The reconstructed block that has been filtered by the filtering unit 180 may be stored in a reference picture buffer 190.
Fig. 2 is a block diagram showing the configuration of an embodiment of a decoding apparatus to which the present invention is applied.
The decoding apparatus 200 may be a video decoding apparatus or an image decoding apparatus.
Referring to fig. 2, the decoding apparatus 200 may include an entropy decoding unit 210, an inverse quantization unit 220, an inverse transformation unit 230, an intra prediction unit 240, an inter prediction unit 250, an adder 255, a filtering unit 260, and a reference picture buffer 270.
The decoding apparatus 200 may receive the bit stream output from the encoding apparatus 100. The decoding apparatus 200 may perform decoding on the bit stream in an intra mode and/or an inter mode. Further, the decoding apparatus 200 may generate a reconstructed image via decoding, and may output the reconstructed image.
For example, an operation of switching to an intra mode or an inter mode based on a prediction mode for decoding may be performed by a switch. When the prediction mode for decoding is an intra mode, the switch may be operated to switch to the intra mode. When the prediction mode for decoding is an inter mode, the switch may be operated to switch to the inter mode.
The decoding apparatus 200 may acquire a reconstructed residual block from an input bitstream and may generate a prediction block. When the reconstructed residual block and the prediction block are acquired, the decoding apparatus 200 may generate the reconstructed block by adding the reconstructed residual block and the prediction block.
The entropy decoding unit 210 may generate symbols by performing entropy decoding on the bitstream based on the probability distribution. The generated symbols may include quantized coefficient format symbols. Here, the entropy decoding method may be similar to the entropy encoding method described above. That is, the entropy decoding method may be an inverse of the entropy encoding method described above.
The quantized coefficients may be dequantized by dequantization unit 220. Further, the inversely quantized coefficients may be inversely transformed by the inverse transformation unit 230. As a result of the inverse quantization and inverse transformation of the quantized coefficients, a reconstructed residual block may be generated. Here, the inverse quantization unit 220 may apply a quantization matrix to the quantized coefficients.
When the intra mode is used, the intra prediction unit 240 may generate a prediction block by performing spatial prediction using pixel values of previously decoded neighboring blocks around the current block.
The inter prediction unit 250 may include a motion compensation unit. When the inter mode is used, the motion compensation unit 250 may generate a prediction block by performing motion compensation using a motion vector and a reference image. The reference picture may be stored in a reference picture buffer 270.
The reconstructed residual block and the prediction block may be added to each other by an adder 255. The adder 255 may generate a reconstructed block by adding the reconstructed residual block and the prediction block.
The reconstructed block may be filtered by a filtering unit 160. The filtering unit 260 may apply one or more of a deblocking filter, an SAO filter, and an ALF to the reconstructed block or the reconstructed picture. The filtering unit 260 may output a reconstructed image (picture). The reconstructed image may be stored in a reference picture buffer 270 and may then be used for inter prediction.
Fig. 3 is a diagram schematically showing an image partition structure when an image is encoded and decoded.
In order to partition an image efficiently, a Coding Unit (CU) may be used in encoding and decoding. The term "unit" may be used to collectively designate 1) a block including image samples and 2) a syntax element. For example, "partition of a unit" may represent "partition of a block corresponding to the unit".
Referring to fig. 3, an image 200 is sequentially partitioned into units corresponding to a Largest Coding Unit (LCU), and a partition structure of the image 300 may be determined according to the LCU. Here, LCUs may be used to have the same meaning as Code Tree Units (CTUs).
The partition structure may represent a distribution of Coding Units (CUs) in LCU 310 for efficiently encoding an image. Such a distribution may be determined according to whether a single CU is to be partitioned into four CUs. The horizontal and vertical sizes of each CU resulting from partitioning may be half the horizontal and vertical sizes of the CU before being partitioned. Each partitioned CU may be recursively partitioned into four CUs, and in the same manner, the horizontal and vertical sizes of the four CUs are halved.
Here, partitioning of CUs may be performed recursively until a predefined depth. The depth information may be information indicating the size of the CU. Depth information may be stored for each CU. For example, the depth of the LCU may be 0 and the depth of the Smallest Coding Unit (SCU) may be a predefined maximum depth. Here, as described above, the LCU may be a CU having a maximum coding unit size, and the SCU may be a CU having a minimum coding unit size.
Partitioning begins at LCU 310, and the depth of a CU may be increased by "1" each time the horizontal and vertical dimensions of the CU are halved by partitioning. For each depth, a CU that is not partitioned may have a size of 2n×2n. Further, in the case where a CU is partitioned, a CU having a size of 2n×2n may be partitioned into four CUs each having a size of n×n. The dimension N may be halved each time the depth increases by 1.
Referring to fig. 3, an LCU of depth 0 may have 64×64 pixels. 0 may be the minimum depth. An SCU of depth 3 may have 8 x 8 pixels. 3 may be the maximum depth. Here, a CU having 64×64 pixels as an LCU may be represented by a depth of 0. A CU with 32 x 32 pixels may be represented by a depth of 1. A CU with 16 x 16 pixels may be represented by a depth of 2. A CU with 8×8 pixels as SCU may be represented by depth 3.
Further, information on whether the corresponding CU is partitioned may be represented by partition information of the CU. The partition information may be 1-bit information. All CUs except SCU may include partition information. For example, when a CU is not partitioned, the value of partition information of the CU may be 0. When a CU is partitioned, the value of partition information of the CU may be 1.
Fig. 4 is a diagram illustrating the shape of a Prediction Unit (PU) that an encoding unit (CU) can include.
Among CUs partitioned from LCUs, a CU that is no longer partitioned may be partitioned into one or more Prediction Units (PUs). This partitioning may also be referred to as "partitioning".
A PU may be a base unit for prediction. The PU may be encoded and decoded in any one of a skip mode, an inter mode, and an intra mode. The PU may be partitioned into various shapes according to various modes.
In skip mode, no partition may be present in the CU. In the skip mode, the 2n×2n mode 410 may be supported without partitioning, wherein the size of the PU and the size of the CU are the same as each other in the 2n×2n mode.
In inter mode, there may be 8 types of partition shapes in a CU. For example, in the inter mode, a 2n×2n mode 410, a 2n×n mode 415, an n×2n mode 420, an n×n mode 425, a 2n×nu mode 430, a 2n×nd mode 435, an nl×2n mode 440, and an nr×2n mode 445 may be supported.
In intra mode, a 2nx2n mode 410 and an nxn mode 425 may be supported.
In the 2n×2n mode 410, PUs of size 2n×2n may be encoded. A PU of size 2N x 2N may represent a PU of the same size as a CU. For example, a PU of size 2N x 2N may have a size of 64 x 64, 32 x 32, 16 x 16, or 8 x 8.
In the nxn mode 425, a PU of size nxn may be encoded.
For example, in intra prediction, four partitioned PUs may be encoded when the PU size is 8 x 8. The size of each partitioned PU may be 4 x 4.
When encoding a PU in intra mode, the PU may be encoded using any of a plurality of intra prediction modes. For example, HEVC techniques may provide 35 intra-prediction modes, and a PU may be encoded in any of the 35 intra-prediction modes.
Which of the 2nx2n mode 410 and the nxn mode 425 is to be used to encode the PU may be determined based on the rate distortion cost.
The encoding apparatus 100 may perform an encoding operation on a PU having a size of 2nx2n. Here, the encoding operation may be an operation of encoding the PU in each of a plurality of intra prediction modes that can be used by the encoding apparatus 100. By the encoding operation, the optimal intra prediction mode for a PU of size 2n×2n can be obtained. The optimal intra prediction mode may be an intra prediction mode that exhibits a minimum rate distortion cost when encoding a PU having a size of 2nx2n among a plurality of intra prediction modes that can be used by the encoding apparatus 100.
Further, the encoding apparatus 100 may sequentially perform encoding operations on respective PUs obtained by performing nxn partitioning. Here, the encoding operation may be an operation of encoding the PU in each of a plurality of intra prediction modes that can be used by the encoding apparatus 100. By the encoding operation, the best intra prediction mode for a PU of size nxn can be obtained. The optimal intra prediction mode may be an intra prediction mode that exhibits a minimum rate distortion cost when encoding a PU of size nxn among a plurality of intra prediction modes that can be used by the encoding apparatus 100.
The encoding apparatus 100 may determine which one of a PU of a size 2nx2n and a PU of a size nxn is to be encoded based on a comparison result between a rate distortion cost of the PU of the size 2nx2n and a rate distortion cost of the PU of the size nxn.
Fig. 5 is a diagram illustrating the shape of a Transform Unit (TU) that can be included in a CU.
A Transform Unit (TU) may be a basic unit in a CU used for processes such as transform, quantization, inverse transform, inverse quantization, entropy encoding, and entropy decoding. TUs may have a square or rectangular shape.
Among CUs partitioned from LCUs, a CU that is no longer partitioned into CUs may be partitioned into one or more TUs. Here, the partition structure of the TUs may be a quadtree structure. For example, as shown in FIG. 5, a single CU 510 may be partitioned one or more times according to a quadtree structure. With such partitioning, a single CU 510 may be composed of TUs having various sizes.
In the encoding apparatus 100, a Coding Tree Unit (CTU) having a size of 64×64 may be partitioned into a plurality of smaller CUs by a recursive quadtree structure. A single CU may be partitioned into four CUs having the same size. Each CU may be recursively partitioned and may have a quadtree structure.
A CU may have a given depth. When a CU is partitioned, the depth of the CU generated by the partition may be increased by 1 from the depth of the partitioned CU.
For example, the depth of a CU may have a value ranging from 0 to 3. The size of a CU may range from a 64 x 64 size to an 8 x 8 size depending on the depth of the CU.
By recursive partitioning of the CU, the best partitioning method can be selected where the minimum rate distortion cost occurs.
Fig. 6 is a diagram for explaining an embodiment of an intra prediction process.
The arrow extending radially from the center of the graph in fig. 6 may represent the prediction direction of the intra prediction mode. Further, numbers shown near the arrow may represent examples of mode values assigned to intra prediction modes or prediction directions of intra prediction modes.
Intra-coding and/or decoding may be performed using reference samples of units adjacent to the target unit. The neighboring unit may be a neighboring reconstruction unit. For example, intra-frame encoding and/or decoding may be performed using values of reference samples included in each neighboring reconstruction unit or encoding parameters of neighboring reconstruction units.
The encoding apparatus 100 and/or the decoding apparatus 200 may generate a prediction block by performing intra prediction on a target unit based on information about samples in a current picture. When intra prediction is performed, the encoding apparatus 100 and/or the decoding apparatus 200 may generate a prediction block for a target unit by performing intra prediction based on information about samples in a current picture. When intra prediction is performed, the encoding apparatus 100 and/or the decoding apparatus 200 may perform directional prediction and/or non-directional prediction based on at least one reconstructed reference sample.
The prediction block may represent a block generated as a result of performing intra prediction. The prediction block may correspond to at least one of a CU, PU, and TU.
The unit of the prediction block may have a size corresponding to at least one of the CU, PU, and TU. The prediction block may have a square shape with a size of 2n×2n or n×n. The dimensions N x N may include dimensions 4 x 4, 8 x 8, 16 x 16, 32 x 32, 64 x 64, etc.
Alternatively, the prediction block may be a square block of size 2×2, 4×4, 16×16, 32×32, 64×64, etc. or a rectangular block of size 2×8, 4×8, 2×16, 4×16, 8×16, etc.
Intra prediction may be performed according to an intra prediction mode for the target unit. The number of intra prediction modes that the target unit may have may be a predefined fixed value, and may be a value that is differently determined according to the properties of the prediction block. For example, the attribute of the prediction block may include the size of the prediction block, the type of the prediction block, and the like.
For example, the number of intra prediction modes may be fixed to 35 regardless of the size of the prediction unit. Alternatively, the number of intra prediction modes may be, for example, 3, 5, 9, 17, 34, 35, or 36.
As shown in fig. 6, the intra prediction modes may include two non-directional modes and 33 directional modes. The two non-directional modes may include a DC mode and a planar mode.
For example, in a vertical mode with a mode value of 26, prediction may be performed in the vertical direction based on the pixel value of the reference sample. For example, in a horizontal mode with a mode value of 10, prediction may be performed in the horizontal direction based on the pixel value of the reference sample.
Even in the directional mode other than the above-described modes, the encoding apparatus 100 and the decoding apparatus 200 can perform intra prediction on the target unit using the reference samples according to the angle corresponding to the directional mode.
The intra prediction mode located at the right side with respect to the vertical mode may be referred to as a "vertical-right mode". The intra prediction mode located below the horizontal mode may be referred to as a "horizontal-below mode". For example, in fig. 6, an intra prediction mode in which the mode value is one of 27, 28, 29, 30, 31, 32, 33, and 34 may be the vertical-right mode 613. The intra prediction mode whose mode value is one of 2, 3, 4, 5, 6, 7, 8, and 9 may be the horizontal-down mode 616.
The non-directional modes may include a DC mode and a planar mode. For example, the mode value of the DC mode may be 1. The mode value of the planar mode may be 0.
The orientation pattern may include an angular pattern. Among the plurality of intra prediction modes, modes other than the DC mode and the plane mode may be a directional mode.
In the DC mode, a prediction block may be generated based on an average of pixel values of a plurality of reference samples. For example, the pixel values of the prediction block may be determined based on an average of the pixel values of the plurality of reference samples.
The number of intra prediction modes and the mode values of the respective intra prediction modes described above are merely exemplary. The number of intra prediction modes described above and the mode values of the respective intra prediction modes may be defined differently according to embodiments, implementations, and/or requirements.
The number of intra prediction modes may be different according to the type of color component. For example, the number of prediction modes may be different depending on whether the color component is a luminance (luma) signal or a chrominance (chroma) signal.
Fig. 7 is a diagram for explaining an embodiment of an inter prediction process.
The rectangle shown in fig. 7 may represent an image (or screen). Further, in fig. 7, an arrow may indicate a prediction direction. That is, each image may be encoded and/or decoded according to a prediction direction.
The picture (or picture) may be classified into an intra picture (I picture), a unidirectional predicted picture or a predictive coded picture (P picture), and a bidirectional predicted picture or a bidirectional predictive coded picture (B picture) according to the type of coding. Each picture may be encoded according to the type of encoding of each picture.
When the image that is the target to be encoded is an I picture, the image itself may be encoded without inter prediction. When the image that is the target to be encoded is a P picture, the image may be encoded via inter prediction using only reference pictures in the forward direction. When the image that is the target to be encoded is a B picture, the image may be encoded via inter prediction using reference pictures in both the forward direction and the reverse direction, or may be encoded via inter prediction using reference pictures in one of the forward direction and the reverse direction.
P-pictures and B-pictures encoded and/or decoded using reference pictures may be considered as pictures using inter-prediction.
Hereinafter, inter prediction in inter mode according to an embodiment will be described in detail.
In the inter mode, the encoding apparatus 100 and the decoding apparatus 200 may perform prediction and/or motion compensation on the encoding target unit and the decoding target unit. For example, the encoding apparatus 100 or the decoding apparatus 200 may perform prediction and/or motion compensation by using motion information of neighboring reconstructed blocks as motion information of an encoding target unit or a decoding target unit. Here, the encoding target unit or the decoding target unit may represent a prediction unit and/or a prediction unit partition.
Inter prediction may be performed using reference pictures and motion information. Furthermore, inter prediction may use the skip mode described above.
The reference picture may be at least one of a picture before or after the current picture. Here, the inter prediction may perform prediction on a block in a current picture based on a reference picture. Here, the reference picture may represent an image used to predict a block.
Here, the region in the reference picture may be specified by using a reference picture index refIdx indicating the reference picture and a motion vector, which will be described later.
Inter prediction may select a reference picture and a reference block corresponding to a current block in the reference picture, and may generate a prediction block for the current block using the selected reference block. The current block may be a block that is a target to be currently encoded or decoded among blocks in the current picture.
The motion information may be derived by each of the encoding apparatus 100 and the decoding apparatus 200 during inter prediction. Furthermore, the derived motion information may be used to perform inter prediction.
Here, the encoding apparatus 100 and the decoding apparatus 200 may improve encoding efficiency and/or decoding efficiency by using motion information of neighboring reconstructed blocks and/or motion information of co-located blocks (col blocks). The col block may be a block corresponding to the current block in a co-located picture (col picture) that has been previously reconstructed.
The neighboring reconstructed block may be a block existing in the current picture and may be a block that has been previously reconstructed via encoding and/or decoding. The reconstructed block may be a neighboring block adjacent to the current block and/or a block located at an outer corner of the current block. Here, the "block located at the outer corner of the current block" may represent a block vertically adjacent to a neighboring block horizontally adjacent to the current block, or a block horizontally adjacent to a neighboring block vertically adjacent to the current block.
For example, the neighboring reconstruction unit (block) may be a unit located at the left side of the target unit, a unit located above the target unit, a unit located at the lower left corner of the target unit, a unit located at the upper right corner of the target unit, or a unit located at the upper left corner of the target unit.
Each of the encoding apparatus 100 and the decoding apparatus 200 may determine a block existing at a position spatially corresponding to the current block in the col picture, and may determine a predefined relative position based on the determined block. The predefined relative position may be an inner and/or outer position of the block that is present at a position spatially corresponding to the current block. Furthermore, each of the encoding device 100 and the decoding device 200 may derive a col block based on the predefined relative position that has been determined. Here, the col picture may be any one picture of one or more reference pictures included in the reference picture list.
The block in the reference picture may exist at a position in the reconstructed reference picture that spatially corresponds to the position of the current block. In other words, the position of the current block in the current picture and the position of the block in the reference picture may correspond to each other. Hereinafter, motion information of a block included in a reference picture may be referred to as "temporal motion information".
The method for deriving motion information may vary according to the prediction mode of the current block. For example, as a prediction mode applied to inter prediction, there may be an Advanced Motion Vector Predictor (AMVP) mode, a merge mode, or the like.
For example, when AMVP mode is used as the prediction mode, each of the encoding apparatus 100 and the decoding apparatus 200 may generate a prediction motion vector candidate list using motion vectors of neighboring reconstructed blocks and/or motion vectors of col blocks. The motion vectors of neighboring reconstructed blocks and/or the motion vectors of col blocks may be used as prediction motion vector candidates.
The bitstream generated by the encoding apparatus 100 may include a prediction motion vector index. The prediction motion vector index may represent a best prediction motion vector selected from among prediction motion vector candidates included in the prediction motion vector candidate list. The prediction motion vector index may be transmitted from the encoding apparatus 100 to the decoding apparatus 200 through a bitstream.
The decoding apparatus 200 may select the predicted motion vector of the current block from among the predicted motion vector candidates included in the predicted motion vector candidate list using the predicted motion vector index.
The encoding apparatus 100 may calculate a Motion Vector Difference (MVD) between a motion vector of the current block and a predicted motion vector, and may encode the MVD. The bitstream may include an encoded MVD. The MVD may be transmitted from the encoding apparatus 100 to the decoding apparatus 200 through a bitstream. Here, the decoding apparatus 200 may decode the received MVD. The decoding apparatus 200 may derive a motion vector of the current block using the sum of the decoded MVD and the predicted motion vector.
The bitstream may include a reference picture index or the like for indicating a reference picture. The reference picture index may be transmitted from the encoding apparatus 100 to the decoding apparatus 200 through a bitstream. The decoding apparatus 200 may predict a motion vector of the current block using motion information of neighboring blocks, and may derive the motion vector of the current block using a difference (MVD) between the predicted motion vector and the motion vector. The decoding apparatus 200 may generate a prediction block for the current block based on the derived motion vector and the reference picture index information.
Since the motion information of the neighboring reconstructed block may be used to encode and decode the target unit, the encoding apparatus 100 may not separately encode the motion information of the target unit in a specific inter prediction mode. If the motion information of the target unit is not encoded, the number of bits transmitted to the decoding apparatus 200 may be reduced, and encoding efficiency may be improved. For example, there may be a skip mode and/or a merge mode that is an inter prediction mode that does not encode motion information of a target unit. Here, each of the encoding apparatus 100 and the decoding apparatus 200 may use an identifier and/or index indicating one neighboring reconstruction block of the plurality of neighboring reconstruction blocks, wherein motion information of the one neighboring reconstruction block is to be used as motion information of the target unit.
There is a merging method as another example of a method of deriving motion information. The term "merge" may refer to the merging of motions of multiple blocks. The term "merge" may mean that motion information of one block is also applied to other blocks. When merging is applied, each of the encoding apparatus 100 and the decoding apparatus 200 may generate a merge candidate list using motion information of neighboring reconstructed blocks and/or motion information of col blocks. The motion information may include at least one of: 1) motion vectors, 2) indices of reference pictures, and 3) prediction directions. The prediction direction may be unidirectional or bidirectional.
Here, merging may be applied on a CU basis or a PU basis. When merging is performed on a CU basis or a PU basis, the encoding apparatus 100 may transmit predefined information to the decoding apparatus 200 through a bitstream. The bitstream may include predefined information. The predefined information may include: 1) Information on whether to perform merging for each block partition, and 2) information on a neighboring block to be used for performing merging among a plurality of neighboring blocks neighboring to the current block. For example, neighboring blocks of the current block may include a left neighboring block of the current block, an upper neighboring block of the current block, a temporally neighboring block of the current block, and the like.
The merge candidate list may represent a list in which pieces of motion information are stored. Further, a merge candidate list may be generated before performing the merge. The motion information stored in the merge candidate list may be 1) motion information of a neighboring block adjacent to the current block and 2) motion information of a co-located block corresponding to the current block in the reference picture. Further, the motion information stored in the merge candidate list may be new motion information generated by combining pieces of motion information previously existing in the merge candidate list.
The skip mode may be a mode in which information on neighboring blocks is applied to the current block without change. The skip mode may be one of a plurality of modes for inter prediction. When the skip mode is used, the encoding apparatus 100 may transmit only information on a block whose motion information is to be used as motion information of the current block to the decoding apparatus 200 through a bitstream. The encoding apparatus 100 may not transmit other information to the decoding apparatus 200. For example, the other information may be syntax information. The syntax information may include Motion Vector Difference (MVD) information.
Partitioning a picture using picture partition information
When pictures constituting a video are encoded, each picture may be partitioned into a plurality of portions, and the plurality of portions may be encoded separately. In this case, in order for the decoding apparatus to decode the partitioned picture, information about the partition of the picture may be required.
The encoding device may transmit picture partition information indicating a partition of a picture to the decoding device. The decoding apparatus may decode the picture using the picture partition information.
The header information of the picture may include picture partition information. Alternatively, the picture partition information may be included in header information of the picture. The picture header information may be information applied to each of the one or more pictures.
In one or more consecutive pictures, picture partition information indicating how each picture is partitioned may be changed if the partition of the picture is changed. When the picture partition information has changed while processing a plurality of pictures, the encoding apparatus may transmit new picture partition information to the decoding apparatus according to the change.
For example, a Picture Parameter Set (PPS) may include picture partition information, and the encoding device may transmit the PPS to the decoding device. The PPS may include a PPS ID as an Identifier (ID) of the PPS. The encoding apparatus may inform the decoding apparatus of which PPS is used for the picture through the PPS ID. The picture may be partitioned based on picture partition information of the PPS.
In encoding of video, picture partition information for pictures constituting the video may be frequently and repeatedly changed. If the encoding apparatus must transmit new picture partition information to the decoding apparatus every time the picture partition information is changed, encoding efficiency and decoding efficiency may be lowered. Therefore, although picture partition information applied to each picture is changed, if encoding, transmission, and decoding of the picture partition information can be omitted, encoding efficiency and decoding efficiency can be improved.
In the following embodiments, a method of deriving additional picture partition information by using one piece of picture partition information for a bitstream of video encoded using two or more pieces of picture partition information will be described.
Since the additional picture partition information is derived based on one piece of picture partition information, at least two different picture partition methods can be provided by other information including one piece of picture partition information.
Fig. 8 illustrates partitioning a picture using parallel blocks according to an embodiment.
In fig. 8, the picture is indicated by a solid line, and the parallel block is indicated by a broken line. A picture may be partitioned into multiple parallel blocks.
Each parallel block may be one of the entities that serves as a partition unit of the picture. The parallel block may be a partition unit of a picture. Alternatively, the parallel block may be a unit of picture partition coding.
Information about the parallel blocks may be signaled by a Picture Parameter Set (PPS). The PPS may contain information about parallel blocks of a picture or information required to partition a picture into multiple parallel blocks.
Table 1 below shows an example of the structure of pic_parameter_set_rbsp. The picture partition information may be pic_parameter_set_rbsp or may include pic_parameter_set_rbsp.
TABLE 1
Figure BDA0004113923970000221
"pic_parameter_set_rbsp" may include the following elements.
-tiles_enabled_flag: the "tiles_enabled_flag" may be a parallel block presence indication flag indicating whether one or more parallel blocks exist in a picture referencing PPS.
For example, a tiles enabled flag value of "0" may indicate that there are no parallel blocks in the picture referencing PPS. the tiles enabled flag value of "1" may indicate that there are one or more parallel blocks in the picture referencing PPS.
The values of the parallel block presence indication flag tiles_enabled_flag of all activated PPS in a single Coded Video Sequence (CVS) may be identical to each other.
Num_tile_columns_minus1: "num_tile_columns_minus1" may be information on the number of column-parallel blocks corresponding to the number of parallel blocks arranged in the lateral direction of the partitioned picture. For example, a value of "num_tile_columns_minus1+1" may represent the number of laterally parallel blocks in a picture of a partition. Alternatively, the value of "num_tile_columns_minus1+1" may represent the number of parallel blocks in a row.
Num_tile_rows_minus1: "num_tile_rows_minus1" may be information on the number of line parallel blocks corresponding to the number of parallel blocks arranged in the longitudinal direction of the partitioned picture. For example, a value of "num_tile_rows_minus1+1" may represent the number of vertically parallel blocks in the partitioned picture. Alternatively, the value of "num_tile_rows_minus1+1" may represent the number of parallel blocks in a column.
-unit_spacing_flag: the "uniform_spacing_flag" may be an equal division indication flag indicating whether or not a picture is equally divided into parallel blocks in the horizontal and vertical directions. For example, the uniform_spacing_flag may be a flag indicating whether the sizes of parallel blocks in a picture are identical to each other. For example, a uniform_spacing_flag value of "0" may indicate that a picture is not equally partitioned in the horizontal and/or vertical directions. The uniform_spacing_flag value of "1" may indicate that the picture is equally partitioned in the horizontal and vertical directions. When the uniform_spacing_flag value is "0", elements defined in more detailed partitions, such as column_width_minus1[ i ] and row_height_minus1[ i ], which will be described later, may be additionally required in order to partition a picture.
Column_width_minus1[ i ]: "column_width_minus1[ i ]" may be parallel block width information corresponding to the width of the parallel block in the i-th column. Here, i may be an integer equal to or greater than 0 and less than the number n of columns of parallel blocks. For example, "column_width_minus1[ i ] +1" may represent the width of the parallel blocks in column i+1. The width may be represented by a predetermined unit. For example, the unit of width may be a Coding Tree Block (CTB).
Row_height_minus1[ i ]: "row_height_minus1[ i ]" may be parallel block height information corresponding to the height of the parallel blocks in the i-th row. Here, i may be an integer equal to or greater than 0 and less than the number n of rows of the parallel block. For example, "row_height_minus1[ i ] +1" may represent the height of parallel blocks in row i+1. The height may be represented by a predetermined unit. For example, the unit of height may be a Coding Tree Block (CTB).
In an example, the picture partition information may be included in the PPS and may be transmitted as part of the PPS when the PPS is transmitted. The decoding apparatus may obtain picture partition information required for partitioning a picture by referring to PPS of the picture.
In order to signal picture partition information different from information that has been previously transmitted, the encoding apparatus may transmit a new PPS to the decoding apparatus, wherein the new PPS includes the new picture partition information and the new PPS ID. Subsequently, the encoding device may send the slice header containing the PPS ID to the decoding device.
Method for signaling parallel block based picture partition information changing according to a specific rule Proposal of
As described above, in a series of pictures, pieces of picture partition information applied to the pictures can be changed. A new PPS may need to be retransmitted each time the picture partition information changes.
In a series of pictures, pieces of picture partition information applied to the pictures may be changed according to a specific rule. For example, the picture partition information may be periodically changed according to the number of pictures.
When pieces of screen partition information are changed according to the specific rule, transmission of the screen partition information may be omitted by using such a rule. For example, the decoding apparatus may derive picture partition information of one picture from picture partition information of another picture that has been previously transmitted.
Typically, it may not be necessary to change pieces of picture partition information for each picture, and pieces of picture partition information may be repeated at a fixed period and according to a specific rule.
For example, picture partitioning may be performed in accordance with a parallel coding strategy. In order to perform parallel encoding on pictures, the encoding apparatus may partition each picture into parallel blocks. The decoding apparatus may use information on the parallel encoding strategy to obtain a rule corresponding to the periodic change of the picture partition information.
For example, when parallel blocks are used as the picture partition tool, a periodicity change rule related to a method for partitioning a single picture into a plurality of parallel blocks may be derived based on information of a parallel coding strategy of a coding apparatus.
Fig. 9 illustrates a reference structure to which group of pictures (GOP) level encoding is applied according to an embodiment.
In fig. 9, a picture constituting a GOP and a reference relationship between pictures are shown.
When a sequence of pictures is encoded, a GOP may be applied. Random access may be made to video encoded by GOP.
In fig. 9, the size of the GOP is shown as 8. For example, a single GOP may be a group of 8 pictures.
In fig. 9, each screen is shown as a rectangle. The "I", "B" or "B" in each picture may represent the type of picture. The horizontal position of the picture may represent the temporal order of the pictures. The vertical position of a picture may represent the level of the picture. Here, the "level" may be a time level. For example, the GOP level of each picture may correspond to the temporal level of the picture. Alternatively, the GOP level of a picture may be the same as the temporal level of the picture.
The GOP level for each picture may be determined by a Picture Order Count (POC) value of the picture. The GOP level of a picture can be determined by the remainder obtained when the POC value of the picture is divided by the size of the GOP. In other words, when the POC value of a picture is a multiple of 8 (8 k), the GOP level of the picture may be 0. Here, k may be an integer of 0 or more. When the POC value of a picture is (8k+4), the GOP level of the picture may be 1. When the POC value of a picture is (8k+2) or (8k+6), the GOP level of the picture may be 2. When the POC value of a picture is (8k+1), (8k+3), (8k+5) or (8k+7), the GOP level of the picture may be 3.
In fig. 9, pictures are divided by GOP levels ranging from GOP level 0 to GOP level 3. Arrows between pictures may represent reference relationships between pictures. For example, an arrow from a first I picture to a second b picture may indicate that the first I picture is referenced by the second b picture.
Fig. 10 shows the coding order of pictures in a GOP according to an embodiment.
In fig. 10, a sequence of pictures, an Instantaneous Decoder Refresh (IDR) period in the sequence, and a GOP are shown. Further, the coding order of pictures in a GOP is shown.
In fig. 10, the uncolored picture may be a picture at GOP level 0 or 1. The lightly colored picture may be a picture at GOP level 2. The deeply colored picture may be a picture at GOP level 3.
As shown in the drawing, the coding order of pictures in a GOP may be determined in such a manner that the type of picture is preferentially applied instead of the temporal order of pictures.
Fig. 11 illustrates parallel encoding of pictures in a GOP according to an embodiment.
In an embodiment, for pictures at GOP levels (such as the pictures shown in fig. 9), the encoding device may encode the pictures using a combination of picture-level parallelization and parallel block-level parallelization.
Picture level parallelization may refer to pictures that are not referenced to each other, and thus pictures that can be encoded independently of each other are encoded in parallel.
Parallel block level parallelization may be parallelization related to partitioning a picture. Parallel block level parallelization may refer to a single picture being partitioned into multiple parallel blocks, and the multiple parallel blocks being encoded in parallel.
Both picture-level parallelization and parallel block-level parallelization can be applied to the parallelization of pictures at the same time. Alternatively, picture level parallelization may be combined with parallel block level parallelization.
For this parallelization, as shown in fig. 9, it is possible to design such that the remaining pictures at the same GOP level, except for the picture at GOP level 0, among the pictures in the GOP do not refer to each other. That is, in fig. 9, B pictures at GOP level 2 may not refer to each other, and B pictures at GOP level 3 may not refer to each other.
Under this design, a scheme may be devised that enables the remaining pictures other than the picture at GOP level 0 among the pictures in the GOP to be encoded in parallel. Since two pictures at GOP level 2 do not refer to each other, two pictures at GOP level 2 can be encoded in parallel. Further, since four pictures at GOP level 3 do not refer to each other, four pictures at GOP level 3 can be encoded in parallel.
Under such an encoding scheme, the number and shape of partitions of a picture may be differently allocated according to GOP levels of the picture. The number of partitions per picture may indicate the number of parallel blocks or stripes into which the picture is partitioned. The shape of the partitions of the picture may represent the size and/or location of the individual parallel blocks or stripes.
In other words, the number and shape of partitions of a picture may be determined based on GOP levels of the picture. Each picture may be partitioned into a certain number of portions according to GOP levels of the picture.
The GOP level of a picture and the partition of the picture may have a specific relationship. Pictures at the same GOP level may have the same picture partition information.
For example, when parallelization such as shown in fig. 11 is designed, if a picture at GOP level 0 and a picture at GOP level 1 are respectively partitioned into 4N parts, a picture at GOP level 2 may be partitioned into 2N parts, and a picture at GOP level 3 may be partitioned into N parts. Here, N may be an integer of 1 or more. According to this design, the number of threads for the portion of parallel encoding when frame-level parallelization and picture-level parallelization are used simultaneously can be fixed. That is, when there is an additional picture that can be encoded or decoded in parallel with a specific picture, picture-level parallelization may be performed first, and parallel block-level parallelization for one picture may be performed in inverse proportion to the picture-level parallelization to some extent.
In the embodiment, a method may be proposed in which picture partition information that is changed periodically or according to a specific rule is not transmitted through several PPS and changed picture partition information of other pictures is derived using picture partition information included in one PPS. Alternatively, one piece of picture partition information may indicate a plurality of picture partition shapes, wherein each picture is partitioned into different shapes according to the plurality of picture partition shapes.
For example, the picture partition information may indicate the number of parallel processed pictures at each of the particular GOP levels. The number of partitions per picture can be obtained using the picture partition information.
The GOP level description performed in connection with partitioning a picture in the above-described embodiment can also be applied to a time identifier (time ID) or a time level. In other words, in an embodiment, the "GOP level" may be replaced by a "temporal level" or a "temporal identifier".
The temporal identifier may indicate a level in the hierarchical temporal prediction structure.
The temporal identifier may be included in a Network Abstraction Layer (NAL) unit header.
Fig. 12 illustrates partitioning a picture using stripes according to an embodiment.
In fig. 12, the picture is indicated by a solid line, the band is indicated by a thick dotted line, and the Coding Tree Unit (CTU) is indicated by a thin dotted line. As shown in the drawings, a picture may be partitioned into a plurality of slices. A stripe may consist of one or more consecutive CTUs.
A stripe may be one of the entities used as a partition unit of a picture. The stripe may be a partition unit of a picture. Alternatively, the slice may be a unit of picture partition coding.
Information about the stripe may be signaled by the stripe segment header. The stripe segment header may contain information about the stripe.
When a slice is a unit of picture partition coding, the picture partition information may define a start address of each of one or more slices.
The unit of the start address of each stripe may be a CTU. The picture partition information may define a starting CTU address for each of the one or more slices. The partition shape of a picture may be defined by the starting address of a stripe.
Table 2 below shows an example of the structure of the slice_segment_header. The picture partition information may be or may include slice_segment_header.
TABLE 2
Figure BDA0004113923970000271
The "slice_segment_header" may include the following elements.
First_slice_segment_in_pic_flag: the "first_slice_segment_in_pic_flag" may be a first slice indication flag indicating whether the slice indicated by the slice_segment_header is the first slice in the picture.
For example, a first slice segment in pic flag value of "0" may indicate that the corresponding slice is not the first slice in the picture. The first slice segment in pic flag value of "1" may indicate that the corresponding slice is the first slice in the picture.
dependency_slice_segment_flag: the "dependent_slice_segment_flag" may be a dependent slice segment indication flag indicating whether the slice indicated by the slice_segment_header is a dependent slice.
For example, a dependency_slice_segment_flag value of "0" may indicate that the corresponding stripe is not a dependent stripe. The dependent slice segment flag value of "1" may indicate that the corresponding slice is a dependent slice.
For example, the substream stripes used for Wavefront Parallel Processing (WPP) may be dependent stripes. There may be independent strips corresponding to non-independent strips. When the stripe indicated by the slice_segment_header is a non-independent stripe, at least one element of the slice_segment_header may not exist. In other words, the values of the elements in the slice_segment_header may not be defined. For elements whose values are not defined in the dependent stripe, the values of the elements of the independent stripe corresponding to the dependent stripe may be used. In other words, the value of a specific element that does not exist in the slice_segment_header of the non-independent stripe may be equal to the value of a specific element in the slice_segment_header of the independent stripe corresponding to the non-independent stripe. For example, a dependent stripe may inherit the values of elements in its corresponding independent stripe and may redefine the values of at least some of the elements in the independent stripe.
-slice_segment_address: the "slice_segment_address" may be start address information indicating a start address of a slice indicated by the slice_segment_header. The unit of the start address information may be CTB.
Methods for partitioning a picture into one or more slices may include methods 1) through 3) below.
Method 1): the first method may be a method for partitioning a picture by a maximum size of a bitstream that one slice can include.
Method 2): the second method may be a method for partitioning a picture by the maximum number of CTUs that one slice can include.
Method 3): the third method may be a method for partitioning a picture by the maximum number of parallel blocks that one stripe can include.
When the encoding apparatus intends to perform parallel encoding on a stripe basis, a second method and a third method among the three methods may be generally used.
In the case of the first method, the size of the bitstream is known after encoding has been completed, and thus it may be difficult to define slices to be processed in parallel before encoding starts. Accordingly, the picture partition method capable of slice-based parallel encoding may be a second method using the maximum number of units of CTUs and a third method using the maximum number of units of parallel blocks.
When the second method and the third method are used, the partition size of the picture may be predefined before the picture is encoded in parallel. Further, from the defined size, a slice_segment_address may be calculated. When the encoding apparatus uses slices as units of parallel encoding, there is generally a tendency that the slice_segment_address is not changed for each picture but is repeated at a fixed period and/or according to a specific rule.
Thus, in an embodiment, a method for signaling picture partition information through parameters commonly applied to a plurality of pictures, instead of signaling picture partition information for each slice, may be used.
Fig. 13 is a configuration diagram of an encoding apparatus for performing video encoding according to an embodiment.
The encoding apparatus 1300 may include a control unit 1310, an encoding unit 1320, and a communication unit 1330.
The control unit 1310 may perform control for encoding video.
The encoding unit 1320 may perform encoding on the video.
The encoding unit 1320 may include the inter prediction unit 110, the intra prediction unit 120, the switcher 115, the subtractor 125, the transform unit 130, the quantization unit 140, the entropy encoding unit 150, the inverse quantization unit 160, the inverse transform unit 170, the adder 175, the filtering unit 180, and the reference picture buffer 190, which have been described above with reference to fig. 1.
The communication unit 1330 may transmit the data of the encoded video to another device.
The detailed functions and operations of the control unit 1310, the encoding unit 1320, and the communication unit 1330 will be described in more detail below.
Fig. 14 is a flowchart of an encoding method for performing video encoding according to an embodiment.
In step 1410, the control unit 1310 may generate picture partition information regarding a plurality of pictures in the video. The picture partition information may indicate a picture partition method for each of a plurality of pictures in the video.
For example, the picture partition information may indicate which method is to be used to partition each picture of the plurality of pictures. The picture partition information may be applied to a plurality of pictures. Further, when a plurality of pictures are partitioned based on picture partition information, methods for partitioning the plurality of pictures may be different from each other. The partitioning method may indicate the number of portions resulting from the partitioning operation, the shape of the portions, the size of the portions, the width of the portions, the height of the portions, and/or the length of the portions.
For example, the picture partition information may indicate at least two different methods for partitioning a picture. At least two different methods for partitioning a picture may be specified by picture partition information. Further, the picture partition information may indicate which of at least two different methods is to be used for partitioning each of the plurality of pictures.
For example, the plurality of pictures may be pictures in a single GOP or pictures constituting a single GOP.
In step 1420, the control unit 1310 may partition each of the plurality of pictures using one of at least two different methods. At least two different methods correspond to the picture partition information. In other words, the picture partition information may specify at least two different methods for partitioning a plurality of pictures.
Here, "different methods" may mean that the number, shape, or size of the parts resulting from the partitioning operation are different from each other. Here, the portions may be parallel blocks or stripes.
For example, the control unit 1310 may determine which one of at least two different methods is to be used for partitioning each of a plurality of pictures based on the picture partition information. The control unit 1310 may generate a portion of a picture by partitioning the picture.
In step 1430, the encoding unit 1320 may perform encoding on a plurality of pictures that are partitioned based on the picture partition information. The encoding unit 1320 may perform encoding on each picture partitioned using one of at least two different methods.
Portions of each picture may be encoded separately. The encoding unit 1320 may perform encoding on a plurality of portions generated from partitioning a picture in parallel.
At step 1440, the encoding unit 1320 may generate data including both picture partition information and a plurality of encoded pictures. The data may be a bit stream.
In step 1450, the communication unit 1330 may transmit the generated data to a decoding apparatus.
The picture division information and portions of each picture will be described in more detail with reference to other embodiments. The details of the picture division information and the portion of each picture to be described in other embodiments can also be applied to the present embodiment. A repetitive description thereof will be omitted.
Fig. 15 is a configuration diagram of a decoding apparatus for performing video decoding according to an embodiment.
The decoding apparatus 1500 may include a control unit 1510, a decoding unit 1520, and a communication unit 1530.
The control unit 1510 may perform control for video decoding. For example, the control unit 1510 may acquire picture partition information from data or a bitstream. Alternatively, the control unit 1510 may decode picture partition information in the data or bitstream. Further, the control unit 1510 may control the decoding unit 1520 to decode the video based on the picture partition information.
The decoding unit 1520 may perform decoding on the video.
The decoding unit 1520 may include the entropy decoding unit 210, the inverse quantization unit 220, the inverse transformation unit 230, the intra prediction unit 240, the inter prediction unit 250, the adder 255, the filtering unit 260, and the reference picture buffer 270, which have been described above with reference to fig. 2.
The communication unit 1530 may receive data of the encoded video from another device.
The detailed functions and operations of the control unit 1510, the decoding unit 1520, and the communication unit 1530 will be described in more detail below.
Fig. 16 is a flowchart of a decoding method for performing video decoding according to an embodiment.
In step 1610, the communication unit 1530 may receive data of the encoded video from the encoding apparatus 1300. The data may be a bit stream.
In step 1620, the control unit 1510 may acquire screen partition information from the data. The control unit 1510 may decode the picture partition information in the data, and may acquire the picture partition information via decoding.
The picture partition information may indicate a picture partition method for each of a plurality of pictures in the video.
For example, the picture partition information may indicate which method is to be used to partition each picture of the plurality of pictures. Further, when a plurality of pictures are partitioned based on picture partition information, methods for partitioning the plurality of pictures may be different from each other.
The partitioning method may indicate the number of portions resulting from the partitioning operation, the shape of the portions, the size of the portions, the width of the portions, the height of the portions, and/or the length of the portions.
For example, the picture partition information may indicate at least two different methods for partitioning a picture. At least two different methods for partitioning a picture may be specified by picture partition information. Further, the picture partition information may indicate which of at least two different methods is to be used to partition each of the plurality of pictures based on the characteristics or properties of the pictures.
For example, the attribute of a picture may be GOP level, time identifier, or time level of a picture.
For example, the plurality of pictures may be pictures in a single GOP or pictures constituting a single GOP.
In step 1630, the control unit 1510 may partition each of the plurality of pictures using one of at least two different methods based on the picture partition information. The control unit 1510 may determine which of at least two different methods is to be used for partitioning each of a plurality of pictures based on the picture partition information. The control unit 1510 may generate a portion of each picture by partitioning the picture.
The portion resulting from the partitioning operation may be a parallel block or stripe.
For example, the control unit 1510 may partition a first picture among a plurality of pictures based on picture partition information. The control unit 1510 may partition the first picture according to a first picture partition method indicated by the picture partition information. The control unit 1510 may partition a second picture of the plurality of pictures based on additional picture partition information derived from the picture partition information. The first screen and the second screen may be different screens. For example, the GOP level of the first picture and the GOP level of the second picture may be different from each other. For example, at least some of the one or more elements in the picture partition information may be used to derive additional picture partition information from the picture partition information.
Alternatively, the control unit 1510 may partition the second picture according to a second picture partition method derived from the picture partition information. At least some of the one or more elements in the picture partition information may indicate a first picture partition method. At least other ones of the one or more elements of picture partition information may be used to derive a second picture partition method from the picture partition information or the first picture partition method.
The picture partition information may define a periodically changing picture partition method. The control unit 1510 may partition a plurality of pictures using a periodically changing picture partition method defined by picture partition information. In other words, the specific picture division method can be repeatedly applied to a series of pictures. When the specific picture division method is applied to a specific number of pictures, the specific picture division method may be repeatedly applied to a subsequent specific number of pictures.
The picture division information may define a picture division method changed according to a rule. The control unit 1510 may partition a plurality of pictures using a picture partition method that is changed according to a rule and defined by picture partition information. That is, the picture division method specified according to the rule can be applied to a series of pictures.
In step 1640, the decoding unit 1520 may perform decoding on the plurality of pictures partitioned based on the picture partition information. The decoding unit 1520 may perform decoding on each picture partitioned using one of at least two different methods.
Portions of each picture may be decoded separately. The decoding unit 1520 may perform decoding on a plurality of portions generated from the partition operation of each picture in parallel.
At step 1650, the decoding unit 1520 may generate video including a plurality of decoded pictures.
As described above, the picture partition information may be defined by the PPS or by at least some elements in the PPS.
In an embodiment, the PPS may include picture partition information. That is, the PPS may include elements related to picture partition information and elements not related to picture partition information. The picture partition information may correspond to at least some elements in the PPS.
Alternatively, in an embodiment, the picture partition information may include PPS. That is, the picture partition information may be defined by PPS and other information.
In an embodiment, the picture partition information for multiple pictures may be defined by a single PPS instead of several PPS. In other words, picture partition information defined by a single PPS may be used to partition a plurality of pictures in at least two different shapes.
In an embodiment, the picture partition information for a single picture may also be used to partition other pictures that are partitioned using a picture partition method different from the picture partition method of the picture. The picture partition information may include information required to derive other picture partition methods in addition to information required to partition a picture in the PPS.
In this case, it can be understood that one piece of picture partition information indicates a multiple picture partition method applied to multiple pictures. For example, at least some elements in the picture partition information may define a first picture partition method. The first picture partition method may be applied to a first picture of the plurality of pictures. At least other elements in the picture partition information may be used to derive a second picture partition method from the first picture partition method. The derived second picture partition method may be applied to a second picture of the plurality of pictures. The picture division information may contain information for defining a picture division method to be applied and a picture to which the picture division method is to be applied. That is, the picture division information may contain information for specifying a picture division method corresponding to each of the plurality of pictures.
Alternatively, in an embodiment, a single PPS may include multiple pieces of picture partition information. The pieces of picture partition information may be used to partition a plurality of pictures. In other words, according to an embodiment, PPS for a single picture may include not only picture partition information for partitioning a corresponding picture but also picture partition information for partitioning other pictures.
In this case, it can be understood that the pieces of picture partition information respectively indicate a plurality of different picture partition methods, and can be transferred from the encoding apparatus to the decoding apparatus through a single PPS. For example, at least some elements in the PPS may define picture partition information. The defined picture partition information may be applied to a first picture of the plurality of pictures. At least other elements in the PPS may be used to derive other picture partition information from the defined picture partition information. The derived picture partition information may be applied to a second picture of the plurality of pictures. The PPS may include information for defining picture partition information to be applied and pictures to which the picture partition information is to be applied. In other words, the PPS may include information for specifying picture partition information corresponding to each of the plurality of pictures.
Picture partition information for partitioning a picture into parallel blocks
As described above, the portion of the picture resulting from the partitioning operation may be parallel blocks. A picture may be partitioned into multiple parallel blocks.
PPS may define parameters that are applied to a particular picture. At least some of these parameters may be picture partition information and may be used to determine a picture partition method.
In an embodiment, picture partition information included in a single PPS may be applied to a plurality of pictures. Here, the plurality of pictures may be partitioned using one of at least two different methods. That is, in order to define at least two different picture partition methods, a single PPS may be used instead of several PPS.
Even if two pictures are partitioned using different picture partition methods, PPS is not signaled for each picture, and the changed picture partition method can be derived from a single PPS or single picture partition information. For example, the PPS may include picture partition information to be applied to a single picture, and picture partition information to be applied to other pictures may be derived from the PPS. Alternatively, for example, the PPS may include picture partition information to be applied to a single picture, and a picture partition method to be applied to a plurality of pictures may be defined based on the picture partition information.
For example, PPS may define the number of pictures to be processed in parallel for each GOP level. Once the number of pictures to be processed in parallel for each GOP level is defined, a picture partition method for pictures at a specific GOP level can be determined. Alternatively, once the number of pictures to be processed in parallel for each GOP level is defined, the number of parallel blocks into which a picture at a particular GOP level is to be partitioned may be determined.
For example, the PPS may define the number of pictures to be processed in parallel for each temporal identifier. Once the number of pictures to be processed in parallel for each time identifier is defined, picture partition information for a picture having a specific time identifier can be determined. Alternatively, once the number of pictures to be processed in parallel for each time identifier is defined, the number of parallel blocks into which a picture having a specific time identifier is to be partitioned may be determined.
The decoding device may extract the GOP size via the configuration of the reference pictures and may derive the GOP level from the GOP size. Alternatively, the decoding device may derive the GOP level from the temporal level. The GOP level and temporal level may be used to partition each picture, which will be described later.
Embodiments for partitioning pictures into parallel blocks according to GOP level
Table 3 below shows an example of a structure of pic_parameter_set_rbsp indicating PPS for signaling picture partition information. The picture partition information may be pic_parameter_set_rbsp or may include pic_parameter_set_rbsp. The picture may be partitioned into multiple parallel blocks by pic_parameter_set_rbsp.
TABLE 3
Figure BDA0004113923970000351
pic_parameter_set_rbsp may include the following elements.
-parallel_frame_by_gp_level_enable_flag: the "parallel_frame_by_gop_level_enable_flag" may be a GOP level parallel processing flag indicating whether a picture referencing PPS is encoded or decoded in parallel with other pictures at the same GOP level.
For example, a parameter_frame_by_gop_level_enable_flag value of "0" may indicate that a picture referencing PPS is not encoded or decoded in parallel with other pictures at the same GOP level. The parameter_frame_by_gop_level_enable_flag value of "1" may indicate that a picture referencing PPS is encoded or decoded in parallel with other pictures at the same GOP level.
When a picture is processed in parallel with other pictures, it is considered to reduce the necessity of partitioning a single picture into a plurality of parts and processing the plurality of parts in parallel. Thus, it can be considered that there may be a correlation between parallel processing for a plurality of pictures and parallel processing for a plurality of portions of a single picture.
The picture partition information may include information about the number of pictures to be processed in parallel at the GOP level n (i.e., parallel processing picture number information). The parallel processing picture number information at the specific GOP level n may correspond to the number of pictures at the GOP level n to which parallel processing is applicable. Here, n may be an integer of 2 or more. The parallel processing picture number information may include the following elements: num_frame_in_parallel_gp_lev3_minus1 and num_frame_in_parallel_gp_lev2_minus1.
Num_frame_in_parallel_gp_level 3 minus1: "num_frame_in_parallel_gp_level 3_minus1" may be parallel processing picture number information at GOP level 3. The parallel processing picture number information at GOP level3 may correspond to the number of pictures that can be encoded or decoded in parallel at GOP level 3.
For example, the value of "num_frame_in_parallel_gp_level 3_minus1+1" may represent the number of pictures at GOP level3 that can be encoded or decoded in parallel.
Num_frame_in_parallel_gp_lev2_minus1: "num_frame_in_parallel_gp_level 2_minus1" may be parallel processing picture number information at GOP level 2. The parallel processing picture number information at GOP level2 may correspond to the number of pictures that can be encoded or decoded in parallel at GOP level 2.
For example, the value of "num_frame_in_parallel_gp_level 2_minus1+1" may represent the number of pictures at GOP level2 that can be encoded or decoded in parallel.
By signaling using the picture partition information of pic_parameter_set_rbsp described above, a plurality of encoded pictures can be decoded using the following procedure.
For example, assuming that the value of "parallel_frame_by_gap_level_enable_flag" in the PPS of the current picture is "1" and the GOP level of the current picture is 2, num_tile_columns_minus1 and num_tile_rows_minus1 to be applied to the current picture may be redefined by the following equations 2 and 3:
[ equation 2]
new_num_tile_columns=(num_tile_columns_minus1+1)/(num_frame_in_parallel_gop_level2_minus1+1)
[ equation 3]
new_num_tile_rows=(num_tile_rows_minus1+1)/(num_frame_in_parallel_gop_level2_minus1+1)
Here, "new_num_tile_columns" may represent the number of parallel blocks (i.e., the number of columns of parallel blocks) arranged in the lateral direction of the picture of the partition. "new_num_tile_rows" may represent the number of parallel blocks (i.e., the number of rows of parallel blocks) arranged in the longitudinal direction of the partitioned picture. The current picture may be partitioned into new_num_tile_columns×new_num_tile_rows of parallel blocks.
For example, assuming that the value of "parallel_frame_by_gap_level_enable_flag" in the PPS of the current picture is "1" and the GOP level of the current block is 3, num_tile_columns_minus1 and/or num_tile_rows_minus1 to be applied to the current picture may be redefined by the following equations 4 and 5:
[ equation 4]
new_num_tile_columns=(num_tile_columns_minus1+1)/(num_frame_in_parallel_gop_level3_minus1+1)
[ equation 5]
new_num_tile_rows=(num_tile_rows_minus1+1)/(num_frame_in_parallel_gop_level3_minus1+1)
The redefinition above may be applied to new_num_tile_columns or new_num_tile_rows or to both new_num_tile_columns and new_num_tile_rows.
According to the above equations 2 to 5, the larger the value of num_frame_in_parallel_gp_lev2_minus1, etc., the smaller the value of new_num_tile_columns. That is, when the value of num_frame_in_parallel_gp_lever2_minus1 or num_frame_in_parallel_gp_lever3_minus1 becomes large, the number of parallel blocks generated from the partitioning operation can be reduced. Thus, num_frame_in_parallel_gp_lev2_minus1 and num_frame_in_parallel_gp_lev3_minus1 may be reduction instruction information for reducing the number of parallel blocks generated from partitioning a picture. When the number of parallel blocks that are encoded or decoded in parallel at the same GOP level becomes large, each picture can be partitioned into a smaller number of parallel blocks.
The picture partition information may include reduction instruction information for reducing the number of parallel blocks generated from partitioning each picture. Further, the reduction instruction information may indicate a degree to which the number of parallel blocks generated from partitioning a picture is reduced according to encoding or decoding of parallel processing.
The picture partition information may include GOP level n reduction instruction information for reducing the number of parallel blocks generated from partitioning a picture at GOP level n. Here, n may be an integer of 2 or more. For example, num_frame_in_parallel_gp_leve2_minus1 may be GOP level2 reduction indication information. Further, num_frame_in_parallel_gp_level 3_minus1 may be GOP level3 reduction indication information.
For example, when the value of "parallel_frame_by_gap_level_enable_flag" in the PPS of the current picture is "0", the current picture may be partitioned into S parallel blocks using the values of num_tile_columns_minus1 and/or num_tile_columns_minus1 in the PPS of the current picture.
For example, S may be calculated using equation 6 below:
[ equation 6]
S=(num_tile_columns_minus1+1)×(num_tile_rows_minus1+1)
As described above according to equations 2 to 6, the picture partition information may contain GOP level n reduction instruction information for pictures at GOP level n. When the number of columns of parallel blocks generated from partitioning a picture at GOP level 0 or 1 is w and the number of columns of parallel blocks generated from partitioning a picture at GOP level n is w/m, GOP level n reduction instruction information may correspond to m. Alternatively, when the number of lines of parallel blocks generated from partitioning a picture at GOP level 0 or 1 is w and the number of lines of parallel blocks generated from partitioning a picture at GOP level n is w/m, the GOP level n reduction instruction information may correspond to m.
As described above according to equations 2 to 6, the picture partition shape applied to partition a picture may be determined based on the GOP level of the picture. Further, as described above with reference to fig. 10, the GOP level of a picture may be determined based on Picture Order Count (POC) of the picture.
The GOP level of a picture may be determined according to a value of a remainder when the POC value of the picture is divided by a predetermined value. For example, among the plurality of pictures in the GOP, a picture at GOP level 3 may be a picture whose remainder is 1 when the POC value of the picture is divided by 2. For example, among the plurality of pictures in the GOP, a picture at GOP level 2 may be a picture whose remainder is 2 when the POC value of the picture is divided by 4.
For example, as described above, the same picture partition method can be applied to pictures at the same GOP level among a plurality of pictures in a GOP. The picture division information may indicate that the same picture division method is to be applied to a picture, among the plurality of pictures, in which a remainder obtained when the POC value of the picture is divided by the first predetermined value is a second predetermined value.
The picture partition information may indicate a picture partition method for a picture at a GOP level of a specific value. Further, the picture partition information may define a picture partition method for one or more pictures corresponding to one of two or more GOP levels.
Embodiments for partitioning a picture into parallel blocks according to temporal level or the like
Table 4 below shows an example of the structure of pic_parameter_set_rbsp, which indicates PPS for signaling picture partition information. The picture partition information may be pic_parameter_set_rbsp, or may include pic_parameter_set_rbsp. Each picture can be partitioned into multiple parallel blocks by pic_parameter_set_rbsp.
TABLE 4
Figure BDA0004113923970000391
"pic_parameter_set_rbsp" may include the following elements.
Drive_num_tile_enable_flag: the "drive_num_tile_enable_flag" may be a uniform partition indication flag indicating whether each picture of the reference PPS is partitioned using one of at least two different methods. Alternatively, "drive_num_tile_enable_flag" may indicate whether the number of parallel blocks generated from the partition operation when each picture referring to the PPS is partitioned into parallel blocks is identical to each other.
For example, a drive_num_tile_enable_flag value of "0" may indicate that multiple pictures of the reference PPS are partitioned using a single method. Alternatively, the drive_num_tile_enable_flag value of "0" may indicate that when a plurality of pictures referring to PPS are partitioned, the plurality of pictures are always partitioned into the same number of parallel blocks.
The drive_num_tile_enable_flag value of "1" may indicate that multiple partition shapes are defined by a single PPS. Alternatively, a drive_num_tile_enable_flag value of "1" may indicate that each picture of the reference PPS is partitioned using one of at least two different methods. Alternatively, a drive_num_tile_enable_flag value of "1" may indicate that the number of parallel blocks generated when each picture referring to the PPS is partitioned is not the same.
It can be considered that when temporal scalability is applied to video or pictures, the necessity of partitioning a single picture into multiple parts and processing the parts in parallel is associated with a temporal identifier. It is considered that there is a correlation between processing of a picture for providing temporal scalability and partitioning of one picture into a plurality of parts.
The picture partition information may include information about the number of parallel blocks (i.e., parallel block number information) for the temporal identifier n. The parallel block number information for a specific time identifier n may indicate the number of parallel blocks into which a picture at a time level n is partitioned. Here, n may be an integer of 1 or more.
The parallel block number information may include the following elements: num_tile_lev1_minus1 and num_tile_lev2_minus1. Further, the parallel block number information may include num_tile_level n_minus1 for one or more values.
When the drive_num_tile_enable_flag is "1", the picture partition information or PPS may optionally contain at least one of num_tile_lever1_minus1, num_tile_lever2_minus1, and num_tile_levenn_minus1.
Num_tile_leve1_minus1: "num_tile_level1_minus1" may be level1 parallel block number information for a picture at level 1. The level may be a time level.
The level1 parallel block number information may correspond to the number of parallel blocks generated from partitioning a picture at level 1. The level1 parallel block information may be inversely proportional to the number of parallel blocks generated from partitioning a picture at level 1.
For example, a picture at level1 may be partitioned into m/(num_tile_level 1_minus1+1) parallel blocks. The value of m may be (num_tile_columns_minus1+1) × (num_tile_rows_minus1+1). Therefore, the larger the value of the level1 parallel block number information, the smaller the number of parallel blocks generated from partitioning the picture at level 1.
Num_tile_level2 minus1: "num_tile_level2_minus1" may be level2 parallel block number information for a picture at level 2. The level may be a time level.
The level2 parallel block number information may correspond to the number of parallel blocks generated from partitioning the picture at level 2. The level2 parallel block information may be inversely proportional to the number of parallel blocks generated from partitioning a picture at level 2.
For example, a picture at level2 may be partitioned into m/(num_tile_level 2_minus1+1) parallel blocks. The value of m may be (num_tile_columns_minus1+1) × (num_tile_rows_minus1+1). Therefore, the larger the value of the level2 parallel block number information, the smaller the number of parallel blocks resulting from partitioning the picture at level 2.
Num_tile_level n_minus1: "num_tile_level n_minus1" may be level N parallel block number information for a picture at level N. The level may be a time level.
The level N parallel block number information may correspond to the number of parallel blocks generated from partitioning a picture at the level N. The level N parallel block number information may be inversely proportional to the number of parallel blocks generated from partitioning a picture at the level N.
For example, a picture at level N may be partitioned into m/(num_tile_level N_minus1+1) parallel blocks. The value of m may be (num_tile_columns_minus1+1) × (num_tile_rows_minus1+1). Therefore, the larger the value of the level N parallel block number information, the smaller the number of parallel blocks generated from partitioning the picture at the level N.
"num_tile_level n_minus1" may be reduction instruction information for reducing the number of parallel blocks generated from partitioning a picture.
The picture partition information may include level N reduction instruction information for reducing the number of parallel blocks generated from partitioning a picture at level N. Here, N may be an integer of 2 or more. For example, num_tile_level2_minus1 may be the level2 decrease indication information. Further, num_tile_level3_minus1 may be level3 decrease indication information.
By signaling using picture partition information of pic_parameter_set_rbsp as described above, a plurality of coded pictures can be decoded using the following procedure.
As described above, the number of parallel blocks generated from partitioning each picture may be changed according to the level of the picture. The encoding apparatus and the decoding apparatus may partition each picture using the same method.
For example, when the value of drive_num_tile_enable_flag in PPS of the current picture is "0", the current picture may be partitioned into (num_tile_columns_minus1+1) × (num_tile_rows_minus1+1) parallel blocks. Hereinafter, a partition performed when the value of drive_num_tile_enable_flag is "0" is referred to as a "base partition".
For example, when the value of drive_num_tile_enable_flag in PPS is "1" and the value of num_tile_level n_minus1+1 is P, a picture at level N may be partitioned into (num_tile_columns_minus1+1) × (num_tile_rows_minus1+1)/P parallel blocks. That is, the number of parallel blocks generated from partitioning a picture at level N may be 1/P times the number of parallel blocks generated from performing basic partitioning. Here, the picture at the level N may be partitioned using one of the following methods 1) to 5).
Here, P may be a GOP level of a picture.
The number of horizontal parallel blocks at the N level (the number of N-level horizontal parallel blocks) may represent the number of parallel blocks arranged in the lateral direction of the picture at the level N (i.e., the number of columns of parallel blocks).
The number of vertical parallel blocks at the N level (the number of N-level vertical parallel blocks) may represent the number of parallel blocks arranged in the longitudinal direction of the picture at the level N (i.e., the number of rows of parallel blocks).
The basic number of horizontal parallel blocks may be (num_tile_columns_minus1+1).
The basic number of vertical parallel blocks may be (num_tile_rows_minus1+1).
The picture horizontal length may represent a horizontal length of a picture.
The vertical length of a picture may represent the vertical length of the picture.
Method 1)
The reduction indication information may be used to adjust the number of horizontally parallel blocks resulting from partitioning a picture.
The number of N-level horizontal parallel blocks may be 1/P times the basic number of horizontal parallel blocks, and the number of N-level vertical parallel blocks may be the same as the basic number of vertical parallel blocks.
Method 2)
The reduction indication information may be used to adjust the number of vertical parallel blocks generated from partitioning a picture.
The number of N-level vertical parallel blocks may be 1/P times the basic number of vertical parallel blocks, and the number of N-level horizontal parallel blocks may be the same as the basic number of horizontal parallel blocks.
Method 3)
The reduction indication information may be used to adjust the number of horizontal parallel blocks when the horizontal length of the picture is greater than the vertical length of the picture, and to adjust the number of vertical parallel blocks when the vertical length of the picture is greater than the horizontal length of the picture.
Based on the comparison between the picture horizontal length and the picture vertical length, it is possible to determine which one of the number of N-level horizontal parallel blocks and the number of N-level vertical parallel blocks to which 1/P is to be applied.
For example, when the picture horizontal length is greater than the picture vertical length, the number of N-level horizontal parallel blocks may be 1/P times the basic number of horizontal parallel blocks, and the number of N-level vertical parallel blocks may be the same as the basic number of vertical parallel blocks. When the picture vertical length is greater than the picture horizontal length, the number of N-level vertical parallel blocks may be 1/P times the basic number of vertical parallel blocks, and the number of N-level horizontal parallel blocks may be the same as the basic number of horizontal parallel blocks.
When the picture horizontal length is the same as the picture vertical length, the number of N-level horizontal parallel blocks may be 1/P times the basic number of horizontal parallel blocks, and the number of N-level vertical parallel blocks may be the same as the basic number of vertical parallel blocks. In contrast, when the picture horizontal length is the same as the picture vertical length, the number of N-level vertical parallel blocks may be 1/P times the basic number of vertical parallel blocks, and the number of N-level horizontal parallel blocks may be the same as the basic number of horizontal parallel blocks.
For example, when the picture horizontal length is greater than the picture vertical length, the number of N-level horizontal parallel blocks may be "(num_tile_columns_minus1+1)/P", and the number of N-level vertical parallel blocks may be "(num_tile_rows_minus1+1)". When the picture vertical length is greater than the picture horizontal length, the number of N-level horizontal parallel blocks may be "(num_tile_columns_minus1+1)", and the number of N-level vertical parallel blocks may be "(num_tile_rows_minus1+1)/P".
Method 4)
The reduction indication information may be used to adjust the number of horizontal parallel blocks when the basic number of horizontal parallel blocks is greater than the basic number of vertical parallel blocks, and to adjust the number of vertical parallel blocks when the basic number of vertical parallel blocks is greater than the basic number of horizontal parallel blocks.
Based on a comparison between the basic number of horizontal parallel blocks and the basic number of vertical parallel blocks, one of the number of N-level horizontal parallel blocks and the number of N-level vertical parallel blocks to which a reduction corresponding to 1/P times is to be applied may be determined.
For example, when the basic number of horizontal parallel blocks is greater than the basic number of vertical parallel blocks, the number of N-level horizontal parallel blocks may be 1/P times the basic number of horizontal parallel blocks, and the number of N-level vertical parallel blocks may be the same as the basic number of vertical parallel blocks. When the basic number of vertical parallel blocks is greater than the basic number of horizontal parallel blocks, the number of N-level vertical parallel blocks may be 1/P times the basic number of vertical parallel blocks, and the number of N-level horizontal parallel blocks may be the same as the basic number of horizontal parallel blocks.
When the basic number of horizontal parallel blocks is the same as the basic number of vertical parallel blocks, the number of N-level horizontal parallel blocks may be 1/P times the basic number of horizontal parallel blocks, and the number of N-level vertical parallel blocks may be the same as the basic number of vertical parallel blocks. In contrast, when the basic number of horizontal parallel blocks is the same as the basic number of vertical parallel blocks, the number of N-level vertical parallel blocks may be 1/P times the basic number of vertical parallel blocks, and the number of N-level horizontal parallel blocks may be the same as the basic number of horizontal parallel blocks.
For example, when the basic number of horizontal parallel blocks is greater than the basic number of vertical parallel blocks, the number of N-level horizontal parallel blocks may be "(num_tile_columns_minus1+1)/P", and the number of N-level vertical parallel blocks may be "(num_tile_rows_minus1+1)". When the basic number of vertical parallel blocks is greater than the basic number of horizontal parallel blocks, the number of N-level horizontal parallel blocks may be "(num_tile_columns_minus1+1)", and the number of N-level vertical parallel blocks may be "(num_tile_rows_minus1+1)/P".
Method 5)
When "p=qr", the number of N-level horizontally parallel blocks may be "the basic number of horizontally parallel blocks/Q", and the number of N-level horizontally parallel blocks may be "the basic number of horizontally parallel blocks/R".
For example, (P, Q, R) may be (P, P, 1), (P, 1, P), (T) 2 One of T, T), (6,3,2), (6, 2, 3), (8,4,2) and (8, 2, 4), wherein P, Q, R and T may each be an integer of 1 or more.
Picture partition information for partitioning a picture into slices
As described above, the portion of the screen resulting from the partitioning operation may be a stripe. The picture may be partitioned into a plurality of slices.
In the above-described embodiments, the picture partition information may be signaled by the slice_segment_header. The slice_segment_address of the slice_segment_header may be used to partition a picture.
In the following embodiments, the slice_segment_address may be included in the PPS instead of the slice_segment_header. That is, PPS including slice_segment_address may be used to partition a picture into a plurality of slices.
PPS may define parameters that are applied to a particular picture. Here, at least some of these parameters may be picture partition information and may be used to determine a picture partition method.
In an embodiment, picture partition information included in a single PPS may be applied to a plurality of pictures. Here, the plurality of pictures may be partitioned using one of at least two different methods. In other words, in order to define at least two different picture partition methods, a single PPS may be used instead of using several PPS. Even if two pictures are partitioned using different picture partition methods, PPS is not signaled for each picture and changed picture partition information can be derived based on picture partition information in a single PPS. For example, the PPS may include picture partition information to be applied to a single picture, and picture partition information to be applied to another picture may be derived based on the PPS. Alternatively, for example, the PPS may include picture partition information to be applied to a single picture, and a picture partition method to be applied to a plurality of pictures may be defined based on the picture partition information.
For example, the PPS may define the number of pictures to be processed in parallel for each GOP level. Once the number of pictures to be processed in parallel for each GOP level is defined, a picture partition method for pictures at a particular GOP level can be determined. Alternatively, once the number of pictures to be processed in parallel for each GOP level is defined, the number of slices into which a picture at a particular GOP level is to be partitioned may be determined.
Embodiments for partitioning pictures into slices according to GOP level
Table 5 below shows an example of a structure of pic_parameter_set_rbsp indicating PPS for signaling picture partition information. The picture partition information may be pic_parameter_set_rbsp or may include pic_parameter_set_rbsp. The picture can be partitioned into a plurality of slices by pic_parameter_set_rbsp. The shape of the plurality of strips may be periodically changed.
TABLE 5
Figure BDA0004113923970000451
/>
Table 6 below shows an example of the structure of the slice_segment_header when PPS of table 5 is used.
TABLE 6
Figure BDA0004113923970000452
Referring to table 5, pic_parameter_set_rbsp may include the following elements.
-parallel slice enabled flag: the "parallel_slice_enabled_flag" may be a slice partition information flag. The slice partition information flag may indicate whether the PPS includes slice partition information to be applied to a picture referring to the PPS.
For example, a parallel slice enabled flag value of "1" may indicate that the PPS includes slice partition information to be applied to a picture referring to the PPS. The parallel slice enabled flag value of "0" may indicate that the PPS does not include slice partition information to be applied to a picture referring to the PPS.
For example, a slice_slice_enabled_flag value of "0" may indicate that slice partition information of a picture referencing PPS exists in slice_segment_header. Here, the stripe partition information may contain slice_segment_address.
Num_parallel_slice_minus1: "num_parallel_slice_minus1" may be band number information corresponding to the number of bands in the picture of the partition.
For example, a value of "num_parallel_slice_minus1+1" may represent the number of slices in a picture of a partition.
-slice_unit_spacing_flag: the "slice_unit_spacing_flag" may be a uniform interval flag indicating whether the sizes of all the slices are identical to each other.
For example, when the value of slice_unit_spacing_flag is "0", it may not be considered that the sizes of all slices are identical to each other, and additional information for determining the sizes of the respective slices may be required.
For example, when the value of the slice_unit_spacing_flag is "1", the sizes of all the slices may be identical to each other. In addition, when the value of slice_unit_spacing_flag is "1", the sizes of all slices are identical to each other, and thus slice partition information for a slice can be derived based on the total size of a picture and the number of slices.
-parallel_slice_segment_address_minus1[ i ]: "parallel_slice_segment_address_minus1" may represent the size of a slice generated from partitioning a picture. For example, "parallel_slice_segment_address_minus1[ i ] +1" may indicate the size of the ith stripe. The size unit of the band may be CTB. Here, i may be an integer equal to or greater than 0 and less than n, and n may be the number of stripes.
-parallel_frame_by_gp_level_enable_flag: the "parallel_frame_by_gop_level_enable_flag" may be a GOP level parallel processing flag indicating whether a picture referencing PPS is encoded or decoded in parallel with other pictures at the same GOP level.
For example, a parameter_frame_by_gop_level_enable_flag value of "0" may indicate that a picture referencing PPS is not encoded or decoded in parallel with other pictures at the same GOP level. The parameter_frame_by_gop_level_enable_flag value of "1" may indicate that a picture referencing PPS is encoded or decoded in parallel with other pictures at the same GOP level.
When the value of the parallel_frame_by_gp_level_enable_flag is "1", the degree of partitioning a picture needs to be adjusted according to the parallelization of the picture level.
The picture partition information may include information about the number of pictures to be processed in parallel at the GOP level n (i.e., parallel processing picture number information). The parallel processing picture number information at the specific GOP level n may correspond to the number of pictures at the GOP level n to which parallel processing is applicable. Here, n may be an integer of 2 or more.
The parallel processing picture number information may include the following elements: num_frame_in_parallel_gp_lev3_minus1 and num_frame_in_parallel_gp_lev2_minus1.
Num_frame_in_parallel_gp_level 3 minus1: "num_frame_in_parallel_gp_level 3_minus1" may be parallel processing picture number information at GOP level 3. The parallel processing picture number information at GOP level3 may correspond to the number of pictures at GOP level3 that can be encoded or decoded in parallel.
For example, the value of "num_frame_in_parallel_gp_level 3_minus1+1" may represent the number of pictures at GOP level3 that can be encoded or decoded in parallel.
Num_frame_in_parallel_gp_lev2_minus1: "num_frame_in_parallel_gp_level 2_minus1" may be parallel processing picture number information at GOP level 2. The parallel processing picture number information at GOP level2 may correspond to the number of pictures at GOP level2 that can be encoded or decoded in parallel.
For example, the value of "num_frame_in_parallel_gp_level 2_minus1+1" may represent the number of pictures at GOP level2 that can be encoded or decoded in parallel.
By signaling using the picture partition information of pic_parameter_set_rbsp described above, a plurality of encoded pictures can be decoded using the following procedure.
For example, when the value of "parallel_slice_enabled_flag" in the PPS of the current picture is "1", the picture may be partitioned into one or more slices. In order to partition a picture into slices, slice_segment_address, which is slice partition information, must be able to be calculated. After the PPS has been received, the slice_segment_address may be calculated based on elements in the PPS.
When the value of "parallel_slice_enabled_flag" is "1", the sizes of all the slices may be identical to each other. In other words, the size of the unit stripe may be calculated according to the size of the picture and the number of stripes, and the sizes of all stripes may be equal to the calculated size of the unit stripe. Further, the slice_segment_address values of all the slices may be calculated using the sizes of the unit slices. When the value of "parallel_slice_enabled_flag" is "1", the code shown in table 7 below may be used to calculate the size of a unit slice and the slice_segment_address value of the slice.
TABLE 7
Figure BDA0004113923970000481
When the value of "slice_unit_spacing_flag" is "0", slice_segment_address [ i ] may be parsed in PPS. That is, when the value of "slice_unit_spacing_flag" is "0", the PPS may include slice_segment_address [ i ]. Here, i may be an integer equal to or greater than 0 and less than n, which may be the number of stripes.
For example, when the value of "parallel_frame_by_gp_level_enable_flag" in the PPS of the current picture is "1", num_parallel_slice_minus1 and slice_segment_address [ i ] may be redefined.
When the value of "parallel_frame_by_gp_level_enable_flag" in the PPS of the current picture is "1" and the GOP level of the current picture is 2, num_parallel_slice_minus1 to be applied to the current picture can be redefined by the following equation 7:
[ equation 7]
new_num_parallel_slice_minus1=(num_parallel_slice_minus1)/(num_frame_in_parallel_gop_level2_minus1+1)
Here, new_num_parallel_slice_minus1 may correspond to the number of slices in the current picture at GOP level 2. For example, the value of "new_num_parallel_slice_minus1+1" may represent the number of slices in the current picture of the partition.
When the value of "parallel_frame_by_gp_level_enable_flag" in the PPS of the current picture is "1" and the GOP level of the current picture is 3, num_parallel_slice_minus1 to be applied to the current picture can be redefined by the following equation 8:
[ equation 8]
new_num_parallel_slice_minus1=(num_parallel_slice_minus1)/(num_frame_in_parallel_gop_level3_minus1+1)
In this case, new_num_parallel_slice_minus1 may correspond to the number of slices in the current picture at GOP level 3. For example, the value of "new_num_parallel_slice_minus1+1" may represent the number of slices in the current picture of the partition.
According to equations 7 and 8 above, the larger the value of num_frame_in_parallel_gp_lev2_minus1 or num_frame_in_parallel_gp_lev3_minus1, the smaller the value of new_num_parallel_slice_minus1. In other words, the larger the value of num_frame_in_parallel_gp_lever2_minus1 or num_frame_in_parallel_gp_lever3_minus1, the smaller the number of stripes resulting from the partitioning operation. Thus, num_frame_in_parallel_gp_lev2_minus1 and num_frame_in_parallel_gp_lev3_minus1 may be reduction instruction information for reducing the number of slices to be generated from partitioning a picture. As the number of pictures at the same GOP level that are encoded or decoded in parallel becomes larger, each picture can be partitioned into a smaller number of slices.
The picture partition information may include reduction instruction information for reducing the number of parallel blocks generated from partitioning each picture. Further, the reduction instruction information may indicate a degree to which the number of slices generated from partitioning a picture according to encoding or decoding of parallel processing is reduced. The picture partition information may include GOP level n reduction instruction information for reducing the number of slices generated from partitioning a picture at GOP level n. Here, n may be an integer of 2 or more. For example, num_frame_in_parallel_gp_leve2_minus1 may be GOP level2 reduction indication information. Further, num_frame_in_parallel_gp_level 3_minus1 may be GOP level3 reduction indication information.
As described above according to equations 7 and 8, the picture partition information may include GOP level n reduction indication information for pictures at GOP level n. When the number of slices generated from partitioning a picture at GOP level n or 1 is w and the number of slices generated from partitioning a picture at GOP level n is w/m, the GOP level n reduction instruction information may correspond to m.
By redefining equations 7 and 8, the slice_segment_address value of the slice in the current picture can be calculated using the codes shown in table 8 below.
TABLE 8
Figure BDA0004113923970000501
Embodiments for partitioning pictures into slices according to GOP level or temporal level
Table 9 below shows an example of the structure of pic_parameter_set_rbsp, which indicates PPS for signaling picture partition information. The picture partition information may be pic_parameter_set_rbsp or may include pic_parameter_set_rbsp. The picture may be partitioned into multiple slices based on pic_parameter_set_rbsp. The shape of the plurality of strips may be periodically changed.
TABLE 9
Figure BDA0004113923970000502
Table 10 below shows an example of the structure of the slice_segment_header when PPS of table 9 is used.
TABLE 10
Figure BDA0004113923970000511
Referring to table 9, pic_parameter_set_rbsp may include the following elements.
-unlocked_slice_segment_enabled_flag: the "integrated_slice_segment_enabled_flag" may be a slice partition information flag. The slice partition information flag may indicate whether the PPS includes slice partition information to be applied to a picture referencing the PPS.
For example, the unified slice segment enabled flag value of "1" may indicate that the PPS includes slice partition information to be applied to a picture referencing the PPS. The unified slice segment enabled flag value of "0" may indicate that the PPS does not include slice partition information to be applied to a picture referencing the PPS.
For example, the integrated_slice_segment_enabled_flag value of "0" may indicate that slice partition information of a picture referencing PPS exists in the slice_segment_header. Here, the stripe partition information may contain slice_segment_address.
Num_slice_minus1: "num_slice_minus1" may be band number information corresponding to the number of bands in the partitioned picture. For example, a value of "num_slice_minus1+1" may represent the number of slices in the picture of the partition.
-slice_unit_spacing_flag: the "slice_unit_spacing_flag" may be a uniform interval flag indicating whether the sizes of all the slices are identical to each other.
For example, when the value of slice_unit_spacing_flag is "0", the sizes of all slices may not be considered to be identical to each other, and additional information for determining the sizes of slices may be required. For example, when the value of the slice_unit_spacing_flag is "1", the sizes of all the slices may be identical to each other.
Further, when the value of slice_unit_spacing_flag is "1", the sizes of slices are identical to each other, and thus slice partition information for a slice can be derived based on the total size of a picture and the number of slices.
-integrated_slice_segment_address_minus1 [ i ]: "unified slice segment address minus1" may represent the size of a slice resulting from partitioning a picture.
For example, a value of "integrated_slice_segment_address_minus1 [ i ] +1" may represent the size of the ith stripe. The size unit of the band may be CTB. Here, i may be an integer equal to or greater than 0 and less than n, and n may be the number of stripes.
-unlocked_slice_segment_by_gap_level_enable_flag: the "integrated_slice_segment_by_gap_level_enable_flag" may be a partition method indication flag indicating whether to partition a picture referencing PPS using one of at least two different methods.
Alternatively, the unified slice segment by _ gap _ level _ enable _ flag may indicate whether the number and shape of slices generated from the partitioning operation when each picture referring to the PPS is partitioned into slices are identical to each other. The shape of the strip may include one or more of a start position of the strip, a length of the strip, and an end position of the strip.
For example, a value of "0" for the unified slice segment by _ gap _ level _ enable _ flag may indicate that a single method is used to partition a picture referencing PPS. Alternatively, the unified_slice_by_gap_level_enable_flag value of "0" may indicate that the number of slices generated when each picture referring to the PPS is partitioned is always the same as each other, and the shapes of the slices are always uniform.
For example, a value of "1" for the unified slice segment by _ gap _ level _ enabled _ flag may indicate that multiple partition shapes are defined by a single PPS. Alternatively, the unified slice segment by _ gap _ level _ enable _ flag value of "1" may indicate that the picture referencing PPS is partitioned using one of at least two different methods. Partitioning a picture using different methods may refer to the number and/or shape of stripes resulting from partitioning a picture being different from each other.
For example, a value of "1" of unified slice segment by gap level enable flag may indicate that the number or shape of slices resulting from partitioning a picture referencing PPS is not uniform.
Alternatively, the unified_slice_segment_by_gap_level_enable_flag may be a GOP-level parallel processing flag indicating whether a picture referencing PPS is encoded or decoded in parallel with other pictures at the same GOP level.
For example, a value of "0" for the unified slice segment by _ gap _ level _ enabled _ flag may indicate that a picture referencing PPS is not encoded or decoded in parallel with other pictures at the same GOP level. The unified slice segment by _ gap _ level _ enabled _ flag value of "1" may indicate that a picture referencing PPS is encoded or decoded in parallel with other pictures at the same GOP level. When the value of the integrated_slice_segment_by_gap_level_enable_flag is "1", the degree of partitioning the picture needs to be adjusted according to the picture level parallelization.
The picture partition information may include frame number indication information at GOP level n. The frame number indication information at a specific GOP level n may correspond to the number of pictures at GOP level n that can be processed in parallel. Here, n may be an integer of 2 or more.
The frame number indication information may include the following elements: num_frame_by_gp_lev2_minus1 and num_frame_by_gp_lev3_minus1. Further, the frame number indication information may include num_frame_by_gp_level n_minus1 for one or more values.
When the value of the unified_slice_by_gap_level_enable_flag is "1", the picture partition information or PPS may selectively include at least one of num_frame_by_gap_level 2_minus1, num_frame_by_gap_level 3_minus1, and num_frame_by_gap_level n_minus1.
Num_frame_by_gp_level 3 minus1: "num_frame_by_gp_level 3_minus1" may be frame number information at GOP level 3. The number of frames information at GOP level3 may correspond to the number of pictures at GOP level3 that can be encoded or decoded in parallel.
For example, the value of "num_frame_by_gp_level 3_minus1+1" may represent the number of pictures at GOP level3 that can be encoded or decoded in parallel.
Num_frame_by_gp_level 2 minus1: "num_frame_by_gp_lever2_minus1" may be frame number information at GOP level 2. The number of frames information at GOP level2 may correspond to the number of pictures at GOP level2 that can be encoded or decoded in parallel.
For example, a value of "num_frame_by_gp_level 2_minus1+1" may represent the number of pictures at GOP level2 that can be encoded or decoded in parallel.
The above description may also be applied to the time level. That is, in an embodiment, "GOP" may be replaced by "temporal identifier" and "GOP level" may be replaced by "temporal level".
By signaling using the picture partition information of pic_parameter_set_rbsp described above, a plurality of encoded pictures can be decoded using the following procedure.
First, when the value of "integrated_slice_segment_enabled_flag" in the PPS of the current picture is "1", the picture may be partitioned into one or more slices.
In addition, when the value of "integrated_slice_segment_enabled_flag" in the PPS of the current picture is "1", the picture referring to the PPS may be partitioned using one of at least two different methods.
In order to partition a picture into slices, slice_segment_address, which is slice partition information, must be able to be calculated. The slice_segment_address may be calculated based on elements of the PPS after the PPS has been received.
When the value of "slice_unit_spacing_flag" is "1", the sizes of all the slices may be identical to each other. In other words, the sizes of the unit strips may be calculated, and the sizes of all the strips may be equal to the calculated sizes of the unit strips. The slice_segment_address values of all the slices may be calculated using the sizes of the unit slices. When the value of "slice_slice_spacing_flag" is "1", the size of a unit slice and the individual slice_slice_address values may be calculated using the codes shown in table 11 below:
TABLE 11
Figure BDA0004113923970000541
When the value of "slice_slice_spacing_flag" is "0", the slice_slice_segment_address [ i ] may be parsed in the PPS. In other words, when the value of "slice_slice_spacing_flag" is "0", the PPS may include the integrated_slice_segment_address [ i ]. Here, i may be an integer equal to or greater than 0 and less than n, and n may be the number of stripes.
For example, when the value of "unique_slice_segment_by_gp_level_enable_flag" in the PPS of the current picture is "1", num_slice_minus1 and unique_slice_segment_address [ i ] may be redefined.
When the value of "parallel_frame_by_gp_level_enable_flag" in the PPS of the current picture is "1" and the GOP level of the current picture is 2, num_slice_minus1 to be applied to the current picture can be redefined by the following equation 9:
[ equation 9]
num_slice_minus1=(num_slice_minus1)/(num_frame_by_gop_level2_minus1+1)
Here, the redefined num_slice_minus1 may correspond to the number of slices in the current picture at GOP level 2. For example, a value of "num_slice_minus1+1" may represent the number of slices in the current picture of the partition.
When the value of "parallel_frame_by_gp_level_enable_flag" in the PPS of the current picture is "1" and the GOP level of the current picture is 3, num_slice_minus1 to be applied to the current picture can be redefined by the following equation 10:
[ equation 10]
num_slice_minus1=(num_slice_minus1)/(num_frame_by_gop_level3_minus1+1)
Here, the redefined num_slice_minus1 may correspond to the number of slices in the current picture at GOP level 3. For example, a value of "num_slice_minus1+1" may represent the number of slices in the current picture.
According to equations 9 and 10 above, the larger the value of num_frame_by_gp_level 2_minus1 or num_frame_by_gp_level 3_minus1, the smaller the value of num_slice_minus1. In other words, the larger the value of num_frame_by_gop_level2_minus1 or num_frame_by_gop_level3_minus1, the smaller the number of stripes resulting from the partitioning operation. Accordingly, num_frame_by_gp_lev2_minus1 and num_frame_by_gp_lev3_minus1 may be reduction instruction information for reducing the number of slices generated from partitioning a picture. As the number of pictures being coded and decoded in parallel at the same GOP level becomes larger, each picture can be partitioned into a smaller number of slices.
The picture partition information may contain reduction instruction information for reducing the number of parallel blocks generated from partitioning each picture. Further, the reduction instruction information may represent a degree to which the number of slices generated from partitioning a picture according to encoding or decoding of parallel processing is reduced. The picture partition information may include GOP level n reduction instruction information for reducing the number of parallel blocks generated from partitioning a picture at GOP level n. Here, n may be an integer of 2 or more. For example, num_frame_by_gp_level 2_minus1 may be GOP level2 reduction indication information. Further, num_frame_by_gp_level 3_minus1 may be GOP level3 reduction indication information.
As described above according to equations 9 and 10, the picture partition information may contain GOP level n reduction instruction information for pictures at GOP level n. When the number of slices generated from partitioning a picture at GOP level 0 or 1 is w and the number of slices generated from partitioning a picture at GOP level n is w/m, GOP level n reduction instruction information may correspond to m.
By redefinition of equations 9 and 10, the unique_slice_address value of the slice in the current picture can be calculated using the codes shown in table 12 below:
TABLE 12
Figure BDA0004113923970000561
Table 13 below shows an example of syntax of PPS for signaling picture partition information when a picture partition method to be applied to a plurality of pictures is changed according to pictures.
TABLE 13
Figure BDA0004113923970000571
Table 14 below shows an example of syntax for signaling a slice header of picture partition information when a picture partition method to be applied to a plurality of pictures is changed according to pictures.
TABLE 14
Figure BDA0004113923970000581
Table 15 below shows another example of syntax of PPS for signaling picture partition information when a picture partition method to be applied to a plurality of pictures is changed according to pictures.
TABLE 15
Figure BDA0004113923970000582
Table 16 below shows another example of syntax of PPS for signaling picture partition information when a picture partition method to be applied to a plurality of pictures is changed according to pictures.
TABLE 16
Figure BDA0004113923970000591
/>
With the above-described embodiments, picture partition information in a bitstream can be transmitted from the encoding apparatus 1300 to the decoding apparatus 1500.
According to the embodiment, even in the case of partitioning a plurality of pictures using different methods, it may not be necessary to signal picture partition information for each picture or for each partition of each picture.
According to the embodiment, even in the case where a plurality of pictures are partitioned using different methods, it may not be necessary to encode the picture partition information for each picture or for each portion of each picture. Since encoding and signaling are efficiently performed, the size of the encoded bitstream may be reduced, encoding efficiency may be improved, and complexity of implementation of the decoding apparatus 1500 may be reduced.
Fig. 17 is a configuration diagram of an electronic device implementing the encoding apparatus and/or the decoding apparatus.
In an embodiment, at least some of the control unit 1310, the encoding unit 1320, and the communication unit 1330 of the encoding apparatus 1300 may be program modules and may communicate with external devices or systems. Program modules may be included in the encoding device 1300 in the form of an operating system, application program modules, and other program modules.
Further, in an embodiment, at least some of the encoding unit 1510, the decoding unit 1520, and the encoding unit 1530 of the decoding apparatus 1500 may be program modules and may communicate with external devices or systems. Program modules may be included in the decoding device 1500 in the form of an operating system, application program modules, and other program modules.
Program modules may be physically stored in various types of well known storage devices. Furthermore, at least some of the program modules may also be stored in a remote storage device capable of communicating with the encoding apparatus 1300 or a remote storage device capable of communicating with the decoding apparatus 1500.
Program modules may include, but are not limited to, routines, subroutines, programs, objects, components, and data structures for performing functions or operations according to embodiments or for implementing abstract data types according to embodiments.
Program modules may be implemented using instructions or code that are executed by at least one processor of the encoding device 1300 or at least one processor of the decoding device 1500.
The encoding apparatus 1300 and/or the decoding apparatus 1500 may be implemented as an electronic device 1700 as shown in fig. 17. The electronic apparatus 1700 may be a general-purpose computer system that functions as the encoding device 1300 and/or the decoding device 1500.
As shown in fig. 17, the electronic device 1700 may include at least one processor 1710, memory 1730, user Interface (UI) input devices 1750, UI output devices 1760, and storage 1740 in communication with each other via a bus 1790. The electronic device 1700 may also include a communication unit 1720 that is connected to a network 1799. The processor 1710 may be a Central Processing Unit (CPU) or a semiconductor device for executing processing instructions stored in the memory 1730 or the storage 1740. Each of memory 1730 and storage 1740 may be any of various types of volatile or nonvolatile storage media. For example, the memory may include at least one of Read Only Memory (ROM) 1731 and Random Access Memory (RAM) 1732.
The encoding apparatus 1300 and/or the decoding apparatus 1500 may be implemented in a computer system including a computer-readable storage medium.
The storage medium may store at least one module required to use the electronic apparatus 1700 as the encoding device 1300 and/or the decoding device 1500. Memory 1730 may store at least one module and may be configured to be executed by at least one processor 1700.
The functions related to the communication of data or information of the encoding apparatus 1300 and/or the decoding apparatus 1500 may be performed by the communication unit 1720. For example, the control unit 1310 and the encoding unit 1320 of the encoding apparatus 1300 may correspond to the processor 1710, and the communication unit 1330 may correspond to the communication unit 1720. For example, the control unit 1510 and the decoding unit 1520 of the decoding apparatus 1500 may correspond to the processor 1710, and the communication unit 1530 may correspond to the communication unit 1720.
In the above-described embodiments, although the method has been described based on a flowchart as a series of steps or units, the present invention is not limited to the order of the steps, and some steps may be performed in a different order from the order of the steps that have been described or simultaneously with other steps. Furthermore, those skilled in the art will appreciate that: the steps shown in the flowcharts are not exclusive and may include other steps, or one or more steps in the flowcharts may be deleted without departing from the scope of the present invention.
The embodiments according to the present invention described above may be implemented as programs capable of being executed by various computer apparatuses, and may be recorded on a computer-readable storage medium. The computer readable storage medium may include program instructions, data files, and data structures, alone or in combination. The program instructions recorded on the storage medium may be specially designed or configured for the present invention, or may be known or available to those having ordinary skill in the computer software arts. Examples of computer storage media may include all types of hardware devices that are specially configured for recording and executing program instructions, such as magnetic media (such as hard disks, floppy disks, and magnetic tape), optical media (such as Compact Discs (CD) -ROMs, and Digital Versatile Discs (DVDs)), magneto-optical media (such as floppy disks, ROMs, RAMs, and flash memory). Examples of program instructions include both machine code, such as produced by a compiler, and high-level language code that can be executed by the computer using an interpreter. The hardware devices may be configured to operate as one or more software modules to perform the operations of the invention, and vice versa.
As described above, although the present invention has been described based on specific details (such as detailed components and a limited number of embodiments and drawings), the specific details are provided only for easy understanding of the present invention, and the present invention is not limited to these embodiments, and various changes and modifications will be practiced by those skilled in the art in light of the above description.
It is, therefore, to be understood that the spirit of the present embodiments is not to be limited to the above-described embodiments and that the appended claims and equivalents thereof, and modifications thereto, fall within the scope of the invention.

Claims (26)

1. A method for reconstructing a plurality of pictures, comprising:
acquiring picture partition information;
reconstructing the plurality of pictures based on the picture partition information.
2. The method of claim 1, wherein,
the picture partition information indicates a partition method applied to the plurality of pictures.
3. The method of claim 2, wherein,
the picture partition information indicates the number of parallel blocks into which each picture of the plurality of pictures is to be partitioned.
4. The method of claim 2, wherein,
the picture partition information includes partition method indication information indicating whether the plurality of pictures are partitioned,
In the case where the partition method indication information indicates that the plurality of pictures are partitioned, the plurality of pictures are partitioned according to the partition method.
5. The method of claim 4, wherein,
the partition method indication information is included in parameter sets for the plurality of pictures.
6. The method of claim 1, wherein,
the plurality of pictures are partitioned using at least two different partitioning methods,
each of the plurality of pictures is partitioned using one of the at least two different partitioning methods.
7. The method of claim 6, wherein,
each of the at least two different partitioning methods partitions the picture into a plurality of slices,
each of the plurality of stripes is partitioned into a plurality of coding tree blocks CTBs.
8. A method for generating a bitstream, comprising:
generating picture partition information for a plurality of pictures;
a bitstream is generated that includes the picture partition information.
9. The method of claim 8, wherein,
the picture partition information indicates a partition method applied to the plurality of pictures.
10. The method of claim 9, wherein,
The picture partition information indicates the number of parallel blocks into which each picture of the plurality of pictures is to be partitioned.
11. The method of claim 9, wherein,
the picture partition information includes partition method indication information indicating whether the plurality of pictures are partitioned,
in the case where the partition method indication information indicates that the plurality of pictures are partitioned, the plurality of pictures are partitioned according to the partition method.
12. The method of claim 11, wherein,
the partition method indication information is included in parameter sets for the plurality of pictures.
13. The method of claim 8, wherein,
the plurality of pictures are partitioned using at least two different partitioning methods,
each of the plurality of pictures is partitioned using one of the at least two different partitioning methods.
14. The method of claim 13, wherein,
each of the at least two different partitioning methods partitions the picture into a plurality of slices,
each of the plurality of stripes is partitioned into a plurality of coding tree blocks CTBs.
15. An apparatus for reconstructing a plurality of pictures, comprising:
the control unit is used for acquiring the picture partition information;
And a decoding unit for reconstructing the plurality of pictures based on the picture partition information.
16. An apparatus for generating a bitstream, comprising:
a control unit for generating picture partition information for a plurality of pictures;
and an encoding unit for generating a bitstream including the picture partition information.
17. A computer-readable recording medium storing a bitstream, the bitstream comprising picture partition information, wherein a plurality of pictures are reconstructed based on the picture partition information.
18. A computer readable recording medium including a bitstream decoded by an image decoding method, the image decoding method comprising:
acquiring picture partition information;
reconstructing a plurality of pictures based on the picture partition information.
19. A computer readable recording medium storing a bitstream comprising computer executable code, wherein the computer executable code, when executed by a processor of a video decoding device, causes the processor to perform the steps of:
decoding picture partition information in the computer executable code;
reconstructing a plurality of pictures based on the picture partition information.
20. A computer readable recording medium storing a bitstream comprising computer executable code, wherein the computer executable code, when executed by a processor of a video decoding device, causes the processor to perform the steps of:
A plurality of pictures are reconstructed based on picture partition information in the computer executable code.
21. The computer-readable recording medium of claim 20, wherein,
the picture partition information indicates a partition method applied to the plurality of pictures.
22. The computer-readable recording medium of claim 21, wherein,
the picture partition information indicates the number of parallel blocks into which each picture of the plurality of pictures is to be partitioned.
23. The computer-readable recording medium of claim 21, wherein,
the picture partition information includes partition method indication information indicating whether the plurality of pictures are partitioned,
in the case where the partition method indication information indicates that the plurality of pictures are partitioned, the plurality of pictures are partitioned according to the partition method.
24. The computer-readable recording medium of claim 23, wherein,
the partition method indication information is included in parameter sets for the plurality of pictures.
25. The computer-readable recording medium of claim 20, wherein,
the plurality of pictures are partitioned using at least two different partitioning methods,
each of the plurality of pictures is partitioned using one of the at least two different partitioning methods.
26. The computer-readable recording medium of claim 25, wherein,
each of the at least two different partitioning methods partitions the picture into a plurality of slices,
each of the plurality of stripes is partitioned into a plurality of coding tree blocks CTBs.
CN202310212807.0A 2016-03-30 2017-03-30 Method and apparatus for encoding and decoding video using picture division information Pending CN116193116A (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
KR20160038461 2016-03-30
KR10-2016-0038461 2016-03-30
CN201780022137.9A CN109076216B (en) 2016-03-30 2017-03-30 Method and apparatus for encoding and decoding video using picture division information
PCT/KR2017/003496 WO2017171438A1 (en) 2016-03-30 2017-03-30 Method and apparatus for encoding and decoding video using picture division information

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN201780022137.9A Division CN109076216B (en) 2016-03-30 2017-03-30 Method and apparatus for encoding and decoding video using picture division information

Publications (1)

Publication Number Publication Date
CN116193116A true CN116193116A (en) 2023-05-30

Family

ID=60141232

Family Applications (6)

Application Number Title Priority Date Filing Date
CN202310181621.3A Pending CN116347073A (en) 2016-03-30 2017-03-30 Method and apparatus for encoding and decoding video using picture division information
CN202310193502.XA Pending CN116170588A (en) 2016-03-30 2017-03-30 Method and apparatus for encoding and decoding video using picture division information
CN202310212661.XA Pending CN116193115A (en) 2016-03-30 2017-03-30 Method and apparatus for encoding and decoding video using picture division information
CN202310212807.0A Pending CN116193116A (en) 2016-03-30 2017-03-30 Method and apparatus for encoding and decoding video using picture division information
CN201780022137.9A Active CN109076216B (en) 2016-03-30 2017-03-30 Method and apparatus for encoding and decoding video using picture division information
CN202310181696.1A Pending CN116156163A (en) 2016-03-30 2017-03-30 Method and apparatus for encoding and decoding video using picture division information

Family Applications Before (3)

Application Number Title Priority Date Filing Date
CN202310181621.3A Pending CN116347073A (en) 2016-03-30 2017-03-30 Method and apparatus for encoding and decoding video using picture division information
CN202310193502.XA Pending CN116170588A (en) 2016-03-30 2017-03-30 Method and apparatus for encoding and decoding video using picture division information
CN202310212661.XA Pending CN116193115A (en) 2016-03-30 2017-03-30 Method and apparatus for encoding and decoding video using picture division information

Family Applications After (2)

Application Number Title Priority Date Filing Date
CN201780022137.9A Active CN109076216B (en) 2016-03-30 2017-03-30 Method and apparatus for encoding and decoding video using picture division information
CN202310181696.1A Pending CN116156163A (en) 2016-03-30 2017-03-30 Method and apparatus for encoding and decoding video using picture division information

Country Status (3)

Country Link
US (1) US20190082178A1 (en)
KR (1) KR102397474B1 (en)
CN (6) CN116347073A (en)

Families Citing this family (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3428887A1 (en) * 2017-07-13 2019-01-16 Thomson Licensing Method and device for encoding a point cloud
KR102551361B1 (en) * 2018-08-24 2023-07-05 삼성전자주식회사 Encoding method and its device, decoding method and its device
WO2020071709A1 (en) * 2018-10-01 2020-04-09 삼성전자 주식회사 Method and device for transmitting video content and method and device for receiving video content
WO2020127110A1 (en) * 2018-12-20 2020-06-25 Telefonaktiebolaget Lm Ericsson (Publ) Signaling segment partitions in a parameter set
CN109714598B (en) * 2019-01-31 2021-05-14 上海国茂数字技术有限公司 Video encoding method, decoding method, processing method and video processing system
WO2020185146A1 (en) * 2019-03-11 2020-09-17 Telefonaktiebolaget Lm Ericsson (Publ) Video coding comprising rectangular tile group signaling
WO2020209479A1 (en) * 2019-04-08 2020-10-15 엘지전자 주식회사 Method and device for picture partitioning based on signaled information
WO2020209478A1 (en) * 2019-04-08 2020-10-15 엘지전자 주식회사 Method and device for partitioning picture into plurality of tiles
WO2020209477A1 (en) * 2019-04-08 2020-10-15 엘지전자 주식회사 Picture partitioning-based coding method and device
US11363307B2 (en) * 2019-08-08 2022-06-14 Hfi Innovation Inc. Video coding with subpictures
WO2021034129A1 (en) 2019-08-20 2021-02-25 주식회사 엑스리스 Method for encoding/decoding image signal and device therefor
WO2021061033A1 (en) * 2019-09-23 2021-04-01 Telefonaktiebolaget Lm Ericsson (Publ) Segment position signalling with subpicture slice position deriving
JP7322290B2 (en) * 2019-10-02 2023-08-07 北京字節跳動網絡技術有限公司 Syntax for Subpicture Signaling in Video Bitstreams
CN116112684A (en) * 2019-10-09 2023-05-12 苹果公司 Method for encoding/decoding image signal and apparatus therefor
MX2022004409A (en) 2019-10-18 2022-05-18 Beijing Bytedance Network Tech Co Ltd Syntax constraints in parameter set signaling of subpictures.
US11785214B2 (en) * 2019-11-14 2023-10-10 Mediatek Singapore Pte. Ltd. Specifying video picture information
MX2022006361A (en) * 2019-11-27 2022-09-07 Lg Electronics Inc Method and apparatus for signaling picture partitioning information.
US20230328266A1 (en) * 2019-11-27 2023-10-12 Lg Electronics Inc. Image decoding method and device therefor
US20230013803A1 (en) * 2019-11-27 2023-01-19 Lg Electronics Inc. Image decoding method and apparatus therefor
CN114902664A (en) * 2019-11-28 2022-08-12 Lg 电子株式会社 Image/video encoding/decoding method and apparatus
CN114930820A (en) * 2019-11-28 2022-08-19 Lg 电子株式会社 Image/video compiling method and device based on picture division structure
KR20220087514A (en) * 2019-11-28 2022-06-24 엘지전자 주식회사 Video/Video Coding Method and Apparatus
MX2022006485A (en) * 2019-11-28 2022-10-10 Lg Electronics Inc Slice and tile configuration for image/video coding.
CN115552899A (en) * 2020-02-21 2022-12-30 抖音视界有限公司 Indication of slices in video pictures
BR112022017122A2 (en) * 2020-02-28 2022-12-27 Huawei Tech Co Ltd DECODER AND CORRESPONDING METHODS FOR SIGNALING IMAGE PARTITIONING INFORMATION FOR SLICES
TWI761166B (en) * 2020-04-01 2022-04-11 聯發科技股份有限公司 Method and apparatus for signaling slice partition information in image and video coding
US11496730B2 (en) 2020-04-03 2022-11-08 Electronics And Telecommunications Research Institute Method, apparatus and storage medium for image encoding/decoding using subpicture
CN112511843B (en) * 2020-11-19 2022-03-04 腾讯科技(深圳)有限公司 Video encoding method, video encoding device, terminal device and storage medium
CN116112683A (en) * 2021-11-10 2023-05-12 腾讯科技(深圳)有限公司 Video compression method, apparatus, computer device and storage medium
US20230164358A1 (en) * 2021-11-23 2023-05-25 Mediatek Inc. Video Encoder With Motion Compensated Temporal Filtering
CN113965753B (en) * 2021-12-20 2022-05-17 康达洲际医疗器械有限公司 Inter-frame image motion estimation method and system based on code rate control

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9124895B2 (en) * 2011-11-04 2015-09-01 Qualcomm Incorporated Video coding with network abstraction layer units that include multiple encoded picture partitions
US10244246B2 (en) * 2012-02-02 2019-03-26 Texas Instruments Incorporated Sub-pictures for pixel rate balancing on multi-core platforms
WO2014003675A1 (en) * 2012-06-29 2014-01-03 Telefonaktiebolaget L M Ericsson (Publ) Transmitting apparatus and method thereof for video processing
MY189418A (en) * 2013-01-04 2022-02-10 Samsung Electronics Co Ltd Method for entropy-encoding slice segment and apparatus therefor, and method for entropy-decoding slice segment and apparatus therefor
CN116708768A (en) * 2013-01-04 2023-09-05 Ge视频压缩有限责任公司 Efficient scalable coding concept
WO2015007348A1 (en) * 2013-07-19 2015-01-22 Huawei Technologies Co., Ltd. Method and apparatus for encoding and decoding a texture block using depth based block partitioning
US10390087B2 (en) * 2014-05-01 2019-08-20 Qualcomm Incorporated Hypothetical reference decoder parameters for partitioning schemes in video coding

Also Published As

Publication number Publication date
CN109076216B (en) 2023-03-31
CN109076216A (en) 2018-12-21
CN116193115A (en) 2023-05-30
US20190082178A1 (en) 2019-03-14
CN116347073A (en) 2023-06-27
KR20170113384A (en) 2017-10-12
CN116156163A (en) 2023-05-23
KR102397474B1 (en) 2022-05-13
CN116170588A (en) 2023-05-26

Similar Documents

Publication Publication Date Title
CN109076216B (en) Method and apparatus for encoding and decoding video using picture division information
CN109314785B (en) Method and apparatus for deriving motion prediction information
CN110463201B (en) Prediction method and apparatus using reference block
CN111149359B (en) Method and apparatus for encoding/decoding image and recording medium storing bit stream
CN109804626B (en) Method and apparatus for encoding and decoding image and recording medium for storing bit stream
CN110476425B (en) Prediction method and device based on block form
CN108605123B (en) Method and apparatus for encoding and decoding video by using prediction
KR20220065740A (en) Method and apparatus for encoding and decoding video using picture partition information
CN115460409A (en) Method and apparatus for encoding and decoding video by using prediction
CN113891094A (en) Method and apparatus for predicting residual signal
CN116546211A (en) Video encoding method, video encoding device, computer equipment and storage medium
CN108605139B (en) Method and apparatus for encoding and decoding video by using prediction
KR20220106724A (en) Method and apparatus for encoding/decoding image and recording medium for storing bitstream
CN115733977A9 (en) Method and apparatus for encoding and decoding video by using prediction
KR20170043461A (en) Method and apparatus for adaptive encoding and decoding based on image complexity

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination