CN118101969A - Method and apparatus for syntax processing for video codec system - Google Patents

Method and apparatus for syntax processing for video codec system Download PDF

Info

Publication number
CN118101969A
CN118101969A CN202410210285.5A CN202410210285A CN118101969A CN 118101969 A CN118101969 A CN 118101969A CN 202410210285 A CN202410210285 A CN 202410210285A CN 118101969 A CN118101969 A CN 118101969A
Authority
CN
China
Prior art keywords
current
block
motion vector
sub
picture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410210285.5A
Other languages
Chinese (zh)
Inventor
赖贞延
林芷仪
陈庆晔
庄子德
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
MediaTek Inc
Original Assignee
MediaTek Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by MediaTek Inc filed Critical MediaTek Inc
Publication of CN118101969A publication Critical patent/CN118101969A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/53Multi-resolution motion estimation; Hierarchical motion estimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/24Systems for the transmission of television signals using pulse code modulation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/109Selection of coding mode or of prediction mode among a plurality of temporal predictive coding modes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/184Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being bits, e.g. of the compressed video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/523Motion estimation or motion compensation with sub-pixel accuracy
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/593Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/96Tree coding, e.g. quad-tree coding

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention discloses a method and a device for a video coding and decoding system with the current image reference mode. According to one approach, integer motion vector flags are inferred to be true when the current reference picture is equal to the current picture, without issuing integers or resolving motion vector flags. According to another approach, when all motion vector differences for the current block are equal to zero, the integer motion vector flag is inferred to be true, without issuing an integer or resolving the motion vector flag. According to another method, when all the reference pictures of the current block are equal to the current picture: disabling the sub-block predictive coding mode; and encodes or decodes the current block by disabling the sub-block predictive coding mode. Or the derived motion vector associated with a sub-block of the current block may be converted to an integer motion vector.

Description

Method and apparatus for syntax processing for video codec system
Related references
The present invention claims priority to U.S. provisional patent application number 62/629,204, filed on 2018, month 2, and day 12, filed on 2018, month 10, and day 8, filed on to U.S. provisional patent application number 62/742,474, and filed on to U.S. provisional patent application number 62/747,170, filed on 2018, month 10, and day 18, the disclosures of which are incorporated herein by reference in their entirety.
Technical Field
The present invention relates to video codec using current picture reference (current picture referencing, CPR) codec tools. More specifically, the present invention discloses syntax transmission for a codec system that uses CPR codec tools and other codec tools, such as adaptive motion vector resolution (adaptive motion vector resolution, AMVR), sub-block based temporal motion vector prediction (sub-block based temporal motion vector prediction (sbTMVP), or affine prediction (affine prediction).
Background
The High Efficiency Video Codec (HEVC) standard was developed under the joint Video project of the ITU-T Video codec expert group (Video Coding Experts Group, VCEG) and ISO/IEC moving picture expert group (Moving Picture Experts Group, MPEG) standardization bodies, especially in cooperation with a joint collaboration team called Video codec (Joint Collaborative Team on Video Coding, JCT-VC). In HEVC, one Slice (Slice) is partitioned into a plurality of coding tree units (coding tree units, hereinafter referred to as CTUs). In the master profile (profile), the minimum and maximum sizes of CTUs are specified by syntax elements in the sequence parameter set (sequence PARAMETER SET, abbreviated SPS). The allowed CTU size may be 8x8, 16x16, 32x32 or 64x64. For each segment, CTUs within the segment are processed in accordance with a raster scan (RASTER SCAN) order.
The CTU is also divided into a plurality of coding units (multiple coding units, CU for short) to accommodate various local characteristics. A quadtree called a coding tree (coding tree) is used to divide a CTU into a plurality of CUs. The CTU is sized to MxM, where M is one of 64, 32 or 16. The CTU may be a single CU (i.e., not split) or may be divided into four smaller units of the same size (i.e., each size M/2 xM/2), which correspond to nodes of the coding tree. If the unit is a leaf node of the encoding tree, the unit becomes a CU. Otherwise, the quadtree splitting process may be iterated until the size of the node reaches the minimum allowed CU size specified in the Sequence parameter set (Sequence PARAMETER SET, SPS). The representations form a recursive structure specified by the coding tree (also referred to as a split tree structure) 120 in fig. 1. The partitioning of CTUs 110 is shown in fig. 1, where the solid lines represent the boundaries of the CUs. The decision to encode the image region using inter-image (temporal) or intra-image (spatial) prediction is made at the CU layer. Since the minimum CU size may be 8x8, the minimum granularity (granularity) to switch between different basic prediction types is 8x 8.
Furthermore, according to HEVC, each CU may be divided into one or more Prediction Units (PUs). Together with a CU, a PU serves as a basic representative block of shared prediction information. Inside each PU, the same prediction process is applied and the relevant information is sent to the decoder on a PU basis. A CU may be divided into one, two, or four PUs depending on the PU partition type. As shown in fig. 2, HEVC defines eight shapes that decompose a CU into PUs, including partition types 2nx2N,2nxn, nx2N, nxn,2nxnu,2nxnd, nlx2N, and nRx2N. Unlike CU, PU can only be split once according to HEVC. The segmentation shown in the second row (row) corresponds to an asymmetric segmentation, wherein the two segments have different sizes.
After obtaining the residual block through the prediction process based on the PU partition type, the prediction residual of the CU is partitioned into Transform Units (TUs) according to another quadtree structure similar to the coding tree of the CU as shown in fig. 1. The solid lines represent CU boundaries and the dashed lines represent TU boundaries. A TU is a basic representative block with residual or transform coefficients for applying integer transform (integer transform) and quantization. For each TU, an integer transform of the same size is applied to the TU to obtain residual coefficients. These coefficients are transmitted to a decoder after TU-based quantization.
The terms coding tree block (coding tree block, CTB) Coding Block (CB), prediction Block (PB) and Transform Block (TB) are defined to designate 2-D sample arrays of one color component associated with CTU, CU, PU and TU, respectively. Thus, a CTU consists of one luma CTB, two chroma CTBs and associated syntax elements. Similar relationships are valid for CUs, PUs and TUs. Tree segmentation is typically applied to both luminance and chrominance at the same time, although there are exceptions when some minimum size for chrominance is reached.
Or in JCTP-P1005 (D.F. Flynn et al ,"HEVC Range Extensions Draft6",Joint Collaborative Team on Video Coding(JCT-VC)of ITU-T SG 16WP 3and ISO/IEC JTC 1/SC 29/WG 11,16th Meeting:San Jose,US,9–17January 2014,Document:JCTVC-P1005), binary tree partition structure is proposed. As shown in FIG. 3, in the proposed binary tree partition structure, a block can be recursively partitioned into two smaller blocks using various binary partition types. Most efficient and simplest are symmetric horizontal partition and symmetric vertical partition as shown in the first two partition types of FIG. 3. For a given block of size MxN, a flag is sent to indicate whether the given block is partitioned into two smaller blocks. If so, another syntax element is sent to indicate which partition type is used. If horizontal partition is used, the given block is partitioned into two blocks of size Mx (N/2). If vertical partition is used, the binary tree partition process can be repeated, since the binary tree has two partition types (i.e., horizontal and vertical), the minimum allowed block width and block height should be indicated. A1 may indicate a vertical split.
Binary tree structures may be used to partition an image region into multiple smaller blocks, such as dividing a slice into CTUs, dividing a CTU into CUs, dividing a CU into PUs, or dividing a CU into TUs, etc. A binary tree may be used to partition CTUs into CUs, where the root node of the binary tree is the CTU and the leaf nodes of the binary tree are the CUs. The leaf nodes may be further processed through prediction and transform coding. For simplicity, there is no further partitioning from CU to PU or from CU to TU, meaning that CU equals PU and PU equals TU. In other words, therefore, leaf nodes of the binary tree are the basic units for prediction and transform coding.
Binary tree structures are more flexible than quadtree structures because more segmentation shapes can be supported, which is also a source of improvement in coding efficiency. However, in order to select the optimal segmentation shape, the coding complexity will also increase. To balance complexity and coding efficiency, a method of combining a quadtree and a binary tree structure, also known as a quadtree plus binary tree (quadtree plus binary tree, QTBT) structure, has been disclosed. According to QTBT structure, the block is first partitioned by a quadtree structure, and quadtree partitioning may iterate until the size of the partitioned block reaches the minimum allowed quadtree node size. If the leaf quadtree block is not greater than the maximum allowable binary tree root node size, then the binary tree structure may be further partitioned, and the binary tree partitioning may iterate until the size (width or height) of the partitioned block reaches the minimum allowable binary leaf node size (width or height) or the binary tree depth reaches the allowable maximum binary tree depth. In the QTBT structure, the minimum allowed quarter tree node size, the maximum allowed binary tree root node size, the minimum allowed binary tree node width and height, and the maximum allowed binary tree depth may be indicated in a high level syntax, such as in SPS. Fig. 5 shows an example of a segmentation of block 510 and its corresponding QTBT 520. The solid line represents a quadtree segmentation and the dashed line represents a binary tree segmentation. In each partition node (i.e., non-leaf node) of the binary tree, one flag indicates which partition type (horizontal or vertical) is used, 0 may indicate horizontal partition, and 1 may indicate vertical partition.
The QTBT structure described above may be used to partition an image region (e.g., a slice, CTU, or CU) into multiple smaller blocks, such as partitioning a slice into CTUs, partitioning a CTU into CUs, partitioning a CU into PUs, partitioning a CU into TUs, and so on. For example, QTBT may be used to partition a CTU into CUs, where the root node of QTBT is a CTU, which is partitioned into multiple CUs through a QTBT structure, and these CUs are further processed through prediction and transform coding. For simplicity, there is no further partitioning from CU to PU or CU to TU. This means that the CU is equal to PU and the PU is equal to TU. Thus, in other words, the leaf nodes of the QTBT structure are the fundamental units of prediction and transformation.
An example of QTBT structure is shown below. For a CTU of size 128x128, the minimum allowed quarter tree node size is set to 16x16, the maximum allowed binary tree root node size is set to 64x64, the minimum allowed binary tree node width and height are both set to 4, and the maximum allowed binary tree depth is set to 4. First, the CTUs are partitioned by a quadtree structure, and the leaf quadtree units may have a size from 16×16 (i.e., the minimum allowed quadtree node size) to 128×128 (equal to the size of the CTUs, not partitioned). If the leaf quadtree unit is 128x128, it cannot be further segmented by the binary tree because the size exceeds the maximum allowed binary tree root node size of 64x64. Otherwise, the leaf quadtree units are further partitioned through the binary tree. The leaf quadtree unit is also a root binary tree unit, whose binary tree depth is 0. When the binary tree depth reaches 4 (i.e., the maximum allowed binary tree as indicated), no segmentation is implied. When the width of the block of the corresponding binary tree node is equal to 4, non-horizontal segmentation is implied. When the height of the block of the corresponding binary tree node is equal to 4, a non-vertical split is implied. QTBT leaf nodes are further processed through prediction (intra-picture or inter-picture) and transform coding.
For an I-slice, the QTBT tree structure typically applies luma/chroma separation coding. For example, QTBT tree structures are applied to the luminance and chrominance components of the I-and B-slices, respectively, and to the luminance and chrominance components of the P-and B-slices simultaneously (except to some minimum size of chrominance is reached). In other words, in the I segment, the luminance CTB has a block division of QTBT structure, and the two chrominance CTBs have a block division of another QTBT structure. In another example, two chroma CTBs may also have their own QTBT-structured block partitions.
For block-based coding, it is always necessary to partition the image into blocks (e.g., CUs, PUs and TUs) for coding purposes. As is known in the art, an image may be segmented into smaller image areas, such as segments, tiles, CTU rows (row) or CTUs, before applying block segmentation. The process of dividing an image into blocks for encoding purposes is referred to as dividing the image using an encoding unit structure. The method of generating a CU, PU and TU by special partitioning employed by HEVC is one example of a Coding Unit (CU) structure. The QTBT tree structure is another example of a Coding Unit (CU) structure.
Current image reference
Motion estimation/compensation is a well-known key technique in hybrid video codec that explores pixel correlation between neighboring images. In a video sequence, the movement of objects between adjacent frames is small and the movement of objects can be modeled by a two-dimensional translational motion. Thus, a pattern (pattern) corresponding to an object or background in a frame is moved in position to form a corresponding object in a subsequent frame or associated with other patterns within the current frame. With an estimate of the movement position (e.g. using block matching techniques), most patterns can be reproduced without re-encoding the pattern. Similarly, block matching and copying are attempted to allow reference blocks to be selected from within the same picture. When this concept is applied to video captured by a camera, inefficiencies are observed. In part, because the text pattern (textual pattern) in spatially adjacent regions may be similar to the current encoded block, but typically has some gradual change in space. Thus, it is difficult for a block to find an exact match within the same image of the video taken by the camera. Therefore, improvement of coding performance is limited.
However, the spatial correlation between pixels within the same picture is different for screen content. For a typical video with text and graphics, there will typically be a repeating pattern in the same image. Thus, intra (image) block compensation has been observed to be very effective. A new prediction mode, intra Block Copy (IBC) mode or referred to as current picture reference (current picture referencing, CPR) has been introduced for screen content coding to take advantage of the feature. In CPR mode, a Prediction Unit (PU) is predicted from a previously reconstructed block within the same picture. Further, a displacement vector (referred to as a block vector or BV) is used to send a relative displacement from the position of the current block to the position of the reference block. The prediction error is then encoded using transform, quantization and entropy coding. An example of CPR compensation is shown in fig. 6, wherein the region 610 corresponds to an image, a slice or an image region to be encoded. Blocks 620 and 630 correspond to two blocks to be encoded. In the example, each block may find a corresponding block (i.e., 622 and 632, respectively) in a previously encoded region in the current image. According to the techniques, the reference samples correspond to reconstructed samples of a currently decoded image prior to a loop filtering operation (in-loop filter operations) included in HEVC, which includes a deblocking filter and a Sample Adaptive Offset (SAO) filter.
In JCTCVC-M0350 (Madhukar Budagavi et al ,"AHG8:Video coding using Intra motion compensation",Joint Collaborative Team on Video Coding(JCT-VC)of ITU-T SG 16WP 3and ISO/IEC JTC 1/SC 29/WG 11,13th Meeting:Incheon,KR,18–26Apr.2013,Document:JCTVC-M0350)), an early version of CPR was revealed, which was submitted as a candidate technique for HEVC range extension (RExt) development, CPR compensation was limited to a small local area, searching was limited to 1-D block vectors for block sizes of only 2Nx2N in JCVC-M0350, then a more advanced CPR method was developed during HEVC screen content codec (screen content coding, SCC for short) normalization.
In order to efficiently transmit a Block Vector (BV), a BV is predictively transmitted using a BV predictor (BVP) in a similar manner to MV coding. Thus, as shown in fig. 7, a BV difference (BVD) is sent and a BV is reconstructed from BV = BVP + BVD, where reference block 720 is selected for the current block 710 (i.e., CU) in accordance with intra block copy (IntraBC) prediction. One BVP for the current CU is determined. Methods of deriving motion vector predictors (motion vector predictor, MVP) are known in the art. Similar derivation may be applied to BVP derivation.
In JCTVC-N0256 (Pang et al ,"Non-RCE3:Intra Motion Compensation with2-D MVs",Joint Collaborative Team on Video Coding(JCT-VC)of ITU-T SG 16WP 3and ISO/IEC JTC 1/SC 29/WG 11,14th Meeting:Vienna,AT,25July–2Aug.2013,Document:JCTVC-N0256),2-D intra MC further combined with the transmission path friendly method (PIPELINE FRIENDLY approach):
1. Instead of using an interpolation filter,
2. The MV search area is limited, in two cases:
a. the search area is the current CTU and the left CTU, or
B. The search area is the rightmost four columns (columns) of samples of the current CTU and the left CTU.
In the method proposed by JCTVC-N0256, the method of removing interpolation filters is employed for MC within 2-D pictures, and the method of restricting search areas to current CTUs and left CTUs is employed. Other aspects were either rejected or suggested for further investigation.
Spatial advanced motion vector prediction (advanced motion vector prediction, AMVP for short) is disclosed at JCTVC-O0218("Evaluation of Palette Mode Coding on HM-12.0+RExt-4.1",15th Meeting:Geneva,CH,23Oct.–1Nov.2013,Document:JCTVC-O0218),. Fig. 8 shows a number of possible block vector candidates in previously encoded neighboring block positions according to JCTVC-O0218. In table 1, the positions are described in detail.
TABLE 1
Position of Description of the invention
0 In the lower left position of the lower left corner of the current block
1 Left position in the lower left corner of the current block
2 In the upper right position of the upper right corner of the current block
3 In an upper position in the upper right corner of the current block
4 In the upper left position of the upper left corner of the current block
5 Left position in upper left corner of current block
6 In an upper position of the upper left corner of the current block
In HEVC, temporal MV prediction is also used as inter-slice motion compensation in addition to position AMVP prediction. As shown in fig. 9, the temporal predictor is derived from a block (TBR or TCTR) located in a co-located (co-located) picture, which is the first reference picture in reference list 0 or reference list 1. Since a block in which a temporal MVP is located may include two MVs, one MV is from reference list 0 and the other MV is from reference list 1, the temporal MVP is derived from MVs from reference list 0 or reference list 1 according to the following rule.
1. The MV passing through the current picture is first selected.
2. If both MVs pass through the current picture or neither MV passes through the current picture, MVs having the same reference list as the current list are selected.
When CPR is used, only part of the current image can be used as a reference image. Some bitstream conformance constraints are applied to adjust the effective MV values of the reference current picture.
First, one of the following two equations must be true:
Bv_x+ offsetX + nPbSw + xPbs-xCbs < = 0, and (1)
BV_y + offsetY + nPbSh + yPbs – yCbs <= 0. (2)
Second, the following wavefront parallel processing (Wavefront Parallel Processing, WPP) condition must be true:
(xPbs+BV_x+offsetX+nPbSw-1)/CtbSizeY–xCbs/CtbSizeY<=
yCbs/CtbSizeY-(yPbs+BV_y+offsetY+nPbSh-1)/CtbSizeY(3)
In equations (1) to (3), (bv_x, bv_y) are the luma block vectors (i.e., the motion vectors for CPR) for the current PU; nPbSw and nPbSh are the width and height of the current block PU; (xPbS, yPbs) is the position of the top left pixel of the current PU relative to the current image; (xCbs, yCbs) is the position of the upper left pixel of the current CU relative to the current image; and CtbSizeY is the size of the CTU. Considering chroma sample interpolation for CPR mode, offsetX and offsetY are two adjustment offsets in two-dimensional space:
offsetX = BVC_x & 0x7 ? 2 : 0, (4)
offsetY = BVC_y & 0x7 ? 2 : 0. (5)
(BVC_x, BVC_y) is a chroma block vector of 1/8 pixel (1/8-pel) resolution in HEVC.
Third, the reference blocks for CPR must be at the same grid/slice boundaries.
Affine motion compensation
Affine models can be used to describe 2D block rotations, as well as 2D deformations that are used to deform squares (rectangles) into parallelograms. The model can be described as follows:
x’=a0+a1*x+a2*y,
y’ = b0 + b1*x + b2*y。 (6)
in the model, 6 parameters need to be determined. For each pixel (x, y) in the region of interest, a motion vector is determined as the difference between the position of the given pixel (a) and the position of its corresponding pixel in the reference block (a '), i.e., mv=a' -a= (a0+ (a 1-1) x+a2 x y, b0+b1 x+ (b 2-1) y). Thus, the motion vector of each pixel is position dependent.
According to the model, the above parameters can be solved if the motion vectors of three different positions are known. This condition corresponds to 6 parameters being known. Each position with a known motion vector is called a control point. The 6 parameter affine model corresponds to the 3 control point model.
Some exemplary embodiments of affine motion compensation are presented in the technical literature by Li et al ("An affine motion compensation framework for high efficiency video coding",in 2015IEEE International Symposium on Circuits and Systems(ISCAS),24-27May 2015,Pages:525–528) and Huang et al ("Control-Point Representation and Differential Coding Affine-Motion Compensation",IEEE Transactions on Circuits,System and Video Technology(CSVT),Vol.23,No.10,pages 1651-1660,Oct.2013),. In the technical literature of Li et al, when the current block is encoded in merge mode or AMVP mode, affine flags are sent for 2Nx2N block segmentation. If the flag is true, the derivation of the motion vector of the current block follows an affine model. If this flag is false, the derivation of the motion vector for the current block follows a conventional translation model. When affine AMVP mode is used, three control points (3 motion vectors) are transmitted. At each control point location, the MV is predictively encoded. The MVDs for these control points are then encoded and transmitted. In the Huang et al technical literature, different control point positions and predictive coding of MVs in control points have been studied.
A syntax table for affine motion compensation implementation is shown in table 2. As shown in table 2, as indicated by notes (2-1) to (2-3) of the merge mode, if at least one merge candidate is affine encoding and the partition mode is 2Nx2N (i.e., partMode = part_2nx 2N), a syntax element use_affine_flag is transmitted. If the current block size is greater than 8x8 (i.e., (log 2CbSize > 3) and the partition mode is 2Nx2N (i.e., (PartMode = = part_2nx 2N), as shown in notes (2-4) to (2-6) for the B-fragment, syntax element use_affine_flag is transmitted, as shown in notes (2-7) to (2-9), if use_affine_flag indicates that affine model is being used (i.e., use_affine_flag with value 1 is used), information of the other two control points is transmitted to the reference list L0, and as shown in notes (2-10) to (2-12), information of the other two control points is transmitted for the reference list L1.
TABLE 2
And (5) palette coding. In screen content coding, a palette is used to represent a given video block (e.g., CU). In JCTVC-O0218("Evaluation of Palette Mode Coding on HM-12.0+RExt-4.1",15th Meeting:Geneva,CH,23Oct.–1Nov.2013,Document:JCTVC-O0218), by Guo et al:
1. transmission of palettes: the first transmission is the size of the palette, followed by the palette elements.
2. Transmission of pixel values: the pixels of the CU are encoded via sequential scan order. For each location, a flag is first transmitted to indicate whether an operational mode or a duplicate upper mode is used;
2.1 "run mode", in which the palette index is first sent, followed by the value "palette_run" (say M). No more information for the current position and the next M positions needs to be transmitted, as they have the same palette index as the transmitted palette index. The palette index (e.g., i) is shared by all three color components, which means that the reconstructed pixel value is (Y, U, V) = (paletteY [ i ], paletteU [ i ], paletteV [ i ]) (assuming that the color space is YUV).
2.2 "Copy over mode", in which a value "copy_run", say N, is sent to indicate that for the next N positions, including the current one, its palette index is equal to the palette index at the same position in the upper row (row).
3. Transmission of residual: the plurality of palette indices transmitted at stage 2 are converted back to pixel values and used as a prediction. The residual information is transmitted using HEVC residual coding and added to the prediction for reconstruction.
The palette may be predicted (shared) from its left neighbor CU to reduce the bit rate in JCTVC-N0247 (Guo et al ,RCE3:Results of Test 3.1on Palette Mode for Screen Content Coding,Joint Collaborative Team on Video Coding(JCT-VC)of ITU-T SG 16WP 3and ISO/IEC JTC 1/SC 29/WG 11,14th Meeting:Vienna,AT,25July–2Aug.2013,Document:JCTVC-N0247), palette for each component. Palette is predicted (shared) to reduce the bit rate. Each element in the palette is a triplet representing a particular combination of three color components later proposed in Qualcomm. Predictive coding of the palette across CUs is eliminated.
Coding based on the dominant color (or palette). Palette coding is another tool for screen content coding, in which palettes for each color component are created and transmitted. But the palette may be predicted from the palette of the left CU. For palette prediction, each entry in the palette may be predicted from a corresponding palette entry in the upper CU or the left CU.
In particular, three line modes, namely a horizontal mode, a vertical mode and a normal mode, are used for the pixel lines in the horizontal mode.
Further, according to JCTVC-O0182, pixels are classified into main color pixels and exceptional pixels (escape pixels). For the primary color pixels, the decoder reconstructs the pixel values from the primary color index (i.e., palette index) and the palette. For exceptional pixels, the encoder must send further pixel values.
In the present invention, the problems of various aspects of CPR encoding with QTBT structures or luminance/chrominance separation encoding are addressed.
Disclosure of Invention
The present invention proposes a method and apparatus for syntax transmission for video coding and decoding systems, wherein current picture reference (current picture referencing, abbreviated CPR) and adaptive motion vector resolution (adaptive motion vector resolution, abbreviated AMVR) codec tools are enabled. According to a proposed embodiment of the present invention, first, a current reference picture of a current block in a current picture is determined. When the current reference picture is equal to the current picture, the integer motion vector flag is inferred to be true, without the need to issue the integer motion vector flag in the bitstream at the encoder side, or without the need to parse the motion vector flag of the current block from the bitstream at the decoder side. An integer motion vector flag is true to represent the current Motion Vector (MV) in an integer, and an integer motion vector flag is false to represent the current Motion Vector (MV) in a fraction.
According to an embodiment of the present invention, when the integer motion vector flag is true, an additional indication may be further issued in the bitstream at the encoder side or parsed from the bitstream at the decoder side. The additional indication is used to indicate whether integer mode or 4-pixel mode is used. In one embodiment, the integer motion vector flag may be referred to as true regardless of the motion vector difference (motion vector difference, abbreviated MVD) between the current MV and MV predictors. In another embodiment, when the current reference picture is not equal to the current picture, the integer motion vector flag is inferred to be false, without the need to issue the integer motion vector flag in the known bitstream at the encoder side, or parse the motion vector flag of the current block from the bitstream at the decoder side.
According to a second embodiment of the present invention, a motion vector difference of a current block in a current image is determined. When all motion vector differences of the current block are equal to zero, it is inferred that the integer motion vector flag is false, no integer motion vector flag in the bitstream needs to be issued at the encoder side, or no motion vector flag of the current block needs to be parsed from the bitstream at the decoder side. The integer motion vector flag is true to indicate the current Motion Vector (MV) as an integer, and the integer motion vector flag is false to indicate the current Motion Vector (MV) as a fraction.
According to a second embodiment of the present invention, when all motion vector differences of the current block are equal to zero, the integer motion vector flag may be referred to as false, regardless of whether the selected reference picture associated with the current MV is equal to the current picture. In another embodiment, the integer motion vector flag is referred to as true if any motion vector difference of the current block is not equal to zero and the selected reference picture associated with the current MV is equal to the current picture. In yet another embodiment, if any motion vector difference of the current block is not equal to zero and the selected reference picture associated with the current MV is not equal to the current picture. The integer motion vector flag is issued in the bitstream at the encoder side or the integer motion vector flag of the current block is parsed in the bitstream at the decoder side. In yet another embodiment, the integer motion vector flag is referred to as false only if the selected reference picture associated with the current MV is not equal to the current picture.
According to a third embodiment of the present invention, syntax transmission of a video encoding system and a video decoding system is disclosed, wherein a Current Picture Reference (CPR) codec tool and a sub-block prediction codec mode are enabled. All reference pictures of the current block in the current picture are determined. When all the reference images of the current block are equal to the current image, disabling the subblock prediction encoding and decoding mode; and the current block is encoded at the encoder side or decoded at the decoder side by disabling the subblock prediction codec mode.
According to the third embodiment of the present invention, when all reference pictures are equal to the current picture, a syntax element indicating a subblock prediction codec mode is not issued in the bitstream at the encoder side or the syntax element of the current block is not parsed from the bitstream at the decoder side. In another embodiment, the syntax element for indicating the sub-block predictive codec mode is inferred to be false when all reference pictures of the current block are equal to the current picture. In yet another embodiment, a syntax element for indicating a sub-block prediction codec mode is constrained to indicate that the sub-block prediction codec mode is disabled when all reference pictures of the current block are equal to the current picture. The sub-block predictive codec mode may be associated with an affine predictive codec tool or a sub-block based temporal motion vector predictive (subblock based temporal motion vector prediction, sbTMVP for short) codec tool.
According to a fourth embodiment of the present invention, syntax transmission of a video coding system and a video decoding system is disclosed, wherein a current picture reference (current picture referencing, abbreviated CPR) codec tool and a sub-block prediction codec mode are enabled. All reference pictures of the current block in the current picture are determined. When all reference pictures of the current block are equal to the current picture: the derived motion vector associated with the sub-block of the current block is converted to an integer motion vector; and, the current motion vector of the current block is encoded at the encoder side or decoded at the decoder side using the integer motion vector as a motion vector predictor.
Drawings
Fig. 1 is a block division example illustrating division of coding tree units (unit CTUs) into Coding Units (CUs) using a quadtree structure.
Fig. 2 illustrates asymmetric motion partitioning (ASYMMETRIC MOTION PARTITION, AMP) according to high efficiency Video Coding (HIGH EFFICIENCY Video Coding, HEVC), wherein AMP defines eight shapes that partition a CU into PUs.
FIG. 3 is an example of various binary partition types used to illustrate a binary tree partition structure, where a partition type may be used to recursively partition a block into two smaller blocks.
Fig. 4 is an example showing a block partition and its corresponding binary tree, where in each partition node (i.e., non-leaf node) of the binary tree, a syntax is used to indicate which partition type (horizontal or vertical) is used, where 0 represents horizontal partition and 1 represents vertical partition.
Fig. 5 is an example showing a block segmentation and a quadtree plus binary tree (quadtree plus binary tree, QTBT) structure, where the solid line represents the quadtree segmentation and the dashed line represents the binary tree segmentation.
Fig. 6 is an example showing CPR compensation, wherein the region 610 corresponds to an image, slice or image region to be encoded. Blocks 620 and 630 correspond to two blocks to be encoded.
Fig. 7 is a diagram illustrating an example of predictive Block Vector (BV) encoding in which BV differences (block vector difference, BVD for short) corresponding to the difference between the current BV and BV predictors are signaled.
Fig. 8 illustrates a number of possible block vector candidates in previously encoded neighboring block position space Advanced Motion Vector Prediction (AMVP).
Fig. 9 shows that the temporal predictor is derived from a block (TBR or TCTR) located in a co-located (co-located) picture, wherein the co-located picture is the first reference picture in reference list 0 or reference list 1.
Fig. 10 shows a flowchart of an exemplary encoding system with current image reference (CPR) and Adaptive Motion Vector Resolution (AMVR) codec tools according to an embodiment of the invention.
Fig. 11 shows a flowchart of an exemplary encoding system with current image reference (CPR) and Adaptive Motion Vector Resolution (AMVR) codec tools according to another embodiment of the present invention.
Fig. 12 shows a flowchart of an exemplary encoding system with a Current Picture Reference (CPR) codec tool and an enabled sub-block predictive codec mode according to another embodiment of the present invention.
Fig. 13 shows a flowchart of an exemplary encoding system with a Current Picture Reference (CPR) codec tool and an enabled sub-block predictive codec mode according to another embodiment of the present invention.
Detailed Description
The following description is of the best mode for carrying out the invention. This description is made for the purpose of illustrating the general principles of the invention and should not be taken in a limiting sense. The scope of the invention is best determined by the claims.
In video coding based on the original quadtree plus binary tree (quad-tree plus binary tree, QTBT) structure and luma/chroma separation coding, luma and chroma are coded separately for all intra frames (e.g., I slices). In the following, various aspects of luma/chroma separation coding and syntax transmission using CPR mode are disclosed.
CPR with affine motion compensation
If affine motion compensation is enabled and affine flags are sent before the reference picture index, the reference picture index (Ref-idx) for list 0 or list 1 or for both list 0 and list 1 will need to be sent on the encoder side or parsed on the decoder side. But according to an embodiment of the invention the current image is removed from the reference image list because when affine mode is selected the current image is not in the reference image list. Accordingly, the length of the codeword of the reference picture index is reduced and the encoding efficiency is improved.
CPR with adaptive motion resolution
In video coding systems that support Adaptive Motion Vector Resolution (AMVR), a Motion Vector (MV) or its derivative (i.e., motion Vector Difference (MVD) or Motion Vector Predictor (MVP)) may be represented in various resolutions (i.e., integer and fractional). A flag (i.e., imv-flag) is used to indicate selection. The integer MV flag (imv-flag) being true indicates that integer MVs are used. In this case, an integer MV mode or a 4-pixel mode may be used. Additional bits are used to represent the selected integer mode or 4-pixel mode. If the integer MV flag (imv-flag) is true, then the MVP needs to be rounded to an integer.
Furthermore, the present invention discloses 3 different types of AMVR signaling. In a first type of AMVR signaling, imv-flag and Imv modes are issued, where the Imv-flag signaling is independent of MVD. An example of a syntax design according to the first type of AMVR signaling is as follows:
in the above example, bold characters are used to represent the grammar encoding. In the above example, imv-flag is still sent out when mvd=0. In this case, the MVs may be original MVPs (i.e., imv-flag is false) or rounded MVPs (i.e., imv-flag is true).
In a second type of AMVR signaling, imv-flag, imv mode, and MVD are sent, where Imv-flag signaling depends on MVD. Examples of syntactic designs according to the second type of AMVR signaling are as follows:
in the above case, imv-flag is inferred to be 0 when mvd=0, and mv can only be MVP.
In a third type of AMVR signaling, IMV flags, IMV mode, and MVD are issued. Examples of syntactic designs according to the third type of AMVR signaling are as follows:
In the above case, ref-idx, MVP and MVD are encoded after imv-flag.
In conventional grammar designs, there may be some redundancy. To improve coding efficiency, various syntax designs related to CPR and AMVR are disclosed. In one embodiment, if adaptive motion resolution (AMVR) for list 0 or list 1 or for both list 0 and list 1 is enabled and the AMVR is issued before the reference picture index, then the reference picture index needs to be issued or parsed for list 0 or list 1 or for both list 0 and list 1.
If the integer MV flag (imv-flag) is preceded by a list 0 or list 1 or list
Reference picture indexes for both list 0 and list 1 and for list 0 or list 1 or for list 0 and list
1 Is equal to the current picture, according to an embodiment of the invention, the integer MV flags
Imv-flag is inferred to be true. Thus, the integer MV flag imv-flag need not be issued for either list 0 or list 1 or for both list 0 and list 1.
When a 4-pixel integer MV mode is employed as one of the integer MV modes, an integer MV index (imv _idx) may be issued according to one embodiment of the present invention. When imv _idx is 0, a fractional MV (e.g., a quarter MV) is used; when imv _idx is 1, an integer MV is used; when imv _idx is 2, 4-pixel MV is used.
The above-described embodiments may be implemented by modifying existing signaling designs. For example, the first type of grammar design may be modified as follows:
In the above example, when the reference picture is not equal to the current picture (i.e., "if (ref |=cpr)"), grammar imv-flag is transmitted. Otherwise (i.e., the "else" case), imv-flag is inferred to be true (i.e., "imv-flag=true"). The above embodiment may be implemented by modifying the second type of grammar design, which may be modified as follows:
/>
In the above example, when the reference picture is not equal to the current picture (i.e., "if (ref |=cpr)"), if the MVD is not equal to zero (i.e., "if (MVD |=0))"), syntax imv-flag is transmitted; otherwise (i.e., the "else" case), imv-flag is inferred as false (i.e., "imv-flag=false"). Imv-flag is inferred to be true (i.e., "imv-flag=true") when the reference picture is equal to the current picture (i.e., "else" case of "if (ref |=cpr)").
If the integer MV flag (imv-flag) is preceded by a list 0 or list 1 or list
Reference picture indexes for both list 0 and list 1 and for list 0 or list 1 or for list 0 and list
The reference picture index of both 1 is equal to the current picture and imv _ idx can only be greater than 0, such as 1 or 2. In one embodiment, a binary symbol (bin) is issued to indicate whether imv _idx is equal to 1 or 2.
In the above example, the syntax of "imv-flag" and "imv-mode" is combined into a new syntax "imv _idx". When imv _idx is 0, a fractional MV (e.g., quarter MV) is used; when imv _idx is 1, an integer MV is used; when imv _idx is 2, 4-pixel MV is used.
In the above example, if imv-flag is inferred to be true, it means that imv _idx should be 1 or 2; if imv-flag is false, imv _idx is 0.
In one embodiment, imv _idx is binarized using truncated binary codes, such as 1-bit code "0" for imv _idx=0, 2-bit code "10" for imv _idx=1, and 2-bit code "11" for imv _idx=2. imv-flag may be considered as the first bin of imv _idx and imv-mode may be considered as the second bin of imv _idx. The above-described "imv-flag" and "imv-mode" syntax transmissions may be converted to "imv _idx" syntax transmissions. For example, the following pseudocode may be used to implement the described embodiments:
When ref equals CPR, imv _idx should be 1 or 2 (i.e., the first bin of imv _idx is inferred to be 1 because imv-flag is inferred to be 1). In another embodiment, imv _idx is inferred to be 0 if imv-flag is inferred to be 0.
In other embodiments, if the reference picture index is issued before the integer MV flag, then the reference picture is equal to the current picture and the MVDs in list 0 or list 1 or both list 0 and list 1 are equal to zero, and then the integer MV flag in list 0 or list 1 or both list 0 and list 1 is inferred to be false. Therefore, it is not necessary to issue integer MV flags at the encoder side or parse the integer MV flags at the decoder side. In other words, if the integer MV flag in list 0 or list 1 is false, or the integer MV flags in both list 0 and list 1 are false, the reference picture in list 0 or list 1, or both list 0 and list 1, is equal to the current picture, the MVD representing the target reference picture is equal to zero. In this disclosure, the phrase "issuing or parsing a syntax element" may be used for convenience. It should be understood that it corresponds to the abbreviation "parse syntax elements issued at encoder side or at decoder side".
The above-described embodiments may be implemented by modifying the second type of grammar design as follows:
/>
In another embodiment, the integer MV flag is inferred to be false only when the MVD in list 0 or list 1, or both list 0 and list 1, is zero and the selected reference picture is not equal to the current picture. The embodiment may be implemented by modifying the second type of grammar design as follows:
another exemplary signaling design for the embodiment is as follows:
/>
In yet another embodiment, the integer MV flag is inferred to be true when the selected reference picture is equal to the current picture, regardless of MVD. The described embodiments may be implemented by modifying the second type of grammar design, which may be modified as follows:
CPR with adaptive motion resolution and affine motion compensation
If the affine flags of list 0 or list 1 or both list 0 and list 1 are issued at the encoder side or the affine flags of list 0 or list 1 or both list 0 and list 1 are parsed at the decoder side before the integer MV flags and the reference picture indexes, then both the integer MV flags and the reference picture indexes need to be issued or parsed for list 0 or list 1 or both list 0 and list 1. But if an affine pattern is used (e.g. affine flag equals 1), the current image may be deleted from the reference image list. Accordingly, the length of the codeword of the reference picture index can be reduced.
If the integer MV flag is issued or parsed before the affine flag and the reference picture index, the affine flag and the reference picture index need to be issued or parsed. Similarly, if a fractional MV mode is used (e.g., integer MVs are disabled), the current picture may be removed from the reference picture list. Accordingly, the length of the codeword of the reference picture index can be reduced.
If the reference picture index of list 0 or list 1 or both list 0 and list 1 is issued or parsed before the affine flag and/or integer MV flag, the affine flag is inferred to be false if the reference picture is equal to the current frame. Thus, there is no need to issue or parse affine flags for either list 0 or list 1 or both list 0 and list 1. Likewise, the integer MV flag for either List 0 or List 1 or both List 0 and List 1 is inferred to be true (or imv _idx is equal to 1 or 2 according to an embodiment of the present invention). However, in other embodiments, under the above conditions, if the MVD of list 0 or list 1 or both list 0 and list 1 is equal to zero, then the integer MV flag is inferred to be false.
CPR with sub-block mode
A sub-block mode (e.g., sbTMVP (sub-block based temporal motion vector prediction) (or also referred to as alternative temporal motion vector prediction (ALTERNATIVE TEMPORAL MOTION VECTOR PREDICTION, short ATMVP) or sub-block temporal merging mode/candidate) or affine prediction) may be used to improve coding efficiency. For these types of sub-block patterns, they may be collected to be shared as a candidate list, referred to as a sub-block pattern candidate list. In skip mode coding, merge mode coding, or AMVP mode coding (i.e., inter mode coding), a flag may be issued to indicate whether to use the sub-block mode. If the sub-block mode is used, a candidate index is issued or inferred to select one of the sub-block candidates. The sub-block candidates may include sub-block temporal merging candidates, affine candidates and/or planar MV mode candidates. In one embodiment, if a CPR mode (which may be implicitly indicated or explicitly indicated using a flag or any other syntax element) is used or selected, and there are no other inter reference pictures (e.g., all reference pictures are current pictures, meaning that the current picture is the only reference picture for the current block), the sub-block mode is disabled. In some embodiments, if a flag indicating CPR mode is selected, it is inferred that the current image is the only reference image of the current block. In the syntax design, the sub-block mode syntax is not sent (e.g., the sub-block mode flag is inferred to be false) or the sub-block mode syntax is constrained to disable the sub-block mode (e.g., the sub-block mode flag is constrained to be false, as a bitstream conformance requirement, the sub-block mode flag is false). The sub-block mode is limited to be applied in the skip mode and the merge mode. In another embodiment, when sub-block mode is used in CPR (e.g., CPR mode is used and no other inter reference picture or selected reference picture is the current picture), the derived motion vector for each sub-block is also rounded to an integer MV. The above proposed method may be implemented in an encoder and/or decoder. For example, the proposed method may be implemented in an inter prediction module of an encoder and/or an inter prediction module of a decoder.
New conditions for dual tree coding allowing use of intra copy (intraBC) mode
In HEVC SCC extensions, if an I-slice with intraBC modes is enabled, the I-slice will be encoded as an inter-slice. The switch flag for intraBC mode may be indicated by checking the reference frame list. If the current frame is inserted into the reference frame list, intraBC mode is enabled.
Furthermore, in BMS2.1 reference software, a double tree is enabled for the I-slices, with separate coding unit partitions applied to the Luma and Chroma signals. To better integrate intraBC modes and dual-tree coding, dual-tree coding in inter-slices (e.g., P-slices or B-slices) is allowed if only one reference frame is put into the reference list and the reference frame is the current frame.
In the present invention, CPR, affine prediction AMVR, ATMVP, intraBC is a technique for video codec. These techniques are also referred to as codec tools in this disclosure.
The above disclosed invention may be incorporated in various forms in various video encoding or decoding systems. For example, the invention may be implemented using hardware-based methods such as application specific integrated circuits (INTEGRATED CIRCUITS, IC for short), field programmable gate arrays (field programmable logic array, FPGA for short), digital signal processors (DIGITAL SIGNAL processors, DSP for short), central processing units (central processing unit, CPU for short), and the like. The present invention may also be implemented using software code or firmware code which is executed on a calculator, laptop, or mobile device such as a smart phone. Furthermore, the software code or firmware code may be executed on a hybrid platform such as a CPU with a dedicated processor (e.g., video encoding engine or co-processor).
Fig. 10 shows a flow chart of an exemplary codec system with current image reference (current picture referencing, abbreviated CPR) and adaptive motion vector resolution (adaptive motion vector resolution, abbreviated AMVR) codec tools according to an embodiment of the invention. The steps shown in the flowcharts, as well as other subsequent flowcharts in this disclosure, may be implemented as program code executable on one or more processors (e.g., one or more CPUs) on the encoder side and/or decoder side. The steps shown in the flowcharts may also be implemented based on hardware, such as one or more electronic devices or processors arranged to perform the steps in the flowcharts. According to the method, in step 1010, a current reference picture for a current block in a current picture is determined. In step 1020, when the current reference picture is equal to the current picture, the integer motion vector flag is inferred to be true, no integer motion vector flag needs to be sent in the bitstream at the encoder side, or no integer motion vector flag needs to be parsed from the bitstream at the decoder side for the current block, where the integer motion vector flag is true to indicate that the current Motion Vector (MV) is represented in an integer and the integer motion vector flag is false to indicate that the current Motion Vector (MV) is represented in a fraction.
Fig. 11 shows a flowchart of an exemplary codec system with current image reference (CPR) and Adaptive Motion Vector Resolution (AMVR) codec tools according to another embodiment of the present invention. According to the method, in step 1110, a current reference picture for a current block in a current picture is determined. In step 1120, when all motion vector differences of the current block are equal to zero, the integer motion vector flag is inferred to be false, no integer motion vector flag needs to be sent in the bitstream at the encoder side, or the integer motion vector flag is parsed from the bitstream at the decoder side for the current block, where the integer motion vector flag is true to represent the current Motion Vector (MV) in an integer and the integer motion vector flag is false to represent the current Motion Vector (MV) in a fraction.
Fig. 12 shows a flow chart of an exemplary codec system with a Current Picture Reference (CPR) codec tool and a sub-block predictive codec mode according to yet another embodiment of the present invention. According to the method, at least one reference picture for a current block in a current picture is determined in step 1210. In step 1220, when the current picture is the only reference picture for the current block: the sub-block predictive codec mode is disabled; and encodes the current motion vector of the current block at the encoder side or decodes the current motion vector at the decoder side by disabling the subblock predictive coding mode.
Fig. 13 shows a flow chart of an exemplary codec system with a Current Picture Reference (CPR) codec and a sub-block predictive codec mode according to yet another embodiment of the present invention. According to the method, at least one reference picture for a current block in a current picture is determined in step 1310. In step 1320, when the current picture is the only reference picture for the current block: the derived motion vectors associated with the sub-blocks in the current block are converted to integer motion vectors; and, the current motion vector of the current block is encoded at the encoder side or decoded at the decoder side using the integer motion vector as a motion vector predictor.
The flow chart shown is intended to illustrate an example of an exemplary video codec according to the present invention. One of ordinary skill in the art may modify each step, rearrange steps, split steps, or combine steps to practice the invention without departing from the spirit of the invention. In this disclosure, specific syntax and semantics have been used to illustrate examples of implementing embodiments of the invention. Those of ordinary skill in the art may practice the invention with the same grammars and semantics instead of those described without departing from the spirit of the invention.
The previous description is presented to enable any person skilled in the art to make or use the invention in the context of a particular application and its requirements. Various modifications to the described embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments. Thus, the present invention is not intended to be limited to the particular embodiments shown and described, but is to be accorded the widest scope consistent with the principles and novel features disclosed herein. In the previous detailed description, numerous specific details were set forth in order to provide a thorough understanding of the present invention. However, those of ordinary skill in the art will appreciate that the present invention may be practiced.
Embodiments of the invention as described above may be implemented in various hardware, software code or a combination of both. For example, embodiments of the invention may be one or more circuits integrated into a video compression chip, or program code integrated into video compression software to perform the processes described herein. Embodiments of the invention may also be program code to be executed on a digital signal Processor (DIGITAL SIGNAL Processor, DSP) to perform the processes described herein. The invention may also relate to a number of functions performed by a computer processor, a digital signal processor, a microprocessor or a field programmable gate array (field programmable GATE ARRAY, FPGA for short). The processors may be configured to perform particular tasks according to the invention by executing machine readable software code or firmware code that defines the particular methods in which the invention is embodied. The software code or firmware code may be developed in different programming languages and different formats or styles. The software code may also be compiled for different target platforms. However, different code formats, styles and languages of software code, and other ways of configuring code to perform tasks consistent with the invention will not depart from the spirit and scope of the invention.
The present invention may be embodied in other specific forms without departing from its spirit or essential characteristics. The described examples are to be considered in all respects only as illustrative and not restrictive. The scope of the invention is, therefore, indicated by the appended claims rather than by the foregoing description. All changes which come within the meaning and range of equivalency of the claims are to be embraced within their scope.

Claims (8)

1. A method for syntax processing for a video coding system and a video decoding system, wherein a current picture reference codec and a sub-block prediction codec mode are enabled, the method comprising:
determining at least one reference image of a current block in the current image; and
When the current picture is the only reference picture for the current block:
disabling the sub-block predictive coding mode; and
The current block is encoded at the encoder side or decoded at the decoder side by disabling the subblock predictive codec mode.
2. The method for syntax processing of a video coding system and a video decoding system according to claim 1, wherein when said current picture is said unique reference picture of said current block, syntax elements used to indicate said sub-block predictive codec mode do not need to be issued in a bitstream at said encoder side or parsed in said bitstream at said decoder side for said current block.
3. The method for syntax processing for a video coding system and a video decoding system according to claim 1, wherein a syntax element used to indicate the sub-block predictive codec mode is inferred to be false when the current picture is the only reference picture of the current block.
4. The method for syntax processing for a video coding system and a video decoding system according to claim 1, wherein when said current picture is said unique reference picture for said current block, a syntax element used to indicate said sub-block predictive codec mode is constrained to indicate that said sub-block predictive codec mode is disabled.
5. The method for syntax processing for a video coding system and a video decoding system according to claim 1, wherein the sub-block predictive codec mode is associated with an affine predictive codec tool or a sub-block based temporal motion vector predictive codec tool.
6. An apparatus for syntax processing of a video encoding system and a video decoding system, wherein a current image reference codec tool and a sub-block prediction codec mode are enabled, the apparatus comprising one or more electronic circuits or one or more processors configured to:
determining at least one reference image of a current block in the current image; and
When the current picture is the only reference picture for the current block:
disabling the sub-block predictive coding mode; and
The current block is encoded at the encoder side or decoded at the decoder side by disabling the subblock predictive codec mode.
7. A method for syntax processing for a video coding system and a video decoding system, wherein a current picture reference codec and a sub-block prediction codec mode are enabled, the method comprising:
determining at least one reference image of a current block in the current image; and
When the current picture is the only reference picture for the current block:
converting a derived motion vector associated with a sub-block of the current block into an integer motion vector; and
The integer motion vector is used as a motion vector predictor, the current motion vector of the current block is encoded at the encoder side, or the current motion vector of the current block is decoded at the decoder side.
8. An apparatus for syntax processing of a video encoding system and a video decoding system, wherein a current image reference codec tool and a sub-block prediction codec mode are enabled, the apparatus comprising one or more electronic circuits or one or more processors configured to:
determining at least one reference image of a current block in the current image; and
When the current picture is the only reference picture for the current block:
converting a derived motion vector associated with a sub-block of the current block into an integer motion vector; and
The integer motion vector is used as a motion vector predictor, the current motion vector of the current block is encoded at the encoder side, or the current motion vector of the current block is decoded at the decoder side.
CN202410210285.5A 2018-02-12 2019-02-11 Method and apparatus for syntax processing for video codec system Pending CN118101969A (en)

Applications Claiming Priority (8)

Application Number Priority Date Filing Date Title
US201862629204P 2018-02-12 2018-02-12
US62/629,204 2018-02-12
US201862742474P 2018-10-08 2018-10-08
US62/742,474 2018-10-08
US201862747170P 2018-10-18 2018-10-18
US62/747,170 2018-10-18
CN201980012198.6A CN111869216B (en) 2018-02-12 2019-02-11 Method and apparatus for syntax processing for video codec system
PCT/CN2019/074783 WO2019154417A1 (en) 2018-02-12 2019-02-11 Method and apparatus of current picture referencing for video coding using adaptive motion vector resolution and sub-block prediction mode

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN201980012198.6A Division CN111869216B (en) 2018-02-12 2019-02-11 Method and apparatus for syntax processing for video codec system

Publications (1)

Publication Number Publication Date
CN118101969A true CN118101969A (en) 2024-05-28

Family

ID=67548807

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201980012198.6A Active CN111869216B (en) 2018-02-12 2019-02-11 Method and apparatus for syntax processing for video codec system
CN202410210285.5A Pending CN118101969A (en) 2018-02-12 2019-02-11 Method and apparatus for syntax processing for video codec system

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN201980012198.6A Active CN111869216B (en) 2018-02-12 2019-02-11 Method and apparatus for syntax processing for video codec system

Country Status (8)

Country Link
US (1) US11109056B2 (en)
KR (1) KR102483602B1 (en)
CN (2) CN111869216B (en)
AU (1) AU2019217409B2 (en)
CA (1) CA3090562C (en)
GB (1) GB2585304B (en)
TW (1) TWI692973B (en)
WO (1) WO2019154417A1 (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020169103A1 (en) 2019-02-24 2020-08-27 Beijing Bytedance Network Technology Co., Ltd. Independent coding of palette mode usage indication
US11025948B2 (en) * 2019-02-28 2021-06-01 Tencent America LLC Method and apparatus for motion prediction in video coding
US11343525B2 (en) * 2019-03-19 2022-05-24 Tencent America LLC Method and apparatus for video coding by constraining sub-block motion vectors and determining adjustment values based on constrained sub-block motion vectors
US11109041B2 (en) * 2019-05-16 2021-08-31 Tencent America LLC Method and apparatus for video coding
CN114175662B (en) 2019-07-20 2023-11-24 北京字节跳动网络技术有限公司 Condition dependent codec with palette mode usage indication
CN117221536A (en) 2019-07-23 2023-12-12 北京字节跳动网络技术有限公司 Mode determination for palette mode coding and decoding
WO2021018166A1 (en) 2019-07-29 2021-02-04 Beijing Bytedance Network Technology Co., Ltd. Scanning order improvements for palette mode coding
WO2021052506A1 (en) * 2019-09-22 2021-03-25 Beijing Bytedance Network Technology Co., Ltd. Transform unit based combined inter intra prediction
US11184632B2 (en) 2020-01-20 2021-11-23 Tencent America LLC Method and apparatus for palette based coding mode under local dual tree structure
US11582491B2 (en) * 2020-03-27 2023-02-14 Qualcomm Incorporated Low-frequency non-separable transform processing in video coding

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100585710B1 (en) 2002-08-24 2006-06-02 엘지전자 주식회사 Variable length coding method for moving picture
US20120287999A1 (en) * 2011-05-11 2012-11-15 Microsoft Corporation Syntax element prediction in error correction
KR101444675B1 (en) 2011-07-01 2014-10-01 에스케이 텔레콤주식회사 Method and Apparatus for Encoding and Decoding Video
CN104768015B (en) 2014-01-02 2018-10-26 寰发股份有限公司 Method for video coding and device
US10531116B2 (en) * 2014-01-09 2020-01-07 Qualcomm Incorporated Adaptive motion vector resolution signaling for video coding
KR102413529B1 (en) * 2014-06-19 2022-06-24 마이크로소프트 테크놀로지 라이센싱, 엘엘씨 Unified intra block copy and inter prediction modes
WO2016119104A1 (en) 2015-01-26 2016-08-04 Mediatek Inc. Motion vector regularization
US20160127731A1 (en) * 2014-11-03 2016-05-05 National Chung Cheng University Macroblock skip mode judgement method for encoder
KR20170084251A (en) * 2014-11-20 2017-07-19 에이치에프아이 이노베이션 인크. Method of motion vector and block vector resolution control
US20160337662A1 (en) 2015-05-11 2016-11-17 Qualcomm Incorporated Storage and signaling resolutions of motion vectors
GB2539213A (en) * 2015-06-08 2016-12-14 Canon Kk Schemes for handling an AMVP flag when implementing intra block copy coding mode
EP3449630B1 (en) * 2016-05-28 2024-07-10 Mediatek Inc. Method and apparatus of current picture referencing for video coding
EP3264769A1 (en) * 2016-06-30 2018-01-03 Thomson Licensing Method and apparatus for video coding with automatic motion information refinement
WO2020058886A1 (en) * 2018-09-19 2020-03-26 Beijing Bytedance Network Technology Co., Ltd. Fast algorithms for adaptive motion vector resolution in affine mode
CN113412623A (en) * 2019-01-31 2021-09-17 北京字节跳动网络技术有限公司 Recording context of affine mode adaptive motion vector resolution

Also Published As

Publication number Publication date
TWI692973B (en) 2020-05-01
GB202013536D0 (en) 2020-10-14
GB2585304B (en) 2023-03-08
CN111869216A (en) 2020-10-30
AU2019217409A1 (en) 2020-09-17
TW201935930A (en) 2019-09-01
US11109056B2 (en) 2021-08-31
KR102483602B1 (en) 2022-12-30
KR20200117017A (en) 2020-10-13
CA3090562C (en) 2023-03-14
AU2019217409B2 (en) 2021-04-22
WO2019154417A1 (en) 2019-08-15
CN111869216B (en) 2024-05-28
CA3090562A1 (en) 2019-08-15
GB2585304A (en) 2021-01-06
US20200374545A1 (en) 2020-11-26

Similar Documents

Publication Publication Date Title
CN111869216B (en) Method and apparatus for syntax processing for video codec system
CN109076210B (en) Video coding and decoding method and device
CN113016187B (en) Video block encoding or decoding method and apparatus using current picture reference encoding scheme
CN109644271B (en) Method and device for determining candidate set for binary tree partition block
CN113170191B (en) Prediction method and prediction device for video encoding and decoding
TWI774141B (en) Method and apparatus for video conding
TWI655863B (en) Methods and apparatuses of predictor-based partition in video processing system
CN113196751B (en) Method and apparatus for processing video signal by using sub-block based motion compensation
CN113507603A (en) Image signal encoding/decoding method and apparatus thereof
CN116781879A (en) Method and device for deblocking subblocks in video encoding and decoding
CN112585972B (en) Inter-frame prediction method and device for video encoding and decoding
CN114424534A (en) Chroma direct mode generation method and apparatus for video coding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination