CN115136601A - Geometric segmentation mode - Google Patents

Geometric segmentation mode Download PDF

Info

Publication number
CN115136601A
CN115136601A CN202180013097.8A CN202180013097A CN115136601A CN 115136601 A CN115136601 A CN 115136601A CN 202180013097 A CN202180013097 A CN 202180013097A CN 115136601 A CN115136601 A CN 115136601A
Authority
CN
China
Prior art keywords
mode
video
video block
partitioning
block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202180013097.8A
Other languages
Chinese (zh)
Inventor
邓智玭
张莉
张凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Douyin Vision Co Ltd
ByteDance Inc
Original Assignee
Douyin Vision Co Ltd
ByteDance Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Douyin Vision Co Ltd, ByteDance Inc filed Critical Douyin Vision Co Ltd
Publication of CN115136601A publication Critical patent/CN115136601A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • H04N19/139Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/14Coding unit complexity, e.g. amount of activity or edge presence estimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A geometric segmentation mode is described. In a representative aspect, a method of video processing includes determining, for a transition between a current video block of a video and a bitstream of the video, that the current video block is being coded in a geometric partitioning mode, wherein the geometric partitioning mode includes an entire set of partitioning modes; determining one or more partition modes of a current video block based on partition mode indexes included in the bitstream, wherein the partition mode indexes correspond to different partition modes from one video block to another video block; and performing the conversion based on the one or more segmentation patterns.

Description

Geometric segmentation mode
Cross Reference to Related Applications
This application claims priority and benefit to international patent application No. pct/CN2020/074499 filed on 2/7/2020/depending on applicable patent laws and/or rules of the paris convention at the appropriate time. The entire disclosure of International patent application No. PCT/CN2020/074499 is incorporated by reference as part of the disclosure of this application.
Technical Field
This document relates to video codec techniques, systems, and devices.
Background
Digital video occupies the largest bandwidth usage on the internet and other digital communication networks. As the number of connected user devices capable of receiving and displaying video increases, the bandwidth requirements for pre-counting the use of digital video will continue to grow.
Disclosure of Invention
The invention relates to a video coding and decoding technology. In particular, the present invention relates to inter prediction and related techniques in video coding. The invention can be applied to existing video coding standards, such as HEVC, and also to pending standards (multifunctional video coding). The invention may also be applied to future video codec standards or video codecs.
In one representative aspect, the disclosed technology can be used to provide a method for video processing. The method includes performing a conversion between a current block of video and a bitstream representation of the video, wherein a codec mode of the current block partitions the current block into two or more subregions including at least one non-rectangular or non-square subregion, wherein the bitstream representation includes signaling associated with the codec mode, and wherein the signaling corresponds to a set of parameters having a first set of values for the current block of video and a second set of values for subsequent blocks.
In another representative aspect, the disclosed technology can be used to provide a method for video processing. The method includes performing a conversion between a current block of video and a bitstream representation of the video, wherein a codec mode of the current block partitions the current block into two or more sub-regions including at least one non-rectangular or non-square sub-region, wherein the codec mode is configurable using a plurality of parameter sets, and wherein the bitstream representation includes signaling for a subset of the plurality of parameter sets, and wherein the parameter sets include angles, displacements, and distances associated with the at least one non-rectangular or non-square sub-region.
In yet another representative aspect, the disclosed technology can be used to provide a method for video processing. The method comprises the following steps: for a transition between a current block of video and a bitstream representation of the video, making a determination regarding enablement of a first codec mode and a second codec mode different from the first codec mode, wherein the first codec mode partitions the current block into two or more sub-regions including at least one non-rectangular or non-square sub-region; and performing a conversion based on the determination.
In yet another representative aspect, the disclosed technology can be used to provide a method for video processing. The method comprises the following steps: for a transition between a current video block of a video and a bitstream of the video, determining that the current video block is coded and decoded in a geometric partitioning mode, wherein the geometric partitioning mode comprises an entire partitioning mode set; determining one or more partition modes of a current video block based on partition mode indexes included in the bitstream, wherein the partition mode indexes correspond to different partition modes from one video block to another video block; and performing the conversion based on the one or more segmentation patterns.
In yet another representative aspect, the disclosed technology can be used to provide a method for video processing. The method comprises the following steps: for a transition between a current video block of a video and a bitstream of the current video block, determining that the video block is coded in a geometric partitioning mode, wherein the geometric partitioning mode comprises an entire set of partitioning modes, and each partitioning mode is associated with a set of parameters comprising at least one of an angle, a distance, and/or a displacement; deriving a subset of segmentation modes or parameters from the entire set of segmentation modes or parameters; and performing the conversion based on the subset of segmentation patterns or parameters.
In yet another representative aspect, the disclosed technology can be used to provide a method for video processing. The method comprises the following steps: determining, for a transition between a video block of a video and a bitstream of the video block, an enablement of a geometric partitioning mode and a second codec mode different from the geometric partitioning mode for the video block; and performing a conversion based on the determination.
In yet another representative aspect, the disclosed technology can be used to provide a method for video processing. The method comprises the following steps: for a transition between a video block of a video and a bitstream of the video block, determining a deblocking process associated with the video block based on whether the current video block is coded in a geometric partitioning mode and/or a color format of the current video block; and performing the conversion based on the deblocking process.
In yet another representative aspect, a method for storing a bitstream of video is disclosed. The method comprises the following steps: for a transition between a current video block of a video and a bitstream of the video, determining that the current video block is coded and decoded in a geometric partitioning mode, wherein the geometric partitioning mode comprises an entire partitioning mode set; determining one or more partition modes of a current video block based on partition mode indexes included in the bitstream, wherein the partition mode indexes correspond to different partition modes from one video block to another video block; generating a bitstream from the video blocks in one or more partitioning modes; and storing the bitstream in a non-transitory computer-readable recording medium.
In yet another example aspect, a video encoder apparatus is disclosed that includes a processor configured to implement the above-described method.
In yet another example aspect, a video decoder apparatus is disclosed that includes a processor configured to implement the above-described method.
In yet another example aspect, a computer-readable medium is disclosed. The computer readable medium has code stored thereon. The code, when executed by a processor, causes the processor to implement the above-described method.
These and other aspects are described in this document.
Drawings
FIG. 1 shows an example of Triangular Prediction Mode (TPM).
Fig. 2 shows an example of a geometric codec mode (GEO) partition boundary description.
Fig. 3A illustrates an example of edges supported in GEO.
Fig. 3B shows the geometric relationship between a given pixel location and two edges.
Fig. 4 shows examples of different angles of GEO and their corresponding aspect ratios.
Fig. 5 shows an example of the angular distribution of 64 GEO modes.
FIG. 6 shows an example of GEO/TPM partition boundaries for angleIdx 0 through 31 in a counter-clockwise direction.
FIG. 7 shows another example of GEO/TPM partition boundaries for angleIdx 0 through 31 in a clockwise direction.
Fig. 8 shows a flow diagram of an example method of video processing.
Fig. 9 is a block diagram of an example of a video processing apparatus.
Fig. 10 is a block diagram illustrating an example video codec system.
Fig. 11 is a block diagram illustrating an example encoder.
Fig. 12 is a block diagram illustrating an example decoder.
FIG. 13 is a block diagram of an example video processing system in which the disclosed techniques may be implemented.
Fig. 14 shows a flow diagram of an example method of video processing.
Fig. 15 shows a flow diagram of an example method of video processing.
Fig. 16 shows a flow diagram of an example method of video processing.
Fig. 17 shows a flow diagram of an example method of video processing.
Fig. 18 shows a flow diagram of an example method of video processing.
Detailed Description
Due to the increasing demand for higher resolution video, video coding methods and techniques are ubiquitous in modern technology. Video codecs typically include electronic circuits or software that compress or decompress digital video and are continually being improved to provide higher codec efficiency. The video codec converts uncompressed video into a compressed format and vice versa. There is a complex relationship between video quality, the amount of data used to represent the video (determined by the bit rate), the complexity of the encoding and decoding algorithms, susceptibility to data loss and errors, ease of editing, random access, and end-to-end delay (latency). The compression format typically conforms to a standard video compression specification, such as the High Efficiency Video Codec (HEVC) standard (also known as h.265 or MPEG-H Part 2), a pending multifunctional video codec standard, or other current and/or future video codec standards.
Embodiments of the disclosed techniques may be applied to existing video codec standards (e.g., HEVC, h.265) and future standards to improve runtime performance. It relates specifically to merge mode in video codec. Section headings are used in this document to enhance readability of the description, and do not limit the discussion or embodiments (and/or implementations) in any way to the corresponding sections.
1. Background of the invention
Video codec standards have been developed primarily through the development of the well-known ITU-T and ISO/IEC standards. ITU-T developed H.261 and H.263, ISO/IEC developed MPEG-1 and MPEG-4Visual, and both organizations jointly developed the H.262/MPEG-2 video, the H.264/MPEG-4 Advanced Video Codec (AVC), and the H.265/HEVC standards. Since h.262, the video codec standard is based on a hybrid video codec structure, in which temporal prediction plus transform coding is employed. To explore future video codec technologies beyond HEVC, VCEG and MPEG united in 2015 into a joint video exploration team (jfet). Thereafter, jfet adopted many new methods and introduced them into reference software known as Joint Exploration Model (JEM). The jviet conference is held once a quarter at the same time, and the goal of the new codec standard is to reduce the bit rate by 50% compared to HEVC. The new video codec standard was formally named multifunctional video codec (VVC) at the jfet meeting in 2018, month 4, and a first version of the VVC Test Model (VTM) was also released at that time. Due to the continuous effort to standardize VVCs, new codec techniques are adopted in the VVC standard at every jfet conference. The working draft and test model VTM of the VVC are updated after each meeting. The VVC project is now targeted for technical completion (FDIS) at a 7-month conference in 2020.
1.1. Geometric partitioning (GEO) for inter-frame prediction
The following descriptions are taken from JVET-P0884, JVET-P0107, JVET-P0304, JVET-P0264, JVET-Q0079, JVET Q0059, JEVT-Q0077 and JVET-Q0309.
Geometric merge mode (GEO) was proposed at the 15 th goldberg jfet conference and is an extension of existing Triangle Prediction Mode (TPM). At the 16 th intraductal watt jfet conference, a simpler designed GEO mode in jfet-P0884 was selected as the CE anchor for further study. On the 17 th brussel jvt conference, GEO is adopted into VTM8 to replace the TPM mode in VTM7, and GEO mode is renamed to GPM mode in VVC WD 8.
FIG. 1 illustrates the TPM in VTM-6.0 and the additional shapes proposed for the GEO inter block.
Division boundary of geometric merge mode is formed by angle
Figure BDA0003783992400000051
And distance offset ρ i Described, as shown in fig. 2. Angle of rotation
Figure BDA0003783992400000052
Representing a quantization angle between 0 and 360 degrees, and a distance offset p i Representing the maximum distance ρ max Quantization offset of (1). In addition, partitioning directions that overlap with the binary tree partitioning and the TPM partitioning are excluded.
In jfet-P0884, GEO is applied to block sizes no smaller than 8 × 8, and for each block size there are 82 different ways of partitioning, distinguished by 24 angles and 4 edges with respect to the center of the CU. Fig. 3A shows that starting from Edge0 passing through the center of the CU, the 4 edges are evenly distributed along the normal vector direction within the CU. Each segmentation mode in the GEO (i.e., a pair of angle degree index and edge index) is assigned with a pixel adaptive weight table to blend samples on both segments, where the samples have weight values ranging from 0 to 8 and determined by the L2 distance from the center position of the pixel to the edge. Basically, the cell gain constraint is followed when assigning weight values, that is, when a small weight value is assigned to a GEO partition, a large complementary weight value is assigned to another partition, totaling 8.
The calculation of the weight value for each pixel is twofold: (a) calculating a displacement from a pixel location to a given edge, and (c) mapping the calculated displacement to a weight value by a predefined lookup table. The way to calculate the displacement from pixel location (x, y) to a given Edge Edgei is practically the same as calculating the displacement from (x, y) to Edge0 and subtracting the displacement by the distance ρ between Edge0 and Edgei. Fig. 3B shows the geometrical relationship between (x, y) and the edge. Specifically, the displacement from (x, y) to Edgei can be formulated as follows:
Figure BDA0003783992400000061
the value of ρ is a function of the maximum length of the normal vector (denoted by ρ max) and the edge index i, i.e.:
Figure BDA0003783992400000062
where N is the number of edges supported by GEO and "1" is to prevent the last edge EdgeN-1 from being too close to the CU corners for some angle indices. Substituting equation (8) into (6), we can calculate the displacement from each pixel (x, y) to a given Edgei. In short, we will
Figure BDA0003783992400000063
Denoted as wIdx (x, y). P needs to be computed once per CU and wIdx (x, y) once per sample, where multiplication is involved.
1.1.1.JVET-P0884
On the basis of the 16 th intraductal tile JFET conference CE4-1.14, JFET-P0884 added the simplification of proposed slope-based version 2 of JFET-P0107, JFET-P0304 and JFET-P0264, test 1.
a) In the joint contribution, the geo angle is the same defined slope (an entangled power of 2) as in JVET-P0107 and JVET-P0264. The slope used in this proposal is (1,1/2,1/4,4, 2). In this case, if the blending mask is computed on the fly, the multiplication will be replaced by a shift operation.
b) The rho calculation is replaced by offset X and offset Y, as described in JFET-P304. In this case, only 24 blend masks need to be stored without computing the blend masks on the fly.
1.1.2.JVET-P0107
Based on this slope-based GEO version 2, the Dis [ ] lookup table is illustrated in Table 1
TABLE 1 2-bit Dis [ ] lookup table for slope-based GEO
Figure BDA0003783992400000064
Figure BDA0003783992400000071
With slope-based GEO version 2, the computational complexity of GEO hybrid mask derivation is considered multiplication (up to 2-bit shift) and addition. There is no different partition compared to the TPM. Furthermore, the rounding operation of distFromLine is removed to more easily store the hybrid mask. This error repair ensures that the sampling point weights are repeated in a shifted manner in each row or column.
1.1.3.JVET-P0264
In JFET-P0264, the angles in the GEO are replaced with angles having powers of 2 that are tangent. Since the tangent of the proposed angle is a power of 2, most multiplications can be replaced by a shift. In addition, the weighting values for these angles may be implemented by row-by-row or column-by-column complex shifting. With the proposed angle, each block size and each partition mode requires one row or one column to be stored.
1.1.4.JVET-P0304
In jfet-P0304 it is proposed to derive weights and mask masks for motion field storage for all blocks and partition modes from two predefined mask sets, one set for hybrid weight derivation and the other set for masks for motion field storage. There are a total of 16 masks in each set. Each mask for each angle was calculated using the same equation in GEO, with the block width and block height set to 256 and the displacement set to 0. For having an angle
Figure BDA0003783992400000072
And a block of size W × H of distance ρ, the blending weight of the luminance samples is directly clipped from the predefined mask, and the offset is calculated as follows:
the variables offsetX and offsetY are calculated as follows:
Figure BDA0003783992400000073
Figure BDA0003783992400000074
-
Figure BDA0003783992400000075
where g _ sampleWeight L []Is a predefined mask of mixing weights.
1.1.5.JVET-Q0079
A simplified GEO model was proposed in JFET-P0884/JFET-P0885 at the JFET-P conference and proposed as a common basis for the CE4 core experiment. In this common basis, the GEO mode is applied to merge blocks whose width and height are both greater than or equal to 8. When a block is coded using GEO-mode, an index is signaled to indicate which of the 82 partition modes is used to divide the block into two partitions. Each partition performs inter prediction using its own motion vector. After each partition is predicted, sample values along the segmentation edges are adjusted using a blending process with weights. This is the prediction signal for the entire block and the transform and quantization process will be applied to the entire block as in other prediction modes.
The weights of the luminance samples used in the blending process are calculated as follows:
weightIdx=(((x+offsetX)<<1)+1)*Dis[displacementX]+(((y+offsetY)<<1)+1))*Dis[displacementY]-rho.
weightIdxAbs=Clip3(0,26,abs(weightIdx)).
sampleWeight=weightIdx<=0GeoFilter[weightIdxAbs]:8-GeoFilter[weightIdxAbs]
dis [ ] is a look-up table with 24 entries, and the possible output values are 0,1,2, 4. GeoFilter [ ] is a look-up table with 27 entries. The variables rho, offsetX and offsetY are pre-calculated based on:
rho=(Dis[displacementX]<<8)+(Dis[displacementY]<<8)
Figure BDA0003783992400000081
Figure BDA0003783992400000082
Figure BDA0003783992400000083
and p represents the angle and distance derived from the look-up table using the signaled index.
The weights of the chroma samples are sub-sampled from the luma weights. The motion memory masks are independently derived using the same weight derivation method. More details on this common basis can be found in JFET-P0884/JFET-P0885.
1.1.6.JVET-Q0059
In the JFET-Q conference, 64 modes of geometric inter-prediction (i.e., JFET-Q0059) are employed.
Since in natural video sequences, the moving objects are mostly laid out vertically, a nearly horizontal partitioning pattern is less often used. In the proposed 64-mode GEO, we remove the angles {5,7,17,19 }. Fig. 5 shows the angular distribution of GEO for 64 modes.
Further, in the GEO of the 82 modes, a distance having a distance index of 2 for the horizontal angle {0,12} and the vertical angle {6,18} overlaps with the ternary tree division boundary. They are also removed in the proposed 64-mode GEO.
The total pattern of the proposed method can be calculated as 10 × 4+10 × 3-2-4 ═ 64 patterns
Furthermore, since the geo partition mode is signaled by using truncated binary, 64 modes will be most efficiently signaled with 6-bit TBs.
JVET-Q0077 and JVET-Q0309
GEO is disabled for blocks greater than 64x64 and GEO is disabled for blocks 64x8 and 8x 64.
GEO/GPM Angle indexing and Angle size
In the latest VVC working draft 8, the GEO mode is renamed to Geometric Partitioning Mode (GPM). The GEO/GPM angle index is used to represent a partition boundary that divides the GEO/GPM block into two sub-regions, as shown in fig. 6. As introduced by JFET-P0264, the tangent of the GEO/GPM angle is a power of 2, depending on the aspect ratio of the block. In JFET-Q2001-vB, the size of the GEO/GPM angle ranges from 0 to 352.87, and the associated GEO/GPM angle index ranges from 0 to 31, as shown in FIG. 6. The angle between the vertical direction (e.g., overlapping a partition boundary equal to 0 angleIdx) and the specified GEO partition boundary is defined as the size of the GEO/GPM angle of the GEO/GPM mode. The correspondence between the size of the GEO/GPM angle (angle size) and the GEO/GPM angle index (angleIdx) can be found in table 2.
Table 2: example of a relationship between an Angle index and an Angle dimension (counter-clockwise)
Figure BDA0003783992400000091
Specification of GEO/GPM in JVET-Q2001-vB
The following specification is taken from the working draft provided in JFET-Q2001-vB.
7.3.10.5 codec Unit syntax
Figure BDA0003783992400000101
Figure BDA0003783992400000111
Figure BDA0003783992400000121
Figure BDA0003783992400000131
7.3.10.7 Merge data syntax
Figure BDA0003783992400000141
The sps _ gpm _ enabled _ flag specifies whether geometric partition-based motion compensation can be used for inter prediction. sps _ gpm _ enabled _ flag equal to 0 specifies that the syntax should be constrained such that geometric partition based motion compensation is not used in CLVS, and merge _ gpm _ partition _ idx, merge _ gpm _ idx0, and merge _ gpm _ idx1 are not present in the codec unit syntax of CLVS. sps _ gpm _ enabled _ flag equal to 1 specifies that geometric partition based motion compensation can be used in CLVS. When not present, the value of sps _ gpm _ enabled _ flag is inferred to be equal to 0.
max _ num _ merge _ cand _ minus _ max _ num _ gpm _ cand specifies the maximum number of geometrically partitioned merge mode candidates supported in the SPS that are subtracted from MaxNumMergeCand.
If sps _ gpm _ enabled _ flag is equal to 1 and MaxNumMergeCand is greater than or equal to 3, the maximum number of geometrically partitioned merge mode candidates MaxNumGeoMergeCand is derived as follows:
if(sps_gpm_enabled_flag&&MaxNumMergeCand>=3)
MaxNumGpmMergeCand=MaxNumMergeCand-max_num_merge_cand_minus_max_num_gpm_cand
else if(sps_gpm_enabled_flag&&MaxNumMergeCand==2)
MaxNumMergeCand=2
else MaxNumGeoMergeCand=0
the value of maxnumgeorgecand should be in the range of 2 to MaxNumMergeCand, inclusive.
The variable MergeGpmFlag [ x0] [ y0], which specifies whether or not to generate predicted samples for the current codec unit using geometry partition based motion compensation when decoding a B slice, is derived as follows:
-MergeGpmFlag [ x0] [ y0] is set equal to 1 if all of the following conditions are true:
-sps _ gpm _ enabled _ flag is equal to 1.
-slice _ type equals B.
-general _ merge _ flag [ x0] [ y0] equal to 1.
-cbWidth greater than or equal to 8.
-cbHeight greater than or equal to 8.
-cbWidth is less than 8 × cbHeight.
-cbHeight is less than 8 × cbWidth.
-regular _ merge _ flag [ x0] [ y0] equal to 0.
-merge _ sublock _ flag [ x0] [ y0] is equal to 0.
-ciip _ flag [ x0] [ y0] is equal to 0.
-otherwise, MergeGpmFlag [ x0] [ y0] is set equal to 0.
The merge _ gpm _ partition _ idx [ x0] [ y0] specifies the split shape of the geometrically-split merge mode. The array indices x0, y0 specify the position of the top-left luma sample of the codec block under consideration relative to the top-left luma sample of the picture (x0, y 0).
When merge _ gpm _ partition _ idx [ x0] [ y0] is absent, it is inferred to be equal to 0.
Merge _ gpm _ idx0[ x0] [ y0] specifies the first Merge candidate index of the geometry partition based motion compensation candidate list, where x0, y0 specify the position of the top left luma sample of the considered codec block relative to the top left luma sample of the picture (x0, y 0).
When merge _ gpm _ idx0[ x0] [ y0] is absent, it is inferred to be equal to 0.
Merge _ gpm _ idx1[ x0] [ y0] specifies the second Merge candidate index of the geometric partition based motion compensation candidate list, where x0, y0 specify the position of the top left luma sample of the considered codec block relative to the top left luma sample of the picture (x0, y 0).
When merge _ gpm _ idx1[ x0] [ y0] is absent, it is inferred to be equal to 0.
8.5.7 decoding Process for geometric partitioning mode inter blocks
8.5.7.1 overview
This procedure is invoked when decoding a codec unit that MergeGpmFlag [ xCb ] [ yCb ] equals 1.
The inputs to this process are:
-a luminance position (xCb, yCb) specifying an upper left luma sample of the current codec block relative to an upper left luma sample of the current picture,
a variable cbWidth specifying the width of the current codec block in luminance samples,
a variable cbHeight specifying the height of the current codec block in the luma samples,
1/16 luma motion vectors mvA and mvB of fractional sample precision,
-chrominance motion vectors mvCA and mvCB,
reference indices refIdxA and refIdxB,
prediction list flags predlistflag a and predlistflag b.
The output of this process is:
-an array of (cbWidth) x (cbHeight) predSamples of luma prediction samples L
-array predSamples of (cbWidth/subwidtc) x (cbHeight/subheight c) of chroma prediction samples of component Cb when ChromaArrayType is not equal to 0 Cb
-array predSamples of (cbWidth/subwidtc) x (cbHeight/subheight c) of chroma prediction samples of component Cr when ChromaArrayType is not equal to 0 Cr
Let predSamplesLA L And predSamplesLB L Array of (cbWidth) x (cbHeight) to predict luma sample values, and set predSamplesLA when ChromaArrayType is not equal to 0 Cb 、predSamplesLB Cb 、predSamplesLA Cr And predSamplesLB Cr An array of (cbWidth/SubWidthC) x (cbHeight/SubHeight C) for predicting chroma sample values.
predSamples L 、predSamples Cb And predSamples Cr Derived by the following ordered steps:
1. for N being each of a and B, the following applies:
-refPicLN from an ordered two-dimensional array of intensity samples L And two ordered two-dimensional arrays of chroma samples refPicLN Cb And refPicLN Cr The constituent reference pictures are derived by calling the process specified in clause 8.5.6.2 with X set equal to predlistflag n and refIdxX set equal to refIdxN as inputs.
The array predSamplesLN L Derived by invoking the fractional sample interpolation procedure specified in clause 8.5.6.3 with luma position (xCb, yCb), luma codec block width sbWidth set equal to cbWidth, luma codec block height sbHeight set equal to cbHeight, motion vector offset set equal to (0,0) in luma position (xCb, yCb), motion vector offset (mvc) in luma position set equal to cbWidth, motion vector offset (mvc) in luma position set equal to cbHeight, motion vector offset (sbHeight) in luma position set equal to cbHeight, motion vector offset (sbh) in luma position set equal to (0,0) in luma position set equal to cbWidth, motion vector offset (sbWidth) in luma position set equal to cbHeight, motion vector offset (sbHeight) in luma position set equal to cbHeightmvOffset, motion vector mvLX set equal to mvN, refPicLN set equal to L Reference array refPicLX of L Variable bdafflag set equal to FALSE, variable cIdx set equal to 0, rprconstraintsactave [ X [ ]][refIdxLX]And RefPicScale [ predListFlagN ]][refIdxN]As an input.
-when ChromaArrayType does not equal 0, the array predSamplesLN Cb Derived by calling the fractional sample interpolation process specified in clause 8.5.6.3 with luma position (xCb, yCb), codec block width sbWidth set equal to cbWidth/sub width hc, codec block height sbHeight set equal to cbHeight/sub height c, motion vector offset mvOffset set equal to (0,0), motion vector mvLX set equal to mvCN, refPicLN set equal to refPicLN Cb Reference array refPicLX of Cb Variable bdafflag set equal to FALSE, variable cIdx set equal to 1, rprconstraintsactave [ X [ ]][refIdxLX]And RefPicScale [ predListFlagN ]][refIdxN]As an input.
The array predSamplesLN when ChromaArrayType does not equal 0 Cr Derived by calling the fractional sample interpolation process specified in clause 8.5.6.3 with luma position (xCb, yCb), codec block width sbWidth set equal to cbWidth/sub width hc, codec block height sbHeight set equal to cbHeight/sub height c, motion vector offset mvOffset set equal to (0,0), motion vector mvLX set equal to mvCN, refPicLN set equal to refPicLN Cr Reference array refPicLX of Cr Variable bdofFlag set equal to FALSE, variable cIdx set equal to 2, rprconstraintsectionactive [ X][refIdxLX]And RefPicScale [ predListFlagN][refIdxN]As an input.
2. The segmentation angle variable angleIdx and distance variable distanceIdx of the geometric segmentation mode are set according to the value of merge _ gpm _ partition _ idx [ xCb ] [ yCb ], as specified in table 36.
3. Prediction samples predSamples inside current luma codec block L [x L ][y L ](wherein, x L cbWidth-1, and y L cbHeight-1) by calling term 8.5.7.2The weighted sample prediction process of the geometric partition mode specified in (1), wherein the weighted sample prediction process is derived with a codec block width nCbW set equal to cbWidth, a codec block height nCbH set equal to cbHeight, and a sample array predSamplesLA L And predSamplesLB L And the variables angleIdx, distanceIdx, and cIdx equal to 0 as inputs.
4. When ChromaArrayType is not equal to 0, prediction samples predSamples inside the current chroma component Cb codec block Cb [x C ][y C ](wherein, x) C cbWidth/subw-1, and y C cbHeight/subheight c-1) is derived by calling the weighted sample point prediction process of the geometric partitioning pattern specified in clause 8.5.7.2, with the codec block width nCbW set equal to cbWidth/subwidth c, the codec block height nCbH set equal to cbHeight/subheight c, and the sample point array pred samplesla Cb And predSamplesLB Cb And the variables angleIdx, distanceIdx, and cIdx equal to 1 as inputs.
5. When ChromaArrayType is not equal to 0, the prediction sample point predSamplesCr [ x ] within the current chroma component Cr codec block C ][y C ](wherein x C cbWidth/subwidth hc-1 and y C cbHeight/subheight c-1) is derived by calling the weighted sample point prediction process of the geometric partitioning pattern specified in clause 8.5.7.2, with the codec block width nCbW set equal to cbWidth/subwidth c, the codec block height nCbH set equal to cbHeight/subheight c, and the sample point array pred samplesla Cr And predSamplesLB Cr And the variables angleIdx, distanceIdx, and cIdx equal to 2 as inputs.
6. The motion vector storage procedure of the merge geometric partitioning mode specified in clause 8.5.7.3 is invoked, with the luma codec block position (xCb, yCb), the luma codec block width cbWidth, the luma codec block height cbHeight, the partitioning angle angleIdx and distance distanceIdx, the luma motion vectors mvA and mvB, the reference indices refIdxA and refIdxB, and the prediction list flags predlistflag a and predlistflag b as inputs.
TABLE 36-Regulation of angleIdx and distanceIdx based on merge _ gpm _ partition _ idx
merge_gpm_partition_idx 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
angleIdx 0 0 2 2 2 2 3 3 3 3 4 4 4 4 5 5
distanceIdx 1 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1
merge_gpm_partition_idx 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
angleIdx 5 5 8 8 11 11 11 11 12 12 12 12 13 13 13 13
distanceIdx 2 3 1 3 0 1 2 3 0 1 2 3 0 1 2 3
merge_gpm_partition_idx 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
angleIdx 14 14 14 14 16 16 18 18 18 19 19 19 20 20 20 21
distanceIdx 0 1 2 3 1 3 1 2 3 1 2 3 1 2 3 1
merge_gpm_partition_idx 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
angleIdx 21 21 24 24 27 27 27 28 28 28 29 29 29 30 30 30
distanceIdx 2 3 1 3 1 2 3 1 2 3 1 2 3 1 2 3
8.5.7.2 weighted sampling point prediction process of geometric partitioning mode
The inputs to this process are:
two variables nCbW and nCbH, specifying the width and height of the current codec block,
two arrays of (nCbW) x (nCbH) predSamplesLA and predSamplesLB,
the variable angleIdx, specifying the angular index of the geometric segmentation,
a variable distanceIdx specifying a distance index for the geometric partitioning,
the variable cIdx, specifying the color component index.
The output of this process is an array of predicted sample values (nCbW) x (ncbh) pbSamples.
The variables nW, nH, shift1, offset1, hwRatio, displacionx, displaciony, partFlip and shiftHor are derived as follows:
nW=(cIdx==0)?nCbW:nCbW*SubWidthC (1030)
nH=(cIdx==0)?nCbH:nCbH*SubHeightC (1031)
shift1=Max(5,17-BitDepth) (1032)
offset1=1<<(shift1-1) (1033)
hwRatio=nH/nW (1034)
displacementX=angleIdx (1035)
displacementY=(angleIdx+8)%32 (1036)
partFlip=(angleIdx>=13&&angleIdx<=27)?0:1 (1037)
shiftHor=(angleIdx%16==8||(angleIdx%16!=0&&hwRatio>0))?0:1 (1038)
the variables offsetX and offsetY are derived as follows:
-if shiftHor is equal to 0, the following applies:
offsetX=(-nW)>>1 (1039)
offsetY=((-nH)>>1)+(angleIdx<16?(distanceIdx*nH)>>3:-((distanceIdx*nH)>>3)) (1040)
else (shiftHor equal to 1), the following applies:
offsetX=((-nW)>>1)+(angleIdx<16?(distanceIdx*nW)>>3:-((distanceIdx*nW)>>3)) (1041)
offsetY=(-nH)>>1 (1042)
the predicted sample point pbSamples [ x ] [ y ] (where x ═ 0.. nCbW-1, and y ═ 0.. nCbH-1) is derived as follows:
the variables xL and yL are derived as follows:
xL=(cIdx==0)?x:x*SubWidthC (1043)
yL=(cIdx==0)?y:y*SubHeightC (1044)
the variable wValue specifying the weight of the predicted samples is derived based on the array distlut specified in table 37 as follows:
weightIdx=(((xL+offsetX)<<1)+1)*disLut[displacementX]+(((yL+offsetY)<<1)+1))*disLut[displacementY] (1045)
weightIdxL=partFlip32+weightIdx:32-weightIdx (1046)
wValue=Clip3(0,8,(weightIdxL+4)>>3) (1047)
the predicted sample values are derived as follows:
pbSamples[x][y]=Clip3(0,(1<<BitDepth)-1,(predSamplesLA[x][y]*wValue+(1048)
predSamplesLB[x][y]*(8-wValue)+offset1)>>shift1)
TABLE 37-Specification of geometric partitioning distance array distlut
idx 0 2 3 4 5 6 8 10 11 12 13 14
disLut[idx] 8 8 8 4 4 2 0 -2 -4 -4 -8 -8
idx 16 18 19 20 21 22 24 26 27 28 29 30
disLut[idx] -8 -8 -8 -4 -4 -2 0 2 4 4 8 8
8.5.7.3 storing motion vector of geometric division mode
This procedure is invoked when decoding a codec unit that MergeGpmFlag [ xCb ] [ yCb ] equals 1.
The inputs to this process are:
-a luminance position (xCb, yCb) specifying an upper left sample of the current codec block relative to an upper left luminance sample of the current picture,
a variable cbWidth specifying the width of the current codec block in luminance samples,
a variable cbHeight specifying the height of the current codec block in luma samples,
the variable angleIdx, specifying the angular index of the geometric segmentation,
a variable distanceIdx specifying a distance index for the geometric partitioning,
1/16 fractional sample precision luminance motion vectors mvA and mvB,
reference indices refIdxA and refIdxB,
prediction list flags predlistflag a and predlistflag b.
Variables numSbX and numSbY that respectively specify the number of 4 × 4 blocks of the current codec block in the horizontal and vertical directions are set equal to cbWidth > >2 and cbHeight > >2, respectively.
The variables hwRatio, displacionx, displaciony, partIdx, and shiftHor are derived as follows:
hwRatio=cbHeight/cbWidth (1049)
displacementX=angleIdx (1050)
displacementY=(angleIdx+8)%32 (1051)
partIdx=(angleIdx>=13&&angleIdx<=27)?0:1 (1052)
shiftHor=(angleIdx%16==8||(angleIdx%16!=0&&hwRatio>0)?0:1(1053)
the variables offsetX and offsetY are derived as follows:
-if shiftHor is equal to 0, the following applies:
offsetX=(-cbWidth)>>1 (1054)
offsetY=((-cbHeight)>>1)+(angleIdx<16?(distanceIdx*cbHeight)>>3:-((distanceIdx*cbHeight)>>3)) (1055)
else (shiftHor equals 1), the following applies:
offsetX=((-cbWidth)>>1)+(angleIdx<16?(distanceIdx*cbWidth)>>3:-((distanceIdx*cbWidth)>>3)) (1056)
offsetY=(-cbHeight)>>1 (1057)
for each 4 × 4 sub-block at the sub-block index (xsbdx, ysbdx) (where xsbdx ═ 0.. numbx-1 and ysbdx ═ 0.. numSbY-1), the following applies:
the variable motionIdx is calculated based on the array distlut specified in table 37 as follows:
motionIdx=(((4*xSbIdx+offsetX)<<1)+5)*disLut[displacementX]+(((4*ySbIdx+offsetY<<1)+5))*disLut[displacementY] (1058)
the variable sType is derived as follows:
sType=abs(motionIdx)<322:(motionIdx<=0?(1-partIdx):partIdx) (1059)
depending on the value of sType, the following assignments are made:
if sType is equal to 0, then the following applies:
predFlagL0=(predListFlagA==0)?1:0 (1060)
predFlagL1=(predListFlagA==0)?0:1 (1061)
refIdxL0=(predListFlagA==0)?refIdxA:-1 (1062)
refIdxL1=(predListFlagA==0)?-1:refIdxA (1063)
mvL0[0]=(predListFlagA==0)?mvA[0]:0 (1064)
mvL0[1]=(predListFlagA==0)?mvA[1]:0 (1065)
mvL1[0]=(predListFlagA==0)?0:mvA[0] (1066)
mvL1[1]=(predListFlagA==0)?0:mvA[1] (1067)
otherwise, if sdype is equal to 1 or (sdtype is equal to 2 and predlistflag a + predlistflag b is not equal to 1), then the following applies:
predFlagL0=(predListFlagB==0)?1:0 (1068)
predFlagL1=(predListFlagB==0)?0:1 (1069)
refIdxL0=(predListFlagB==0)?refIdxB:-1 (1070)
refIdxL1=(predListFlagB==0)?-1:refIdxB (1071)
mvL0[0]=(predListFlagB==0)?mvB[0]:0 (1072)
mvL0[1]=(predListFlagB==0)?mvB[1]:0 (1073)
mvL1[0]=(predListFlagB==0)?0:mvB[0] (1074)
mvL1[1]=(predListFlagB==0)?0:mvB[1] (1075)
else (sType equals 2 and predlistflag a + predlistflag b equals 1), the following applies:
predFlagL0=1 (1076)
predFlagL1=1 (1077)
refIdxL0=(predListFlagA==0)?refIdxA:refIdxB (1078)
refIdxL1=(predListFlagA==0)?refIdxB:refIdxA (1079)
mvL0[0]=(predListFlagA==0)?mvA[0]:mvB[0] (1080)
mvL0[1]=(predListFlagA==0)?mvA[1]:mvB[1] (1081)
mvL1[0]=(predListFlagA==0)?mvB[0]:mvA[0] (1082)
mvL1[1]=(predListFlagA==0)?mvB[1]:mvA[1] (1083)
for x 0..3 and y 0..3, the following assignment is made:
MvL0[(xSbIdx<<2)+x][(ySbIdx<<2)+y]=mvL0 (1084)
MvL1[(xSbIdx<<2)+x][(ySbIdx<<2)+y]=mvL1 (1085)
MvDmvrL0[(xSbIdx<<2)+x][(ySbIdx<<2)+y]=mvL0 (1086)
MvDmvrL1[(xSbIdx<<2)+x][(ySbIdx<<2)+y]=mvL1 (1087)
RefIdxL0[(xSbIdx<<2)+x][(ySbIdx<<2)+y]=refIdxL0 (1088)
RedIdxL1[(xSbIdx<<2)+x][(ySbIdx<<2)+y]=refIdxL1 (1089)
PredFlagL0[(xSbIdx<<2)+x][(ySbIdx<<2)+y]=predFlagL0 (1090)
PredFlagL1[(xSbIdx<<2)+x][(ySbIdx<<2)+y]=predFlagL1 (1091)
BcwIdx[(xSbIdx<<2)+x][(ySbIdx<<2)+y]=0 (1092)
2. disadvantages of the existing solutions
There are several potential problems in the current design of GEO, as described below.
(1) The current GEO modes are distributed as symmetric GEO angles and symmetric GEO distances/displacements, which may not be valid for natural video codecs.
(2) The current GEO mode can be rendered with weighted prediction, which can lead to visual artifacts.
3. Embodiments of the disclosed technology
The following detailed description is to be considered as an example to explain the general concepts. These inventions should not be construed in a narrow manner. Furthermore, these inventions can be combined in any manner.
The term "GEO" may denote a codec method of dividing one block into two or more sub-regions, in which at least one sub-region is non-rectangular, or it cannot be generated by any existing partition structure (e.g., QT/BT/TT) of dividing one block into a plurality of rectangular sub-regions. In one example, for a GEO codec block, one or more weighting masks are derived for the codec block based on how the sub-region is partitioned, and a final prediction signal for the codec block is generated by a weighted sum of two or more auxiliary prediction signals associated with the sub-region. The term "GEO" may indicate a geometric merge mode (GEO), and/or a geometric split mode (GPM), and/or a wedge prediction mode, and/or a Triangle Prediction Mode (TPM).
The term "block" may denote a Codec Block (CB), CU, PU, TU, PB, TB.
1. The interpretation of the signaled GEO mode index may change adaptively from one video unit to another. That is, it may be interpreted as a different angle and/or a different distance for the same signaling value.
a) In one example, the mapping between the signaled GEO-mode index and its corresponding GEO angle/distance may depend on the block dimension (e.g., the ratio of block width to height).
b) Further alternatively, the binarization for GEO-mode index coding may be a non-fixed length coding, wherein the number of bins/bits to be coded may be different for the two different modes.
c) In one example, an unequal number of GEO modes/angles/displacements/distances may be used in different video units.
i. In one example, how many GEO modes/angles/displacements/distances to use in a video unit may depend on the block dimensions (e.g., width, height, aspect ratio, etc.).
in one example, more GEO angles may be used in block a than in block B, where a and B may have different dimensions (e.g., a may indicate a block with a height greater than a width, B may indicate a block with a height less than or equal to a width).
in one example, the allowed GEO angles of the video units may be asymmetric.
in one example, the allowed GEO angles of the video units may not be rotationally symmetric.
v. in one example, the allowed GEO angles of the video unit may not be bilaterally symmetric.
In one example, the allowed GEO angles of the video unit may not be quarter-symmetric.
d) In one example, a video unit may be a block, a VPDU, a slice/picture/sub-picture/tile/video.
2. How the GEO-mode is represented in the bitstream may depend on the priority of the mode, e.g., the priority is determined by the associated GEO-angle and/or distance.
a) In one example, GEO modes facing smaller GEO angular sizes (i.e., GEO angular sizes less than X degrees, such as X90 or 45, as described in table 2, section 2.1.8) have higher priority than GEO modes facing larger GEO angular sizes (i.e., GEO angular sizes greater than X degrees).
b) In one example, GEO-modes associated with smaller GEO-angle indices (i.e., GEO-angle indices less than Y (such as Y ═ 4 or 8 or 16)) have higher priority than GEO-modes associated with larger GEO-angle indices (i.e., GEO-angle indices greater than Y).
c) In one example, GEO modes with higher priority in the above claims may require fewer bits or bits at the time of signaling than GEO modes with lower priority.
GEO modes can be classified into two or more categories. The indication of which class the GEO belongs to may be signaled before other information related to GEO mode.
a) For example, the angle index may be signaled first for the GEO codec block in either the clockwise or counterclockwise direction.
i. Definition of clockwise GEO angle index signaling:
1. for example, a GEO angle index signaled counterclockwise may mean that a smaller GEO angle index represents a smaller GEO angle size.
a) In one example, as shown in fig. 6 and table 2, where the GEO angle index is signaled counterclockwise, assuming that the anglexdx ═ 0 means the GEO angle size equals 0 degrees, then the anglexdx ═ 8 means the GEO angle size equals 90 degrees; the angleIdx-16 means the GEO angular dimension equals 180 degrees, and the angleIdx-24 means the GEO angular dimension equals 270 degrees.
Definition of clockwise GEO angle index signaling:
1. for example, a GEO angle index signaled clockwise may mean that a smaller GEO angle index represents a larger GEO angle size.
a) In one example, in contrast to fig. 6 and table 2, where the GEO angle index is signaled counterclockwise, clockwise GEO angle signaling is illustrated in fig. 7 and table 3, assuming that anglexdx-0 means the GEO angle size is equal to 0 degrees (equivalent to 360 degrees), anglexdx-8 may mean the GEO angle size is equal to 270 degrees; the angleIdx-16 may mean that the GEO angular dimension is equal to 180 degrees, and the angleIdx-24 may mean that the GEO angular dimension is equal to 90 degrees.
in one example, it may be signaled at the video unit level (such as SPS/VPS/PPS/picture header/sub-picture/slice header/slice/tile/CTU/VPDU/CU/block level).
1. In one example, a high level flag (above block level) may be signaled to indicate whether the angle associated with the signaled GEO mode is clockwise or counterclockwise.
2. In one example, a block level flag may be signaled to indicate whether the angle associated with the signaled GEO mode is clockwise or counterclockwise.
Table 3: example of a relationship between an Angle index and an Angle size (clockwise)
Figure BDA0003783992400000261
Figure BDA0003783992400000271
A subset of GEO modes/angles/displacements/distances may be derived from the entire GEO mode/angles/displacements/distances set.
a) In one example, only a subset of GEO-modes are used for the block.
b) In one example, GEO modes associated with only a subset of GEO angles/displacements/distances are used for a block.
c) Further alternatively, an indication whether the selected pattern/angle/displacement/distance is within the subset may be further signaled in the bitstream.
d) In one example, whether a subset of GEO modes/angles/displacements/distances or a full set of GEO modes/angles/displacements/distances is used for a video unit may depend on decoding information (e.g., syntax elements and/or block dimensions) of the current video unit or previously decoded video unit(s).
e) In one example, what GEO mode is in the subset may depend on the corresponding GEO angle.
i. In one example, the subset may contain only GEO modes associated with a distance/displacement equal to 0.
in one example, the subset may contain only GEO patterns associated with the specified GEO angles (e.g., GEO patterns associated with a predefined subset of GEO angles, which may be combined with all displacements corresponding to these predefined GEO angles).
f) In one example, what GEO mode is in the subset may depend on whether the LDB (i.e., low delay B frame) codec is checked.
i. In one example, different subsets of GEO modes may be used for LDB (i.e., low latency B frames) and RA (i.e., random access) codecs.
g) In one example, what GEO mode in the subset may depend on a reference picture in the reference picture list of the current picture. For example, the state of a reference picture in a reference picture list may be identified as two cases: case 1: all reference pictures precede the current picture in display order; case 2: at least one reference picture follows the current picture in display order.
i. In one example, different subsets may be used for case 1 and case 2.
h) In one example, what GEO mode is in the subset may depend on how the motion candidates are derived.
i. In one example, different subsets of GEO modes may be used, depending on whether the motion candidate is derived from a temporal motion candidate (e.g., TMVP), a spatial motion candidate, history-based motion vector prediction (HMVP), or which spatial motion candidates (e.g., left, or top right).
i) In one example, the subset of GEOs may contain only GEO modes that partition blocks in the same manner as TPM modes.
i. In one example, the subset of GEO may contain only GEO patterns that divide the blocks by lines through the top left and bottom right corners of the connection block or by lines through the top right and bottom left corners of the connection block.
j) In one example, the subset of GEO may contain only GEO modes corresponding to diagonal angles with one or more distance/displacement indices.
i. In one example, a diagonal angle may indicate a GEO mode corresponding to a partition boundary that partitions a block through lines connecting the top left and bottom right corners of the block or through lines connecting the top right and bottom left corners of the block.
in one example, the subset of GEO may only contain GEO modes associated with distances/displacements equal to 0.
1. In one example, the subset of GEO may only contain GEO modes corresponding to any angle associated with a distance/displacement equal to 0.
2. In one example, the subset of GEO may contain only GEO patterns corresponding to diagonal angles (i.e., dividing boundaries of the block from top-left to bottom-right and/or top-right to bottom-left) associated with a distance/displacement equal to 0.
in one example, the subset of GEO may contain only GEO patterns corresponding to diagonal angles (i.e., dividing boundaries of the block from top-left to bottom-right and/or top-right to bottom-left) associated with all distance/displacement indices corresponding to these GEO angles.
1. For example, for blocks with an aspect ratio (i.e., the ratio of 2 to the powers of log2 (width) -log2 (height)) equal to X, the subset of GEO may only contain GEO modes corresponding to GEO angle sizes equal to arctan (X) and/or pi-arctan (X), and/or pi + arctan (X) and/or pi-arctan (X), and all distance indices corresponding to these GEO angles (e.g., distanceIdx from 0 to 3 as defined by jfet-Q2001-vB).
k) For example, for a block with an aspect ratio equal to 1, the subset of GEO may only contain GEO modes corresponding to GEO angle sizes equal to 45 ° and/or 135 ° and/or 225 ° and/or 315 ° (e.g., angle index ═ 4 and/or 12 and/or 20 and/or 28 as defined by jmet-Q2001-vB), and all distance indices corresponding to these GEO angles (e.g., distance indices from 0 to 3 as defined by jmet-Q2001-vB). In one example, the horizontal and/or vertical angles may be included in a subset of GEO angles.
i. For example, a horizontal angle may mean an angle index corresponding to 90 ° and/or 270 ° as described in table 2 of section 2.1.8 (i.e., a GEO angle index equal to 8 and/or 24 in table 36 of jfet-Q2001-vB).
For example, a vertical angle may mean an angle index corresponding to 0 ° and/or 180 ° as described in table 2 of section 2.1.8 (i.e., a GEO angle index equal to 0 and/or 6 in table 36 of jfet-Q2001-vB).
in one example, GEO modes associated with horizontal and/or vertical angles combined with a distance/displacement equal to 0 may be included in a subset of allowed GEO angles.
1. Alternatively, GEO modes associated with horizontal and/or vertical angles combined with a distance/displacement equal to 0 may not be included in the subset of allowed GEO angles.
in one example, GEO modes associated with horizontal and/or vertical angles combined with all distance/displacement indices may be included in a subset of allowed GEO angles.
1. Alternatively, GEO modes associated with horizontal and/or vertical angles combined with all distance/displacement indices may not be included in the subset of allowed GEO angles.
l) it can be signaled (such as SPS/VPS/PPS/picture header/sub-picture/slice header/slice/tile/CTU/VPDU/CU/block level) to indicate whether to use a subset of GEO-mode/angle/displacement/distance.
i. It may further be signaled (such as SPS/VPS/PPS/picture header/sub-picture/slice header/slice/tile/CTU/VPDU/CU/block level) to indicate which subset of GEO-mode/angle/displacement/distance to use.
GEO mode may not co-exist with X (such as X being another codec tool than GEO).
a) In one example, X may indicate weighted prediction.
i. In one example, GEO may be disabled at the video unit level (such as the slice/PPS/SPS/slice/sub-picture/CU/PU/TU level) when weighted prediction is enabled (e.g., at the slice level).
in one example, whether GEO is used with weighted prediction may depend on a weighting factor.
1. In one example, GEO may be disabled if the weighting factor of the weighted prediction is greater than T (e.g., T is a constant value).
b) In one example, X may indicate BCW.
c) In one example, X may indicate PROF
d) In one example, X may indicate BDOF.
e) In one example, X may indicate DMVR.
f) In one example, X may indicate SBT.
g) In one example, when GEO is enabled, codec tool X may be disabled.
i. In one example, if GEO is enabled, the indication of codec tool X may not be signaled.
h) In another example, GEO may be disabled when codec tool X is enabled.
i. In one example, if codec tool X is enabled, the indication of GEO may not be signaled.
i) In another example, when weighted prediction is enabled (e.g., at the slice level), codec tool X may be disabled.
j) Alternatively, the deblocking process (such as deblocking strength, deblocking edge detection, type of deblocking edge, etc.) may depend on whether GEO coexists with codec tool X.
6. The deblocking process (such as deblocking strength, deblocking edge detection, type of deblocking edge, etc.) may depend on whether GEO is applied. For a codec unit that is coded with GEO, the weighting values generated for a first component (such as a luma component) may be used to derive the weighting values for a second component (such as a Cb or Cr component).
a) The derivation may depend on the color format (such as 4:2:0 or 4:2:2 or 4:4: 4).
b) The weighted values of the second component may be derived by applying an upsampling or downsampling to the weighted values of the first component.
7. The weighting values generated for the components (such as Cb or Cr components) may depend on the color format (such as 4:2:0 or 4:2:2 or 4:4: 4).
a) For example, when the color format is 4:2:2, the GEO angle/displacement/distance associated with the GEO mode may be adjusted to generate a weighted value for a component (such as a Cb or Cr component).
The examples described above may be incorporated in the context of the method described below (e.g., method 800), which may be implemented at a video decoder or a video encoder.
Fig. 8 shows a flow diagram of an example method 800 for video processing. The method includes, at operation 810, for a transition between a current block of video and a bitstream representation of the video, making a determination regarding enablement of a first codec mode and a second codec mode different from the first codec mode, wherein the first codec mode partitions the current block into two or more subregions that include at least one non-rectangular or non-square subregion.
The method includes, at operation 820, performing a conversion based on the determination.
In some embodiments, the following technical solutions may be implemented:
A1. a method for video processing, comprising: performing a conversion between a current block of video and a bitstream representation of the video, wherein a codec mode of the current block partitions the current block into two or more sub-regions including at least one non-rectangular or non-square sub-region, wherein the bitstream representation includes signaling associated with the codec mode, and wherein the signaling corresponds to a set of parameters having a first set of values for the current block of video and a second set of values for a subsequent block.
A2. The method of solution a1, wherein the signaling includes an index, and wherein the binarization of the index includes a variable length coding that uses a first number of bits for a first value of the index and a second number of bits for a second value of the index.
A3. The method of solution a1, wherein the signaling includes an index, and wherein the index is based on a height or width of the current block.
A4. The method of solution a1, wherein the set of parameters includes a plurality of angles and a plurality of distances for at least one non-rectangular or non-square subregion.
A5. The method of solution a4, wherein the plurality of angles are asymmetric.
A6. The method of solution a4, wherein the plurality of angles are not rotationally symmetric.
A7. The method of solution a4, wherein the plurality of angles are not bilaterally symmetric.
A8. The method according to solution a1, wherein the location of the signaling in the bitstream representation is based on the priority of the codec mode.
A9. The method according to solution a1, wherein the signaling comprises an indication of the class of the codec mode and other information related to the codec mode.
A10. The method of solution a9, wherein the other information includes an angle, and wherein the indication includes an indication of a clockwise or counterclockwise direction of the angle.
A11. A method for video processing, comprising: performing a conversion between a current block of video and a bitstream representation of the video, wherein a codec mode of the current block partitions the current block into two or more sub-regions including at least one non-rectangular or non-square sub-region, wherein the codec mode is configurable using a plurality of parameter sets, and wherein the bitstream representation includes signaling for a subset of the plurality of parameter sets, and wherein the parameter sets include angles, displacements and distances associated with the at least one non-rectangular or non-square sub-region.
A12. The method of solution a11, wherein the codec mode uses only a subset of the plurality of parameter sets.
A13. The method of solution a11, wherein the selection of a parameter set in the subset of the plurality of parameter sets is based on a syntax element in the bitstream representation or one or more dimensions of the current block.
A14. The method of solution a11, wherein selection of a parameter set in a subset of a plurality of parameter sets is based on an activation of a low delay b (ldb) frame codec tool.
A15. The method of solution a11, wherein the selection of a parameter set in a subset of the plurality of parameter sets is based on a derivation of a motion vector candidate.
A16. The method according to solution a15, wherein the motion vector candidates are derived from Temporal Motion Vector Prediction (TMVP) candidates, spatial motion candidates or history-based motion vector prediction (HMVP) candidates.
A17. The method of solution a11, wherein, in determining that the ratio of the height of the current block to the width of the current block is 1, the subset of the plurality of parameter sets includes a codec mode corresponding to an angle of 45 °, 135 °, 225 °, and/or 315 °.
A18. The method of solution a11, wherein the subset of the plurality of parameter sets includes codec modes corresponding to the current block divided by (a) a line connecting a top-left corner and a bottom-right corner of the current block or (b) a line connecting a top-right corner and a bottom-left corner of the current block.
A19. A method for video processing, comprising: for a transition between a current block of video and a bitstream representation of the video, making a determination regarding enablement of a first codec mode and a second codec mode different from the first codec mode, wherein the first codec mode partitions the current block into two or more sub-regions including at least one non-rectangular or non-square sub-region; and performing a conversion based on the determination.
A20. The method of solution a19, wherein the second codec mode comprises weighted prediction, and wherein the first codec mode is disabled at the video unit level and the second codec mode is enabled at the slice level.
A21. The method of solution a20, wherein the video unit level is a slice level, a Picture Parameter Set (PPS) level, a Sequence Parameter Set (SPS) level, a slice level, a sub-picture level, a coding-decoding unit (CU) level, a Prediction Unit (PU) level, or a Transform Unit (TU) level.
A22. The method of solution a19, wherein the second codec mode is bi-directional prediction with Codec Unit (CU) weights (BCW), prediction refinement with optical flow (PROF) mode, bi-directional optical flow (BDOF) mode, decoder-side motion vector refinement (DMVR) mode, or sub-block transform (SBT) mode.
A23. The method of solution a19, wherein the second codec mode includes a deblocking process, and wherein enabling the second codec mode is based on enabling of the first codec mode.
A24. The method according to solution a19, wherein the second codec mode comprises weighted prediction, and wherein the weighting values of the weighted prediction of the components are based on the color format of the current block.
A25. The method of solution a24, wherein the component is a Cb component or a Cr component, and wherein the color format is 4:2:0, 4:2:2, or 4:4:4: 4.
A26. The method of any of solutions a 1-a 25, wherein the converting generates the current block from a bitstream representation.
A27. The method of any of solutions a 1-a 25, wherein the transformation generates a bitstream representation from the current block.
A28. An apparatus in a video system comprising a processor and a non-transitory memory having instructions thereon, wherein the instructions, when executed by the processor, cause the processor to implement a method according to any of solutions a 1-a 27.
A29. A computer program product stored on a non-transitory computer readable medium, the computer program product comprising program code for performing the method according to any of solutions a 1-a 27.
Fig. 9 is a block diagram of a video processing apparatus 900. Apparatus 900 may be used to implement one or more of the methods described herein. The apparatus 900 may be embodied in a smartphone, tablet, computer, internet of things (IoT) receiver, or the like. The apparatus 900 may include one or more processors 902, one or more memories 904, and video processing hardware 906. The processor 902 may be configured to implement one or more of the methods described in this document. Memory 904 may be used to store data and code for implementing the methods and techniques described herein. Video processing hardware 906 may be used to implement some of the techniques described in this document in hardware circuitry.
Fig. 10 is a block diagram illustrating an example video codec system 300 that may utilize techniques of the present disclosure.
As shown in fig. 10, the video codec system 300 may include a source device 310 and a destination device 320. Source device 310 generates encoded video data, where source device 310 may be referred to as a video encoding device. The destination device 320 may decode the encoded video data generated by the source device 310, where the destination device 320 may be referred to as a video decoding device.
The source device 310 may include a video source 312, a video encoder 314, and an input/output (I/O) interface 316.
Video source 312 may include sources such as a video capture device, an interface that receives video data from a video content provider, and/or a computer graphics system used to generate video data, or a combination of such sources. The video data may include one or more pictures. The video encoder 314 encodes video data from the video source 312 to generate a bitstream. The bitstream may comprise a sequence of bits forming a codec representation of the video data. The bitstream may include coded pictures and related data. A coded picture is a coded representation of a picture. The related data may include sequence parameter sets, picture parameter sets, and other syntax structures. I/O interface 316 may include a modulator/demodulator (modem) and/or a transmitter. The encoded video data may be transmitted directly to the destination device 320 over the network 330a via the I/O interface 316. The encoded video data may also be stored on a storage medium/server 330b for access by the destination device 320.
The destination device 320 may include an I/O interface 326, a video decoder 324, and a display device 322.
I/O interface 326 may include a receiver and/or a modem. I/O interface 326 may obtain encoded video data from source device 310 or storage medium/server 330 b. The video decoder 324 may decode the encoded video data. The display device 322 may display the decoded video data to a user. The display device 322 may be integrated with the destination device 320 or may be external to the destination device 320, the destination device 320 being configured to interface with an external display device.
The video encoder 314 and the video decoder 324 may operate in accordance with video compression standards, such as the High Efficiency Video Codec (HEVC) standard, the universal video codec (VVM) standard, and other current and/or further standards.
Fig. 11 is a block diagram illustrating an example of a video encoder 400, which video encoder 400 may be the video encoder 314 in the system 300 shown in fig. 10.
Video encoder 400 may be configured to perform any or all of the techniques of this disclosure. In the example of fig. 11, video encoder 400 includes a number of functional components. The techniques described in this disclosure may be shared among various components of video encoder 400. In some examples, the processor may be configured to perform any or all of the techniques described in this disclosure.
The functional components of the video encoder 400 may include a partition unit 401, a prediction unit 402 (which may include a mode selection unit 403, a motion estimation unit 404, a motion compensation unit 405, and an intra prediction unit 406), a residual generation unit 407, a transform unit 408, a quantization unit 409, an inverse quantization unit 410, an inverse transform unit 411, a reconstruction unit 412, a buffer 413, and an entropy encoding unit 414.
In other examples, video encoder 400 may include more, fewer, or different functional components. In an example, the prediction unit 402 may include an Intra Block Copy (IBC) unit. The IBC unit may perform prediction in IBC mode, where the at least one reference picture is a picture in which the current video block is located.
Furthermore, some components, such as the motion estimation unit 404 and the motion compensation unit 405, may be highly integrated, but are for explanation purposes shown separately in the example of fig. 11.
The partition unit 401 may partition a picture into one or more video blocks. The video encoder 400 and the video decoder 500 may support various video block sizes.
The mode selection unit 403 may select one of the coding modes (e.g., intra or inter) based on the error result, and supply the resulting intra coded block or inter coded block to the residual generation unit 407 to generate residual block data, and to the reconstruction unit 412 to reconstruct the coded block to be used as a reference picture. In some examples, mode selection unit 403 may select a combination of intra and inter prediction modes (CIIP), where the prediction is based on the inter prediction signal and the intra prediction signal. In the case of inter prediction, the mode selection unit 403 may also select the resolution (e.g., sub-pixel or integer-pixel precision) of the motion vector of the block.
To perform inter prediction on the current video block, motion estimation unit 404 may generate motion information for the current video block by comparing one or more reference frames from buffer 413 with the current video block. Motion compensation unit 405 may determine a predictive video block for the current video block based on the motion information and decoded samples of pictures from buffer 413 other than the picture associated with the current video block.
The motion estimation unit 404 and the motion compensation unit 405 may perform different operations on the current video block, for example, depending on whether the current video block is in an I slice, a P slice, or a B slice.
In some examples, motion estimation unit 404 may perform uni-directional prediction on the current video block, and motion estimation unit 404 may search list 0 or list 1 reference pictures for a reference video block of the current video block. Motion estimation unit 404 may then generate a reference index indicating a reference picture in list 0 or list 1 that includes the reference video block and a motion vector indicating spatial displacement between the current video block and the reference video block. The motion estimation unit 404 may output the reference index, the prediction direction indicator, and the motion vector as motion information of the current video block. The motion compensation unit 405 may generate a prediction video block of the current block based on a reference video block indicated by motion information of the current video block.
In other examples, motion estimation unit 404 may perform bi-prediction on the current video block, and motion estimation unit 404 may search for a reference video block of the current video block in a reference picture in list 0 and may also search for another reference video block of the current video block in list 1. Motion estimation unit 404 may then generate reference indices that indicate reference pictures in list 0 and list 1 that include reference video blocks and motion vectors that indicate spatial displacements between the reference video blocks and the current video block. Motion estimation unit 404 may output the reference index and the motion vector of the current video block as motion information of the current video block. Motion compensation unit 405 may generate a prediction video block for the current video block based on a reference video block indicated by motion information for the current video block.
In some examples, the motion estimation unit 404 may output the complete set of motion information for the decoding process of the decoder.
In some examples, motion estimation unit 404 may not output the complete set of motion information for the current video. The motion estimation unit 404 may signal motion information of the current video block with reference to motion information of another video block. For example, motion estimation unit 404 may determine that the motion information of the current video block is sufficiently similar to the motion information of the neighboring video block.
In one example, motion estimation unit 404 may indicate a value in a syntax structure associated with the current video block that indicates to video decoder 500 that the current video block has the same motion information as another video block.
In another example, motion estimation unit 404 may identify another video block and a Motion Vector Difference (MVD) in a syntax structure associated with the current video block. The motion vector difference indicates a difference between a motion vector of the current video block and a motion vector of the indicated video block. The video decoder 500 may determine a motion vector for the current video block using the indicated motion vector and motion vector difference for the video block.
As described above, video encoder 400 may predictively signal motion vectors. Two examples of prediction signaling techniques that may be implemented by video encoder 400 include Advanced Motion Vector Prediction (AMVP) and merge mode signaling.
The intra prediction unit 406 may perform intra prediction on the current video block. When intra prediction unit 406 performs intra prediction on the current video block, intra prediction unit 406 may generate prediction data for the current video block based on decoded samples of other video blocks in the same picture. The prediction data for the current video block may include a prediction video block and various syntax elements.
Residual generation unit 407 may generate residual data for the current video block by subtracting (e.g., as indicated by a minus sign) the predicted video block(s) of the current video block from the current video block. The residual data for the current video block may include residual video blocks corresponding to different sample components of samples in the current video block.
In other examples, for example in skip mode, there may be no residual data for the current video block and the residual generation unit 407 may not perform the subtraction operation.
Transform processing unit 408 may generate one or more transform coefficient video blocks for the current video block by applying one or more transforms to a residual video block associated with the current video block.
After transform processing unit 408 generates a transform coefficient video block associated with the current video block, quantization unit 409 may quantize the transform coefficient video block associated with the current video block based on one or more Quantization Parameter (QP) values associated with the current video block.
Inverse quantization unit 410 and inverse transform unit 411 may apply inverse quantization and inverse transform, respectively, to the transform coefficient video blocks to reconstruct residual video blocks from the transform coefficient video blocks. Reconstruction unit 412 may add the reconstructed residual video block to corresponding sample points from one or more prediction video blocks generated by prediction unit 402 to produce a reconstructed video block associated with the current block for storage in buffer 413.
After reconstruction unit 412 reconstructs the video blocks, loop filtering operations may be performed to reduce video blockiness in the video blocks.
Entropy encoding unit 414 may receive data from other functional components of video encoder 400. When entropy encoding unit 414 receives the data, entropy encoding unit 414 may perform one or more entropy encoding operations to generate entropy encoded data and output a bitstream that includes the entropy encoded data.
Fig. 12 is a block diagram illustrating an example of a video decoder 500, the video decoder 500 may be the video decoder 314 in the system 300 shown in fig. 10.
Video decoder 500 may be configured to perform any or all of the techniques of this disclosure. In the example of fig. 12, the video decoder 500 includes a number of functional components. The techniques described in this disclosure may be shared among various components of the video decoder 500. In some examples, the processor may be configured to perform any or all of the techniques described in this disclosure.
In the example of fig. 12, the video decoder 500 includes an entropy decoding unit 501, a motion compensation unit 502, an intra prediction unit 503, an inverse quantization unit 504, an inverse transformation unit 505, a reconstruction unit 506, and a buffer 507. In some examples, video decoder 500 may perform a decoding process that is generally the inverse of the encoding process described for video encoder 400 (fig. 11).
The entropy decoding unit 501 may retrieve the encoded bitstream. The encoded bitstream may include entropy-coded video data (e.g., encoded blocks of video data). The entropy decoding unit 501 may decode entropy-coded video data, and the motion compensation unit 502 may determine motion information including a motion vector, a motion vector precision, a reference picture list index, and other motion information from the entropy-decoded video data. The motion compensation unit 502 may determine such information, for example, by performing AMVP and merge modes.
The motion compensation unit 502 may generate a motion compensation block and may perform interpolation based on an interpolation filter. An identifier of the interpolation filter to be used with sub-pixel precision may be included in the syntax element.
The motion compensation unit 502 may calculate the interpolation of sub-integer pixels of the reference block using an interpolation filter as used by the video encoder 400 during encoding of the video block. The motion compensation unit 502 may determine an interpolation filter used by the video encoder 400 according to the received syntax information and generate a prediction block using the interpolation filter.
The motion compensation unit 502 may use some syntax information to determine the size of blocks used to encode frame(s) and/or slice(s) of an encoded video sequence, partition information describing how each macroblock of a picture of the encoded video sequence is partitioned, a mode indicating how each partition is encoded, one or more reference frames (and reference frame lists) of each inter-coded block, and other information used to decode the encoded video sequence.
The intra prediction unit 503 may form a prediction block from spatially adjacent blocks using, for example, an intra prediction mode received in a bitstream. The inverse quantization unit 503 inversely quantizes, i.e., dequantizes, the quantized video block coefficients provided in the bitstream and decoded by the entropy decoding unit 501. The inverse transform unit 503 applies inverse transform.
The reconstruction unit 506 may add the residual block to the corresponding prediction block generated by the motion compensation unit 502 or the intra prediction unit 503 to form a decoded block. Deblocking filters may also be applied to filter the decoded blocks, if desired, to remove blockiness. The decoded video blocks are then stored in buffer 507, provide reference blocks for subsequent motion compensation/intra prediction, and also generate decoded video for presentation on a display device.
Fig. 13 is a block diagram illustrating an example video processing system 1300 in which various techniques disclosed herein may be implemented. Various embodiments may include some or all of the components of system 1300. The system 1300 may include an input 1302 for receiving video content. The video content may be received in a raw or uncompressed format, e.g., 8 or 10 bit multi-component pixel values, or may be in a compressed or encoded format. Input 1302 may represent a network interface, a peripheral bus interface, or a storage interface. Examples of network interfaces include wired interfaces such as ethernet, Passive Optical Networks (PONs), etc., and wireless interfaces such as Wi-Fi or cellular interfaces.
The system 1300 may include a codec component 1304 that may implement various codecs or encoding methods described in this document. The codec component 1304 may reduce the average bit rate of the video from the input 1302 to the output of the codec component 1304 to produce a codec representation of the video. Codec techniques are therefore sometimes referred to as video compression or video transcoding techniques. The output of codec component 1304 may be stored or transmitted via a communication connection as represented by component 1306. The stored or communicated bitstream (or codec) representation of the video received at input 1302 may be used by component 1308 to generate pixel values or displayable video communicated to display interface 1310. The process of generating user-viewable video from a bitstream representation is sometimes referred to as video decompression. Furthermore, while certain video processing operations are referred to as "codec" operations or tools, it will be understood that codec tools or operations are used at the encoder and that corresponding decoding tools or operations that reverse the codec results will be performed by the decoder.
Examples of a peripheral bus interface or display interface may include a Universal Serial Bus (USB), or a high-definition multimedia interface (HDMI), or a displayport, among others. Examples of storage interfaces include SATA (serial advanced technology attachment), PCI, IDE interfaces, and the like. The techniques described in this document may be embodied in various electronic devices such as mobile phones, laptops, smart phones, or other devices capable of performing digital data processing and/or video display.
Fig. 14 shows a flow diagram of an example method for video processing. The method comprises the following steps: determining (1402) that a current video block of a video is to be coded in a geometric partitioning mode for a transition between the current video block and a bitstream of the video, wherein the geometric partitioning mode includes an entire set of partitioning modes; determining one or more partitioning modes (1404) for the current video block based on partitioning mode indices included in the bitstream, wherein the partitioning mode indices correspond to different partitioning modes from one video block to another video block; and performing a conversion based on the one or more segmentation patterns (1406).
In some examples, each partitioning mode of the entire set of partitioning modes partitions the current video block into two or more partitions, and the two or more partitions corresponding to at least one partitioning mode are non-square and non-rectangular.
In some examples, each segmentation mode index is associated with a set of parameters including at least one of an angle, a distance, and/or a displacement.
In some examples, the correspondence of the partition mode index and the partition mode depends on a dimension of the video block, wherein the dimension of the video block includes at least one of a height, a width, and a ratio of the width and the height of the video block.
In some examples, the binarization of the partition mode index includes a variable length codec that uses a first number of bits for a first partition mode and a second number of bits for a second partition mode.
In some examples, different numbers of partitioning modes or different numbers of parameters are used in different video blocks.
In some examples, the number of partition modes or parameters used in a video block depends on the dimensions of the video block.
In some examples, the number of angles used in the current video block is greater than the number of angles used in the second video block, wherein the current video block is a video block with a height greater than a width and the second video block is a video block with a height less than a width.
In some examples, the angle is asymmetric.
In some examples, the angle is not rotationally symmetric.
In some examples, the angle is not bilaterally symmetric.
In some examples, the angle is not quarter-symmetric.
In some examples, the video block includes at least one of a codec block, a codec unit, a Virtual Pipeline Data Unit (VPDU), a slice, a stripe, a picture, a sub-picture, a tile, or video.
In some examples, the representation of the partitioning pattern in the bitstream depends on a priority of the partitioning pattern.
In some examples, the priority is determined by an angle and/or distance associated with the segmentation pattern.
In some examples, a segmented pattern that faces angles less than a predetermined value has a higher priority than a segmented pattern that faces angles greater than the predetermined value.
In some examples, the predetermined value is 45 or 90.
In some examples, the segmentation patterns associated with angle indices less than a predetermined value have a higher priority than the segmentation patterns associated with angle indices greater than the predetermined value.
In some examples, the predetermined value is 4 or 8 or 16.
In some examples, a partitioning pattern with a higher priority requires fewer bits or bits than a partitioning pattern with a lower priority when signaled.
In some examples, the segmentation pattern is classified into two or more classes, and an indication of which class the segmentation pattern belongs to is signaled before other information related to the segmentation pattern.
In some examples, the angle index is signaled first for a video block in either a clockwise or counterclockwise direction.
In some examples, when the angle index is signaled in a counterclockwise direction, the angle index signaled in the counterclockwise direction means a smaller angle index, which represents a smaller angle size.
In some examples, if angle idx-0 means the angular dimension is equal to 0 degrees, then angle idx-8 means the angular dimension is equal to 90 degrees; the angle idx-16 means the angular dimension equals 180 degrees and the angle idx-24 means the angular dimension equals 270 degrees.
In some examples, when the angle index is signaled in a clockwise direction, the angle index signaled in the clockwise direction means a larger angle index, which represents a larger angle size.
In some examples, if angle idx-0 means that the angular dimension is equal to 0 degrees, then angle idx-8 means that the angular dimension is equal to 270 degrees; 16 means the angular dimension equals 180 degrees, and 24 means the angular dimension equals 90 degrees.
In some examples, the indication is signaled at a video block level.
In some examples, the indication is signaled at least one of an SPS, a VPS, a PPS, a picture header, a sub-picture, a slice header, a slice, a tile, a CTU, a VPDU, a CU, and a block level.
In some examples, a high level flag above the block level is signaled to indicate whether the angle associated with the signaled segmentation mode is clockwise or counterclockwise.
In some examples, the block level flag is signaled to indicate whether an angle associated with the signaled segmentation mode is clockwise or counterclockwise.
Fig. 15 shows a flow diagram of an example method for video processing. The method comprises the following steps: for a transition between a current video block of a video and a bitstream of the current video block, determining that the video block is to be coded in a geometric partitioning mode (1502), wherein the geometric partitioning mode comprises an entire set of partitioning modes, and each partitioning mode is associated with a set of parameters comprising at least one of an angle, a distance, and/or a displacement; deriving a subset of segmentation modes or parameters from the entire set of segmentation modes or parameters (1504); and performing a conversion based on the subset of segmentation patterns or parameters (1506).
In some examples, each partitioning mode in the entire set of partitioning modes partitions the current video block into two or more partitions, at least one of which is non-square and non-rectangular.
In some examples, only a subset of the partitioning modes are used for video blocks.
In some examples, only partition modes associated with a subset of parameters are used for video blocks.
In some examples, an indication of whether the selected partitioning mode or parameter is within the subset is also included in the bitstream.
In some examples, whether a subset of the partition modes or parameters or the entire partition mode or parameter set is used for a video block depends on the decoding information of the current video block or one or more previously decoded video blocks.
In some examples, the decoding information includes syntax elements and/or block dimensions of the video block.
In some examples, the selection of a segmentation mode in the subset of segmentation modes depends on the corresponding angle.
In some examples, the subset of segmentation patterns only contains segmentation patterns associated with a distance or displacement equal to 0.
In some examples, the subset of segmentation patterns contains only segmentation patterns associated with the specified angle.
In some examples, the segmentation pattern associated with the specified angles includes segmentation patterns associated with a predefined subset of angles that are combined with all displacements corresponding to the predefined angles.
In some examples, the selection of a partitioning mode in the subset of partitioning modes depends on whether a low-delay B frame (LDB) codec is checked.
In some examples, different subsets of the partitioning modes are used for LDB coding and Random Access (RA) coding.
In some examples, the selection of the partition mode in the subset of partition modes depends on a reference picture in a reference picture list of the current picture.
In some examples, different subsets of the segmentation mode are used in a first instance in which all reference pictures precede the current picture in display order and in a second instance in which at least one reference picture follows the current picture in display order.
In some examples, the selection of the segmentation mode among the subset of segmentation modes depends on the derivation of the motion vector candidate.
In some examples, different subsets of the segmentation modes are used when deriving motion vector candidates from Temporal Motion Vector Prediction (TMVP) candidates, spatial motion candidates, or history-based motion vector prediction (HMVP) candidates.
In some examples, the subset of split modes only contains split modes that partition blocks in the same manner as the TPM mode.
In some examples, the subset of partition patterns only includes partition patterns that partition blocks by connecting lines at the top-left and bottom-right corners of the block or by connecting lines at the top-right and bottom-left corners of the block.
In some examples, the subset of segmentation patterns only contains segmentation patterns corresponding to diagonal angles with one or more distance or displacement indices.
In some examples, the diagonal angle indicates a partitioning pattern corresponding to a partition boundary that partitions the block by a line connecting the top-left and bottom-right corners of the block or by a line connecting the top-right and bottom-left corners of the block.
In some examples, the subset of segmentation patterns only contains segmentation patterns associated with a distance or displacement equal to 0.
In some examples, the subset of segmentation patterns only contains segmentation patterns corresponding to any angle associated with a distance or displacement equal to 0.
In some examples, the subset of segmentation patterns only contains segmentation patterns corresponding to diagonal angles associated with a distance or displacement equal to 0.
In some examples, the subset of segmentation patterns only contains segmentation patterns corresponding to diagonal angles associated with all distance or displacement indices corresponding to those angles.
In some examples, for a block with an aspect ratio equal to X, the subset of segmentation patterns only includes segmentation patterns corresponding to angular sizes equal to arctan (X) and/or pi-arctan (X) and/or pi + arctan (X) and/or 2 pi-arctan (X) and all distance indices corresponding to these angles, X being an integer.
In some examples, X ═ 1.
In some examples, the horizontal angle and/or the vertical angle are included in a subset of angles.
In some examples, the horizontal angle indication corresponds to an angle index of 90 ° and/or 270 °.
In some examples, the vertical angle indicates an angle index corresponding to 0 ° and/or 180 °.
In some examples, the segmentation patterns associated with the horizontal angle and/or the vertical angle combined with a distance/displacement equal to 0 are included in a subset of the allowed angles.
In some examples, segmentation patterns associated with horizontal angles and/or vertical angles combined with a distance or displacement equal to 0 are not included in the subset of allowed angles.
In some examples, the segmentation patterns associated with the horizontal angles and/or the vertical angles combined with all distance or displacement indices are included in a subset of the allowed angles.
In some examples, the segmentation patterns associated with the horizontal angles and/or the vertical angles combined with all the distance or displacement indices are not included in the subset of allowed angles.
In some examples, the indication of whether to use the subset of partition modes or parameters is signaled at least one of an SPS, VPS, PPS, picture header, sub-picture, slice header, slice, tile, CTU, VPDU, CU, and block level.
In some examples, the indication of which subset of the partitioning modes or parameters to use is further signaled at least one of an SPS, VPS, PPS, picture header, sub-picture, slice header, slice, tile, CTU, VPDU, CU, and block level.
FIG. 16 shows a flow diagram of an example method for video processing. The method comprises the following steps: determining, for a transition between a video block of a video and a bitstream of the video block, an enablement of a geometric partitioning mode and a second codec mode different from the geometric partitioning mode for the video block (1602); and performing a conversion based on the determination (1604).
In some examples, the geometric partitioning mode includes an entire set of partitioning modes, each partitioning mode of the entire set of partitioning modes partitions the current video block into two or more partitions, and the two or more partitions corresponding to at least one partitioning mode are non-square and non-rectangular.
In some examples, the second codec mode includes weighted prediction.
In some examples, when weighted prediction is enabled, geometric partitioning mode is disabled at the video block level.
In some examples, when weighted prediction is enabled at a slice level, the geometric partitioning mode is disabled at least one of a slice, PPS, SPS, slice, sub-picture, CU, PU, or TU level.
In some examples, whether geometric partitioning mode is used with weighted prediction depends on the weighting factor of the weighted prediction.
In some examples, the geometric partitioning mode is disabled if the weighting factor of the weighted prediction is greater than T, where T is a constant value.
In some examples, the second codec mode includes a bi-directional prediction with Codec Unit (CU) weights (BCW), a prediction refinement with optical flow (PROF) mode, a bi-directional optical flow (BDOF) mode, a decoder-side motion vector refinement (DMVR) mode, or a sub-block transform (SBT) mode.
In some examples, the second codec mode is disabled when the geometric partitioning mode is enabled.
In some examples, when the geometry partitioning mode is enabled, the indication of the second codec mode is not signaled.
In some examples, the geometric partitioning mode is disabled when the second codec mode is enabled.
In some examples, the indication of the geometric partitioning mode is not signaled when the second codec mode is enabled.
In some examples, the second codec mode is disabled when weighted prediction is enabled.
In some examples, the deblocking process associated with the video block depends on whether both the geometric partition mode and the second codec mode are enabled.
Fig. 17 shows a flow diagram of an example method for video processing. The method comprises the following steps: for a transition between a video block of a video and a bitstream of the video block, determining a deblocking process associated with the video block based on whether the current video block is coded in a geometric partitioning mode and/or a color format of the current video block (1702); and performing a conversion based on the deblocking parameters (1704).
In some examples, the geometric partitioning mode includes an entire set of partitioning modes, each partitioning mode of the entire set of partitioning modes partitions the current video block into two or more partitions, and the two or more partitions corresponding to at least one partitioning mode are non-square and non-rectangular.
In some examples, when a video block is coded in a geometric partitioning mode, the weighted predicted weights generated for the first component of the video block are used to derive the weighted values for the second component of the video block.
In some examples, the first component is a luma component and the second component is a Cb or Cr component.
In some examples, the derivation depends on the color format of the video block.
In some examples, the color format is 4:2:0, 4:2:2, or 4:4: 4.
In some examples, the weighted values of the second component are derived by applying an upsampling or a downsampling to the weighted values of the first component.
In some examples, the weighted values of the weighted predictions of the components are based on the color format of the video block.
In some examples, the component is a Cb component or a Cr component, and wherein the color format is 4:2:0, 4:2:2, or 4:4: 4.
In some examples, when the color format is 4:2:2, the parameters associated with the segmentation pattern are adjusted to generate the weighted values of the components.
In some examples, the geometric partitioning mode includes one or more of a geometric merge mode, a geometric partition mode, a wedge prediction mode, and a triangle prediction mode.
In some examples, the converting includes encoding the video block into a bitstream.
In some examples, the converting includes decoding the video block from a bitstream.
In some examples, the converting includes generating a bitstream from the video block; the method further comprises the following steps: the bitstream is stored in a non-transitory computer-readable recording medium.
Fig. 17 shows a flow diagram of an example method for video processing. The method comprises the following steps: for a transition between a current video block of a video and a bitstream of the video, determining that the current video block is coded in a geometric partitioning mode (1702), wherein the geometric partitioning mode includes an entire set of partitioning modes; determining one or more partitioning modes of the current video block based on partitioning mode indices included in the bitstream (1704), wherein the partitioning mode indices correspond to different partitioning modes from one video block to another video block; generating a bitstream (1706) from the video blocks in one or more partitioning modes; and storing the bitstream in a non-transitory computer-readable recording medium (1708).
From the foregoing it will be appreciated that, although specific embodiments of the presently disclosed technology have been described herein for purposes of illustration, various modifications may be made without deviating from the scope of the invention. Accordingly, the presently disclosed technology is not limited, except as by the appended claims.
Embodiments of the subject matter and the functional operations described in this patent document can be implemented in various systems, digital electronic circuitry, or computer software, firmware, or hardware, including the structures disclosed in this specification and their structural equivalents, or in combinations of one or more of them. The disclosed and other embodiments can be implemented as one or more computer program products, i.e., one or more modules of computer program instructions encoded on a tangible and non-transitory computer readable medium for execution by, or to control the operation of, data processing apparatus. The computer readable medium can be a machine readable storage device, a machine readable storage substrate, a memory device, a composition of matter effecting a machine readable propagated signal, or a combination of one or more of them. The term "data processing unit" or "data processing apparatus" includes all apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or groups of computers. The apparatus can include, in addition to hardware, code that creates an execution environment for the computer program, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them. A propagated signal is an artificially generated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode information for transmission to suitable receiver apparatus.
A computer program (also known as a program, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. A computer program does not necessarily correspond to a file in a file system. A program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document), in a single file dedicated to the program, or in multiple coordinated files (e.g., files that store one or more modules, sub programs, or portions of code). A computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.
The processes and logic flows described in this document can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit).
Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more of any kind of digital computer. Generally, a processor will receive instructions and data from a read-only memory or a random access memory or both. The essential elements of a computer are a processor for executing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks. However, a computer does not necessarily have such a device. Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto-optical disks; and CD ROM and DVD ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
While this patent document contains many specifics, these should not be construed as limitations on the scope of any invention or of what may be claimed, but rather as descriptions of features specific to particular embodiments of particular inventions. Certain features that are described in this patent document in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various functions described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Furthermore, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claim combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.
Also, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. Moreover, the separation of various system components in the embodiments described herein should not be understood as requiring such separation in all embodiments.
Only a few implementations and examples are described and other implementations, enhancements and variations can be made based on what is described and illustrated in this patent document.

Claims (98)

1. A method for video processing, comprising:
determining, for a transition between a current video block of a video and a bitstream of the video, that the current video block is coded and decoded in a geometric partitioning mode, wherein the geometric partitioning mode includes an entire set of partitioning modes;
determining one or more partition modes of a current video block based on partition mode indexes included in a bitstream, wherein the partition mode indexes correspond to different partition modes from one video block to another video block; and
the conversion is performed based on one or more segmentation patterns.
2. The method of claim 1, wherein each partitioning mode of the entire set of partitioning modes partitions the current video block into two or more partitions, and the two or more partitions corresponding to at least one partitioning mode are non-square and non-rectangular.
3. The method according to claim 1 or 2, wherein each segmentation mode index is associated with a set of parameters comprising at least one of an angle, a distance and/or a displacement.
4. The method of claim 3, wherein the correspondence of the partition mode index and the partition mode depends on a dimension of the video block, wherein the dimension of the video block comprises at least one of a height, a width, and a ratio of the width and the height of the video block.
5. The method according to any of claims 1-4, wherein the binarization of the partition mode index comprises a variable length codec using a first number of bits for a first partition mode and a second number of bits for a second partition mode.
6. The method of claim 4, wherein a different number of partition modes or a different number of parameters are used in different video blocks.
7. The method of claim 6, wherein the number of partitioning modes or the number of parameters used in a video block depends on the dimension of the video block.
8. The method of claim 7, wherein a number of angles used in a current video block is greater than a number of angles used in a second video block, wherein the current video block is a video block with a height greater than a width and the second video block is a video block with a height less than a width.
9. The method of claim 8, wherein the angle is asymmetric.
10. The method of claim 8, wherein the angle is not rotationally symmetric.
11. The method of claim 8, wherein the angle is not bilaterally symmetric.
12. The method of claim 8, wherein the angle is not quarter-symmetrical.
13. The method of claim 12, wherein the video block comprises at least one of a codec block, a codec unit, a Virtual Pipeline Data Unit (VPDU), a slice, a picture, a sub-picture, a tile, or video.
14. The method of claim 1, wherein the representation of the partitioning pattern in the bitstream is dependent on a priority of the partitioning pattern.
15. The method of claim 14, wherein the priority is determined by an angle and/or distance associated with a segmentation pattern.
16. The method of claim 15, wherein a segmentation mode facing angles smaller than a predetermined value has a higher priority than a segmentation mode facing angles larger than the predetermined value.
17. The method of claim 16, wherein the predetermined value is 45 or 90.
18. The method of claim 15, wherein the segmentation patterns associated with angle indices less than a predetermined value have a higher priority than the segmentation patterns associated with angle indices greater than the predetermined value.
19. The method of claim 18, wherein the predetermined value is 4 or 8 or 16.
20. The method according to any of claims 14-19, wherein a partitioning pattern with a higher priority requires fewer bits or bits than a partitioning pattern with a lower priority when signaled.
21. The method according to any of claims 1-20, wherein the segmentation pattern is classified into two or more classes, and the indication of which class the segmentation pattern belongs to is signaled before further information about the segmentation pattern.
22. The method of claim 21, wherein the angle index is signaled first for a video block in either a clockwise or counterclockwise direction.
23. The method of claim 21, wherein when the angle index is signaled in a counter-clockwise direction, the angle index signaled in the counter-clockwise direction implies a smaller angle index, the smaller angle index representing a smaller angle size.
24. The method of claim 23, wherein if anglexdx-0 means that the angular dimension is equal to 0 degrees, then anglexdx-8 means that the angular dimension is equal to 90 degrees; the angle idx-16 means the angular dimension equals 180 degrees and the angle idx-24 means the angular dimension equals 270 degrees.
25. The method of claim 21, wherein when the angle index is signaled in a clockwise direction, the angle index signaled in a clockwise direction implies a larger angle index, the larger angle index representing a larger angle size.
26. The method of claim 25, wherein if anglexdx-0 means that the angular dimension is equal to 0 degrees, then anglexdx-8 means that the angular dimension is equal to 270 degrees; the angle idx-16 means the angular dimension equals 180 degrees, and the angle idx-24 means the angular dimension equals 90 degrees.
27. The method of claim 21, wherein the indication is signaled at a video block level.
28. The method of claim 27, wherein the indication is signaled at least one of an SPS, VPS, PPS, picture header, sub-picture, slice header, slice, tile, CTU, VPDU, CU, and block level.
29. The method of claim 28, wherein a high level flag above a block level is signaled to indicate whether an angle associated with a signaled segmentation mode is clockwise or counterclockwise.
30. The method of claim 28, wherein a block level flag is signaled to indicate whether an angle associated with a signaled segmentation mode is clockwise or counterclockwise.
31. A method for video processing, comprising:
for a transition between a current video block of a video and a bitstream of the current video block, determining that the current video block is coded in a geometric partitioning mode, wherein the geometric partitioning mode comprises an entire set of partitioning modes including a plurality of subsets of partitioning modes, and each partitioning mode is associated with a set of parameters comprising at least one of an angle, a distance, and/or a displacement;
selecting a subset of the segmentation modes or parameters from the entire set of segmentation modes or parameters for the current video block; and
the conversion is performed based on a subset of the segmentation patterns or parameters.
32. The method of claim 31, wherein each partitioning pattern of the entire set of partitioning patterns partitions a current video block into two or more partitions, and the two or more partitions corresponding to at least one partitioning pattern are non-square and non-rectangular.
33. The method of claim 31, wherein only a subset of partition modes are used for video blocks.
34. The method of claim 31, wherein only partition modes associated with a subset of parameters are used for video blocks.
35. The method of claim 31, wherein an indication of whether the selected partitioning mode or parameter is within the subset is further included in the bitstream.
36. The method of claim 31, wherein whether a subset of partitioning modes or a subset of parameters or a set of partitioning modes or parameters is used for a video block depends on decoding information of a current video block or one or more previously decoded video blocks.
37. The method of claim 36, wherein the decoding information comprises syntax elements and/or block dimensions of video blocks.
38. The method of claim 31, wherein the selection of a segmentation mode in the subset of segmentation modes depends on the corresponding angle.
39. The method of claim 37, wherein the subset of segmentation patterns only includes segmentation patterns associated with a distance or displacement equal to 0.
40. The method of claim 38, wherein the subset of segmentation patterns includes only segmentation patterns associated with a specified angle.
41. The method of claim 40, wherein the segmentation pattern associated with the specified angles comprises segmentation patterns associated with a predefined subset of angles combined with all displacements corresponding to the predefined angles.
42. The method of claim 31, wherein selection of a partition mode in the subset of partition modes depends on whether a low delay B frame (LDB) codec is checked.
43. The method of claim 42, wherein different subsets of partitioning modes are used for LDB coding and Random Access (RA) coding.
44. The method of claim 31, wherein the selection of the partition mode in the subset of partition modes depends on a reference picture in a reference picture list of a current picture.
45. The method of claim 44, wherein different subsets of the split modes are used in a first case where all reference pictures precede the current picture in display order and in a second case where at least one reference picture succeeds the current picture in display order.
46. The method of claim 31, wherein the selection of a partition mode in the subset of partition modes depends on a derivation of a motion vector candidate.
47. The method of claim 46, wherein different subsets of segmentation modes are used when deriving motion vector candidates from Temporal Motion Vector Prediction (TMVP) candidates, spatial motion candidates, or history-based motion vector prediction (HMVP) candidates.
48. The method of claim 38, wherein the subset of partition modes includes only partition modes that partition blocks in the same manner as the TPM mode.
49. The method of claim 48, wherein the subset of partitioning patterns only includes partitioning patterns of blocks by lines connecting upper-left and lower-right corners of the blocks or by lines connecting upper-right and lower-left corners of the blocks.
50. The method of claim 38, wherein the subset of segmentation patterns only includes segmentation patterns corresponding to diagonal angles with one or more distance or displacement indices.
51. The method of claim 50, wherein the diagonal angle indicates a partitioning pattern corresponding to a partitioning boundary that partitions a block by a line connecting a top-left corner and a bottom-right corner of the block or by a line connecting a top-right corner and a bottom-left corner of the block.
52. The method of claim 50, wherein the subset of segmentation patterns only includes segmentation patterns associated with a distance or displacement equal to 0.
53. The method of claim 52, wherein the subset of segmentation patterns only contains segmentation patterns corresponding to any angle associated with a distance or displacement equal to 0.
54. The method of claim 52, wherein the subset of segmentation patterns only contains segmentation patterns corresponding to diagonal angles associated with a distance or displacement equal to 0.
55. The method of claim 50, wherein the subset of segmentation patterns only includes segmentation patterns corresponding to diagonal angles associated with all distance or displacement indices corresponding to those angles.
56. A method as in claim 55, where for a block with an aspect ratio equal to X, the subset of partition patterns only contains partition patterns corresponding to angular dimensions equal to arctan (X) and/or pi-arctan (X) and/or pi + arctan (X) and/or 2 pi-arctan (X) and all distance indices corresponding to these angles, X being an integer.
57. The method of claim 56, wherein X-1.
58. The method of claim 57, wherein the horizontal angle and/or the vertical angle are included in a subset of angles.
59. The method of claim 58, wherein the horizontal angle indication corresponds to an angle index of 90 ° and/or 270 °.
60. The method of claim 58, wherein the perpendicular angle indicates an angle index corresponding to 0 ° and/or 180 °.
61. The method of claim 58, wherein segmentation patterns associated with horizontal angles and/or vertical angles combined with a distance/displacement equal to 0 are included in the subset of allowed angles.
62. The method of claim 58, wherein segmentation patterns associated with horizontal angles and/or vertical angles combined with a distance or displacement equal to 0 are not included in the subset of allowed angles.
63. The method of claim 58, wherein segmentation patterns associated with horizontal angles and/or vertical angles combined with all distance or displacement indices are included in a subset of allowed angles.
64. The method of claim 58, wherein segmentation patterns associated with horizontal angles and/or vertical angles combined with all distance or displacement indices are not included in the subset of allowed angles.
65. The method of any of claims 31-64, wherein an indication of whether to use a subset of partition modes or parameters is signaled at least one of an SPS, a VPS, a PPS, a picture header, a sub-picture, a slice header, a slice, a tile, a CTU, a VPDU, a CU, and a block level.
66. The method of claim 65, wherein the indication of which subset of partitioning modes or parameters to use is further signaled at least one of an SPS, a VPS, a PPS, a picture header, a sub-picture, a slice header, a slice, a tile, a CTU, a VPDU, a CU, and a block level.
67. A method for video processing, comprising:
determining, for transitions between video blocks of a video and bitstreams of the video blocks, an enablement of a geometric partitioning mode and a second codec mode different from the geometric partitioning mode for the video blocks; and
performing a conversion based on the determination.
68. The method of claim 67, wherein the geometric partitioning pattern comprises an entire set of partitioning patterns, each partitioning pattern in the entire set of partitioning patterns partitions the current video block into two or more partitions, and the two or more partitions corresponding to at least one partitioning pattern are non-square and non-rectangular.
69. The method of claim 68, wherein the second codec mode comprises weighted prediction.
70. The method of claim 69, wherein the geometric partitioning mode is disabled at a video block level when weighted prediction is enabled.
71. The method of claim 70, wherein, when weighted prediction is enabled at a slice level, a geometric partitioning mode is disabled at least one of a slice, PPS, SPS, slice, sub-picture, CU, PU, or TU level.
72. The method of claim 69, wherein whether geometric partitioning mode is used with weighted prediction depends on weighting factors of weighted prediction.
73. The method of claim 72, wherein the geometric partitioning mode is disabled if a weighting factor of the weighted prediction is greater than T, wherein T is a constant value.
74. The method of claim 68, wherein the second codec mode comprises a bi-directional prediction (BCW) with Coding Unit (CU) weights, a prediction refinement with optical flow (PROF) mode, a bi-directional optical flow (BDOF) mode, a decoder-side motion vector refinement (DMVR) mode, or a sub-block transform (SBT) mode.
75. The method of claim 68, wherein the second codec mode is disabled when the geometric partitioning mode is enabled.
76. The method of claim 75, wherein the indication of the second codec mode is not signaled when the geometry partitioning mode is enabled.
77. The method of claim 68, wherein the geometric partitioning mode is disabled when the second codec mode is enabled.
78. The method of claim 77, wherein the indication of the geometric partitioning mode is not signaled when the second codec mode is enabled.
79. The method of claim 68, wherein the second codec mode is disabled when weighted prediction is enabled.
80. The method of claim 68, wherein a deblocking process associated with a video block depends upon whether both a geometric partition mode and a second codec mode are enabled.
81. A method for video processing, comprising:
for a transition between a video block of a video and a bitstream of the video block, determining a deblocking process associated with the video block based on whether the current video block is codec in a geometric partitioning mode and/or a color format of the current video block; and
the conversion is performed based on a deblocking process.
82. The method of claim 81, wherein the geometric partitioning pattern comprises an entire set of partitioning patterns, each partitioning pattern in the entire set of partitioning patterns partitions the current video block into two or more partitions, and the two or more partitions corresponding to at least one partitioning pattern are non-square and non-rectangular.
83. The method of claim 82, wherein when video blocks are coded in geometric partitioning mode, the weighted predicted weights generated for a first component of a video block are used to derive weights for a second component of the video block.
84. The method of claim 83, wherein the first component is a luma component and the second component is a Cb or Cr component.
85. The method of claim 84, wherein deriving depends on a color format of the video block.
86. The method of claim 85, wherein the color format is 4:2:0, 4:2:2, or 4:4: 4.
87. The method of claim 83, wherein the weighted values of the second component are derived by applying upsampling or downsampling to the weighted values of the first component.
88. The method of claim 81, wherein weighting values for weighted prediction of components are based on a color format of a video block.
89. The method of claim 88, wherein a component is a Cb component or a Cr component, and wherein the color format is 4:2:0, 4:2:2, or 4:4: 4.
90. The method of claim 89, wherein a parameter associated with the segmentation mode is adjusted to generate a weighting value for the component when the color format is 4:2: 2.
91. The method of any of claims 1-90, wherein the geometric partitioning mode comprises one or more of a geometric merge mode, a geometric partitioning mode, a wedge prediction mode, and a triangle prediction mode.
92. The method of any of claims 1-91, wherein converting comprises encoding the video block into a bitstream.
93. The method of any one of claims 1-91, wherein converting comprises decoding a video block from a bitstream.
94. The method of any of claims 1-91, wherein converting comprises generating a bitstream from a video block;
the method further comprises the following steps:
the bitstream is stored in a non-transitory computer-readable recording medium.
95. An apparatus for processing video data comprising a processor and a non-transitory memory having instructions thereon, wherein the instructions, when executed by the processor, cause the processor to:
determining, for a transition between a current video block of a video and a bitstream of the video, that the current video block is coded and decoded in a geometric partitioning mode, wherein the geometric partitioning mode includes an entire set of partitioning modes;
determining one or more partition modes of a current video block based on partition mode indexes included in the bitstream, wherein the partition mode indexes correspond to different partition modes from one video block to another video block; and
the conversion is performed based on one or more segmentation patterns.
96. A non-transitory computer-readable medium storing instructions that cause a processor to:
determining, for a transition between a current video block of a video and a bitstream of the video, that the current video block is coded and decoded in a geometric partitioning mode, wherein the geometric partitioning mode includes an entire set of partitioning modes;
determining one or more partition modes of a current video block based on partition mode indexes included in the bitstream, wherein the partition mode indexes correspond to different partition modes from one video block to another video block; and
the conversion is performed based on one or more segmentation patterns.
97. A non-transitory computer-readable medium storing a bitstream of video generated by a method performed by a video processing apparatus, wherein the method comprises:
for a transition between a current video block of a video and a bitstream of the video, determining that the current video block is coded and decoded in a geometric partitioning mode, wherein the geometric partitioning mode comprises an entire partitioning mode set;
determining one or more partition modes of a current video block based on partition mode indexes included in the bitstream, wherein the partition mode indexes correspond to different partition modes from one video block to another video block; and
a bitstream is generated from the video blocks based on the one or more partitioning patterns.
98. A method for storing a bitstream of video, comprising:
determining, for a transition between a current video block of a video and a bitstream of the video, that the current video block is coded and decoded in a geometric partitioning mode, wherein the geometric partitioning mode includes an entire set of partitioning modes;
determining one or more partition modes of a current video block based on partition mode indexes included in the bitstream, wherein the partition mode indexes correspond to different partition modes from one video block to another video block;
generating a bitstream from the video blocks in one or more partitioning modes; and
the bitstream is stored in a non-transitory computer-readable recording medium.
CN202180013097.8A 2020-02-07 2021-02-07 Geometric segmentation mode Pending CN115136601A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CNPCT/CN2020/074499 2020-02-07
CN2020074499 2020-02-07
PCT/CN2021/075821 WO2021155865A1 (en) 2020-02-07 2021-02-07 Geometric partitioning mode

Publications (1)

Publication Number Publication Date
CN115136601A true CN115136601A (en) 2022-09-30

Family

ID=77200518

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202180013097.8A Pending CN115136601A (en) 2020-02-07 2021-02-07 Geometric segmentation mode

Country Status (2)

Country Link
CN (1) CN115136601A (en)
WO (1) WO2021155865A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023158765A1 (en) * 2022-02-16 2023-08-24 Beijing Dajia Internet Information Technology Co., Ltd. Methods and devices for geometric partitioning mode split modes reordering with pre-defined modes order

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106851292A (en) * 2010-07-02 2017-06-13 数码士有限公司 For the method for the decoding image of infra-frame prediction
CN102611880B (en) * 2011-01-19 2015-02-04 华为技术有限公司 Encoding method and device for marking geometric classification mode of image block

Also Published As

Publication number Publication date
WO2021155865A1 (en) 2021-08-12

Similar Documents

Publication Publication Date Title
US11985323B2 (en) Quantized residual differential pulse code modulation representation of coded video
US11805268B2 (en) Two step cross-component prediction mode
US11700378B2 (en) High level syntax for inter prediction with geometric partitioning
CN113875233A (en) Matrix-based intra prediction using upsampling
US20240121381A1 (en) Intra coded video using quantized residual differential pulse code modulation coding
WO2021129682A1 (en) Improvements on merge mode
WO2021110116A1 (en) Prediction from multiple cross-components
CN115668923A (en) Indication of multiple transform matrices in coded video
WO2021115235A1 (en) Cross-component prediction using multiple components
WO2021104433A1 (en) Simplified inter prediction with geometric partitioning
US20230097850A1 (en) Intra block copy using non-adjacent neighboring blocks
WO2021155865A1 (en) Geometric partitioning mode
WO2021136361A1 (en) Motion vector difference for block with geometric partition
US11778176B2 (en) Intra block copy buffer and palette predictor update
WO2023284694A1 (en) Method, apparatus, and medium for video processing
US20230262226A1 (en) Sample string processing in intra coding
WO2023030504A1 (en) Method, device, and medium for video processing
WO2022262689A1 (en) Method, device, and medium for video processing
CN117581538A (en) Video processing method, apparatus and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination