CN113261281A - Use of interleaved prediction - Google Patents
Use of interleaved prediction Download PDFInfo
- Publication number
- CN113261281A CN113261281A CN202080007786.3A CN202080007786A CN113261281A CN 113261281 A CN113261281 A CN 113261281A CN 202080007786 A CN202080007786 A CN 202080007786A CN 113261281 A CN113261281 A CN 113261281A
- Authority
- CN
- China
- Prior art keywords
- block
- prediction
- sub
- current
- current video
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 claims abstract description 432
- 238000012545 processing Methods 0.000 claims abstract description 69
- 238000006243 chemical reaction Methods 0.000 claims abstract description 35
- 238000005192 partition Methods 0.000 claims abstract description 22
- 230000007704 transition Effects 0.000 claims abstract description 9
- 239000013598 vector Substances 0.000 claims description 113
- 238000001914 filtration Methods 0.000 claims description 75
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 claims description 58
- 238000004590 computer program Methods 0.000 claims description 17
- 230000015654 memory Effects 0.000 claims description 13
- 230000001131 transforming effect Effects 0.000 claims description 5
- 230000008569 process Effects 0.000 description 143
- OSWPMRLSEDHDFF-UHFFFAOYSA-N methyl salicylate Chemical compound COC(=O)C1=CC=CC=C1O OSWPMRLSEDHDFF-UHFFFAOYSA-N 0.000 description 58
- 230000009977 dual effect Effects 0.000 description 56
- 241000023320 Luma <angiosperm> Species 0.000 description 55
- 238000003491 array Methods 0.000 description 23
- 238000005516 engineering process Methods 0.000 description 23
- 238000010586 diagram Methods 0.000 description 20
- 230000002123 temporal effect Effects 0.000 description 18
- 230000002146 bilateral effect Effects 0.000 description 16
- 238000009795 derivation Methods 0.000 description 16
- 230000003287 optical effect Effects 0.000 description 16
- 238000005286 illumination Methods 0.000 description 14
- 230000003044 adaptive effect Effects 0.000 description 9
- 238000012886 linear function Methods 0.000 description 8
- 238000005070 sampling Methods 0.000 description 8
- 230000009466 transformation Effects 0.000 description 6
- 238000013459 approach Methods 0.000 description 5
- 230000004048 modification Effects 0.000 description 5
- 238000012986 modification Methods 0.000 description 5
- 238000013139 quantization Methods 0.000 description 5
- 238000006073 displacement reaction Methods 0.000 description 4
- 230000011664 signaling Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 238000000638 solvent extraction Methods 0.000 description 3
- 101100162209 Aspergillus parasiticus (strain ATCC 56775 / NRRL 5862 / SRRC 143 / SU-1) aflL gene Proteins 0.000 description 2
- 241000723655 Cowpea mosaic virus Species 0.000 description 2
- 101100478642 Emericella nidulans (strain FGSC A4 / ATCC 38163 / CBS 112.46 / NRRL 194 / M139) stcS gene Proteins 0.000 description 2
- 230000002457 bidirectional effect Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000007274 generation of a signal involved in cell-cell signaling Effects 0.000 description 2
- 230000001788 irregular Effects 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 238000013515 script Methods 0.000 description 2
- 238000000926 separation method Methods 0.000 description 2
- 101150041882 verB gene Proteins 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 240000004760 Pimpinella anisum Species 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 229910052799 carbon Inorganic materials 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000006837 decompression Effects 0.000 description 1
- 229910003460 diamond Inorganic materials 0.000 description 1
- 239000010432 diamond Substances 0.000 description 1
- 230000003828 downregulation Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/119—Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/157—Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
- H04N19/159—Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/573—Motion compensation with multiple frame prediction using two or more reference frames in a given prediction direction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/577—Motion compensation with bidirectional frame interpolation, i.e. using B-pictures
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
There is provided a method of video processing, comprising: for a transition between a current video block of video and a bitstream representation of the video, subdividing the current video block into partitions according to a plurality of subdivision patterns according to a height (H) or a width (W) of the current video block; and performing a conversion using the interleaved prediction of the plurality of partitions.
Description
Cross Reference to Related Applications
According to applicable patent laws and/or regulations of Paris convention, the application is proposed to timely request the priority and benefit of international patent application number PCT/CN2019/070058 filed on day 1/2 in 2019, international patent application number PCT/CN2019/071507 filed on day 13 in month 1 in 2019, and international patent application number PCT/CN2019/071576 filed on day 14 in month 1 in 2019. The entire disclosure of the above application is incorporated by reference herein as part of the disclosure of the present application for all purposes of united states law.
Technical Field
This patent document relates to video encoding and decoding techniques, apparatus and systems.
Background
Motion Compensation (MC) is a technique in video processing for predicting frames in video given previous and/or future frames by considering the motion of the camera and/or objects in the video. Motion compensation may be used for encoding of video data for video compression.
Disclosure of Invention
This document discloses methods, systems, and apparatus relating to sub-block based motion prediction in video motion compensation.
In one representative aspect, a method of video processing is disclosed. The method comprises the following steps: deriving one or more motion vectors for a first set of sub-blocks belonging to a first subdivision pattern of a current video block of the video; and performing a conversion between the current video block and a codec representation of the video based on the one or more motion vectors.
In another representative aspect, a method of video processing is disclosed. The method comprises the following steps: subdividing the video blocks of the first color component to obtain a first set of sub-blocks of the first color component; subdividing corresponding video blocks of the second color component to obtain a second set of sub-blocks of the second color component; deriving one or more motion vectors of the first set of sub-blocks based on the one or more motion vectors of the second set of sub-blocks; and performing a conversion between the video block and a codec representation of the video based on the one or more motion vectors of the first set of sub-blocks and the second set of sub-blocks.
In another representative aspect, a method of video processing is disclosed. The method comprises the following steps: for a transition between a current video block of video and a bitstream representation of the video, subdividing the current video block into partitions according to a plurality of subdivision patterns in accordance with a height (H) or a width (W) of the current video block; and performing a conversion using the interleaved prediction of the plurality of partitions.
In another representative aspect, a method of video processing is disclosed. The method comprises the following steps: determining to apply prediction to a current video block of the video, the prediction comprising subdividing the current video block into sub-blocks according to a subdivision pattern; determining to apply bit shifting to generate a prediction block on a sub-block of a current video block; and performing a conversion between the current video block and a codec representation of the video.
In another representative aspect, a method of video processing is disclosed. The method comprises the following steps: determining whether to use an interleaved prediction tool for a conversion between a current block and a codec representation of a video based on a characteristic of the current video block of the video; and performing a transformation in accordance with the determination, wherein upon determining that the characteristic of the current video block does not satisfy a condition, the transformation is performed by disabling use of an affine prediction tool and/or an interleaved prediction tool.
In another representative aspect, a method of video processing is disclosed. The method comprises the following steps: determining whether to use an interleaved prediction tool for a conversion between a current block and a codec representation of a video based on a characteristic of the current video block of the video; and performing a transformation according to the determination, wherein the transformation is performed by an affine prediction tool and/or an interleaved prediction tool upon determining that the characteristics of the current video block satisfy a condition.
In another representative aspect, a method of video processing is disclosed. The method comprises the following steps: determining that an interleaving prediction is to be applied to a current video block of a video; and based on determining that interleaved prediction is to be applied, disabling bi-prediction for the current video block; and performing a conversion between the current video block and a codec representation of the video.
In another representative aspect, a method of video processing is disclosed. The method comprises the following steps: determining refined motion information for at least one sub-block of a current video block for a transition between the current video block of the video and a codec representation of the video; and performing conversion using the refined motion information, wherein the refined motion information is generated based on an interleaved prediction tool in which the motion information of the partition of the current video block is generated using a plurality of patterns, and wherein the refined motion information of the current video block is used for subsequent processing or selective storage based on whether a condition is satisfied.
In another representative aspect, a method of video processing is disclosed. The method comprises the following steps: determining whether to apply the interleaved prediction to a current video block of the video; and determining to use a filtering process on the current video block based on determining whether to apply interleaved prediction to the current video block; and based on the determination of the use of the filtering process, performing a conversion between the current video block and a codec representation of the video.
In another representative aspect, a method of video processing is disclosed. The method comprises the following steps: determining whether to apply the interleaved prediction to a current video block of the video; and determining whether to use local illumination compensation or weighted prediction for the current video block based on the determination of the use of interleaved prediction; and performing a conversion between the current video block and a codec representation of the video based on the determination of the use of the local illumination compensation or the weighted prediction.
In another representative aspect, a method of video processing is disclosed. The method comprises the following steps: determining a weighted prediction to apply to a current video block of the video or a sub-block of the current video block; and converting between the current video block and a codec representation of the video by disabling bi-directional optical flow (BDOF) techniques.
In another representative aspect, an apparatus is disclosed that includes a processor and a non-transitory memory having instructions thereon. The instructions, when executed by a processor, cause the processor to select a set of pixels from a video frame to form a block, subdivide the block into a first set of sub-blocks according to a first pattern, generate a first intermediate prediction block based on the first set of sub-blocks, subdivide the block into a second set of sub-blocks according to a second pattern, wherein at least one sub-block in the second set has a different size than the sub-blocks in the first set, generate a second intermediate prediction block based on the second set of sub-blocks, and determine the prediction block based on the first intermediate prediction block and the second intermediate prediction block.
In yet another representative aspect, a method of video processing includes deriving one or more motion vectors for a first set of sub-blocks of a current video block, wherein each of the first set of sub-blocks has a first subdivision pattern, and reconstructing the current video block based on the one or more motion vectors.
In yet another representative aspect, the various techniques described herein may be implemented as a computer program product stored on a non-transitory computer readable medium. The computer program product comprises program code for performing the methods described herein.
In yet another representative aspect, a video decoder device may implement a method as described herein.
The details of one or more implementations are set forth in the accompanying drawings, the drawings, and the description below. Other features will be apparent from the description and drawings, and from the claims.
Drawings
Fig. 1 is a diagram illustrating an example of sub-block based prediction.
Fig. 2 shows an example of an affine motion field of a block described by two control point motion vectors.
Fig. 3 shows an example of an affine motion vector field for each sub-block of a block.
Fig. 4 shows an example of motion vector prediction of a block 400 in AF _ INTER mode.
Fig. 5A shows an example of a selection order of candidate blocks of a current Codec Unit (CU).
Fig. 5B shows another example of a candidate block of the current CU in the AF _ MERGE mode.
Fig. 6 shows an example of an Alternative Temporal Motion Vector Prediction (ATMVP) motion prediction process for a CU.
Fig. 7 shows an example of one CU with four sub-blocks and neighboring blocks.
FIG. 8 illustrates an exemplary optical flow trace in a bi-directional optical flow (BIO) method.
Fig. 9A shows an example of access locations outside of a block.
FIG. 9B shows a padding area (padding area) that may be used to avoid additional memory accesses and computations.
Fig. 10 illustrates an example of bilateral matching used in a Frame Rate Up Conversion (FRUC) method.
Fig. 11 illustrates an example of template matching used in the FRUC method.
Fig. 12 shows an example of unilateral Motion Estimation (ME) in the FRUC method.
Fig. 13 illustrates an example of interleaved prediction with two subdivision patterns in accordance with the disclosed technique.
Fig. 14A illustrates an exemplary subdivision pattern of a block into 4x4 sub-blocks in accordance with the disclosed technique.
Fig. 14B illustrates an exemplary subdivision pattern of a block into 8x8 sub-blocks in accordance with the disclosed techniques.
Fig. 14C illustrates an exemplary subdivision pattern of a block into 4x 8 sub-blocks in accordance with the disclosed technique.
Fig. 14D illustrates an exemplary subdivision pattern of a block into 8x 4 sub-blocks in accordance with the disclosed technique.
Fig. 14E illustrates an example subdivision pattern for subdividing a block into non-uniform sub-blocks in accordance with the disclosed technique.
Fig. 14F illustrates another example subdivision pattern for subdividing a block into non-uniform sub-blocks in accordance with the disclosed techniques.
Fig. 14G illustrates yet another example subdivision pattern for subdividing a block into non-uniform sub-blocks in accordance with the disclosed techniques.
Fig. 15A to 15D show exemplary embodiments of partial interleaving prediction.
Fig. 16A to 16C show exemplary embodiments of deriving MVs of one subdivision pattern from another subdivision pattern.
Fig. 17A to 17C illustrate exemplary embodiments of selecting a subdivision pattern based on the size of a current video block.
Fig. 18A and 18B illustrate exemplary embodiments of deriving MVs of sub-blocks in one component within a subdivision pattern from MVs of another component of sub-blocks within another subdivision pattern.
Fig. 19 is an example flow diagram of a method of video processing based on some implementations of the disclosed technology.
Fig. 20 is an example flow diagram of a method of video processing based on some other implementations of the disclosed technology.
Fig. 21A-21D are example flow diagrams of methods of video processing based on some other implementations of the disclosed technology.
Fig. 22A-22D are example flow diagrams of methods of video processing based on some other implementations of the disclosed technology.
Fig. 23 and 24 are block diagrams of examples of hardware platforms for implementing the video processing methods described in the disclosed technology.
Detailed Description
Global motion compensation is one of many variations of motion compensation techniques and can be used to predict the motion of a camera. However, moving objects within a frame are not adequately represented by various implementations of global motion compensation. Local motion estimation (such as block motion compensation) that subdivides a frame into blocks of pixels for motion prediction may be used to account for objects that are moving within the frame.
Sub-block based prediction developed based on block motion compensation was first introduced into the video codec standard by High Efficiency Video Coding (HEVC) Annex I (3D-HEVC). Fig. 1 is a diagram illustrating an example of sub-block based prediction. In the case of sub-block based prediction, a block 100, such as a Coding Unit (CU) or a Prediction Unit (PU), is subdivided into several non-overlapping sub-blocks 101. Different sub-blocks may be allocated different motion information such as reference indices or Motion Vectors (MVs). Motion compensation is then performed separately for each sub-block.
In order to explore future video codec technologies beyond HEVC, the joint video exploration group (jfet) was established by the Video Codec Experts Group (VCEG) and the Moving Picture Experts Group (MPEG) in 2015 jointly. Many methods have been adopted by jfet and added to reference software named Joint Exploration Model (JEM). In JEM, sub-block based prediction is employed in several codec techniques, such as affine prediction, optional temporal motion vector prediction (ATMVP), spatial-temporal motion vector prediction (STMVP), bi-directional optical flow (BIO), and frame rate up-conversion (FRUC), discussed in detail below.
Affine prediction
In HEVC, only the translational motion model is applied to Motion Compensated Prediction (MCP). However, the camera and object may have many types of motion, such as zoom in/out, rotation, perspective motion, and/or other unusual motion. JEM, on the other hand, applies a simplified affine transform motion compensated prediction. FIG. 2 shows a motion vector V from two control points0And V1An example of an affine motion field of block 200 is described. The Motion Vector Field (MVF) of block 200 may be described by the following equation:
as shown in fig. 2, (v)0x,v0y) Is the motion vector of the upper left corner control point, and(v1x,v1y) Is the motion vector of the upper right hand corner control point. To simplify motion compensated prediction, sub-block based affine transform prediction may be applied. The subblock size M × N is derived as follows:
here, MVPre is the motion vector score precision (e.g., 1/16 in JEM). (v)2x,v2y) Is the motion vector of the lower left control point calculated according to equation (1). If desired, M and N can be adjusted downward to be divisors of w and h, respectively.
Fig. 3 shows an example of affine MVF for each sub-block of block 300. To derive the motion vector for each M × N sub-block, the motion vector for the center sample of each sub-block may be calculated according to equation (1) and rounded to motion vector fractional precision (e.g., 1/16 in JEM). A motion compensated interpolation filter may then be applied to generate a prediction for each sub-block using the derived motion vectors. After MCP, the high precision motion vector of each sub-block is rounded and saved to the same precision as the normal motion vector.
In JEM, there are two affine motion patterns: AF _ INTER mode and AF _ MERGE mode. For CUs with both width and height larger than 8, the AF _ INTER mode may be applied. An affine flag in the CU level is signaled in the bitstream to indicate whether AF _ INTER mode is used. In AF _ INTER mode, neighboring blocks are used to construct pairs of motion vectors { (v)0,v1)|v0={vA,vB,vC},v1={vD,vE} of the candidate list. Fig. 4 shows an example of Motion Vector Prediction (MVP) of a block 400 in AF _ INTER mode. As shown in FIG. 4, v is selected from the motion vector of sub-block A, B or C0. The motion vectors from the neighboring blocks may be scaled according to the reference list. The motion vector may also be scaled according to a relationship between a reference Picture Order Count (POC) of the neighboring block, a reference POC of the current CU, and a POC of the current CU. From adjacent sub-blocks D ande selection of v1The scheme of (a) is similar. If the number of candidate lists is less than 2, the list is populated by pairs of motion vectors formed by copying each of the AMVP candidates. When the candidate list is greater than 2, the candidates may first be classified according to neighboring motion vectors (e.g., based on the similarity of two motion vectors in a pair of candidates). In some implementations, the first two candidates are retained. In some embodiments, a rate-distortion (RD) cost check is used to determine which motion vector pair candidate to select as the Control Point Motion Vector Predictor (CPMVP) for the current CU. An index indicating the position of CPMVP in the candidate list may be signaled in the bitstream. After the CPMVP of the current affine CU is determined, affine motion estimation is applied and the Control Point Motion Vectors (CPMVs) are found. The difference between CPMV and CPMVP is then signaled in the bitstream.
When a CU is applied in AF _ MERGE mode, it gets the first block coded in affine mode from the valid neighboring reconstructed blocks. Fig. 5A shows an example of the selection order of candidate blocks of the current CU 500. As shown in fig. 5A, the selection order may be from left (501), up (502), top right (503), bottom left (504) to top left (505) of the current CU 500. Fig. 5B shows another example of a candidate block of the current CU 500 in the AF _ MERGE mode. If the neighboring lower-left block 501 is coded in affine mode, as shown in fig. 5B, then the motion vectors v for the upper-left, upper-right, and lower-left corners of the CU containing sub-block 501 are derived2、v3And v4. Based on v2、v3And v4Calculating motion vector v of the top left corner on current CU 5000. The motion vector v at the top right of the current CU is calculated accordingly1. In the current CU v0And v1After the CPMV is calculated according to the affine motion model in equation (1), the MVF of the current CU can be generated. To identify whether the current CU is coded in AF _ MERGE mode, an affine flag may be signaled in the bitstream when there is at least one neighboring block coded in affine mode.
Optional temporal motion vector prediction (ATMVP)
In the ATMVP method, the Temporal Motion Vector Prediction (TMVP) method is modified by retrieving multiple sets of motion information (including motion vectors and reference indices) from a block smaller than the current CU.
Fig. 6 shows an example of the ATMVP motion prediction process for CU 600. The ATMVP method predicts the motion vector of sub-CU 601 within CU 600 in two steps. The first step is to identify a corresponding block 651 in the reference picture 650 with a temporal vector. The reference picture 650 is also referred to as a motion source picture. The second step is to partition the current CU 600 into sub-CUs 601 and obtain the motion vector and reference index of each sub-CU from the block corresponding to each sub-CU.
In the first step, the reference picture 650 and the corresponding block are determined from motion information of spatial neighboring blocks of the current CU 600. To avoid the repeated scanning process of the neighboring blocks, the first MERGE candidate in the MERGE candidate list of the current CU 600 is used. The first available motion vector and its associated reference index are set as the indices of the temporal vector and the motion source picture. In this way, the corresponding block can be identified more accurately than the TMVP, where the corresponding block (sometimes referred to as a collocated block) is always in the lower right or central position relative to the current CU.
In a second step, the corresponding block of sub-CU 651 is identified by a temporal vector in the motion source picture 650 by adding the temporal vector to the coordinates of the current CU. For each sub-CU, the motion information of its corresponding block (e.g., the minimum motion grid covering the central sample point) is used to derive the motion information of the sub-CU. After identifying the motion information of the corresponding nxn block, it is converted into a motion vector and reference index of the current sub-CU in the same way as the TMVP of HEVC, where motion scaling and other processes apply. For example, the decoder checks whether a low delay condition is met (e.g., POC of all reference pictures of the current picture is less than POC of the current picture) and possibly predicts a motion vector MVy of each sub-CU using a motion vector MVx (e.g., a motion vector corresponding to reference picture list X) (e.g., X equals 0 or 1, and Y equals 1-X).
Spatial-temporal motion vector prediction (STMVP)
In the STMVP method, the motion vectors of the sub-CUs are derived recursively following a raster scan order. Fig. 7 shows an example of one CU with four sub-blocks and neighboring blocks. Consider an 8 × 8CU 700, which includes four 4 × 4 sub-CUs, a (701), B (702), C (703), and D (704). The neighboring 4 × 4 blocks in the current frame are labeled a (711), b (712), c (713), and d (714).
The motion derivation of sub-CU a begins by identifying its two spatial neighborhoods (neighbors). The first neighborhood is the nxn block (block c 713) above the sub-CU a 701. If this block c (713) is not available or intra coded, the other nxn blocks above the sub-CU a (701) are checked (from left to right, starting from block c 713). The second neighborhood is the block to the left of sub-CU a 701 (block b 712). If block b (712) is not available or intra coded, the other blocks to the left of sub-CU a 701 are checked (from top to bottom, starting from block b 712). The motion information obtained from the neighboring blocks for each list is scaled to the first reference frame for the given list. Next, the Temporal Motion Vector Prediction (TMVP) of sub-block a 701 is derived by following the same process as the TMVP as specified in HEVC. The motion information of the collocated block at block D704 is retrieved and scaled accordingly. Finally, after extracting and scaling the motion information, all available motion vectors are averaged separately for each reference list. The averaged motion vector is assigned as the motion vector of the current sub-CU.
Bidirectional optical flow (BIO)
The bi-directional optical flow (BIO) approach is a sample-wise motion refinement over block-wise motion compensation for bi-directional prediction. In some implementations, the sample-level motion refinement does not use signaling.
Make I(k)Luminance values from reference k (k 0, 1) after block motion compensation, andare respectively I(k)The horizontal and vertical components of the gradient. Motion vector field (v) assuming that the optical flow is validx,vy) Given by:
combining this optical flow equation with Hermite interpolation for the motion trajectory of each sample point yields a unique third order polynomial that matches the function value I(k)And derivatives at the endsAnd both. The value of the polynomial at t-0 is the BIO prediction:
FIG. 8 illustrates an exemplary optical flow trace in a bi-directional optical flow (BIO) method. Here, τ0And τ1Refers to the distance from the reference frame. Based on Ref0And Ref1POC of (2) calculating the distance τ0And τ1:τ0POC (current) -POC (Ref)0),τ1=POC(Ref1) -POC (current). If the two predictions are from the same time direction (both from the past or from the future), then the sign is different (e.g., τ0·τ1< 0). In this case, if the predictions are not from the same time instant (e.g., τ)0≠τ1) BIO is applied. Both reference regions have non-zero motion (e.g., MVx)0,MVy0,MVx1,MVy1Not equal to 0) and the block motion vector is proportional to the temporal distance (e.g., MVx)0/MVx1=MVy0/MVy1=-τ0/τ1)。
Determining a motion vector field (v) by minimizing the difference Δ between the values in points A and Bx,vy). Fig. 9A-9B illustrate examples of the intersection of a motion trajectory and a reference frame plane. For Δ, the model uses only the first linear term of the local Taylor (Taylor) expansion:
all values in the above equation depend on the sample position and are denoted as (i ', j'). Assuming that the motion is consistent in the local surrounding area, Δ may be minimized within a (2M +1) × (2M +1) square window Ω centered on the current predicted point (i, j), where M equals 2:
for this optimization problem, JEM uses a simplified approach, first minimizing in the vertical direction and then in the horizontal direction. This results in the following:
wherein,
to avoid division by zero or a small value, the regularization parameters r and m may be introduced into equations (7) and (8).
r=500·4d-8Equation (10)
m=700.4d-8Equation (11)
Here, d is the bit depth of the video samples.
To make memory access like BIO identical to conventional bi-predictive motion compensation, the overall prediction and gradient values I are computed for the position within the current block(k) Fig. 9A shows an example of access locations outside of block 900. As shown in FIG. 9A, in equation (9), the currently predicted point on the boundary of the predicted block is centered at (2M +1) × (2M +1) The square window omega requires access to locations outside the block. In JEM, I outside the block(k),Is set equal to the nearest available value within the block. This may be accomplished, for example, as filling area 901, as shown in fig. 9B.
Using BIO, the motion field can be refined for each sample point. To reduce computational complexity, the BIO of block-based designs is used in JEM. Motion refinement may be calculated based on 4x4 blocks. In block-based BIO, s in equation (9) for all samples in a 4 × 4 blocknCan be aggregated, and then snIs used to derive the BIO motion vector offset for the 4x4 block. More specifically, the following formula may be used for block-based BIO derivation:
here, bkRefers to the set of samples belonging to the kth 4x4 block of the predicted block. S in equations (7) and (8)nQuilt ((s)n,bk) > 4) to derive the associated motion vector offset.
In some cases, MV cliques (regions) of the BIO may be unreliable due to noise or irregular motion. Thus, in BIO, the size of the MV cluster is clipped to the threshold. The threshold is determined based on whether all of the reference pictures of the current picture come from one direction. For example, if all reference pictures of the current picture are from one direction, the value of the threshold is set to 12 × 214-d(ii) a Otherwise, it is set to 12 × 213-d。
The gradient of the BIO may be calculated using operations consistent with the HEVC motion compensation process (e.g., 2D separable Finite Impulse Response (FIR)) while simultaneously with motion compensated interpolation. In some embodiments, the input to the 2D separable FIR is the same reference frame sample as the motion compensation process and fractional position (fracX, fracY) from the fractional portion of the block motion vector. For horizontal gradientsThe signal is first interpolated vertically using BIOfilters, which corresponds to the fractional position fracY with the de-scaling displacement d-8. The gradient filter BIOfiltG is then applied in the horizontal direction corresponding to the fractional position frracX with the de-scaling displacement 18-d. For vertical gradientsThe gradient filter is applied vertically using the BIOfiltER G, corresponding to the fractional position fracY with the de-scaling displacement d-8. The BIOfilters edge line signal permutation is then used in the horizontal direction corresponding to the fractional position fracX with the de-scaling displacement 18-d. The length of the interpolation filter of the gradient calculation, bianterg and the signal permutation, bianterf, may be shorter (e.g. 6-tap) in order to maintain reasonable complexity. Table 1 shows exemplary filters that may be used for gradient calculations for different fractional positions of block motion vectors in a BIO. Table 2 shows exemplary interpolation filters that may be used for prediction signal generation in BIO.
Table 1: exemplary Filter for gradient computation in BIO
Fractional pixel position | Gradient interpolation filter (BIOfilterg) |
0 | {8,-39,-3,46,-17,5} |
1/16 | {8,-32,-13,50,-18,5} |
1/8 | {7,-27,-20,54,-19,5} |
3/16 | {6,-21,-29,57,-18,5} |
1/4 | {4,-17,-36,60,-15,4} |
5/16 | {3,-9,-44,61,-15,4} |
3/8 | {1,-4,-48,61,-13,3} |
7/16 | {0,1,-54,60,-9,2} |
1/2 | {-1,4,-57,57,-4,1} |
Table 2: exemplary interpolation Filter for prediction Signal Generation in BIO
Fractional pixel position | Interpolation filter for prediction signal (BIOfilters) |
0 | {0,0,64,0,0,0} |
1/16 | {1,-3,64,4,-2,0} |
1/8 | {1,-6,62,9,-3,1} |
3/16 | {2,-8,60,14,-5,1} |
1/4 | {2,-9,57,19,-7,2} |
5/16 | {3,-10,53,24,-8,2} |
3/8 | {3,-11,50,29,-9,2} |
7/16 | {3,-11,44,35,-10,3} |
1/2 | {3,-10,35,44,-11,3} |
In JEM, when the two predictions are from different reference pictures, the BIO may be applied to all bi-predicted blocks. When Local Illumination Compensation (LIC) is enabled for a CU, the BIO may be disabled.
In some embodiments, OBMC is applied to the block after the normal MC process. To reduce computational complexity, BIO may not be applied during the OBMC process. This means that during the OBMC process, when its own MV is used, BIO is applied in the MC process of the block, and when the MV of the neighboring block is used, BIO is not applied in the MC process.
Frame Rate Up Conversion (FRUC)
When the Merge flag of a CU is true, the FRUC flag may be signaled to the CU. When the FRUC flag is false, the Merge index may be signaled and the normal Merge mode is used. When the FRUC flag is true, an additional FRUC mode flag may be signaled to indicate which method (e.g., bilateral matching or template matching) is to be used to derive motion information for the block.
At the encoder side, the decision whether to use FRUC Merge mode for a CU is based on RD cost selection, as done for normal Merge candidates. For example, multiple matching patterns (e.g., bilateral matching and template matching) are checked for CUs by using RD cost selection. The one pointing to the minimum cost is further compared to other CU patterns. If the FRUC matching pattern is the one with the highest efficiency, the FRUC flag is set to true for the CU and the associated matching pattern is used.
Typically, the motion derivation process in FRUC Merge mode has two steps: CU-level motion search is performed first, followed by sub-CU-level motion refinement. At the CU level, an initial motion vector is derived for the entire CU based on bilateral matching or template matching. First, a list of MV candidates is generated and the candidate pointing to the smallest matching cost is selected as the starting point for further CU-level refinement. A local search based on bilateral matching or template matching is then performed around the starting point. The MV that results in the smallest matching cost is taken as the MV of the entire CU. Subsequently, the motion information is further refined at the sub-CU level using the derived CU motion vector as a starting point.
For example, the following derivation process is performed for W × H CU motion information derivation. In the first stage, MVs for the entire W × H CU are derived. In the second stage, the CU is further subdivided into M × M sub-CUs. The value of M is calculated as in (16), D is a predetermined division depth, which is set to 3 by default in JEM. The MV of each sub-CU is then derived.
Fig. 10 shows an example of bilateral matching used in a Frame Rate Up Conversion (FRUC) method. Bilateral matching is used to derive motion information of a current CU by finding a closest match between two blocks along a motion trajectory of the current CU (1000) in two different reference pictures (1010,1011). In connection withMotion vectors MV0(1001) and MV pointing to two reference blocks under the assumption of a continuous motion trajectory1(1002) Is proportional to the temporal distance between the current picture and the two reference pictures, e.g., TD0(1003) and TD1 (1004). In some embodiments, the bilateral matching becomes bidirectional MV based mirroring when the current picture 1000 is temporally between two reference pictures (1010,1011) and the temporal distances from the current picture to the two reference pictures are the same.
Fig. 11 illustrates an example of template matching used in the FRUC method. Template matching may be used to derive motion information for the current CU 1100 by finding the closest match between a template in the current picture (e.g., an upper and/or left neighboring block of the current CU) and a block in the reference picture 1110 (e.g., the same size as the template). In addition to the FRUC Merge mode described above, template matching may also be applied to AMVP mode. In both JEM and HEVC, AMVP has two candidates. Using a template matching method, new candidates can be derived. If the newly derived candidate by template matching is different from the first existing AMVP candidate, it is inserted into the very beginning of the AMVP candidate list and then the list size is set to 2 (e.g., by removing the second existing AMVP candidate). When applied to AMVP mode, only CU level search is applied.
MV candidates set at the CU level may include the following: (1) the current CU is the original AMVP candidate if it is in AMVP mode, (2) all Merge candidates, (3) some MVs in the interpolated MV field (described later) and the upper and left neighboring motion vectors.
When using bilateral matching, each valid MV of the Merge candidate may be used as an input to generate MV pairs under the assumption of bilateral matching. For example, one valid MV for the Merge candidate is at reference list a (MVa, ref)a). Then, find the reference picture ref of its paired bilateral MV in the other reference list BbSo that refaAnd refbTemporally on different sides of the current picture. If such a ref in list B is referencedbIf not, then refbIs determined as being equal to refaDifferent references and its temporal distance to the current picture is the smallest one in list B. In determining refbThen, based on the current picture and refa、refbThe temporal distance between them, MVb is derived by scaling MVa.
In some implementations, four MVs from the interpolated MV field may also be added to the CU level candidate list. More specifically, interpolation MVs at positions (0,0), (W/2,0), (0, H/2), and (W/2, H/2) of the current CU are added. When FRUC is applied to AMVP mode, the original AMVP candidate is also added to the CU-level MV candidate set. In some implementations, at the CU level, 15 MVs for AMVP CUs and 13 MVs for Merge CUs may be added to the candidate list.
The MV candidate set at the sub-CU level includes (1) MVs determined from the CU level search, (2) top, left, top-left, and top-right neighboring MVs, (3) scaled versions of collocated MVs from the reference picture, (4) one or more ATMVP candidates (e.g., up to four), and (5) one or more STMVP candidates (e.g., up to four). The scaled MVs from the reference pictures are derived as follows. Reference pictures in both lists are traversed. The MVs at collocated positions of the sub-CUs in the reference picture are scaled to the reference of the starting CU-level MV. The ATMVP and STMVP candidates may be the first four. At the sub-CU level, one or more MVs (e.g., up to 17) are added to the candidate list.
Generation of interpolated MV fields
Before encoding and decoding the frame, an interpolation motion field is generated for the whole picture based on the unilateral ME. The motion field may then be used as a CU-level or sub-CU-level MV candidate afterwards.
In some embodiments, the motion field for each reference picture in the two reference lists is traversed at the 4x4 block level. Fig. 12 shows an example of unilateral Motion Estimation (ME)1200 in the FRUC approach. For each 4x4 block, if the motion associated with the block passes through a 4x4 block in the current picture (as shown in fig. 12) and the block has not been assigned any interpolated motion, the motion of the reference block is scaled to the current picture according to temporal distances TD0 and TD1 (in the same way as MV scaling of TMVP in HEVC) and the scaled motion is assigned to the block in the current frame. If no scaled MV are assigned to a 4x4 block, the motion of the block is marked as unavailable in the interpolated motion field.
Interpolation and matching costs
When the motion vector points to a fractional sample position, motion compensated interpolation is required. To reduce complexity, bilinear interpolation may be used for both bilateral matching and template matching instead of conventional 8-tap (tap) HEVC interpolation.
The matching cost is calculated slightly differently at different steps. When selecting candidates from the candidate set at the CU level, the matching cost may be the absolute sum and difference (SAD) of the bilateral matching or the template matching. After determining the starting MV, the matching cost C of the bilateral match of the sub-CU level search is calculated as follows:
here, w is a weight factor. In some embodiments, w may be empirically set to 4. MV and MVsIndicating the current MV and the starting MV, respectively. SAD may still be used as the matching cost for template matching for sub-CU level searches.
In FRUC mode, the MV is derived by using only the luminance samples. The derived motion will be used for both luma and chroma for MC inter prediction. After the MV is decided, the final MC is performed using an 8-tap (tap) interpolation filter for luminance and a 4-tap interpolation filter for chrominance.
MV refinement is a MV search based style with criteria of bilateral matching cost or template matching cost. In JEM, two search styles are supported-an Unrestricted Central Biased Diamond Search (UCBDS) and an adaptive Cross search (adaptive cross search) for MV refinement at the CU level and sub-CU level, respectively. For CU and sub-CU level MV refinement, the MV is searched directly with quarter luma sample MV precision and then eighth luma sample MV refinement. The search range for MV refinement for the CU and sub-CU steps is set equal to 8 luma samples.
In the bilateral matching Merge mode, bi-prediction is applied because the motion information of a CU is derived based on the closest match between two blocks along the motion trajectory of the current CU in two different reference pictures. In template matching Merge mode, the encoder may select among unidirectional prediction from list0, unidirectional prediction from list1, or bi-directional prediction for a CU. The selection may be based on the template matching cost, as follows:
if costBi & ltfactor & gt min (cost0, cost1)
Then bi-directional prediction is used;
otherwise, if cost0 < ═ cost1
Then use the one-way prediction from list 0;
if not, then,
using one-way prediction from list 1;
here, cost0 is the SAD of the list0 template match, cost1 is the SAD of the list1 template match, and cost bi is the SAD of the bi-predictive template match. For example, when the value of the factor (factor) is equal to 1.25, this means that the selection process is biased towards bi-directional prediction. Inter prediction direction selection may be applied to the CU-level template matching process.
Deblocking (Deblocking) process in VVC
8.6.2 deblocking Filter Process
8.6.2.1 overview
The input to this process is the reconstructed picture before deblocking, i.e. the array recPictureL, and when ChromaArrayType does not equal 0, the arrays recPictureCb and recPictureCr.
The output of this process is the deblocked modified reconstructed picture, i.e., the array recPictureL, and when ChromaArrayType is not equal to 0, the arrays recPictureCb and recPictureCr.
Vertical edges in the picture are filtered first. The horizontal edges in the picture are then filtered with the samples modified by the vertical edge filtering process as input. The vertical and horizontal edges in the CTB of each CTU are separately processed based on a codec unit. Vertical edges of codec blocks in a codec unit are filtered starting from the edge on the left-hand side of the codec block and proceeding in its geometric order across the edge towards the right side of the codec block. Horizontal edges of codec blocks in a codec unit are filtered starting from the edge at the top of the codec block and proceeding through the edges in their geometric order towards the bottom of the codec block.
Note-although the filtering process is specified on a picture basis in this specification, it can be implemented on a codec unit basis to have an equivalent result as long as the decoder properly considers the processing dependency order to produce the same output value.
The deblocking filtering process is applied to all codec sub-block edges and transform block edges of a picture, except for the following types of edges:
-an edge at a boundary of the picture,
-edges coinciding with slice boundaries when loop _ filter _ across _ tiles _ enabled _ flag is equal to 0,
-edges coinciding with the upper or left border of a slice group with tile _ group _ loop _ filter _ across _ tile _ groups _ enabled _ flag equal to 0 or tile _ group _ deblocking _ filter _ disabled _ flag equal to 1,
-edges within a slice group having tile _ group _ deblocking _ filter _ disabled _ flag equal to 1,
edges of the 8x8 spline boundaries that do not correspond to the component under consideration,
-edges within chroma sampling points using inter prediction on both sides of the edge,
-edges of chroma transform blocks which are not edges of the associated transform unit.
[ Ed. (BB): once the tiles are integrated, the syntax is adapted. ]
The edge type, vertical or horizontal, is represented by the variable edgeType specified by table 817.
Table 817 — name associated with edgeType
Name of edgeType edgeType
EDGE 0 (vertical EDGE) EDGE _ VER
1 (horizontal EDGE) EDGE _ HOR
When tile _ group _ deblocking _ filter _ disabled _ flag of the current slice group is equal to 0, the following applies:
the variable treeType is derived as follows:
-if tile _ group _ type is equal to I and qtbtttt _ DUAL _ TREE _ intra _ flag is equal to 1, treeType is set equal to DUAL _ TREE _ LUMA.
Else, treeType is set equal to SINGLE _ TREE.
-to de-block the reconstructed picture before deblocking, i.e. the array recPictureCr, using the variable treeType, as specified in clause 8.6.2.2, using the de-blocking filter process to de-block the vertical EDGE by de-blocking for one direction, as well as the arrays recPictureCb and recPictureCr when the ChromaArrayType does not equal 0 or treeType equals SINGLE _ TREE, and the variable edgeType set equal to EDGE _ VER as inputs, and the modified reconstructed picture after deblocking, i.e. the array recPictureL, and the arrays recPictureCb and recPictureCr as outputs when the ChromaArrayType does not equal 0 or treeType equals SINGLE _ TREE.
-to down-call filtering of horizontal EDGEs by deblocking filtering process in one direction with a call using variable treeType, as specified in clause 8.6.2.2, the modified reconstructed picture after deblocking, i.e. array recPictureL, and when ChromaArrayType does not equal 0 or treeType equals SINGLE _ TREE, the arrays recPictureBand recPictureCp, and variable edgeType are set equal to EDGE _ HOR as inputs, and the reconstructed picture modified after deblocking, i.e. array recPictureL, and when ChromaArraType does not equal 0 or treeType equals SINGLE _ TREE, the arrays recPictureBand recPictureCp as outputs.
When tile _ group _ type is equal to I and qtbtttt _ dual _ tree _ intra _ flag is equal to 1, the following applies:
-the variable treeType is set equal to DUAL _ TREE _ CHROMA
-to down-call filtering of vertical EDGEs by a deblocking filtering process to down-call for one direction, as specified in clause 8.6.2.2, using the variable treeType, as input the reconstructed pictures before deblocking, i.e. the arrays recapicturecb and recapicturecr, and the variable edgeType set equal to EDGE _ VER, and as output the modified reconstructed pictures after deblocking, i.e. the arrays recapicturecb and recapicturecr.
-filtering the horizontal EDGEs by de-blocking the filtering process for one direction with a down-adjustment, as specified in clause 8.6.2.2, using the variable treeType, the modified reconstructed pictures after de-blocking, i.e. the arrays recPictureCb and recPictureCcr, and the variable edgeType are set equal to EDGE _ HOR as input, and the modified reconstructed pictures after de-blocking, i.e. the arrays recPictureCb and recPictureCr as output.
8.6.2.2 deblocking filtering process for one direction
The inputs to this process are:
a variable treeType specifying whether a SINGLE TREE (SINGLE _ TREE) or a DUAL TREE is used to partition the CTU and, when a DUAL TREE is used, whether the LUMA (DUAL _ TREE _ LUMA) or the CHROMA component (DUAL _ TREE _ CHROMA) is currently processed,
-when treeType equals SINGLE _ TREE or DUAL _ TREE _ LUMA, the reconstructed picture before deblocking, i.e. the array recPictures L,
the arrays recPictureCb and recPictureCr when ChromaArrayType is not equal to 0 and treeType is equal to SINGLE _ TREE or DUAL _ TREE _ CHROMA,
the variable edgeType, which specifies whether the filtering is for vertical (EDGE _ VER) or horizontal (EDGE _ HOR) EDGEs.
The output of this process is the modified reconstructed picture after deblocking, i.e.:
-when treeType equals SINGLE _ TREE or DUAL _ TREE _ LUMA, the array recPictureL,
-the arrays recPictureCb and recPictureCcr when ChromaArrayType is not equal to 0 and treeType is equal to SINGLE _ TREE or DUAL _ TREE _ CHROMA.
For each codec unit having a codec block width log2CbW, a codec block height log2CbH and a position of a left top sample of the codec block (xCb, yCb), filtering the EDGE when edgeType equals EDGE _ VER and xCb% 8 equals 0 or when edgeType equals EDGE _ HOR and yCb% 8 equals 0 by the following sequence of steps:
1. the code block width nCbW is set equal to 1 < log2CbW, and the code block height nCbH is set equal to 1 < log2CbH
2. The variable filterEdgeFlag is derived as follows:
-if edgeType is equal to EDGE _ VER and one or more of the following conditions is true, then filterEdgeFlag is set equal to 0:
-the left boundary of the current codec block is the left boundary of the picture.
The left boundary of the current codec block is the left boundary of the slice and loop _ filter _ across _ tiles _ enabled _ flag is equal to 0.
The left boundary of the current codec block is the left boundary of the slice group and tile _ group _ loop _ filter _ across _ tile _ groups _ enabled _ flag is equal to 0.
Otherwise, if edgeType is equal to EDGE _ HOR and one or more of the following conditions is true, the variable filterEdgeFlag is set equal to 0:
the top boundary of the current luma codec block is the top boundary of the picture.
The top boundary of the current codec block is the top boundary of the slice and loop _ filter _ across _ tiles _ enabled _ flag is equal to 0.
The top boundary of the current codec block is the top boundary of the slice group and tile _ group _ loop _ filter _ across _ tile _ groups _ enabled _ flag is equal to 0.
Otherwise, filterEdgeFlag is set equal to 1.
[ Ed. (BB): once the tiles are integrated, the syntax is adapted. ]
3. All elements of the two-dimensional (nCbW) x (ncbh) array edgeFlags are initialized to equal zero.
4. With the transform block boundary derivation process specified in the following call clause 8.6.2.3, the position (xB0, yB0) set equal to (0,0), the block width nTbW set equal to nCbW, the block height nTbH set equal to nCbH, the variable treeType, the variable filterEdgeFlag, the array edgeFlags, and the variable edgeType are used as inputs, and the modified array edgeFlags thereof is used as an output.
5. The derivation procedure to down-invoke the codec sub-block boundaries specified by clause 8.6.2.4 uses as inputs the position (xCb, yCb), the codec block width nCbW, the codec block height nCbH, the array edgeFlags, and the variable edgeType, and as an output the modified array edgeFlags.
6. The picture sample array recPicture is derived as follows:
-if treeType is equal to SINGLE _ TREE or DUAL _ TREE _ LUMA, then the recorpicture is set equal to the reconstructed LUMA picture sample array recPictureL before deblocking.
Otherwise (treeType equals DUAL _ TREE _ CHROMA), the receiver picture is set equal to the reconstructed CHROMA picture sample array receiver picture before deblocking.
7. The boundary filtering strength derivation process specified in the down-regulation clause 8.6.2.5 uses the picture sample array, recipicture, luma position (xCb, yCb), codec block width nCbW, codec block height nCbH, variable edgeType, and array edgeFlags as inputs, and the (nCbW) x (nCbH) array verBs as outputs.
8. The edge filtering process is invoked as follows:
-if edgeType equals EDGE _ VER, with the vertical EDGE filtering process of the codec unit specified under clause 8.6.2.6.1, reconstructing the picture before deblocking using the variable treeType, i.e. when treeType equals SINGLE _ TREE or DUAL _ TREE _ LUMA, array recapicturel, and when chromeararytype does not equal 0 and treeType equals SINGLE _ TREE or DUAL _ TREE _ CHROMA, arrays recapicturecb and recapicturecr, position (xCb, yCb), codec block width nCbW, codec block height nCbH, and array verBs as inputs, and modified reconstructed picture, i.e. when treebetype equals SINGLE _ TREE or DUAL _ TREE _ LUMA, array, and when scanaytype equals SINGLE _ TREE or DUAL _ treejjjjjjjjjjjc, array, and when scanorarytype does not equal SINGLE _ TREE or DUAL _ treejjc, array, as output.
Otherwise, if edgeType is equal to EDGE _ HOR, then with the horizontal EDGE filtering process that down-adjusts the codec unit specified in clause 8.6.2.6.2, using the variable treeType, the modified reconstructed picture before deblocking, i.e., when treeType is equal to SINGLE _ TREE or DUAL _ TREE _ LUMA, the array recapicturel, and when chromarraytype is not equal to 0 and treeType is equal to SINGLE _ TREE or DUAL _ TREE _ CHROMA, the arrays recapicturecb and recapicturecr, the location (xCb, yCb), the codec block width nCbW, the codec block height nCbH, and the array horBs as inputs, and the modified reconstructed picture, i.e., when treeType is equal to SINGLE _ TREE or DUAL _ TREE _ ma, the array retrieverl, and when scanytype is not equal to SINGLE _ TREE or DUAL _ TREE _ loop, the array retrieverecl and the array retrieverect as outputs as scantreecrect and DUAL.
8.6.2.3 derivation of transform block boundaries
The inputs to this process are:
-a position (xB0, yB0) specifying the top left sample of the current block relative to the top left sample of the current codec block,
a variable nTbW specifying the width of the current block,
a variable nTbH specifying the height of the current block,
a variable treeType specifying whether a SINGLE TREE (SINGLE _ TREE) or a DUAL TREE is used to partition the CTU and, when a DUAL TREE is used, whether the LUMA (DUAL _ TREE _ LUMA) or the CHROMA component (DUAL _ TREE _ CHROMA) is currently processed,
-a variable filterEdgeFlag,
-a two-dimensional (nCbW) x (nCbH) array edgeFlags,
the variable edgeType, which specifies whether the filtering is for vertical (EDGE _ VER) or horizontal (EDGE _ HOR) EDGEs.
The output of this process is a modified two-dimensional (nCbW) x (ncbh) array edgeFlags.
The maximum transform block size maxTbSize is derived as follows:
maxTbSize=(treeType==DUAL_TREE_CHROMA)?MaxTbSizeY/2:MaxTbSizeY (8 862)
depending on maxTbSize, the following applies:
-if nTbW is greater than maxTbSize or nTbH is greater than maxTbSize, applying the following sequence of steps.
1. The variables newTbW and newTbH are derived as follows:
newTbW=(nTbW>maxTbSize)?(nTbW/2):nTbW (8 863)
newTbH=(nTbH>maxTbSize)?(nTbH/2):nTbH (8 864)
2. the derivation process to down-regulate the transform block boundaries specified in this clause uses as inputs the location (xB0, yB0), the variable nTbW set equal to newTbW and the variable nTbH, the variable filterEdgeFlag, the array edgeFlags, and the variable edgeType set equal to newTbH, and the output is a modified version of the array edgeFlags.
3. If nTbW is greater than maxTbSize, the derivation process to down-regulate the transform block boundary specified in this clause uses luma positions (xTb0, yTb0) set equal to (xTb0+ newTbW, yTb0), a variable nTbW set equal to newTbW and a variable nTbH set equal to newTbH, the variables filterEdgeFlag, array edgeFlags and variable edgeType as inputs, and the output is a modified version of array edgeFlags.
4. If nTbH is greater than maxTbSize, the derivation process to down-regulate the transform block boundary specified in this clause uses luma positions (xTb0, yTb0) set equal to (xTb0, yTb0+ newTbH), variable nTbW set equal to newTbW and variable nTbH set equal to newTbH, the variables filterEdgeFlag, array edgeFlags and variable edgeType as inputs, and the output is a modified version of array edgeFlags.
5. If nTbW is greater than maxTbSize and nTbH is greater than maxTbSize, the derivation process to down-regulate the transform block boundaries specified in this clause uses luma positions (xTb0, yTb0) set equal to (xTb0+ newTbW, yTb0+ newTbH), variable nTbW set equal to newTbW and variable nTbH set equal to newTbH, variable fileEdgeFlag, array edgeFlags and variable edgeType as inputs, and the output is a modified version of array edgeFlags.
-otherwise, applying the following:
if edgeType is equal to EDGE _ VER, the value of edgeFlags [ xB0] [ yB0+ k ] is derived for k 0.. nTbH-1 as follows:
-if xB0 equals 0, edgeFlags [ xB0] [ yB0+ k ] is set equal to filterEdgeFlag.
Else, edgeFlags [ xB0] [ yB0+ k ] is set equal to 1.
Else (edgeType equals EDGE _ HOR), the value of edgeFlags [ xB0+ k ] [ yB0] is derived for k 0.. nTbW-1 as follows:
-if yB0 equals 0, edgeFlags [ xB0+ k ] [ yB0] is set equal to filterEdgeFlag.
Else, edgeFlags [ xB0+ k ] [ yB0] is set equal to 1.
8.6.2.4 derivation process of coding and decoding sub-block boundary
The inputs to this process are:
-a position (xCb, yCb) specifying the left top sample of the current codec block relative to the left top sample of the current picture,
a variable nCbW specifying the width of the current codec block,
a variable nCbH specifying the height of the current codec block,
-a two-dimensional (nCbW) x (nCbH) array edgeFlags,
the variable edgeType, which specifies whether the filtering is for vertical (EDGE _ VER) or horizontal (EDGE _ HOR) EDGEs.
The output of this process is a modified two-dimensional (nCbW) x (ncbh) array edgeFlags.
The number of codec sub-blocks numSbX in the horizontal direction and the number of codec sub-blocks numSbY in the vertical direction are derived as follows:
-if CuPredMode [ xCb ] [ yCb ] ═ MODE _ INTRA, then both numSbX and numSbY are set equal to 1.
-otherwise, numSbX and numSbY are set equal to numSbX [ xCb ] [ yCb ] and numSbY [ xCb ] [ yCb ], respectively.
Depending on the value of edgeType, the following applies:
if edgeType is equal to EDGE _ VER and numSbX is greater than 1, then for i 1.. min ((nCbW/8) -1, numSbX-1), k 0.. nCbH-1, the following applies:
edgeFlags[i*Max(8,nCbW/numSbX)][k]=1(8 865)
otherwise, if edgeType is equal to EDGE _ HOR and numby is greater than 1, for j 1.. min ((nCbH/8) -1, numby-1), k 0.. nCbW-1, the following applies:
edgeFlags[k][j*Max(8,nCbH/numSbY)]=1(8 866)
8.6.2.5 derivation process of boundary filtering strength
The inputs to this process are:
-an array of picture samples recPicture,
-a position (xCb, yCb) specifying the left top sample of the current codec block relative to the left top sample of the current picture,
a variable nCbW specifying the width of the current codec block,
a variable nCbH specifying the height of the current codec block,
a variable edgeType specifying whether vertical (EDGE _ VER) or horizontal (EDGE _ HOR) EDGEs are filtered,
two-dimensional (nCbW) x (ncbh) arrays edgeFlags.
The output of this process is a two-dimensional (nCbW) x (ncbh) array bS, specifying the boundary filtering strength.
The variables xDi, yDj, xN and yN are derived as follows:
-if edgeType is equal to EDGE _ VER, xDi is set equal to (i < 3), yDj is set equal to (j < 2), xN is set equal to Max (0, (nCbW/8) -1) and yN is set equal to (nCbH/4) -1.
Else (edgeType equals EDGE _ HOR), xDI is set equal to (i < 2), yDj is set equal to (j < 3), xN is set equal to (nCbW/4) -1 and yN is set equal to Max (0, (nCbH/8) -1).
For xDi, where i ═ 0.. xN, and yDj, where j ═ 0.. yN, the following applies:
-if edgeFlags [ xDI ] [ yDj ] equals 0, the variable bS [ xDI ] [ yDj ] is set equal to 0.
-otherwise, applying the following:
sample values p0 and q0 are derived as follows:
-if edgeType equals EDGE _ VER, then p0 is set equal to Picture [ xCb + xDI-1] [ yCb + yDj ] and q0 is set equal to Picture [ xCb + xDI ] [ yCb + yDj ].
Otherwise (edgeType equals EDGE _ HOR), p0 is set equal to recPicture [ xCb + xDI ] [ yCb + yDj-1] and q0 is set equal to recPicture [ xCb + xDI ] [ yCb + yDj ].
The variable bS [ xDi ] [ yDj ] is derived as follows:
-bS [ xDI ] [ yDj ] is set equal to 2 if sample p0 or q0 is in a codec block of a codec unit that is coded with intra prediction mode.
Otherwise, bS [ xDi ] [ yDj ] is set equal to 1 if the block edge is also a transform block edge and samples p0 or q0 are in a transform block containing one or more non-zero transform coefficient levels.
-otherwise, bS [ xDi ] [ yDj ] is set equal to 1 if one or more of the following conditions is true:
for the prediction of the coded sub-block containing sample p0, a different reference picture or a different number of motion vectors is used than for the prediction of the coded sub-block containing sample q 0.
Note 1-determining whether the reference pictures for the two coded sub-blocks are the same or different is based only on which pictures are referenced, regardless of whether the prediction is formed using the index of reference picture list0 or the index of reference picture list1, and regardless of whether the index positions within the reference picture lists are different.
Note 2-the number of motion vectors used to predict the codec sub-block with left top sample coverage (xSb, ySb) is equal to PredFlagL0[ xSb ] [ ySb ] + PredFlagL1[ xSb ] [ ySb ].
One motion vector for predicting the codec sub-block containing sample p0 and one motion vector for predicting the codec sub-block containing sample q0, and the absolute difference between the horizontal or vertical components of the used motion vectors is greater than or equal to 4 in units of quarter luminance samples.
-using two motion vectors and two different reference pictures to predict a coded sub-block containing a sample point p0, using two motion vectors of the same two reference pictures to predict a coded sub-block containing a sample point q0, and the absolute difference between the horizontal or vertical components of the two motion vectors used in the prediction for the two coded sub-blocks of the same reference picture is greater than or equal to 4 in units of quarter luma samples.
Two motion vectors for the same reference picture are used for predicting the coded subblock containing a sample point p0, two motion vectors for the same reference picture are used for predicting the coded subblock containing a sample point q0, and the following two conditions are true:
-the absolute difference between the horizontal or vertical components of the list0 motion vectors used in the prediction of the two codec sub-blocks is greater than or equal to 4 in terms of quarter luma samples, or the list1 motion vectors used in the prediction of the two codec sub-blocks are greater than or equal to 4 in terms of quarter luma samples.
The absolute difference between the horizontal or vertical components of the list0 motion vector used in the prediction of the codec sub-block containing sample p0 and the list1 motion vector used in the prediction of the codec sub-block containing sample q0 is greater than or equal to 4 in terms of quarter luma samples, or the absolute difference between the horizontal or vertical components of the list1 motion vector used in the prediction of the codec sub-block containing sample p0 and the list0 motion vector used in the prediction of the codec sub-block containing sample q0 is greater than or equal to 4 in units of quarter luma samples.
-otherwise, setting the variable bS [ xDi ] [ yDj ] equal to 0.
8.6.2.6 edge filtering process
8.6.2.6.1 vertical edge filtering process
The inputs to this process are:
a variable treeType specifying whether a SINGLE TREE (SINGLE _ TREE) or a DUAL TREE is used to partition the CTU, and when a DUAL TREE is used, whether the LUMA component (DUAL _ TREE _ LUMA) or the CHROMA component (DUAL _ TREE _ CHROMA) is currently processed,
-when treeType equals SINGLE _ TREE or DUAL _ TREE _ LUMA, the reconstructed picture before deblocking, i.e. the array recPictures L,
the arrays recPictureCb and recPictureCr when ChromaArrayType is not equal to 0 and treeType is equal to SINGLE _ TREE or DUAL _ TREE _ CHROMA,
-a position (xCb, yCb) specifying the left top sample of the current codec block relative to the left top sample of the current picture,
a variable nCbW specifying the width of the current codec block,
the variable nCbH, specifying the height of the current codec block.
The output of this process is the modified reconstructed picture after deblocking, i.e.:
-when treeType equals SINGLE _ TREE or DUAL _ TREE _ LUMA, the array recPictureL,
-the arrays recPictureCb and recPictureCcr when ChromaArrayType is not equal to 0 and treeType is equal to SINGLE _ TREE or DUAL _ TREE _ CHROMA.
When treeType is equal to SINGLE _ TREE or DUAL _ TREE _ LUMA, the filtering process of the edge in the LUMA codec block of the current codec unit consists of the following sequential steps:
1. the variable xN is set equal to Max (0, (nCbW/8) -1) and yN is set equal to (nCbH/4) -1.
2. For xDk equal to k < 3, where k 0.. nN and yDm equal to m < 2, where m 0.. yN, the following applies:
-when bS [ xDk ] [ yDm ] is greater than 0, applying the following sequence of steps:
a. using treeType, picture sample array recPicture set equal to luma picture sample array recPictureL, luma codec block locations (xCb, yCb), luma location of blocks (xDk, yDm), variable edgeType set to EDGE _ VER, boundary filter strength bS [ xDk ] [ yDm ], and bit depth bD set to BitDepthY are inputs to the block EDGE decision process specified in the downward calling clause 8.6.2.6.3, and decisions dE, dEp and dEq, and variable tC are outputs.
b. The filtering process to down-regulate the block EDGE specified in clause 8.6.2.6.4 uses as input the picture sample array recPicture set equal to the luma picture sample array recPictureL, the location of the luma codec blocks (xCb, yCb), the luma location of the blocks ((xDk, yDm)), the variable edgeType set equal to EDGE _ VER, decides dE, dEp, and dEq, and the variable tC, and the modified luma picture sample array recPictureL as output.
When ChromaArrayType is not equal to 0 and treeType is equal to SINGLE _ TREE, the filtering process of the edge in the chroma coding and decoding block of the current coding and decoding unit consists of the following sequence of steps:
1. the variable xN is set equal to Max (0, (nCbW/8) -1) and yN is set equal to Max (0, (nCbH/8) -1).
2. The variable edgeSpacing is set equal to 8/SubWidthC.
3. The variable edgeselections is set equal to yN (2/subcoight c).
4. For xDk equal to k × edgeSpacing, where k is 0.. xN and yDm equal to m < 2, where m is 0.. edgeselections, the following applies:
-when bS [ xDk subwidth hc ] [ yDm subheight c ] is equal to 2 and (((xCb/subwidth hc + xDk) > >3) < 3) is equal to xCb/subwidth hc + xDk, the following sequence of steps is applied:
a. the filtering process to down-regulate the chroma block EDGE specified by clause 8.6.2.6.5 uses as input the chroma picture sample array recPictureCB, the location of the chroma codec block (xCb/SubWidthC, yCb/SubHeight C), the chroma location of the block (xDk, yDm), the variable edgeType set equal to EDGE _ VER and the variable cQpPicOffset set equal to pps _ cb _ qp _ offset, and as output the modified chroma picture sample array recPictureCB.
b. The filtering process to down-regulate the chroma block EDGE specified by clause 8.6.2.6.5 uses as input the chroma picture sample array recPictureCr, the location of the chroma codec block (xCb/SubWidthC, yCb/SubHeight C), the chroma location of the block (xDk, yDm), the variable edgeType set equal to EDGE _ VER and the variable cQpPicOffset set equal to pps _ cr _ qp _ offset, and as output the modified chroma picture sample array recPictureCr.
When treeType is equal to DUAL _ TREE _ CHROMA, the filtering process of the edges in the two CHROMA codec blocks of the current codec unit consists of the following sequence of steps:
1. the variable xN is set equal to Max (0, (nCbW/8) -1) and yN is set equal to (nCbH/4) -1.
2. For xDk equal to k < 3, where k 0.. xN and yDm equal to m < 2, where m 0.. yN, the following applies:
-when bS [ xDk ] [ yDm ] is greater than 0, applying the following sequence of steps:
a. the decision process to down-regulate the block EDGE specified in clause 8.6.2.6.3 uses treeType, picture sample array recPicture set equal to chroma picture sample array recPictureCb, chroma codec block location (xCb, yCb), chroma block location (xDk, yDm), variable edgeType set equal to EDGE _ VER, boundary filter strength bS [ xDk ] [ yDm ], and bit depth bD set equal to BitDepthC as inputs, and decides dE, dEp and dEq, and variable tC as outputs.
b. A filtering process to down-regulate the block EDGE specified in clause 8.6.2.6.4, using the picture sample array recPicture set equal to the chroma picture sample array recPictureCb, the location of the chroma codec block (xCb, yCb), the chroma location of the block (xDk, yDm), the variable edgeType set equal to EDGE _ VER, the dE, dEp, and dEq are determined, and the variable tC as input, and the modified chroma picture sample array recPictureCb as output.
c. A filtering process to down-regulate the block EDGE specified in clause 8.6.2.6.4, using the picture sample array recPicture set equal to the chroma picture sample array recPictureCr, the location of the chroma codec block (xCb, yCb), the chroma location of the block (xDk, yDm), the variable edgeType set equal to EDGE _ VER, decides dE, dEp and dEq, and the variable tC as inputs, and the modified chroma picture sample array recPictureCr as output.
8.6.2.6.2 horizontal edge filtering process
The inputs to this process are:
a variable treeType specifying whether a SINGLE TREE (SINGLE _ TREE) or a DUAL TREE is used to partition the CTU, and when a DUAL TREE is used, whether the LUMA component (DUAL _ TREE _ LUMA) or the CHROMA component (DUAL _ TREE _ CHROMA) is currently processed,
-when treeType equals SINGLE _ TREE or DUAL _ TREE _ LUMA, the reconstructed picture before deblocking, i.e. the array recPictures L,
the arrays recPictureCb and recPictureCr when ChromaArrayType is not equal to 0 and treeType is equal to SINGLE _ TREE or DUAL _ TREE _ CHROMA,
-a position (xCb, yCb) specifying the left top sample of the current codec block relative to the left top sample of the current picture,
a variable nCbW specifying the width of the current codec block,
the variable nCbH, specifying the height of the current codec block.
The output of this process is the modified reconstructed picture after deblocking, i.e.:
-when treeType equals SINGLE _ TREE or DUAL _ TREE _ LUMA, the array recPictureL,
-the arrays recPictureCb and recPictureCcr when ChromaArrayType is not equal to 0 and treeType is equal to SINGLE _ TREE or DUAL _ TREE _ CHROMA.
When treeType is equal to SINGLE _ TREE or DUAL _ TREE _ LUMA, the filtering process for the edge in the LUMA codec block of the current codec unit consists of the following sequence of steps:
1. the variable yN is set equal to Max (0, (nCbH/8) -1) and xN is set equal to (nCbW/4) -1.
2. For yDm equal to m < 3, where m 0.. yN and xDk equal k < 2, where k 0.. xN, the following applies:
-when bS [ xDk ] [ yDm ] is greater than 0, applying the following sequence of steps:
a. using treeType, picture sample array recPicture set equal to luma picture sample array recPictureL, luma codec block position (xCb, yCb), luma position of block ((xDk, yDm)), variable edgeType set equal to EDGE _ HOR, boundary filter strength bS [ xDk ] [ yDm ], and bit depth bD set equal to BitDepthY as inputs, and dE, dEp and dEq, and variable tC as outputs, the decision process to down-call the block EDGE specified in clause 8.6.2.6.3.
b. The filtering process to down-regulate the block EDGE specified in clause 8.6.2.6.4 uses as input the picture sample array recPicture set equal to the luma picture sample array recPicture, the location of the luma codec block (xCb, yCb), the luma location of the block ((xDk, yDm)), the variable edgeType set equal to EDGE _ HOR, the decisions dEp, dEp, and dEq, and the variable tC, and the modified luma picture sample array recPicture as output.
When ChromaArrayType is not equal to 0 and treeType is equal to SINGLE _ TREE, the filtering process of the edge in the chroma coding and decoding block of the current coding and decoding unit consists of the following sequence of steps:
1. the variable xN is set equal to Max (0, (nCbW/8) -1) and yN is set equal to Max (0, (nCbH/8) -1).
2. The variable edgeSpacing is set equal to 8/SubHeightC.
3. The variable edgeSections is set equal to xN (2/SubWidthC).
4. For yDm equal to m × edgeSpacing, where m 0.. yN and xDk equal k < 2, where k 0.. edgeselections, the following applies:
-when bS [ xDk subwidth hc ] [ yDm subheight c ] is equal to 2 and (((yCb/subheight c + yDm) > >3) < 3) is equal to yCb/subheight c + yDm, the following sequence of steps is applied:
a. the filtering process to down-regulate the chroma block EDGE specified by clause 8.6.2.6.5 uses as input the chroma picture sample array recPictureCB, the location of the chroma codec block (xCb/SubWidthC, yCb/SubHeight C), the chroma location of the block (xDk, yDm), the variable edgeType set equal to EDGE _ HOR and the variable cQpPicOffset set equal to pps _ cb _ qp _ offset, and the modified chroma picture sample array recPictureCB as output.
b. The filtering process to down-regulate the chroma block EDGE specified by clause 8.6.2.6.5 uses as input the chroma picture sample array recPictureCr, the location of the chroma codec block (xCb/SubWidthC, yCb/SubHeightC), the chroma location of the block (xDk, yDm), the variable edgeType set equal to EDGE _ HOR and the variable cQpPicOffset set equal to pps _ cr _ qp _ offset, and as output the modified chroma picture sample array recPictureCr.
When treeType is equal to DUAL _ TREE _ CHROMA, the filtering process of the edges in the two CHROMA codec blocks of the current codec unit consists of the following sequence of steps:
1. the variable yN is set equal to Max (0, (nCbH/8) -1) and xN is set equal to (nCbW/4) -1.
2. For yDm equal to m < 3, where m 0.. yN and xDk equal k < 2, where k 0.. xN, the following applies:
-when bS [ xDk ] [ yDm ] is greater than 0, applying the following sequence of steps:
a. the decision process to down-regulate the block EDGE specified in clause 8.6.2.6.3 uses treeType, picture sample array recPicture set equal to chroma picture sample array recPictureCb, chroma codec block position (xCb, yCb), chroma block position (xDk, yDm), variable edgeType set equal to EDGE _ HOR, boundary filter strength bS [ xDk ] [ yDm ], and bit depth bD set equal to BitDepthC as inputs, and decides dE, dEp and dEq, and variable tC as outputs.
b. A filtering process to down-regulate the block EDGE specified in clause 8.6.2.6.4, using the picture sample array recPicture set equal to the chroma picture sample array recPictureCb, the location of the chroma codec block (xCb, yCb), the chroma location of the block (xDk, yDm), the variable edgeType set equal to EDGE _ HOR, the dE, dEp, and dEq are determined, and the variable tC as input, and the modified chroma picture sample array recPictureCb as output.
c. The filtering process to down-regulate the block EDGE specified in clause 8.6.2.6.4 uses as input the picture sample array recPicture set equal to the chroma picture sample array recPictureCr, the location of the chroma codec block (xCb, yCb), the chroma location of the block (xDk, yDm), the variable edgeType set equal to EDGE _ HOR, the decisions dE, dEp and dEq, and the variable tC, and the modified chroma picture sample array recPictureCr as output.
8.6.2.6.3 determination of block edge
The inputs to this process are:
a variable treeType specifying whether a SINGLE TREE (SINGLE _ TREE) or a DUAL TREE is used to partition the CTU, and when a DUAL TREE is used, whether the LUMA component (DUAL _ TREE _ LUMA) or the CHROMA component (DUAL _ TREE _ CHROMA) is currently processed,
-an array of picture samples recPicture,
-a position (xCb, yCb) specifying the left top sample of the current codec block relative to the left top sample of the current picture,
-a position (xBl, yBl) specifying a top left sample of the current block relative to a top left sample of the current codec block,
a variable edgeType specifying whether vertical (EDGE _ VER) or horizontal (EDGE _ HOR) EDGEs are filtered,
a variable bS specifying the boundary filtering strength,
a variable bD specifying the depth of the bit current component.
The output of this process is:
-containing the decision variables dE, dEp and dEq,
-a variable tC.
If edgeType is equal to EDGE _ VER, then the sample values pi, k and qi, k, where i is 0..3 and k is 0 and 3, are derived as follows:
qi,k=recPictureL[xCb+xBl+i][yCb+yBl+k](8 867)
pi,k=recPictureL[xCb+xBl-i-1][yCb+yBl+k](8 868)
otherwise (edgeType equals EDGE _ HOR), sample values pi, k and qi, k, where i ═ 0..3 and k ═ 0 and 3, are derived as follows:
qi,k=recPicture[xCb+xBl+k][yCb+yBl+i] (8 869)
pi,k=recPicture[xCb+xBl+k][yCb+yBl-i-1] (8 870)
the variable qpOffset is derived as follows:
-if sps _ ladf _ enabled _ flag is equal to 1 and treeType is equal to SINGLE _ TREE or DUAL _ TREE _ LUMA, the following applies:
the variable lumaLevel for reconstructing the luminance level is derived as follows:
lumaLevel=((p0,0+p0,3+q0,0+q0,3)>>2), (8 871)
the variable qpOffset is set equal to sps _ ladf _ lowest _ interval _ qp _ offset and is modified as follows:
for(i=0;i<sps_num_ladf_intervals_minus2+1;i++){
if(lumaLevel>SpsLadfIntervalLowerBound[i+1])
qpOffset=sps_ladf_qp_offset[i] (8 872)
else
break
}
else (treeType equals DUAL _ TREE _ CHROMA), qpOffset is set equal to 0.
Variables QpQ and QpP are derived as follows:
-if treeType equals SINGLE _ TREE or DUAL _ TREE _ LUMA, QpQ and QpP are set equal to QpY values for a codec unit comprising codec blocks containing samples q0,0 and p0,0, respectively.
Else (treeType equals DUAL _ TREE _ CHROMA), QpQ and QpP are set equal to the QpC value of the codec unit comprising the codec blocks containing samples q0,0 and p0,0, respectively.
The variable qP is derived as follows:
qP=((QpQ+QpP+1)>>1)+qpOffset (8 873)
the value of variable β' is determined as specified in table 818 based on quantization parameter Q, which is derived as follows:
Q=Clip3(0,63,qP+(tile_group_beta_offset_div2<<1)) (8874)
where tile _ group _ beta _ offset _ div2 is the value of the syntax element tile _ group _ beta _ offset _ div2 of the slice group containing samples q0, 0.
The variable β is derived as follows:
β=β′*(1<<(bD-8)) (8 875)
the value of variable tC' is determined as specified in table 818 based on quantization parameter Q, which is derived as follows:
Q=Clip3(0,65,qP+2*(bS-1)+(tile_group_tc_offset_div2<<1)) (8 876)
where tile _ group _ tc _ offset _ div2 is the value of the syntax element of tile _ group _ tc _ offset _ div2 of the slice group containing samples q0, 0.
The variable tC is derived as follows:
tC=tC′*(1<<(bD-8)) (8 877)
depending on the value of edgeType, the following applies:
-if edgeType is equal to EDGE _ VER, applying the following sequence of steps:
1. the variables dpq0, dpq3, dp, dq and d are derived as follows:
dp0=Abs(p2,0-2*p1,0+p0,0) (8 878)
dp3=Abs(p2,3-2*p1,3+p0,3) (8 879)
dq0=Abs(q2,0-2*q1,0+q0,0) (8 880)
dq3=Abs(q2,3-2*q1,3+q0,3) (8 881)
dpq0=dp0+dq0 (8 882)
dpq3=dp3+dq3 (8 883)
dp=dp0+dp3 (8 884)
dq=dq0+dq3 (8 885)
d=dpq0+dpq3 (8 886)
2. the variables dE, dEp, and dEq are set equal to 0.
3. When d is less than β, the following sequence of steps is applied:
a. the variable dpq is set equal to 2 × dpq 0.
b. For the sample position (xCb + xBl, yCb + yBl), the decision process to down-regulate the sample as specified in clause 8.6.2.6.6 uses sample values p0,0, p3,0, q0,0, and q3,0, variables dpq, β, and tC as inputs, and the output is assigned to the decision dSam 0.
c. The variable dpq is set equal to 2 × dpq 3.
d. For the sample position (xCb + xBl, yCb + yBl +3), the sample decision process as specified in clause 8.6.2.6.6 is invoked with the sample values p0,3, p3,3, q0,3, and q3,3, the variables dpq, β, and tC being used as inputs, and the output being assigned to the decision dSam 3.
e. The variable dE is set equal to 1.
f. When dSam0 equals 1 and dSam3 equals 1, the variable dE is set equal to 2.
g. When dp is less than (β + (β > >1)) > >3, the variable dEp is set equal to 1.
h. When dq is less than (β + (β > >1)) > >3, the variable dEq is set equal to 1.
Else (edgeType equals EDGE _ HOR), applying the following sequence of steps:
1. the variables dpq0, dpq3, dp, dq and d are derived as follows:
dp0=Abs(p2,0-2*p1,0+p0,0) (8 887)
dp3=Abs(p2,3-2*p1,3+p0,3) (8 888)
dq0=Abs(q2,0-2*q1,0+q0,0) (8 889)
dq3=Abs(q2,3-2*q1,3+q0,3) (8 890)
dpq0=dp0+dq0 (8 891)
dpq3=dp3+dq3 (8 892)
dp=dp0+dp3 (8 893)
dq=dq0+dq3 (8 894)
d=dpq0+dpq3 (8 895)
2. the variables dE, dEp, and dEq are set equal to 0.
3. When d is less than β, the following sequence of steps is applied:
a. the variable dpq is set equal to 2 × dpq 0.
b. For the sample position (xCb + xBl, yCb + yBl), the decision process to down-regulate the sample as specified in clause 8.6.2.6.6 uses sample values p0,0, p3,0, q0,0 and q3,0, variables dpq, β and tC as inputs, and the output is assigned to the decision dSam 0.
c. The variable dpq is set equal to 2 × dpq 3.
d. For the sample position (xCb + xBl +3, yCb + yBl), the decision process to down-regulate the sample as specified in clause 8.6.2.6.6 uses sample values p0,3, p3,3, q0,3 and q3,3, variables dpq, β and tC as inputs, and the output is assigned to the decision dSam 3.
e. The variable dE is set equal to 1.
f. When dSam0 equals 1 and dSam3 equals 1, the variable dE is set equal to 2.
g. When dp is less than (β + (β > >1)) > >3, the variable dEp is set equal to 1.
h. When dq is less than (β + (β > >1)) > >3, the variable dEq is set equal to 1.
TABLE 818-derive threshold variables β ' and tC ' from input Q '
8.6.2.6.4 Block edge Filter Process
The inputs to this process are:
-an array of picture samples recPicture,
-a position (xCb, yCb) specifying the left top sample of the current codec block relative to the left top sample of the current picture,
-a position (xBl, yBl) specifying a top left sample of the current block relative to a top left sample of the current codec block,
a variable edgeType specifying whether vertical (EDGE _ VER) or horizontal (EDGE _ HOR) EDGEs are filtered,
-containing the decision variables dE, dEp and dEq,
-a variable tC.
The output of this process is a modified picture sample array, recPicture.
Depending on the value of edgeType, the following applies:
-if edgeType is equal to EDGE _ VER, applying the following sequence of steps:
1. sample values pi, k and qi, k, i ═ 0..3 and k ═ 0..3 are derived as follows:
qi,k=recPictureL[xCb+xBl+i][yCb+yBl+k](8 896)
pi,k=recPictureL[xCb+xBl-i-1][yCb+yBl+k] (8 897)
2. when dE is not equal to 0, for each spot position (xCb + xBl, yCb + yBl + k), k ═ 0..3, the following sequence of steps is applied:
a. the filtering process of samples as specified in clause 8.6.2.6.7 is invoked with samples values pi, k, qi, k, where i is 0..3, a position (xPi, yPi) set equal to (xCb + xBl-i-1, yCb + yBl + k) and (xQi, yQi) (i is 0..2) set equal to (xCb + xBl + i, yCb + yBl + k), dE-termining dE, variables dEp and dEq and variable tC as inputs, and the number of filtered samples nDp and nDq and the filtered sample values pi 'and qj' from each side of the block boundary as outputs.
b. When nDp is greater than 0, the filtered sample value pi' (where i ═ 0.. nDp-1) replaces the corresponding sample within the sample array recorpicture as follows:
recPicture[xCb+xBl-i-1][yCb+yBl+k]=pi' (8 898)
c. when nDq is greater than 0, the filtered sample value qj' (where j ═ 0.. nDq-1) replaces the corresponding sample within the sample array recorpicture:
recPicture[xCb+xBl+j][yCb+yBl+k]=qj' (8 899)
else (edgeType equals EDGE _ HOR), applying the following sequence of steps:
1. sample values pi, k and qi, k, where i 0..3 and k 0..3 are derived as follows:
qi,k=recPictureL[xCb+xBl+k][yCb+yBl+i] (8 900)
pi,k=recPictureL[xCb+xBl+k][yCb+yBl-i-1] (8 901)
2. when dE is not equal to 0, for each spot position (xCb + xBl + k, yCb + yBl), k ═ 0..3, the following sequence of steps is applied:
a. the filtering process of samples as specified in clause 8.6.2.6.7 is invoked with samples values pi, k, qi, k, where i is 0..3, a position (xPi, yPi) set equal to (xCb + xBl + k, yCb + yBl-i-1) and (xQi, yQi) (i is 0..2) set equal to (xCb + xBl + k, yCb + yBl + i), dE-termining dE, variables dEp and dEq, and variable tC as inputs, and the number of filtered samples nDp and nDq and the filtered sample values pi 'and qj' from each side of the block boundary as outputs.
b. When nDp is greater than 0, the filtered sample value pi', (where i ═ 0.. nDp-1) replaces the corresponding sample within the sample array recorpicture:
recPicture[xCb+xBl+k][yCb+yBl-i-1]=pi' (8 902)
c. when nDq is greater than 0, the filtered sample values qj', (where j 0.. nDq-1) replace the corresponding samples within the sample array recorpicture:
recPicture[xCb+xBl+k][yCb+yBl+j]=qj' (8 903)
8.6.2.6.5 filtering process of chroma block edge
This procedure is only invoked when ChromaArrayType is not equal to 0.
The inputs to this process are:
-an array of samples s' of a chrominance picture,
-a chroma position (xCb, yCb) specifying a top left chroma sampling of the current chroma codec block relative to a top left chroma sampling of the current picture,
-a chroma position (xBl, yBl) specifying a top left sample of the current chroma block relative to a top left sample of the current chroma codec block,
a variable edgeType specifying whether vertical (EDGE _ VER) or horizontal (EDGE _ HOR) EDGEs are filtered,
-a variable cqppicffset specifying the picture-level chrominance quantization parameter offset.
The output of this process is a modified chroma picture sample array s'.
If edgeType is equal to EDGE _ VER, the values pi and qi, where i 0..1 and k 0..3 are derived as follows:
qi,k=s′[xCb+xBl+i][yCb+yBl+k] (8 904)
pi,k=s′[xCb+xBl-i-1][yCb+yBl+k] (8 905)
otherwise (edgeType equals EDGE _ HOR), the sample values pi and qi, where i ═ 0..1 and k ═ 0..3, are derived as follows:
qi,k=s′[xCb+xBl+k][yCb+yBl+i] (8 906)
pi,k=s′[xCb+xBl+k][yCb+yBl-i-1] (8 907)
variables QpQ and QpP are set equal to the QpY value of the codec unit that includes the codec blocks containing samples q0,0 and p0,0, respectively.
If ChromaArrayType is equal to 1, then variable QpC is determined based on index qPi as specified by table 815, index qPi being derived as follows:
qPi=((QpQ+QpP+1)>>1)+cQpPicOffset (8 908)
otherwise (ChromaArrayType greater than 1), variable QpC is set equal to Min (qPi, 63).
Note that the variable cqppicfoffset provides an adjustment to the value of pps _ Cb _ qp _ offset or pps _ Cr _ qp _ offset depending on whether the filtered chroma component is a Cb or Cr component. However, to avoid the need to change the amount of intra-picture adjustment, the filtering process does not include an adjustment to the value of tile _ group _ cb _ qp _ offset or tile _ group _ cr _ qp _ offset.
The value of variable tC' is determined as specified in table 818 based on a chrominance quantization parameter Q, which is derived as follows:
Q=Clip3(0,65,QpC+2+(tile_group_tc_offset_div2<<1))(8 909)
where tile _ group _ tc _ offset _ div2 is the value of the syntax element of tile _ group _ tc _ offset _ div2 of the slice group containing samples q0, 0.
The variable tC is derived as follows:
tC=tC′*(1<<(BitDepthC-8)) (8 910)
depending on the value of edgeType, the following applies:
-if edgeType is equal to EDGE _ VER, for each sample position (xCb + xBl, yCb + yBl + k), k 0..3, applying the following sequence of steps:
1. the filtering process of chroma samples as specified in clause 8.6.2.6.8 is invoked with sample values pi, k, qi, k, where i is 0..1, positions (xCb + xBl-1, yCb + yBl + k) and (xCb + xBl, yCb + yBl + k) and variable tC as inputs, and filtered sample values p0 'and q 0' as outputs.
2. The filtered sample values p0 ' and q0 ' replace corresponding samples within the sample array s ' as follows:
s′[xCb+xBl][yCb+yBl+k]=q0′ (8 911)
s′[xCb+xBl-1][yCb+yBl+k]=p0′ (8 912)
else (edgeType equals EDGE _ HOR), for each sample position (xCb + xBl + k, yCb + yBl), k being 0..3, the following sequence of steps is applied:
1. the filtering process of chroma samples as specified in clause 8.6.2.6.8 is invoked with sample values pi, k, qi, k, where i is 0..1, positions (xCb + xBl + k, yCb + yBl-1) and (xCb + xBl + k, yCb + yBl), and variable tC as inputs, and filtered sample values p0 'and q 0' as outputs.
2. The filtered sample values p0 ' and q0 ' replace corresponding samples within the sample array s ' as follows:
s′[xCb+xBl+k][yCb+yBl]=q0′ (8 913)
s′[xCb+xBl+k][yCb+yBl-1]=p0′ (8 914)
8.6.2.6.6 sampling point decision process
The inputs to this process are:
sample values p0, p3, q0 and q3,
the variables dpq, β and tC.
The output of this process is the inclusion decision variable dSam.
The variable dSam is specified as follows:
-Abs (p3-p0) + Abs (q0-q3) is less than (β > >3) and Abs (p0-q0) is less than (5 × tC +1) > >1 if dpq is less than (β > >2, dSam being set equal to 1.
Otherwise dSam is set equal to 0.
8.6.2.6.7 sampling point filtering process
The inputs to this process are:
-sampling point values pi and qi, where i ═ 0..3,
-positions of pi and qi, (xPi, yPi) (xQi, yQi), wherein i is 0..2,
-a variable dE for the number of bits,
variables dEp and dEq containing the decision on the filter samples p1 and q1, respectively,
-a variable tC.
The output of this process is:
the number of filtered samples nDp and nDq,
filtered sample point values pi 'and qj', where i 0.. nDp-1 and j 0.. nDq-1.
Depending on the value of dE, the following applies:
if the variable dE is equal to 2, nDp and nDq are both set equal to 3, and the following strong filtering is applied:
p0′=Clip3(p0-2*tC,p0+2*tC,(p2+2*p1+2*p0+2*q0+q1+4)>>3) (8 915)
p1′=Clip3(p1-2*tC,p1+2*tC,(p2+p1+p0+q0+2)>>2)(8 916)
p2′=Clip3(p2-2*tC,p2+2*tC,(2*p3+3*p2+p1+p0+q0+4)>>3) (8 917)
q0′=Clip3(q0-2*tC,q0+2*tC,(p1+2*p0+2*q0+2*q1+q2+4)>>3) (8 918)
q1′=Clip3(q1-2*tC,q1+2*tC,(p0+q0+q1+q2+2)>>2)(8 919)
q2′=Clip3(q2-2*tC,q2+2*tC,(p0+q0+q1+3*q2+2*q3+4)>>3) (8 920)
otherwise, nDp and nDq are both set equal to 0, and the following weak filtering is applied:
-applying the following:
=(9*(q0-p0)-3*(q1-p1)+8)>>4 (8 921)
-when Abs () is less than tC x 10, applying the following sequence of steps:
the filtered sample values p0 'and q 0' are specified as follows:
=Clip3(-tC,tC,) (8 922)
p0′=Clip1Y(p0+) (8 923)
q0′=Clip1Y(q0-) (8 924)
when dEp equals 1, the filtered sample value p 1' is specified as follows:
p=Clip3(-(tC>>1),tC>>1,(((p2+p0+1)>>1)-p1+)>>1) (8 925)
p1′=Clip1Y(p1+p) (8 926)
when dEq equals 1, the filtered sample value q 1' is specified as follows:
q=Clip3(-(tC>>1),tC>>1,(((q2+q0+1)>>1)-q1-)>>1) (8 927)
q1′=Clip1Y(q1+q) (8 928)
-nDp is set equal to dEp +1 and nDq is set equal to dEq + 1.
nDp is set equal to 0 when nDp is greater than 0 and one or more of the following conditions is true:
-pcm _ loop _ filter _ disabled _ flag equal to 1 and pcm _ flag [ xP0] [ yP0] equal to 1.
Cu _ transquant _ bypass _ flag of the codec unit comprising the codec block containing sample p0 is equal to 1.
nDq is set equal to 0 when nDq is greater than 0 and one or more of the following conditions is true:
-pcm _ loop _ filter _ disabled _ flag equal to 1 and pcm _ flag [ xQ0] [ yQ0] equal to 1.
Cu _ transquant _ bypass _ flag of the codec unit comprising the codec block containing sample q0 is equal to 1.
8.6.2.6.8 filtering process of chroma sampling points
This procedure is only invoked when ChromaArrayType is not equal to 0.
The inputs to this process are:
-chromaticity sample values pi and qi, where i ═ 0..1,
-the colorimetric positions of p0 and q0, (xP0, yP0) and (xQ0, yQ0),
-a variable tC.
The outputs of this process are filtered sample values p0 'and q 0'.
The filtered sample values p0 'and q 0' are derived as follows:
=Clip3(-tC,tC,((((q0-p0)<<2)+p1-q1+4)>>3)) (8 929)
p0′=Clip1C(p0+) (8 930)
q0′=Clip1C(q0-) (8 931)
the filtered sample value, p 0', is replaced by a corresponding input sample value, p0, when one or more of the following conditions is true:
-pcm _ loop _ filter _ disabled _ flag is equal to 1 and pcm _ flag [ xP0 SubWidthC ] [ yP0 subheight c ] is equal to 1.
Cu _ transquant _ bypass _ flag of the codec unit comprising the codec block containing sample p0 is equal to 1.
The filtered sample value, q 0', is replaced by a corresponding input sample value, q0, when one or more of the following conditions is true:
-pcm _ loop _ filter _ disabled _ flag is equal to 1 and pcm _ flag [ xQ0 SubWidthC ] [ yQ0 subheight c ] is equal to 1.
Cu _ transquant _ bypass _ flag of the codec unit comprising the codec block containing sample q0 is equal to 1.
8.6.3 sampling point adaptive offset procedure
8.6.3.1 overview
The input to this process is the reconstructed picture sample array, recPictureL, before the sample adaptive offset and, when ChromaArrayType does not equal 0, the arrays recPictureCb and recPictureCr.
The output of this process is the modified reconstructed picture sample array saoPictureL after the sample adaptive offset and, when ChromaArrayType does not equal 0, the arrays saoPictureCb and saoPictureCr.
After the deblocking filtering process of decoding a picture is completed, the process is performed based on the CTB.
The modified reconstructed picture sample array saoPictureL, and when ChromaArrayType is not equal to 0, the arrays saoPictureCb and saopicturec are initially set equal to the sample values in the reconstructed picture sample array recapicturel, and when ChromaArrayType is not equal to 0, the arrays recapicturecb and recapicturec, respectively.
For each CTU with CTB location (rx, ry), where rx-0.. PicWidthInCtbsY-1 and ry-0.. PicHeightInCtbsY-1, the following rule applies:
-when tile _ group _ sao _ luma _ flag of the current slice group is equal to 1, to down-invoke the CTB modification procedure as specified in clause 8.6.3.2, using as input the recircuture set equal to recipicterel, the cIdx set to 0, (rx, ry), and both nCtbSw and nCtbSh set equal to CtbSizeY, and as output the modified luma picture sample point array saopicteurl.
-when ChromaArrayType is not equal to 0 and tile _ group _ sao _ chroma _ flag of the current slice group is equal to 1, to down-invoke the CTB modification process specified by clause 8.6.3.2, use recipieture set to recircuurcecb, cid set to 1, (rx, ry), nCtbSw set to (1 < CtbLog2 size)/SubWidthC, nCtbSh set to (1 < CtbLog2 size)/subwight as input, and modified chroma image sample point array saopictecb as output.
-when ChromaArrayType is not equal to 0 and tile _ group _ sao _ chroma _ flag of the current slice group is equal to 1, to down-invoke the CTB modification process specified by clause 8.6.3.2, use recipieture set to recircurtecr, cid set to 2, (rx, ry), nCtbSw set to (1 < CtbLog2 size)/SubWidthC, nCtbSh set to (1 < CtbLog2 size)/subwight as input, and modified chroma image sample point array saoPictureCr as output.
8.6.3.2CTB modification procedure
The inputs to this process are:
-an array of picture samples recPicture of the color component cIdx,
a variable cIdx specifying a color component index,
a pair of variables (rx, ry) specifying the CTB location,
CTB width nCtbSw and height nCtbSh.
The output of this process is a modified picture sample array saoPicture of the color component cIdx.
The variable bitDepth is derived as follows:
-if cIdx equals 0, bitDepth is set equal to bitDepth.
Else, bitDepth is set equal to BitDepthC.
The position (xCtb, yCtb), which specifies the left top sample point of the current CTB of the color component cIdx relative to the left top sample point component cIdx of the current picture, is derived as follows:
(xCtb,yCtb)=(rx*nCtbSw,ry*nCtbSh) (8 932)
the sample location within the current CTB is derived as follows:
(xSi,ySj)=(xCtb+i,yCtb+j) (8 933)
(xYi,yYj)=(cIdx==0)?(xSi,ySj):(xSi*SubWidthC,ySj*SubHeightC) (8 934)
for all sample positions (xSi, ySj) and (xYi, yYj), where i ═ 0.. nCtbSw-1 and j ═ 0.. nCtbSh-1, depending on the values of pcm _ loop _ filter _ disabled _ flag, pcm _ flag [ xYi ] [ yYj ] and cu _ transquant _ bypass _ flag of the codec unit containing the codec block covering the feature [ xSi ] [ ySj ], the following applies:
saoPicture [ xSi ] [ ySj ] is not modified if one or more of the following conditions is true:
-pcm _ loop _ filter _ disabled _ flag and pcm _ flag [ xYi ] [ yYj ] are both equal to 1.
-cu _ transquant _ bypass _ flag is equal to 1.
-SaoTypeIdx [ cIdx ] [ rx ] [ ry ] equals 0.
[ Ed. (BB): the highlighted portion is modified to allow for future decision of the conversion/quantization bypass. ]
Else, if SaoTypeIdx [ cIdx ] [ rx ] [ ry ] equals 2, apply the following sequence of steps:
1. based on SaoEoClass [ cIdx ] [ rx ] [ ry ], the values of hPos [ k ] and vPos [ k ] for k 0..1 are specified in table 819.
2. The variable edgeIdx is derived as follows:
-the modified sample point positions (xSik ', ySjk') and (xYik ', yYjk') are derived as follows:
(xSik′,ySjk′)=(xSi+hPos[k],ySj+vPos[k]) (8 935)
(xYik′,yYjk′)=(cIdx==0)?(xSik′,ySjk′):(xSik′*SubWidthC,ySjk′*SubHeightC) (8 936)
-edgeIdx is set equal to 0 if one or more of the following conditions is true for all sample positions (xSik ', ySjk') and (xYik ', yyyjk'), k ═ 0.. 1:
samples at position (xSik ', ySjk') are outside the picture boundary.
-the samples at position (xSik ', ySjk') belong to different groups of slices and one of the following two conditions is true:
-MinTbAddrZs [ xYik '> > MinTbLog2SizeY ] [ yYjk' > > MinTbLog2SizeY ] is less than MinTbAddrZs [ xYi > > MinTbLog2SizeY ] [ yYj > > MinTbLog2SizeY ] and tile _ group _ loop _ filter _ access _ tile _ groups _ enabled _ flag in the tile group to which the sample point recPicture [ xSi ] [ ySj ] belongs is equal to 0.
-MinTbAddrZs [ xYi > > MinTbLog2SizeY ] [ yYj > > MinTbLog2SizeY ] is smaller than MinTbAddrZs [ xYik '> > MinTbLog2SizeY ] [ yYjk' > > MinTbLog2SizeY ] and the tile _ group _ loop _ filter _ across _ tile _ groups _ enabled _ flag in the tile group to which the sample point recPicture [ xSik '] [ ySjk' ] belongs is equal to 0.
-loop _ filter _ across _ tiles _ enabled _ flag is equal to 0 and samples at position (xSik ', ySjk') belong to different slices.
[ Ed. (BB): modifying highlighted portions when merging tiles without a slice group ]
Otherwise, edgeIdx is derived as follows:
-applying the following:
edgeIdx=2+Sign(recPicture[xSi][ySj]-recPicture[xSi+hPos[0]][ySj+vPos[0]])+
Sign(recPicture[xSi][ySj]-recPicture[xSi+hPos[1]][ySj+vPos[1]]) (8 937)
-when edgeIdx equals 0,1, or 2, edgeIdx is modified as follows:
edgeIdx=(edgeIdx==2)?0:(edgeIdx+1) (8 938)
3. the modified picture sample array saoPicture [ xSi ] [ ySj ] is derived as follows:
saoPicture[xSi][ySj]=Clip3(0,(1<<bitDepth)-1,recPicture[xSi][ySj]+
SaoOffsetVal[cIdx][rx][ry][edgeIdx]) (8 939)
-else (SaoTypeIdx [ cIdx ] [ rx ] [ ry ] equal to 1), applying the following sequence of steps:
1. the variable bandShift is set equal to bitDepth-5.
2. The variable saoLeftClass is set equal to sao _ band _ position [ cIdx ] [ rx ] [ ry ].
3. The list bandTable is defined to have 32 elements and all elements are initially set equal to 0. Then, four of the elements (indicating the starting position of the band for a definite offset) are modified as follows:
for (k ═ 0; k < 4; k + +)
bandTable[(k+saoLeftClass)&31]=k+1(8 940)
4. The variable bandIdx is set equal to bandTable [ recorpicture [ xSi ] [ ySj ] > > bandShift ].
5. The modified picture sample array saoPicture [ xSi ] [ ySj ] is derived as follows:
saoPicture[xSi][ySj]=Clip3(0,(1<<bitDepth)-1,recPicture[xSi][ySj]+
SaoOffsetVal[cIdx][rx][ry][bandIdx]) (8 941)
table 819-Specifications for hPos and vPos sorted by sample adaptive offset
The sub-block-based prediction techniques discussed above may be used to obtain more accurate motion information for each sub-block when the size of the sub-block is small. However, smaller sub-blocks result in higher bandwidth requirements in motion compensation. On the other hand, the motion information derived for smaller sub-blocks may be inaccurate, especially when there is some noise in the block. Therefore, having a fixed sub-block size within a block may be sub-optimal.
This document describes techniques that may be used in various embodiments to address the bandwidth and precision issues introduced by fixed sub-block sizes using non-uniform and/or variable sub-block sizes. These techniques (also called interleaved prediction) use a different way of subdividing blocks so that motion information can be acquired more robustly without increasing bandwidth consumption.
The block is subdivided into sub-blocks in one or more subdivision modes using an interleaved prediction technique. The subdivision pattern indicates the manner in which the block is subdivided into sub-blocks, including the size of the sub-blocks and the location of the sub-blocks. For each subdivision pattern, a corresponding prediction block may be generated by deriving motion information for each sub-block based on the subdivision pattern. Thus, in some embodiments, multiple prediction blocks may be generated by multiple subdivision patterns, even for one prediction direction. In some embodiments, only one subdivision pattern may be applied for each prediction direction.
Fig. 13 illustrates an example of interleaved prediction with two subdivision patterns in accordance with the techniques of this disclosure. The current block 1300 may be subdivided into a plurality of patterns. For example, as shown in fig. 13, the current block is subdivided into both a pattern 0(1301) and a pattern 1 (1302). Generating two prediction blocks P0(1303) And P1(1304). Can be calculated by P0(1303) And P1(1304) To generate a final prediction block P (1305) of the current block 1300.
More generally, given X subdivision patterns, X prediction blocks, denoted P, for a current block may be generated by sub-block based prediction with X subdivision patterns0,P1,......,PX-1. The final prediction of the current block, denoted P, may be generated as
Here, (x, y) is the coordinate of the pixel in the block, and wi(x, y) is PiThe weight value of (2). By way of example and not limitation, the weights may be expressed as:
n is a non-negative value. Alternatively, the bit shift operation in equation (16) may also be expressed as:
the sum of weights as a power of 2 allows the weighted sum P to be calculated more efficiently by performing a bit shift operation rather than floating-point division.
In the following, various implementations are presented as separate chapters and items. The use of different sections and items in this document is merely for ease of understanding, and the scope of the embodiments and techniques described in each section/item is not limited to that section/item.
Use of interleaved prediction for different coding tools
Item 1: note that the interleaved prediction techniques disclosed herein may be applied in one, some, or all of the codec techniques for sub-block based prediction. For example, the interleaved prediction technique may be applied to affine prediction, while other coding techniques based on prediction of sub-blocks (e.g., ATMVP, STMVP, FRUC, or BIO) do not use interleaved prediction. As another example, affine, ATMVP, and STMVP both apply the interleaved prediction techniques disclosed herein.
Definition of subdivision patterns
Item 2: the subdivision patterns may have different shapes, sizes or locations of the sub-blocks. In some embodiments, the subdivision patterns may include irregular sub-block sizes. Fig. 14A-14G show several examples of subdivision patterns for 16 x 16 blocks. In fig. 14A, a block is subdivided into 4x4 sub-blocks according to the disclosed technique. This pattern is also used in JEM. Fig. 14B illustrates an example of subdividing a block into 8x8 sub-blocks in accordance with the techniques of this disclosure. Fig. 14C illustrates an example of subdividing a block into 8x 4 sub-blocks in accordance with the disclosed techniques. Fig. 14D illustrates an example of subdividing a block into 4x 8 sub-blocks in accordance with the techniques of this disclosure. In fig. 14E, a portion of a block is subdivided into 4x4 sub-blocks in accordance with the techniques of this disclosure. The pixels at the block boundaries are subdivided into smaller sub-blocks having a size such as 2 × 4, 4 × 2 or 2 × 2. Some sub-blocks may be combined to form larger sub-blocks. Fig. 14F shows an example of neighboring sub-blocks, such as a 4 × 4 sub-block and a 2 × 4 sub-block, which are merged to form a larger sub-block having a size such as 6 × 4, 4 × 6, or 6 × 6. In fig. 14G, a portion of a block is subdivided into 8x8 sub-blocks. The pixels at the block boundaries are subdivided into smaller sub-blocks having a size such as 8x 4, 4x 8 or 4x 4.
Item 3: the shape and size of the sub-block in the sub-block based prediction may be determined based on the shape and/or size of the coded block and/or the coded block information. For example, in some embodiments, when the current block has a size of M × N, the sub-block has a size of 4 × N (or 8 × N, etc.). That is, the sub-block has the same height as the current block. In some embodiments, when the current block has a size of M × N, the sub-block has a size of M × 4 (or M × 8, etc.). That is, the sub-block has the same width as the current block. In some embodiments, when the current block has a size of M × N (where M > N), the sub-block has a size of a × B, where a > B (e.g., 8 × 4). Alternatively, the sub-blocks may have a size of B × a (e.g., 4 × 8). In some embodiments, the current block has a size of M × N. A sub-block has a size of a × B when M × N < ═ T (or Min (M, N) < ═ T, or Max (M, N) < ═ T, etc.), and a size of C × D when M × N > T (or Min (M, N) > T, or Max (M, N) > T, etc.), where a < ═ C and B < ═ D. For example, if M × N < > 256, the sub-block may be 4 × 4 in size. In some implementations, the sub-blocks have a size of 8x 8.
Codec process enabling/disabling inter prediction and inter prediction
Item 4: in some embodiments, whether to apply interleaved prediction may be determined based on the inter prediction direction. For example, in some embodiments, interleaved prediction may be applied to bi-directional prediction, but not to uni-directional prediction. As another example, when applying multiple hypotheses, when there is more than one reference block, the interleaved prediction may be applied to one prediction direction.
Item 5: in some embodiments, how to apply the interleaved prediction may also be determined based on the inter prediction direction. In some embodiments, a bi-prediction block with sub-block based prediction is subdivided into sub-blocks in two different subdivision patterns for two different reference lists. For example, when predicting from the reference list 0(L0), the bi-prediction block is subdivided into 4 × 8 sub-blocks, as shown in fig. 14D. When predicted from reference list 1(L1), the same block is subdivided into 8 × 4 sub-blocks, as shown in fig. 14C. The final prediction P is calculated as
Here, P0And P1Predictions from L0 and L1, respectively. w is a0And w1Weight values of L0 and L1, respectively. As shown in equation (16), the weight value may be determined as: w is a0(x,y)+w1(x, y) < N (where N is a non-negative integer value). Because fewer sub-blocks are used for prediction in each direction (e.g., 4x 8 sub-blocks compared to 8x8 sub-blocks), the computation requires less bandwidth than existing sub-block based approaches. By using larger sub-blocks, the prediction is also less susceptible to noise interference.
In some embodiments, a uni-directional prediction block with sub-block based prediction is subdivided into sub-blocks in two or more different subdivision patterns for the same reference list. For example, the prediction P of the list L (L ═ 0 or 1)LIs calculated as
Where XL is the number of subdivision styles for list L.Is generated by prediction using the ith subdivision pattern, anIs thatThe weight value of (2). For example, when XL is 2, two subdivision styles are applied to list L. In the first subdivision pattern, the block is subdivided into 4 × 8 sub-blocks, as shown in fig. 14D. In the second subdivision pattern, the block is subdivided into 8 × 4 sub-blocks as shown in fig. 14D.
In one embodiment, a bi-prediction block with sub-block based prediction is considered a combination of two uni-prediction blocks from L0 and L1, respectively. The prediction from each list can be derived as described in the above examples. The final prediction P can be calculated as
Here, the parameters a and b are two additional weights applied to the two intra prediction blocks. In this specific example, a and b may both be set to 1. Similar to the example above, because fewer sub-blocks are used in each direction for prediction (e.g., 4 × 8 sub-blocks compared to 8 × 8 sub-blocks), bandwidth usage is better or on par with existing sub-block based approaches. At the same time, the prediction results may be improved by using larger sub-blocks.
In some embodiments, a single non-uniform pattern may be used in each unidirectional prediction block. For example, for each list L (e.g., L0 or L1), the tiles are divided into different styles (e.g., as shown in fig. 14E or 14F). Using a smaller number of sub-blocks reduces the bandwidth requirements. The non-uniformity of the sub-blocks also increases the robustness of the prediction results.
In some embodiments, for a multi-hypothesis coded block, for each prediction direction (or reference picture list), there may be more than one prediction block generated by different subdivision patterns. Multiple prediction blocks may be used to generate a final prediction with additional weights applied. For example, the additional weight may be set to 1/M, where M is the total number of generated prediction blocks.
Item 6: in some embodiments, the encoder may determine whether and how to apply interleaved prediction. The encoder may then send information corresponding to the determination to the decoder at a sequence level, picture level, view level, slice level, Codec Tree Unit (CTU) (also referred to as a maximum codec unit (LCU)) level, CU level, PU level, Tree Unit (TU) level, slice group level, or region level (which may include multiple CUs/PUs/TU/LCUs). The information may be signaled in a Sequence Parameter Set (SPS), a View Parameter Set (VPS), a Picture Parameter Set (PPS), a Slice Header (SH), a picture header, a sequence header, or the first block of a slice level or a group of slices level, CTU/LCU, CU, PU, TU, or region.
In some implementations, the interleaved prediction is applicable to existing sub-block methods, such as affine prediction, ATMVP, STMVP, FRUC, or BIO. In this case, no additional signaling cost is required. In some implementations, the new sub-block Merge candidates generated by the interleaved prediction may be inserted into a Merge list, e.g., interleaved prediction + ATMVP, interleaved prediction + STMVP, interleaved prediction + FRUC, etc. In some implementations, a flag may be signaled to indicate whether to use interleaved prediction. In one example, if the current block is affine inter-coded, a flag is signaled to indicate whether to use interleaved prediction. In some implementations, if the current block is affine Merge codec and unidirectional prediction is applied, a flag may be signaled to indicate whether to use interleaved prediction. In some implementations, if the current block is affine Merge codec, a flag may be signaled to indicate whether to use interleaved prediction. In some implementations, if the current block is affine Merge codec and unidirectional prediction is applied, then interleaved prediction may always be used. In some implementations, if the current block is affine Merge codec, then interleaved prediction may always be used.
In some implementations, the flag indicating whether to use interleaved prediction can be inherited (inherit) without being signaled. Some examples include:
(i) in one example, if the current block is affine Merge coded, inheritance may be used.
(ii) In one example, the flags may be inherited from flags of neighboring blocks that inherit the affine model.
(iii) In one example, the flag is inherited from a predetermined neighboring block, such as a left or upper neighboring block.
(iv) In one example, the flag may be inherited from neighboring blocks of the affine codec encountered first.
(v) In one example, if no neighboring blocks are affine codec, the flag may be inferred to be zero.
(vi) In one example, the flag may be inherited only when unidirectional prediction is applied to the current block.
(vii) In one example, the flag may be inherited only if the current block and its neighboring block to be inherited are in the same CTU.
(viii) In one example, the flag may be inherited only if the current block and its neighboring block to be inherited are in the same CTU row.
(ix) In one example, when the affine model is derived from the temporal neighboring blocks, the flag may not be inherited from the flags of the neighboring blocks.
(x) In one example, the flags may not be inherited from flags of neighboring blocks that are not located in the same LCU or LCU row or video data processing unit (such as 64 x 64 or 128 x 128).
(xi) In one example, how the flag is signaled and/or derived may depend on the block size and/or codec information of the current block.
In some implementations, if the reference picture is a current picture, then interleaved prediction is not applied. For example, if the reference picture is a current picture, a flag indicating whether to use interleaved prediction is not signaled.
In some embodiments, the subdivision pattern to be used by the current block may be derived based on information from spatial and/or temporal neighboring blocks. For example, rather than relying on the encoder to signal the relevant information, both the encoder and decoder may employ a set of predetermined rules to obtain the subdivision pattern based on temporal adjacency (e.g., the subdivision pattern of the same block previously used) or spatial adjacency (e.g., the subdivision pattern used by neighboring blocks).
Weighted value
Item 7: in some embodiments, the weight value w may be fixed. For example, all subdivision patterns may be weighted equally: w is ai(x,y)=1。
Item 8: in some embodiments, the weight values may be determined based on the location of the blocks and the subdivision pattern used. For example, for different (x, y), wi(x, y) may be different. In some embodiments, the weight values may also depend on the sub-block prediction based codec technique (e.g., affine or ATMVP) and/or other codec information (e.g., skip or non-skip mode, and/or MV information).
Item 9: in some embodiments, the encoder may determine the weight values and send the values to the decoder in a sequence level, picture level, slice level, CTU/LCU level, CU level, PU level, or region level (which may include multiple CUs/PUs/Tu/LCUs)). The weight value may be signaled in a Sequence Parameter Set (SPS), a Picture Parameter Set (PPS), a Slice Header (SH), a CTU/LCU, a CU, a PU, or a first block of a region. In some embodiments, the weight values may be derived from weight values of spatially and/or temporally neighboring blocks.
Partial interleavingPrediction
Item 10: in some embodiments, partial interleaved prediction may be implemented as follows.
In some embodiments, interleaved prediction is applied to a portion of the current block. The predicted samples at some locations are computed as a weighted sum of two or more sub-block based predictions. The predicted samples at other locations are not used for weighted sum. For example, these prediction samples are copied from the subblock-based prediction with a particular subdivision pattern.
In some embodiments, the current block is predicted by sub-block-based prediction P1 and P2 having a subdivision pattern D0 and a subdivision pattern D1, respectively. The final prediction calculation is P-w 0 × P0+ w1 × P1. In some positions, w0 ≠ 0 and w1 ≠ 0. But at some other positions w 0-1 and w 1-0, i.e. no interleaved prediction is applied at these positions.
In some embodiments, interleaved prediction is not applied on the four corner sub-blocks, as shown in fig. 15A.
In some embodiments, interleaved prediction is not applied to the leftmost column of the sub-block and the rightmost column of the sub-block, as shown in fig. 15B.
In some embodiments, interleaved prediction is not applied to the top-most row of sub-blocks and the bottom-most row of sub-blocks, as shown in fig. 15C.
In some embodiments, interleaved prediction is not applied to the top-most row of rows of sub-blocks, the bottom-most row, the left-most column of sub-blocks, and the right-most column of sub-blocks, as shown in fig. 15D.
In some embodiments, whether and how partial interleaved prediction is applied may depend on the size/shape of the current block.
For example, in some embodiments, if the size of the current block satisfies a certain condition, the interleaving prediction is applied to the entire block; otherwise, interleaved prediction is applied to a portion (or portions) of the block. Conditions include, but are not limited to: (assuming that the width and height of the current block are W and H, respectively, and T, T1, T2 are integer values):
w > -T1 and H > -T2;
w ≦ T1 and H ≦ T2;
w > -T1 or H > -T2;
w < (R) > T1 or H < (R) > T2;
W+H>=T
W+H<=T
W×H>=T
W×H<=T
in some embodiments, the partially interleaved prediction is applied to portions of the current block that are smaller than the current block. For example, in some embodiments, the portion of the block excludes the sub-blocks as follows. In some embodiments, if W ≧ H, the interleaved prediction does not apply to the leftmost column of the sub-block and the rightmost column of the sub-block as shown in FIG. 15B; otherwise, the interleaved prediction is not applied to the topmost row of sub-blocks and the bottommost row of sub-blocks as shown in fig. 15C.
For example, in some embodiments, if W > H, then interleaved prediction is not applied to the leftmost column of sub-blocks and the rightmost column of sub-blocks as shown in fig. 15B; otherwise, the interleaved prediction is not applied to the topmost row of sub-blocks and the bottommost row of sub-blocks as shown in fig. 15C.
In some embodiments, whether and how interleaving prediction is applied may be different for different regions in a block. For example, it is assumed that the current block is predicted through sub-block-based prediction P1 and P2 having a subdivision mode D0 and a subdivision mode D1, respectively. The final prediction was calculated as P (x, y) ═ w0 × P0(x, y) + w1 × P1(x, y). If the position (x, y) belongs to a subblock of size S0 × H0 with a subdivision pattern D0; and belongs to sub-block S1 × H1 with subdivision pattern D1, setting w0 to 1 and w1 to 0 (e.g., no interleaving prediction applied at that position) if one or more of the following conditions are met:
S1<T1;
H1<T2;
s1 < T1 and H1 < T2; or
S1 < T1 or H1 < T2,
herein, T1 and T2 are integers. For example, T1-T2-4.
Examples of techniques integrated with encoder embodiments
Item 11: in some embodiments, interleaved prediction is not applied in the Motion Estimation (ME) process.
For example, interleaved prediction is not applied in the ME process for 6-parameter affine prediction.
For example, if the size of the current block satisfies a certain condition such as the following, the interleaving prediction is not applied in the ME procedure. Here, it is assumed that the width and height of the current block are W and H, respectively, and T, T1, T2 are integer values:
w > -T1 and H > -T2;
w ≦ T1 and H ≦ T2;
w > -T1 or H > -T2;
w < (R) > T1 or H < (R) > T2;
W+H>=T
W+H<=T
W×H>=T
W×H<=T
for example, if the current block is divided from the parent block, and the parent block does not select an affine mode at the encoder, the interleaved prediction is omitted in the ME process.
Alternatively, if the current block is divided from the parent block, and the parent block does not select the affine mode at the encoder, the affine mode is not checked at the encoder.
Exemplary embodiments of MV derivation
In the following example, SatShift (x, n) is defined as
Shift (x, n) is defined as Shift (x, n) = (x + Shift0) > > n. In one example, shift0 and/or shift1 is set to (1 < n) > >1 or (1 < (n-1)). In another example, shift0 and/or shift1 is set to 0.
Item 12: in some embodiments, the MV of each sub-block in one subdivision pattern may be derived directly from an affine model, such as with equation (1), or it may be derived from the MV of the sub-block within another subdivision pattern.
(a) In one example, the MVs of sub-block B having subdivision pattern 0 may be derived from MVs of some or all of the sub-blocks within subdivision pattern 1 that overlap sub-block B.
(b) Figures 16A-16C illustrate some examples. In fig. 16A, MVs for a particular sub-block within subdivision pattern 1 will be derived1(x, y). Fig. 16B shows a subdivision pattern 0 (solid line) and a subdivision pattern 1 (dotted line) in a block, indicating that there are four subblocks within subdivision pattern 0 that overlap with a particular subblock within subdivision pattern 1. Fig. 16C shows four MVs: MVs of four sub-blocks within subdivision pattern 0 overlapping a particular sub-block within subdivision pattern 10 (x-2,y-2)、MV0 (x+2,y-2)、MV0 (x-2,y+2)And MV0 (x+2,y+2). Then MV1 (x,y)Will be driven from MV0 (x-2,y-2)、MV0 (x+2,y-2)、MV0 (x-2,y+2)And MV0 (x+2,y+2)And (6) exporting.
(c) Suppose that the MV' of one sub-block within subdivision pattern 1 is from k within subdivision pattern 0+MV0, MV of 1 sub-block1MV2, … MVk. MV' can be derived as:
(i) MV' is MVn, and n is any number of 0 … k.
(ii)MV’=f(MV0,MV1MV2, …, MVk). f is a linear function.
(iii)MV’=f(MV0,MV1MV2, …, MVk). f is a non-linear function.
(iv)MV’=Average(MV0,MV1MV2, …, MVk). Average is the Average operation.
(v)MV’=Median(MV0,MV1MV2, …, MVk). Median is an operation to get a Median value.
(vi)MV’=Max(MV0,MV1MV2, …, MVk). Max is the operation to get the maximum value.
(vii)MV’=Min(MV0,MV1MV2, …, MVk). Min is the operation to get the minimum.
(viii)MV’=MaxAbs(MV0,MV1MV2, …, MVk). Maxabs was the highest absoluteOperation of the value.
(ix)MV’=MinAbs(MV0,MV1MV2, …, MVk). Minabs is the operation that results in the value with the smallest absolute value.
(x) Taking FIG. 16A as an example, MV1 (x,y)Can be derived as:
1.MV1 (x,y)=SatShift(MV0(x-2,y-2)+MV0(x+2,y-2)+MV0(x-2,y+2)+MV0(x+2,y+2),2);
2.MV1 (x,y)=Shift(MV0(x-2,y-2)+MV0(x+2,y-2)+MV0(x-2,y+2)+MV0(x+2,y+2),2);
3.MV1 (x,y)=SatShift(MV0(x-2,y-2)+MV0(x+2,y-2),1);
4.MV1 (x,y)=Shift(MV0(x-2,y-2)+MV0(x+2,y-2),1);
5.MV1 (x,y)=SatShift(MV0(x-2,y+2)+MV0(x+2,y+2),1);
6.MV1 (x,y)=Shift(MV0(x-2,y+2)+MV0(x+2,y+2),1);
7.MV1 (x,y)=SatShift(MV0(x-2,y-2)+MV0(x+2,y+2),1);
8.MV1 (x,y)=Shift(MV0(x-2,y-2)+MV0(x+2,y+2),1);
9.MV1 (x,y)=SatShift(MV0(x-2,y-2)+MV0(x-2,y+2),1);
10.MV1 (x,y)=Shift(MV0(x-2,y-2)+MV0(x-2,y+2),1);
11.MV1 (x,y)=SatShift(MV0(x+2,y-2)+MV0(x+2,y+2),1);
12.MV1 (x,y)=Shift(MV0(x+2,y-2)+MV0(x+2,y+2),1);
13.MV1 (x,y)=SatShift(MV0(x+2,y-2)+MV0(x-2,y+2),1);
14.MV1 (x,y)=Shift(MV0(x+2,y-2)+MV0(x-2,y+2),1);
15.MV1 (x,y)=MV0 (x-2,y-2);
16.MV1 (x,y)=MV0 (x+2,y-2);
17.MV1 (x,y)=MV0 (x-2,y+2)(ii) a Or
18.MV1 (x,y)=MV0 (x+2,y+2)。
Item 13: in some embodiments, how the subdivision pattern is selected may depend on the width and height of the current block.
(a) For example, if the width > T1 and the height > T2 (e.g., T1-T2-4), then two subdivision patterns are selected. Fig. 17A shows an example of two subdivision patterns.
(b) For example, if height < ═ T2 (e.g., T2 ═ 4), two other subdivision patterns are selected. Fig. 17B shows an example of two subdivision patterns.
(c) For example, if width < ═ T1 (e.g., T1 ═ 4), two more subdivision patterns are selected. Fig. 17C shows an example of two subdivision patterns.
Item 14: in some embodiments, the MV of each sub-block within one subdivision pattern of one color component C1 may be derived from the MV of the sub-block within another subdivision pattern of another color component C0.
(a) For example, C1 refers to a color component that is codec/decoded after another color component, such as Cb or Cr or U or V or R or B.
(b) For example, C0 refers to a color component that is coded/decoded before another color component, such as Y or G.
(c) In one example, how to derive the MVs of the sub-blocks within one subdivision pattern of one color component from the MVs of the sub-blocks within another subdivision pattern of another color component may depend on the color format, such as 4: 2: 0, or 4: 2: 2, or 4: 4: 4.
(d) in one example, after scaling down or scaling up the coordinates according to the color format, the MVs of sub-block B in color component C1 having a subdivision pattern C1Pt (t 0 or 1) may be derived from the MVs of some or all of the color components C0 within subdivision pattern C0Pr (r 0 or 1) overlapping sub-block B.
(i) In one example, C0Pr is always equal to C0P 0.
(e) Fig. 18A and 18B show two examples. The color format is 4: 2: 0. the MVs of the sub-blocks in the Cb component are derived from the MVs of the sub-blocks in the Y component.
(i) On the left side of fig. 18A, the MV of a particular Cb subblock B within the subdivision pattern 0 is to be derivedCb0 (x’,y’). The right side of fig. 18A shows four Y sub-blocks within subdivision pattern 0, which are divided by 2: the 1 reduction overlaps Cb subblock B. Assuming that x is 2 x 'and Y is 2Y', four MVs of the four Y sub-blocks within the subdivision pattern 0: MV (Medium Voltage) data base0 (x-2,y-2),MV0 (x+2,y-2),MV0 (x-2,y+2)And MV0 (x+2,y+2)Is used to derive MVsCb0 (x’,y’)。
(ii) On the left side of fig. 18B, the MV of a specific Cb subblock B within the subdivision pattern 1Cb0 (x’,y’)To be exported. The right side of fig. 18B shows four Y sub-blocks within subdivision pattern 0, when 2: the 1 reduction overlaps Cb subblock B. Assuming that x is 2 x 'and Y is 2Y', four MVs of the four Y sub-blocks within the subdivision pattern 0: MV (Medium Voltage) data base0 (x-2,y-2),MV0 (x+2,y-2),MV0 (x-2,y+2)And MV0 (x+2,y+2)For deriving MVsCb0 (x’,y’)。
(f) Suppose that MV' of one sub-block of the color component C1 is MV0, MV from the k-1 sub-block of the color component C01MV2, … MVk. MV' can be derived as:
(i) MV' is MVn, and n is any number of 0 … k.
(ii)MV’=f(MV0,MV1MV2, …, MVk). f is a linear function.
(iii)MV’=f(MV0,MV1MV2, …, MVk). f is a non-linear function.
(iv)MV’=Average(MV0,MV1MV2, …, MVk). Average is the Average operation.
(v)MV’=Median(MV0,MV1MV2, …, MVk). Median is an operation to get a Median value.
(vi)MV’=Max(MV0,MV1MV2, …, MVk). Max is the operation to get the maximum value.
(vii)MV’=Min(MV0,MV1MV2, …, MVk). Min is the operation to get the minimum.
(viii)MV’=MaxAbs(MV0,MV1MV2, …, MVk). MaxAbs is the operation that yields the value with the largest absolute value.
(ix)MV’=MinAbs(MV0,MV1MV2, …, MVk). Minabs is the operation that results in the value with the smallest absolute value.
(x) Taking fig. 18A and 18B as an example, MVCbt (x’,y’)Where t is 0 or 1, can be derived as:
1.MVCbt (x’,y’)=SatShift(MV0(x-2,y-2)+MV0(x+2,y-2)+MV0(x-2,y+2)+MV0(x+2,y+2),2);
2.MVCbt (x’,y’)=Shift(MV0(x-2,y-2)+MV0(x+2,y-2)+MV0(x-2,y+2)+MV0(x+2,y+2),2);
3.MVCbt (x’,y’)=SatShift(MV0(x-2,y-2)+MV0(x+2,y-2),1);
4.MVCbt (x’,y’)=Shift(MV0(x-2,y-2)+MV0(x+2,y-2),1);
5.MVCbt (x’,y’)=SatShift(MV0(x-2,y+2)+MV0(x+2,y+2),1);
6.MVCbt (x’,y’)=Shift(MV0(x-2,y+2)+MV0(x+2,y+2),1);
7.MVCbt (x’,y’)=SatShift(MV0(x-2,y-2)+MV0(x+2,y+2),1);
8.MVCbt (x’,y’)=Shift(MV0(x-2,y-2)+MV0(x+2,y+2),1);
9.MVCbt (x’,y’)=SatShift(MV0(x-2,y-2)+MV0(x-2,y+2),1);
10.MVCbt (x’,y’)=Shift(MV0(x-2,y-2)+MV0(x-2,y+2),1);
11.MVCbt (x’,y’)=SatShift(MV0(x+2,y-2)+MV0(x+2,y+2),1);
12.MVCbt (x’,y’)=Shift(MV0(x+2,y-2)+MV0(x+2,y+2),1);
13.MVCbt (x’,y’)=SatShift(MV0(x+2,y-2)+MV0(x-2,y+2),1);
14.MVCbt (x’,y’)=Shift(MV0(x+2,y-2)+MV0(x-2,y+2),1);
15.MVCbt (x’,y’)=MV0 (x-2,y-2);
16.MVCbt (x’,y’)=MV0 (x+2,y-2);
17.MVCbt (x’,y’)=MV0 (x-2,y+2);
18.MVCbt (x’,y’)=MV0 (x+2,y+2);
exemplary embodiments of interleaved prediction for bi-directional prediction
Item 15: in some embodiments, when interleaved prediction is applied on bi-directional prediction, the following method may be applied to save the increase of internal bit depth due to different weights:
(a) for list X (X ═ 0 or 1), PX(x,y)=Shift(W0(x,y)*PX 0(x,y)+W1(x,y)*PX 1(x, y), SW), wherein PX(X, y) is a prediction of List X, PX 0(x, y) and PX 1(X, y) are the predictions of list X in subdivision pattern 0 and subdivision pattern 1, respectively. W0 and W1 are integers representing interleaved predicted weight values, and SW represents the accuracy of the weight values.
(b) The final predicted value is derived as P (x, y) ═ Shift (Wb0(x, y) × P0(x,y)+Wb1(x,y)*P1(x, y), SWB), where Wb0 and Wb1 are integers used in weighted bi-prediction, and SWB is precision. When there is no weighted bi-prediction, Wb0 Wb1 SWB 1.
(c) In some embodiments, PX 0(x, y) and PX 1(x, y) can be maintained as the accuracy of the interpolation filtering. For example, they may beWith 16 bits of unsigned integer. The final predicted value is derived as P (x, y) ═ Shift (Wb0(x, y) × P0(x,y)+Wb1(x,y)*P1(x, y), SWB + PB), where PB is the additional accuracy from the interpolation filtering, e.g., PB 6. In this case, W0(x, y) × PX 0(x, y) or W1(x, y) PX 1(x, y) may exceed 16 bits. Propose P toX 0(x, y) and PX 1(x, y) is first right shifted to a lower precision to avoid exceeding 16 bits.
(i) For example, for list X (X ═ 0 or 1), PX(x,y)=Shift(W0(x,y)*PLX 0(x,y)+W1(x,y)*PLX 1(x, y), SW), wherein PLX 0(x,y)=Shift(PX 0(x,y),M),PLX 1(x,y)=Shift(PX 1(x, y), M). Then, the final prediction is derived as P (x, y) ═ Shift (Wb0(x, y) × P0(x,y)+Wb1(x,y)*P1(x, y), SWB + PB-M). For example, M is set to 2 or 3.
(d) The above method may also be applied to other Bi-Prediction methods with different weighting factors for two reference Prediction blocks, such as Generalized Bi-Prediction (GBi, where the weights may be e.g. 3/8, 5/8), weighted Prediction (where the weights may be large values).
(e) The above method may also be applied to other multi-hypothesis uni-directional prediction or bi-directional prediction methods with different weight factors for different reference prediction blocks.
Exemplary embodiments of Block size dependency
Item 16: whether and/or how the interleaved prediction is applied may depend on the block width W and the height H.
a. In one example, whether and/or how to apply interleaving prediction may depend on the size of the VPDU (video processing data unit, which typically represents the maximum allowed block size processed in a hardware design).
b. In one example, when interleaving prediction is disabled for a particular block size (or block with particular codec information), the original prediction method may be employed.
i. Alternatively, the affine mode may be disabled directly for such blocks.
c. In one example, when W > T1 and H > T2, interleaved prediction may not be used. For example, T1-T2-64;
d. in one example, when W > T1 or H > T2, interleaved prediction may not be used. For example, T1-T2-64;
e. in one example, when W x H > T, interleaved prediction may not be used. For example, T-64 x 64;
f. in one example, when W < T1 and H < T2, interleaved prediction may not be used. For example, T1 ═ T2 ═ 16;
g. in one example, when W < T1 or H > T2, interleaved prediction may not be used. For example, T1 ═ T2 ═ 16;
h. in one example, when W × H < T, interleaved prediction may not be used. For example, T16 x 16;
i. in one example, for a sub-block (e.g., codec unit) that is not located at a block boundary, an interleaved affine may be disabled for this sub-block. Alternatively, in addition, the prediction result using the original affine prediction method can be directly used as the final prediction of this sub-block.
j. In one example, when W > T1 and H > T2, interleaved prediction is used in a different manner. For example, T1-T2-64;
k. in one example, when W > T1 or H > T2, interleaved prediction is used in a different manner. For example, T1-T2-64;
in one example, when W x H > T, interleaved prediction is used in a different way. For example, T-64 x 64;
in one example, when W < T1 and H < T2, interleaved prediction is used in a different way. For example, T1 ═ T2 ═ 16;
in one example, when W < T1 or H > T2, interleaved prediction is used in a different manner. For example, T1 ═ T2 ═ 16;
in one example, when W × H < T, interleaved prediction is used in a different way. For example, T16 x 16;
in one example, when H > X (e.g., H equals 128, X equals 64), interleaved prediction is not applied to samples belonging to sub-blocks that are partitioned by W (H/2) up and W (H/2) down across the current block;
in one example, when W > X (e.g., W equals 128, X equals 64), the interleaved prediction is not applied to samples belonging to sub-blocks that are left (W/2) × H split and right (W/2) × H split across the current block;
in one example, when W > X and H > Y (e.g., W-H-128, X-Y-64),
i. interleaving prediction is not applied to samples belonging to sub-blocks of left (W/2) × H partition and right (W/2) × H partition across the current block;
interleaving prediction is not applied to samples belonging to subblocks of upper W x (H/2) partition and lower W x (H/2) partition across the current block;
in one example, interleaved prediction is enabled only for blocks having a particular set of widths and/or heights.
t. in one example, interleaved prediction is disabled only for blocks having a particular set of widths and/or heights.
In one example, interleaved prediction is only used for certain types of pictures/slices/slice groups/slices/or other types of video data units.
i. For example, interleaved prediction is used only for P pictures or B pictures;
for example, a flag is signaled in the header of a picture/slice group/slice to indicate whether interleaved prediction can be used.
For example, the flag is signaled only if affine prediction is allowed.
Item 17: a signaling notification message is proposed to indicate whether/how the dependency between function interleaved prediction and width and height is applied. The message may be signaled in SPS/VPS/PPS/slice header/picture header/slice group header/CTU line/CTUs/other types of video processing units.
Item 18: in one example, when interleaved prediction is used, bi-directional prediction is not allowed.
a. For example, when using interleaved prediction, an index indicating whether or not bi-prediction is used is not signaled.
b. Alternatively, the indication of whether bi-prediction is not allowed may be signaled in SPS/VPS/PPS/slice header/picture header/slice group header/CTU row/CTUs.
Item 19: it is proposed to refine the motion information of the sub-blocks also based on motion information derived from two or more patterns.
a. In one example, the refined motion information may be used to predict the following blocks to be coded.
b. In one example, refined motion information may be employed in the filtering process, such as deblocking, SAO, ALF.
c. Whether to store the refined information may be based on the position of the subblock relative to the entire block/CTU row/slice group/picture.
d. Whether to store the refined information may be based on a codec mode of the current block and/or the neighboring block.
e. Whether to store the refined information may be based on the size of the current block.
f. Whether to store the refined information may be based on picture/slice type/reference picture list, etc.
Item 20: it is proposed whether and/or how to apply a deblocking process or other kind of filtering process (such as SAO, adaptive loop filtering) may depend on whether or not to apply interleaved prediction.
a. In one example, if the edge is within a sub-block in another subdivision pattern of the block, deblocking is not performed on the edge between two sub-blocks in the one subdivision pattern of the block.
b. In one example, deblocking is weakened on an edge between two sub-blocks in one subdivision pattern of a block if the edge is within the sub-blocks in another subdivision pattern of the block.
i. In one example, the bS [ xDi ] [ yDj ] described in the VVC deblocking process is reduced for such edges.
in one example, β described in the VVC deblocking process is reduced for such edges.
in one example, such edges are reduced as described in the VVC deblocking process.
in one example, tC described in the VVC deblocking process is reduced for such edges.
c. In one example, deblocking is enhanced on an edge between two sub-blocks in one subdivision pattern of a block if the edge is within the sub-blocks in another subdivision pattern of the block.
i. In one example, the bS [ xDi ] [ yDj ] described in the VVC deblocking process is raised for such edges.
in one example, β described in the VVC deblocking process is raised for such edges.
in one example, such edges are enhanced as described in the VVC deblocking process.
in one example, tC described in the VVC deblocking process is raised for such edges.
Item 21: it is proposed whether and/or how to apply local illumination compensation or weighted prediction to the block/sub-block may depend on whether or not interleaved prediction is applied.
a. In one example, when a block is coded with an interleaved prediction mode, it is not allowed for local illumination compensation or weighted prediction.
b. Alternatively, further, if interleaved prediction is applied to the block/sub-block, no indication of the enabling of local illumination compensation needs to be signaled.
Item 22: it is proposed that bi-directional optical flow (BIO) can be skipped when weighted prediction is applied to a block or sub-block.
a. In one example, the BIO may be applied to a block with weighted prediction.
b. In one example, the BIO may be applied to blocks with weighted prediction, however, certain conditions should be met.
i. In one example, it is required that at least one parameter should be within a range, or equal to a specific value.
in one example, certain reference picture restrictions may apply.
The above-described embodiments and examples may be implemented in the context of the methods of fig. 19 through [ ] as described next.
Fig. 19 illustrates an exemplary flow diagram of a method 1900 of video processing based on some implementations of the disclosed technology. The method 1900 includes, at 1902, deriving one or more motion vectors belonging to a first set of sub-blocks of a first subdivision pattern of a current video block of the video. The method 1900 includes, at 1904, converting between the current video block and a codec representation of the video based on the one or more motion vectors.
Fig. 20 illustrates an exemplary flow diagram of a method 2000 of video processing based on some implementations of the disclosed technology. The method 2000 includes, at 2002, subdividing video blocks of a first color component to obtain a first set of sub-blocks of the first color component. The method 2000 also includes, at 2004, subdividing the corresponding video blocks of the second color component to obtain a second set of sub-blocks of the second color component. The method 2000 also includes, at 2006, deriving one or more motion vectors of the first set of sub-blocks based on the one or more motion vectors of the second set of sub-blocks. The method 2000 also includes, at 2008, converting between the video block and the codec representation of the video based on the one or more motion vectors of the first set and the second set of sub-blocks.
Fig. 21A illustrates an exemplary flow diagram of a method 2110 of video processing based on some implementations of the disclosed technology. The method 2110 includes, at 2112, for a transition between a current video block of the video and a bitstream representation of the video, subdividing the current video block into partitions according to a plurality of subdivision patterns according to a height (H) or a width (W) of the current video block. The method 2110 further includes, at 2114, performing said transforming using a plurality of partitioned interleaved predictions.
Fig. 21B illustrates an exemplary flow diagram of a method 2120 of video processing in accordance with some implementations of the disclosed technology. Method 2120 includes, at 2122, determining to apply prediction to a current video block of the video, the prediction comprising subdividing the current video block into sub-blocks according to a subdivision pattern. Method 2120 also includes, at 2124, determining to apply a bit shift to generate a prediction block on a sub-block of the current video block. Method 2120 also includes, at 2126, performing a conversion between the current video block and a codec representation of the video.
Fig. 21C illustrates an exemplary flow diagram of a method 2130 of video processing in accordance with some implementations of the disclosed technology. The method 2130 includes, at 2132, determining whether to use an interleaved prediction tool for converting between the current block and a codec representation of the video based on a characteristic of a current video block of the video. The method 2130 further includes, at 2134, performing the converting according to the determining. In some implementations, upon determining that the current video block does not satisfy the characteristic of the condition, a transformation is made by disabling the use of the affine prediction tool and/or the interlace prediction tool. In some implementations, once the characteristics of the current video block are determined to satisfy the conditions, the transformation is performed by using an affine prediction tool and/or an interleaved prediction tool.
Fig. 21D illustrates an exemplary flow diagram of a method 2140 of video processing based on some implementations of the disclosed technology. The method 2140 includes, at 2142, determining that interlace prediction is to be applied to a current video block of the video. The method 2140 further includes, at 2144, disabling bi-prediction for the current video block based on the determination that interleaved prediction is to be applied. The method 2140 also includes, at 2146, performing a conversion between the current video block and a codec representation of the video.
Fig. 22A illustrates an exemplary flow diagram of a method 2210 of video processing based on some implementations of the disclosed technology. Method 2210 includes, at 2212, determining refined motion information for a current video block of the video for a transition between the current video block and a codec representation of the video. The method 2210 further includes, at 2214, converting using the refined motion information. In some implementations, the refined motion information is generated based on an interleaved prediction tool, wherein the motion information for the partitioning of the current video block is generated using a plurality of patterns, and the refined motion information for the current video block is used for subsequent processing or is selectively stored based on whether a condition is satisfied.
Fig. 22B illustrates an exemplary flow diagram of a method 2220 of video processing based on some implementations of the disclosed technology. The method 2220 includes, at 2222, determining whether interlaced prediction applies to a current video block of the video. The method 2220 further includes, at 2224, determining to use a filtering process for the current video block based on the determination of whether the interleaved prediction applies to the current video block. The method 2220 also includes, at 2226, converting between the current video block and the codec representation of the video based on determining the use of the filtering process.
Fig. 22C illustrates an exemplary flow diagram of a method 2230 of video processing based on some implementations of the disclosed technology. Method 2230 includes, at 2232, determining whether an interleaved prediction is to be applied to a current video block of the video. The method 2230 also includes, at 2234, determining whether to apply local illumination compensation or weighted prediction to the current video block based on the determination of the use of the interleaved prediction. The method 2230 also includes, at 2236, transitioning between the current video block and the codec representation of the video based on the determination of the use of the local illumination compensation or the weighted prediction.
Fig. 22D illustrates an exemplary flow diagram of a method 2240 of video processing based on some implementations of the disclosed technology. Method 2240 includes, at 2242, determining whether weighted prediction is applied to a current video block of the video or to a sub-block of the current video block. Method 2240 also includes, at 2244, converting between the current video block and a codec representation of the video by disabling bi-directional optical flow (BDOF) techniques.
In the methods discussed above, partial interleaving may be implemented. Using this scheme, samples in a first subset of predicted samples are calculated as a weighted combination of the first inter-predicted block, and samples in a second subset of predicted samples are copied from the sub-block based prediction, wherein the first and second subsets are based on a subdivision pattern. The first subset and the second subset together may constitute an entire prediction block, e.g., a block currently being processed. As shown in fig. 15A-15D, in various examples, the second subset excluded from the interleaving may consist of (a) a corner sub-block or (b) a top-most row of sub-blocks and a bottom-most row of sub-blocks or (c) a left-most column of sub-blocks or a right-most column of sub-blocks. The size of the block currently being processed may be used as a condition for deciding whether to exclude certain sub-blocks from the interleaved prediction.
As further described in this document, the encoding process may avoid checking the affine pattern of blocks subdivided from the parent block, where the parent block itself is encoded with a pattern other than the affine pattern.
In some embodiments, a video decoder apparatus may implement a method of video decoding, wherein improved block-based motion prediction as described herein is used for video decoding. The method may include forming a block of video using a set of pixels from a video frame. The block may be subdivided into a first set of sub-blocks according to a first pattern. The first intermediate prediction block may correspond to a first set of sub-blocks. The block may contain a second set of sub-blocks according to a second pattern. At least one sub-block in the second set has a different size than the sub-blocks in the first set. The method may also determine a prediction block based on the first intermediate prediction block and a second intermediate prediction block generated from the second set of sub-blocks. Other features of the method may be similar to the method 1900 described above.
In some embodiments, a decoder-side method of video decoding may use block-based motion prediction for improving video quality by using blocks of a video frame for prediction, where a block corresponds to a set of blocks of pixels. A block may be subdivided into a plurality of sub-blocks based on the size of the block or information from another block that is spatially or temporally adjacent to the block, wherein at least one sub-block of the plurality of sub-blocks has a different size than the other sub-blocks. The decoder may use a motion vector prediction generated by applying a codec algorithm to the plurality of sub-blocks. Other features of the method are described with respect to fig. 2000 and the corresponding description.
Yet another method of video processing includes deriving one or more motion vectors for a first set of sub-blocks of a current video block, wherein each of the first set of sub-blocks has a first subdivision pattern, and reconstructing the current video block based on the one or more motion vectors.
In some embodiments, deriving the one or more motion vectors is based on an affine model.
In some embodiments, deriving the one or more motion vectors is based on motion vectors of one or more of the second set of sub-blocksEach of the second set of sub-blocks has a second subdivision pattern that is different from the first subdivision pattern, and one or more of the second set of sub-blocks overlap with at least one of the first set of sub-blocks. For example, one or more motion vectors in the first set of sub-blocks contain MVs1The motion vector of one or more of the second set of sub-blocks contains an MV01,MV02,MV03… and MV0KAnd K is a positive integer. In an example, MV1=f(MV01,MV02,MV03,…,MV0K). In another example, f (-) is a linear function. In yet another example, f (-) is a non-linear function. In yet another example, the MV1=average(MV01,MV02,MV03,…,MV0K) And average (·) is an averaging operation. In yet another example, the MV1=median(MV01,MV02,MV03,…,MV0K) And mean () is an operation to compute the median value. In yet another example, the MV1=min(MV01,MV02,MV03,…,MV0K) And min (-) is an operation to select the minimum value from the plurality of input values. In yet another example, the MV1=MaxAbs(MV01,MV02,MV03,…,MV0K) And MaxAbs (-) is an operation of selecting the maximum absolute value from a plurality of input values.
In some embodiments, the first set of sub-blocks corresponds to a first color component, the deriving the one or more motion vectors is based on motion vectors of one or more of a second set of sub-blocks, each of the second set of sub-blocks having a second subdivision pattern different from the first subdivision pattern, the second set of sub-blocks corresponding to a second color component different from the first color component. In an example, the first color component is codec or decoded after the third color component, and wherein the third color component is one of Cr, Cb, U, V, R, or B. In another example, the second color component is codec or decoded before the third color component, and wherein the third color component is Y or G. In yet another example, deriving the one or more motion vectors is further based on a color format of at least one of the second set of sub-blocks. In yet another example, the color format is 4: 2: 0. 4: 2: 2 or 4: 4: 4.
in some embodiments, the first subdivision pattern is based on the height or width of the current video block.
Fig. 23 is a block diagram of the video processing apparatus 2300. Apparatus 2300 may be used to implement one or more methods described herein. Device 2300 may be embodied as a smartphone, tablet computer, internet of things (IoT) receiver, and so forth. The device 2300 may include one or more processors 2302, one or more memories 2304, and video processing hardware 2306. The processor(s) 2302 may be configured to implement one or more of the methods described in this document, including but not limited to the methods shown in figures 19-22D. The memory(s) 2304 may be used to store data and code for implementing the methods and techniques described herein. Video processing hardware 2306 may be used to implement some of the techniques described in this document in hardware circuits.
FIG. 24 is another example of a block diagram of a video processing system in which the disclosed techniques may be implemented. Fig. 24 is a block diagram illustrating an exemplary video processing system 3100 in which various techniques disclosed herein may be implemented. Various implementations may include some or all of the components of system 3100. System 3100 can include an input 3102 for receiving video content. The video content may be received in a raw or uncompressed format, e.g., 8 or 10 bit multi-component pixel values, or may be in a compressed or encoded format. Input 3102 may represent a network interface, a peripheral bus interface, or a storage interface. Examples of network interfaces include wired interfaces such as ethernet, Passive Optical Networks (PONs), etc., and wireless interfaces such as Wi-Fi or cellular interfaces.
System 3100 can comprise an encoding component 3104 that can implement various encoding or encoding methods described in this document. The encoding component 3104 may reduce the average bit rate of the video from the input 3102 to the output of the encoding component 3104 to produce an encoded representation of the video. Thus, the encoding technique is sometimes referred to as a video compression or video transcoding technique. The output of the encoding component 3104 may be stored or transmitted via a connected communication, as represented by component 3106. A stored or communicated bitstream (or encoded) representation of the video received at input 3102 can be used by component 3108 to generate pixel values or displayable video that is transmitted to display interface 3110. The process of generating user-viewable video from a bitstream representation is sometimes referred to as video decompression. Furthermore, while certain video processing operations are referred to as "encoding" operations or tools, it should be understood that the encoding tools or operations are used at the encoder and the corresponding decoding tools or operations that reverse the encoding results will be performed by the decoder.
Examples of a peripheral bus interface or display interface may include a Universal Serial Bus (USB) or a high-resolution multimedia interface (HDMI) or a Displayport (Displayport), etc. Examples of storage interfaces include SATA (serial advanced technology attachment), PCI, IDE interfaces, and the like. The techniques described in this document may be implemented as various electronic devices, such as mobile phones, laptops, smart phones, or other devices capable of performing digital data processing and/or video display.
In some embodiments, the video codec method may be implemented using a device implemented on the hardware platform described with respect to fig. 23 or 24.
The various techniques and embodiments may be described using the following clause-based format.
The first set of terms describes certain features and aspects of the disclosed technology listed in the previous section, including, for example, item 1.
1. A method of video processing, comprising: deriving one or more motion vectors of a first set of sub-blocks belonging to a first subdivision pattern of a current video block of the video; and converting between the current video block and a codec representation of the video based on the one or more motion vectors.
2. The method of clause 1, wherein deriving the one or more motion vectors is based on an affine model.
3. The method of clause 1, wherein deriving the one or more motion vectors is based on motion vectors of a second set of sub-blocks, wherein the second set of sub-blocks has a second subdivision pattern that is different from the first subdivision pattern.
4. The method of clause 3, wherein the second set of sub-blocks overlaps the first set of sub-blocks.
5. The method of clause 3, wherein the one or more motion vectors of the first set of sub-blocks comprise MVs1And the motion vector of the second set of sub-blocks comprises an MV01,MV02,MV03… and MV0KAnd wherein K is a positive integer.
6. The method of clause 5, wherein the MV is1=f(MV01,MV02,MV03,…,MV0K)。
7. The method of clause 6, wherein f (-) is a linear function.
8. The method of clause 6, wherein f (-) is a non-linear function.
9. The method of clause 5, wherein the MV is1=average(MV01,MV02,MV03,…,MV0K) Wherein average (·) is the average operation.
10. The method of clause 5, wherein the MV is1=median(MV01,MV02,MV03,…,MV0K) Where mean () is the operation that computes the median value.
11. The method of clause 5, wherein the MV is1=max(MV01,MV02,MV03,…,MV0K) Where max (·) is an operation of selecting a maximum value from a plurality of input values.
12. The method of clause 5, wherein the MV is1=min(MV01,MV02,MV03,…,MV0K) Where min (-) is the operation that selects the minimum value from the multiple input values.
13. The method of clause 5, wherein the MV is1=MaxAbs(MV01,MV02,MV03,…,MV0K) Where Maxabs (·) is the operation that selects the maximum absolute value from a plurality of input values.
14. The method of clause 5, wherein the MV is1=MinAbs(MV01,MV02,MV03,…,MV0K) Where Minabs (·) is the operation that selects the smallest absolute value from a plurality of input values.
15. The method of any of clauses 1-14, wherein performing the conversion comprises generating a codec representation from a current video block.
16. The method of any of clauses 1-14, wherein performing the conversion comprises generating a current video block from a codec representation.
17. An apparatus in a video system comprising a processor and a non-transitory memory having instructions thereon, wherein the instructions, when executed by the processor, cause the processor to implement the method of one or more of clauses 1-16.
18. A computer program product stored on a non-transitory computer readable medium, the computer program product comprising program code for performing the method of one or more of claims 1 to 16.
The second set of terms describes certain features and aspects of the disclosed technology listed in the previous section, including, for example, item 14.
The third set of terms describes certain features and aspects of the disclosed technology listed in the previous section, including, for example, items 13, 15, 16, 17, and 18.
1. A method of video processing, comprising:
for a transition between a current video block of video and a bitstream representation of the video, subdividing the current video block into partitions according to a plurality of subdivision patterns according to a height (H) or a width (W) of the current video block; and
the converting is performed using interleaved prediction of the plurality of partitions.
2. The method of clause 1, wherein the current video block is subdivided according to two subdivision patterns if W > T1 and H > T2, T1, T2 are integer values.
3. The method of clause 2, wherein T1-T2-4.
4. The method of clause 1, wherein the current video block is subdivided according to two subdivision patterns if H < ═ T2, T2 is an integer.
5. The method of clause 4, wherein T2 ═ 4.
6. The method of clause 1, wherein the current video block is subdivided according to two subdivision patterns if W < ═ T1, T1 is an integer.
7. The method of clause 6, wherein T1 ═ 4.
8. A method of video processing, comprising:
determining to apply prediction to a current video block of a video, the prediction comprising subdividing the current video block into sub-blocks according to a subdivision pattern;
determining to apply bit shifting to generate a prediction block on a sub-block of the current video block; and
a conversion is made between the current video block and a codec representation of the video.
9. The method of clause 8, wherein bi-prediction or uni-prediction is applied to the current video block.
10. The method of clause 8, wherein the one or more motion vectors associated with the current video block have an internal bit depth that depends on the predicted weight values.
11. The method of clause 8, wherein for a reference picture list X of the current video block, PX (X, y) ═ Shift (W0(X, y) × PX0(X, y) + W1(X, y) × PX1(X, y), SW), where PX (X, y) is a prediction of the list X as 0 or 1, PX0(X, y) and PX1(X, y) are predictions of list X using subdivision pattern 0 and subdivision pattern 1, respectively, W0 and W1 are integers representing weight values of the interleaved prediction, SW represents an accuracy of interleaving the weight values, and Shift (X, n) is defined as Shift (X, n) = (X + offset0) > > n.
12. The method of clause 8, wherein the final predicted value is derived as P (x, y) ═ Shift (Wb0(x, y) × P0(x, y) + Wb1(x, y) × P1(x, y), SWB), wherein Wb0 and Wb1 are integers representing weight values of the bi-prediction, P0(x, y) and P1(x, y) represent predictions of list0 and list1, respectively, SWB is the accuracy of the weighted bi-prediction, and Shift (x, n) is defined as Shift (x, n) ═ x + offset0) > > n.
13. The method of clause 12, wherein Wb0 Wb1 SWB 1.
14. The method of clause 8, wherein interpolation filtering is applied to generate the prediction block, and the final prediction value is derived as P (x, y) ═ Shift (Wb0(x, y) × P0(x, y) + Wb1(x, y) × P1(x, y), SWB + PB), where Wb0 and Wb1 are integers representing weight values of the interleaved prediction, SWB is the accuracy of the weighted bi-directional prediction, PB is the additional accuracy from the interpolation filtering, and Shift (x, n) is defined as Shift (x, n) ═ offset0) > > n.
15. The method of clause 14, wherein PX0(X, y) and PX1(X, y) are predictions of the reference picture list X using subdivision pattern 0 and subdivision pattern 1, respectively, and wherein PX0(X, y) and PX1(X, y) are right-shifted.
16. The method of clause 8, wherein the bi-prediction uses different weight factors for two reference prediction blocks.
17. The method of clause 11 or 14, wherein PX0(x, y) is modified to PX0(x, y) ═ Shift (PX0(x, y), M) and/or PX1(x, y) is modified to PX1(x, y) ═ Shift (PX1(x, y), M), where M is an integer, and P (x, y) ═ Shift (Wb0(x, y) × P0(x, y) + Wb1(x, y) × P1(x, y), SWB + PB-M.
18. A method of video processing, comprising:
determining whether to use an interleaving prediction tool for a conversion between a current block of a video and a codec representation of the video based on a characteristic of the current block; and
performing the conversion in accordance with the determination,
wherein the transforming is performed by disabling use of an affine prediction tool and/or the interleaving prediction tool upon determining that a characteristic of the current video block does not satisfy a condition.
19. A method of video processing, comprising:
determining whether to use an interleaving prediction tool for a conversion between a current block of a video and a codec representation of the video based on a characteristic of the current block; and
performing the conversion in accordance with the determination, an
Wherein the transforming is performed by using an affine prediction tool and/or the interleaved prediction tool upon determining that a characteristic of the current video block satisfies a condition.
20. The method of clause 18 or 19, wherein the characteristic of the current video block comprises at least one of a width or a height of the current video block.
21. The method of clause 18 or 19, further comprising:
determining a size of a Video Processing Data Unit (VPDU), and wherein determining whether to use the interleaved prediction tool is based on the VPDU size.
22. The method of clause 18, wherein a different prediction method than the interleaved prediction is applied to the current video block.
23. The method of clause 18, wherein the width and height of the current video block are W and H, respectively, and T, T1, T2 are integer values, and wherein the interleaved prediction is disabled for a particular condition comprising one of:
i.W > T1 and H > T2,
w > T1 or H > T2,
iii.W x H>T,
w < T1 and H < T2, or
v.W < T1 or H < T2
vi.W*H<T。
24. The method of clause 18 or 19, further comprising:
determining that a first sub-block of the current video block is not located at a block boundary; and
upon determining that the first sub-block is not located at the block boundary, disabling an interleaved affine technique for the first sub-block.
25. The method of clause 24, wherein the prediction result from the original affine prediction technique is used for the final prediction of the first sub-block.
26. The method of clause 18, wherein the width and height of the current video block are W and H, respectively, and T, T1, T2 are integer values, and wherein the interleaved prediction is used for a particular condition comprising one of:
i.W > T1 and H > T2,
w > T1 or H > T2,
iii.W x H>T,
w < T1 and H < T2, or
v.W < T1 or H < T2
vi.W*H<T。
27. The method of clause 18, wherein in the event that the height (H) of the current video block is greater than X, X being an integer, the interleaved prediction is not applied to samples belonging to sub-blocks that are split at upper W (H/2) and lower W (H/2) across the current video block.
28. The method of clause 18, wherein the interleaved prediction is not applied to samples belonging to subblocks that both left (W/2) × H partitioning and right (W/2) × H partitioning across the current video block, where the width (W) of the current video block is greater than X, X being an integer.
29. The method of clause 27 or 28, wherein X-64.
30. The method of clause 19, wherein the interleaved prediction is allowed for a particular type of video data unit that includes the current video block, the video data unit including a picture, a slice group, or a slice.
31. The method of clause 19, wherein the interleaved prediction is for a P picture or a B picture.
32. The method of clause 18 or 19, wherein the flag indicating whether to use or disable the interleaved prediction tool is signaled in a header of a picture, slice, group of slices, or slice.
33. The method of clause 32, wherein the flag is signaled based on whether affine prediction is allowed for the current video block.
34. The method of clause 18 or 19, wherein a message is signaled in a video processing unit to indicate whether to use a characteristic of the current video block, the video processing unit comprising a Video Parameter Set (VPS), a Sequence Parameter Set (SPS), a Picture Parameter Set (PPS), a slice header, a picture header, a slice group header, a slice, a Coding Tree Unit (CTU), or a row of CTUs.
35. A method of video processing, comprising:
determining that an interleaving prediction is to be applied to a current video block of a video;
disabling bi-prediction for the current video block based on determining that interleaved prediction is to be applied; and
a conversion is made between the current video block and a codec representation of the video.
36. The method of clause 35, wherein the index indicating the use of bi-prediction is not signaled.
37. The method of clause 36, wherein bi-prediction is disabled based on an indication signaled in a Video Parameter Set (VPS), a Sequence Parameter Set (SPS), a Picture Parameter Set (PPS), a slice header, a picture header, a slice group header, a slice, a Codec Tree Unit (CTU), a CTU row, or multiple CTUs.
38. The method of any of clauses 1-37, wherein the converting comprises generating pixel values of the current video block from the bitstream representation.
39. The method of any of clauses 1-37, wherein the converting comprises generating the bitstream representation from pixel values of the current video block.
40. An apparatus in a video system comprising a processor and a non-transitory memory having instructions thereon, wherein the instructions, when executed by the processor, cause the processor to implement the method of one or more of clauses 1-39.
41. A computer program product, stored on a non-transitory computer readable medium, comprising program code for performing the method of one or more of clauses 1-39.
The fourth set of terms describes certain features and aspects of the disclosed technology listed in the previous section, including, for example, items 19, 20, 21, and 22.
1. A method of video processing, comprising:
for a transition between a current video block of video and a codec representation of the video, determining refinement motion information for at least one sub-block of the current video block; and
the conversion is performed using the refined motion information,
wherein the refined motion information is generated based on an interleaved prediction tool in which the segmented motion information of the current video block is generated using a plurality of patterns, and
wherein the refined motion information for the current video block is used for subsequent processing or selective storage based on whether a condition is satisfied.
2. The method of clause 1, wherein the subsequent processing is a conversion of a subsequent block to be converted after the current video block.
3. The method of clause 1, wherein the subsequent processing is a filtering process of the current video block.
4. The method of clause 1, wherein determining whether the condition is satisfied is based on a location of a sub-block, the location related to a block, CTU row, slice group, or picture in the video.
5. The method of clause 1, wherein determining whether the condition is satisfied is based on a codec mode of at least one of the current video block or a neighboring video block to the current video block.
6. The method of clause 1, wherein determining whether the condition is satisfied is based on a size of the current video block.
7. The method of clause 1, wherein determining whether the condition is satisfied is based on at least one of a picture, a slice type, or a reference picture list related to the current video block.
8. A method of video processing, comprising:
determining whether an interleaved prediction is applied to a current video block of a video;
determining to use a filtering process on the current video block based on determining whether the interleaved prediction applies to the current video block; and
converting between the current video block and a codec representation of the video based on the determination of the use of the filtering process.
9. The method of clause 8, wherein the filtering process comprises a deblocking process, Sample Adaptive Offset (SAO) filtering, or adaptive loop filtering.
10. The method of clause 8, further comprising:
determining parameters related to how to apply the filtering process, and wherein the converting is performed based on the parameters of the filtering process.
11. The method of clause 8, wherein the filtering process is not applied on an edge between two sub-blocks in the subdivision pattern of the current video block if the edge is within a sub-block in another subdivision pattern of the current video block.
12. The method of clause 8, wherein the filtering process is in a weaker level of the edge between two sub-blocks in the subdivision pattern of the current video block if the edge is within a sub-block in another subdivision pattern of the current video block.
13. The method of clause 9, wherein at least one of the variables bS [ xDi ] [ yDj ], β, Δ, or tC used in the deblocking process has a smaller value for the edge.
14. The method of clause 8, wherein the filtering process is in a stronger level of the edge between two sub-blocks in the subdivision pattern of the current video block if the edge is within a sub-block in another subdivision pattern of the current video block.
15. The method of clause 14, wherein at least one of the variables bS [ xDi ] [ yDj ], β, Δ, or tC used in the deblocking process has a larger value for the edge.
16. A method of video processing, comprising:
determining whether an interleaved prediction is applied to a current video block of a video;
based on determining the use of the interleaved prediction, determining whether to use local illumination compensation or weighted prediction for the current video block; and
based on a determination of the local illumination compensation or the use of weighted prediction, a conversion between the current video block and a codec representation of the video is made.
17. The method of clause 16, further comprising:
determining a parameter related to how to apply the local illumination compensation or the weighted prediction, and wherein the converting is based on the parameter of the local illumination compensation or the weighted prediction.
18. The method of clause 16, wherein upon determining that the interleaved prediction is applied to the current video block, the local illumination compensation or weighted prediction is disabled.
19. The method of clause 16, wherein an indication indicating that the local illumination compensation or the weighted prediction is enabled is not signaled for the current video block or a sub-block of the current video block to which the interleaved prediction is applied.
20. A method of video processing, comprising:
determining whether weighted prediction is applied to a current video block of a video or a sub-block of the current video block; and
transitions between the current video block and a codec representation of the video are made by disabling bi-directional optical flow (BDOF) techniques.
21. The method of clause 20, wherein the BDOF technique is applied to blocks with weighted prediction when certain conditions are met.
22. The method of clause 21, wherein the parameter of the BDOF technique is within a threshold range or equal to a particular value.
23. The method of clause 22, wherein a particular reference picture restriction is applied to the current video block.
24. The method of any of clauses 1-23, wherein the converting comprises generating pixel values of the current video block from the bitstream representation.
25. The method of any of clauses 1-23, wherein the converting comprises generating the bitstream representation from pixel values of the current video block.
26. An apparatus in a video system comprising a processor and a non-transitory memory having instructions thereon, wherein the instructions, when executed by the processor, cause the processor to implement the method of one or more of clauses 1-25.
27. A computer program product, stored on a non-transitory computer readable medium, comprising program code for performing the method of one or more of clauses 1-25.
From the foregoing it will be appreciated that specific embodiments of the disclosed technology have been described herein for purposes of illustration, but that various modifications may be made without deviating from the scope of the invention. Accordingly, the disclosed technology is not to be restricted except in the spirit of the appended claims.
The disclosed and other embodiments, modules, and functional operations described in this document can be implemented in digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this document and their structural equivalents, or in combinations of one or more of them. The disclosed and other embodiments can be implemented as one or more computer program products, i.e., one or more modules of computer program instructions encoded on a computer-readable medium for execution by, or to control the operation of, data processing apparatus. The computer readable medium can be a machine-readable storage device, a machine-readable storage substrate, a memory apparatus, a combination of matter effecting a machine-readable propagated signal, or a combination of one or more of them. The term "data processing apparatus" encompasses all apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers. The apparatus can include, in addition to hardware, code that creates an execution environment for the computer program in question, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them. Propagated signaling is an artificially generated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode information for transmission to suitable receiver apparatus.
A computer program (also known as a program, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. A computer program does not necessarily correspond to a file in a file system. A program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language file), in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub programs, or portions of code). A computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.
The processes and logic flows described in this document can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit).
Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a read-only memory or a random access memory or both. The essential elements of a computer are a processor for executing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks. However, a computer does not require such a device. Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, such as internal hard disks or removable disks; magneto-optical disks; and CD ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
While this patent document contains many specifics, these should not be construed as limitations on the scope of any invention or of what may be claimed, but rather as descriptions of features specific to particular embodiments of particular inventions. Certain features that are described in this patent document in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Furthermore, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.
Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. Moreover, the separation of various system components in the embodiments described in this patent document should not be understood as requiring such separation in all embodiments.
Only a few implementations and examples are described, and other implementations, enhancements and variations can be made based on what is described and illustrated in this patent document.
Claims (41)
1. A method of video processing, comprising:
for a transition between a current video block of video and a bitstream representation of the video, subdividing the current video block into partitions according to a plurality of subdivision patterns according to a height (H) or a width (W) of the current video block; and
the converting is performed using interleaved prediction of the plurality of partitions.
2. The method of claim 1, wherein the current video block is subdivided according to two subdivision patterns if W > T1 and H > T2, T1, T2 are integer values.
3. The method of claim 2, wherein T1-T2-4.
4. The method of claim 1, wherein the current video block is subdivided according to two subdivision patterns if H < ═ T2, T2 is an integer.
5. The method of claim 4, wherein T2-4.
6. The method of claim 1, wherein the current video block is subdivided according to two subdivision patterns if W < ═ T1, T1 is an integer.
7. The method of claim 6, wherein T1-4.
8. A method of video processing, comprising:
determining to apply prediction to a current video block of a video, the prediction comprising subdividing the current video block into sub-blocks according to a subdivision pattern;
determining to apply bit shifting to generate a prediction block on a sub-block of the current video block; and
a conversion is made between the current video block and a codec representation of the video.
9. The method of claim 8, wherein bi-prediction or uni-prediction is applied to the current video block.
10. The method of claim 8, wherein one or more motion vectors associated with the current video block have an internal bit depth that depends on the predicted weight values.
11. The method of claim 8, wherein reference picture list X, P for the current video blockX(x,y)=Shift(W0(x,y)*PX 0(x,y)+W1(x,y)*PX 1(x, y), SW), wherein PX(X, y) is a prediction of said list X as 0 or 1, PX 0(x, y) and PX 1(X, y) are predictions of the list X using the subdivision pattern 0 and the subdivision pattern 1, respectively, W0 and W1 are integers indicating weight values of the interleaved prediction, SW indicates the precision of the interleaved weight values, and Shift (X, n) is defined as Shift (X, n) ═ X + offset0)>>n。
12. The method of claim 8, wherein the final predicted value is derived as P (x, y) ═ Shift (Wb0(x, y) × P0(x,y)+Wb1(x,y)*P1(x, y), SWB), wherein Wb0 and Wb1 are integers representing weight values of the bi-prediction, P0(x, y) and P1(x, y) denotes prediction of list0 and list1, respectively, SWB is precision of weighted bi-directional prediction, and Shift (x, n) is defined as Shift (x, n) ═ x + offset0)>>n。
13. The method of claim 12, wherein Wb0 ═ Wb1 ═ SWB ═ 1.
14. The method of claim 8, wherein interpolation filtering is applied to generate the prediction block, and the final prediction value is derived asP(x,y)=Shift(Wb0(x,y)*P0(x,y)+Wb1(x,y)*P1(x, y), SWB + PB), where Wb0 and Wb1 are integers representing weight values of interleaved prediction, SWB is the precision of weighted bi-directional prediction, PB is the additional precision from the interpolation filtering, and Shift (x, n) is defined as Shift (x, n) ═ x + offset0)>>n。
15. The method of claim 14, wherein PX 0(x, y) and PX 1(X, y) are predictions of reference picture list X using subdivision pattern 0 and subdivision pattern 1, respectively, and where P isX 0(x, y) and PX 1(x, y) is shifted right.
16. The method of claim 8, wherein the bi-prediction uses different weight factors for two reference prediction blocks.
17. The method of claim 11 or 14, wherein PX 0(x, y) is modified to PX 0(x,y)=Shift(PX 0(x, y), M) and/or PX 1(x, y) is modified to PX 1(x,y)=Shift(PX 1(x, y), M), wherein M is an integer, and P (x, y) ═ Shift (Wb0(x, y) × P0(x, y) + Wb1(x, y) × P1(x, y), SWB + PB-M).
18. A method of video processing, comprising:
determining whether to use an interleaving prediction tool for a conversion between a current block of a video and a codec representation of the video based on a characteristic of the current block; and
performing the conversion in accordance with the determination,
wherein the transforming is performed by disabling use of an affine prediction tool and/or the interleaving prediction tool upon determining that a characteristic of the current video block does not satisfy a condition.
19. A method of video processing, comprising:
determining whether to use an interleaving prediction tool for a conversion between a current block of a video and a codec representation of the video based on a characteristic of the current block; and
performing the conversion in accordance with the determination, an
Wherein the transforming is performed by using an affine prediction tool and/or the interleaved prediction tool upon determining that a characteristic of the current video block satisfies a condition.
20. The method of claim 18 or 19, wherein the characteristic of the current video block comprises at least one of a width or a height of the current video block.
21. The method of claim 18 or 19, further comprising:
determining a size of a Video Processing Data Unit (VPDU), and wherein determining whether to use the interleaved prediction tool is based on the VPDU size.
22. The method of claim 18, wherein a different prediction method than the interleaved prediction is applied to the current video block.
23. The method of claim 18, wherein the width and height of the current video block are W and H, respectively, and T, T1, T2 are integer values, and wherein the interleaved prediction is disabled for certain conditions including one of:
i.W > T1 and H > T2,
w > T1 or H > T2,
iii.W x H>T,
w < T1 and H < T2, or
v.W < T1 or H < T2,
vi.W*H<T。
24. the method of claim 18 or 19, further comprising:
determining that a first sub-block of the current video block is not located at a block boundary; and
upon determining that the first sub-block is not located at the block boundary, disabling an interleaved affine technique for the first sub-block.
25. The method of claim 24, wherein the prediction results from the original affine prediction technique are used for the final prediction of the first sub-block.
26. The method of claim 18, wherein the width and height of the current video block are W and H, respectively, and T, T1, T2 are integer values, and wherein the interleaved prediction is used for a particular condition comprising one of:
i.W > T1 and H > T2,
w > T1 or H > T2,
iii.W x H>T,
w < T1 and H < T2, or
v.W < T1 or H < T2,
vi.W*H<T。
27. the method of claim 18, wherein the interleaved prediction is not applied to samples belonging to sub-blocks that are split at upper and lower W X (H/2) across the current video block, if the height (H) of the current video block is greater than X, X being an integer.
28. The method of claim 18, wherein the interleaved prediction is not applied to samples belonging to sub-blocks that are both left (W/2) × H partitions and right (W/2) × H partitions across the current video block, where the width (W) of the current video block is greater than X, X being an integer.
29. The method of claim 27 or 28, wherein X-64.
30. The method of claim 19, wherein the interleaved prediction is allowed for a particular type of video data unit that includes the current video block, the video data unit including a picture, a slice group, or a slice.
31. The method of claim 19, wherein the interleaved prediction is for a P-picture or a B-picture.
32. The method of claim 18 or 19, wherein a flag indicating whether to use or disable the interleaved prediction tool is signaled in a header of a picture, slice, group of slices, or slice.
33. The method of claim 32, wherein the flag is signaled based on whether affine prediction is allowed for the current video block.
34. The method of claim 18 or 19, wherein a message is signaled in a video processing unit to indicate whether to use a characteristic of the current video block, the video processing unit comprising a Video Parameter Set (VPS), a Sequence Parameter Set (SPS), a Picture Parameter Set (PPS), a slice header, a picture header, a slice group header, a slice, a Codec Tree Unit (CTU), or a row of CTUs.
35. A method of video processing, comprising:
determining that an interleaving prediction is to be applied to a current video block of a video;
disabling bi-prediction for the current video block based on determining that interleaved prediction is to be applied; and
a conversion is made between the current video block and a codec representation of the video.
36. The method of claim 35, wherein the index indicating the use of bi-prediction is not signaled.
37. The method of claim 36, wherein bi-prediction is disabled based on an indication signaled in a Video Parameter Set (VPS), a Sequence Parameter Set (SPS), a Picture Parameter Set (PPS), a slice header, a picture header, a slice group header, a slice, a Codec Tree Unit (CTU), a row of CTUs, or multiple CTUs.
38. The method of any of claims 1-37, wherein the converting comprises generating pixel values of the current video block from the bitstream representation.
39. The method of any of claims 1-37, wherein the converting comprises generating the bitstream representation from pixel values of the current video block.
40. An apparatus in a video system comprising a processor and a non-transitory memory having instructions thereon, wherein the instructions, when executed by the processor, cause the processor to implement the method of one or more of claims 1-39.
41. A computer program product stored on a non-transitory computer readable medium, the computer program product comprising program code for performing the method of one or more of claims 1 to 39.
Applications Claiming Priority (7)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNPCT/CN2019/070058 | 2019-01-02 | ||
CN2019070058 | 2019-01-02 | ||
CNPCT/CN2019/071507 | 2019-01-13 | ||
CN2019071507 | 2019-01-13 | ||
CN2019071576 | 2019-01-14 | ||
CNPCT/CN2019/071576 | 2019-01-14 | ||
PCT/CN2020/070115 WO2020140949A1 (en) | 2019-01-02 | 2020-01-02 | Usage of interweaved prediction |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113261281A true CN113261281A (en) | 2021-08-13 |
CN113261281B CN113261281B (en) | 2024-10-15 |
Family
ID=71406996
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202080007786.3A Active CN113261281B (en) | 2019-01-02 | 2020-01-02 | Use of interleaving predictions |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN113261281B (en) |
WO (1) | WO2020140949A1 (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101252686A (en) * | 2008-03-20 | 2008-08-27 | 上海交通大学 | Undamaged encoding and decoding method and system based on interweave forecast |
CN102037732A (en) * | 2009-07-06 | 2011-04-27 | 联发科技(新加坡)私人有限公司 | Single pass adaptive interpolation filter |
US20120082224A1 (en) * | 2010-10-01 | 2012-04-05 | Qualcomm Incorporated | Intra smoothing filter for video coding |
US20150350687A1 (en) * | 2014-05-29 | 2015-12-03 | Apple Inc. | In loop chroma deblocking filter |
CN106797476A (en) * | 2014-10-07 | 2017-05-31 | 高通股份有限公司 | Frame in BC and interframe are unified |
US20180270500A1 (en) * | 2017-03-14 | 2018-09-20 | Qualcomm Incorporated | Affine motion information derivation |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6807231B1 (en) * | 1997-09-12 | 2004-10-19 | 8×8, Inc. | Multi-hypothesis motion-compensated video image predictor |
-
2020
- 2020-01-02 CN CN202080007786.3A patent/CN113261281B/en active Active
- 2020-01-02 WO PCT/CN2020/070115 patent/WO2020140949A1/en active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101252686A (en) * | 2008-03-20 | 2008-08-27 | 上海交通大学 | Undamaged encoding and decoding method and system based on interweave forecast |
CN102037732A (en) * | 2009-07-06 | 2011-04-27 | 联发科技(新加坡)私人有限公司 | Single pass adaptive interpolation filter |
US20120082224A1 (en) * | 2010-10-01 | 2012-04-05 | Qualcomm Incorporated | Intra smoothing filter for video coding |
US20150350687A1 (en) * | 2014-05-29 | 2015-12-03 | Apple Inc. | In loop chroma deblocking filter |
CN106797476A (en) * | 2014-10-07 | 2017-05-31 | 高通股份有限公司 | Frame in BC and interframe are unified |
US20180270500A1 (en) * | 2017-03-14 | 2018-09-20 | Qualcomm Incorporated | Affine motion information derivation |
Non-Patent Citations (1)
Title |
---|
KAI ZHANG, LI ZHANG, HONGBIN LIU, YUE WANG, PENGWEI ZHAO, DINGKUN HONG: "CE10: Interweaved Prediction for Affine Motion Compensation (Test 10.5.1 and Test 10.5.2)", Retrieved from the Internet <URL:Joint Video Experts Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11> * |
Also Published As
Publication number | Publication date |
---|---|
WO2020140949A1 (en) | 2020-07-09 |
CN113261281B (en) | 2024-10-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113170099B (en) | Interaction between intra copy mode and inter prediction tools | |
US11070820B2 (en) | Condition dependent inter prediction with geometric partitioning | |
CN110581999B (en) | Chroma decoder side motion vector refinement | |
CN113597760B (en) | Video processing method | |
JP2023164833A (en) | Simplification of combined inter-intra prediction | |
JP2023145563A (en) | Inclination calculation in different motion vector fine adjustment | |
CN110740321B (en) | Motion prediction based on updated motion vectors | |
CN113950838A (en) | Sub-block based intra block copy | |
CN110677674B (en) | Method, apparatus and non-transitory computer-readable medium for video processing | |
CN110876063B (en) | Fast coding method for interleaving prediction | |
CN110876064B (en) | Partially interleaved prediction | |
CN113261281B (en) | Use of interleaving predictions | |
CN113348669B (en) | Interaction between interleaving prediction and other codec tools | |
CN113557720B (en) | Video processing method, apparatus and non-transitory computer readable medium | |
TWI850252B (en) | Partial interweaved prediction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |