CN113875244A - Linear model mode parameter derivation with multiple rows - Google Patents

Linear model mode parameter derivation with multiple rows Download PDF

Info

Publication number
CN113875244A
CN113875244A CN202080038096.4A CN202080038096A CN113875244A CN 113875244 A CN113875244 A CN 113875244A CN 202080038096 A CN202080038096 A CN 202080038096A CN 113875244 A CN113875244 A CN 113875244A
Authority
CN
China
Prior art keywords
samples
block
cclm
luminance
chroma
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202080038096.4A
Other languages
Chinese (zh)
Inventor
张凯
张莉
刘鸿彬
王悦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing ByteDance Network Technology Co Ltd
ByteDance Inc
Original Assignee
Beijing ByteDance Network Technology Co Ltd
ByteDance Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing ByteDance Network Technology Co Ltd, ByteDance Inc filed Critical Beijing ByteDance Network Technology Co Ltd
Publication of CN113875244A publication Critical patent/CN113875244A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/593Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

Apparatus, systems, and methods for digital video coding are described including deriving Linear Model (LM) mode parameters using multiple lines. In a representative aspect, a method for video processing includes: determining rows of luminance samples for a transition between a video and a bitstream representation of the video for deriving parameters of a cross-component linear model for predicting samples of a chroma block; and performing conversion based on the predicted samples of the chroma block, wherein at least one of the plurality of rows is not adjacent to the luma block collocated with the chroma block.

Description

Linear model mode parameter derivation with multiple rows
Cross Reference to Related Applications
The present application is intended to claim in time the priority and benefit of international patent application No. PCT/CN2019/088005, filed 5, month 22 in 2019, according to applicable patent laws and/or rules under the paris convention. The entire disclosure of the foregoing application is incorporated by reference as part of the disclosure of the present application for all purposes dictated by law.
Technical Field
This patent document relates to video encoding and decoding techniques, devices and systems.
Background
Despite advances in video compression, digital video still occupies the largest bandwidth usage in the internet and other digital communication networks. As the number of connected user devices capable of receiving and displaying video increases, the bandwidth requirements for pre-counting the use of digital video will continue to grow.
Disclosure of Invention
Devices, systems, and methods related to digital video coding and decoding are described, and in particular, derivation of Linear Model (LM) mode parameters using multiple lines in video coding is described. The described methods may be applied to both existing Video codec standards (e.g., High Efficiency Video Coding (HEVC)) and future Video codec standards (e.g., universal Video Coding (VVC)) or codecs.
In one representative aspect, the disclosed technology can be used to provide a method for video processing. The example method includes: determining rows of luminance samples for a transition between a video and a bitstream representation of the video for deriving parameters of a cross-component linear model for predicting samples of a chroma block; and performing conversion based on the predicted samples of the chroma block, wherein at least one of the plurality of rows is not adjacent to the luma block collocated with the chroma block.
In another representative aspect, the above-described methods are embodied in the form of processor-executable code and stored in a computer-readable program medium.
In yet another representative aspect, an apparatus configured or operable to perform the above-described method is disclosed. The apparatus may include a processor programmed to implement the method.
In yet another representative aspect, a video decoder device may implement a method as described herein.
The above and other aspects and features of the disclosed technology are described in more detail in the accompanying drawings, the description and the claims.
Drawings
Fig. 1 shows an example of the locations of samples used to derive weights for a linear model for cross-component prediction.
Fig. 2 shows an example of classifying neighboring samples into two groups.
Fig. 3A shows an example of chroma samples and their corresponding luma samples.
Fig. 3B shows an example of the lower filtering of a Cross-Component Linear Model (CCLM) in a Joint Exploration Model (JEM).
Fig. 4 shows an exemplary arrangement of four luminance samples corresponding to a single chrominance sample.
Fig. 5A and 5B show examples of samples and corresponding luminance samples of a4 × 4 chroma block with neighboring samples.
Fig. 6A-6J show examples of CCLMs without filtering under luminance samples.
Fig. 7A-7D show examples of CCLMs that only require neighboring luminance samples for use in normal intra prediction.
Fig. 8A and 8B show examples of Coding Units (CUs) at the boundary of a Coding Tree Unit (CTU).
Fig. 9 shows an example of 67 intra prediction modes.
Fig. 10A and 10B illustrate examples of reference samples of a wide-angle intra prediction mode of a non-square block.
FIG. 11 shows an example of a discontinuity when wide-angle intra prediction is used.
Fig. 12A-12D illustrate examples of sampling points used by a Position-Dependent Intra Prediction Combination (PDPC) method.
Fig. 13A and 13B show examples of down-sampled luminance sample positions inside and outside the current block.
Fig. 14 shows an example of combining different luminance down-sampling methods together.
Fig. 15A-15H show examples of proximity spots used in a CCLM.
Fig. 16A and 16B show examples of how luminance samples to the left of the current block are downsampled to derive parameters in the CCLM.
Fig. 17A and 17B show examples of Linear Model (LM) prediction with a single row of adjacent luminance samples.
18A-18C illustrate examples of using different numbers of neighboring luma samples based on the height and width of a current block.
FIG. 19 illustrates a flow diagram of yet another example method for cross-component prediction in accordance with the disclosed technology.
FIG. 20 is a block diagram of an example of a hardware platform for implementing the visual media decoding or visual media encoding techniques described in this document.
FIG. 21 is a block diagram of an example video processing system in which the disclosed techniques may be implemented.
Detailed Description
Due to the increasing demand for higher resolution video, video coding methods and techniques are ubiquitous in modern technology. Video codecs typically include electronic circuits or software that compress or decompress digital video, and are continually being improved to provide higher codec efficiency. Video codecs convert uncompressed video into a compressed format and vice versa. There is a complex relationship between video quality, the amount of data used to represent the video (determined by the bit rate), the complexity of the encoding and decoding algorithms, the susceptibility to data loss and errors, ease of editing, random access, and end-to-end delay (latency). The compression format typically conforms to a standard video compression specification, such as the High Efficiency Video Codec (HEVC) standard (also known as h.265 or MPEG-H Part 2), the pending release general video codec (VVC) standard, or other current and/or future video codec standards.
Embodiments of the disclosed techniques may be applied to existing video codec standards (e.g., HEVC, h.265) and future standards to improve runtime performance. Section headings are used in this document to improve readability of the description, and the discussion or embodiments (and/or implementations) are not limited in any way to only the individual sections.
1 Cross-component prediction embodiment
Cross-component prediction is a form of chroma-to-luma prediction method that has a well-balanced trade-off between complexity and compression efficiency improvement.
1.1 examples of Cross-component Linear models (CCLM)
In some embodiments, to reduce cross-component redundancy, a cross-component linear model (CCLM) prediction mode (also referred to as LM) is used in JEM for which chroma samples are predicted based on reconstructed luma samples of the same CU by using the following linear model:
predC(i,j)=α·recL′(i,j)+β (1)
here predc(i, j) represents the predicted chroma sampling in the CU, and rec for color formats 4:2:0 or 4:2L' (i, j) denotes the down-sampled reconstructed luma samples of the same CU, while rec for the color format 4: 4L' (i, j) denotes reconstructed luma samples of the same CU. The CCLM parameters α and β are derived by minimizing the regression error between neighboring reconstructed luma and chroma samples around the current block as follows:
Figure BDA0003368258400000041
and
Figure BDA0003368258400000042
here, l (N) denotes down-sampled (for color format 4:2:0 or 4: 2) or original (for color format 4: 4) top and left side neighboring reconstructed luma samples, c (N) denotes top and left side neighboring reconstructed chroma samples, and the value of N is equal to twice the minimum of the width and height of the current chroma codec block.
In some embodiments, for a codec block having a square shape, the above two equations are applied directly. In other embodiments, for non-square codec blocks, adjacent samples of the longer boundary are first sub-sampled to have the same number of samples as the shorter boundary. Fig. 1 shows the positions of the reconstructed samples to the left and above and the samples of the current block involved in CCLM mode.
In some embodiments, this regression error minimization calculation is performed as part of the decoding process, not just as an encoder search operation, and thus does not use syntax to convey the α and β values.
In some embodiments, the CCLM prediction mode also includes prediction between two chroma components, e.g., the Cr (red difference) component is predicted from the Cb (blue difference) component. Instead of using reconstructed sample signaling, the CCLM Cb to Cr prediction is applied in the residual domain. This is done by adding the weighted reconstructed Cb residual to the original Cr intra prediction to form the final Cr prediction:
Figure BDA0003368258400000043
here, resicb' (i, j) represents the reconstructed Cb residual samples at position (i, j).
In some embodiments, the scaling factor α may be derived in a similar manner as in CCLM luma to chroma prediction. The only difference is that the regression cost is increased in the error function relative to the default alpha value, so that the derived scaling factor is biased towards the default value-0.5, as follows:
Figure BDA0003368258400000044
here, Cb (n) denotes neighboring reconstructed Cb samples, Cr (n) denotes neighboring reconstructed Cr samples, and λ is equal to Σ (Cb (n) (n)) Cb > 9.
In some embodiments, the CCLM luma to chroma prediction modes are added as an additional chroma intra prediction mode. At the encoder side, to select the chroma intra prediction mode, an RD cost check of a chroma component is added. When intra prediction modes other than the CCLM luma to chroma prediction modes are used for the chroma components of the CU, the CCLM Cb to Cr prediction is used for the Cr component prediction.
In JEM and VTM-2.0, the total number of training samples for CCLM must be in the form of 2N. Assume that the current block size is W × H. If W is not equal to H, the set of down-sampled luminance samples having more samples is decimated to match the number of samples in the set of down-sampled luminance samples having fewer samples.
1.2 example of a Multi-model CCLM
In JEM, there are two CCLM modes: single model CCLM mode and multi-model CCLM mode (MMLM). As the name implies, the single model CCLM mode uses one linear model for the entire CU to predict chroma samples from luma samples, whereas in MMLM there may be two models.
In MMLM, neighboring luma samples and neighboring chroma samples of a current block are classified into two groups, each of which is used as a training set to derive a linear model (i.e., derive specific α and β for a particular group). Furthermore, samples of the current luminance block are also classified based on the same rules used for classification of neighboring luminance samples.
Fig. 2 shows an example of classifying neighboring samples into two groups. The Threshold (Threshold) is calculated as the average of the neighboring reconstructed intensity samples. Rec'L[x,y]<Neighboring spots that are Threshold are classified as group 1; and Rec'L[x,y]>The nearby samples of Threshold are classified as group 2.
Figure BDA0003368258400000051
1.3 examples of downsampling filters in CCLM
In some embodiments, to perform cross-component prediction, for a 4:2:0 chroma format, where 4 luma samples correspond to 1 chroma sample, the reconstructed luma block needs to be downsampled to match the size of the chroma signaling. The default down-sampling filter used in CCLM mode is as follows:
Figure BDA0003368258400000052
here, for the position of the chrominance samples relative to the position of the luminance samples, the down-sampling assumes a "type 0" phase relationship as shown in fig. 3A, e.g., horizontal side-by-side sampling and vertical gap sampling.
The exemplary 6-tap downsampling filter defined in (6) is used as a default filter for both single model and multi-model CCLM modes.
In some embodiments, for MMLM mode, the encoder may alternatively select one of four additional luma downsampling filters to apply to the prediction in the CU, and send a filter index to indicate which of these filters to use. As shown in fig. 3B, the four selectable luminance down-sampling filters for the MMLM mode are as follows:
Rec'L[x,y]=(RecL[2x,2y]+RecL[2x+1,2y]+1)>>1 (8)
Rec'L[x,y]=(RecL[2x+1,2y]+RecL[2x+1,2y+1]+1)>>1 (9)
Rec'L[x,y]=(RecL[2x,2y+1]+RecL[2x+1,2y+1]+1)>>1 (10)
Rec'L[x,y]=(RecL[2x,2y]+RecL[2x,2y+1]+RecL[2x+1,2y]+RecL[2x+1,2y+1]+2)>>2 (11)
1.4 exemplary embodiments related to Cross-component prediction
Previously proposed CCLM methods include, but are not limited to:
only neighboring luma samples used in normal intra prediction are needed; and
there is no need to downsample the luminance samples, or downsampling is performed by simple double sample averaging.
The previously proposed example described below assumes a color format of 4:2: 0. As shown in fig. 3A, one chrominance (Cb or Cr) sample (represented by a triangle) corresponds to four luminance (Y) samples (represented by circles): such as A, B, C and D shown in fig. 4. Fig. 5 shows an example of samples and corresponding luminance samples of a4 x 4 chroma block with neighboring samples.
Example 1.In one example, it is proposed that CCLM be done without downsampling filtering of the luminance samples.
(a) In one example, the downsampling process of adjacent luminance samples is removed in the CCLM parameter (e.g., α and β) derivation process. Alternatively, the downsampling process is replaced by a subsampling process in which non-consecutive luma samples are utilized.
(b) In one example, the downsampling process of removing samples in the collocated luma block is performed in the CCLM chroma prediction process. Alternatively, only part of the luma samples in the collocated luma block are used to derive a prediction block of chroma samples.
(c) Fig. 6A-6J illustrate examples of an 8 × 8 luma block corresponding to a4 × 4 chroma block.
(d) In one example as shown in fig. 6A, the luma samples at position "C" in fig. 4 are used to correspond to chroma samples. The top-adjacent samples are used in the training process to derive a linear model.
(e) In one example as shown in fig. 6B, the luma samples at position "C" in fig. 4 are used to correspond to chroma samples. The top and top-right neighboring samples are used in the training process to derive the linear model.
(f) In one example as shown in fig. 6C, the luma samples at position "D" in fig. 4 are used to correspond to chroma samples. The top-adjacent samples are used in the training process to derive a linear model.
(g) In one example as shown in fig. 6D, the luma samples at position "D" in fig. 4 are used to correspond to chroma samples. The top and top-right neighboring samples are used in the training process to derive the linear model.
(h) In one example as shown in fig. 6E, the luma samples at position "B" in fig. 4 are used to correspond to the chroma samples. The left neighboring samples are used in the training process to derive a linear model.
(i) In one example as shown in fig. 6F, the luma samples at position "B" in fig. 4 are used to correspond to the chroma samples. The left neighboring samples and the bottom-left neighboring samples are used in the training process to derive a linear model.
(j) In one example as shown in fig. 6G, the luma samples at position "D" in fig. 4 are used to correspond to the chroma samples. The left neighboring samples are used in the training process to derive a linear model.
(k) In one example as shown in fig. 6H, the luma samples at position "D" in fig. 4 are used to correspond to chroma samples. The left neighboring samples and the bottom-left neighboring samples are used in the training process to derive a linear model.
(l) In one example as shown in fig. 6I, the luma samples at position "D" in fig. 4 are used to correspond to the chroma samples. The upper and left neighboring samples are used in the training process to derive a linear model.
(m) in one example as shown in fig. 6J, the luma samples at position "D" in fig. 4 are used to correspond to chroma samples. The linear model is derived during training using the top neighboring samples, the left side neighboring samples, the top right neighboring samples, and the bottom left neighboring samples.
Example 2.In one example, it is proposed that CCLM only requires neighboring luminance samples that are used during normal intra prediction, i.e., no other neighboring luminance samples are allowed to be used during CCLM. In one example, 2-tap filtering is performed on the luminance samples to complete the CCLM. Fig. 7A-7D show examples of an 8 × 8 luma block corresponding to a4 × 4 chroma block.
(a) In one example as shown in fig. 7A, the luminance samples at position "C" and position "D" in fig. 4 are filtered as F (C, D) to correspond to the chrominance samples. The top-adjacent samples are used in the training process to derive a linear model.
(b) In one example as shown in fig. 7B, the luminance samples at position "C" and position "D" in fig. 4 are filtered as F (C, D) to correspond to the chrominance samples. The top and top-right neighboring samples are used in the training process to derive the linear model.
(c) In one example as shown in fig. 7C, the luminance samples at positions "B" and "D" in fig. 4 are filtered as F (B, D) to correspond to the chrominance samples. The left neighboring samples are used in the training process to derive a linear model.
(d) In one example as shown in fig. 7D, the luma samples at position "B" and position "D" in fig. 4 are filtered as F (B, D) to correspond to chroma samples. The left neighboring samples and the bottom-left neighboring samples are used in the training process to derive a linear model.
(e) In one example, F is defined as F (X, Y) ═ X + Y > > 1. Alternatively, F (X, Y) ═ X + Y +1) > > 1.
Example 3.In one example, the CCLM method previously proposed (e.g., CCLM method)Examples 1 and 2) in this section can be applied in a selective manner. That is, different blocks within a region, slice, picture, or sequence may select different kinds of previously proposed CCLM methods.
(a) In one embodiment, the encoder selects one of the previously proposed CCLM methods from a predefined candidate set and signals it to the decoder.
(i) For example, the encoder may select between example 1(a) and example 1 (e). Alternatively, it may be selected between example 1(b) and example 1 (f). Alternatively, it may be selected between example 1(c) and example 1 (g). Alternatively, it may be selected between example 1(d) and example 1 (h). Alternatively, it may be selected between example 2(a) and example 2 (c). Alternatively, it may be selected between example 2(b) and example 2 (d).
(ii) The candidate set and the signaling to be selected from may depend on the shape or size of the block. Let W and H denote the width and height of the chroma block, and T1 and T2 are integers.
(1) In one example, if W < ═ T1 and H < ═ T2, then there is no candidate, e.g., CCLM is disabled. For example, T1-T2-2.
(2) In one example, if W < ═ T1 or H < ═ T2, there is no candidate, e.g., CCLM is disabled. For example, T1-T2-2.
(3) In one example, if W × H < ═ T1, there is no candidate, e.g., CCLM is disabled. For example, T1 ═ 4.
(4) In one example, if W < ═ T1 and H < ═ T2, there is only one candidate, such as example 1 (i). The CCLM method selection information is not signaled. For example, T1-T2-4.
(5) In one example, if W < ═ T1 or H < ═ T2, there is only one candidate, such as example 1 (i). The CCLM method selection information is not signaled. For example, T1-T2-4.
(6) In one example, if W × H < ═ T1, there is only one candidate, such as example 1 (i). The CCLM method selection information is not signaled. For example, T1 ═ 16.
(7) In one example, if W > H, there is only one candidate, such as example 1 (a). The CCLM method selection information is not signaled. Alternatively, if W > H (or W > N × H, where N is a positive integer), only candidates (or some candidates) that use the above-neighboring reconstructed samples or/and the top-right neighboring reconstructed samples in deriving the CCLM parameters are included in the candidate set.
(8) In one example, if W < H, there is only one candidate, such as example 1 (e). The CCLM method selection information is not signaled. Alternatively, if W < H (or N x W < H), only candidates (or some candidates) that use the left-neighboring reconstructed samples or/and the bottom-left neighboring reconstructed samples in deriving the CCLM parameters are included in the candidate set.
(b) In one embodiment, both the encoder and decoder select the previously proposed CCLM method based on the same rules. The encoder does not signal it to the decoder. For example, the selection may depend on the shape or size of the block. In one example, example 1(a) is selected if the width is greater than the height, otherwise, example 1(e) is selected.
(c) One or more previously proposed CCLM candidate sets may be signaled in a sequence parameter set/picture parameter set/slice header/CTU/CTB/CTU group.
Example 4.In one example, it is proposed that multiple CCLM methods (e.g., example 1 and example 2) can be applied to the same chroma block. That is, one block within a region/slice/picture/sequence may select a different kind of previously proposed CCLM method to derive a plurality of inter chroma prediction blocks, and derive a final chroma prediction block from the plurality of inter chroma prediction blocks.
(a) Alternatively, a plurality of sets of CCLM parameters (e.g., α and β) may first be derived from a plurality of selected CCLM methods. A final CCLM parameter set may be derived from the multiple sets and used in the chroma prediction block generation process.
(b) The selection of the various CCLM methods can be signaled (implicitly or explicitly) in a similar manner as described in example 3.
(c) The indication of the use of the proposed method may be signaled in sequence parameter set/picture parameter set/slice header/CTU group/CTU/codec block.
Example 5.In one example, whether and how to apply the previously proposed CCLM method may depend on the location of the current block.
(a) In one example, one or more of the proposed methods are applied to CUs located at the upper boundary of the current CTU, as shown in fig. 8A.
(b) In one example, one or more of the proposed methods are applied to CUs located at the left boundary of the current CTU, as shown in fig. 8B.
(c) In one example, one or more of the proposed methods are applied to both cases.
Example of CCLM in 1.5 VVC
In some embodiments, CCLM as in JEM is employed in VTM-2.0, but MM-CCLM as in JEM is not employed in VTM-2.0.
CCLM in VTM-5.0
In VTM-5.0, in addition to LM mode, two other CCLM modes (LM-A and LM-T) proposed in JFET-L0338 are employed. LM-A derives CCLM parameters using only the neighboring samples above or to the top right of the current block, while LM-T derives CCLM parameters using only the neighboring samples to the left or to the bottom left of the current block.
Decoding process of CCLM
In VTM-5.0, the LM derivation process is simplified to the 4-point Max Min (4-point max-min) method proposed in JFET-N0271. The corresponding working draft is as follows.
Specification of INTRA prediction modes of INTRA _ LT _ CCLM, INTRA _ L _ CCLM and INTRA _ T _ CCLM
The inputs to this process are:
-an intra prediction mode predModeIntra,
sample position (xTbC, yTbC) of the top-left sample of the current transform block relative to the top-left sample of the current picture,
a variable nTbW specifying the transform block width,
-a variable nTbH specifying the transform block height,
-chroma neighborhood samples p [ x ] [ y ], where x ═ 1, y ═ 0..2 × nTbH-1 and x ═ 0..2 × nTbW-1, y ═ 1.
The output of this process is a predicted sample point predSamples [ x ] [ y ], where x is 0.
The current luminance position (xTbY, yTbY) is derived as follows:
(xTbY,yTbY)=(xTbC<<1,yTbC<<1) (8-156)
the variables avail L, avail T and avail TL are derived as follows:
invoking the availability derivation process [ Ed. (BB) of the left neighboring samples of the block specified in clause 6.4.X with the current chroma position (xCurr, yCurr) set equal to (xTbC, yTbC) and the neighboring chroma position (xTbC-1, yTbC) as inputs: the block availability check process tbd ] is approached and the output is assigned to avail l.
Invoking an availability derivation process [ Ed. (BB) of upper neighboring samples of the block specified in clause 6.4.X with the current chroma position (xCurr, yCurr) set equal to (xTbC, yTbC) and the neighboring chroma position (xTbC, yTbC-1) as inputs: the block availability check process tbd ] is approached and the output is assigned to avail t.
Invoking the availability derivation process [ Ed. (BB) for the upper left neighboring samples of the block specified in clause 6.4.X with the current chroma position (xCurr, yCurr) set equal to (xTbC, yTbC) and the neighboring chroma position (xTbC-1, yTbC-1) as inputs: adjacent block availability check process tbd ], and the output is assigned to availTL.
The number of available top-right neighboring chroma samples numTopRight is derived as follows:
the variable numTopRight is set equal to 0 and availTR is set equal to TRUE (TRUE).
-when predModeIntra is equal to INTRA _ T _ CCLM, for x ═ nTbW..2 × nTbW-1, the following applies until availTR equals FALSE (FALSE) or x equals 2 × nTbW-1:
invoking the availability derivation process [ Ed. (BB) of the block specified in clause 6.4.X with the current chroma position (xCurr, yCurr) set equal to (xTbC, yTbC) and the neighboring chroma position (xTbC + X, yTbC-1) as inputs: adjacent Block availability check procedure tbd ], and the output is assigned to availableTR
-numTopRight increases by 1 when "availableTR" equals TRUE.
The number of available below-left neighboring chroma samples numLeftBelow is derived as follows:
the variable numLeftBelow is set equal to 0 and availLB is set equal to TRUE.
-when predModeIntra is equal to INTRA _ L _ CCLM, for y ═ nTbH..2 × nTbH-1, the following applies until availLB equals FALSE or y equals 2 × nTbH-1:
invoking the availability derivation process [ Ed. (BB) of the block specified in clause 6.4.X with the current chroma position (xCurr, yCurr) set equal to (xTbC, yTbC) and the neighboring chroma position (xTbC-1, yTbC + y) as inputs: adjacent Block availability check Process tbd ], and the output is assigned to availableLB
-numLeftBelow is increased by 1 when availableLB equals TRUE.
The number of available neighboring chroma samples numpossamp above and to the right and the number of available neighboring chroma samples nlefsamp to the left and to the bottom left are derived as follows:
if predModeIntra is equal to INTRA _ LT _ CCLM, then the following applies:
numSampT=availTnTbW:0 (8-157)
numSampL=availLnTbH:0 (8-158)
otherwise, the following applies:
numSampT=(availT&&predModeIntra==INTRA_T_CCLM)?
(nTbW+Min(numTopRight,nTbH)):0 (8-159)
numSampL=
(availL&&predModeIntra==INTRA_L_CCLM)?(nTbH+Min(numLeftBelow,nTbW)):0 (8-160)
the variable bCTUboundary is derived as follows:
bCTUboundary=(yTbC&(1<<(CtbLog2SizeY-1)-1)==0)?
TRUE:FALSE。 (8-161)
the variables cntN and array pickPosN (where N is replaced by L and T) are derived as follows:
the variable numIs4N is set equal to ((avail t & & avail l & & predModeIntra ═ INTRA _ LT _ CCLM).
The variable startPosN is set equal to numSampN > (2+ numIs 4N).
The variable pickStepN is set equal to Max (1, numBanpN > > (1+ numIs 4N)).
-if availN equals TRUE and predModeIntra equals INTRA — INTRA _ LT _ CCLM or INTRA — N _ CCLM, then the following allocation is made:
cntN is set equal to Min (nummapn, (1+ numIs4N) < <1)
-pickPosN [ pos ] is set equal to (startPosN + pos pickStepN), where pos ═ 0.. cntN-1.
Else, cntN is set equal to 0.
A predicted sample point predSamples [ x ] [ y ], where x is 0.. nTbW-1, y is 0.. nTbH-1, derived as follows:
-if both numSampL and numSampT equal 0, then the following applies:
predSamples[x][y]=1<<(BitDepthC-1) (8-162)
otherwise, the following ordered steps apply:
1. the collocated luminance sample pY [ x ] [ y ] (where x ═ 0.. nTbW · 2-1, y ═ 0.. nTbH · 2-1) is set equal to the reconstructed luminance sample before the deblocking filtering process at position (xTbY + x, yTbY + y).
2. The neighboring luminance samples pY [ x ] [ y ] are derived as follows:
-when numSampL is greater than 0, the neighboring left luminance sample pY [ x ] [ y ] (where x-1. -3, y-0.. 2 numSampL-1) is set equal to the reconstructed luminance sample before the deblocking filtering process at position (xTbY + x, yTbY + y).
-when numSampT is greater than 0, the neighboring upper luminance sample pY [ x ] [ y ] (where x ═ 0..2 numSampT-1, y ═ 1, -2) is set equal to the reconstructed luminance sample prior to the deblocking filtering process at position (xTbY + x, yTbY + y).
-when availTL is equal to TRUE, the neighboring upper left luma sample pY [ x ] [ y ] (where x ═ -1, y ═ -1, -2) is set equal to the reconstructed luma sample before the deblocking filtering process at position (xTbY + x, yTbY + y).
3. The downsampled collocated luminance sample pDsY [ x ] [ y ], where x is 0.. nTbW-1 and y is 0.. nTbH-1, is derived as follows:
if sps _ cclm _ colocated _ chroma _ flag is equal to 1, then the following applies:
-pDsY [ x ] [ y ] (where x 1.. nTbW-1, y 1.. nTbH-1) is derived as follows:
pDsY[x][y]=(pY[2*x][2*y-1]+
pY[2*x-1][2*y]+4*pY[2*x][2*y]+pY[2*x+1][2*y]+
pY[2*x][2*y+1]+4)>>3 (8-163)
-pDsY [0] [ y ] if avail l is equal to TRUE, wherein y ═ 1.. nTbH-1, derived as follows:
Figure BDA0003368258400000131
-otherwise, pDsY [0] [ y ] (where y ═ 1.. nTbH-1) is derived as follows:
pDsY[0][y]=(pY[0][2*y-1]+2*pY[0][2*y]+pY[0][2*y+1]+2)>>2 (8-165)
-pDsY [ x ] [0] if availT equals true, wherein x ═ 1.. nTbW-1, derived as follows:
Figure BDA0003368258400000141
-otherwise, pDsY [ x ] [0], wherein x ═ 1.. nTbW-1, derived as follows:
pDsY[x][0]=(pY[2*x-1][0]+2*pY[2*x][0]+pY[2*x+1][0]+2)>>2 (8-167)
if avail L equals TURE and avail T equals TURE, pDsY [0] [0] is derived as follows:
Figure BDA0003368258400000142
otherwise, if avail L equals TURE and avail T equals FALSE, pDsY [0] [0] is derived as follows:
pDsY[0][0]=(pY[-1][0]+2*pY[0][0]+pY[1][0]+2)>>2 (8-169)
otherwise, if avail L equals FALSE and avail T equals TURE, pDsY [0] [0] is derived as follows:
pDsY[0][0]=(pY[0][-1]+2*pY[0][0]+pY[0][1]+2)>>2 (8-170)
else (avail L equals FALSE and avail T equals FALSE), pDsY [0] [0] is derived as follows:
pDsY [0] [0] ═ pY [0] [0] (8-171) — otherwise, the following applies:
-pDsY [ x ] [ y ], where x 1.. nTbW-1, y 0.. nTbH-1, derived as follows:
pDsY[x][y]=(pY[2*x-1][2*y]+pY[2*x-1][2*y+1]+2*pY[2*x][2*y]+2*pY[2*x][2*y+1]+pY[2*x+1][2*y]+pY[2*x+1][2*y+1]+4)>>3 (8-172)
-pDsY [0] [ y ] if avail is equal to true, wherein y ═ 0.. nTbH-1, derived as follows:
pDsY[0][y]=(pY[-1][2*y]+pY[-1][2*y+1]+2*pY[0][2*y]+2*pY[0][2*y+1]+pY[1][2*y]+pY[1][2*y+1]+4)>>3 (8-173)
-otherwise, pDsY [0] [ y ], wherein y ═ 0.. nTbH-1, derived as follows: pDsY [0] [ y ] (pY [0] [2 × y ] + pY [0] [2 × y +1] +1) > >1 (8-174)
4. When numSampL is greater than 0, the selected adjacent left chrominance samples pSelC [ idx ] are set equal to p [ -1] [ pickPosL [ idx ] ], where idx ═ 0.. cntL-1), and the selected down-sampled adjacent left luminance samples pSelDsY [ idx ], where,
cntL-1, derived as follows:
the variable y is set equal to pickPosL [ idx ].
If sps _ cclm _ colocated _ chroma _ flag is equal to 1, then the following applies:
-if y is greater than 0 or availTL is equal to TRUE, pSelDsY [ idx ] is derived as follows:
pSelDsY[idx]=(pY[-2][2*y-1]+pY[-3][2*y]+4*pY[-2][2*y]+pY[-1][2*y]+pY[-2][2*y+1]+4)>>3 (8-175)
-otherwise:
pSelDsY[idx]=(pY[-3][0]+2*pY[-2][0]+pY[-1][0]+2)>>2 (8-177)
otherwise, the following applies:
pSelDsY[idx]=(pY[-1][2*y]+pY[-1][2*y+1]+2*pY[-2][2*y]+2*pY[-2][2*y+1]+pY[-3][2*y]+pY[-3][2*y+1]+4)>>3 (8-178)
5. when numSampT is greater than 0, the selected upper-adjacent chroma sample point pSelC [ idx ] is set equal to p [ pickPosT [ idx-cntL ] ] [ -1], where idx ═ cntL.. cntL + cntT-1, and the down-sampled upper-adjacent luma sample point pSelDsY [ idx ], where idx ═ 0.. cntL + cntT-1, specified as follows:
variable x is set equal to pickPosT [ idx-cntL ].
If sps _ cclm _ colocated _ chroma _ flag is equal to 1, then the following applies:
if x is greater than 0, the following applies:
-if bCTUboundary equals FALSE, the following applies:
pSelDsY[idx]=(pY[2*x][-3]+pY[2*x-1][-2]+4*pY[2*x][-2]+pY[2*x+1][-2]+pY[2*x][-1]+4)>>3 (8-179)
else (bCTUboundary equals TRUE), the following applies: pSelDsY [ idx ] ═ (pY [2 x-1] [ -1] +2 x pY [2 x ] 1] + pY [2 x +1] [ -1] +2) > >2 (8-180)
-otherwise:
if availTL equals TRUE and bCTUboundary equals FALSE, then the following applies:
pSelDsY[idx]=(pY[0][-3]+pY[-1][-2]+4*pY[0][-2]+pY[1][-2]+pY[0][-1]+4)>>3 (8-181)
otherwise, if availTL equals TRUE and bCTUboundary equals TRUE, then the following applies:
pSelDsY[idx]=(pY[-1][-1]+2*pY[0][-1]+pY[1][-1]+2)>>2 (8-182)
otherwise, if availTL is equal to FALSE and bCTUboundary is equal to FALSE, then the following applies:
pSelDsY[idx]=(pY[0][-3]+2*pY[0][-2]+pY[0][-1]+2)>>2 (8-183)
else (availTL equal FALSE, and bCTUboundary equal TRUE), the following applies:
pSelDsY[idx]=pY[0][-1] (8-184)
otherwise, the following applies:
if x is greater than 0, the following applies:
-if bCTUboundary equals FALSE, the following applies:
pSelDsY[idx]=(pY[2*x-1][-2]+pY[2*x-1][-1]+2*pY[2*x][-2]+2*pY[2*x][-1]+pY[2*x+1][-2]+pY[2*x+1][-1]+4)>>3 (8-185)
else (bCTUboundary equals TRUE), the following applies:
pSelDsY[idx]=(pY[2*x-1][-1]+2*pY[2*x][-1]+pY[2*x+1][-1]+2)>>2 (8-186)
-otherwise:
if availTL equals TRUE and bCTUboundary equals FALSE, then the following applies:
pSelDsY[idx]=(pY[-1][-2]+pY[-1][-1]+2*pY[0][-2]+2*pY[0][-1]+pY[1][-2]+pY[1][-1]+4)>>3 (8-187)
otherwise, if availTL equals TRUE and bCTUboundary equals TRUE, then the following applies:
pSelDsY[idx]=(pY[-1][-1]+2*pY[0][-1]+pY[1][-1]+2)>>2 (8-188)
otherwise, if availTL is equal to FALSE and bCTUboundary is equal to FALSE, then the following applies:
pSelDsY[idx]=(pY[0][-2]+pY[0][-1]+1)>>1 (8-189)
else (availTL equal FALSE, and bCTUboundary equal TRUE), the following applies:
pSelDsY [ idx ] ═ pY [0] [ -1] (8-190)6 when cntT + cntL are not equal to 0, the variables minY, maxY, minC and maxC are derived as follows:
-when cntT + cntL is equal to 2, setting pSelComp [3] equal to pSelComp [0], pSelComp [2] equal to pSelComp [1], pSelComp [0] equal to pSelComp [1], pSelComp [1] equal to pSelComp [3], wherein Comp is replaced by DsY and C.
Arrays minGrpIdx and maxGrpIdx are set as follows:
–minGrpIdx[0]=0。
–minGrpIdx[1]=2。
–maxGrpIdx[0]=1。
–maxGrpIdx[1]=3。
-when pSelDsY [ minGrpIdx [0] ] is greater than pSelDsY [ minGrpIdx [1] ], minGrpIdx [0] and minGrpIdx [1] are exchanged for (minGrpIdx [0], minGrpIdx [1]) Swap (minGrpIdx [0], minGrpIdx [1 ]).
-when pSelDsY [ maxGrpIdx [0] ] is larger than pSelDsY [ maxGrpIdx [1] ], maxGrpIdx [0] and maxGrpIdx [1] are exchanged to (maxGrpIdx [0], maxGrpIdx [1]) -Swap (maxGrpIdx [0], maxGrpIdx [1 ]).
-when pSelDsY [ minGrpIdx [0] ] is greater than pSelDsY [ maxGrpIdx [1] ], the arrays minGrpIdx and maxGrpIdx are exchanged for (minGrpIdx, maxGrpIdx) ═ Swap (minGrpIdx, maxGrpIdx).
-when pSelDsY [ minGrpIdx [1] ] is greater than pSelDsY [ maxgrapidx [0], minGrpIdx [1] and maxgrapidx [0] are exchanged for (minGrpIdx [1], maxgrapidx [0]) -Swap (minGrpIdx [1], maxgrapidx [0 ]).
–maxY=(pSelDsY[maxGrpIdx[0]]+pSelDsY[maxGrpIdx[1]]+1)>>1。
–maxC=(pSelC[maxGrpIdx[0]]+pSelC[maxGrpIdx[1]]+1)>>1。
–minY=(pSelDsY[minGrpIdx[0]]+pSelDsY[minGrpIdx[1]]+1)>>1。
–minC=(pSelC[minGrpIdx[0]]+pSelC[minGrpIdx[1]]+1)>>1。
7. The variables a, b and k are derived as follows:
-if numSampL equals 0 and numSampT equals 0, then the following applies:
k=0 (8-208)
a=0 (8-209)
b=1<<(BitDepthC-1) (8-210)
otherwise, the following applies:
diff=maxY–minY (8-211)
if diff is not equal to 0, the following applies:
diffC=maxC-minC (8-212)
x=Floor(Log2(diff)) (8-213)
normDiff=((diff<<4)>>x)&15 (8-214)
x+=(normDiff!=0)?1:0 (8-215)
y=Floor(Log2(Abs(diffC)))+1 (8-216)
a=(diffC*(divSigTable[normDiff]|8)+2y-1)>>y (8-217)
k=((3+x-y)<1)?1:3+x-y (8-218)
a=((3+x-y)<1)?Sign(a)*15:a (8-219)
b=minC-((a*minY)>>k) (8-220)
wherein divSigTable [ ] is specified as follows:
divSigTable[]={0,7,6,5,5,4,4,3,3,2,2,1,1,1,1,0} (8-221)
else (diff equals 0), the following applies:
k=0 (8-222)
a=0 (8-223)
b=minC (8-224)
8. the predicted sample points predSamples [ x ] [ y ], where x is 0.
nTbH-1, derived as follows:
predSamples[x][y]=Clip1C(((pDsY[x][y]*a)>>k)+b) (8-225)
JVET-L0283
this proposal proposes a Multiple Reference Line Intra Prediction (MRLIP) in which directional Intra Prediction can be generated by using neighboring samples of more than one Reference Line adjacent or not adjacent to the current block.
1.6 examples of Intra prediction in VVC
1.6.1 Intra mode coding and decoding Using 67 Intra prediction modes
To capture any edge direction present in natural video, the number of directional intra modes extends from 33 to 65 as used in HEVC. The additional directional modes are depicted in fig. 9 with dashed arrows, while the planar and DC modes remain unchanged. These denser directional intra prediction modes are applicable to all block sizes and both luma and chroma intra prediction.
As shown in fig. 9, the conventional angular intra prediction direction is defined as a clockwise direction from 45 degrees to-135 degrees. In VTM2, several conventional angular intra prediction modes are adaptively replaced with wide angular intra prediction modes for non-square blocks. The replaced pattern is signaled using the original method and remapped to the index of the wide-angle pattern after parsing. The total number of intra prediction modes is unchanged (e.g., 67), and the intra mode codec is unchanged.
In HEVC, each intra coded block has a square shape and the length of each side thereof is a power of 2. Therefore, no division is required to generate the intra prediction value using the DC mode. In VTV2, the blocks may have a rectangular shape, which typically requires the use of a division operation for each block. To avoid division operations for DC prediction, only the longer edges are used to calculate the average of the non-square blocks.
1.6.2 examples of Intra mode coding
In some embodiments, in order to keep the complexity of MPM list generation low, an intra Mode codec method with 3 Most Probable Modes (MPM) is used. The MPM list takes into account the following three aspects:
-adjacent intra mode;
-derived intra mode; and
-default intra mode.
For the adjacent intra modes (a and B), consider two adjacent blocks located to the left and above. The initial MPM list is formed by performing a pruning process for two adjacent intra modes. If the two neighboring modes are different from each other, one of the default modes (e.g., PLANA (0), DC (1), ANGULAR50 (e.g., 50)) is added to the MPM list after pruning checking the existing two MPMs. When the two neighboring modes are the same, the default mode or the derived mode is added to the MPM list after the pruning check. The detailed generation process of the three MPM lists is as follows:
if two neighboring candidate patterns (i.e., a ═ B) are the same,
if a is less than 2, candModeList [3] ═ 0, 1, 50 }.
Otherwise, candModelist [0] = { A,2+ ((A + 61)% 64),2+ ((A-1)% 64) }
If not, then,
candmodellist [3] ═ a, B,0} if both a and B are not equal to 0.
Otherwise, if both a and B are not equal to 1, candModeList [3] ═ a, B,1 }.
Otherwise, candModeList [3] { a, B,50 }.
An additional pruning process is used to remove duplicate patterns so that only unique patterns can be included in the MPM list. For entropy coding of 64 non-MPM modes, a 6-bit Fixed Length Code (FLC) is used.
1.6.3 Wide Angle Intra prediction of non-Square blocks
In some embodiments, the conventional angular intra prediction direction is defined as a clockwise direction from 45 degrees to-135 degrees. In VTM2, several conventional angular intra prediction modes are adaptively replaced with wide angular intra prediction modes for non-square blocks. The replaced pattern is signaled using the original method and remapped to the index of the wide-angle pattern after parsing. The total number of intra prediction modes for a particular block is unchanged, e.g., 67, and the intra mode codec is unchanged.
To support these prediction directions, an upper reference of length 2W +1 and a left reference of length 2H +1 are defined as shown in the examples in fig. 10A and 10B.
In some embodiments, the number of modes of the replaced one of the wide-angle-direction modes depends on an aspect ratio of the block. The replaced intra prediction modes are shown in table 1.
Table 1: intra prediction mode replaced by wide angle mode
Condition Substituted intra prediction modes
W/H==2 Modes 2,3,4,5,6,7
W/H>2 Modes 2,3,4,5,6,7,8,9,10,11
W/H==1 None
H/W==1/2 Modes 61,62,63,64,65,66
H/W<1/2 Mode 57,58,59,60,61,62,63,64,65,66
As shown in fig. 11, in the case of wide-angle intra prediction, two vertically adjacent prediction samples may use two non-adjacent reference samples. Therefore, low-pass reference sampling filtering and edge smoothing are applied to wide-angle prediction to reduce the increased gap Δ pαThe negative effects of (c).
1.6.4 example of position dependent intra prediction combining (PDPC)
In VTM2, the result of intra prediction for planar mode is further modified by a position dependent intra prediction combining (PDPC) method. PDPC is an intra prediction method that invokes a combination of unfiltered boundary reference samples and HEVC style (style) intra prediction that utilizes filtered boundary reference samples. PDPC applies without signaling to the following intra modes: plane, DC, horizontal, vertical, lower left angle pattern and its eight adjacent angle patterns, and upper right angle pattern and its eight adjacent angle patterns.
The predicted samples pred (x, y) are predicted using a linear combination of intra prediction modes (DC, plane, angle) and reference samples according to the following equation:
pred(x,y)=(wL×R-1,y+wT×Rx,-1–wTL×R-1,-1+(64–wL–wT+wTL)×pred(x,y)+32)>>shift
here, Rx,-1、R-1,yRespectively, represent reference samples located above and to the left of the current sample (x, y), and R-1,-1Representing reference samples located in the upper left corner of the current block.
In some embodiments, if PDPC is applied to DC, planar, horizontal and vertical intra modes, no additional boundary filters are needed as is required in the case of HEVC DC mode boundary filters or horizontal/vertical mode edge filters.
FIGS. 12A-12D show reference samples (R) of PDPC applied across various prediction modesx,-1、R-1,yAnd R-1,-1) The definition of (1). The prediction samples pred (x ', y') are located at (x ', y') within the prediction block. Reference sample Rx,-1Is given by: x ═ x '+ y' +1, reference spot R-1,yIs similarly given by: y ═ x '+ y' + 1.
In some embodiments, the PDPC weights depend on the prediction mode and are shown in table 2, where S ═ shift.
Table 2: examples of PDPC weights according to prediction mode
Prediction mode wT wL wTL
Diagonal angle of the upper right 16>>((y’<<1)>>S) 16>>((x’<<1)>>S) 0
Diagonal from the lower left 16>>((y’<<1)>>S) 16>>((x’<<1)>>S) 0
Adjacent diagonal angle, upper right 32>>((y’<<1)>>S) 0 0
Adjacent diagonal angle, lower left 0 32>>((x’<<1)>>S) 0
2 example of the disadvantages in the prior embodiments solved by the described technology
The current CCLM implementation in JEM or VTM has at least the following problems:
in the current CCLM design of JEM, it requires more neighboring luminance samples than are used in normal intra prediction. CCLM requires two upper adjacent row brightness samples and three left adjacent column brightness samples. The MM-CCLM requires four upper adjacent row luminance samples and four left adjacent column luminance samples. This is not desirable in hardware design.
Other related methods use only one row of adjacent luma samples, but they suffer some codec performance loss.
The neighboring chroma samples are used only to derive the LM parameters. When generating a prediction block for a chroma block, only the luma samples and the derived LM parameters are utilized. Therefore, the spatial correlation between the current chroma block and its neighboring chroma blocks is not exploited.
In VTM-5.0, the CCLM (including LM, LM-L, LM-T modes) may only find two available adjacent chroma samples (and their corresponding luma samples that can be downsampled). This is a special case in the 4-point maximum-minimum CCLM parameter derivation process, which is not desirable.
Exemplary method of Cross-component prediction in 3-video codec
Embodiments of the presently disclosed technology overcome the disadvantages of the prior implementations, providing video codecs with higher codec efficiency but lower computational complexity. Cross-component prediction based on the disclosed techniques may enhance both existing and future video codec standards, set forth in the examples described below for various implementations. The examples of the disclosed technology provided below illustrate the general concepts and are not meant to be construed as limiting. In examples, various features described in these examples may be combined unless explicitly indicated to the contrary.
In the following examples and methods, the term "LM method" includes, but is not limited to, LM mode in JEM or VTM and MMLM mode in JEM, left LM mode using only left-side neighboring samples to derive linear models, upper LM mode using only upper neighboring samples to derive linear models, or other kinds of methods that utilize luma reconstruction samples to derive chroma prediction blocks.
Example 1.In one example, the method of how to downsample the luma samples is considered to depend on whether the luma samples are inside or outside the current block.
(a) The down-sampled luminance samples can be used to derive the LM parameters. Here, the luminance block is a corresponding luminance block of one chrominance block.
(b) The downsampled luma samples may be used to derive other kinds of chroma prediction blocks. Here, the luminance block is a corresponding luminance block of one chrominance block.
(c) For example, luma samples inside the current block are downsampled in the same way as in JEM, while luma samples outside the current block are downsampled in a different way.
Example 2.In one example, the method of considering how to down-sample the external luminance samples depends on their location.
(a) In one example, the downsampled luma samples may be used to derive the prediction block. Here, the external luminance samples may be neighboring luminance samples or non-neighboring luminance samples with respect to the current luminance block to be coded.
(b) The down-sampled luminance samples can be used to derive the LM parameters. Here, the external luminance samples are those that do not lie in the corresponding luminance block of the current chrominance block.
(c) In one example, luma samples to the left of the current block or above the current block are downsampled in a different manner.
(d) In one example, the luminance samples are down-sampled as shown in fig. 13A and 13B as specified below.
(i) The luma samples inside the current block are downsampled in the same way as in JEM.
(ii) The luminance samples outside and above the current block are downsampled to positions C or D in fig. 4. Alternatively, the luminance samples are downsampled to position C in fig. 4 using a filter. Assuming that the upper luminance sample point adjacent to the current block is represented as a [ i ], d [ i ] - (a [2i-1] + 2a [2i ] + a [2i +1] +2) > >2, where d [ i ] represents the luminance sample point after down-sampling.
(1) If the sampling point a [2i-1] is not available, d [ i ] - (3 a [2i ] + a [2i +1] +2) > >2
(2) If the sampling point a [2i +1] is not available, d [ i ] - (a [2i-1] + 3a [2i ] +2) > >2
(iii) The luminance samples outside and to the left of the current block are downsampled to positions B or D in fig. 4. Alternatively, the luminance samples are down-sampled to a half position between B and D. Assuming that the luminance sample point adjacent to the current block is represented as a [ j ], d [ j ] ═ a [2j ] + a [2j +1] +1 > >1, where d [ j ] represents the down-sampled luminance sample point.
(iv) In one example, W luma downsampled samples are generated, where W is the width of the current chroma block, as shown in fig. 13A.
(1) Alternatively, N × W luminance downsampled samples are generated from the upper adjacent luminance samples, where N is an integer such as 2, as shown in fig. 13B.
(2) Alternatively, W + K luminance downsampled samples are generated from the top adjacent luminance samples, where K is a positive integer.
(3) Alternatively, W/N luminance downsampled samples are generated from the top adjacent luminance samples, where N is an integer such as 2.
(v) In one example, H luma downsampled samples are generated from the left-side neighboring luma samples, where H is the height of the current chroma block, as shown in fig. 13A.
(1) Alternatively, N × H luminance downsampled samples are generated, where N is an integer such as 2, as shown in fig. 13B.
(2) Alternatively, H + K luma downsampled samples are generated, where K is a positive integer.
(3) Alternatively, H/N luminance downsampled samples are generated, where N is an integer such as 2.
Example 3.In one example, the method of considering how to select samples to downsample and how many samples to downsample for the outer luminance/inner luminance samples may depend on the block size/block shape.
Example 4.In one example, the prediction block generated from the LM method may be further refined before being used as a prediction value for the chroma block.
(a) In one example, reconstructed chroma samples (neighboring or non-neighboring chroma samples) may be further used with the prediction block from the LM method.
(i) In one example, a linear function may be applied with neighboring reconstructed chroma samples and a prediction block from the LM method as inputs, and refined prediction samples as outputs.
(ii) In one example, for certain locations, the prediction block from the LM method may be refined, while for the remaining locations, the prediction block from the LM method may be inherited directly without refinement.
(b) In one example, two prediction blocks may be generated (e.g., one from the LM method and the other from the chroma intra prediction block). However, for certain locations, the final prediction block may be generated from two prediction blocks, and for the remaining locations, the final prediction block may be copied directly from the prediction block from the LM.
(i) In one example, the chroma intra prediction mode may be signaled or derived from the luma intra prediction mode.
(ii) In one example, the "specific location" where two prediction blocks may be used jointly includes the upper rows and/or the left columns.
(c) For example, the boundary filtering may be applied to the LM mode, the MMLM mode, the left LM mode, or the upper LM mode regardless of which down-sampling filter is applied.
(d) In one example, the prediction block from the LM method and the function of reconstructing chroma samples above the current block may be used together to refine the prediction block from the LM method.
(i) Suppose that the above-neighboring reconstructed chroma samples of the current block are denoted as a-1][j]The LM predicted sample point at ith row and jth column is a [ i ]][j]Then the predicted samples after boundary filtering are calculated as a-1][j]And a [ i ]][j]As a function of (c). In one example, the final prediction of the (i, j) position is defined as a' [ i ] i][j]=(w1*a[i][j]+w2*a[-1][i]+2N-1)>>N, wherein w1+ w2 ═ 2N
(ii) Boundary filtering can only be applied when upper neighboring samples are available.
(iii) In one example, boundary filtering is only applied when i < ═ K. K is an integer such as 0 or 1. For example, K is 0, w1 is w2 is 1. In another example, K is 0, w1 is 3, and w2 is 1.
(iv) In one example, w1 and w2 depend on row index (i). For example, for sample a [0] [ j ], K is 1, w1 is w2 is 1, and for sample a [1] [ j ], w1 is 3 and w2 is 1.
(e) In one example, the prediction block from the LM method and the function of reconstructed chroma samples to the left of the current block may be used together to refine the prediction block from the LM method.
(i) Suppose that the above reconstructed chroma samples adjacent to the current block are denoted as a [ i ]][-1]The LM predicted sample point at ith row and jth column is a [ i ]][j]Then the predicted samples after boundary filtering are calculated as a [ i ]][-1]And a [ i ]][j]As a function of (c). In one example, the final prediction of the (i, j) position is defined as a' [ i ] i][j]=(w1*a[i][j]+w2*a[i][-1]+2N-1)>>N, wherein w1+ w2 ═ 2N
(ii) In one example, boundary filtering can only be applied when the left-hand neighboring samples are available.
(iii) In one example, boundary filtering is only applied when j < ═ K. K is an integer such as 0 or 1. For example, K is 0, w1 is w2 is 1. In another example, K is 0, w1 is 3, and w2 is 1.
(iv) In one example, w1 and w2 depend on column index (i). For example, for sample a [0] [ j ], K is 1, w1 is w2 is 1, and for sample a [1] [ j ], w1 is 3 and w2 is 1.
(f) In one example, the prediction block from the LM method and the function of reconstructed chroma samples to the left and above the current block may be used together to refine the prediction block from the LM method.
(i) Suppose that the above reconstructed chroma samples adjacent to the current block are denoted as a-1][j]The reconstructed chroma samples on the left side adjacent to the current block are denoted as a [ i ]][-1]And the LM predicted sample point at row ith and column jth is a [ i ]][j]Then the predicted samples after boundary filtering are calculated as a' [ i ″ ]][j]=(w1*a[i][j]+w2*a[i][-1]+w3*a[-1][j]+2N-1)>>N, wherein w1+ w2+ w3 ═ 2N
(ii) In one example, boundary filtering can only be applied when both left and top neighboring samples are available.
(iii) In one example, the boundary filtering is only applied when i < ═ K and j < ═ P. In another example, the boundary filtering is only applied when i < ═ K or J < ═ P.
(iv) In one example, the boundary filtering is applied only to a [0] [0], where w1 ═ 2, and w2 ═ w3 ═ 1.
Example 5.In one example, whether and how the LM method is applied is considered may depend on the size or shape of the current block. Let W and H denote the width and height of the current chroma block, respectively, and T1 and T2 are thresholds.
(a) In one example, LM mode (or MMLM mode, or left LM mode, or upper LM mode) is not applicable when W < ═ T1 and H < ═ T2. For example, T1-T2-4.
(b) In one example, LM mode (or MMLM mode, or left LM mode, or upper LM mode) is not applicable when W < ═ T1 or H < ═ T2. For example, T1-T2-2.
(c) In one example, LM mode (or MMLM mode, or left LM mode, or upper LM mode) is not applicable when W < ═ T1 or H < ═ T2. For example, T1-T2-4.
(d) In one example, when W + H < ═ T1, the LM (or MMLM mode, or left LM mode, or upper LM mode) mode does not apply. For example, T1 ═ 6.
(e) In one example, when W × H < ═ T1, the LM mode (or MMLM mode, or left LM mode, or upper LM mode) does not apply. For example, T1 ═ 16.
(f) In one example, when H < ═ T1, the left LM mode does not apply. For example, T1 ═ 4.
(g) In one example, the upper LM mode is not applicable when W < ═ T1. For example, T1 ═ 4.
(h) T1 and/or T2 may be predefined or signaled in SPS, sequence header, PPS, picture header, VPS, slice header, CTU, CU, or CTU group.
Example 6.In one example, whether and how to apply the proposed single-line LM method may depend on the location of the current block.
(a) In one example, one or more of the proposed methods are applied to CUs located at the upper boundary of the current CTU, as shown in fig. 8A.
(b) In one example, one or more of the proposed methods are applied to CUs located at the left boundary of the current CTU, as shown in fig. 8B.
(c) In one example, one or more of the proposed methods are applied to both cases.
(d) In one example, one or more of the proposed methods are applied to CUs located at the upper boundary of a region such as a 64 × 64 block.
(e) In one example, one or more of the proposed methods are applied to CUs located at the left boundary of an area such as a 64 × 64 block.
The examples described above may be incorporated in the context of the methods described below (e.g., method 1400, which may be implemented at a video encoder and/or decoder).
Example 7: the upper LM neighbor samples are down-sampled by a filtering method (e.g., the filtering method defined in bullet 2. d.ii) and use is made of those luma samples that can be used for the normal intra prediction process (e.g., the 2W samples in VVC (above and above-right of the current block)). While the left LM neighbor is down-sampled by a different filtering method (e.g., the filtering method defined in JEM or VTM-2.0).
a. Fig. 14 shows an example of how different down-sampling methods can be combined together.
Example 8: in an alternative example of the bullet 2.d.ii.1, the luminance samples are downsampled to position C in fig. 4 using a filter. Suppose that the upper luma samples adjacent to the current block are denoted as a [ i ]]If i is>0,d[i]=(a[2i-1]+2*a[2i]+a[2i+1]+offset0)>>2; otherwise (if i ═ 0), d [ i ═ 0 ═ d]=(3*a[2i]+a[2i+1]+offset1)>>2。
a. Alternatively, if i ═ 0, then d [ i ] ═ a [2i ].
b. Alternatively, if i ═ 0, then d [ i ] ═ 1(a [2i ] + a [2i +1] + offset2) > >.
c. In one example, offset 0-offset 1-2; offset2 is 1.
Example 9: in some embodiments, the number of downsampled luma samples above or to the left of the current block may depend on the size of the current block. Assuming that the width and height of the current chroma block are denoted as W and H:
a. for example, if W ═ H, then W luma samples above the current block are downsampled and H luma samples to the left of the current block are downsampled;
b. for example, if W ═ H, then 2 × W luminance samples above the current block are down-sampled and 2 × H luminance samples to the left of the current block are down-sampled;
c. for example, if W < H, 2 × W luminance samples above the current block are down-sampled, and H luminance samples to the left of the current block are down-sampled;
d. for example, if W < ═ H, 2 × W luminance samples above the current block are down-sampled, and H luminance samples to the left of the current block are down-sampled;
e. for example, if W > H, 2 × W luma samples above the current block are down-sampled and H luma samples to the left of the current block are down-sampled;
f. for example, if W > ═ H, 2 × W luminance samples above the current block are down-sampled, and H luminance samples to the left of the current block are down-sampled;
g. for example, if W < H, then W luma samples above the current block are downsampled and 2 × H luma samples to the left of the current block are downsampled;
h. for example, if W < ═ H, then W luminance samples above the current block are down-sampled and 2 × H luminance samples to the left of the current block are down-sampled;
i. for example, if W > H, then W luma samples above the current block are downsampled, and 2 × H luma samples to the left of the current block are downsampled;
j. for example, if W > ═ H, then W luminance samples above the current block are downsampled, and 2 × H luminance samples to the left of the current block are downsampled;
k. for example, 2 x W luma samples above the current block are downsampled only if both the top-adjacent block and the top-right adjacent block are available. Otherwise, the W luma samples above the current block are downsampled.
For example, 2 × H luma samples on the left side of the current block are downsampled only if both the left neighboring block and the bottom-left neighboring block are available. Otherwise, the H luma samples above the current block are downsampled.
Let W '(W' equals W or 2W) down-sampled luminance samples above the current block and H '(H' equals H or 2H) down-sampled luminance samples above the current block. If W 'is not equal to H', the downsampled set of luminance samples having more samples is decimated (demate) to match the number of samples in the downsampled set of luminance samples having fewer samples, as defined in JEM or VTM-2.0.
Example 10: in some embodimentsThe decimation process when W is not equal to H is different from the decimation process in JEM or VTM-2.0.
a. For example, if W > H, then H leftmost top neighboring samples and H left neighboring samples are involved in the training process.
b. For example, if W > H, then H rightmost top neighboring samples and H left neighboring samples are involved in the training process.
c. For example, if W > H, and W ═ H × n, then W top neighboring samples and H left neighboring samples are involved in the training process, and each left neighboring sample appears n times in the training process.
d. For example, if W < H, then W top left and W top neighboring samples are involved in the training process.
e. For example, if W < H, then W lowest left-side neighboring samples and W top neighboring samples are involved in the training process.
f. For example, if W < H, and H ═ W × n, then W upper neighboring samples and H left neighboring samples are involved in the training process, and each upper neighboring sample appears n times in the training process.
Example 11: a plurality of rows of luminance samples, at least one of which is not adjacent to the current block, are used to derive parameters used in the CCLM mode or some variant of the CCLM mode, such as MM-CCLM or MDLM.
a. Luma samples not used for directional intra-prediction and multi-reference row intra-prediction (MRLIP) cannot be used to derive parameters used in the CCLM mode or some variants of the CCLM mode, such as MM-CCLM or MDLM.
b. In one example, a row of luma samples adjacent to the current block above and a row of luma samples two rows above the current block are used to derive parameters used in the CCLM mode or some variant of the CCLM mode (e.g., MM-CCLM or MDLM).
c. In one example, a row of luminance samples adjacent to the current block on the left side and a row of luminance samples of two rows on the left side of the current block are used to derive parameters used in the CCLM mode or some variant of the CCLM mode (e.g., MM-CCLM or MDLM).
d. Fig. 15A-15H show several examples of sets of neighboring luma samples (shaded samples outside the current block) for deriving parameters used in CCLM mode or some variant of CCLM mode (like MM-CCLM or MDLM).
Example 12
The down-sampled luminance samples on the left side of the current block may be calculated as
Rec′L[x,y]=(w1·RecL[2x+1,2y]+w2·RecL[2x+1,2y+1]+
w3·RecL[2x-1,2y]+w4·RecL[2x-1,2y+1]+2N-1)>>N
Wherein w1+ w2+ w3+ w4 is 2N. In one example, w 1-w 2-w 3-w 4-1 and N-2. Fig. 16A-16B illustrate this example.
Example 13: whether and/or how a CCLM mode (such as LM, LM-A, LM-T) is applied to a block may depend on the number of available neighboring samples and/or the size of the block.
a. In one example, a block may refer to a chroma codec block.
b. In one example, if one or more particular CCLM modes (such as LM, LM-A, LM-T) are not applicable to a block (e.g., according to the number and/or size of available proximate samples), syntax element(s) (such as a flag or mode representation) indicating the particular CCLM mode(s) may not be signaled and the particular CCLM mode(s) are inferred as not being applied.
c. In one example, if one or more particular CCLM modes (such as LM, LM-A, LM-T) do not apply to a block (e.g., according to the number and/or size of available contiguous samples), syntax element(s) (such as a flag or mode representation) indicating the CCLM mode(s) can be signaled and the syntax element(s) should indicate that the CCLM mode(s) are not applied in a coherent bitstream.
d. In one example, if one or more particular CCLM modes (such as LM, LM-A, LM-T) are not applicable to a block (e.g., according to the number and/or size of available contiguous samples), syntax element(s) (such as flag or mode representation) indicating the CCLM mode(s) may be signaled, but the signaling may be ignored by the decoder, and the particular CCLM mode(s) inferred as not being applied.
e. In one example, the proximity samples may refer to chroma proximity samples.
i. Alternatively, the neighboring samples may refer to corresponding luminance neighboring samples that may be downsampled (e.g., according to a color format).
f. In one example, if one or more particular CCLM modes (such as LM, LM-A, LM-T) are not applicable to a block (e.g., according to the number and/or size of available proximate samples), but the particular CCLM mode is signaled, the parameter derivation process for the CCLM is completed in a particular manner.
i. In one example, parameter a is set equal to 0 and parameter b is set equal to a fixed value, such as 1< < (BitDepth-1).
g. In one example, the LM mode and/or LM-A mode and/or LM-T mode are not applicable when the number of available neighboring samples is less than T, where T is an integer such as 4.
h. In one example, if the width of a block is equal to 2 and the left-side neighboring block is not available, the LM mode is not applicable.
i. In one example, if the height of a block is equal to 2 and an above adjacent block is not available, the LM mode is not applicable.
j. In one example, if the width of a block is equal to 2, the LM-T mode does not apply.
i. Alternatively, if the width of a block is equal to 2 and the upper right neighboring block is not available, the LM-T mode is not applicable.
k. In one example, if the height of a block is equal to 2, the LM-L mode does not apply.
i. Alternatively, if the height of a block is equal to 2 and the lower left adjacent block is not available, the LM-L mode is not applicable.
In one example, if the LM mode does not apply, the LM-L mode does not apply.
m. in one example, if the LM mode does not apply, the LM-T mode does not apply.
n. in one example, the "available neighboring samples" may be those from the existing top and/or left side samples according to the selected reference row.
In one example, the "available proximity samples" may be those from a selected location according to a selected reference line and CCLM parameter derivation rule (e.g., pSelComp [ ]).
The above method may also be applicable to a Local Illumination Compensation (LIC) process, where LIC may be disabled depending on the number of available proximity spots.
i. Alternatively, LIC may be enabled depending on the number of available neighboring samples, but with certain linear model parameters (e.g., a-1, b-0) regardless of the values of the neighboring samples.
FIG. 14 illustrates a flow diagram of an exemplary method for cross-component prediction. The method 1400 includes, at step 1410, receiving a bitstream representation of a current block of video data that includes a luma component and a chroma component.
Additional embodiments of 4-Cross-component prediction
Example 1
In some embodiments, luma samples inside the current block are downsampled in the same manner as in JEM.
In some embodiments, the luma samples outside and above the current block are downsampled to location C in fig. 4 using a filter. Assuming that the upper luminance sample point adjacent to the current block is represented as a [ i ], d [ i ] - (a [2i-1] + 2a [2i ] + a [2i +1] +2) > >2, where d [ i ] represents the luminance sample point after down-sampling. If the sampling point a [2i-1] is not available, d [ i ] ═ 3a [2i ] + a [2i +1] +2 > > 2.
In some embodiments, the luminance samples outside the current block and to the left of the current block are downsampled to a half position between B and D, as shown in fig. 4. Assuming that the left-side luminance sample point adjacent to the current block is represented as a [ j ], d [ j ] ═ a [2j ] + a [2j +1] +1) > >1, where d [ j ] represents the down-sampled luminance sample point.
In some embodiments, W luma downsampled samples from the top-adjacent luma samples and H luma downsampled samples from the left-adjacent luma samples are generated, where W and H are the width and height of the current chroma block, as shown in fig. 13A.
In some embodiments, 2W luma downsampled samples from the top-adjacent luma samples and 2H luma downsampled samples from the left-adjacent luma samples are generated, where W and H are the width and height of the current chroma block, as shown in fig. 13B.
To constrain the neighboring luminance samples required for the training process to a single row, a down-sampling filter with fewer taps is applied, as shown in fig. 3A:
-for upper adjacent luminance samples:
Rec'L[x,y]=(2×RecL[2x,2y+1]+RecL[2x-1,2y+1]+RecL[2x+1,2y+1]+2)>>2。 (2)
for left-side neighboring luminance samples:
Rec'L[x,y]=(RecL[2x+1,2y]+RecL[2x+1,2y+1]+1)>>1。 (3)
the luminance samples inside the block are still down-sampled with a six-tap filter.
Two solutions are provided in this application that use different sets of neighboring luminance samples. Assume that the width and height of a block are denoted as W and H, respectively. In solution #1, in fig. 17A, the training process involves W upper neighboring samples and H left neighboring samples. In solution #2, as shown in fig. 17B, 2W upper neighboring samples and 2H left neighboring samples are involved. It should be noted that the extended neighboring samples in solution #2 have been used by wide-angle intra prediction.
Further, solution #3 shown in fig. 18A to 18C is provided based on solution # 2. If W < ═ H, 2W upper adjacent samples are involved; otherwise, W upper neighboring samples are involved. If H < ═ W, then 2H left-hand neighboring samples are involved; otherwise, H left neighboring samples are involved.
As a further study, solutions #1A, #2A and #3A were provided, which applied the same methods of solution #1, solution #2 and solution #3, respectively, but only to top-adjacent spots. In solutions #1A, #2A and #3A, the left-hand neighboring samples are downsampled as in VTM-2.0, i.e., H left-hand neighboring luminance samples are downsampled by the 6-tap filter.
Example #2
Embodiments of the CCLM sample Point number constraint in the working draft (example changes are marked; deletions are shown using double bold brackets, i.e., [ [ a ] ] to indicate "a" is deleted, and additions are shown using double bold brackets, i.e., { { a } } to indicate "a" is added to the specification.)
XXX. Specification of INTRA prediction modes of INTRA _ LT _ CCLM, INTRA _ L _ CCLM and INTRA _ T _ CCLM
The inputs to this process are:
–...
the output of this process is a predicted sample point predSamples [ x ] [ y ], where x is 0.
The current luminance position (xTbY, yTbY) is derived as follows:
(xTbY,yTbY)=(xTbC<<1,yTbC<<1) (8-156)
the variables avail L, avail T and avail TL are derived as follows:
–...
the number of available top-right neighboring chroma samples numTopRight is derived as follows:
the variable numTopRight is set equal to 0 and availTR is set equal to TRUE.
When predModeIntra is equal to INTRA _ T _ CCLM, for x ═ nTbW..2 × nTbW-1, the following formula applies until availTR equals FALSE or x equals 2 × nTbW-1:
–…
the number of available below-left neighboring chroma samples numLeftBelow is derived as follows:
–…
the number of available neighboring chroma samples numpossamp above and to the right and the number of available neighboring chroma samples nlefsamp to the left and to the bottom left are derived as follows:
if predModeIntra is equal to INTRA _ LT _ CCLM, then the following applies:
numSampT=availTnTbW:0 (8-157)
numSampL=availLnTbH:0 (8-158)
otherwise, the following applies:
numSampT=(availT&&predModeIntra==INTRA_T_CCLM)?(nTbW+Min(numTopRight,nTbH)):0 (8-159)
numSampL=
(availL&&predModeIntra==INTRA_L_CCLM)?(nTbH+Min(numLeftBelow,nTbW)):0 (8-160)
the variable bCTUboundary is derived as follows:
bCTUboundary=(yTbC&(1<<(CtbLog2SizeY-1)-1)==0)?TRUE:FALSE。 (8-161)
the variables cntN and array pickPosN, where N is replaced by L and T, are derived as follows:
the variable numIs4N is set equal to ((avail t & & avail l & & predModeIntra ═ INTRA _ LT _ CCLM).
The variable startPosN is set equal to numSampN > (2+ numIs 4N).
The variable pickStepN is set to Max (1, numBanpN > > (1+ numIs 4N)).
-if availN equals TRUE and predModeIntra equals INTRA — INTRA _ LT _ CCLM or INTRA — N _ CCLM, then the following allocation is made:
cntN is set equal to Min (nummapn, (1+ numIs4N) < <1)
-pickPosN [ pos ] is set equal to (startPosN + pos pickStepN), where pos ═ 0.. cntN-1.
Else, cntN is set equal to 0.
A predicted sample point predSamples [ x ] [ y ], where x is 0.. nTbW-1, y is 0.. nTbH-1, derived as follows:
-if both numSampL and numSampT are equal to 0, then the following applies:
predSamples[x][y]=1<<(BitDepthC-1) (8-162)
otherwise, the following ordered steps apply:
1. the luminance samples pY [ x ] [ y ], where x is 0.. nTbW 2-1, y is 0.. nTbH 2-1, are set equal to the reconstructed luminance samples before the deblocking filtering process at the positions (xTbY + x, yTbY + y).
2. The neighboring luminance samples pY [ x ] [ y ] are derived as follows:
-when numSampL is greater than 0, the left luminance sample pY [ x ] [ y ], where x-1. -3, y-0.. 2 numSampL-1, is set equal to the reconstructed luminance sample prior to the deblocking filtering process at location (xTbY + x, yTbY + y).
-when numSampT is greater than 0, the upper luminance sample pY [ x ] [ y ], where x ═ 0..2 numSampT-1, y ═ 1, -2, set equal to the reconstructed luminance sample prior to the deblocking filtering process at position (xTbY + x, yTbY + y).
-when availTL is equal to TRUE, the top left luma sample pY [ x ] [ y ], where x ═ 1, y ═ 1, -2, is set equal to the reconstructed luma sample before the deblocking filtering process at position (xTbY + x, yTbY + y).
3. The downsampled collocated luminance sample pDsY [ x ] [ y ], where x is 0.. nTbW-1 and y is 0.. nTbH-1, is derived as follows:
4. when numSampL is greater than 0, the selected adjacent left-hand chroma sample point pSelC [ idx ] is set equal to p [ -1] [ pickPosL [ idx ] ], where idx is 0.. cntL-1, and the selected downsampled adjacent left-hand luma sample point pSelDsY [ idx ], where idx is 0.. cntL-1, as follows:
the variable y is set equal to pickPosL [ idx ].
If sps _ cclm _ colocated _ chroma _ flag is equal to 1, then the following applies:
otherwise, the following applies:
pSelDsY[idx]=(pY[-1][2*y]+pY[-1][2*y+1]+2
*pY[-2][2*y]+2*pY[-2][2*y+1]+pY[-3][2*y]
+pY[-3][2*y+1]+4)>>3(8-178)
5. when numSampT is greater than 0, the selected immediate upper chroma sample point pSelC [ idx ] is set equal to p [ pickPosT [ idx-cntL ] ] [ -1], where idx ═ cntL.. cntL + cntT-1, and the downsampled immediate upper luma sample point pSelDsY [ idx ], where idx ═ 0.. cntL + cntT-1, specified as follows:
variable x is set equal to pickPosT [ idx-cntL ].
If sps _ cclm _ colocated _ chroma _ flag is equal to 1, then the following applies:
6. when cntT + cntL is not [ [ equal to 0] ] { { less than Threshold1} }, the variables minY, maxY, minC, and maxC are derived as follows:
- [ [ when cntT + cntL equals 2, pSelComp [3] is set equal to pSelComp [0], pSelComp [2] is set equal to pSelComp [1], pSelComp [0] is set equal to pSelComp [1], pSelComp [1] is set equal to pSelComp [3], wherein Comp is replaced by DsY and C. ] ]
Arrays minGrpIdx and maxGrpIdx are set as follows:
–minGrpIdx[0]=0。
–minGrpIdx[1]=2。
–maxGrpIdx[0]=1。
–maxGrpIdx[1]=3。
-when pSelDsY [ minGrpIdx [0] ] is greater than pSelDsY [ minGrpIdx [1] ], minGrpIdx [0] and minGrpIdx [1] are exchanged for (minGrpIdx [0], minGrpIdx [1]) Swap (minGrpIdx [0], minGrpIdx [1 ]).
-when pSelDsY [ maxGrpIdx [0] ] is larger than pSelDsY [ maxGrpIdx [1] ], maxGrpIdx [0] and maxGrpIdx [1] are exchanged to (maxGrpIdx [0], maxGrpIdx [1]) -Swap (maxGrpIdx [0], maxGrpIdx [1 ]).
-when pSelDsY [ minGrpIdx [0] ] is greater than pSelDsY [ maxGrpIdx [1] ], the arrays minGrpIdx and maxGrpIdx are exchanged for (minGrpIdx, maxGrpIdx) ═ Swap (minGrpIdx, maxGrpIdx).
-when pSelDsY [ minGrpIdx [1] ] is greater than pSelDsY [ maxgrapidx [0], minGrpIdx [1] and maxgrapidx [0] are exchanged for (minGrpIdx [1], maxgrapidx [0]) -Swap (minGrpIdx [1], maxgrapidx [0 ]).
–maxY=(pSelDsY[maxGrpIdx[0]]+pSelDsY[maxGrpIdx[1]]+1)>>1。
–maxC=(pSelC[maxGrpIdx[0]]+pSelC[maxGrpIdx[1]]+1)>>1。
–minY=(pSelDsY[minGrpIdx[0]]+pSelDsY[minGrpIdx[1]]+1)>>1。
–minC=(pSelC[minGrpIdx[0]]+pSelC[minGrpIdx[1]]+1)>>1。
7. The variables a, b and k are derived as follows:
if [ [ numSampL equals 0, and numSampT equals 0] ] { { cntT +
cntL is less than Threshold1} }, the following applies:
k=0 (8-208)
a=0 (8-209)
b=1<<(BitDepthC-1) (8-210)
otherwise, the following applies:
8. the predicted sample points predSamples [ x ] [ y ], where x is 0.
nTbH-1, derived as follows:
predSamples[x][y]=Clip1C(((pDsY[x][y]*a)>>k)+b)(8-225)
in one example, Threshold1 (Threshold 1) is set to 4.
5 example methods and technical solutions of the disclosed technology
Fig. 19 shows an example method 1900 for video processing. The method 1900 includes, at operation 1910, determining rows of luma samples for a transition between a video and a bitstream representation of the video for deriving parameters of a cross-component linear model for predicting samples of chroma blocks.
The method 1900 includes, at operation 1920, performing a conversion based on the predicted samples of the chroma block. In some embodiments, at least one of the plurality of rows is not adjacent to a luma block collocated with a chroma block.
In some embodiments, the following technical solutions may be implemented:
A1. a method for video processing, comprising: determining rows of luminance samples for a transition between a video and a bitstream representation of the video for deriving parameters of a cross-component linear model for predicting samples of a chroma block; and performing conversion based on the predicted samples of the chroma block, wherein at least one of the plurality of rows is not adjacent to the luma block collocated with the chroma block. Here, a block may represent a set of samples, such as a codec unit or a transform unit or a prediction unit, according to an operation.
A2. The method of solution a1, wherein the cross-component linear model comprises a multi-directional linear model.
A3. The method of solution a1 or a2, wherein the plurality of rows of luminance samples exclude rows that satisfy exclusion criteria or include rows that satisfy inclusion criteria.
A4. The method of solution a3, wherein the exclusion criteria include excluding luma samples that are not used for directional intra prediction and multi-reference row intra prediction.
A5. The method of solution a3, wherein the inclusion criteria comprises two rows above the luminance block.
A6. The method of solution a3, wherein the inclusion criteria includes two rows to the left of the luminance block.
A7. The method of solution a3, wherein the inclusion criteria includes a first row immediately above the luminance block, a second row immediately to the left of the luminance block, and a third row two rows away from the second row.
A8. The method of solution a1 or a2, wherein the plurality of rows includes a first set of sample points generated by downsampling a second set of sample points from the plurality of rows.
A9. The method of solution A8, wherein the coordinates (x, y) of each of the first set of samples are calculated as
Rec′L[x,y]=(w1·RecL[2x+1,2y]+w2·RecL[2x+1,2y+1]+
w3·RecL[2x-1,2y]+w4·RecL[2x-1,2y+1]+2N-1)>>N
Wherein w1+ w2+ w3+ w4 is 2NAnd wherein N is a positive integer.
A10. The method according to solution a9, w1, w2, w3, w4, 1, and N2.
A11. An apparatus in a video system comprising a processor and a non-transitory memory having instructions thereon, wherein the instructions, when executed by the processor, cause the processor to implement the method of any of solutions a 1-a 10.
A12. A computer program product stored on a non-transitory computer readable medium, the computer program product comprising program code for performing the method of any of solutions a 1-a 10.
In some embodiments, the following technical solutions may be implemented:
B1. a method of video coding comprising: during a transition between a current block of video and a bitstream representation of the current block, using parameters of a linear model based on a first set of samples generated by down-sampling a second set of samples of a luma component; and performing a conversion between the bitstream representation and the current block based on the parameters of the linear model.
B2. The method of solution B1, wherein the number of downsampled luma samples from the second set above or to the left of the current block depends on the size of the current block.
B3. The method of solution B2, wherein the current block is a chroma block with width W and height H samples, where W and H are integers, and wherein W luma samples above the current block are downsampled and H luma samples to the left of the current block for W equal to H.
B4. The method according to solution B1, wherein two different downsampling schemes are used for the case where the current block is square and the current block is rectangular.
B5. The method according to solution B4, wherein the current block has a width W and a height H samples, wherein W and H are integers, and wherein W > H, and wherein H leftmost top adjacent samples and H left adjacent samples are used for the training process.
B6. The method according to solution B4, wherein the current block has a width W and a height H samples, wherein W and H are integers, and wherein W > H, and wherein H rightmost top adjacent samples and H left adjacent samples are involved in the training process.
In the above solution, the conversion may comprise video decoding or decompression, wherein the pixel values of the current block are generated from its bitstream representation. In the above solution, the conversion may comprise a video encoding or compression operation, wherein the bitstream representation is generated from the current block.
6 example implementation of the disclosed technology
Fig. 20 is a block diagram of the video processing apparatus 2000. The apparatus 2000 may be used to implement one or more of the methods described herein. The apparatus 2000 may be embodied in a smartphone, tablet, computer, Internet of Things (IoT) receiver, or the like. The apparatus 2000 may include one or more processors 2002, one or more memories 2004, and video processing hardware 2006. Processor(s) 2002 may be configured to implement one or more methods described in this document (including, but not limited to, method 1900). The memory (es) 2004 may be used to store data and code for implementing the methods and techniques described herein. Video processing hardware 2006 may be used to implement some of the techniques described in this document in hardware circuits.
In some embodiments, the video codec method may be implemented using an apparatus implemented on a hardware platform as described with reference to fig. 20.
Some embodiments of the disclosed technology include making a decision or determination to enable a video processing tool or mode. In an example, when a video processing tool or mode is enabled, the encoder will use or implement the tool or mode in the processing of the video blocks, but does not necessarily modify the resulting bitstream based on the use of the tool or mode. That is, when the video processing tool or mode is enabled based on the decision or determination, the conversion from the video block to the bitstream representation of the video will use the video processing tool or mode. In another example, when a video processing tool or mode is enabled, the decoder will process the bitstream knowing that the bitstream has been modified based on the video processing tool or mode. That is, the conversion from the bitstream representation of the video to the video blocks will be performed using the video processing tools or modes that are enabled based on the decision or determination.
Some embodiments of the disclosed technology include making a decision or determination to disable a video processing tool or mode. In an example, when a video processing tool or mode is disabled, the encoder will not use that tool or mode in the conversion of video blocks to bitstream representations of video. In another example, when a video processing tool or mode is disabled, the decoder will process the bitstream knowing that the bitstream was not modified using the video processing tool or mode that was enabled based on the decision or determination.
Fig. 21 is a block diagram illustrating an example video processing system 2100 in which various techniques disclosed herein may be implemented. Various embodiments may include some or all of the components of system 2100. The system 2100 may include an input 2102 for receiving video content. The video content may be received in a raw or uncompressed format, e.g., 8 or 10 bit multi-component pixel values, or may be received in a compressed or encoded format. Input 2102 may represent a network interface, a peripheral bus interface, or a storage interface. Examples of network interfaces include wired interfaces such as ethernet, Passive Optical Network (PON), etc., and wireless interfaces such as Wi-Fi or cellular interfaces.
The system 2100 can include a codec component 2104 that can implement various codec (coding) or encoding (encoding) methods described in this document. The codec component 2104 can reduce an average bit rate of video from the input 2102 to an output of the codec component 2104 to produce a codec representation of the video. Thus, codec techniques are sometimes referred to as video compression or video transcoding techniques. The output of the codec component 2104 can be stored or transmitted via a connected communication (as represented by component 2106). The component 2108 can use a bit stream (or codec) representation of the communication transmission of the video stored or received at the input 2102 to generate pixel values or displayable video that is sent to the display interface 2110. The process of generating a video viewable by a user from a bitstream representation is sometimes referred to as video decompression. Further, while a particular video processing operation is referred to as a "codec" operation or tool, it should be understood that the codec tool or operation is used at the encoder and a corresponding decoding tool or operation that reverses the codec results will be performed by the decoder.
Examples of a peripheral bus interface or display interface may include a Universal Serial Bus (USB) or a High Definition Multimedia Interface (HDMI) or Displayport (Displayport), among others. Examples of storage interfaces include SATA (serial advanced technology attachment), PCI, IDE interfaces, and the like. The techniques described in this document may be embodied in various electronic devices, such as mobile phones, laptop computers, smart phones, or other devices capable of performing digital data processing and/or video display.
From the foregoing it will be appreciated that specific embodiments of the presently disclosed technology have been described herein for purposes of illustration, but that various modifications may be made without deviating from the scope of the invention. Accordingly, the presently disclosed technology is not limited except as by the appended claims.
Implementations of the subject matter and the functional operations described in this patent document can be implemented in various systems, digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this specification and their structural equivalents, or in combinations of one or more of them. Embodiments of the subject matter described in this specification can be implemented as one or more computer program products, i.e., one or more modules of computer program instructions encoded on a tangible and non-transitory computer readable medium for execution by, or to control the operation of, data processing apparatus. The computer readable medium can be a machine-readable storage device, a machine-readable storage substrate, a memory device, a combination of substances which affect a machine-readable propagated signal, or a combination of one or more of them. The term "data processing unit" or "data processing apparatus" encompasses all apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers. The apparatus can include, in addition to hardware, code that creates an execution environment for the computer program in question, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them.
A computer program (also known as a program, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. A computer program does not necessarily correspond to a file in a file system. A program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document), in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub programs, or portions of code). A computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.
The processes and logic flows described in this specification can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (Field Programmable Gate Array) or an ASIC (Application Specific Integrated Circuit).
Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a read-only memory or a random access memory or both. The essential elements of a computer are a processor for executing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks. However, a computer does not require such a device. Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
While this patent document contains many specifics, these should not be construed as limitations on the scope of any invention or of what may be claimed, but rather as descriptions of features specific to particular embodiments of particular inventions. Certain features that are described in this patent document in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Furthermore, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.
Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. Moreover, the separation of various system components in the embodiments described in this patent document should not be understood as requiring such separation in all embodiments.
Only some embodiments and examples are described and other embodiments, enhancements and variations can be made based on what is described and illustrated in this patent document.

Claims (12)

1. A method for video processing, comprising:
determining rows of luminance samples for a transition between a video and a bitstream representation of the video for deriving parameters of a cross-component linear model for predicting samples of a chroma block; and
performing the converting based on predicted samples of the chroma block,
wherein at least one of the plurality of rows is not adjacent to a luma block collocated with the chroma block.
2. The method of claim 1, wherein the cross-component linear model comprises a multi-directional linear model.
3. The method of claim 1 or 2, wherein the plurality of rows of luminance samples exclude rows satisfying exclusion criteria or include rows satisfying inclusion criteria.
4. The method of claim 3, wherein the exclusion criteria include excluding luma samples that are not used for directional intra prediction and multi-reference row intra prediction.
5. The method of claim 3, wherein the inclusion criteria comprises two rows above the luma block.
6. The method of claim 3, wherein the inclusion criteria comprises two rows to the left of the luminance block.
7. The method of claim 3, wherein the inclusion criteria comprises a first row immediately above the luma block, a second row immediately to the left of the luma block, and a third row two rows away from the second row.
8. The method of claim 1 or 2, wherein the plurality of rows comprises a first set of sample points generated by down-sampling a second set of sample points from the plurality of rows.
9. The method of claim 8, wherein the coordinates (x, y) of each of the first set of samples are calculated as
Rec′L[x,y]=(w1·RecL[2x+1,2y]+w2·RecL[2x+1,2y+1]+w3·RecL[2x-1,2y]+w4·RecL[2x-1,2y+1]+2N-1)>>N,
Wherein w1+ w2+ w3+ w4 is 2NAnd wherein N is a positive integer.
10. The method of claim 9, wherein w 1-w 2-w 3-w 4-1 and N-2.
11. An apparatus in a video system comprising a processor and a non-transitory memory having instructions thereon, wherein the instructions, when executed by the processor, cause the processor to implement a method according to one or more of claims 1-10.
12. A computer program product stored on a non-transitory computer readable medium, the computer program product comprising program code for performing a method according to one or more of claims 1 to 10.
CN202080038096.4A 2019-05-22 2020-05-22 Linear model mode parameter derivation with multiple rows Pending CN113875244A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN2019088005 2019-05-22
CNPCT/CN2019/088005 2019-05-22
PCT/CN2020/091830 WO2020233711A1 (en) 2019-05-22 2020-05-22 Linear model mode parameter derivation with multiple lines

Publications (1)

Publication Number Publication Date
CN113875244A true CN113875244A (en) 2021-12-31

Family

ID=73458372

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202080038096.4A Pending CN113875244A (en) 2019-05-22 2020-05-22 Linear model mode parameter derivation with multiple rows

Country Status (2)

Country Link
CN (1) CN113875244A (en)
WO (1) WO2020233711A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11575909B2 (en) * 2020-04-07 2023-02-07 Tencent America LLC Method and apparatus for video coding
WO2023239676A1 (en) * 2022-06-06 2023-12-14 Beijing Dajia Internet Information Technology Co., Ltd. Improved cross-component prediction for video coding

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160198190A1 (en) * 2011-05-12 2016-07-07 Texas Instruments Incorporated Luma-based chroma intra-prediction for video coding
WO2017214420A1 (en) * 2016-06-08 2017-12-14 Qualcomm Incorporated Implicit coding of reference line index used in intra prediction
WO2018053293A1 (en) * 2016-09-15 2018-03-22 Qualcomm Incorporated Linear model chroma intra prediction for video coding

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020053805A1 (en) * 2018-09-12 2020-03-19 Beijing Bytedance Network Technology Co., Ltd. Single-line cross component linear model prediction mode

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160198190A1 (en) * 2011-05-12 2016-07-07 Texas Instruments Incorporated Luma-based chroma intra-prediction for video coding
US20170359597A1 (en) * 2011-05-12 2017-12-14 Texas Instruments Incorporated Luma-based chroma intra-prediction for video coding
WO2017214420A1 (en) * 2016-06-08 2017-12-14 Qualcomm Incorporated Implicit coding of reference line index used in intra prediction
WO2018053293A1 (en) * 2016-09-15 2018-03-22 Qualcomm Incorporated Linear model chroma intra prediction for video coding

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LAROCHE, GUILLAUME: "Non-CE3: On cross-component linear model simplification", JOINT VIDEO EXPERTS TEAM (JVET) OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11 11TH MEETING: LJUBLJANA, SI, 10–18 JULY 2018,DOCUMENT: JVET-K0204-V1 *

Also Published As

Publication number Publication date
WO2020233711A1 (en) 2020-11-26

Similar Documents

Publication Publication Date Title
CN110896478B (en) Downsampling in cross-component linear modeling
CN110839153B (en) Method and device for processing video data
JP7123268B2 (en) Parameter derivation for intra prediction
JP7422757B2 (en) Location-based intra prediction
WO2020169101A1 (en) Neighbouring sample selection for intra prediction
CN113767631B (en) Conditions in parameter derivation for intra prediction
WO2020143825A1 (en) Size dependent cross-component linear model
CN113632474B (en) Parameter derivation for inter-prediction
WO2020233711A1 (en) Linear model mode parameter derivation with multiple lines

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination