LINEAR MODEL MODE PARAMETER DERIVATION WITH MULTIPLE LINES
CROSS-REFERENCE TO RELATED APPLICATION
Under the applicable patent law and/or rules pursuant to the Paris Convention, this application is made to timely claim the priority to and benefits of International Patent Application No. PCT/CN2019/088005 filed on May 22, 2019. For all purposes under the law, the entire disclosures of the aforementioned applications are incorporated by reference as part of the disclosure of this application.
TECHNICAL FIELD
This patent document relates to video coding techniques, devices and systems.
BACKGROUND
In spite of the advances in video compression, digital video still accounts for the largest bandwidth use on the internet and other digital communication networks. As the number of connected user devices capable of receiving and displaying video increases, it is expected that the bandwidth demand for digital video usage will continue to grow.
SUMMARY
Devices, systems and methods related to digital video coding and decoding, and specifically, the derivation of linear model (LM) mode parameters with multiple lines in video coding are described. The described methods may be applied to both the existing video coding standards (e.g., High Efficiency Video Coding (HEVC) ) and future video coding standards (e.g., Versatile Video Coding (VVC) ) or codecs.
In one representative aspect, the disclosed technology may be used to provide a method for video processing. This example method includes determining, for a conversion between a video and a bitstream representation of the video, multiple lines of luma samples for deriving parameters of a cross-component linear model used for predicting samples of a chroma block, and performing the conversion based on the predicted samples of the chroma block, wherein at least one of the multiple lines is non-adjacent to a luma block collocated to the chroma block.
In another representative aspect, the above-described method is embodied in the form of processor-executable code and stored in a computer-readable program medium.
In yet another representative aspect, a device that is configured or operable to perform the above-described method is disclosed. The device may include a processor that is programmed to implement this method.
In yet another representative aspect, a video decoder apparatus may implement a method as described herein.
The above and other aspects and features of the disclosed technology are described in greater detail in the drawings, the description and the claims.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 shows an example of locations of samples used for the derivation of the weights of the linear model used for cross-component prediction.
FIG. 2 shows an example of classifying neighboring samples into two groups.
FIG. 3A shows an example of a chroma sample and its corresponding luma samples.
FIG. 3B shows an example of down filtering for the cross-component linear model (CCLM) in the Joint Exploration Model (JEM) .
FIG. 4 shows an exemplary arrangement of the four luma samples corresponding to a single chroma sample.
FIGS. 5A and 5B shows an example of samples of a 4×4 chroma block with the neighboring samples, and the corresponding luma samples.
FIGS. 6A-6J show examples of CCLM without luma sample down filtering.
FIGS. 7A-7D show examples of CCLM only requiring neighboring luma samples used in normal intra-prediction.
FIGS. 8A and 8B show examples of a coding unit (CU) at a boundary of a coding tree unit (CTU) .
FIG. 9 shows an example of 67 intra prediction modes.
FIGS. 10A and 10B show examples of reference samples for wide-angle intra prediction modes for non-square blocks.
FIG. 11 shows an example of a discontinuity when using wide-angle intra prediction.
FIGS. 12A-12D show examples of samples used by a position-dependent intra prediction combination (PDPC) method.
FIGS. 13A and 13B show examples of down-sampled luma sample positions inside and outside the current block.
FIG. 14 shows an example of combining different luma down-sampling methods together.
FIG. 15A-15H show examples of neighboring samples used in CCLM.
FIG. 16A and 16B show examples of how to downsample luma samples left of the current block to derive the parameters in CCLM.
FIGS. 17A and 17B show examples of linear model (LM) prediction with single-line neighboring luma samples.
FIG. 18A-18C show examples of using a different number of neighboring luma samples based on the height and width of the current block.
FIG. 19 shows a flowchart of yet another example method for cross-component prediction in accordance with the disclosed technology.
FIG. 20 is a block diagram of an example of a hardware platform for implementing a visual media decoding or a visual media encoding technique described in the present document.
FIG. 21 is a block diagram of an example video processing system in which disclosed techniques may be implemented.
DETAILED DESCRIPTION
Due to the increasing demand of higher resolution video, video coding methods and techniques are ubiquitous in modern technology. Video codecs typically include an electronic circuit or software that compresses or decompresses digital video, and are continually being improved to provide higher coding efficiency. A video codec converts uncompressed video to a compressed format or vice versa. There are complex relationships between the video quality, the amount of data used to represent the video (determined by the bit rate) , the complexity of the encoding and decoding algorithms, sensitivity to data losses and errors, ease of editing, random access, and end-to-end delay (latency) . The compressed format usually conforms to a standard video compression specification, e.g., the High Efficiency Video Coding (HEVC) standard (also known as H. 265 or MPEG-H Part 2) , the Versatile Video Coding (VVC) standard to be finalized, or other current and/or future video coding standards.
Embodiments of the disclosed technology may be applied to existing video coding standards (e.g., HEVC, H. 265) and future standards to improve runtime performance. Section headings are used in the present document to improve readability of the description and do not in any way limit the discussion or the embodiments (and/or implementations) to the respective sections only.
1 Embodiments of cross-component prediction
Cross-component prediction is a form of the chroma-to-luma prediction approach that has a well-balanced trade-off between complexity and compression efficiency improvement.
1.1 Examples of the cross-component linear model (CCLM)
In some embodiments, and to reduce the cross-component redundancy, a cross-component linear model (CCLM) prediction mode (also referred to as LM) , is used in the JEM, for which the chroma samples are predicted based on the reconstructed luma samples of the same CU by using a linear model as follows:
pred
C (i, j) =α·rec
L′ (i, j) + β (1)
Here, pred
C (i, j) represents the predicted chroma samples in a CU and rec
L′ (i, j) represents the downsampled reconstructed luma samples of the same CU for color formats 4: 2: 0 or 4: 2: 2 while rec
L′ (i, j) represents the reconstructed luma samples of the same CU for color format 4: 4: 4. CCLM parameters α and β are derived by minimizing the regression error between the neighboring reconstructed luma and chroma samples around the current block as follows:
Here, L (n) represents the down-sampled (for color formats 4: 2: 0 or 4: 2: 2) or original (for color format 4: 4: 4) top and left neighboring reconstructed luma samples, C (n) represents the top and left neighboring reconstructed chroma samples, and value of N is equal to twice of the minimum of width and height of the current chroma coding block.
In some embodiments, and for a coding block with a square shape, the above two equations are applied directly. In other embodiments, and for a non-square coding block, the neighboring samples of the longer boundary are first subsampled to have the same number of samples as for the shorter boundary. FIG. 1 shows the location of the left and above reconstructed samples and the sample of the current block involved in the CCLM mode.
In some embodiments, this regression error minimization computation is performed as part of the decoding process, not just as an encoder search operation, so no syntax is used to convey the α and β values.
In some embodiments, the CCLM prediction mode also includes prediction between the two chroma components, e.g., the Cr (red-difference) component is predicted from the Cb (blue-difference) component. Instead of using the reconstructed sample signal, the CCLM Cb-to-Cr prediction is applied in residual domain. This is implemented by adding a weighted reconstructed Cb residual to the original Cr intra prediction to form the final Cr prediction:
Here, resi
Cb′ (i, j) presents the reconstructed Cb residue sample at position (i, j) .
In some embodiments, the scaling factor α may be derived in a similar way as in the CCLM luma-to-chroma prediction. The only difference is an addition of a regression cost relative to a default α value in the error function so that the derived scaling factor is biased towards a default value of -0.5 as follows:
Here, Cb (n) represents the neighboring reconstructed Cb samples, Cr (n) represents the neighboring reconstructed Cr samples, and λ is equal to ∑ (Cb (n) ·Cb (n) ) >>9.
In some embodiments, the CCLM luma-to-chroma prediction mode is added as one additional chroma intra prediction mode. At the encoder side, one more RD cost check for the chroma components is added for selecting the chroma intra prediction mode. When intra prediction modes other than the CCLM luma-to-chroma prediction mode is used for the chroma components of a CU, CCLM Cb-to-Cr prediction is used for Cr component prediction.
In JEM and VTM-2.0, the total number of training sample in CCLM must be in a form of 2N. Suppose the current block size is W×H. If W is not equal to H, then the down-sampled luma sample set with more samples are decimated to match the number of samples in the down-sampled luma sample set with less samples.
1.2 Examples of multiple model CCLM
In the JEM, there are two CCLM modes: the single model CCLM mode and the multiple model CCLM mode (MMLM) . As indicated by the name, the single model CCLM mode employs one linear model for predicting the chroma samples from the luma samples for the whole CU, while in MMLM, there can be two models.
In MMLM, neighboring luma samples and neighboring chroma samples of the current block are classified into two groups, each group is used as a training set to derive a linear model (i.e., a particular α and β are derived for a particular group) . Furthermore, the samples of the current luma block are also classified based on the same rule for the classification of neighboring luma samples.
FIG. 2 shows an example of classifying the neighboring samples into two groups. Threshold is calculated as the average value of the neighboring reconstructed luma samples. A neighboring sample with Rec′
L [x, y] <= Threshold is classified into group 1; while a neighboring sample with Rec′
L [x, y] > Threshold is classified into group 2.
1.3 Examples of downsampling filters in CCLM
In some embodiments, and to perform cross-component prediction, for the 4: 2: 0 chroma format, where 4 luma samples corresponds to 1 chroma samples, the reconstructed luma block needs to be downsampled to match the size of the chroma signal. The default downsampling filter used in CCLM mode is as follows:
Here, the downsampling assumes the “type 0” phase relationship as shown in FIG. 3A for the positions of the chroma samples relative to the positions of the luma samples, e.g., collocated sampling horizontally and interstitial sampling vertically.
The exemplary 6-tap downsampling filter defined in (6) is used as the default filter for both the single model CCLM mode and the multiple model CCLM mode.
In some embodiments, and for the MMLM mode, the encoder can alternatively select one of four additional luma downsampling filters to be applied for prediction in a CU, and send a filter index to indicate which of these is used. The four selectable luma downsampling filters for the MMLM mode, as shown in FIG. 3B, are as follows:
Rec'
L [x, y] = (Rec
L [2x, 2y] +Rec
L [2x+1, 2y] +1) >>1 (8)
Rec'
L [x, y] = (Rec
L [2x+1, 2y] +Rec
L [2x+1, 2y+1] +1) >>1 (9)
Rec'
L [x, y] = (Rec
L [2x, 2y+1] +Rec
L [2x+1, 2y+1] +1) >>1 (10)
Rec'
L [x, y] = (Rec
L [2x, 2y] +Rec
L [2x, 2y+1] +Rec
L [2x+1, 2y] +Rec
L [2x+1, 2y+1] +2) >>2 (11)
1.4 Exemplary embodiments related to cross-component prediction
Previously proposed CCLM methods include, but are not limited to:
● Only requiring neighboring luma samples used in normal intra-prediction; and
● Not needing to downsample the luma samples, or downsampling is performed by a simple two-sample averaging.
The previously proposed examples described below assume that the color format is 4: 2: 0. As shown in FIG. 3A, one chroma (Cb or Cr) sample (represented by a triangle) corresponds to four luma (Y) samples (represented by circles) : A, B, C and D as shown in FIG. 4. FIG. 5 shows an example of samples of a 4×4 chroma block with neighboring samples, and the corresponding luma samples.
Example 1. In one example, it is proposed that CCLM is done without down-sampling filtering on luma samples.
(a) In one example, down-sampling process of neighboring luma samples is removed in the CCLM parameter (e.g., α and β) derivation process. Instead, the down-sampling process is replaced by sub-sampling process wherein non-consecutive luma samples are utilized.
(b) In one example, down-sampling process of samples in the co-located luma block is removed in the CCLM chroma prediction process. Instead, only partial luma samples in the co-located luma block is used to derive the prediction block of chroma samples.
(c) FIGS. 6A-6J show examples on an 8×8 luma block corresponding to a 4×4 chroma block.
(d) In one example as shown in FIG. 6A, Luma sample at position “C” in FIG. 4 is used to correspond to the chroma sample. The above neighbouring samples are used in the training process to derive the linear model.
(e) In one example as shown in FIG. 6B, luma sample at position “C” in FIG. 4 is used to correspond to the chroma sample. The above neighboring samples and above-right neighboring samples are used in the training process to derive the linear model.
(f) In one example as shown in FIG. 6C, luma sample at position “D” in FIG. 4 is used to correspond to the chroma sample. The above neighboring samples are used in the training process to derive the linear model.
(g) In one example as shown in FIG. 6D, luma sample at position “D” in FIG. 4 is used to correspond to the chroma sample. The above neighboring samples and above-right neighboring samples are used in the training process to derive the linear model.
(h) In one example as shown in FIG. 6E, luma sample at position “B” in FIG. 4 is used to correspond to the chroma sample. The left neighboring samples are used in the training process to derive the linear model.
(i) In one example as shown in FIG. 6F, luma sample at position “B” in FIG. 4 is used to correspond to the chroma sample. The left neighboring samples and left-bottom neighboring samples are used in the training process to derive the linear model.
(j) In one example as shown in FIG. 6G, luma sample at position “D” in FIG. 4 is used to correspond to the chroma sample. The left neighboring samples are used in the training process to derive the linear model.
(k) In one example as shown in FIG. 6H, luma sample at position “D” in FIG. 4 is used to correspond to the chroma sample. The left neighboring samples and left-bottom neighboring samples are used in the training process to derive the linear model.
(l) In one example as shown in FIG. 6I, luma sample at position “D” in FIG. 4 is used to correspond to the chroma sample. The above neighboring samples and left neighboring samples are used in the training process to derive the linear model.
(m) In one example as shown in FIG. 6J, luma sample at position “D” in FIG. 4 is used to correspond to the chroma sample. The above neighboring samples, left neighboring samples, above-right neighboring samples and left-bottom neighboring samples are used in the training process to derive the linear model.
Example 2. In one example, it is proposed that CCLM only requires neighbouring luma samples which are used in normal intra-prediction process, that is, other neighboring luma samples are disallowed to be used in the CCLM process. In one example, CCLM is done with 2-tap filtering on luma samples. FIGS. 7A-7D show examples on an 8×8 luma block corresponding to a 4×4 chroma block.
(a) In one example as shown in FIG. 7A, luma sample at position “C” and position “D” in FIG. 4 are filtered as F (C, D) to be used to correspond to the chroma sample. The above neighboring samples are used in the training process to derive the linear model.
(b) In one example as shown in FIG. 7B, luma sample at position “C” and position “D” in FIG. 4 are filtered as F (C, D) to be used to correspond to the chroma sample. The above neighboring samples and above-right neighboring samples are used in the training process to derive the linear model.
(c) In one example as shown in FIG. 7C, luma sample at position “B” and position “D” in FIG. 4 are filtered as F (B, D) to be used to correspond to the chroma sample. The left neighboring samples are used in the training process to derive the linear model.
(d) In one example as shown in FIG. 7D, luma sample at position “B” and position “D” in FIG. 4 are filtered as F (B, D) to be used to correspond to the chroma sample. The left neighboring samples and left-bottom neighboring samples are used in the training process to derive the linear model.
(e) In one example, F is defined as F (X, Y) = (X+Y) >>1. Alternatively, F (X, Y) = (X+Y+1) >>1.
Example 3. In one example, the previously proposed CCLM methods (e.g., Examples 1 and 2 in this Section) can be applied in a selective way. That is, different block within a region, a slice, a picture or a sequence may choose different kinds of previously proposed CCLM methods.
(a) In one embodiment, the encoder selects one kind of previously proposed CCLM method from a predefined candidate set and signals it to the decoder.
(i) For example, the encoder can select between Example 1 (a) and Example 1 (e) . Alternatively, it can select between Example 1 (b) and Example 1 (f) . Alternatively, it can select between Example 1 (c) and Example 1 (g) . Alternatively, it can select between Example 1 (d) and Example 1 (h) . Alternatively, it can select between Example 2 (a) and Example 2 (c) . Alternatively, it can select between Example 2 (b) and Example 2 (d) .
(ii) The candidate set to be selected from and the signaling may depend on the shape or size of the block. Suppose W and H represent the width and height of the chroma block, T1, and T2 are integers.
(1) In one example, if W<=T1 and H<=T2, there is no candidate, e.g., CCLM is disabled. For example, T1=T2=2.
(2) In one example, if W<=T1 or H<=T2, there is no candidate, e.g., CCLM is disabled. For example, T1=T2=2.
(3) In one example, if W×H<=T1, there is no candidate, e.g., CCLM is disabled. For example, T1= 4.
(4) In one example, if W<=T1 and H<=T2, there is only one candidate such as Example 1 (i) . No CCLM method selection information is signaled. For example, T1=T2=4.
(5) In one example, if W<=T1 or H<=T2, there is only one candidate such as Example 1 (i) . No CCLM method selection information is signaled. For example, T1=T2=4.
(6) In one example, if W×H<=T1, there is only one candidate such as Example 1 (i) . No CCLM method selection information is signaled. For example, T1= 16.
(7) In one example, if W> H, there is only one candidate such as Example 1 (a) . No CCLM method selection information is signaled. Alternatively, if W > H (or W > N *H wherein N is a positive integer) , only candidates (or some candidates) using above or/and above-right neighboring reconstructed samples in deriving CCLM parameters are included in the candidate set.
(8) In one example, if W< H, there is only one candidate such as Example 1 (e) . No CCLM method selection information is signaled. Alternatively, if W < H (or N *W < H) , only candidates (or some candidates) using left or/and left-bottom neighboring reconstructed samples in deriving CCLM parameters are included in the candidate set.
(b) In one embodiment, both the encoder and decoder select a previously proposed CCLM method based the same rule. The encoder does not signal it to the decoder. For example, the selection may depend on the shape or size of the block. In one example, if the width is larger than the height, Example 1 (a) is selected, otherwise, Example 1 (e) is selected.
(c) One or multiple sets of previously proposed CCLM candidates may be signaled in sequence parameter set/picture parameter set/slice header/CTUs/CTBs/groups of CTUs.
Example 4. In one example, it is proposed that multiple CCLM methods (e.g., Examples 1 and 2) may be applied to the same chroma block. That is, one block within a region/slice/picture/sequence may choose different kinds of previously proposed CCLM methods to derive multiple intermediate chroma prediction blocks and the final chroma prediction block is derived from the multiple intermediate chroma prediction blocks.
(a) Alternatively, multiple sets of CCLM parameters (e.g., α and β) may be firstly derived from multiple selected CCLM methods. One final set of CCLM parameters may be derived from the multiple sets and utilized for chroma prediction block generation process.
(b) The selection of multiple CCLM methods may be signaled (implicitly or explicitly) in a similar way as described in Example 3.
(c) Indication of the usage of the proposed method may be signaled in sequence parameter set/picture parameter set/slice header/groups of CTUs/CTUs/coding blocks.
Example 5. In one example, whether and how to apply the previously proposed CCLM methods may depend on the position of the current block.
(a) In one example, one or more of the proposed methods is applied on CUs that locate at the top boundary of the current CTU as shown in FIG. 8A.
(b) In one example, one or more of the proposed methods is applied on CUs that locate at the left boundary of the current CTU as shown in FIG. 8B.
(c) In one example, one or more of the proposed methods is applied in both the above cases.
1.5 Examples of CCLM in VVC
In some embodiments, CCLM as in JEM is adopted in VTM-2.0, but MM-CCLM in JEM is not adopted in VTM-2.0.
CCLM in VTM-5.0
In VTM-5.0, two additional CCLM modes (LM-A and LM-T) proposed in JVET-L0338 are adopted besides LM mode. LM-A only uses neighbouring samples above or right-above the current block and LM-T only uses neighbouring samples left or left-below the current block, to derive the CCLM parameters.
Decoding process of CCLM
In VTM-5.0, the LM derivation process is simplified to a 4-point max-min method proposed in JVET-N0271. The corresponding working draft is shown in below.
Specification of INTRA_LT_CCLM, INTRA_L_CCLM and INTRA_T_CCLM intra prediction mode
Inputs to this process are:
– the intra prediction mode predModeIntra,
– a sample location (xTbC, yTbC) of the top-left sample of the current transform block relative to the top-left sample of the current picture,
– a variable nTbW specifying the transform block width,
– a variable nTbH specifying the transform block height,
– chroma neighbouring samples p [x] [y] , with x = -1, y = 0.. 2 *nTbH -1 and x = 0.. 2 *nTbW -1, y = -1.
Output of this process are predicted samples predSamples [x] [y] , with
x = 0.. nTbW -1, y = 0.. nTbH -1.
The current luma location (xTbY, yTbY) is derived as follows:
(xTbY, yTbY) = (xTbC << 1, yTbC << 1) (8-156)
The variables availL, availT and availTL are derived as follows:
– The availability of left neighbouring samples derivation process for a block as specified in clause 6.4. X [Ed. (BB) : Neighbouring blocks availability checking process tbd] is invoked with the current chroma location (xCurr, yCurr) set equal to (xTbC, yTbC) and the neighbouring chroma location (xTbC-1, yTbC) as inputs, and the output is assigned to availL.
– The availability of top neighbouring samples derivation process for a block as specified in clause 6.4. X [Ed. (BB) : Neighbouring blocks availability checking process tbd] is invoked with the current chroma location (xCurr, yCurr) set equal to (xTbC, yTbC) and the neighbouring chroma location (xTbC, yTbC -1) as inputs, and the output is assigned to availT.
– The availability of top-left neighbouring samples derivation process for a block as specified in clause 6.4. X [Ed. (BB) : Neighbouring blocks availability checking process tbd] is invoked with the current chroma location (xCurr, yCurr) set equal to (xTbC, yTbC) and the neighbouring chroma location (xTbC -1, yTbC -1) as inputs, and the output is assigned to availTL.
– The number of available top-right neighbouring chroma samples numTopRight is derived as follows:
– The variable numTopRight is set equal to 0 and availTR is set equal to TRUE.
– When predModeIntra is equal to INTRA_T_CCLM, the following applies for x = nTbW.. 2 *nTbW -1 until availTR is equal to FALSE or x is equal to 2 *nTbW -1:
– The availability derivation process for a block as specified in clause 6.4. X [Ed. (BB) : Neighbouring blocks availability checking process tbd] is invoked with the current chroma location (xCurr, yCurr) set equal to (xTbC , yTbC) and the neighbouring chroma location (xTbC + x, yTbC -1) as inputs, and the output is assigned to availableTR
– When availableTR is equal to TRUE, numTopRight is incremented by one.
– The number of available left-below neighbouring chroma samples numLeftBelow is derived as follows:
– The variable numLeftBelow is set equal to 0 and availLB is set equal to TRUE.
– When predModeIntra is equal to INTRA_L_CCLM, the following applies for y = nTbH.. 2 *nTbH -1 until availLB is equal to FALSE or y is equal to 2 *nTbH -1:
– The availability derivation process for a block as specified in clause 6.4. X [Ed. (BB) : Neighbouring blocks availability checking process tbd] is invoked with the current chroma location (xCurr, yCurr) set equal to (xTbC , yTbC) and the neighbouring chroma location (xTbC -1, yTbC + y) as inputs, and the output is assigned to availableLB
– When availableLB is equal to TRUE, numLeftBelow is incremented by one.
The number of available neighbouring chroma samples on the top and top-right numTopSamp and the number of available neighbouring chroma samples on the left and left-below nLeftSamp are derived as follows:
– If predModeIntra is equal to INTRA_LT_CCLM, the following applies:
numSampT = availT ? nTbW: 0 (8-157)
numSampL = availL ? nTbH: 0 (8-158)
– Otherwise, the following applies:
numSampT = (availT && predModeIntra = = INTRA_T_CCLM) ? (nTbW +Min (numTopRight, nTbH) ) : 0 (8-159)
numSampL = (availL && predModeIntra = = INTRA_L_CCLM) ? (nTbH + Min (numLeftBelow, nTbW) ) : 0 (8-160)
The variable bCTUboundary is derived as follows:
bCTUboundary = (yTbC & (1 << (CtbLog2SizeY -1) -1) = = 0) ? TRUE : FALSE. (8-161) The variable cntN and array pickPosN with N being replaced by L and T, are derived as follows:
– The variable numIs4N is set equal to ( (availT && availL &&predModeIntra = = INTRA_LT_CCLM) ? 0: 1) .
– The variable startPosN is set equal to numSampN >> (2 + numIs4N) .
– The variable pickStepN is set equal to Max (1, numSampN >> (1 + numIs4N) ) .
– If availN is equal to TRUE and predModeIntra is equal to INTRA_LT_CCLM or INTRA_N_CCLM, the following assignments are made:
– cntN is set equal to Min (numSampN, (1 + numIs4N) << 1)
– pickPosN [pos] is set equal to (startPosN + pos *pickStepN) , with pos = 0.. cntN –1 .
– Otherwise, cntN is set equal to 0.
The prediction samples predSamples [x] [y] with x = 0.. nTbW -1, y = 0.. nTbH -1 are derived as follows:
– If both numSampL and numSampT are equal to 0, the following applies:
predSamples [x] [y] = 1 << (BitDepth
C -1) (8-162)
– Otherwise, the following ordered steps apply:
1. The collocated luma samples pY [x] [y] with x = 0.. nTbW *2 -1, y= 0.. nTbH *2 -1 are set equal to the reconstructed luma samples prior to the deblocking filter process at the locations (xTbY + x, yTbY + y) .
2. The neighbouring luma samples samples pY [x] [y] are derived as follows:
– When numSampL is greater than 0, the neighbouring left luma samples pY [x] [y] with x = -1.. -3, y = 0.. 2 *numSampL -1, are set equal to the reconstructed luma samples prior to the deblocking filter process at the locations (xTbY + x , yTbY +y) .
– When numSampT is greater than 0, the neighbouring top luma samples pY [x] [y] with x = 0.. 2 *numSampT -1, y = -1, -2, are set equal to the reconstructed luma samples prior to the deblocking filter process at the locations (xTbY+ x, yTbY + y) .
– When availTL is equal to TRUE, the neighbouring top-left luma samples pY [x] [y] with x = -1, y = -1, -2, are set equal to the reconstructed luma samples prior to the deblocking filter process at the locations (xTbY+ x, yTbY + y) .
3. The down-sampled collocated luma samples pDsY [x] [y] with x = 0.. nTbW -1, y = 0.. nTbH -1 are derived as follows:
– If sps_cclm_colocated_chroma_flag is equal to 1, the following applies:
– pDsY [x] [y] with x = 1.. nTbW -1, y = 1.. nTbH -1 is derived as follows:
– If availL is equal to TRUE, pDsY [0] [y] with y = 1.. nTbH -1 is derived as follows:
– Otherwise, pDsY [0] [y] with y = 1.. nTbH -1 is derived as follows:
pDsY [0] [y] = (pY [0] [2 *y -1] + 2 *pY [0] [2 *y] + pY [0] [2 *y + 1] + 2) >>2 (8-165)
– If availT is equal to TRUE, pDsY [x] [0] with x = 1.. nTbW -1 is derived as follows:
– Otherwise, pDsY [x] [0] with x = 1.. nTbW -1 is derived as follows:
pDsY [x] [0] = (pY [2 *x -1] [0] + 2 *pY [2 *x] [0] + pY [2 *x + 1] [0] + 2) >>2 (8-167)
– If availL is equal to TRUE and availT is equal to TRUE, pDsY [0] [0] is derived as follows:
– Otherwise if availL is equal to TRUE and availT is equal to FALSE, pDsY [0] [0] is derived as follows:
pDsY [0] [0] = (pY [-1] [0] + 2 *pY [0] [0] + pY [1] [0] + 2) >> 2 (8-169)
– Otherwise if availL is equal to FALSE and availT is equal to TRUE, pDsY [0] [0] is derived as follows:
pDsY [0] [0] = (pY [0] [-1] + 2 *pY [0] [0] + pY [0] [1] + 2) >> 2 (8-170)
– Otherwise (availL is equal to FALSE and availT is equal to FALSE) , pDsY [0] [0] is derived as follows:
pDsY [0] [0] = pY [0] [0] (8-171)
– Otherwise, the following applies:
– pDsY [x] [y] with x = 1.. nTbW -1, y = 0.. nTbH -1 is derived as follows:
– If availL is equal to TRUE, pDsY [0] [y] with y = 0.. nTbH -1 is derived as follows:
– Otherwise, pDsY [0] [y] with y = 0.. nTbH -1 is derived as follows:
pDsY [0] [y] = (pY [0] [2 *y] + pY [0] [2 *y + 1] + 1) >> 1 (8-174)
4. When numSampL is greater than 0, the selcted neighbouring left chroma samples pSelC [idx] are set equal to p [-1] [pickPosL [idx] ] with idx = 0.. cntL –1, and the selected down-sampled neighbouring left luma samples pSelDsY [idx] with idx = 0.. cntL -1 are derived as follows:
– The variable y is set equal to pickPosL [idx] .
– If sps_cclm_colocated_chroma_flag is equal to 1, the following applies:
– If y is greater than 0 or availTL is equal to TRUE, pSelDsY [idx] is derived as follows:
– Otherwise:
pSelDsY [idx] = (pY [-3] [0] + 2 *pY [-2] [0] + pY [-1] [0] + 2) >> 2 (8-177)
– Otherwise, the following applies:
5. When numSampT is greater than 0, the selcted neighbouring top chroma samples pSelC [idx] are set equal to p [pickPosT [idx –cntL] ] [-1] with idx = cntL.. cntL + cntT –1, and the down-sampled neighbouring top luma samples pSelDsY [idx] with idx = 0.. cntL + cntT -1 are specified as follows:
– The variable x is set equal to pickPosT [idx –cntL] .
– If sps_cclm_colocated_chroma_flag is equal to 1, the following applies:
– If x is greater than 0, the following applies:
– If bCTUboundary is equal to FALSE, the following applies:
– Otherwise (bCTUboundary is equal to TRUE) , the following applies:
– Otherwise:
– If availTL is equal to TRUE and bCTUboundary is equal to FALSE, the following applies:
– Otherwise if availTL is equal to TRUE and bCTUboundary is equal to TRUE, the following applies:
– Otherwise if availTL is equal to FALSE and bCTUboundary is equal to FALSE, the following applies:
pSelDsY [idx] = (pY [0] [-3] + 2 *pY [0] [-2] + pY [0] [-1] + 2) >> 2 (8-183)
– Otherwise (availTL is equal to FALSE and bCTUboundary is equal to TRUE) , the following applies:
pSelDsY [idx] = pY [0] [-1] (8-184)
– Otherwise, the following applies:
– If x is greater than 0, the following applies:
– If bCTUboundary is equal to FALSE, the following applies:
– Otherwise (bCTUboundary is equal to TRUE) , the following applies:
– Otherwise:
– If availTL is equal to TRUE and bCTUboundary is equal to FALSE, the following applies:
– Otherwise if availTL is equal to TRUE and bCTUboundary is equal to TRUE, the following applies:
– Otherwise if availTL is equal to FALSE and bCTUboundary is equal to FALSE, the following applies:
pSelDsY [idx] = (pY [0] [-2] + pY [0] [-1] + 1) >> 1 (8-189)
– Otherwise (availTL is equal to FALSE and bCTUboundary is equal to TRUE) , the following applies:
pSelDsY [idx] = pY [0] [-1] (8-190)
6. When cntT+ cntL is not equal to 0, the variables minY, maxY, minC and maxC are derived as follows:
– When cntT+cntL is equal to 2, set pSelComp [3] equal to pSelComp [0] , pSelComp [2] equal to pSelComp [1] , pSelComp [0] equal to pSelComp [1] , and pSelComp [1] equal to pSelComp [3] , with Comp being replaced by DsY and C..
– The arrays minGrpIdx and maxGrpIdx are set as follows:
– minGrpIdx [0] = 0.
– minGrpIdx [1] = 2.
– maxGrpIdx [0] = 1.
– maxGrpIdx [1] = 3.
– When pSelDsY [minGrpIdx [0] ] is greater than pSelDsY [minGrpIdx [1] ] , minGrpIdx [0] and minGrpIdx [1] are swapped as (minGrpIdx [0] , minGrpIdx [1] ) = Swap (minGrpIdx [0] , minGrpIdx [1] ) .
– When pSelDsY [maxGrpIdx [0] ] is greater than pSelDsY [maxGrpIdx [1] ] , maxGrpIdx [0] and maxGrpIdx [1] are swapped as (maxGrpIdx [0] , maxGrpIdx [1] ) = Swap (maxGrpIdx [0] , maxGrpIdx [1] ) .
– When pSelDsY [minGrpIdx [0] ] is greater than pSelDsY [maxGrpIdx [1] ] , arrays minGrpIdx and maxGrpIdx are swapped as (minGrpIdx, maxGrpIdx) = Swap (minGrpIdx, maxGrpIdx) .
– When pSelDsY [minGrpIdx [1] ] is greater than pSelDsY [maxGrpIdx [0] ] , minGrpIdx [1] and maxGrpIdx [0] are swapped as (minGrpIdx [1] , maxGrpIdx [0] ) = Swap (minGrpIdx [1] , maxGrpIdx [0] ) .
– maxY = (pSelDsY [maxGrpIdx [0] ] + pSelDsY [maxGrpIdx [1] ] + 1) >> 1.
– maxC = (pSelC [maxGrpIdx [0] ] + pSelC [maxGrpIdx [1] ] + 1) >> 1.
– minY = (pSelDsY [minGrpIdx [0] ] + pSelDsY [minGrpIdx [1] ] + 1) >> 1.
– minC = (pSelC [minGrpIdx [0] ] + pSelC [minGrpIdx [1] ] + 1) >> 1.
7. The variables a, b, and k are derived as follows:
– If numSampL is equal to 0, and numSampT is equal to 0, the following applies:
k = 0 (8-208)
a = 0 (8-209)
b = 1 << (BitDepth
C -1) (8-210)
– Otherwise, the following applies:
diff = maxY -minY (8-211)
– If diff is not equal to 0, the following applies:
diffC = maxC -minC (8-212)
x = Floor (Log2 (diff) ) (8-213)
normDiff = ( (diff << 4) >> x) &15 (8-214)
x += (normDiff ! = 0) ? 1 : 0 (8-215)
y = Floor (Log2 (Abs (diffC) ) ) + 1 (8-216)
a = (diffC * (divSigTable [normDiff] | 8) + 2
y
-1) >> y (8-217)
k = ( (3 + x -y) < 1) ? 1 : 3 + x -y (8-218)
a = ( (3 + x -y) < 1) ? Sign (a) *15 : a (8-219)
b = minC - ( (a *minY) >> k) (8-220)
where divSigTable [] is specified as follows:
divSigTable [] = {0, 7, 6, 5, 5, 4, 4, 3, 3, 2, 2, 1, 1, 1, 1, 0} (8-221)
– Otherwise (diff is equal to 0) , the following applies:
k = 0 (8-222)
a = 0 (8-223)
b = minC (8-224)
8. The prediction samples predSamples [x] [y] with x = 0.. nTbW -1, y = 0.. nTbH -1 are derived as follows:
predSamples [x] [y] = Clip1C ( ( (pDsY [x] [y] *a) >> k) + b) (8-225)
JVET-L0283
This contribution proposes Multiple Reference Line Intra Prediction (MRLIP) in which the directional intra-prediction can be generated by using more than one reference line of neighbouring samples adjacent or non-adjacent to the current block.
1.6 Examples of intra prediction in VVC
1.6.1 Intra mode coding with 67 intra prediction modes
To capture the arbitrary edge directions presented in natural video, the number of directional intra modes is extended from 33, as used in HEVC, to 65. The additional directional modes are depicted as dotted arrows in FIG. 9, and the planar and DC modes remain the same. These denser directional intra prediction modes apply for all block sizes and for both luma and chroma intra predictions.
Conventional angular intra prediction directions are defined from 45 degrees to -135 degrees in clockwise direction as shown in FIG. 9. In VTM2, several conventional angular intra prediction modes are adaptively replaced with wide-angle intra prediction modes for the non-square blocks. The replaced modes are signaled using the original method and remapped to the indexes of wide angular modes after parsing. The total number of intra prediction modes is unchanged (e.g., 67) , and the intra mode coding is unchanged.
In HEVC, every intra-coded block has a square shape and the length of each of its side is a power of 2. Thus, no division operations are required to generate an intra-predictor using DC mode. In VTV2, blocks can have a rectangular shape that necessitates the use of a division operation per block in the general case. To avoid division operations for DC prediction, only the longer side is used to compute the average for non-square blocks.
1.6.2 Examples of intra mode coding
In some embodiments, and to keep the complexity of the MPM list generation low, an intra mode coding method with 3 Most Probable Modes (MPMs) is used. The following three aspects are considered to the MPM lists:
-- Neighbor intra modes;
-- Derived intra modes; and
-- Default intra modes.
For neighbor intra modes (A and B) , two neighbouring blocks, located in left and above are considered. An initial MPM list is formed by performing pruning process for two neighboring intra modes. If two neighboring modes are different each other, one of the default modes (e.g., PLANA (0) , DC (1) , ANGULAR50 (e.g., 50) ) is added to the MPM list after the pruning check with the existing two MPMs. When the two neighboring modes are the same, either the default modes or the derived modes are added to the MPM list after the pruning check. The detailed generation process of three MPM list is derived as follows:
If two neighboring candidate modes (i.e., A == B) are same,
If A is less than 2, candModeList [3] = {0, 1, 50} .
Otherwise, candModeList [0] = {A, 2 + ( (A + 61) %64) , 2 + ( (A-1) %64) }
Otherwise,
If neither of A and B is equal to 0, candModeList [3] = {A, B, 0} .
Otherwise, if neither of A and B is equal to 1, candModeList [3] = {A, B, 1} .
Otherwise, candModeList [3] = {A, B, 50} .
An additional pruning process is used to remove duplicated modes so that only unique modes can be included into the MPM list. For entropy coding of the 64 non-MPM modes, a 6-bit Fixed Length Code (FLC) is used.
1.6.3 Wide-angle intra prediction for non-square blocks
In some embodiments, conventional angular intra prediction directions are defined from 45 degrees to -135 degrees in clockwise direction. In VTM2, several conventional angular intra prediction modes are adaptively replaced with wide-angle intra prediction modes for non-square blocks. The replaced modes are signaled using the original method and remapped to the indexes of wide angular modes after parsing. The total number of intra prediction modes for a certain block is unchanged, e.g., 67, and the intra mode coding is unchanged.
To support these prediction directions, the top reference with length 2W+1, and the left reference with length 2H+1, are defined as shown in the examples in FIGS. 10A and 10B.
In some embodiments, the mode number of replaced mode in wide-angular direction mode is dependent on the aspect ratio of a block. The replaced intra prediction modes are illustrated in Table 1.
Table 1: Intra prediction modes replaced by wide-angle modes
Condition | Replaced intra prediction modes |
W /H == 2 | Modes 2, 3, 4, 5, 6, 7 |
W /H > 2 |
Modes 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 |
W /H == 1 |
None |
H /W == 1/2 |
Modes 61, 62, 63, 64, 65, 66 |
H /W < 1/2 |
Mode 57, 58, 59, 60, 61, 62, 63, 64, 65, 66 |
As shown in FIG. 11, two vertically-adjacent predicted samples may use two non-adjacent reference samples in the case of wide-angle intra prediction. Hence, low-pass reference samples filter and side smoothing are applied to the wide-angle prediction to reduce the negative effect of the increased gap Δp
α.
1.6.4 Examples of position dependent intra prediction combination (PDPC)
In the VTM2, the results of intra prediction of planar mode are further modified by a position dependent intra prediction combination (PDPC) method. PDPC is an intra prediction method which invokes a combination of the un-filtered boundary reference samples and HEVC style intra prediction with filtered boundary reference samples. PDPC is applied to the following intra modes without signaling: planar, DC, horizontal, vertical, bottom-left angular mode and its eight adjacent angular modes, and top-right angular mode and its eight adjacent angular modes.
The prediction sample pred (x, y) is predicted using an intra prediction mode (DC, planar, angular) and a linear combination of reference samples according to the Equation as follows:
pred (x, y) = (wL×R
-1, y + wT×R
x, -1 –wTL ×R
-1, -1+ (64 –wL –wT+wTL) ×pred (x, y) + 32) >> shift
Herein, R
x, -1, R
-1, y represent the reference samples located at the top and left of current sample (x, y) , respectively, and R
-1, -1 represents the reference sample located at the top-left corner of the current block.
In some embodiments, and if PDPC is applied to DC, planar, horizontal, and vertical intra modes, additional boundary filters are not needed, as required in the case of HEVC DC mode boundary filter or horizontal/vertical mode edge filters.
FIGS. 12A-12D illustrate the definition of reference samples (R
x, -1, R
-1, y and R
-1, -1) for PDPC applied over various prediction modes. The prediction sample pred (x’, y’) is located at (x’, y’) within the prediction block. The coordinate x of the reference sample R
x,
-1 is given by: x =x’ + y’ + 1, and the coordinate y of the reference sample R
-1, y is similarly given by: y = x’ + y’ + 1.
In some embodiments, the PDPC weights are dependent on prediction modes and are shown in Table 2, where S = shift.
Table 2: Examples of PDPC weights according to prediction modes
Prediction modes | wT | wL | wTL |
Diagonal top-right | 16 >> ( (y’<<1) >> S) | 16 >> ( (x’<<1) >> S) | 0 |
Diagonal bottom-left | 16 >> ( (y’<<1) >> S) | 16 >> ( (x’<<1) >> S) | 0 |
Adjacent diag. top-right | 32 >> ( (y’<<1) >> S) | 0 | 0 |
Adjacent diag. bottom-left | 0 | 32 >> ( (x’<<1) >> S) | 0 |
2 Examples of drawbacks in existing implementations solved by described techniques
The current CCLM implementations in either JEM or VTM exhibit at least the following issues:
● In the current CCLM design of JEM, it requires more neighboring luma samples than what are used in normal intra-prediction. CCLM requires two above neighboring rows of luma samples and three left neighboring columns of luma samples. MM-CCLM requires four above neighboring rows of luma samples and four left neighboring columns of luma samples. This is undesirable in hardware design.
● Other related methods use only one line of neighbouring luma samples, but they bring some coding performance loss.
● The neighboring chroma samples are only used to derive LM parameters. When generating the prediction block of a chroma block, only luma samples and derived LM parameters are utilized. Therefore, the spatial correlation between current chroma block and its neighboring chroma blocks are not utilized.
In VTM-5.0, CCLM (including LM, LM-L, LM-T modes) may find only two available neighbouring chroma samples (and their corresponding luma samples which may be down-sampled) . That is a special case in the 4-point max-min CCLM parameter derivation process, which is not desirable.
3 Exemplary methods for cross-component prediction in video coding
Embodiments of the presently disclosed technology overcome drawbacks of existing implementations, thereby providing video coding with higher coding efficiencies but lower computational complexity. Cross-component prediction, based on the disclosed technology, may enhance both existing and future video coding standards, is elucidated in the following examples described for various implementations. The examples of the disclosed technology provided below explain general concepts, and are not meant to be interpreted as limiting. In an example, unless explicitly indicated to the contrary, the various features described in these examples may be combined.
In the following examples and methods, the term “LM method” includes, but is not limited to, the LM mode in JEM or VTM, and MMLM mode in JEM, left-LM mode which only uses left neighbouring samples to derive the linear model, the above-LM mode which only uses above neighbouring samples to derive the linear model or other kinds of methods which utilize luma reconstruction samples to derive chroma prediction blocks.
Example 1. In one example, consider methods for how to down-sample luma samples depends on whether the luma samples are inside the current block or outside the current block.
(a) The down-sampled luma samples may be used to derive LM parameters. Here, the luma block is the corresponding luma block of one chroma block.
(b) The down-sampled luma samples may be used to derive other kinds of chroma prediction blocks. Here, the luma block is the corresponding luma block of one chroma block.
(c) For example, luma samples inside the current block are down-sampled with the same method as that in JEM, but luma samples outside the current block are down-sampled with a different method.
Example 2. In one example, consider methods for how to down-sample outside luma samples that depend on their positions.
(a) In one example, the down-sampled luma samples may be used to derive prediction blocks. Here, the outside luma samples could be neighboring luma samples or non-adjacent luma samples relative to the current luma block to be coded.
(b) The down-sampled luma samples may be used to derive LM parameters. Here, outside luma samples are those which not located in the corresponding luma block of the current chroma block.
(c) In one example, luma samples left to the current block or above to the current block are down-sampled in different ways.
(d) In one example, luma samples are down-sampled as specified below as shown in FIGS. 13A and 13B.
(i) Luma samples inside the current block are down-sampled in the same way as in JEM.
(ii) Luma samples outside the current block and above to the current block are down-sampled to position C or D in FIG. 4. Alternatively, Luma samples are down-sampled to position C in FIG. 4 with a filter. Suppose the above luma samples adjacent to the current block are denoted as a [i] , then d [i] = (a [2i-1] +2*a [2i] +a [2i+1] +2) >>2, where d [i] represents the down-sampled luma samples.
(1) If the sample a [2i-1] is unavailable, d [i] = (3*a [2i] +a [2i+1] +2) >>2
(2) If the sample a [2i+1] is unavailable, d [i] = (a [2i-1] +3*a [2i] +2) >>2
(iii) Luma samples outside the current block and left to the current block are down-sampled to position B or D in FIG. 4. Alternatively, Luma samples are down-sampled to the half position between B and D. Suppose the luma samples left adjacent to the current block are denoted as a [j] , then d [j] = (a [2j] +a [2j+1] +1) >>1, where d [j] represents the down-sampled luma samples.
(iv) In one example, W luma down-sampled samples are generated, where W is the width of the current chroma block as shown in FIG. 13A.
(1) Alternatively, N×W luma down-sampled samples are generated from the above adjacent luma samples, where N is an integer such as 2, as shown in FIG. 13B.
(2) Alternatively, W + K luma down-sampled samples are generated from the above adjacent luma samples, wherein K is a positive integer.
(3) Alternatively, W/N luma down-sampled samples are generated from the above adjacent luma samples, where N is an integer such as 2.
(v) In one example, H luma down-sampled samples are generated from the left adjacent luma samples, where H is the height of the current chroma block as shown in FIG. 13A.
(1) Alternatively, N×H luma down-sampled samples are generated, where N is an integer such as 2, as shown in FIG. 13B.
(2) Alternatively, H + K luma down-sampled samples are generated, wherein K is a positive integer.
(3) Alternatively, H/N luma down-sampled samples are generated, where N is an integer such as 2.
Example 3. In one example, consider methods for how to select and how many samples to be down-sampled for outside luma/inside luma samples may depend on the block size/block shape.
Example 4. In one example, the prediction block generated from LM methods may be further refined before being used as the predictor of chroma block.
(a) In one example, the reconstructed chroma samples (neighboring or non-adjacent chroma samples) may be further utilized together with the prediction block from LM methods.
(i) In one example, a linear function may be applied with the neighboring reconstructed chroma samples and the prediction block from LM methods as input and refined prediction samples as output.
(ii) In one example, for certain positions, the prediction block from LM methods may be refined and for the remaining positions, the prediction block from LM methods may be directly inherited without being refined.
(b) In one example, two prediction blocks may be generated (e.g., one from LM methods and the other one from chroma intra prediction block) . However, the final prediction block may be generated from the two prediction blocks for certain positions and directly copied from the prediction block from LM for the remaining positions.
(i) In one example, the chroma intra prediction mode may be signaled or derived from luma intra prediction modes.
(ii) In one example, ‘the certain positions’ where two prediction blocks may be jointly used includes the top several rows and/or left several columns.
(c) For example, boundary filtering can be applied to LM mode, MMLM mode, left-LM mode or above-LM mode, no matter what kind of down-sampling filter is applied.
(d) In one example, a function of prediction block from LM methods and reconstructed chroma samples above current block may be utilized together to refine the prediction block from LM methods.
(i) Suppose the reconstructed chroma samples above adjacent to the current block are denoted as a [-1] [j] , the LM predicted sample at the ith row and jth column is a [i] [j] , then the prediction sample after boundary filtering is calculated as a function of a [-1] [j] and a [i] [j] . In one example, the final prediction for the (i, j) position is defined as a’ [i] [j] = (w1*a [i] [j] +w2*a [-1] [i] + 2
N-1) >>N, where w1+w2=2
N.
(ii) The boundary filtering can only be applied when above neighboring samples are available.
(iii) In one example, the boundary filtering is only applied if i<=K. K is an integer such as 0 or 1. For example, K = 0, w1=w2=1. In another example, K=0, w1=3, w2=1.
(iv) In one example, w1 and w2 depend on the row index (i) . For example, K=1, w1=w2=1 for samples a [0] [j] , but w1=3 and w2=1 for samples a [1] [j] .
(e) In one example, a function of prediction block from LM methods and reconstructed chroma samples left to the current block may be utilized together to refine the prediction block from LM methods.
(i) Suppose the reconstructed chroma samples left adjacent to the current block are denoted as a [i] [-1] , the LM predicted sample at the ith row and jth column is a [i] [j] , then the prediction sample after boundary filtering is calculated as a function of a [i] [-1] and a [i] [j] . In one example, the final prediction for the (i, j) position is defined as a’ [i] [j] = (w1*a [i] [j] +w2*a [i] [-1] + 2
N-1) >>N, where w1+w2=2
N.
(ii) In one example, the boundary filtering can only be applied when left neighboring samples are available.
(iii) In one example, the boundary filtering is only applied if j<=K. K is an integer such as 0 or 1. For example, K = 0, w1=w2=1. In another example, K=0, w1=3, w2=1.
(iv) In one example, w1 and w2 depend on the column index (i) . For example, K=1, w1=w2=1 for samples a [0] [j] , but w1=3 and w2=1 for samples a [1] [j] .
(f) In one example, a function of prediction block from LM methods and reconstructed chroma samples left to and above the current block may be utilized together to refine the prediction block from LM methods.
(i) Suppose the reconstructed chroma samples above adjacent to the current block are denoted as a [-1] [j] , the reconstructed chroma samples left adjacent to the current block are denoted as a [i] [-1] , and the LM predicted sample at the ith row and jth column is a [i] [j] , then the prediction sample after boundary filtering is calculated as a’ [i] [j] = (w1* a [i] [j] +w2*a [i] [-1] +w3*a [-1] [j] + 2
N-1) >>N, where w1+w2+w3=2
N.
(ii) In one example, the boundary filtering can only be applied when both left and above neighboring samples are available.
(iii) In one example, this boundary filtering is only applied when i<=K and j<=P. In another example, it is only applied when i<=K or J<=P.
(iv) In one example, this boundary filtering is only applied to a [0] [0] with w1=2, w2=w3=1.
Example 5. In one example, consider whether to apply and how to apply LM methods may depend on the size or shape of the current block. Assume W and H represent the width and height of the current chroma block, respectively, and T1 and T2 are thresholds.
(a) In one example, LM mode (or MMLM mode, or Left-LM mode, or Above-LM mode) is not applicable when W <=T1 and H<=T2. For example, T1=T2=4.
(b) In one example, LM mode (or MMLM mode, or Left-LM mode, or Above-LM mode) is not applicable when W <=T1 or H<=T2. For example, T1=T2=2.
(c) In one example, LM mode (or MMLM mode, or Left-LM mode, or Above-LM mode) is not applicable when W <=T1 or H<=T2. For example, T1=T2=4.
(d) In one example, LM (or MMLM mode, or Left-LM mode, or Above-LM mode) mode is not applicable when W + H<=T1. For example, T1=6.
(e) In one example, LM mode (or MMLM mode, or Left-LM mode, or Above-LM mode) is not applicable when W × H<=T1. For example, T1=16.
(f) In one example, Left-LM mode is not applicable when H<=T1. For example, T1= 4.
(g) In one example, Above-LM mode is not applicable when W<=T1. For example, T1= 4.
(h) T1 and/or T2 may be pre-defined or signaled in an SPS, a sequence header, a PPS, a picture header, a VPS, a slice header, a CTU, a CU or a group of CTUs.
Example 6. In one example, consider whether and how to apply the proposed single-line LM methods may depend on the position of the current block.
(a) In one example, one or more of the proposed methods is applied on CUs that locate at the top boundary of the current CTU as shown in FIG. 8A.
(b) In one example, one or more of the proposed methods is applied on CUs that locate at the left boundary of the current CTU as shown in FIG. 8B.
(c) In one example, one or more of the proposed methods is applied in both the above cases.
(d) In one example, one or more of the proposed methods is applied on CUs that locate at the top boundary of a region such as a 64×64 block.
(e) In one example, one or more of the proposed methods is applied on CUs that locate at the left boundary of a region such as a 64×64 block.
The examples described above may be incorporated in the context of the methods described below, e.g., method 1400, which may be implemented at a video encoder and/or decoder.
Example 7: The above LM neighbouring samples are down-sampled by one filtering method (e.g., the one defined in bullet 2. d. ii) , and those luma samples which may be used for normal intra prediction process (e.g., 2W samples (above and above right of current block) in VVC) are utilized. While the left LM neighbouring samples are down-sampled by a different filtering method (e.g., the one defined in JEM or VTM-2.0) .
a. FIG. 14 shows an example how to combine different down-sampling methods together.
Example 8: In an alternative example to bullet 2. d. ii. 1, Luma samples are down-sampled to position C in Fig. 4 with a filter. Suppose the above luma samples adjacent to the current block are denoted as a [i] , then d [i] = (a [2i-1] +2*a [2i] +a [2i+1] +offset0) >>2, if i > 0; Otherwise (if i == 0) , d [i] = (3*a [2i] +a [2i+1] + offset1) >>2.
a. Alternatively, if i == 0, d [i] = a [2i] .
b. Alternatively, if i == 0, d [i] = (a [2i] +a [2i+1] +offset2) >>1.
c. In one example, offset0 = offset1 = 2; offset2 = 1.
Example 9: In some embodiments, the number of down-sampling luma samples above or left to the current block may depend on the dimension of the current block. Suppose the width and height of the current chroma block are denoted as W and H:
a. For example, W luma samples above the current block are down-sampled and H luma samples left to the current block are down-sampled if W == H;
b. For example, 2*W luma samples above the current block are down-sampled and 2*H luma samples left to the current block are down-sampled if W == H;
c. For example, 2*W luma samples above the current block are down-sampled and H luma samples left to the current block are down-sampled if W < H;
d. For example, 2*W luma samples above the current block are down-sampled and H luma samples left to the current block are down-sampled if W <= H;
e. For example, 2*W luma samples above the current block are down-sampled and H luma samples left to the current block are down-sampled if W > H;
f. For example, 2*W luma samples above the current block are down-sampled and H luma samples left to the current block are down-sampled if W >= H;
g. For example, W luma samples above the current block are down-sampled and 2*H luma samples left to the current block are down-sampled if W < H;
h. For example, W luma samples above the current block are down-sampled and 2*H luma samples left to the current block are down-sampled if W <= H;
i. For example, W luma samples above the current block are down-sampled and 2*H luma samples left to the current block are down-sampled if W > H;
j. For example, W luma samples above the current block are down-sampled and 2*H luma samples left to the current block are down-sampled if W >= H;
k. For example, 2*W luma samples above the current block are down-sampled only if both the above neighbouring blocks and right-above neighbouring blocks are all available. Otherwise, W luma samples above the current block are down-sampled.
l. For example, 2*H luma samples left to the current block are down-sampled only if both the left neighbouring blocks and left-bottom neighbouring blocks are all available. Otherwise, H luma samples above the current block are down-sampled.
m. Suppose there are W’ (W’ is equal to W or 2*W) down-sampled luma sample above to the current block, H’ (H’ is equal to H or 2*H) down-sampled luma sample above to the current block. If W’ is not equal to H’ , then the down-sampled luma sample set with more samples are decimated to match the number of samples in the down-sampled luma sample set with less samples, as defined in JEM or VTM-2.0.
Example 10: In some embodiments, the decimation process when W is not equal to H is different from that in JEM or VTM-2.0.
a. For example, if W > H, the H leftmost above neighbouring samples and H left neighbouring samples are involved in the training process.
b. For example, if W > H, the H rightmost above neighbouring samples and H left neighbouring samples are involved in the training process.
c. For example, if W > H, and W=H*n, then W above neighbouring samples and H left neighbouring samples are involved in the training process, and each left neighbouring sample appears n times in the training process.
d. For example, if W < H, the W topmost left neighbouring samples and W above neighbouring samples are involved in the training process.
e. For example, if W < H, the W bottommost left neighbouring samples and W above neighbouring samples are involved in the training process.
f. For example, if W < H, and H=W*n, then W above neighbouring samples and H left neighbouring samples are involved in the training process, and each above neighbouring sample appears n times in the training process.
Example 11: Multiple lines of luma samples with at least one being not-adjacent to the current block are used to derive the parameters used in the CCLM mode or some variants of CCLM modes like MM-CCLM or MDLM.
a. Luma samples which are not used in directional intra-prediction and Multiple Reference Line Intra-Prediction (MRLIP) cannot be used to derive the parameters used in the CCLM mode or some variants of CCLM modes like MM-CCLM or MDLM.
b. In one example, a line of luma samples above adjacent to the current block and a line which is two-lines above the current block are used to derive the parameters used in the CCLM mode or some variants of CCLM modes like MM-CCLM or MDLM.
c. one example, a line of luma samples left adjacent to the current block and a line which is two-lines left to the current block are used to derive the parameters used in the CCLM mode or some variants of CCLM modes like MM-CCLM or MDLM.
d. FIGS. 15A-15H shows several examples of the sets of neighbouring luma samples (shaded samples outside the current block) used to derive derive the parameters used in the CCLM mode or some variants of CCLM modes like MM-CCLM or MDLM.
Example 12
The down-sampled luma samples left to the current block can be calculated as
Rec′
L [x, y] = (w
1·Rec
L [2x+1, 2y] +w
2·Rec
L [2x+1, 2y+1] +w
3·Rec
L [2x-1, 2y] +w
4·Rec
L [2x-1, 2y+1] +2
N-1) >>N
where w1+w2+w3+w4 = 2
N. In one example, w1=w2=w3=w4=1, N=2. FIGS. 16A-16B demonstrate the example.
Example 13: Whether and/or how a CCLM mode (such as LM, LM-A, LM-T) is applied for a block may depend on the number of available neighbouring samples and/or dimensions of the block.
a. In one example, the block may refer to a chroma coding block.
b. In one example, if one or multiple specific CCLM modes (such as LM, LM-A, LM-T) is not applicable for a block (e.g., according to the number of available neighbouring samples and/or dimensions) , the syntax element (s) (such as a flag or a mode representation) to indicate the specific CCLM mode (s) may not be signaled, and the specific CCLM mode (s) are inferred to be not applied.
c. In one example, if one or multiple specific CCLM modes (such as LM, LM-A, LM-T) is not applicable for a block (e.g., according to the number of available neighbouring samples and/or dimensions) , the syntax element (s) (such as a flag or a mode representation) to indicate the CCLM mode (s) may be signaled and should indicate that the CCLM mode (s) are not applied in a conformance bit-stream.
d. In one example, if one or multiple specific CCLM modes (such as LM, LM-A, LM-T) is not applicable for a block (e.g., according to the number of available neighbouring samples and/or dimensions) , the syntax element (s) (such as a flag or a mode representation) to indicate the CCLM mode (s) may be signaled but the signaling may be ignored by the decoder and the specific CCLM mode (s) are inferred to be not applied.
e. In one example, the neighbouring samples may refer to chroma neighbouring samples.
i. Alternatively, the neighbouring samples may refer to corresponding luma neighbouring samples, which may be down-sampled (e.g., according to the color format) .
f. In one example, if one or multiple specific CCLM modes (such as LM, LM-A, LM-T) is not applicable for a block (e.g., according to the number of available neighbouring samples and/or dimensions) but a specific CCLM mode is signaled, the parameter derivation process for CCLM is done in a specific way.
i. In one example, the parameter a is set equal to 0 and the parameter b is set equal to a fixed number, such as 1<< (BitDepth -1) .
g. In one example, LM mode and/or LM-A mode and/or LM-T mode are not applicable when the number of available neighbouring samples is less than T, where T is an integer such as 4.
h. In one example, LM mode is not applicable if the width of the block is equal to 2 and the left neighbouring block is unavailable.
i. In one example, LM mode is not applicable if the height of the block is equal to 2 and the above neighbouring block is unavailable.
j. In one example, LM-T mode is not applicable if the width of the block is equal to 2.
i. Alternatively, LM-T mode is not applicable if the width of the block is equal to 2 and the right-above neighbouring block is unavailable.
k. In one example, LM-L mode is not applicable if the height of the block is equal to 2.
i. Alternatively, LM-L mode is not applicable if the height of the block is equal to 2 and the left-below neighbouring block is unavailable.
l. In one example, LM-L mode is not applicable if LM mode is not applicable.
m. In one example, LM-T mode is not applicable if LM mode is not applicable.
n. In one example, the ‘available neighbouring samples’ may be those from existing above and/or left samples according to the selected reference lines.
o. In one example, the ‘available neighbouring samples’ may be those from selected positions according to the selected reference lines and the rule of CCLM parameter derivation (e.g., pSelComp [] ) .
p. The above methods may be also applicable to the local illumination compensation (LIC) process wherein according to the number of available neighboring samples, LIC may be disabled.
i. Alternatively, according to the number of available neighboring samples, LIC may be enabled but with specific linear model parameters (e.g., a = 1, b = 0) regardless values of the neighboring samples.
FIG. 14 shows a flowchart of an exemplary method for cross-component prediction. The method 1400 includes, at step 1410, receiving a bitstream representation of a current block of video data comprising a luma component and a chroma component.
4 Additional embodiments for cross-component prediction
Embodiment 1
In some embodiments, luma samples inside the current block are down-sampled in the same way as in JEM.
In some embodiments, luma samples outside the current block and above to the current block are down-sampled to position C in FIG. 4 with a filter. Suppose the luma samples above adjacent to the current block are denoted as a [i] , then d [i] = (a [2i-1] +2*a [2i] +a [2i+1] +2) >>2, where d [i] represents the down-sampled luma samples. If the sample a [2i-1] is unavailable, d [i] = (3*a [2i] +a [2i+1] +2) >>2.
In some embodiments, luma samples outside the current block and left to the current block are down-sampled to the half position between B and D, as shown in FIG. 4. Suppose the luma samples left adjacent to the current block are denoted as a [j] , then d [j] = (a [2j] +a [2j+1] +1) >>1, where d [j] represents the down-sampled luma samples.
In some embodiments, W luma luma down-sampled samples from the above adjacent luma samples and H luma down-sampled samples from the left adjacent luma samples are generated, where W and H are the width and height of the current chroma block as shown in FIG. 13A.
In some embodiments, 2W luma luma down-sampled samples from the above adjacent luma samples and 2H luma down-sampled samples from the left adjacent luma samples are generated, where W and H are the width and height of the current chroma block as shown in FIG. 13B.
To constrain the neighbouring luma samples required by the training process in a single line, down-sampling filters with less taps are applied as shown in FIG. 3A:
● For neighbouring luma samples above:
Rec'
L [x, y] = (2×Rec
L [2x, 2y+1] +Rec
L [2x-1, 2y+1] +Rec
L [2x+1, 2y+1] +2) >>2. (2)
● For neighbouring luma samples left:
Rec'
L [x, y] = (Rec
L [2x+1, 2y] +Rec
L [2x+1, 2y+1] +1) >>1. (3)
The luma samples inside the block are still down-sampled with the six-tap filter.
Two solutions are provided in this contribution with usage of different sets of neighbouring luma samples. Suppose the width and height of one block is denoted as W and H, respectively. In solution #1, W above neighbouring samples and H left neighbouring samples are involved in the training process in FIG. 17A. In solution # 2, 2W above neighbouring samples and 2H left neighboring samples are involved as shown in FIG. 17B. It should be noted that the extended neighboring samples in solution #2 have already been used by wide angle intra-prediction.
In addition, solution #3 as shown in FIG. 18A-18C is provided based on solution #2. 2W above neighboring samples are involved if W <= H; Otherwise, W above neighboring samples are involved. And 2H left neighbouring samples are involved if H<=W; Otherwise, H left neighbouring samples are involved.
As a further investigation, solution #1A, #2A and #3A are provided, which apply the same methods of solution #1, solution #2 and solution #3, respectively, but only on the above neighbouring samples. In solution #1A, #2A and #3A, the left neighbouring samples are down-sampled as in VTM-2.0, i.e., H left neighbouring luma samples are down-sampled by the 6-tap filter.
Embodiment #2
An embodiment on CCLM sample number constrains in Working Draft (example changes are marked; deletions are shown using double bolded brackets, i.e., [ [a] ] indicates that “a” is being deleted, and additions are shown using double bolded braces, i.e., { {a} } indicates that “a” is being added to the specification. )
XXX. Specification of INTRA_LT_CCLM, INTRA_L_CCLM and INTRA_T_CCLM intra prediction mode
Inputs to this process are:
– ...
Output of this process are predicted samples predSamples [x] [y] , with
x = 0.. nTbW -1, y = 0.. nTbH -1.
The current luma location (xTbY, yTbY) is derived as follows:
(xTbY, yTbY) = (xTbC << 1, yTbC << 1) (8-156)
The variables availL, availT and availTL are derived as follows:
– ...
– The number of available top-right neighbouring chroma samples numTopRight is derived as follows:
– The variable numTopRight is set equal to 0 and availTR is set equal to TRUE.
– When predModeIntra is equal to INTRA_T_CCLM, the following applies for x = nTbW.. 2 *nTbW -1 until availTR is equal to FALSE or x is equal to 2 *nTbW -1:
– …
– The number of available left-below neighbouring chroma samples numLeftBelow is derived
as follows:
– …
The number of available neighbouring chroma samples on the top and top-right numTopSamp and the number of available neighbouring chroma samples on the left and left-below nLeftSamp are derived as follows:
– If predModeIntra is equal to INTRA_LT_CCLM, the following applies:
numSampT = availT ? nTbW: 0 (8-157)
numSampL = availL ? nTbH: 0 (8-158)
–Otherwise, the following applies:
numSampT = (availT && predModeIntra = = INTRA_T_CCLM) ? (nTbW +Min(numTopRight, nTbH) ) : 0 (8-159)
numSampL =(availL && predModeIntra = = INTRA_L_CCLM) ? (nTbH + Min (numLeftBelow , nTbW) ) : 0 (8-160)
The variable bCTUboundary is derived as follows:
bCTUboundary = (yTbC & (1 << (CtbLog2SizeY -1) -1) = = 0) ? TRUE : FA LSE. (8-161)
The variable cntN and array pickPosN with N being replaced by L and T, are derived as follows:
– The variable numIs4N is set equal to ( (availT && availL &&predModeIntra = = INTRA_LT_CCLM) ? 0: 1) .
– The variable startPosN is set equal to numSampN >> (2 + numIs4N) .
– The variable pickStepN is set equal to Max (1, numSampN >> (1 + numIs4N) ) .
– If availN is equal to TRUE and predModeIntra is equal to INTRA_LT_CCLM or INTRA_N_CCLM, the following assignments are made:
– cntN is set equal to Min (numSampN, (1 + numIs4N) << 1)
– pickPosN [pos] is set equal to (startPosN + pos *pickStepN) , with pos = 0.. cntN –1 .
– Otherwise, cntN is set equal to 0.
The prediction samples predSamples [x] [y] with x = 0.. nTbW -1, y = 0.. nTbH -1 are derived as follows:
– If both numSampL and numSampT are equal to 0, the following applies:
predSamples [x] [y] = 1 << (BitDepth
C -1) (8-162)
– Otherwise, the following ordered steps apply:
1. The collocated luma samples pY [x] [y] with x = 0.. nTbW *2 -1, y= 0.. nTbH *2 -1 are set equal to the reconstructed luma samples prior to the deblocking filter process at the locations (xTbY + x, yTbY + y) .
2. The neighbouring luma samples samples pY [x] [y] are derived as follows:
– When numSampL is greater than 0, the neighbouring left luma samples pY [x] [y] with x = -1.. -3, y = 0.. 2 *numSampL -1, are set equal to the reconstructed luma samples prior to the deblocking filter process at the locations (xTbY + x , yTbY +y) .
– When numSampT is greater than 0, the neighbouring top luma samples pY [x] [y] with x = 0.. 2 *numSampT -1, y = -1, -2, are set equal to the reconstructed luma samples prior to the deblocking filter process at the locations (xTbY+ x, yTbY + y) .
– When availTL is equal to TRUE, the neighbouring top-left luma samples pY [x] [y] with x = -1, y = -1, -2, are set equal to the reconstructed luma samples prior to the deblocking filter process at the locations (xTbY+ x, yTbY + y) .
3. The down-sampled collocated luma samples pDsY [x] [y] with
x = 0.. nTbW -1, y = 0.. nTbH -1 are derived as follows:
…
4. When numSampL is greater than 0, the selcted neighbouring left chroma samples pSelC [idx] are set equal to p [-1] [pickPosL [idx] ] with idx = 0.. cntL –1, and the selected down-sampled neighbouring left luma samples pSelDsY [idx] with idx = 0.. cntL -1 are derived as follows:
– The variable y is set equal to pickPosL [idx] .
– If sps_cclm_colocated_chroma_flag is equal to 1, the following applies:
…
– Otherwise, the following applies:
5. When numSampT is greater than 0, the selcted neighbouring top chroma samples pSelC [idx] are set equal to p [pickPosT [idx –cntL] ] [-1] with idx =cntL.. cntL + cntT –1, and the down-sampled neighbouring top luma samples pSelDsY [idx] with idx = 0.. cntL + cntT -1 are specified as follows:
– The variable x is set equal to pickPosT [idx –cntL] .
– If sps_cclm_colocated_chroma_flag is equal to 1, the following applies:
…
6. When cntT+ cntL is not [ [equal to 0] ] { {smaller than Threshold1} } , the variables minY, maxY, minC and maxC are derived as follows:
– [ [When cntT+cntL is equal to 2, set pSelComp [3] equal to pSelComp [0] , pSelComp [2] equal to pSelComp [1] , pSelComp [0] equal to pSelComp [1] , and pSelComp [1] equal to pSelComp [3] , with Comp being replaced by DsY and C.. ] ]
– The arrays minGrpIdx and maxGrpIdx are set as follows:
– minGrpIdx [0] = 0.
– minGrpIdx [1] = 2.
– maxGrpIdx [0] = 1.
– maxGrpIdx [1] = 3.
– When pSelDsY [minGrpIdx [0] ] is greater than pSelDsY [minGrpIdx [1] ] , minGrpIdx [0] and minGrpIdx [1] are swapped as (minGrpIdx [0] , minGrpIdx [1] ) = Swap (minGrpIdx [0] , minGrpIdx [1] ) .
– When pSelDsY [maxGrpIdx [0] ] is greater than pSelDsY [maxGrpIdx [1] ] , maxGrpIdx [0] and maxGrpIdx [1] are swapped as (maxGrpIdx [0] , maxGrpIdx [1] ) = Swap (maxGrpIdx [0] , maxGrpIdx [1] ) .
– When pSelDsY [minGrpIdx [0] ] is greater than pSelDsY [maxGrpIdx [1] ] , arrays minGrpIdx and maxGrpIdx are swapped as (minGrpIdx, maxGrpIdx) = Swap (minGrpIdx, maxGrpIdx) .
– When pSelDsY [minGrpIdx [1] ] is greater than pSelDsY [maxGrpIdx [0] ] , minGrpIdx [1] and maxGrpIdx [0] are swapped as (minGrpIdx [1] , maxGrpIdx [0] ) = Swap (minGrpIdx [1] , maxGrpIdx [0] ) .
– maxY = (pSelDsY [maxGrpIdx [0] ] + pSelDsY [maxGrpIdx [1] ] + 1) >> 1.
– maxC = (pSelC [maxGrpIdx [0] ] + pSelC [maxGrpIdx [1] ] + 1) >> 1.
– minY = (pSelDsY [minGrpIdx [0] ] + pSelDsY [minGrpIdx [1] ] + 1) >> 1.
– minC = (pSelC [minGrpIdx [0] ] + pSelC [minGrpIdx [1] ] + 1) >> 1.
7. The variables a, b, and k are derived as follows:
– If [ [numSampL is equal to 0, and numSampT is equal to 0] ] { {cntT+ cntL is smaller than Threshold1} } , the following applies:
k = 0 (8-208)
a = 0 (8-209)
b = 1 << (BitDepth
C -1) (8-210)
– Otherwise, the following applies:
…
8. The prediction samples predSamples [x] [y] with x = 0.. nTbW-1, y = 0.. nTbH-1 are derived as follows:
predSamples [x] [y] = Clip1C ( ( (pDsY [x] [y] *a) >> k) + b) (8-225)
In one example, Threshold1 is set to 4.
5 Example methods and technical solutions for the disclosed technology
FIG. 19 shows an example method 1900 for video processing. The method 1900 includes, at operation 1910, determining, for a conversion between a video and a bitstream representation of the video, multiple lines of luma samples for deriving parameters of a cross-component linear model used for predicting samples of a chroma block.
The method 1900 includes, at operation 1920, performing the conversion based on the predicted samples of the chroma block. In some embodiments, at least one of the multiple lines is non-adjacent to a luma block collocated to the chroma block.
In some embodiments, the following technical solutions can be implemented:
A1. A method for video processing, comprising determining, for a conversion between a video and a bitstream representation of the video, multiple lines of luma samples for deriving parameters of a cross-component linear model used for predicting samples of a chroma block; and performing the conversion based on the predicted samples of the chroma block, wherein at least one of the multiple lines is non-adjacent to a luma block collocated to the chroma block. Herein, a block may represent a group of samples according to an operation, such as a coding unit or a transform unit or a prediction unit.
A2. The method of solution A1, wherein the cross component linear model comprises a multi-directional linear model.
A3. The method of solution A1 or A2, wherein the multiple lines of luma samples exclude lines satisfying an exclusion criterion or include lines satisfying an inclusion criterion.
A4. The method of solution A3, wherein the exclusion criterion comprises excluding luma samples not used for directional intra prediction and multiple reference line intra prediction.
A5. The method of solution A3, wherein the inclusion criterion includes two lines above the luma block.
A6. The method of solution A3, wherein the inclusion criterion includes two lines to a left of the luma block.
A7. The method of solution A3, wherein the inclusion criterion includes a first line immediately above the luma block, a second line immediately to a left of the luma block, and a third line that is two lines away from the second line.
A8. The method of solution A1 or A2, wherein the multiple lines comprise a first set of samples that are generated by down-sampling a second set of samples from the multiple lines.
A9. The method of solution A8, wherein coordinates (x, y) of each of the first set of samples are calculated as
Rec′
L [x, y] = (w
1·Rec
L [2x+1, 2y] +w
2·Rec
L [2x+1, 2y+1] +w
3·Rec
L [2x-1, 2y] +w
4·Rec
L [2x-1, 2y+1] +2
N-1) >>N
wherein w1+w2+w3+w4 = 2
N, and wherein N is a positive integer.
A10. The method of solution A9, wherein w1 = w2 = w3 = w4 = 1 and N = 2.
A11. An apparatus in a video system comprising a processor and a non-transitory memory with instructions thereon, wherein the instructions upon execution by the processor, cause the processor to implement the method in any one of solutions A1 to A10.
A12. A computer program product stored on a non-transitory computer readable media, the computer program product including program code for carrying out the method in any one of solutions A1 to A10.
In some embodiments, the following technical solutions can be implemented:
B1. A method of video coding, comprising using, during a conversion between a current block of video and a bitstream representation of the current block, parameters of a linear model based on a first set of samples that are generated by down-sampling a second set of samples of the luma component; and performing, based on the parameters of the linear model, the conversion between the bitstream representation and the current block.
B2. The method of solution B1, wherein a number of down-sampled luma samples in the second set from above or left of the current block are dependent on a dimension of the current block.
B3. The method of solution B2, wherein the current block is a chroma block having width W and height H samples, where W and H are integers, and where W luma samples above the current block are down-sampled and H luma samples left to the current block are down-sampled for W equal to H.
B4. The method of solution B1, wherein two different downsampling schemes are used for cases when the current block is square and the current block is rectangular.
B5. The method of solution B4, wherein the current block has a width W and a height H samples, where W and H are integers, and wherein W > H, and wherein H leftmost above neighboring samples and H left neighboring samples are used for a training process.
B6. The method of solution B4, wherein the current block has a width W and a height H samples, where W and H are integers, and wherein W > H, and wherein H rightmost above neighboring samples and H left neighboring samples are involved in the training process.
In the above solutions, the conversion may include video decoding or decompression in which pixel values of the current block are generated from its bitstream representation. In the above solutions, the conversion may include video encoding or compression operation in which the bitstream representation is generated from the current block.
6 Example implementations of the disclosed technology
FIG. 20 is a block diagram of a video processing apparatus 2000. The apparatus 2000 may be used to implement one or more of the methods described herein. The apparatus 2000 may be embodied in a smartphone, tablet, computer, Internet of Things (IoT) receiver, and so on. The apparatus 2000 may include one or more processors 2002, one or more memories 2004 and video processing hardware 2006. The processor (s) 2002 may be configured to implement one or more methods (including, but not limited to, method 1900) described in the present document. The memory (memories) 2004 may be used for storing data and code used for implementing the methods and techniques described herein. The video processing hardware 2006 may be used to implement, in hardware circuitry, some techniques described in the present document.
In some embodiments, the video coding methods may be implemented using an apparatus that is implemented on a hardware platform as described with respect to FIG. 20.
Some embodiments of the disclosed technology include making a decision or determination to enable a video processing tool or mode. In an example, when the video processing tool or mode is enabled, the encoder will use or implement the tool or mode in the processing of a block of video, but may not necessarily modify the resulting bitstream based on the usage of the tool or mode. That is, a conversion from the block of video to the bitstream representation of the video will use the video processing tool or mode when it is enabled based on the decision or determination. In another example, when the video processing tool or mode is enabled, the decoder will process the bitstream with the knowledge that the bitstream has been modified based on the video processing tool or mode. That is, a conversion from the bitstream representation of the video to the block of video will be performed using the video processing tool or mode that was enabled based on the decision or determination.
Some embodiments of the disclosed technology include making a decision or determination to disable a video processing tool or mode. In an example, when the video processing tool or mode is disabled, the encoder will not use the tool or mode in the conversion of the block of video to the bitstream representation of the video. In another example, when the video processing tool or mode is disabled, the decoder will process the bitstream with the knowledge that the bitstream has not been modified using the video processing tool or mode that was enabled based on the decision or determination.
FIG. 21 is a block diagram showing an example video processing system 2100 in which various techniques disclosed herein may be implemented. Various implementations may include some or all of the components of the system 2100. The system 2100 may include input 2102 for receiving video content. The video content may be received in a raw or uncompressed format, e.g., 8 or 10 bit multi-component pixel values, or may be in a compressed or encoded format. The input 2102 may represent a network interface, a peripheral bus interface, or a storage interface. Examples of network interface include wired interfaces such as Ethernet, passive optical network (PON) , etc. and wireless interfaces such as Wi-Fi or cellular interfaces.
The system 2100 may include a coding component 2104 that may implement the various coding or encoding methods described in the present document. The coding component 2104 may reduce the average bitrate of video from the input 2102 to the output of the coding component 2104 to produce a coded representation of the video. The coding techniques are therefore sometimes called video compression or video transcoding techniques. The output of the coding component 2104 may be either stored, or transmitted via a communication connected, as represented by the component 2106. The stored or communicated bitstream (or coded) representation of the video received at the input 2102 may be used by the component 2108 for generating pixel values or displayable video that is sent to a display interface 2110. The process of generating user-viewable video from the bitstream representation is sometimes called video decompression. Furthermore, while certain video processing operations are referred to as “coding” operations or tools, it will be appreciated that the coding tools or operations are used at an encoder and corresponding decoding tools or operations that reverse the results of the coding will be performed by a decoder.
Examples of a peripheral bus interface or a display interface may include universal serial bus (USB) or high definition multimedia interface (HDMI) or Displayport, and so on. Examples of storage interfaces include SATA (serial advanced technology attachment) , PCI, IDE interface, and the like. The techniques described in the present document may be embodied in various electronic devices such as mobile phones, laptops, smartphones or other devices that are capable of performing digital data processing and/or video display.
From the foregoing, it will be appreciated that specific embodiments of the presently disclosed technology have been described herein for purposes of illustration, but that various modifications may be made without deviating from the scope of the invention. Accordingly, the presently disclosed technology is not limited except as by the appended claims.
Implementations of the subject matter and the functional operations described in this patent document can be implemented in various systems, digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this specification and their structural equivalents, or in combinations of one or more of them. Implementations of the subject matter described in this specification can be implemented as one or more computer program products, i.e., one or more modules of computer program instructions encoded on a tangible and non-transitory computer readable medium for execution by, or to control the operation of, data processing apparatus. The computer readable medium can be a machine-readable storage device, a machine-readable storage substrate, a memory device, a composition of matter effecting a machine-readable propagated signal, or a combination of one or more of them. The term “data processing unit” or “data processing apparatus” encompasses all apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers. The apparatus can include, in addition to hardware, code that creates an execution environment for the computer program in question, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them.
A computer program (also known as a program, software, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. A computer program does not necessarily correspond to a file in a file system. A program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document) , in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub programs, or portions of code) . A computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.
The processes and logic flows described in this specification can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit) .
Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a read only memory or a random access memory or both. The essential elements of a computer are a processor for performing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto optical disks, or optical disks. However, a computer need not have such devices. Computer readable media suitable for storing computer program instructions and data include all forms of nonvolatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
While this patent document contains many specifics, these should not be construed as limitations on the scope of any invention or of what may be claimed, but rather as descriptions of features that may be specific to particular embodiments of particular inventions. Certain features that are described in this patent document in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.
Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. Moreover, the separation of various system components in the embodiments described in this patent document should not be understood as requiring such separation in all embodiments.
Only a few implementations and examples are described and other implementations, enhancements and variations can be made based on what is described and illustrated in this patent document.