WO2020127956A1

WO2020127956A1 - Piecewise modeling for linear component sample prediction

Info

Publication number: WO2020127956A1
Application number: PCT/EP2019/086658
Authority: WO
Inventors: Christophe Gisquet; Guillaume Laroche; Jonathan Taquet; Patrice Onno
Original assignee: Canon Kabushiki Kaisha; Canon Europe Limited
Priority date: 2018-12-20
Filing date: 2019-12-20
Publication date: 2020-06-25
Also published as: GB201903756D0; GB2580192A

Abstract

The invention relates to a method of deriving a linear model for obtaining a first-component sample for a first-component block from an associated reconstructed second-component sample of a second-component block in the same frame, the method comprising:determining three points;each point being defined by two variables, the first variable corresponding to a second-component sample value, the second variable corresponding to a first-component sample value, based on reconstructed samples of both the first-component and the second-component; determining the parameters of two linear equations, a first equation representing a straight line passing through said first and second adjacent points, and the second linear equation passing through said second and third adjacent points, said second point representing an average of a selection of said sample values; and deriving a continuous linear model defined by the parameters of the said straight lines.

Description

PIECEWISE MODELING

FOR LINEAR COMPONENT SAMPLE PREDICTION

DOMAIN OF THE INVENTION

The present invention regards the encoding or decoding of blocks of a given video component, in particular the intra prediction of such component blocks or obtaining the samples of such blocks. The invention finds applications in obtaining blocks of a component, typically blocks of a chroma component, of video data from samples of another component, typically luma samples.

BACKGROUND OF THE INVENTION

Predictive encoding of video data is based on the division of frames into blocks of pixels. For each block of pixels, a predictor block is searched for in available data. The predictor block may be a block in a reference frame different from the current one in INTER coding modes, or generated from neighbouring pixels in the current frame in INTRA coding modes. Different encoding modes are defined according to different ways of determining the predictor block. The result of the encoding is a signalling of the predictor block and a residual block consisting in the difference between the block to be encoded and the predictor block.

Regarding INTRA coding modes, various modes are usually proposed, such as a Direct Current (DC) mode, a planar mode and angular modes. Each of them seeks to predict samples of a block using previously decoded boundary samples from spatially neighbouring blocks.

The encoding may be performed for each component forming the pixels of the video data. Although RGB (for Red-Green-Blue) representation is well-known, the YUV representation is preferably used for the encoding to reduce the inter-channel redundancy. According to these encoding modes, a block of pixels may be considered as composed of several, typically three, component blocks. An RGB pixel block is composed of an R component block containing the values of the R component of the pixels of the block, a G component block containing the values of the G component of these pixels, a B component block containing the values of the B component of these pixels. Similarly, a YUV pixel block is composed of a Y component block (luma), a U component block (chroma) and a V component block (also chroma). Another example is YCbCr, where Cb and Cr are also known as chroma components. However, inter-component (also known as cross-component) correlation is still observed locally.

To improve compression efficiency, the usage of Cross-Component Prediction (CCP) has been studied in the state of this art. The main application of CCP concerns luma-to-chroma prediction. It means that the luma samples have already been encoded and reconstructed from encoded data (as the decoder does) and that chroma is predicted from luma. However, variants use CCP for chroma-to-chroma prediction or more generally for first-component to second-component prediction (including RGB).

The Cross-Component Prediction may apply directly to a block of chroma pixels or may apply to a residual chroma block (meaning the difference between a chroma block and a chroma block predictor).

The Linear Model (LM) mode uses a linear model to predict chroma from luma as a chroma intra prediction mode, relying on one or two parameters, slope (a) and offset (b), to be determined. The chroma intra predictor is thus derived from reconstructed luma samples of a current luma block using the linear model with the parameters.

The linearity, i.e. parameters a and b, is derived from the reconstructed causal samples, in particular from a neighbouring chroma sample set comprising reconstructed chroma samples neighbouring the current chroma block to predict and from a neighbouring luma sample set comprising luma samples neighbouring the current luma block.

Specifically, for an NxN chroma block, the N neighbours of the above row and the N neighbours of the left column are used to form the neighbouring chroma sample set for derivation. The neighbouring luma sample set is also made of N neighbouring samples just above the corresponding luma block and N neighbouring samples on the left side of the luma block.

It is known to reduce the size of the video data to encode without significant degradation of visual rendering, by sub-sampling the chroma components. Known subsampling modes are labelled 4: 1 : 1 , 4:2:2, 4:2:0.

In the situation where the video chroma data are subsampled, the luma block corresponding to the NxN chroma block is bigger than NxN. In that case, the neighbouring luma sample set is down-sampled to match the chroma resolution. The chroma intra predictor to predict the chroma samples in the current NxN chroma block has to be generated using the linear model with the one or more parameters a and b derived and the reconstructed luma samples of the current luma block that are previously down-sampled to match chroma resolution. The down-sampling of the reconstructed luma samples to chroma resolution makes it possible to retrieve the same number of samples as the chroma samples to form both the luma sample set and the chroma intra predictor. Furthermore, when the number of samples on the borders are not a power of 2, and operations, such as average computation, require divisions, further decimation of these border samples can allow use of a number of border samples that is a power of 2, for which divisions are less costly to implement.

The chroma intra predictor is thus subtracted from the current chroma block to obtain a residual chroma block that is encoded at the encoder. Conversely, at the decoder, the chroma intra predictor is added to the received residual chroma block in order to retrieve the chroma block, also known as reconstruction of the decoded block. This may also involve clipping for results of the addition going out of the sample range.

Sometimes, the residual chroma block is negligible and thus not considered during encoding. In that case, the above-mentioned chroma intra predictor is used as the chroma block itself. As a consequence, the above LM mode makes it possible to obtain a sample for a current block of a given component from an associated (i.e. collocated or corresponding) reconstructed sample of a block of another component in the same frame using a linear model with one or more parameters. The sample is obtained using the linear model with the one or more parameters derived and the associated reconstructed samples in the block of the other component. If needed, the block of the other component is made of samples down-sampled to match the block resolution of the current component. While the block of the current component is typically a chroma block and the block of the other component a luma block, this may not be the case. For the sake of clarity and simplicity, the examples given here focus on the prediction of a chroma block from a luma block, it should be clear that the described mechanism may apply to any component prediction from another component.

The Joint Exploration Model (JEM) of the Joint Video Exploration Team (JVET) adds six Cross-Component (luma-to-chroma) linear model modes to the conventional intra prediction modes already known. All these modes compete against each other to predict or generate the chroma blocks, the selection being usually made based on a rate-distortion criterion at the encoder end.

In VTM (JEM successor, currently developed by the JVET group for testing and defining the future WC codec), there are currently three such modes, differing only in the sets of samples they use for determining the parameters. In all cases, said parameters are found by determining extremum points based on luma, as is later described.

For instance, the sample sets may be made of the two lines (i.e. rows and columns) of samples neighbouring the current luma or chroma block, these lines being parallel and immediately adjacent to each one of the top and/or left boundaries of the current luma or chroma block at chroma resolution. Such exemplary sample set is described in publication US 9,736,487.

Other exemplary sample sets are also disclosed in publications US 9,288,500 and US 9,462,273.

The down-sampling schemes used in JEM include a 6-tap filter determining a down-sampled reconstructed luma sample from six reconstructed luma samples but also three 2-tap filters that select either the top right and bottom right samples from among the six reconstructed luma samples, or the bottom and bottom right samples, or the top and top right samples, and a 4-tap filter that selects the top, top right, bottom and bottom right samples of the six reconstructed luma samples. SUMMARY OF THE INVENTION

The linear model parameters for the computation of the chroma predictor block samples has shortcomings in that the model is at best approximate, and valid on a limited sample range. In particular, the accuracy of LM in the middle of the range is often relatively poor. Indeed, these parameters are updated for every block based on the borders, which may result in an instable model. As a consequence, it is desirable to improve the modelling, but without significantly deviating from the simplicity of the linear model and its parameter derivation.

The present invention has been devised to address one or more of the foregoing concerns. It concerns an improved method for obtaining a chroma sample for a current chroma block, possibly through chroma intra prediction.

According to one aspect of the present invention there is provided a method of deriving a linear model for obtaining a first-component sample for a first-component block from an associated reconstructed second-component sample of a second- component block in the same frame, the method comprising: determining M points, where M>3; each point being defined by two variables, the first variable corresponding to a second-component sample value, the second variable corresponding to a first-component sample value, based on reconstructed samples of both the first-component and the second-component; determining the parameters of a plurality of linear equations, each equation representing a straight line passing through two adjacent points of said M points, and deriving a continuous linear model defined by the parameters of the said straight lines.

According to another aspect of the present invention there is provided a method of deriving a linear model for obtaining a first-component sample for a first- component block from an associated reconstructed second-component sample of a second-component block in the same frame, the method comprising: determining three points; each point being defined by two variables, the first variable corresponding to a second-component sample value, the second variable corresponding to a first-component sample value, based on reconstructed samples of both the first-component and the second-component; determining the parameters of two linear equations, a first equation representing a straight line passing through said first and second adjacent points, and the second linear equation passing through said second and third adjacent points, said second point representing an average (mid-point, weighted average, scaled sum) of a selection of said sample values; and deriving a continuous linear model defined by the parameters of the said straight lines.

Optionally, the points are determined based on sample pairs in the neighbourhood of the second-component block. Optionally, the points are determined based on the sample values of the sample pairs in the neighbourhood of the second-component block.

Optionally, the first and Mth points correspond respectively to the sample pairs with the lowest second-component sample value and with the highest second- component sample value.

Optionally, said first and Mth points correspond respectively to the sample pairs with the lowest second-component sample value and with the highest second- component sample value from a selection of sample pairs.

Optionally, the selection of the sample pairs comprises a selection of sample pairs from one of: a top border, or a left border.

Optionally, the method comprises iteratively determining said first and Mth points when accessing said selection of sample pairs.

Optionally, the or each point from 2 to M-1 is derived from a mid-point of second- component values.

More than 1 knee point

Optionally, M>3 and each point from 2 to M-1 are distributed corresponding to a fractional mid-point. Optionally, each point from 2 to M-1 corresponds to a point having a second-component value nearest a value corresponding to said distribution of points between the first and Nth point.

Optionally, each point from 2 to M-1 corresponds to a point having a second- component value which differs from a value corresponding to an even distribution of points between the first and Mth point less than a threshold amount.

Optionally, each point corresponding to said distribution is determined iteratively.

1 knee point Optionally, M=3, and where the second point corresponds to the mid-point. For example, the first and third points correspond to points having the lowest and highest second component values respectively, and the second point is a mid-point.

Optionally, the second point corresponds to a point having a second-component value nearest a mid-point of the second-component sample values.

Optionally, the second point corresponds to a point of which the second- component value differs from a mid-point of the second-component sample values less than a threshold amount.

Optionally, the point or points corresponding to the mid-point is determined iteratively.

Defining a target

Optionally, the mid-point is an average of a selection of second-component sample values.

Average calculation

Optionally, said second point representing an average of the sample values is calculated using a bitwise shift operation.

Approximation of the average

Optionally, said bitwise shift is by an amount M, where M is the largest power of two smaller or equal to the number of available sample pairs.

Optionally, said bitwise shift is by an amount nShift where nShift depends on the availability of neighbouring sample pairs.

Optionally, nShift is the largest power of two smaller or equal to the number of available sample pairs, modified if either or both the neighbouring samples to the left or to the top are unavailable,

Optionally, nShift = Log2( nS ) + ( availL && availT 7 1 : 0)

Optionally, the average is calculated by

avg = ( avg + ( 1 « ( nShift - 1 ) ) ) » nShift

Sampling to obtain selection of samples used to calculate the average

Optionally, wherein the total number of samples comprises sample pairs not from a border with said block so as to create a power of two number of sample pairs. Optionally, the method further comprises extending a border and selecting sample pairs from said extended border.

Optionally, extending the border comprises extending the border into a previously coded part of said frame.

Optionally, said selected sample pairs comprise the last 2^N sample pairs from said extended border, where N is used as the bitwise shift when calculating said second point representing said average.

Scaling the average

Optionally, the method further comprises applying a scaling factor when calculating said second point representing said average.

Optionally, said scaling factor is a fraction approximately equal to 2/3; preferably said scaling factor is 21/32.

Optionally, the method further comprises applying a rounding parameter when calculating said second point representing said average.

Optionally, the values of the samples having the lowest and highest second- component sample values are ignored from the selection of the second-component sample values when calculating said second point representing said average .

Optionally, said highest and lowest values are ignored when updating a calculated average.

Optionally, the calculated average is updated by:

avg += (2*avg - min - max) » N

where N is a predetermined scaling parameter.

Optionally, the calculated average is updated by:

avg += (2* avg - min - max + 2^N_1) » N

where N is a predetermined scaling parameter; preferably N=2

Optionally, N is dependent on the block size

Optionally, N is dependent on the number of samples used to compute the calculated average.

Optionally, the selection of sample pairs is from the sample pairs forming a top border of the block, or a left border of the block. Optionally, the selection of sample pairs also includes a sample from a top-left neighbouring block.

Optionally, the total number of samples in said selection is a power of two.

Optionally, the total number of samples in said selection is greater than the minimum possible number of such samples.

Adaptive MMLM

Optionally, the method further comprises the step of determining a range of said second-component values, and determining the parameters of said straight lines in dependence on said range.

Optionally, said dependence is whether said range is greater than a threshold.

Optionally, if said range between adjacent points is below a threshold, a parameter derived from a straight line between two different points is used.

Optionally, said parameter comprises the slope.

Optionally, said parameter comprises the ordinate intercept.

Conditional MMLM/LM in the middle

According to another aspect of the present invention there is provided a method of deriving a linear model for obtaining a first-component sample for a first- component block from an associated reconstructed second-component sample of a second-component block in the same frame, the method comprising: determining M points, each point being defined by two variables, the first variable corresponding to a second-component sample value, the second variable corresponding to a first- component sample value, based on reconstructed samples of both the first- component and the second-component, the first point corresponding to the point with the lowest second component value, and the Mth point corresponding to the point with the highest second component value; determining the parameters of at least one linear equations, each equation representing a straight line passing through two adjacent points of said M points, and deriving a linear model defined by the parameters of the or each straight lines; wherein M=2 if the number of samples is less than a threshold, and wherein M>3 (M>2) if the number of samples is greater than said threshold.

Optionally, the threshold is 16 samples.

Optionally, the threshold is 32 samples. Optionally, the method further comprises, if the number of samples is greater than said threshold, performing any of the methods as described herein.

According to another aspect of the present invention there is provided a method of deriving a linear model for obtaining a first-component sample for a first- component block from an associated reconstructed second-component sample of a second-component block in the same frame, the method comprising: determining a difference between the second-component values corresponding to the two points having the largest and smallest second-component values; if said difference is lower than a threshold: determining the parameters of a linear equation, representing a straight line passing through said points, and deriving a linear model defined by the parameters of said straight line; if said difference is higher than the threshold: determining at least one further point between said two points having the largest and smallest second-component values; and determining the parameters of two linear equations, the first equation representing a straight line passing through the point having the smallest second-component value and an adjacent point; the second equation representing a straight line passing through the point having the largest second-component value and an adjacent point; and deriving a linear model defined by the parameters of said straight lines.

Optionally, the threshold depends on the bitdepth of the samples.

Optionally, for a bitdepth of 10, the threshold is between 48 and 96.

According to another aspect of the present invention there is provided a method of deriving a linear model for obtaining a first-component sample for a first- component block from an associated reconstructed second-component sample of a second-component block in the same frame, the method comprising: determining three points, said three points comprising two points having the largest and smallest second-component values, and a third point between said two points; determining a difference between the point having the smallest second-component value and the third point; determining a difference between the point having the largest second-component value and the third point; if one or both of said differences are lower than a threshold: determining the parameters of a linear equation, representing a straight line passing through said two points having the largest and smallest second-component values, and deriving a linear model defined by the parameters of said straight line; if said difference is higher than the threshold: determining the parameters of two linear equations, the first equation representing a straight line passing through the point having the smallest second-component value and the third point; and the second equation representing a straight line passing through the point having the largest second-component value and the third point; and deriving a linear model defined by the parameters of said straight lines.

Optionally, the method further comprise, if said difference is higher than the threshold, performing any of the methods as described herein.

Devices

According to another aspect of the present invention there is provided a device for encoding images, wherein the device comprises a means for deriving a continuous linear model.

According to another aspect of the present invention there is provided a device for decoding images, wherein the device comprises a means for deriving a continuous linear model.

According to another aspect of the present invention there is provided a computer program product for a programmable apparatus, the computer program product comprising a sequence of instructions for implementing a method according as described herein, when loaded into and executed by the programmable apparatus.

According to another aspect of the present invention there is provided a computer-readable medium storing a program which, when executed by a microprocessor or computer system in a device, causes the device to perform a method as described herein.

According to another aspect of the present invention there is provided a computer program which upon execution causes the method as described herein to be performed.

At least parts of the methods according to the invention may be computer implemented. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a“processor and a memory”, "circuit", "module" or "system". Furthermore, the present invention may take the form of a computer program product embodied in any tangible medium of expression having computer usable program code embodied in the medium.

Since the present invention can be implemented in software, the present invention can be embodied as computer readable code for provision to a programmable apparatus on any suitable carrier medium. A tangible carrier medium may comprise a storage medium such as a hard disk drive, a magnetic tape device or a solid state memory device and the like. A transient carrier medium may include a signal such as an electrical signal, an electronic signal, an optical signal, an acoustic signal, a magnetic signal or an electromagnetic signal, e.g. a microwave or RF signal.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments of the invention will now be described, by way of example only, and with reference to the following drawings in which:

Figure 1 illustrates a video encoder logical architecture;

Figure 2 illustrates a video decoder logical architecture corresponding to the video encoder logical architecture illustrated in Figure 1 ;

Figure 3 schematically illustrates examples of a YUV sampling scheme for 4:2:0 sampling;

Figure 4 illustrates, using a flowchart, general steps for generating a block predictor using the LM mode, performed either by an encoder or a decoder;

Figures 5A-5B schematically illustrate a chroma block and an associated or collocated luma block, with down-sampling of the luma samples, and neighbouring chroma and luma samples, as known in prior art;

Figures 6A-6D illustrate exemplary coding of signalling flags to signal LM modes;

Figure 7 illustrates points of luma and chroma neighboring samples and a straight line representing the linear model parameters;

Figure 8 illustrates the main steps of a process of a simplified LM derivation; Figure 9 illustrates a combination of the MMLM mode plus parameters derivation through a straight line, including a discontinuity;

Figure 10 illustrates the concepts of a knee point and a target point to create piecewise (continuous) linear models;

Figure 11 presents example sets of sample pairs that help define the knee point or the target point;

Figure 12 illustrates the main steps of a process of a derivation in one embodiment of the invention; Figure 13 is a schematic block diagram of a computing device for implementation of one or more embodiments of the invention;

Figure 14 illustrates the main steps of a derivation process in another embodiment of the invention.

Figure 15 illustrates a sample selection to ensure that the number of samples is a power of 2; and

Figure 16 illustrates various solutions to the number of samples not being a power of 2.

DETAILED DESCRIPTION OF EMBODIMENTS

Figure 1 illustrates a video encoder architecture. In the video encoder, an original sequence 101 is divided into blocks of pixels 102 called coding blocks or coding units for HEVC. A coding mode is then affected to each block. There are two families of coding modes typically used video coding: the coding modes based on spatial prediction or“INTRA modes” 103 and the coding modes based on temporal prediction or “INTER modes” based on motion estimation 104 and motion compensation 105.

An INTRA coding block is generally predicted from the encoded pixels at its causal boundary by a process called INTRA prediction. The predictor for each pixel of the INTRA coding block thus forms a predictor block. Depending on which pixels are used to predict the INTRA coding block, various INTRA modes are proposed: for example, DC mode, a planar mode and angular modes. While Figure 1 is directed to a general description of a video encoder architecture, it is to be noted that a pixel corresponds here to an element of an image, that typically consists of several components, for example a red component, a green component, and a blue component. An image sample is an element of an image, which comprises only one component.

Temporal prediction first consists in finding in a previous or future frame, called the reference frame 116, a reference area which is the closest to the coding block in a motion estimation step 104. This reference area constitutes the predictor block. Next this coding block is predicted using the predictor block to compute the residue or residual block in a motion compensation step 105.

In both cases, spatial and temporal prediction, a residue or residual block is computed by subtracting the obtained predictor block from the coding block.

In the INTRA prediction, a prediction mode is encoded.

In the temporal prediction, an index indicating the reference frame used and a motion vector indicating the reference area in the reference frame are encoded. However, in order to further reduce the bitrate cost related to motion vector encoding, a motion vector is not directly encoded. Indeed, assuming that motion is homogeneous, it is particularly advantageous to encode a motion vector as a difference between this motion vector, and a motion vector (or motion vector predictor) in its surroundings. In the H.264/AVC coding standard for instance, motion vectors are encoded with respect to a median vector computed from the motion vectors associated with three blocks located above and on the left of the current block. Only a difference, also called residual motion vector, computed between the median vector and the current block motion vector is encoded in the bitstream. This is processed in module“Mv prediction and coding” 117. The value of each encoded vector is stored in the motion vector field 118. The neighbouring motion vectors, used for the prediction, are extracted from the motion vector field 118.

The HEVC standard uses three different INTER modes: the Inter mode, the Merge mode and the Merge Skip mode, which mainly differ from each other by the signalling of the motion information (i.e. the motion vector and the associated reference frame through its so-called reference frame index) in the bit-stream 110. For the sake of simplicity, motion vector and motion information are conflated below. Regarding motion vector prediction, HEVC provides several candidates of motion vector predictor that are evaluated during a rate-distortion competition in order to find the best motion vector predictor or the best motion information for respectively the Inter or the Merge mode. An index corresponding to the best predictors or the best candidate of the motion information is inserted in the bitstream 110. Thanks to this signalling, the decoder can derive the same set of predictors or candidates and uses the best one according to the decoded index.

The design of the derivation of motion vector predictors and candidates contributes to achieving the best coding efficiency without large impact on complexity. Two motion vector derivations are proposed in HEVC: one for Inter mode (known as Advanced Motion Vector Prediction (AMVP)) and one for the Merge modes (known as Merge derivation process).

Next, the coding mode optimizing a rate-distortion criterion for the coding block currently considered is selected in module 106. In order to further reduce the redundancies within the obtained residue data, a transform, typically a DCT, is applied to the residual block in module 107, and a quantization is applied to the obtained coefficients in module 108. The quantized block of coefficients is then entropy coded in module 109 and the result is inserted into the bit-stream 110.

The encoder then performs decoding of each of the encoded blocks of the frame for the future motion estimation in modules 111 to 116. These steps allow the encoder and the decoder to have the same reference frames 116. To reconstruct the coded frame, each of the quantized and transformed residual blocks is inverse quantized in module 111 and inverse transformed in module 112 in order to provide the corresponding“reconstructed” residual block in the pixel domain. Due to the loss of the quantization, this“reconstructed” residual block differs from original residual block obtained at step 106.

Next, according to the coding mode selected at 106 (INTER or INTRA), this “reconstructed” residual block is added to the INTER predictor block 114 or to the INTRA predictor block 113, to obtain a“pre-reconstructed” block (coding block).

Next, the“pre-reconstructed” blocks are filtered in module 115 by one or several kinds of post filtering to obtain“reconstructed” blocks (coding blocks). The same post filters are integrated at the encoder (in the decoding loop) and at the decoder to be used in the same way in order to obtain exactly the same reference frames at encoder and decoder ends. The aim of this post filtering is to remove compression artefacts.

Figure 2 illustrates a video decoder architecture corresponding to the video encoder architecture illustrated in Figure 1.

The video stream 201 is first entropy decoded in a module 202. Each obtained residual block (coding block) is then inverse quantized in a module 203 and inverse transformed in a module 204 to obtain a“reconstructed” residual block. This is similar to the beginning of the decoding loop at the encoder end.

Next, according to the decoding mode indicated in the bitstream 201 (either INTRA type decoding or an INTER type decoding), a predictor block is built.

In case of INTRA mode, an INTRA predictor block is determined 205 based on the INTRA prediction mode specified in the bit-stream 201.

In case of INTER mode, the motion information is extracted from the bitstream during the entropy decoding 202. The motion information is composed, for example in HEVC and JVET, of a reference frame index and a motion vector residual.

A motion vector predictor is obtained in the same way as done by the encoder (from neighbouring blocks) using already computed motion vectors stored in motion vector field data 211. It is thus added 210 to the extracted motion vector residual block to obtain the motion vector. This motion vector is added to the motion vector field data 211 in order to be used for the prediction of the next decoded motion vectors.

The motion vector is also used to locate the reference area in the reference frame 206 which is the INTER predictor block.

Next, the“reconstructed” residual block obtained at 204 is added to the INTER predictor block 206 or to the INTRA predictor block 205, to obtain a “pre reconstructed” block (coding block) in the same way as the decoding loop of the encoder. Next, this“pre-reconstructed” block is post filtered in module 207 as done at the encoder end (signalling of the post filtering to use may be retrieved from bitstream

201)

A“reconstructed” block (coding block) is thus obtained which forms the de compressed video 209 as the output of the decoder.

The above-described encoding/decoding process may be applied to monochrome frames. However, most common frames are colour frames generally made of three arrays of colour samples, each array corresponding to a“colour component”, for instance R (red), G (green) and B (blue). A pixel of the image comprises three collocated/corresponding samples, one for each component.

R, G, B components have usually high correlation between them. It is thus very common in image and video compression to decorrelate the colour components prior to processing the frames, by converting them in another colour space. The most common format is the YUV (YCbCr) where Y is the luma (or luminance) component, and U (Cb) and V (Cr) are chroma (or chrominance) components.

To reduce the amount of data to process, some colour components of the colour frames may be subsampled, resulting in having different sampling ratios for the three colour components. A subsampling scheme is commonly expressed as a three part ratio J:a:b that describes the number of luma and chroma samples in a conceptual 2-pixel-high region.‘J’ defines the horizontal sampling reference of the conceptual region (i.e. a width in pixels), usually 4.‘a’ defines the number of chroma samples (Cr, Cb) in the first row of J pixels, while ‘b’ defines the number of (additional) chroma samples (Cr, Cb) in the second row of J pixels.

With the subsampling schemes, the number of chroma samples is reduced compared to the number of luma samples.

The 4:4:4 YUV or RGB format does not provide subsampling and corresponds to a non-subsampled frame where the luma and chroma frames have the same size W x H.

The 4:0:0 YUV or RGB format has only one colour component and thus corresponds to a monochrome frame.

Exemplary sampling formats are the following. The 4:2:0 YUV format has half as many chroma samples as luma samples in the first row, and no chroma samples in the second row. The two chroma frames are thus W/2-pixel wide and H/2-pixel height, where the luma frame is W x H.

The 4:2:2 YUV format has half as many chroma samples in the first row and half as many chroma samples in the second raw, as luma samples. The two chroma frames are thus W/2-pixel wide and H-pixel height, where the luma frame is W x H.

The 4: 1 : 1 YUV format has 75% fewer chroma samples in the first row and 75% fewer chroma samples in the second row, than the luma samples. The two chroma frames are thus W/4-pixel wide and H-pixel height, where the luma frame is W x H.

When subsampled, the positions of the chroma samples in the frames are shifted compared to the luma sample positions.

Figure 3 illustrates an exemplary positioning of chroma samples (triangles) with respect to luma samples (circles) for a 4:2:0 YUV frame.

The encoding process of Figure 1 may be applied to each colour-component frame of an input frame.

Due to correlations between the colour components (between RGB or remaining correlations between YUV despite the RGB-to-YUV conversion), Cross-Component Prediction (CCP) methods have been developed to exploit these (remaining) correlations in order to improve coding efficiency.

CCP methods can be applied at different stages of the encoding or the decoding process, in particular either at a first prediction stage (to predict a current colour component) or at a second prediction stage (to predict a current residual block of a component).

One known CCP method is the LM mode, also referred as to CCLM (Cross- Component Linear Model prediction). It is used to predict both chroma components Cb and Cr (or U and V) from the luma Y, more specifically from the reconstructed luma (at the encoder end or at the decoder end). One predictor is generated for each component. The method operates at a (chroma and luma) block level, for instance at CTU (coding tree unit), CU (coding unit) level, PU (prediction unit) level, sub-PU or TU (transform unit) level. Figure 4 illustrates as an example, using a flowchart, general steps for generating a block predictor using the LM mode, performed either by the encoder (used as reference below) or the decoder.

In the description below, an exemplary first component is chroma while an exemplary second component is luma.

Considering a current chroma block 502 (Figure 5A) to encode or decode and its associated or corresponding (i.e.“collocated”) luma block 505 (i.e. of the same CU for instance) in the same frame, the encoder (or the decoder) receives, in step 401 , a neighbouring luma sample set RecL comprising luma samples 503 neighbouring the current luma block, and receives a neighbouring chroma sample set RecC comprising chroma samples 501 neighbouring the current chroma block, denoted 402. It is to be noted that for some chroma sampling formats and chroma phase, the luma samples 504 and 503 are not directly adjacent to luma block 505 as depicted in Figure 5A. For example in Figure 5A, to obtain the left row RecL’ (503), only the second left row is needed and not the direct left row. In the same way, for the up line 504 the second up line is also considered for the down-sampling of luma sample as depicted in Figure 5A.

When a chroma sampling format is used (e.g. 4:2:0, 4:2:2, etc.), the neighbouring luma sample set is down-sampled at step 403 into RecL’ 404 to match chroma resolution (i.e. the sample resolution of the corresponding chroma frame/block). RecL’ thus comprises reconstructed luma samples 504 neighbouring the current luma block that are down-sampled. Thanks to the down-sampling, RecL’ and RecC comprise the same number 2N of samples (chroma block 502 being N x N). Yet, particular down-samplings of the luma border exist in the prior art where less samples are needed to obtain RecL’. In addition, even if RecL and RecC have the same resolution, RecL’ can be seen as the denoised version of RecL, through the use of a low-pass convolution filter.

In the example of Figure 5A, the neighbouring luma and chroma sample sets are made of the down-sampled top and left neighbouring luma samples and of the top and left neighbouring chroma samples, respectively. More precisely each of the two sample sets is made of the first line immediately adjacent to the left boundary and the first line immediately adjacent to the top boundary of their respective luma or chroma block. Due to down-sampling (4:2:0 in Figure 5A), the single line of neighbouring luma samples RecL’ is obtained from two lines of non down-sampled reconstructed luma samples RecL (left or up).

US 9,565,428 suggests using sub-sampling which selects a single sample, only for the up line (i.e. adjacent to the top boundary of the luma block) and not for the luma block itself (as described below with reference to step 408). The proposed sub sampling is illustrated in Figure 6A. The motivation for this approach is to reduce the line buffer of the up line.

The linear model which is defined by one or two parameters (a slope a and an offset b) is derived from RecL’ (if any, otherwise RecL) and RecC. This is step 405 to obtain the parameters 406.

The LM parameters a and b are obtained using a least mean square-based method using the following equations:

where M is a value which depends on the size of the block considered. In general cases of square blocks as shown in the Figures 5A and 5B, M=2N. However, the LM-based CCP may apply to any block shape where M is for instance the sum of the block height H plus the block width W (for a rectangular block shape).

It is to be noted that the value of M used as a weight in this equation may be adjusted to avoid computational overflows at the encoder and decoder. To be precise, when using arithmetic with 32-bit or 64-bit signed architectures, some of the computations may sometimes overflow and thus cause unspecified behaviour (which is strictly prohibited in any cross platform standard). To face this situation, the maximum magnitude possible given inputs RecL’ and RecC values may be evaluated, and M (and in turn the sums above) may be scaled accordingly to ensure that no overflow occurs. The derivation of the parameters is usually made from the sample sets RecU and RecC shown in Figure 5A.

Variations of the sample sets have been proposed.

For instance, US 9,288,500 proposes three competing sample sets, including a first sample set made of the outer line adjacent to the top boundary and the outer line adjacent to the left boundary, a second sample set made of only the outer line adjacent to the top boundary and a third sample set made of only the outer line adjacent to the left boundary. These three sample sets are shown in Figure 6B for the chroma block only (and thus can be transposed to the luma block).

US 9,462,273 extends the second and third sample sets to additional samples extending the outer lines (usually doubling their length). The extended sample sets are shown in Figure 6C for the chroma block only. This document also provides a reduction in the number of LM modes available in order to decrease the signalling costs for signalling the LM mode used in the bitstream. The reduction may be contextual, for instance based on the Intra mode selected for the associated luma block.

US 9,736,487 proposes three competing sample sets similar to those of US 9,288,500 but made, each time, of the two lines of outer neighbouring samples parallel and immediately adjacent to the boundaries considered. These sample sets are shown in Figure 6D for the chroma block only.

Also US 9, 153,040 and the documents of the same patent family propose additional sample sets made of a single line per boundary, with fewer samples per line than the previous sets. In VVC, the cross-component modes are shown in Fig. 6C, the latter two referred to as“directional modes”. However, as the blocks can be rectangle-shaped, the number of samples on the border may not be a power of 2. As previously described, this makes the implementation more troublesome. Consequently, one in N samples is used on the bigger border, such that the total number of samples is a power of 2. First sample set corresponds to the INTRA_LT_CCLM (“left”+”top”) mode, second to INTRA_L_CCLM (“left”) and finally INTRA_T_CCLM (“top”). Ensuring that the number of samples is a power of 2 is beneficial, as some computations may require divisions by the number of samples. Figure 15 illustrates one way to guarantee this: for rectangular shapes, the selected samples are marked with an‘X’. As all dimensions of a block are power of 2, and various rectangular shapes are allowed, e.g. 1501 to 1503, it is possible (and thus done) to select the same number of samples on each border.

For example, for the 8x2 rectangular block 1501 , 2 samples are taken on the top border of size 8, and 2 on its left border. Blocks 1502 and 1503 illustrate the symmetrical nature of the solution, as one out of 2 is selected on the longest border. The sampling ratio is therefore 1 out of 2^M, whereby M is such that 2^Nx2^M*2^N or 2^M*2^NX2^N are the sizes of the concerned blocks. As a result, the number of samples is then 2^*2^N, therefore a power of 2.

Block 1504 uses another coding mode, in WC called INTRA_T_CCLM, which does not use sampling: the sampling for INTRA_LT_CCLM was due to the use of the LMS method, while the INTRA_T_CCLM and INTRA_L_CCLM modes were introduced with another method computing the model parameters from extremum points.

Back to the process of Figure 4, using the linear model with the one or more derived parameters 406, a chroma intra predictor 413 for chroma block 502 may thus be obtained from the reconstructed luma samples 407 of the current luma block represented in 505. Again if a chroma sampling format is used (e.g. 4:2:0, 4:2:2, etc.), the reconstructed luma samples are down-sampled at step 408 into U 409 to match chroma resolution (i.e. the sample resolution of the corresponding chroma frame/block).

The same down-sampling as for step 403 may be used, or another one for a line buffer reason. For instance, a 6-tap filter may be used to provide the down- sampled value as a weighted sum of the top left, top, top right, bottom left, bottom and bottom right samples surrounding the down-sampling position. When some surrounding samples are missing, a mere 2-tap filter is used instead of the 6-tap filter.

Applied to reconstructed luma samples L, output U of an exemplary 6-tap filter is obtained as follows: 1] + fz - 1,2 j\ + fz + 1,2 j]

1,2 j + l] + f + 1,2 j + l] + 4) » 3 with (ij) being the coordinates of the sample within the down-sampled block and » being the bit-right-shifting operation.

Thanks to down-sampling step 408, L’ and C blocks (the set of chroma samples in chroma block 502) comprise the same number N² of samples (chroma block 502 being N x N).

Next, each sample of the chroma intra predictor PredC 413 is calculated using the loop 410-411 -412 following the formula

PredC [i,j] = a. L'[i,j] + b

with (i,j) being the coordinates of all samples within the chroma and luma blocks.

To avoid divisions and multiplications, the computations may be implemented using less complex methods based on look-up tables and shift operations. For instance, the actual chroma intra predictor derivation 411 may be done as follows:

PredC\i,j ] = ( A. L'[i,j ]) » 5 + b

where S is an integer and A is derived from A1 and A2 (introduced above when computing a and b) using the look-up table mentioned previously. It actually corresponds to a rescaled value of a. The operation (x » S) corresponds to the bit- right-shifting operation, equivalent to an integer division of x (with truncation) by 2^s.

When all samples of the down-sampled luma block have been parsed (412), the chroma intra predictor 413 is available for subtraction from chroma block 502 (to obtain a chroma residual block) at the encoder end or for addition to a chroma residual block (to obtain a reconstructed chroma block) at the decoder end.

Note that the chroma residual block may be insignificant and thus discarded, in which case the obtained chroma intra predictor 413 directly corresponds to predicted chroma samples (forming chroma block 502).

Both standardization groups ITU-T VCEG (Q6/16) and ISO/IEC MPEG (JTC 1/SC 29/WG 11 ) which have defined the HEVC standard are studying future video coding technologies for the successor of HEVC in a joint collaboration effort known as the Joint Video Exploration Team (JVET). The Joint Exploration Model (JEM) contained HEVC tools and new added tools selected by this JVET group. In particular, this reference software contains some CCP tools, as described in document JVET-G1001 . This model has been superseded by the VTM, currently described in document JVET-L1001 -v3.

In addition to the previously described variations around CCLM, another mode is being studied in JVET.

Compared to CCLM, this so-called‘multiple model’ MMLM mode uses two linear models. The neighbouring reconstructed luma samples from the RecL’ set and the neighbouring chroma samples from the RecC set are classified into two groups, each group being used to derive the parameters a and b of one linear model, thus resulting in two sets of linear model parameters (ai,bi) and (a2,b2).

Currently, a threshold is calculated as the average value of the neighbouring reconstructed luma samples forming RecL’. Next, a neighbouring luma sample with RecL’[i,j] < threshold is classified into group 1 ; while a neighbouring luma sample with RecL’[i,j] > threshold is classified into group 2.

Next, the chroma intra predictor (or the predicted chroma samples for current chroma block 602) is obtained according to the following formulas:

PredC[i,j] = a₁. L' [i,j] + b_±, if L'[i,j] < threshold

PredC[i,j] = a₂. L' \i,j] + b₂, if L'[i,j] > threshold

The CCLM or MMLM mode has to be signalled in the bitstream 110 or 201. Figure 8 illustrates an exemplary LM mode signalling of JEM. A first binary flag indicates whether the current block is predicted using an LM mode or other intra modes, including so-called DM modes. In case of LM mode, the six possible LM modes need to be signalled. The first MMLM mode (using the 6-tap filter) is signalled with one second binary flag set to 1 . This second binary flag is set to 0 for the remaining modes, in which case a third binary flag is set to 1 to signal the CCLM mode and is set to 0 for the remaining MMLM modes. Two additional binary flags are then used to signal one of the four remaining MMLM modes.

One mode is signalled for each chroma component. The Cb-to-Cr CCLM mode introduced above is used in DM modes, and applies at residual level. Indeed, a DM mode uses for chroma the intra mode which was used by luma in a predetermined location. Traditionally, a coding mode like HEVC uses one single DM mode, co-located with the top-left corner of the CU. Without going in too many details, and for the sake of clarity, JVET provides several such locations. This mode is then used to determine the prediction method, therefore creating a usual intra prediction for a chroma component which, when subtracted from the reference/original data, yield aforementioned residual data. The prediction for the Cr residual is obtained from the Cb residual (ResidualCb below) by the following formula:

PredCr[i,j ] = a. ResidualCb[i,j]

where a is derived in a similar way as in the CCLM luma-to-chroma prediction. The only difference is the addition of a regression cost relative to a default a value in the error function so that the derived scaling factor is biased towards a default value of -0.5 as follows:

where RecCbi represents the values of neighbouring reconstructed Cb samples, RecCri represents the neighbouring reconstructed Cr samples, and

l = å ₌₁RecCbf » 9.

The LM modes currently present (or proposed) in WC suffer from their modelling being inaccurate, while the current (or proposed) MMLM mode suffers from a high complexity (due to the least-squares approach of fitting a linear model).

A proposed alternative LM is based on the replacement of the derivation of a single linear model used to compute chroma predictor block samples from luma block samples by determination of the parameters of the linear model based on the equations of straight lines. A straight line is defined by two sample pairs defined based on reconstructed sample pairs in the neighbourhood of the block.

Figure 7 illustrates the principle of this method by considering here the minimum and the maximum of luma sample values in the set of sample pairs in the neighborhood of the current block. All the sample pairs are drawn on the figure according to their chroma value and their luma value. Two different points, namely point A and point B are identified on the figure, each point corresponding to a sample pair. Point A corresponds to the sample pair with the lowest luma value x_A from RecU and >¾ its collocated chroma value from RecC. Point B corresponds to the sample pair with the highest luma value x_B and y_B its collocated chroma value.

Figure 8 shows a flow chart of a method to derive the linear model parameters shown in Figure 7. This flow chart is a simplified version of Figure 4. The method is based on the neighboring luma samples RecU obtained in step 801 and chroma samples RecC obtained in step 802.

In a step 803, the two points A and B (804) corresponding to two sample pairs are determined. In a first embodiment, these two points A and B correspond to the sample pairs with respectively the lowest and highest luma sample values x_A and x_B with their corresponding chroma sample values y_A and y_B.

Then the straight line equation which crosses the points A and B is computed in a step 805 according to the following equation:

B ^~ A

a = -

X_B - X_A

b = y_A- ax_A

The obtained a,b are the linear model parameters 806 used to generate the chroma predictor.

The linear model derivation based on the LMS algorithm used in the prior art has a certain complexity. In this known method, the computation of the a parameter of the model is obtained by the following equation:

The analysis of this equation regarding the computation complexity gives the following results. The computation of^ requires M+1 multiplications and M sums, M being the number of sample pairs. The computation of B requires 1 multiplication and 2M sums. The computation of B₃ requires M+1 multiplication and M sums and the computation of

requires one multiplication and 2M sums. The computation of

B, B_j

a corresponding to— - - requires two additional sums and one division.

B₃ - B₄

To compute /? , one multiplication and 2M+1 sums and one division. As described previously M is the number of pairs of samples RecCi and RecL'i.

The complexity of the LMS derivation of a and b is therefore (2 M + 2 + 2) multiplications, (7 M + 3) additions and two divisions.

In comparison, the analysis of the method based on the computation of the equation of a straight line using only two points gives the following results. As reported, the derivation step 805 requires only one multiplication, three sums and one division. This large complexity reduction in generating the linear model parameters is a major advantage of the proposed invention.

It should be noted that the search for the minimum and maximum values has a complexity of its own, typically related to sorting algorithm. The operation is not completely serial: N points can be compared to N other points, generating N minimum/maximum. Then N/2 minimum and N/2 maximum points can be compared to the N/2 others, then again N/4 and so on until only the desired numbers of minimum and maximum points remain. Typically, the search for the minimum and maximum thus results in approximatively 2^*N-2 comparisons (N-1 for each).

As already described, the chroma predictor can be calculated with an integer multiplication and a shift instead of a floating-point multiplication, and a division when computing the slope. This simplification consists in replacing:

By:

To use only integer multiplication and shift, in one embodiment, the straight line equation is obtained as follows: S = 10

b = YA - L(X_A » S )

Please note that b refers to this equation in the following if a is replaced by L and S otherwise it refers to the traditional equation b = y_A- ax_A.

Another advantage of this derivation is that the shift value S always has the same value. This is interesting especially for hardware implementation that can be made simpler in taking advantage of this property.

In yet another embodiment, the value of S is forced to be low, as L could be large, and requires larger multiplier operations. Indeed, a multiply of 8bits values by a 8-bits value is much easier to implement than e.g. a 8^*16 multiplier. Typical practical values for L are often equivalent to a multiplier less than 8 bits.

A particular implementation is known as fixed point: for every value of D=(XB - XA), possibly quantized (e.g. the results for 2D+0 and 2D+1 are stored as a single one), the value of (1 « S)/D is stored in a table. Preferably, these are only for the positive values, as the sign can be easily retrieved. Using an array TAB, the computation of L thus becomes:

Q controls the quantization and thus the number of elements in the table. Using Q=1 thus means no quantization. Also note that the looked-up index can be instead ( abs(x_B - x_A) + R)/Q, typically with R=Q/2, or a variation thereof of the division rounding. Consequently, Q is ideally a power of 2 so that the division by Q=2^P is equivalent to a right-shift by P.

Finally, some of the values in that table may not be equal to 0: low values of abs(x_B - XA) or abs(y_B - YA) often result in very bad estimations of L. Pre-determ ined, or explicit (such as in the slice header or a parameter set such as PPS or SPS) values can be used then. For instance, for all values of D below 4, the array TAB may contains a default value, e.g. -(1 «S)/8. For 10 bits content and Q=1 , up to 2048 entries in the array are needed. By exploiting the symmetry with sign as shown above, this can be reduced to 1024. Increasing further Q would similarly reduce the size of TAB.

It should be appreciated that the derivation of the parameters of the straight line may be calculated in a number of different methods - with certain methods (for example, using tables or estimations of the denominator of a) being particularly well- suited to hardware implementation. Any such modifications could be combined with the present invention without the need for any structural modification.

Similarly, the selection of the maximum and minimum points (A and B) may vary depending on the implementation and such modifications could be combined with the present invention without the need for any structural modification.

Figure 9 illustrates a proposed way of combining the MMLM mode with the straight line method described above. Given the threshold value Ymean, two segments are thus defined:

- A is the sample pair having the smallest luma sample value, and C is the one having the largest luma sample value below Ymean;

- B is the sample pair having the largest luma sample value, and D is the one having the smallest luma sample value above Ymean

Given these specific two pairs of points, two models can be determined by computing their parameters, respectively (a1 ,b1 ) and (a2,b2), as the slope of intercept of respectively the dashed line and dotted line passing through respectively A and C, or D and B. Element 804 circles a discontinuity of the models centered around Ymean-

However, the accuracy of this method relies heavily on the selection of points C and D - with minor differences potentially making a large impact on the parameters a-i, 02, bi, or b2. Furthermore, this method introduces a discontinuity which may yield incorrect or inaccurate predictions for values immediately on either side of the discontinuity. The present invention seeks to improve the situation in term of coding efficiency and/or computational complexity.

Figure 10 first illustrates the concept of a‘target’. The goal is to find T, indicated by an arrow to a mid-point between points A and B - this point is termed a‘knee point’ and is defined by a point which is, or best matches, a particular target. Once the knee point has been determined, the dashed and dotted line equations relating to a linear model linking adjacent points A -> T and T -> B respectively can be determined, and thus a continuous linear model can be determined from the parameters of the two lines linking the three points A->T->B. Such a method provides an improvement in accuracy by utilizing multiple models, while maintaining the simplicity of the LM, and avoiding the problems arising from a discontinuity. In such a way, efficient coding can be achieved.

In a first embodiment, determining the knee point comprises finding the sample pair which minimizes the distance (or has a distance below a threshold) between its luma sample value and a target luma sample.

In one example, the target is the average value Ymean calculated over all sample pairs considered.

However, one issue with determining such a value as Ymean in order to find T is that the modeling requires an additional stage and buffering:

- First, going through all samples to find Ymean;

- Once this is done, go through them again to find the point with a Y value closest to Ymean (T).

One or the other can be performed while searching for the extremum points (A and B in Figure 10), but the other cannot. As a consequence, determining the target may require additional processing steps and as such it would be preferable to perform it while or before finding the extremum points.

One first set of embodiments relies on iterative search of the target point. Referring to Figure 11 and its illustrative samples, several embodiments can be illustrated. Examples for determining a“target” can be defined as:

- The average of any of the luma of the A to I sample pairs;

- When going through the top border 801 to find the extremum points, the average of any of the luma of the A to D sample pairs as well as I;

- Similarly, for the left border 802, sample pairs E to H.

In a preferred embodiment for iterative searches, a luma average is first computed using points on each side of a first border (e.g. the top). Ideally, the number of points is a (small, such as 2/4/8) power of 2, as the average would require a division (as opposed to a bit-shift) if it were not a power of 2. Embodiments include:

- For the INTRA_LT_CCLM (“left”+”top”) mode;

o For top border, using A and B;

o For left border, using E and F.

- For the INTRA_L_CCLM (“left”) mode, using A and D;

- For the INTRA_T_CCLM (“top”) mode, using E and H.

If the processing starts with the top border, then the aforementioned average for the top border is used to find the T point whose luma is nearest to it.

To process the second border, then two scenarios apply. First, if the first border is not available (i.e. the extremum and T points are undefined), which can be the case at the frontier of a slice or when constrained intra prediction is used. The former splits the image according to image or tiles boundaries and another criterion, such as the number of CTUs or the coded bitstream size, while the latter is an error- resilience tool, which forbids intra prediction from using data coming from temporally predicted data (such as sample values), thereby breaking the dependence on possibly corrupted or missing data. In such cases, the average on said border for one of all the CCLM modes is used. Otherwise, the“middle” point between the A and B points as found on the previous border is used.

In all cases, it should be noted that the average or middle point found is not necessarily updated when iterating over sample pairs of a border, while the minimum and maximum point are. In another embodiment, the target for the T point is updated when a new minimum or maximum point is found. As this may cause a convergence problem, like getting caught in a local minimum, further criterion can be applied to when to update said target: if enough sample pairs have been investigated, or if the change in the target is above a threshold, then this is hint enough that (the) values are safe to update. This is particularly true at the start of the iterative search, when the distance between the current minimum and maximum is too small, and may cause the target to be set to close to either one, preventing fast update afterwards.

In addition to the previous set of iterative embodiments, one can forego the search for the T point best matching the target. Instead, this point can be defined through a combination of several sample pairs. In a set of embodiments, T is the point whose luma and chroma are the averages of the luma and chroma sample values of a set of sample pairs. As explained when defining the target point, this set preferably contains a power-of-2 number of pairs for easier computation of the average. In the extreme case, all sample pairs that are currently used are used for the average.

However, a particularly advantageous embodiment takes into account that, if the model has a strong mismatch in the middle, and it is derived from the extremum points, these extremum points are outliers. It is therefore beneficial to remove them from the averages computed. A potential drawback is then that the number of sample pairs used in the computation of averages is no longer a power of 2. A first embodiment therefore adds two additional sample pairs to compensate for the extremum ones to be removed. The locations of these depend on the sample set used. However, there is a risk that these are actually outliers too. Therefore, in one embodiment, a new average is computed by removing the extremum points, for example by using the following formula: avg = ((2^N+2)^*avg - min - max + 2^{N 1}) » N or alternatively avg = avg + (2^* avg - min - max + 2^N_1) » N where the updating of a variable, avg = avg + val, is sometimes represented as avg += val. Typical values for N in these formulae are 2 or 3: N=1 may produce strong changes to the original value of avg, and N=4 too low ones.

The number of samples may not be a power of 2, and as discussed above, this may be detrimental, as the true‘avg’ value, (computed by summing the sample values then dividing by the number of samples), would no longer be able to be calculated solely using shift operations to perform the division.

Such an occurrence is exemplified in block 1504 of Figure 15. This block is coded according to a CCLM mode employing only the extended top border, known as e.g. INTRA_T_CCLM. This extension works by using the already coded parts of the image. There are various scenarios where such an extended border of this size may exist. Usually, the 8 necessary samples on the top border are easily available. More generally, there are 2N samples for an NxM block, and conversely 2M samples for the same block coded using the INTRA_L_CCLM mode.

However, restrictions on the number of available samples may apply which may mean selecting a power of 2 may be impractical:

• the block may lie near the right or bottom border of the image and as such the samples needed on the border may not exist.

• the CTB/CTU (largest coded block) border, or a sub-unit, for instance known as Video and Data Processing Unit, a virtual processing unit that helps restrict the amount of data to cache for implementations, and currently 64x64 (i.e. a quarter of the largest possible, 128x128, CTB). A block whose 2N samples border would extend past the VDPU boundary is not allowed, and as such may result in ‘clipping’ of the number of available samples.

• If ‘Constrained Intra Prediction’ feature is enabled - this is an error resilience feature where the use of samples coming from‘inter prediction’ is disallowed and as such the choice of samples is restricted. This also combines with the hierarchy of blocks from WC, called QTBT: a CTB is first split by a quadtree, the leaf nodes of which further split into binary trees (i.e. horizontal or vertical splits into areas of equal size). Other splitting schemes can be used, as Ternary Tree, Asymmetrical Binary trees (e.g. 3N vs N). Thus, in certain cases, some blocks might have at least one dimension which is not a power of 2.

Therefore, for a variety of reasons, it is possible that the number of samples may not be a power of 2. It should be noted also that, currently in WC, only the INTRA_LT_CCLM mode uses a sampling ratio for the longest border, as is depicted on blocks 1501 to 1503, therefore, as such, a solution is required.

The solutions discussed herein can be placed into three broad categories: sampling, scaling and approximation.

Sampling In this set of embodiments, a power of 2 number of samples are selected from the available samples. Such ‘sampling’ may be regular (i.e. selecting every N available samples), or more adaptive as is discussed below.

Figure 16 illustrates various solutions to the number of samples not being a power of 2, at least for the case of a single border. Block 1600 uses for instance a sampling of 6:4. As it is not an integer value, this results in the selected samples marked by an due to the rounding: 0^*1 .5 is rounded to 0, 1 ^*1 .5 is rounded to 1 , 2^*1 .5 is rounded to 3 and finally 3^*1 .5 is rounded to 4. This is a simple extension of the sampling ratio to guarantee a power of 2.

An alternative, simpler, solution is to determine the nearest integer M that is a power of 2 smaller than or equal to the number of available samples is used instead, and M contiguous samples at the‘start’ (in e.g. coding order) of the border are used. In the later, such an M integer being smaller then indicates that it is not strictly smaller, and can be equal to said number of samples (for example, when there are exactly 2^LM samples).

This method is more effective in terms of coding efficiency, as well as implementation complexity, than using the sampling ratio discussed above, and as such may represent a general preferred embodiment for ensuring a power-of-2 number of samples.

To illustrate possible variants of this general solution, block 1602 shows the case where these M samples are centered (if possible) in the border, and block 1603 where they are located at the‘end’ of the border (referred to later as Last 2^LN). Another criterion could be to select either of patterns illustrated for blocks 1601 to 1603 depending on a property of the samples on the border. Such criterion can be implicit, e.g. based on the quantizer step or coding mode (e.g. INTRA vs INTER, the intra or inter prediction method, the coding of residuals, etc) used for coding the blocks where these samples are located. It can also be explicit, e.g. due to a sub indication for the current CCLM mode (such as a flag or unary code) or because the CCLM directly implies the pattern selected. For instance, three such‘top’ modes could exist, namely topleft, topright and topcentered. Depending on the block size, it will be appreciated that a number of further variations of such positions would be possible. This however implies particular book-keeping, which can be detrimental to the caching or memory mechanisms of e.g. a hardware implementation. For instance, 1602 may cause an‘unaligned access’, which has various negative consequences, from slowness to not being possible.

Scaling

A further embodiment observes that a troublesome number of samples is often of the form 3^*2^N (e.g. 3/6/12/24/48). This may happen because the encoder tries to select a block size as large as possible but is required to leave out a part to a much smaller block. As a consequence, many troublesome cases can be handled by applying a scaling factor to the average so that bitwise shifting can be used to calculate the division. For the troublesome number of samples discussed above, this factor is around 2/3.

This can be implemented by computing the‘average’ (which is no longer a true mathematical average) in the following manner:

If the number of samples is 3^*2^LN as discussed above, the average would need to be scaled as follows: avg = avg / (3^*2^LN) = (avg » N) / 3

If using Fact/2^Aprec « 2/3, we can rewrite this as:

(avg » N)/3 « (avg >> N) ^* fact/2^Aprec « (avg^*fact) » (N+prec)

A rounding parameter R can then be added to bias the rounding of the result as an integer in an advantageous way, and the following general formula is derived: avg = (fact ^* sum + R) >> (N+prec)

The value of N is based on the number of samples being the nearest smaller power-of-two number. As observed previously, this should be N for numbers of samples 3^*2^N and 2^N. In any case, such an N value can be obtained by various means, such as from a table or bit counting (such as count leading/trailing zero bits processor instructions). The variables‘fact’ and‘prec’ are 1 and 0 respectfully for numbers being 2^N, or selected such that fact / 2^prec is close to 2/3 otherwise. Examples of values for fact and prec are therefore 21 and 5, 1 1 and 4 or 5 and 3. Results for fact/2^prec of 21/32 are provided below. R is a rounding parameter, that can be equal to e.g. 0, 2^N_1 or 2^{N+prec 1}, the later giving the rounding to nearest integer.

Approximation

An alternative solution is to not modify the number of samples and perform an approximation of the division. The typical solution is to use a fixed-point multiply, as has already been presented previously. Another example is document JVET- M0064,“Non-CE3: CCLM table reduction and bit range control”, where the precision and multiplier in the fixed-point multiply is more finely controlled. While having the advantage of being shareable with the slope computation, it was surprisingly measured to be less efficient than the above embodiments.

Accounting for non-power-of-2 number of samples be performed by ignoring them and computing the average as if there were a power of 2 number of samples. Therefore, a further embodiment consists in always using 1 and 0 for respectively fact and prec discussed above in relation to the‘scaling’ solution. This solution is referred to as CE3-1.8.1.

As a consequence, the computed‘avg’ value is no longer a true average, and introduces some inaccuracies in the derivation of the slope of the line (a) but, surprisingly, this works sufficiently well to provide useful gains in coding efficiency without introducing additional complexity. As a consequence, the use of the variable ‘avg’, or of the word‘average’ in this specification, means a value computed by any of the various preceding embodiments as opposed to a true mathematical average.

The following table provides an indication of the relative performance of a selection of different solutions, compared to just using a single slope computed from the min/max samples (negative values indicating a coding efficiency gain):

Where YUV is a weighted average of the so-called BDRate gains for the Y, U and V components of test video sequences. Al is an encoding scenario where all frames encoded are independent images (i.e. using only intra prediction/coding) and RA is an encoding scenario where the forcing to use an intra image happens approximately every second (e.g. 64 frames at 60Hz), while other images may reference other images (i.e. using inter prediction/coding besides intra prediction/coding).

As can be seen from the results, the three different categories of solution (which all use an‘average’ which differs from a true mathematical average) provide a minor improvement to coding efficiency - this is surprising given that a less accurate average would be expected to have reduced the coding efficiency.

In the above computations of avg, N is a predetermined value, possibly related to the number of sample pairs having been selected to compute the average. In a more generic fashion, the value of N may depend on the block size, e.g. if there are 2^P samples used on the border(s), then N is available from a table indexed with P. Part or all of the table may be such that, for a given P, the corresponding value is such that N < P. Furthermore, the average may be rounded by using a value other than 2^{N 1}. An example is just foregoing the rounding, for the above embodiment, or any following: avg = avg + (2^* avg - min - max) » N

Finally, the value of avg may go outside of an expected range, whether it is [min; max] in the preferred embodiment, or [0; 2^{bitdepth 1}]. In these cases, the average may simply not be updated, or the value clipped to the aforementioned range. In this case, the above equation becomes: avg = clip(m in, max, avg + ((2^* avg - min - max) » N)) with clip(a, b, c) doing the expected clipping operation: - If c < a, returns a;

- If c > b, returns b;

- Otherwise, returns c.

A particularly advantageous embodiment, using the ‘approximation’ method discussed above is to calculate a shift value which is dependent on the availability of certain neighbouring samples: nShift = Log2( nS ) + ( availL && availT 7 1 : 0)

Where: nS is the number of samples on one border, i.e. the total number of samples can be 2^*nS. availL and availT are the availability of left and top neighbouring samples respectively used for derivation process for a block

Log2( ) is the truncated result of the base-2 logarithm to an integer (e.g.

Log2( 6 ) = 2)

Thus nShift is greater than Log2(nS) by 1 if either or both of the left and top samples are unavailable. The averages can then be calculated as follows: avgY = ( avgY + ( 1 « ( nShift - 1 ) ) ) » nShift avgC = ( avgC + ( 1 « ( nShift - 1 ) ) ) » nShift

As such, a block where the left and top samples are not available means 2^nShift != nS and the calculation does not represent a true average. In such a case the number of available samples used to calculate the average is likely not to be a power of two, so adjusting nShift allows a bitwise division to be performed while only having a small impact on the accuracy of the resultant‘average’. nS is defined as the number of samples inspected on one border, and there are scaling factors xS and yS to convert an index of the inspected sample into its position, i.e. last one on the top border would be xS^*(nS-1 ), and last one on the left border would be yS^*(nS-1 ). Their definitions in the current WC specifications make them integers, thus of practical implementation, while the first embodiment may result in non-integers (e.g. 40/32 or 12/8).

The averages may then be updated to remove the min and max points as discussed above: avgC += ( 2 ^* avgC - minC - maxC ) » 2 avgY += ( 2 ^* avgY - minY - maxY ) » 2

The values of diffl and diff2 can then be derived from previously calculated variables without the need for scaling: diffl = avgY - minY diff2 = maxY - avgY

Multiple‘knee points’

A variation which may improve the accuracy of the prediction (at a cost of increased complexity) consists in increasing the number of points (M) used to generate the continuous model (M=3 in the simplest embodiment). In such a way, multiple‘knee points’ are generated which allows for a more accurate model.

In one example, the additional‘knee points’ correspond to fractional mid-points (e.g. the mid-point multiplied by a rational number) which are distributed based on the calculated mid-point (which, as described above, may account for the extremum points being outliers). It should be noted that some of the points calculated may fall outside the range of the maximum and minimum values - in particular if the mid point is calculated ignoring these values.

In such an example, a series of weighted averages can be determined using the following generalized formula: avg = ((2^N - A - B)^*avg + A ^* min + B ^* max + 2^N_1) » N

In some embodiments, A+B < 2^N, so that the true average has still a positive contribution to the above weighted a The final value for avg is thus a weighted average between the minimum, maximum and true average values, allowing various segmentations of the sample value range.

This formula can be used to generate additional knees/target points for either the iterative search embodiments or the ones above. For instance, the following knee points are below the middle, but still differ from the first quarter of the range:

Q1 = 2^*avg - max Q1 = (avg + min + 1 ) » 1

Conversely, a knee/target point above the middle can be defined as:

Q3 = 2^*avg - min

Q3 = (avg + max + 1 ) » 1

The above expressions for Q1 and Q3 illustrate other embodiments that do not fit the generalized formula. One can therefore deduce that knee point computations may involve at least two of the average, minimum and extremum points, and that even the average is a convenient value, but could be computed on even fewer sample values (e.g. any of A to I in figure 1 1 ).

From this, it can be understood that other-than-middle knee points can be defined, and that they can be used to segment the luma range into more segments than the two previously illustrated.

Conditional use of linear models

For all previous embodiments, the application of all or parts of the parameters derivation can further be subject to particular conditions. Indeed, the proposed MMLM technique is a separate mode. Instead, and as made evident in the previous paragraphs, it can instead be applied to any CCLM mode. However, coding efficiency gains and complexity reduction can be achieved by making the use of such modes conditional. For instance, finding and using the T point / the knee can be made conditional on any or several of the following.

The two-model (i.e. continuous piecewise linear model) case can be restricted to the chroma modes within a subset of all the cross-component chroma modes (say INTRA_LT_CCLM, or 2 out of 6, if there were to be 6 such modes).

Another embodiment concerns the block size or number of samples. This embodiment is particularly beneficial if the luma range over the borders is large, which happens more frequently on larger blocks. However, deriving two models can be seen as more complex than one, and this worst-case situation can be restricted to easier blocks complexity-wise. All implementations of either any MMLM mode (Figure 9) or according to embodiments of the continuous piecewise straight lines method (Figure 10) have been found to offer significant improvement to coding performance over the simple LM model (Figure 7) when implemented on to blocks of strictly more than 16 samples (compared to an absolute minimum number of samples in a chroma block of 4). Improved coding performance over the simple LM model (Figure 7) can also be achieved for restricting the use of either any MMLM mode (Figure 9) or according to embodiments of the continuous piecewise straight lines method (Figure 10) when implemented on to blocks of strictly more than 32 samples.

Typical block sizes that can be excluded are thus 2x2, 2x4, 4x2, 4x4, 8x2 and 2x8 if the restriction is for >16 samples.

Typical block sizes that can be excluded are 4x8, 8x4, 2x16 and 16x2 if the restriction is for >32 samples.

The piecewise modelling, or MMLM, can be included as separate modes. These modes can be already known modes with the addition of the knee point search, or modes using e.g. new sets of samples on the border, different luma downsampling filters, or modes using multiple models differing from the piecewise modelling, etc.

It has been found that certain modes do not provide gains (or create losses) for some block sizes. As a consequence, when the coding mode of a block is signaled, the signaling may also be conditioned by the block size. Such an example would be to use a unary max code adapted to the number of LM modes. This is for example a series of one to two Context-based Adaptive Binary Arithmetic Coding (CABAC)- coded flags for 3 modes, where Ό’ means mode 0,‘10’ mode 1 and‘1 1’ mode 2.

Equivalently, if N modes are available (i.e. because the block is large enough), N modes can be indicated using between 1 and N-1 flags. Reordering of the coding modes, or not including some of the single-model ones may happen, e.g. codeword ‘10’ previously meaning“mode 1”, may instead mean“mode 1 using piecewise modelling”,“1 10” may mean“mode 2 using piecewise modelling”,“1 1 10” may mean “mode 3”, which can be a MMLM mode,“1 1 1 1” may mean“mode 1”, and therefore “mode 2” cannot be used (nor signaled) in this case. Finally, the use of CABAC-only bits, or of unary coding does not preclude the use of another signaling method, as long as the method is adaptive to block size or number of samples. The number of samples on the border however depends on the border availability and decimation scheme used to guarantee that a power-of-2 number of samples is used. As such, some larger blocks, which do not have an available border, or use certain decimation schemes would also be excluded.

In general, in the above embodiment, the MMLM mode or continuous piecewise straight lines method is used in preference to a single LM method in dependence on the block size and/or the number of samples.

A further class of embodiments concerning the adaptive use of a piecewise model (i.e. either the MMLM model shown in Figure 9, or the continuous‘knee point’ model shown in Figure 10) is via a determination of the luma ranges of each segment:

- If the luma range between the extremum points is too low, then the single line model is more likely valid in the middle of that range, for 10 bits, a good threshold, for using a single line, on the distance between maximum and minimum luma values is between 48 and 96.

- If one of the ranges between an extremum and the luma of T is below a threshold, then the alpha parameter from the other range is selected, meaning the parameters are (a1 ,b1 ) and (a1 ,b2), or (a2,b1 ) and (a2,b2). A good condition is the distance between T and the concerned extremum below 6, though any value below that still works.

A particular case is if one, or both luma distances from each extremum to T is not above a threshold, then the model can be skewed. Indeed, this means, first, that the average is very close to one of the extremum, which is then not an outlier. Second, if the luma distances are small, then quantization errors may affect strongly the terms used in computing the model parameters, and thus cause very large relative change. One such case is a value of 2 for the luma distance: even an error of 1 will cause a value of 1/2 to be used instead of 1/3 or 1/1 . Third, this means that the prediction block may use as input sample values that far outside the observed luma range on the borders, and the later may inadequately represent the former.

In the following, midY and midC are respectively the luma and chroma values of a knee point, determined according to any of the previous embodiments for determining a knee point. For instance, they can be the averages computed as previously seen by removing the minimum and maximum for respectively luma (minY and maxY) and chroma (minC and maxC). Similarly, knee points usable in the following are the Q1 or Q2 points as computed earlier.

A particularly advantageous embodiment according to this minimal distance follows. Given the following computations: shift = ( BitDepthC > 8 ) ? BitDepthC - 9 : 0 (8-153) add = shift ? 1 « ( shift - 1 ) : 0 (8-154) with BitDepthC the bitdepth of the component to predict, the following terms are computed: diffl = ( maxY - midY + add ) » shift diff2 = ( midY - minY + add ) » shift The condition becomes:

- If diffl > 8 and diff2 > 8, then apply the two-model approach,

- Otherwise, use the traditional one model approach:

diff = ( maxY - minY + add ) » shift

It should be noted that‘diff represents the denominator of the respective slope a. The use of ‘shift’ (which is dependent on bitdepth of the sample) essentially restricts the total number of values this difference can make to a constant (irrespective of the bidepth). In the example above, the number of values is 512, but could lowered to 256 or 128 by increasing the value of shift by one or two respectively. Such values of diff (or values derived from‘diff) can then be stored in a table to eliminate the requirement of calculation each time. One example of a value derived from‘diff is a‘division’ value as follows:

div = ( ( maxC - avgC ) ^* ( Floor( 232 / diff )

div -= Floor( 216 / diff) ^* 216 ) + 215 ) » 16

where the‘Floor’ function outputs the greatest integer less than or equal to the input. The shift value of‘16’ may be termed‘k’ and represents a division accuracy, and may be reduced so that each value stored in the table takes up less memory, at a cost of lower accuracy of division. The optional ‘add’ parameter ensures that the value of ‘diff is an integer following the shift operation. This can make subsequent division operations less complex.

The threshold for the minimal distance between max/min and mid point (8 in the above example) is set to be a significant portion of the range of values of‘diff . As such, a different threshold may be appropriate depending on the range and/or the variance of the sample values in the range. In particular if the value of‘shift’ is increased (as discussed above), the threshold should be lowered.

It should be noted that if either diffl or diff2 = 0, the slope of the line a1 or a2 is forced to be zero to avoid e.g. a division by zero error, or use of undefined elements from a table, or any undefined behavior for a given implementation.

Similarly, if either diffl or diff2 < 0, the slope of the line a1 or a2 is forced to be zero as this situation could only arise if‘midY’ (the average) is below min or above max, resulting in negative values for‘diffl’ or‘diff2’. This situation does not represent a realistic model, so the forcing a to be zero represents a simple way of avoiding unrealistic models. An alternative element consists in the following checks:

- If diffl < minDist and diff2 < minDist, use a single linear model

o This is equivalent to diffl > minDist-1 or diff2 > minDist-1 not being satisfied.

- Otherwise, determine the two slopes for the piecewise modelling and:

o If diffl < minDiff, a1 = a2;

o If diff2 < minDiff, a2 = a1 ;

o Derive as expected b1 and b2.

A particularly advantageous embodiment uses values of 4 for minDiff and 9 for minDist. It should be noted that the abnormal case of avg falling outside of the [min, max] range can therefore be handled in a number of ways. Another example is to condition the removal of any outlier to avg being within said range: if this were not the case, then none, or fewer, extrema are removed.

In practice, because of outlier samples, diffl , or diff2, can be below 0, or above the maximum difference handled by the table. In such case, diffl , or diff2, can be forced to be within said range, e.g.: diffl = clip(0, 51 1 , diffl ) diff2 = clip(0, 51 1 , diff2) with clip the operation previously presented. It should be understood that this clipping may take into account either the size of table(s) involved or obey another constrain on range. For instance, the minimum and maximum values in the formulae above could also be 4 (e.g. if values below are undesirable) and 131 (e.g. if the table size is 128 and denominators below 4 are specifically excluded). In addition, as averages (or variations thereof) for luma and chroma have been computed for determining the knee point, even if the two-model variation were not to be used, the computation of b can also be modified. Instead of using: b = A ~ L(*A » S)

One can instead use:

while keeping the alpha computation based on extremum points. In other embodiments, so as to better handle very skewed values of luma or chroma as may happen with e.g. screen content, the computation of b can depend on the chroma prediction mode:

- LM_CHROMA_IDX: b = y_mid - L(x_mid » S)

- MDLM_L_IDX: b = y_A - L(x_A » S)

- MDLM_T_IDX: b = y_B - L(x_B » S)

Moving to Figure 12, the embodiments above can be illustrated in a combined and functional fashion. Figure 12 is very similar to Figure 8: steps 1211 to 1217 are new, while steps 1203, 1204 and 1205 are modified versions of 803 and 804. Finally, steps 1201 , 1202 and 1206 are identical to respectively 801 , 802 and 806, and therefore are not described again.

When steps 1201 and 1202 are completed, the two extremum points are found on step 1203, as well as the averages avgi_. and avgc. To find whether to use a knee point, step 1211 checks whether the block satisfies a criterion. As seen already, that may be the block coding mode, or its number of samples. In the latter case, minimal sizes of 16 or 32 samples offer a good tradeoff. Note that the rationale for this size is also dependent on the minimal size of the non-MMLM CCLM modes. Indeed, they are simpler, i.e. requiring fewer operations, so the smaller MMLM block size should incur equivalent or smaller number of operations than for all the CCLM-minimally- sized blocks equivalent to said smaller MMLM block size. For instance, if the minimum CCLM block size is 16, a MMLM mode, (using any method described herein, including the‘knee point’) should apply to blocks of at least 64 samples. A ratio of around 4 times more samples is often adequate, although ratios between 2 times and 8 times are quite likely. If the knee point need not be used, normal CCLM operations (i.e. a single linear equation based on the extreme points) resume on step 1205. It should be understood that some a priori conditions such as the block size can also entirely avoid the determination of the averages which would not be used when determining a single linear equation.

Resuming on step 1205, the computation of the slope is performed, per any of the methods described above, or available to the person skilled in the art. In particular, this is the straight line method using the 2 extremum points. The intercept can be computed as usual, or instead of using points A and B chroma, the averages avg_L and avgC as b = avgc - cr*avg_L. This has the benefit that the true knee is used for computing the model parameters, and ensures better continuity between two adjacent models. Due to rounding, there may still be a small discontinuity as, for the luma value of the knee point, they may yield different chroma sample values. In such a case, the threshold value to select either one of the two adjacent models can be adjusted, e.g. for a given threshold T, usually avg_L (or the outlier-removed versions previously illustrated) by checking which of T-1 , T and T+1 minimizes said discontinuity. This may also consist in computing the true intercept. Indeed, one may want: bi + ( (cri*threshold) » shift ) = b₂ + ( (^threshold) » shift ) and therefore: threshold = {b₂ - bi) / (cn - a )

The normal operation then ends on step 1206, as single model parameters have been found.

Moving back to step 1211 , if the knee point can be used, the averages are further considered. In particular, step 1212 performs a linear combination of them and the extremum luma and chroma sample values. This may in particular consist in removing them from the averages, e.g. avgi_. = (6^* avgi_. - XA - XB + 2) » 2 and avgc = (6^* avgc - YA - YB + 2) » 2. As a consequence, at least three points (extremum A and B and at least one knee point T) have been determined on step 1204. As mentioned previously, in case more models are needed, more knee points can be computed, as illustrated earlier with the computations of Q1 and Q3.

These can then be used on step 1213 to determine whether two models can be used on the current block. This may consist in checking whether T is far enough from A and B, e.g. that the difference between the luma avgi_. of T and that of any extremum (i.e. XA or XB) is above a threshold. Such a check can be extended to any number of knee points. In another embodiment, this is simply if the luma sample values of A and B are far enough. If the criterion is not satisfied, the knee point method may be considered as useless, or even harmful, for prediction. In this case, normal CCLM operation resumes on step 1205.

Otherwise, the knee point can be used to compute two models starting on step

1214. First, this step performs the computation of the slopes defined by the three points, using any method of the prior art, e.g. the straight line one. Then, on step

1215, an optional operation corrects these slopes. This may involve using one slope instead of another if the values used in the computation of the former may have issues. This can be for instance checking whether the luma difference, or chroma difference of T and the extremum corresponding to the slope computed is below a threshold: quantization had a non-negligible impact on said difference, rendering the slope computed very approximate, if not very wrong. Finally, now that the slopes are known, the intercepts can be computed. This may use any of the luma and chroma sample values of the three points. In the preferred embodiment, this uses the averages as bi = avgc - cri*avg_L and b₂ = avgc - a2*avgi_. This finalizes the computation of the two models. Finally, on step 1217, the criterion to switch between one among the at least two models, can be determined, usually luma thresholds. For two models, this is simply avgi_.: if below (or equal), cci and bi are used, otherwise a₂ and b₂ are. If more knee points and models are determined, then the luma of said knee points can be the thresholds for which to select a given set of parameters.

Figure 14 shows another embodiment offering a different tradeoff between complexity and coding efficiency from that of Figure 12. Although computing averages is not very complex, it is still desirable to limit any such additional computation in the critical path, which is smaller blocks. Accordingly, in Figure 14 the computation and use of the average values is clearly made conditional on e.g. the number of samples in the block (and thus its size). Therefore, compared to Figure 12, complexity is reduced at the expense of a small loss in coding efficiency.

The following steps from Figure 12 are present still in Figure 14, namely 1201/1401 , 1202/1402, 1211/1411 , 1212/1412, 1204/1404, 1205/1405, 1206/1406, 1213/1413, 1214/1414, 1215/1415, 1216/1416, and 1217/1417. A first difference occurs in step 1403 where, in contrast to step 1203, only the extremum points are determined. The averages avgc and avgi_. are now computed in step 1420. This step is performed conditionally, i.e. only when the block satisfies the condition(s) (number of samples condition and/or size condition) checked in step 141 1 . The preferred criterion for the block size in this embodiment is that the block has more than 16 samples. An alternative criterion could be that the block has more than 32 samples.

If the block does not satisfy the size criterion on step 1411 , a regular straight line model is used. Step 1425, which is newly added compared to Figure 12, computes the slope a and intercept b (solely) based on the extremum points, specifically ignoring the aforementioned averages that have purposely not been computed. At the end 1426 of this path, the parameters for a single linear model have therefore been determined.

Later on during the processing, if the check in step 1413 is negative, the processing advances to step 1405 to compute the slope and intercept of a single straight line, similarly to step 1205, by using the averages as determined on step 1412. This path therefore ends on 1406, where the parameters for a single linear model have therefore been determined.

The advantage of these embodiments is a coding efficiency improvement.

The descriptions of these embodiments mention the luma and a chroma component but can easily be adapted to other components such as both chroma components, or RGB components. According to an embodiment, the present invention is used when predicting a first chroma component sample value from a second chroma component. In another embodiment, the present invention is used when predicting a sample value of one component from more than one sample values of more than one component. It is understood that in such a case, the linear model is derived based on two points/sets, each point/set comprising a sample value of the one component, and the more than one sample values of the more than one component. For example, if two components’ sample values are used to predict the one component’s sample value, each point/set can be represented as a position in a 3-dimensional space, and the linear model is based on a straight line passing through the two positions in the 3-dimensional space that correspond to the two points/sets of the reconstructed sample values.

Figure 13 is a schematic block diagram of a computing device 1300 for implementation of one or more embodiments of the invention. The computing device 1300 may be a device such as a micro-computer, a workstation or a light portable device. The computing device 1300 comprises a communication bus connected to:

- a central processing unit 1301 , such as a microprocessor, denoted CPU;

- a random access memory 1302, denoted RAM, for storing the executable code of the method of embodiments of the invention as well as the registers adapted to record variables and parameters necessary for implementing the method for encoding or decoding at least part of an image according to embodiments of the invention, the memory capacity thereof can be expanded by an optional RAM connected to an expansion port for example;

- a read only memory 1303, denoted ROM, for storing computer programs for implementing embodiments of the invention;

- a network interface 1304 is typically connected to a communication network over which digital data to be processed are transmitted or received. The network interface 1304 can be a single network interface, or composed of a set of different network interfaces (for instance wired and wireless interfaces, or different kinds of wired or wireless interfaces). Data packets are written to the network interface for transmission or are read from the network interface for reception under the control of the software application running in the CPU 1301 ;

- a user interface 1305 may be used for receiving inputs from a user or to display information to a user;

- a hard disk 1306 denoted HD may be provided as a mass storage device; - an I/O module 1307 may be used for receiving/sending data from/to external devices such as a video source or display.

The executable code may be stored either in read only memory 1303, on the hard disk 1306 or on a removable digital medium such as for example a disk. According to a variant, the executable code of the programs can be received by means of a communication network, via the network interface 1304, in order to be stored in one of the storage means of the communication device 1300, such as the hard disk 1306, before being executed.

The central processing unit 1301 is adapted to control and direct the execution of the instructions or portions of software code of the program or programs according to embodiments of the invention, which instructions are stored in one of the aforementioned storage means. After powering on, the CPU 1301 is capable of executing instructions from main RAM memory 1302 relating to a software application after those instructions have been loaded from the program ROM 1303 or the hard-disc (HD) 1306 for example. Such a software application, when executed by the CPU 1301 , causes the steps of the method according to the invention to be performed.

Any step of the methods according to the invention may be implemented in software by execution of a set of instructions or program by a programmable computing machine, such as a PC (“Personal Computer”), a DSP (“Digital Signal Processor”) or a microcontroller; or else implemented in hardware by a machine or a dedicated component, such as an FPGA (“Field-Programmable Gate Array”) especially for the Minima and Maxima selection, or an ASIC (“Application-Specific Integrated Circuit”).

It is also to be noted that while some examples are based on HEVC for the sake of illustration, the invention is not limited to HEVC. For example, the present invention can also be used in any other prediction/estimation process where a relationship between two or more components’ sample values can be estimated/predicted with a model, wherein the model is an approximate model determined based on at least two sets of related/associated component sample values selected from all available sets of the related/associated component sample values. It is understood that each point corresponding to a sample pair (i.e. a set of associated sample values for different components) may be stored and/or processed in terms of an array. For example, each component’s sample values may be stored in an array so that each sample value of that component is referable/accessible/obtainable by referencing an element of that array, using an index for that sample value for example. Alternatively, an array may be used to store and process each sample pairs that each sample value of the sample pairs accessible/obtainable as an element of the array.

It is also understood that any result of comparison, determination, assessment, selection, or consideration described above, for example a selection made during an encoding process, may be indicated in or determinable from data in a bitstream, for example a flag or data indicative of the result, so that the indicated or determined result can be used in the processing instead of actually performing the comparison, determination, assessment, selection, or consideration, for example during a decoding process.

Although the present invention has been described hereinabove with reference to specific embodiments, the present invention is not limited to the specific embodiments, and modifications will be apparent to a skilled person in the art, which lie within the scope of the present invention.

Many further modifications and variations will suggest themselves to those versed in the art upon making reference to the foregoing illustrative embodiments, which are given by way of example only and which are not intended to limit the scope of the invention, that being determined solely by the appended claims. In particular the different features from different embodiments may be interchanged, where appropriate.

Each of the embodiments of the invention described above can be implemented solely or as a combination of a plurality of the embodiments. Also, features from different embodiments can be combined where necessary or where the combination of elements or features from individual embodiments in a single embodiment is beneficial.

Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise. Thus, unless expressly stated otherwise, each feature disclosed is one example only of a generic series of equivalent or similar features.

In the claims, the word“comprising” does not exclude other elements or steps, and the indefinite article“a” or“an” does not exclude a plurality. The mere fact that different features are recited in mutually different dependent claims does not indicate that a combination of these features cannot be advantageously used.

Claims

1. A method of deriving a linear model for obtaining a first-component sample for a first-component block from an associated reconstructed second-component sample of a second-component block in the same frame, the method comprising:

- determining three points;

- each point being defined by two variables, the first variable corresponding to a second-component sample value, the second variable corresponding to a first-component sample value, based on reconstructed samples of both the first-component and the second-component;

- determining the parameters of two linear equations, a first equation representing a straight line passing through said first and second adjacent points, and the second linear equation passing through said second and third adjacent points,

- said second point representing an average of a selection of said sample values; and

- deriving a continuous linear model defined by the parameters of the said straight lines.

2. The method of claim 1 , wherein the points are determined based on sample pairs in the neighbourhood of the second-component block.

3. The method of claim 2 wherein said first and third points correspond respectively to the sample pairs with the lowest second-component sample value and with the highest second-component sample value from a selection of sample pairs.

4. The method of claim 2 or 3 wherein the selection of the sample pairs comprises a selection of sample pairs from one of: a top border, or a left border.

5. The method of any preceding claim wherein said second point representing an average of the sample values is calcualted using a bitwise shift operation.

6. The method of claim 5 wherein said bitwise shift is by an amount M, where M is the largest power of two smaller or equal to the number of available sample pairs.

7. The method of claim 5 wherein said bitwise shift is by an amount nShift where nShift depends on the availability of neighbouring sample pairs.

8. The method of claim 7 wherein nShift is the largest power of two smaller or equal to the number of available sample pairs, modified if either or both the neighbouring samples to the left or to the top are unavailable,

9. The method of claim 8 wherein

nShift = Log2( nS ) + ( availL && availT 7 1 : 0)

10. The method of any of claims 7 to 9 wherein the average is calculated by

avg = ( avg + ( 1 « ( nShift - 1 ) ) ) » nShift

11. The method of claim 5 wherein the total number of samples comprises sample pairs not from a border with said block so as to create a power of two number of sample pairs.

12. The method of claim 11 comprising extending a border and selecting sample pairs from said extended border.

13. The method of claim 12 wherein extending the border comprises extending the border into a previously coded part of said frame.

14. The method of claim 12 or 13 wherein said selected sample pairs comprise the last 2^N sample pairs from said extended border, where N is used as the bitwise shift when calculating said second point representing said average.

15. The method of claim 5 comprising applying a scaling factor when calculating said second point representing said average .

16. The method of claim 15 wherein said scaling factor is a fraction approximately equal to 2/3.

17. The method of claim 16 wherein said scaling factor is 21/32.

18. The method of any of claims 15 to 17 further comprising applying a rounding parameter when calculating said second point representing said average .

19. The method of any preceding claim wherein the values of the samples having the lowest and highest second-component sample values are ignored from the selection of the second-component sample values when calculating said second point representing said average .

20. The method of claim 19 wherein the calculated average is updated by: avg += (2*avg - min - max) » N

or avg += (2* avg - min - max + 2^N_1) » N

where N is a predetermined scaling parameter.

21. The method of claim 20 where N=2.

22. A device for encoding images, wherein the device comprises a means for deriving a continuous linear model according to any preceding claim.

23. A device for decoding images, wherein the device comprises a means for deriving a continuous linear model according to any preceding claim.

24. A computer program product for a programmable apparatus, the computer program product comprising a sequence of instructions for implementing a method according to any preceding claim, when loaded into and executed by the programmable apparatus.

25. A computer-readable medium storing a program which, when executed by a microprocessor or computer system in a device, causes the device to perform a method according to any preceding claim.

26. A computer program which upon execution causes the method of any preceding claim to be performed.