GB2580078A - Piecewise modeling for linear component sample prediction - Google Patents

Piecewise modeling for linear component sample prediction Download PDF

Info

Publication number
GB2580078A
GB2580078A GB1820861.1A GB201820861A GB2580078A GB 2580078 A GB2580078 A GB 2580078A GB 201820861 A GB201820861 A GB 201820861A GB 2580078 A GB2580078 A GB 2580078A
Authority
GB
United Kingdom
Prior art keywords
component
point
sample
points
block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB1820861.1A
Other versions
GB201820861D0 (en
Inventor
Gisquet Christophe
Laroche Guillaume
Taquet Jonathan
Onno Patrice
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Canon Inc
Original Assignee
Canon Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Canon Inc filed Critical Canon Inc
Priority to GB1820861.1A priority Critical patent/GB2580078A/en
Priority to GB1900137.9A priority patent/GB2581948A/en
Publication of GB201820861D0 publication Critical patent/GB201820861D0/en
Priority to GB1903756.3A priority patent/GB2580192A/en
Priority to PCT/EP2019/086658 priority patent/WO2020127956A1/en
Publication of GB2580078A publication Critical patent/GB2580078A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/593Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A linear model for obtaining a first-component sample for a first-component block from an associated reconstructed second-component sample of a second-component block in the same frame is derived. The derivation comprises determining at least three points, each point, A, T and B, being defined by two variables, the first variable corresponding to a second-component sample value, the second variable corresponding to a first-component sample value, based on reconstructed samples of both the first-component and the second-component. Parameters, α and β, of each of a plurality of linear equations are determined, each equation representing a straight line passing through two adjacent points of said M points, and a piecewise linear model defined by the parameters of the said straight lines is derived. Also claimed are methods in which different criteria are used to determine if more than one linear equation should be used in determining the model. The methods may be used to predict chroma component samples from reconstructed luma component samples in video coding.

Description

PIECEWISE MODELING
FOR LINEAR COMPONENT SAMPLE PREDICTION
DOMAIN OF THE INVENTION
The present invention regards the encoding or decoding of blocks of a given video component, in particular the intra prediction of such component blocks or obtaining the samples of such blocks. The invention finds applications in obtaining blocks of a component, typically blocks of a chroma component, of video data from samples of another component, typically luma samples.
BACKGROUND OF THE INVENTION
Predictive encoding of video data is based on the division of frames into blocks of pixels. For each block of pixels, a predictor block is searched for in available data. The predictor block may be a block in a reference frame different from the current one in INTER coding modes, or generated from neighbouring pixels in the current frame in INTRA coding modes. Different encoding modes are defined according to different ways of determining the predictor block. The result of the encoding is a signalling of the predictor block and a residual block consisting in the difference between the block to be encoded and the predictor block.
Regarding INTRA coding modes, various modes are usually proposed, such as a Direct Current (DC) mode, a planar mode and angular modes. Each of them seeks to predict samples of a block using previously decoded boundary samples from spatially neighbouring blocks.
The encoding may be performed for each component forming the pixels of the video data. Although RGB (for Red-Green-Blue) representation is well-known, the YUV representation is preferably used for the encoding to reduce the inter-channel redundancy. According to these encoding modes, a block of pixels may be considered as composed of several, typically three, component blocks. An RGB pixel block is composed of an R component block containing the values of the R component of the pixels of the block, a G component block containing the values of the G component of these pixels, a B component block containing the values of the B component of these pixels. Similarly, a YUV pixel block is composed of a Y component block (luma), a U component block (chroma) and a V component block (also chroma). Another example is YCbCr, where Cb and Cr are also known as chroma components. However, inter-component (also known as cross-component) correlation is still observed locally.
To improve compression efficiency, the usage of Cross-Component Prediction (CCP) has been studied in the state of this art. The main application of CCP concerns luma-to-chroma prediction. It means that the luma samples have already been encoded and reconstructed from encoded data (as the decoder does) and that chroma is predicted from luma. However, variants use CCP for chroma-to-chroma prediction or more generally for first-component to second-component prediction (including RGB).
The Cross-Component Prediction may apply directly to a block of chroma pixels 15 or may apply to a residual chroma block (meaning the difference between a chroma block and a chroma block predictor).
The Linear Model (LM) mode uses a linear model to predict chroma from luma as a chroma intra prediction mode, relying on one or two parameters, slope (a) and offset (f3), to be determined. The chroma intra predictor is thus derived from reconstructed luma samples of a current luma block using the linear model with the parameters.
The linearity, i.e. parameters a and [3, is derived from the reconstructed causal samples, in particular from a neighbouring chroma sample set comprising reconstructed chroma samples neighbouring the current chroma block to predict and from a neighbouring luma sample set comprising luma samples neighbouring the current luma block.
Specifically, for an NxN chroma block, the N neighbours of the above row and the N neighbours of the left column are used to form the neighbouring chroma sample set for derivation.
The neighbouring luma sample set is also made of N neighbouring samples just above the corresponding luma block and N neighbouring samples on the left side of the luma block.
It is known to reduce the size of the video data to encode without significant 5 degradation of visual rendering, by sub-sampling the chroma components. Known subsampling modes are labelled 4:1:1, 4:2:2, 4:2:0.
In the situation where the video chroma data are subsampled, the luma block corresponding to the NxN chroma block is bigger than NxN. In that case, the neighbouring luma sample set is down-sampled to match the chroma resolution.
The chroma intra predictor to predict the chroma samples in the current NxN chroma block has to be generated using the linear model with the one or more parameters a and 13 derived and the reconstructed luma samples of the current luma block that are previously down-sampled to match chroma resolution. The down-sampling of the reconstructed luma samples to chroma resolution makes it possible to retrieve the same number of samples as the chroma samples to form both the luma sample set and the chroma intra predictor. Furthermore, when the number of samples on the borders are not a power of 2, and operations, such as average computation, require divisions, further decimation of these border samples can allow use of a number of border samples that is a power of 2, for which divisions are less costly to implement.
The chroma intra predictor is thus subtracted from the current chroma block to obtain a residual chroma block that is encoded at the encoder. Conversely, at the decoder, the chroma intra predictor is added to the received residual chroma block in order to retrieve the chroma block, also known as reconstruction of the decoded block. This may also involve clipping for results of the addition going out of the sample range.
Sometimes, the residual chroma block is negligible and thus not considered during encoding. In that case, the above-mentioned chroma intra predictor is used as the chroma block itself. As a consequence, the above LM mode makes it possible to obtain a sample for a current block of a given component from an associated (i.e. collocated or corresponding) reconstructed sample of a block of another component in the same frame using a linear model with one or more parameters. The sample is obtained using the linear model with the one or more parameters derived and the associated reconstructed samples in the block of the other component. If needed, the block of the other component is made of samples down-sampled to match the block resolution of the current component. While the block of the current component is typically a chroma block and the block of the other component a luma block, this may not be the case. For the sake of clarity and simplicity, the examples given here focus on the prediction of a chroma block from a luma block, it should be clear that the described mechanism may apply to any component prediction from another component.
The Joint Exploration Model (JEM) of the Joint Video Exploration Team (JVET) adds six Cross-Component (luma-to-chroma) linear model modes to the conventional intra prediction modes already known. All these modes compete against each other to predict or generate the chroma blocks, the selection being usually made based on a rate-distortion criterion at the encoder end.
In VTM (JEM successor, currently developed by the JVET group for testing and defining the future VVC codec), there are currently three such modes, differing only in the sets of samples they use for determining the parameters. In all cases, said parameters are found by determining extremum points based on luma, as is later described.
For instance, the sample sets may be made of the two lines (i.e. rows and columns) of samples neighbouring the current luma or chroma block, these lines being parallel and immediately adjacent to each one of the top and/or left boundaries of the current luma or chroma block at chroma resolution. Such exemplary sample set is described in publication US 9,736,487.
Other exemplary sample sets are also disclosed in publications US 9,288,500 and US 9,462,273 The down-sampling schemes used in JEM include a 6-tap filter determining a down-sampled reconstructed luma sample from six reconstructed luma samples but also three 2-tap filters that select either the top right and bottom right samples from among the six reconstructed luma samples, or the bottom and bottom right samples, or the top and top right samples, and a 4-tap filter that selects the top, top right, bottom and bottom right samples of the six reconstructed luma samples.
SUMMARY OF THE INVENTION
The linear model parameters for the computation of the chroma predictor block samples has shortcomings in that the model is at best approximate, and valid on a limited sample range. In particular, the accuracy of LM in the middle of the range is often relatively poor. Indeed, these parameters are updated for every block based on the borders, which may result in an instable model. As a consequence, it is desirable to improve the modelling, but without significantly deviating from the simplicity of the linear model and its parameter derivation.
The present invention has been devised to address one or more of the foregoing concerns. It concerns an improved method for obtaining a chroma sample for a current chroma block, possibly through chroma intra prediction.
According to one aspect of the present invention there is provided a method of deriving a linear model for obtaining a first-component sample for a first-component block from an associated reconstructed second-component sample of a second-component block in the same frame, the method comprising: determining M points, where Mal; each point being defined by two variables, the first variable corresponding to a second-component sample value, the second variable corresponding to a first-component sample value, based on reconstructed samples of both the first-component and the second-component; determining the parameters of a plurality of linear equations, each equation representing a straight line passing through two adjacent points of said M points, and deriving a continuous linear model defined by the parameters of the said straight lines.
Optionally, the points are determined based on sample pairs in the neighbourhood of the second-component block. Optionally, the points are determined based on the sample values of the sample pairs in the neighbourhood of the second-component block.
Optionally, the first and Mth points correspond respectively to the sample pairs with the lowest second-component sample value and with the highest second-30 component sample value.
Optionally, said first and Mth points correspond respectively to the sample pairs with the lowest second-component sample value and with the highest second-component sample value from a selection of sample pairs.
Optionally, the selection of the sample pairs comprises a selection of sample pairs from one of: a top border, or a left border.
Optionally, the method comprises iteratively determining said first and Mth points when accessing said selection of sample pairs.
Optionally, the or each point from 2 to M-1 is derived from a mid-point of second-component values.
More than 1 knee point Optionally, M>3 and each point from 2 to M-1 are distributed corresponding to a fractional mid-point. Optionally, each point from 2 to M-1 corresponds to a point having a second-component value nearest a value corresponding to said distribution of points between the first and Nth point.
Optionally, each point from 2 to M-1 corresponds to a point having a second-component value which differs from a value corresponding to an even distribution of points between the first and Mth point less than a threshold amount.
Optionally, each point corresponding to said distribution is determined iteratively.
1 knee point Optionally, M=3, and where the second point corresponds to the mid-point. For example, the first and third points correspond to points having the lowest and highest second component values respectively, and the second point is a mid-point.
Optionally, the second point corresponds to a point having a second-component value nearest a mid-point of the second-component sample values.
Optionally, the second point corresponds to a point of which the second-component value differs from a mid-point of the second-component sample values less than a threshold amount.
Optionally, the point or points corresponding to the mid-point is determined iteratively.
Defining a target Optionally, the mid-point is an average of a selection of second-component sample values.
Optionally, the values of the samples having the lowest and highest second-component sample values are ignored from the selection of the second-component sample values when determining the average.
Optionally, said highest and lowest values are ignored when updating a calculated average.
Optionally, the calculated average is updated by: avg += (2*avg -min -max) >> N where N is a predetermined scaling parameter. Optionally, the calculated average is updated by: avg += (2* avg -min -max + 2N-1) >, N where N is a predetermined scaling parameter.
Optionally, N is dependent on the block size Optionally, N is dependent on the number of samples used to compute the calculated average.
Optionally, the selection of sample pairs is from the sample pairs forming a top border of the block, or a left border of the block.
Optionally, the selection of sample pairs also includes a sample from a top-left neighbouring block.
Optionally, the total number of samples in said selection is a power of two.
Optionally, the total number of samples in said selection is greater than the minimum possible number of such samples.
Adaptive MMLM Optionally, the method further comprises the step of determining a range of said second-component values, and determining the parameters of said straight lines in 25 dependence on said range.
Optionally, said dependence is whether said range is greater than a threshold. Optionally, if said range between adjacent points is below a threshold, a parameter derived from a straight line between two different points is used. Optionally, said parameter comprises the slope.
Optionally, said parameter comprises the ordinate intercept.
Conditional MMLM/LM in the middle According to another aspect of the present invention there is provided a method of deriving a linear model for obtaining a first-component sample for a first-component block from an associated reconstructed second-component sample of a second-component block in the same frame, the method comprising: determining M points, each point being defined by two variables, the first variable corresponding to a second-component sample value, the second variable corresponding to a first-component sample value, based on reconstructed samples of both the first-component and the second-component, the first point corresponding to the point with the lowest second component value, and the Mth point corresponding to the point with the highest second component value; determining the parameters of at least one linear equations, each equation representing a straight line passing through two adjacent points of said M points, and deriving a linear model defined by the parameters of the or each straight lines; wherein M=2 if the number of samples is less than a threshold, and wherein W3 (M>2) if the number of samples is greater than said threshold.
Optionally, the threshold is 16 samples.
Optionally, the threshold is 32 samples.
Optionally, the method further comprises, if the number of samples is greater than said threshold, performing any of the methods as described herein.
According to another aspect of the present invention there is provided a method of deriving a linear model for obtaining a first-component sample for a first-component block from an associated reconstructed second-component sample of a second-component block in the same frame, the method comprising: determining a difference between the second-component values corresponding to the two points having the largest and smallest second-component values; if said difference is lower than a threshold: determining the parameters of a linear equation, representing a straight line passing through said points, and deriving a linear model defined by the parameters of said straight line; if said difference is higher than the threshold: determining at least one further point between said two points having the largest and smallest second-component values; and determining the parameters of two linear equations, the first equation representing a straight line passing through the point having the smallest second-component value and an adjacent point; the second equation representing a straight line passing through the point having the largest second-component value and an adjacent point; and deriving a linear model defined by the parameters of said straight lines.
Optionally, the threshold depends on the bitdepth of the samples.
Optionally, for a bitdepth of 10, the threshold is between 48 and 96.
According to another aspect of the present invention there is provided a method of deriving a linear model for obtaining a first-component sample for a first-component block from an associated reconstructed second-component sample of a second-component block in the same frame, the method comprising: determining three points, said three points comprising two points having the largest and smallest second-component values, and a third point between said two points; determining a difference between the point having the smallest second-component value and the third point; determining a difference between the point having the largest second-component value and the third point; if one or both of said differences are lower than a threshold: determining the parameters of a linear equation, representing a straight line passing through said two points having the largest and smallest second-component values, and deriving a linear model defined by the parameters of said straight line; if said difference is higher than the threshold: determining the parameters of two linear equations, the first equation representing a straight line passing through the point having the smallest second-component value and the third point; and the second equation representing a straight line passing through the point having the largest second-component value and the third point; and deriving a linear model defined by the parameters of said straight lines.
Optionally, the method further comprise, if said difference is higher than the threshold, performing any of the methods as described herein.
Devices According to another aspect of the present invention there is provided a device for encoding images, wherein the device comprises a means for deriving a continuous linear model.
According to another aspect of the present invention there is provided a device 30 for decoding images, wherein the device comprises a means for deriving a continuous linear model.
According to another aspect of the present invention there is provided a computer program product for a programmable apparatus, the computer program product comprising a sequence of instructions for implementing a method according as described herein, when loaded into and executed by the programmable apparatus.
According to another aspect of the present invention there is provided a 5 computer-readable medium storing a program which, when executed by a microprocessor or computer system in a device, causes the device to perform a method as described herein.
According to another aspect of the present invention there is provided a computer program which upon execution causes the method as described herein to 10 be performed.
At least parts of the methods according to the invention may be computer implemented. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a "processor and a memory", "circuit", "module" or "system". Furthermore, the present invention may take the form of a computer program product embodied in any tangible medium of expression having computer usable program code embodied in the medium.
Since the present invention can be implemented in software, the present invention can be embodied as computer readable code for provision to a programmable apparatus on any suitable carrier medium. A tangible carrier medium may comprise a storage medium such as a hard disk drive, a magnetic tape device or a solid state memory device and the like. A transient carrier medium may include a signal such as an electrical signal, an electronic signal, an optical signal, an acoustic signal, a magnetic signal or an electromagnetic signal, e.g. a microwave or RE signal.
BRIEF DESCRIPTION OF THE DRAWINGS
Embodiments of the invention will now be described, by way of example only, and with reference to the following drawings in which: Figure 1 illustrates a video encoder logical architecture; Figure 2 illustrates a video decoder logical architecture corresponding to the video encoder logical architecture illustrated in Figure 1, Figure 3 schematically illustrates examples of a YUV sampling scheme for 4:2:0 sampling; Figure 4 illustrates, using a flowchart, general steps for generating a block predictor using the LM mode, performed either by an encoder or a decoder; Figures 5A-5B schematically illustrate a chrome block and an associated or collocated luma block, with down-sampling of the luma samples, and neighbouring chroma and luma samples, as known in prior art; Figures 6A-6D illustrate exemplary coding of signalling flags to signal LM modes; Figure 7 illustrates points of luma and chroma neighboring samples and a straight line representing the linear model parameters; Figure 8 illustrates the main steps of a process of a simplified LM derivation; Figure 9 illustrates a combination of the MMLM mode plus parameters derivation through a straight line, including a discontinuity; Figure 10 illustrates the concepts of a knee point and a target point to create piecewise (continuous) linear models; Figure 11 presents example sets of sample pairs that help define the knee point or the target point; Figure 12 illustrates the main steps of a process of a derivation in one embodiment of the invention; and Figure 13 is a schematic block diagram of a computing device for implementation of one or more embodiments of the invention.
DETAILLED DESCRIPTION OF EMBODIMENTS
Figure 1 illustrates a video encoder architecture. In the video encoder, an original sequence 101 is divided into blocks of pixels 102 called coding blocks or coding units for HEVC. A coding mode is then affected to each block. There are two families of coding modes typically used video coding: the coding modes based on spatial prediction or "INTRA modes" 103 and the coding modes based on temporal prediction or "INTER modes" based on motion estimation 104 and motion compensation 105.
An INTRA coding block is generally predicted from the encoded pixels at its causal boundary by a process called INTRA prediction. The predictor for each pixel of the INTRA coding block thus forms a predictor block. Depending on which pixels are used to predict the INTRA coding block, various INTRA modes are proposed: for example, DC mode, a planar mode and angular modes.
While Figure 1 is directed to a general description of a video encoder architecture, it is to be noted that a pixel corresponds here to an element of an image, that typically consists of several components, for example a red component, a green component, and a blue component. An image sample is an element of an image, which comprises only one component.
Temporal prediction first consists in finding in a previous or future frame, called the reference frame 116, a reference area which is the closest to the coding block in a motion estimation step 104. This reference area constitutes the predictor block. Next this coding block is predicted using the predictor block to compute the residue or residual block in a motion compensation step 105.
In both cases, spatial and temporal prediction, a residue or residual block is computed by subtracting the obtained predictor block from the coding block.
In the INTRA prediction, a prediction mode is encoded.
In the temporal prediction, an index indicating the reference frame used and a motion vector indicating the reference area in the reference frame are encoded.
However, in order to further reduce the bitrate cost related to motion vector encoding, a motion vector is not directly encoded. Indeed, assuming that motion is homogeneous, it is particularly advantageous to encode a motion vector as a difference between this motion vector, and a motion vector (or motion vector predictor) in its surroundings. In the H.264/AVC coding standard for instance, motion vectors are encoded with respect to a median vector computed from the motion vectors associated with three blocks located above and on the left of the current block. Only a difference, also called residual motion vector, computed between the median vector and the current block motion vector is encoded in the bitstream. This is processed in module "Mv prediction and coding" 117. The value of each encoded vector is stored in the motion vector field 118. The neighbouring motion vectors, used for the prediction, are extracted from the motion vector field 118.
The HEVC standard uses three different INTER modes: the Inter mode, the Merge mode and the Merge Skip mode, which mainly differ from each other by the signalling of the motion information (i.e. the motion vector and the associated reference frame through its so-called reference frame index) in the bit-stream 110.
For the sake of simplicity, motion vector and motion information are conflated below.
Regarding motion vector prediction, HEVC provides several candidates of motion vector predictor that are evaluated during a rate-distortion competition in order to find the best motion vector predictor or the best motion information for respectively the Inter or the Merge mode. An index corresponding to the best predictors or the best candidate of the motion information is inserted in the bitstream 110. Thanks to this signalling, the decoder can derive the same set of predictors or candidates and uses the best one according to the decoded index.
The design of the derivation of motion vector predictors and candidates contributes to achieving the best coding efficiency without large impact on complexity. Two motion vector derivations are proposed in HEVC: one for Inter mode (known as Advanced Motion Vector Prediction (AMVP)) and one for the Merge modes (known as Merge derivation process).
Next, the coding mode optimizing a rate-distortion criterion for the coding block currently considered is selected in module 106. In order to further reduce the redundancies within the obtained residue data, a transform, typically a DCT, is applied to the residual block in module 107, and a quantization is applied to the obtained coefficients in module 108. The quantized block of coefficients is then entropy coded in module 109 and the result is inserted into the bit-stream 110.
The encoder then performs decoding of each of the encoded blocks of the frame for the future motion estimation in modules 111 to 116. These steps allow the encoder and the decoder to have the same reference frames 116. To reconstruct the coded frame, each of the quantized and transformed residual blocks is inverse quantized in module 111 and inverse transformed in module 112 in order to provide the corresponding "reconstructed" residual block in the pixel domain. Due to the loss of the quantization, this "reconstructed" residual block differs from original residual block obtained at step 106.
Next, according to the coding mode selected at 106 (INTER or INTRA), this "reconstructed" residual block is added to the INTER predictor block 114 or to the INTRA predictor block 113, to obtain a "pre-reconstructed" block (coding block).
Next, the "pre-reconstructed" blocks are filtered in module 115 by one or several kinds of post filtering to obtain "reconstructed" blocks (coding blocks). The same post filters are integrated at the encoder (in the decoding loop) and at the decoder to be used in the same way in order to obtain exactly the same reference frames at encoder and decoder ends. The aim of this post filtering is to remove compression artefacts.
Figure 2 illustrates a video decoder architecture corresponding to the video encoder architecture illustrated in Figure 1.
The video stream 201 is first entropy decoded in a module 202. Each obtained residual block (coding block) is then inverse quantized in a module 203 and inverse transformed in a module 204 to obtain a "reconstructed" residual block. This is similar to the beginning of the decoding loop at the encoder end.
Next, according to the decoding mode indicated in the bitstream 201 (either INTRA type decoding or an INTER type decoding), a predictor block is built.
In case of INTRA mode, an INTRA predictor block is determined 205 based on the INTRA prediction mode specified in the bit-stream 201.
In case of INTER mode, the motion information is extracted from the bitstream 25 during the entropy decoding 202. The motion information is composed, for example in HEVC and JVET, of a reference frame index and a motion vector residual.
A motion vector predictor is obtained in the same way as done by the encoder (from neighbouring blocks) using already computed motion vectors stored in motion vector field data 211. It is thus added 210 to the extracted motion vector residual block to obtain the motion vector. This motion vector is added to the motion vector field data 211 in order to be used for the prediction of the next decoded motion vectors.
The motion vector is also used to locate the reference area in the reference frame 206 which is the INTER predictor block.
Next, the "reconstructed" residual block obtained at 204 is added to the INTER predictor block 206 or to the INTRA predictor block 205, to obtain a "pre-reconstructed" block (coding block) in the same way as the decoding loop of the encoder.
Next, this "pre-reconstructed" block is post filtered in module 207 as done at the encoder end (signalling of the post filtering to use may be retrieved from bitstream 201).
A "reconstructed" block (coding block) is thus obtained which forms the decompressed video 209 as the output of the decoder.
The above-described encoding/decoding process may be applied to monochrome frames. However, most common frames are colour frames generally made of three arrays of colour samples, each array corresponding to a "colour component", for instance R (red), G (green) and B (blue). A pixel of the image comprises three collocated/corresponding samples, one for each component.
R, G, B components have usually high correlation between them. It is thus very common in image and video compression to decorrelate the colour components prior to processing the frames, by converting them in another colour space. The most common format is the YUV (YCbCr) where Y is the luma (or luminance) component, and U (Cb) and V (Cr) are chroma (or chrominance) components.
To reduce the amount of data to process, some colour components of the colour frames may be subsampled, resulting in having different sampling ratios for the three colour components. A subsampling scheme is commonly expressed as a three part ratio J:a:b that describes the number of luma and chroma samples in a conceptual 2-pixel-high region. 'X defines the horizontal sampling reference of the conceptual region (i.e. a width in pixels), usually 4. 'a' defines the number of chroma samples (Cr, Cb) in the first row of J pixels, while 'b' defines the number of (additional) chroma samples (Cr, Cb) in the second row of J pixels.
With the subsampling schemes, the number of chroma samples is reduced compared to the number of luma samples.
The 4:4:4 YUV or RGB format does not provide subsampling and corresponds to a non-subsampled frame where the luma and chroma frames have the same size 5 W x H. The 4:0:0 YUV or RGB format has only one colour component and thus corresponds to a monochrome frame.
Exemplary sampling formats are the following.
The 4:2:0 YUV format has half as many chroma samples as luma samples in 10 the first row, and no chroma samples in the second row. The two chroma frames are thus W/2-pixel wide and H/2-pixel height, where the luma frame is W x H. The 4:2:2 YUV format has half as many chroma samples in the first row and half as many chroma samples in the second raw, as luma samples. The two chroma frames are thus W/2-pixel wide and H-pixel height, where the luma frame is W x H. The 4:1:1 YUV format has 75% fewer chroma samples in the first row and 75% fewer chroma samples in the second row, than the luma samples. The two chroma frames are thus W/4-pixel wide and H-pixel height, where the luma frame is W x H. When subsampled, the positions of the chroma samples in the frames are shifted compared to the luma sample positions.
Figure 3 illustrates an exemplary positioning of chroma samples (triangles) with respect to luma samples (circles) for a 4:2:0 YUV frame.
The encoding process of Figure 1 may be applied to each colour-component frame of an input frame.
Due to correlations between the colour components (between RGB or remaining 25 correlations between YUV despite the RGB-to-YUV conversion), Cross-Component Prediction (CCP) methods have been developed to exploit these (remaining) correlations in order to improve coding efficiency.
CCP methods can be applied at different stages of the encoding or the decoding process, in particular either at a first prediction stage (to predict a current colour component) or at a second prediction stage (to predict a current residual block of a component).
One known CCP method is the LM mode, also referred as to CCLM (Cross-Component Linear Model prediction). It is used to predict both chroma components Cb and Cr (or U and V) from the luma Y, more specifically from the reconstructed luma (at the encoder end or at the decoder end). One predictor is generated for each component. The method operates at a (chroma and luma) block level, for instance at CTU (coding tree unit), CU (coding unit) level, PU (prediction unit) level, sub-PU or TU (transform unit) level.
Figure 4 illustrates as an example, using a flowchart, general steps for generating a block predictor using the LM mode, performed either by the encoder (used as reference below) or the decoder.
In the description below, an exemplary first component is chroma while an exemplary second component is luma.
Considering a current chroma block 502 (Figure 5A) to encode or decode and its associated or corresponding (i.e. "collocated") luma block 505 (i.e. of the same CU for instance) in the same frame, the encoder (or the decoder) receives, in step 401, a neighbouring luma sample set RecL comprising luma samples 503 neighbouring the current luma block, and receives a neighbouring chroma sample set RecC comprising chroma samples 501 neighbouring the current chroma block, denoted 402. It is to be noted that for some chroma sampling formats and chroma phase, the luma samples 504 and 503 are not directly adjacent to luma block 505 as depicted in Figure 5A. For example in Figure 5A, to obtain the left row RecL' (503), only the second left row is needed and not the direct left row. In the same way, for the up line 504 the second up line is also considered for the down-sampling of luma sample as depicted in Figure 5A.
When a chroma sampling format is used (e.g. 4:2:0, 4:2:2, etc.), the neighbouring luma sample set is down-sampled at step 403 into RecL' 404 to match chroma resolution (i.e. the sample resolution of the corresponding chroma frame/block). RecL' thus comprises reconstructed luma samples 504 neighbouring the current luma block that are down-sampled. Thanks to the down-sampling, RecL' and RecC comprise the same number 2N of samples (chroma block 502 being N x N). Yet, particular down-samplings of the luma border exist in the prior art where less samples are needed to obtain RecL'. In addition, even if RecL and RecC have the same resolution, RecL' can be seen as the denoised version of RecL, through the use of a low-pass convolution filter.
In the example of Figure 5A, the neighbouring luma and chroma sample sets are made of the down-sampled top and left neighbouring luma samples and of the top and left neighbouring chroma samples, respectively. More precisely each of the two sample sets is made of the first line immediately adjacent to the left boundary and the first line immediately adjacent to the top boundary of their respective luma or chroma block. Due to down-sampling (4:2:0 in Figure 5A), the single line of neighbouring luma samples RecL' is obtained from two lines of non down-sampled reconstructed luma samples RecL (left or up).
US 9,565,428 suggests using sub-sampling which selects a single sample, only for the up line (i.e. adjacent to the top boundary of the luma block) and not for the luma block itself (as described below with reference to step 408). The proposed sub-sampling is illustrated in Figure 6A. The motivation for this approach is to reduce the line buffer of the up line.
The linear model which is defined by one or two parameters (a slope a and an offset 13) is derived from RecL' (if any, otherwise RecL) and RecC. This is step 405 20 to obtain the parameters 406.
The LM parameters a and 13 are obtained using a least mean square-based method using the following equations: M. Er, RecC1.Rec11,-E RecC,. Real, A1 M. En Rec11,2 -(EnRec14)2 A2 where M is a value which depends on the size of the block considered. In general cases of square blocks as shown in the Figures 5A and 5B, M=2N. However, the LM-based CCP may apply to any block shape where M is for instance 30 the sum of the block height H plus the block width W (for a rectangular block shape).
It is to be noted that the value of M used as a weight in this equation may be adjusted to avoid computational overflows at the encoder and decoder. To be precise, when using arithmetic with 32-bit or 64-bit signed architectures, some of the computations may sometimes overflow and thus cause unspecified behaviour (which is strictly prohibited in any cross platform standard). To face this situation, the maximum magnitude possible given inputs RecL and RecC values may be evaluated, and M (and in turn the sums above) may be scaled accordingly to ensure that no overflow occurs.
The derivation of the parameters is usually made from the sample sets RecL' 10 and RecC shown in Figure 5A.
Variations of the sample sets have been proposed.
For instance, US 9,288,500 proposes three competing sample sets, including a first sample set made of the outer line adjacent to the top boundary and the outer line adjacent to the left boundary, a second sample set made of only the outer line adjacent to the top boundary and a third sample set made of only the outer line adjacent to the left boundary. These three sample sets are shown in Figure 6B for the chroma block only (and thus can be transposed to the luma block).
US 9,462,273 extends the second and third sample sets to additional samples extending the outer lines (usually doubling their length). The extended sample sets are shown in Figure 6C for the chroma block only. This document also provides a reduction in the number of LM modes available in order to decrease the signalling costs for signalling the LM mode used in the bitstream. The reduction may be contextual, for instance based on the Intra mode selected for the associated luma block.
US 9,736,487 proposes three competing sample sets similar to those of US 9,288,500 but made, each time, of the two lines of outer neighbouring samples parallel and immediately adjacent to the boundaries considered. These sample sets are shown in Figure 6D for the chroma block only.
Also US 9,153,040 and the documents of the same patent family propose additional sample sets made of a single line per boundary, with fewer samples per line than the previous sets. In VVC, the cross-component modes are shown in Fig. 6C, the latter two referred to as "directional modes". However, as the blocks can be rectangle-shaped, the number of samples on the border may not be a power of 2. As previously described, this makes the implementation more troublesome. Consequently, one in N samples is used on the bigger border, such that the total number of samples is a power of 2. First sample set corresponds to the INTRA LT CCLM ("left"+"top") mode, second to INTRA L CCLM ("left") and finally INTRA T CCLM ("top").
Back to the process of Figure 4, using the linear model with the one or more derived parameters 406, a chroma intra predictor 413 for chroma block 502 may thus be obtained from the reconstructed luma samples 407 of the current luma block represented in 505. Again if a chroma sampling format is used (e.g. 4:2:0, 4:2:2, etc.), the reconstructed luma samples are down-sampled at step 408 into L' 409 to match chroma resolution (i.e. the sample resolution of the corresponding chroma frame/block).
The same down-sampling as for step 403 may be used, or another one for a line buffer reason. For instance, a 6-tap filter may be used to provide the down-sampled value as a weighted sum of the top left, top, top right, bottom left, bottom and bottom right samples surrounding the down-sampling position. When some surrounding samples are missing, a mere 2-tap filter is used instead of the 6-tap filter.
Applied to reconstructed luma samples L, output L' of an exemplary 6-tap filter is obtained as follows: [i,A = (2 x 1421,21] + 2 x 421,2.1 + L[21 -1,2A + L[21 +1,2A + L[21 -1,2/ + + L[21+1,2/ + 1]+ 4) »3 with (ii) being the coordinates of the sample within the down-sampled block and » being the bit-right-shifting operation.
Thanks to down-sampling step 408, L' and C blocks (the set of chroma samples in chroma block 502) comprise the same number N2 of samples (chroma block 502 being N x N) Next, each sample of the chroma infra predictor PredC 413 is calculated using the loop 410-411-412 following the formula PredC[i,j] = a. L'[0] +j9 with (ii) being the coordinates of all samples within the chroma and luma blocks.
To avoid divisions and multiplications, the computations may be implemented using less complex methods based on look-up tables and shift operations. For instance, the actual chroma intra predictor derivation 411 may be done as follows: PredCP, j] = (A. L'[i, j]) >> 5 + fl where S is an integer and A is derived from Al and A2 (introduced above when computing a and p) using the look-up table mentioned previously. It actually corresponds to a rescaled value of a. The operation (x » S) corresponds to the bit-right-shifting operation, equivalent to an integer division of x (with truncation) by 28.
When all samples of the down-sampled luma block have been parsed (412), the chroma intra predictor 413 is available for subtraction from chroma block 502 (to obtain a chroma residual block) at the encoder end or for addition to a chroma residual block (to obtain a reconstructed chroma block) at the decoder end.
Note that the chroma residual block may be insignificant and thus discarded, in which case the obtained chroma intra predictor 413 directly corresponds to predicted chrome samples (forming chroma block 502).
Both standardization groups ITU-T VCEG (Q6/16) and ISO/IEC MPEG (JTC 1/SC 29/VVG 11) which have defined the HEVC standard are studying future video coding technologies for the successor of HEVC in a joint collaboration effort known as the Joint Video Exploration Team (JVET). The Joint Exploration Model (JEM) contained HEVC tools and new added tools selected by this JVET group. In particular, this reference software contains some CCP tools, as described in document JVET-G1001. This model has been superseded by the VTM, currently described in document JVET-L1001-v3.
In addition to the previously described variations around CCLM, another mode is being studied in JVET.
Compared to CCLM, this so-called 'multiple model' MMLM mode uses two linear models. The neighbouring reconstructed luma samples from the RecL' set and the 30 neighbouring chroma samples from the RecC set are classified into two groups, each group being used to derive the parameters a and 1 of one linear model, thus resulting in two sets of linear model parameters (01,131) and (02,132).
Currently, a threshold is calculated as the average value of the neighbouring reconstructed luma samples forming RecL'. Next, a neighbouring luma sample with RecL'[i,j] threshold is classified into group 1; while a neighbouring luma sample with RecLII,j] > threshold is classified into group 2.
Next, the chroma intra predictor (or the predicted chroma samples for current chroma block 602) is obtained according to the following formulas: PredC[0] = a1.11[0]+ if EU/A threshold PredC[0]= a2.11[0]+ #2, if 1,1[0]> threshold The CCLM or MMLM mode has to be signalled in the bitstream 110 or 201. Figure 8 illustrates an exemplary LM mode signalling of JEM. A first binary flag indicates whether the current block is predicted using an LM mode or other intra modes, including so-called DM modes. In case of LM mode, the six possible LM modes need to be signalled. The first MMLM mode (using the 6-tap filter) is signalled with one second binary flag set to 1. This second binary flag is set to 0 for the remaining modes, in which case a third binary flag is set to 1 to signal the CCLM mode and is set to 0 for the remaining MMLM modes. Two additional binary flags are then used to signal one of the four remaining MMLM modes.
One mode is signalled for each chroma component.
The Cb-to-Cr CCLM mode introduced above is used in DM modes, and applies at residual level. Indeed, a DM mode uses for chroma the intra mode which was used by luma in a predetermined location. Traditionally, a coding mode like HEVC uses one single DM mode, co-located with the top-left corner of the CU. Without going in too many details, and for the sake of clarity, JVET provides several such locations. This mode is then used to determine the prediction method, therefore creating a usual intra prediction for a chroma component which, when subtracted from the reference/original data, yield aforementioned residual data. The prediction for the Cr residual is obtained from the Cb residual (ResidualCb below) by the following formula: PredCr[i,j] = a. ResidualCb[0] where a is derived in a similar way as in the CCLM luma-to-chroma prediction. The only difference is the addition of a regression cost relative to a default a value in the error function so that the derived scaling factor is biased towards a default value of -0.5 as follows: At Er, RecCbi. RecCr, - RecCb, .zr_, Rea:, + A (-0.5) a - M. En RecCbi -(En RecCb32 + A where RecC b1 represents the values of neighbouring reconstructed Cb samples, RecCri represents the neighbouring reconstructed Cr samples, and A =EY_, 1RecCb1 >> 9.
The LM modes currently present (or proposed) in VVC suffer from their modelling being inaccurate, while the current (or proposed) MMLM mode suffers from a high complexity (due to the least-squares approach of fitting a linear model).
A proposed alternative LM is based on the replacement of the derivation of a single linear model used to compute chroma predictor block samples from luma block samples by determination of the parameters of the linear model based on the equations of straight lines. A straight line is defined by two sample pairs defined based on reconstructed sample pairs in the neighbourhood of the block.
Figure 7 illustrates the principle of this method by considering here the minimum and the maximum of luma sample values in the set of sample pairs in the neighborhood of the current block. All the sample pairs are drawn on the figure according to their chroma value and their luma value. Two different points, namely point A and point B are identified on the figure, each point corresponding to a sample pair. Point A corresponds to the sample pair with the lowest luma value xA from RecL' and yA its collocated chroma value from RecC. Point B corresponds to the sample pair with the highest luma value xs and Ys its collocated chroma value.
Figure 8 shows a flow chart of a method to derive the linear model parameters shown in Figure 7. This flow chart is a simplified version of Figure 4. The method is based on the neighboring luma samples RecL' obtained in step 801 and chroma samples RecC obtained in step 802.
In a step 803, the two points A and B (804) corresponding to two sample pairs are determined. In a first embodiment, these two points A and B correspond to the sample pairs with respectively the lowest and highest luma sample values xA and x8 with their corresponding chroma sample values yA and ys.
Then the straight line equation which crosses the points A and B is computed in a step 805 according to the following equation:
YB YA a =
Xn -XA )61 = yA-axA The obtained a,fl are the linear model parameters 806 used to generate the chroma predictor.
The linear model derivation based on the [MS algorithm used in the prior art has a certain complexity. In this known method, the computation of the a parameter of the model is obtained by the following equation: Al MIReenReci'1-1Reen1Recti1 41 B,-Al - *I \2 113 -114 A, MIRee,E,2 -t=1 The analysis of this equation regarding the computation complexity gives the following results. The computation of B, requires M+1 multiplications and M sums, M being the number of sample pairs. The computation of B, requires 1 multiplication and 2M sums. The computation of B3 requires M+1 multiplication and M sums and the computation of a, requires one multiplication and 2M sums. The computation of B -B, a corresponding to 1 -requires two additional sums and one division.
B,-B4 To compute,6, one multiplication and 2M-F1 sums and one division. As described previously M is the number of pairs of samples RecCi and RecL'i.
The complexity of the [MS derivation of a and 13 is therefore (2M + 2 + 2) multiplications, (7M + 3) additions and two divisions.
In comparison, the analysis of the method based on the computation of the equation of a straight line using only two points gives the following results. As reported, the derivation step 805 requires only one multiplication, three sums and one division. This large complexity reduction in generating the linear model parameters is a major advantage of the proposed invention It should be noted that the search for the minimum and maximum values has a complexity of its own, typically related to sorting algorithm. The operation is not completely serial: N points can be compared to N other points, generating N minimum/maximum. Then N/2 minimum and N/2 maximum points can be compared to the N/2 others, then again N/4 and so on until only the desired numbers of minimum and maximum points remain. Typically, the search for the minimum and maximum thus results in approximatively 2*N-2 comparisons (N-1 for each).
As already described, the chroma predictor can be calculated with an integer multiplication and a shift instead of a floating-point multiplication, and a division when computing the slope. This simplification consists in replacing: predc(i, = a * reciL(11-1) + By: predc(i, D = (L * recl) >> S + To use only integer multiplication and shift, in one embodiment, the straight line equation is obtained as follows: S = 10 L - Y A) <<S = YA L(XA >> Please note that)3 refers to this equation in the following if a is replaced by L and S otherwise it refers to the traditional equation = yA-axA.
Another advantage of this derivation is that the shift value S always has the same value. This is interesting especially for hardware implementation that can be made simpler in taking advantage of this property.
In yet another embodiment, the value of S is forced to be low, as L could be large, and requires larger multiplier operations. Indeed, a multiply of 8bits values by a 8-bits value is much easier to implement than e.g. a 8*16 multiplier. Typical practical values for L are often equivalent to a multiplier less than 8 bits.
A particular implementation is known as fixed point: for every value of D=(xB -xA), possibly quantized (e.g. the results for 2D+0 and 2D+1 are stored as a single one), the value of (1 « S)/D is stored in a table. Preferably, these are only for the positive values, as the sign can be easily retrieved. Using an array TAB, the computation of L thus becomes: L = 1(Ys-YA)*TAB[abs(x6-x.4)10 if x8-XA -1* (Ys -yA) * TAB[abs(xB -xAVQ] otherwise Q controls the quantization and thus the number of elements in the table. Using Q=1 thus means no quantization. Also note that the looked-up index can be instead (abs(x8 -xA) + R)/Q, typically with R=0/2, or a variation thereof of the division rounding. Consequently, Q is ideally a power of 2 so that the division by Q=2P is equivalent to a right-shift by P. Finally, some of the values in that table may not be equal to 0: low values of abs(xe -xA) or abs(ye -yA) often result in very bad estimations of L. Pre-determined, or explicit (such as in the slice header or a parameter set such as PPS or SPS) values can be used then. For instance, for all values of D below 4, the array TAB may contains a default value, e.g. -(1«S)/8.
For 10 bits content and Q=1, up to 2048 entries in the array are needed. By exploiting the symmetry with sign as shown above, this can be reduced to 1024. Increasing further Q would similarly reduce the size of TAB.
It should be appreciated that the derivation of the parameters of the straight line may be calculated in a number of different methods -with certain methods (for example, using tables or estimations of the denominator of a) being particularly well-suited to hardware implementation. Any such modifications could be combined with the present invention without the need for any structural modification.
Similarly, the selection of the maximum and minimum points (A and B) may vary depending on the implementation and such modifications could be combined with the present invention without the need for any structural modification.
Figure 9 illustrates a proposed way of combining the MMLM mode with the straight line method described above. Given the threshold value Y mean, two segments are thus defined: - A is the sample pair having the smallest luma sample value, and C is the one having the largest luma sample value below Y mean, - B is the sample pair having the largest luma sample value, and D is the one having the smallest luma sample value above Y * mean Given these specific two pairs of points, two models can be determined by computing their parameters, respectively (01431) and (02,132), as the slope of intercept of respectively the dashed line and dotted line passing through respectively A and C, or D and B. Element 804 circles a discontinuity of the models centered around Y * mean.
However, the accuracy of this method relies heavily on the selection of points C and D -with minor differences potentially making a large impact on the parameters al, a2, 131, or 132. Furthermore, this method introduces a discontinuity which may yield incorrect or inaccurate predictions for values immediately on either side of the discontinuity. The present invention seeks to improve the situation in term of coding efficiency and/or computational complexity.
Figure 10 first illustrates the concept of a 'target'. The goal is to find T, indicated by an arrow to a mid-point between points A and B -this point is termed a 'knee point' and is defined by a point which is, or best matches, a particular target. Once the knee point has been determined, the dashed and dotted line equations relating to a linear model linking adjacent points A -> T and T -> B respectively can be determined, and thus a continuous linear model can be determined from the parameters of the two lines linking the three points A->T->B. Such a method provides an improvement in accuracy by utilizing multiple models, while maintaining the simplicity of the LM, and avoiding the problems arising from a discontinuity. In such a way, efficient coding can be achieved.
In a first embodiment, determining the knee point comprises finding the sample pair which minimizes the distance (or has a distance below a threshold) between its luma sample value and a target luma sample.
In one example, the target is the average value Ymean calculated over all sample pairs considered.
However, one issue with determining such a value as Y mean in order to find T is that the modeling requires an additional stage and buffering.
-First, going through all samples to find Ymean, - Once this is done, go through them again to find the point with a Y value closest to Ymean (T).
One or the other can be performed while searching for the extremum points (A and B in Figure 10), but the other cannot. As a consequence, determining the target may require additional processing steps and as such it would be preferable to perform it while or before finding the extremum points.
One first set of embodiments relies on iterative search of the target point.
Referring to Figure 11 and its illustrative samples, several embodiments can be illustrated. Examples for determining a "target" can be defined as: - The average of any of the luma of the A to I sample pairs; - When going through the top border 801 to find the extremum points, the average of any of the luma of the A to D sample pairs as well as I; - Similarly, for the left border 802, sample pairs E to H. In a preferred embodiment for iterative searches, a luma average is first computed using points on each side of a first border (e.g. the top). Ideally, the number of points is a (small, such as 2/4/8) power of 2, as the average would require a division (as opposed to a bit-shift) if it were not a power of 2. Embodiments include: - For the INTRA_LT_CCLM ("left"±"top") mode; o For top border, using A and B; o For left border, using E and F. - For the INTRA L CCLM ("left") mode, using A and D; For the INTRA T CCLM ("top") mode, using E and H. If the processing starts with the top border, then the aforementioned average for the top border is used to find the T point whose luma is nearest to it.
To process the second border, then two scenarios apply. First, if the first border is not available (i.e. the extremum and T points are undefined), which can be the case at the frontier of a slice or when constrained intra prediction is used. The former splits the image according to image or tiles boundaries and another criterion, such as the number of CTUs or the coded bitstream size, while the latter is an error-resilience tool, which forbids intra prediction from using data coming from temporally predicted data (such as sample values), thereby breaking the dependence on possibly corrupted or missing data. In such cases, the average on said border for one of all the CCLM modes is used. Otherwise, the "middle" point between the A and B points as found on the previous border is used.
In all cases, it should be noted that the average or middle point found is not necessarily updated when iterating over sample pairs of a border, while the minimum and maximum point are. In another embodiment, the target for the T point is updated when a new minimum or maximum point is found. As this may cause a convergence problem, like getting caught in a local minimum, further criterion can be applied to when to update said target: if enough sample pairs have been investigated, or if the change in the target is above a threshold, then this is hint enough that (the) values are safe to update. This is particularly true at the start of the iterative search, when the distance between the current minimum and maximum is too small, and may cause the target to be set to close to either one, preventing fast update afterwards.
In addition to the previous set of iterative embodiments, one can forego the search for the T point best matching the target. Instead, this point can be defined through a combination of several sample pairs. In a set of embodiments, T is the point whose luma and chroma are the averages of the luma and chroma sample values of a set of sample pairs. As explained when defining the target point, this set preferably contains a power-of-2 number of pairs for easier computation of the average. In the extreme case, all sample pairs that are currently used are used for the average.
However, a particularly advantageous embodiment takes into account that, if the model has a strong mismatch in the middle, and it is derived from the extremum points, these extremum points are outliers. It is therefore beneficial to remove them from the averages computed. A potential drawback is then that the number of sample pairs used in the computation of averages is no longer a power of 2.A first embodiment therefore adds two additional sample pairs to compensate for the extremum ones to be removed. The locations of these depend on the sample set used. However, there is a risk that these are actually outliers too. Therefore, in one embodiment, a new average is computed by removing the extremum points, for example by using the following formula: avg = ((2N+2)*avg -min -max + 2N-1) ,/ N or alternatively avg = avg + (2* avg -min -max + 2N-1) >, N where the updating of a variable, avg = avg + val, is sometimes represented as avg += val.
In the two above computations of avg, N is a predetermined value, possibly related to the number of sample pairs having been selected to compute the average. In a more generic fashion, the value of N may depend on the block size, e.g. if there are 2P samples used on the border(s), then N is available from a table indexed with P. Part or all of the table may be such that, for a given P, the corresponding value is such that N < P. Furthermore, the average may be rounded by using a value other than 2". An example is just foregoing the rounding, for the above embodiment, or any following: avg = avg + (2* avg -min -max) » N Multiple 'knee points' A variation which may improve the accuracy of the prediction (at a cost of increased complexity) consists in increasing the number of points (M) used to generate the continuous model (M=3 in the simplest embodiment). In such a way, multiple 'knee points' are generated which allows for a more accurate model.
In one example, the additional 'knee points' correspond to fractional mid-points (e.g. the mid-point multiplied by a rational number) which are distributed based on the calculated mid-point (which, as described above, may account for the extremum points being outliers). It should be noted that some of the points calculated may fall outside the range of the maximum and minimum values -in particular if the midpoint is calculated ignoring these values.
In such an example, a series of weighted averages can be determined using the following generalized formula: avg = ((2" -A -B)*avg + A * min + B * max + 2N-1) " N In some embodiments, A+B < 2 so that the true average has still a positive contribution to the above weighted a The final value for avg is thus a weighted average between the minimum, maximum and true average values, allowing various segmentations of the sample value range.
This formula can be used to generate additional knees/target points for either the iterative search embodiments or the ones above. For instance, the following knee points are below the middle, but still differ from the first quarter of the range: 01 = 2*avg -max 01 = (avg + min + 1) » 1 Conversely, a knee/target point above the middle can be defined as: 03 = 2*avg -min 03 = (avg + max + 1) » 1 The above expressions for 01 and 03 illustrate other embodiments that do not fit the generalized formula. One can therefore deduce that knee point computations may involve at least two of the average, minimum and extremum points, and that even the average is a convenient value, but could be computed on even fewer sample values (e.g. any of A to I in figure 11).
From this, it can be understood that other-than-middle knee points can be defined, and that they can be used to segment the luma range into more segments than the two previously illustrated.
Conditional use of linear models For all previous embodiments, the application of all or parts of the parameters derivation can further be subject to particular conditions. Indeed, the proposed MM LM technique is a separate mode. Instead, and as made evident in the previous paragraphs, it can instead be applied to any CCLM mode. However, coding efficiency gains and complexity reduction can be achieved by making the use of such modes conditional. For instance, finding and using the T point! the knee can be made conditional on any or several of the following.
The two-model (i.e. continuous piecewise linear model) case can be restricted 5 to the chroma modes within a subset of all the cross-component chroma modes (say INTRA LT CCLM, or 2 out of 6, if there were to be 6 such modes).
Another embodiment concerns the block size or number of samples. This embodiment is particularly beneficial if the luma range over the borders is large, which happens more frequently on larger blocks. However, deriving two models can be seen as more complex than one, and this worst-case situation can be restricted to easier blocks complexity-wise. All implementations of either any MMLM mode (Figure 9) or according to embodiments of the continuous piecewise straight lines method (Figure 10) have been found to offer significant improvement to coding performance over the simple LM model (Figure 7) when implemented on to blocks of strictly more than 16 samples (compared to an absolute minimum number of samples in a chroma block of 4). Improved coding performance over the simple LM model (Figure 7) can also be achieved for restricting the use of either any MMLM mode (Figure 9) or according to embodiments of the continuous piecewise straight lines method (Figure 10) when implemented on to blocks of strictly more than 32 samples.
Typical block sizes that can be excluded are thus 2x2, 2x4, 4x2, 4x4, 8x2 and 2x8 if the restriction is for >16 samples.
Typical block sizes that can be excluded are 4x8, 8x4, 2x16 and 16x2 if the restriction is for >32 samples.
The piecewise modelling, or MMLM, can be included as separate modes. These modes can be already known modes with the addition of the knee point search, or modes using e.g. new sets of samples on the border, different luma downsampling filters, or modes using multiple models differing from the piecewise modelling, etc. It has been found that certain modes do not provide gains (or create losses) for 30 some block sizes. As a consequence, when the coding mode of a block is signaled, the signaling may also be conditioned by the block size. Such an example would be to use a unary max code adapted to the number of LM modes. This is for example a series of one to two Context-based Adaptive Binary Arithmetic Coding (CABAC)-coded flags for 3 modes, where '0' means mode 0, '10' mode 1 and '11' mode 2.
Equivalently, if N modes are available (i.e. because the block is large enough), N modes can be indicated using between 1 and N-1 flags. Reordering of the coding modes, or not including some of the single-model ones may happen, e.g. codeword '10' previously meaning "mode 1", may instead mean "mode 1 using piecewise modelling", "110" may mean "mode 2 using piecewise modelling", "1110" may mean "mode 3", which can be a MMLM mode, "1111" may mean "mode 1", and therefore "mode 2" cannot be used (nor signaled) in this case.
The number of samples on the border however depends on the border availability and decimation scheme used to guarantee that a power-of-2 number of samples is used. As such, some larger blocks which do not have an available border, or use certain decimation schemes would also be excluded.
In general, in the above embodiment, the MMLM mode or continuous piecewise straight lines method is used in preference to a single LM method in dependence on the block size and/or the number of samples.
A further class of embodiments concerning the adaptive use of a piecewise model (i.e. either the MMLM model shown in Figure 9, or the continuous 'knee point' model shown in Figure 10) is via a determination of the luma ranges of each segment: - If the luma range between the extremum points is too low, then the single line model is more likely valid in the middle of that range, for 10 bits, a good threshold, for using a single line, on the distance between maximum and minimum luma values 25 is between 48 and 96.
- If one of the ranges between an extremum and the luma of T is below a threshold, then the alpha parameter from the other range is selected, meaning the parameters are (al,p1) and (al,p2), or (a2,31) and (02,132). A good condition is the distance between T and the concerned extremum below 6, though any value below that still works.
A particular case is if one, or both luma distances from each extremum to T is not above a threshold, then the model can be skewed. Indeed, this means, first, that the average is very close to one of the extremum, which is then not an outlier. Second, if the luma distances are small, then quantization errors may affect strongly the terms used in computing the model parameters, and thus cause very large relative change. One such case is a value of 2 for the luma distance: even an error of 1 will cause a value of 1/2 to be used instead of 1/3 or 1/1. Third, this means that the prediction block may use as input sample values that far outside the observed luma range on the borders, and the later may inadequately represent the former.
A particularly advantageous embodiment according to this minimal distance follows. Given the following computations: shift = ( BitDepthC > 8) ? BitDepthC -9: 0 (8-153) add = shift ? 1 « ( shift -1) : 0 (8-154) with BitDepthC the bitdepth of the component to predict, the following terms are 15 computed: diff1 = ( maxY -midY + add) » shift diff2 = ( midY -minY + add) » shift The condition becomes: -If diff1 > 8 and diff2 > 8, then apply the two-model approach, -Otherwise, use the traditional one model approach: diff = ( maxY -minY + add) » shift It should be noted that 'diff' represents the denominator of the respective slope a. The use of 'shift' (which is dependent on bitdepth of the sample) essentially restricts the total number of values this difference can make to a constant (irrespective of the bidepth). In the example above, the number of values is 512, but could lowered to 256 or 128 by increasing the value of shift by one or two respectively. Such values of diff (or values derived from 'din can then be stored in a table to eliminate the requirement of calculation each time. One example of a value derived from 'cliff' is a 'division' value as follows: div = ( ( maxC -avgC) * ( Floor( 232 / duff) div -= Floor( 216 / diff)*216) + 215) » 16 Where the 'Floor' function outputs the greatest integer less than or equal to the input. The shift value of '16 may be termed LIK' and represents a division accuracy, 5 and may be reduced so that each value stored in the table takes up less memory, at a cost of lower accuracy of division.
The optional 'add' parameter ensures that the value of 'duff is an integer following the shift operation. This can make subsequent division operations less complex.
The threshold for the minimal distance between max/min and mid point (8 in the above example) is set to be a significant portion of the range of values of 'duff. As such, a different threshold may be appropriate depending on the range and/or the variance of the sample values in the range. In particular if the value of 'shift' is increased (as discussed above), the threshold should be lowered.
It should be noted that if either diffl or diff2 = 0, the slope of the line al or a2 is forced to be zero to avoid e.g. a division by zero error, or use of undefined elements from a table, or any undefined behavior for a given implementation.
Similarly, if either diffl or diff2 < 0, the slope of the line al or a2 is forced to be zero as this situation could only arise if midY' (the average) is below min or above max, resulting in negative values for diffl' or diff2'. This situation does not represent a realistic model, so the forcing a to be zero represents a simple way of avoiding unrealistic models. An alternative element consists in the following checks: -If diff1 < minDist and diff2 < minDist, use a single linear model o This is equivalent to diffl > minDist-1 or diff2 > minDist-1 not being satisfied.
-Otherwise, determine the two slopes for the piecewise modelling and: O If diffl < minDiff, al = a2; o If diff2 < minDiff, a2 = al; o Derive as expected p1 and 32.
A particularly advantageous embodiment uses values of 4 for minDiff and 9 for minDist. It should be noted that the abnormal case of avg falling outside of the [min, max] range can therefore be handled in a number of ways. Another example is to condition the removal of any outlier to avg being within said range: if this were not the case, then none, or fewer, extrema are removed.
In addition, as averages (or variations thereof) for luma and chroma have been computed for determining the knee point, even if the two-model variation were not to be used, the computation of 3 can also be modified. Instead of using: = yA -L(xA >> S) One can instead use: =Ymia -L(xmid » while keeping the alpha computation based on extremum points. In other 10 embodiments, so as to better handle very skewed values of luma or chroma as may happen with e.g. screen content, the computation of p can depend on the chroma prediction mode: - LM CHROMA_IDX: p = ymid -"xmid » s) - MDLM_LIDX: p = yA -L(xA >> S) -MDLM T IDX: p = yB -L(xB» s) Moving to Figure 12, the embodiments above can be illustrated in a combined and functional fashion. Figure 12 is very similar to Figure 8: steps 1211 to 1217 are new, while steps 1203, 1204 and 1205 are modified versions of 803 and 804. Finally, steps 1201, 1202 and 1206 are identical to respectively 801, 802 and 806, and therefore are not described again.
When steps 1201 and 1202 are completed, the two extremum points are found on step 1203, as well as the averages avgi_ and avgc. To find whether to use a knee point, step 1211 checks whether the block satisfies a criterion. As seen already, that may be the block coding mode, or its number of samples. In the latter case, minimal sizes of 16 or 32 samples offer a good tradeoff. If the knee point need not be used, normal CCLM operations (i.e. a single linear equation based on the extreme points) resume on step 1205. It should be understood that some a priori conditions such as the block size can also entirely avoid the determination of the averages which would not be used when determining a single linear equation.
Resuming on step 1205, the computation of the slope is performed, per any of the methods described above, or available to the person skilled in the art. In particular, this is the straight line method using the 2 extremum points. The intercept can be computed as usual, or instead of using points A and B chroma, the averages avgi_ and avgC asp = avgc -atavgi_. This has the benefit that the true knee is used for computing the model parameters, and ensures better continuity between two adjacent models. Due to rounding, there may still be a small discontinuity as, for the luma value of the knee point, they may yield different chroma sample values. In such a case, the threshold value to select either one of the two adjacent models can be adjusted, e.g. for a given threshold T, usually avgi_ (or the outlier-removed versions previously illustrated) by checking which of T-1, T and T+1 minimizes said discontinuity. This may also consist in computing the true intercept. Indeed, one may want: + ( (ai*threshold) » shift) = /32 + ( (a2*threshold) » shift) and therefore: threshold = (16'2 - / (on -(12) The normal operation then ends on step 1206, as single model parameters have been found.
Moving back to step 1211, if the knee point can be used, the averages are further considered. In particular, step 1212 performs a linear combination of them and the extremum luma and chroma sample values. This may in particular consist in removing them from the averages, e.g. avgi_ = (6* avgi_ -xA -xB + 2) » 2 and avgc = (6* avgc -yA -ye + 2) » 2. As a consequence, at least three points (extremum A and B and at least one knee point T) have been determined on step 1204. As mentioned previously, in case more models are needed, more knee points can be computed, as illustrated earlier with the computations of 01 and 03.
These can then be used on step 1213 to determine whether two models can be used on the current block. This may consist in checking whether T is far enough from A and B, e.g. that the difference between the luma avgi_ of T and that of any extremum (i.e. xA or x) is above a threshold. Such a check can be extended to any number of knee points. In another embodiment, this is simply if the luma sample values of A and B are far enough. If the criterion is not satisfied, the knee point method may be considered as useless, or even harmful, for prediction. In this case, normal CCLM operation resumes on step 1205.
Otherwise, the knee point can be used to compute two models starting on step 1214. First, this step performs the computation of the slopes defined by the three points, using any method of the prior art, e.g. the straight line one. Then, on step 1215, an optional operation corrects these slopes. This may involve using one slope instead of another if the values used in the computation of the former may have issues. This can be for instance checking whether the luma difference, or chroma difference of T and the extremum corresponding to the slope computed is below a threshold: quantization had a non-negligible impact on said difference, rendering the slope computed very approximate, if not very wrong. Finally, now that the slopes are known, the intercepts can be computed. This may use any of the luma and chroma sample values of the three points. In the preferred embodiment, this uses the averages as 131 = avgc -aitavgL and flz = avgc -az*avgL. This finalizes the computation of the two models. Finally, on step 1217, the criterion to switch between one among the at least two models, can be determined, usually luma thresholds. For two models, this is simply avgc if below (or equal), al and fit are used, otherwise az and #2 are. If more knee points and models are determined, then the luma of said knee points can be the thresholds for which to select a given set of parameters.
The advantage of these embodiments is a coding efficiency improvement.
The descriptions of these embodiments mention the luma and a chroma component but can easily be adapted to other components such as both chroma components, or RGB components. According to an embodiment, the present invention is used when predicting a first chroma component sample value from a second chroma component. In another embodiment, the present invention is used when predicting a sample value of one component from more than one sample values of more than one component. It is understood that in such a case, the linear model is derived based on two points/sets, each point/set comprising a sample value of the one component, and the more than one sample values of the more than one component. For example, if two components' sample values are used to predict the one component's sample value, each point/set can be represented as a position in a 3-dimensional space, and the linear model is based on a straight line passing through the two positions in the 3-dimensional space that correspond to the two points/sets of the reconstructed sample values.
Figure 13 is a schematic block diagram of a computing device 1300 for implementation of one or more embodiments of the invention. The computing device 1300 may be a device such as a micro-computer, a workstation or a light portable device. The computing device 1300 comprises a communication bus connected to: - a central processing unit 1301, such as a microprocessor, denoted CPU; - a random access memory 1302, denoted RAM, for storing the executable code of the method of embodiments of the invention as well as the registers adapted to record variables and parameters necessary for implementing the method for encoding or decoding at least part of an image according to embodiments of the invention, the memory capacity thereof can be expanded by an optional RAM connected to an expansion port for example; - a read only memory 1303, denoted ROM, for storing computer programs for implementing embodiments of the invention, - a network interface 1304 is typically connected to a communication network over which digital data to be processed are transmitted or received. The network interface 1304 can be a single network interface, or composed of a set of different network interfaces (for instance wired and wireless interfaces, or different kinds of wired or wireless interfaces). Data packets are written to the network interface for transmission or are read from the network interface for reception under the control of the software application running in the CPU 1301; - a user interface 1305 may be used for receiving inputs from a user or to display information to a user; -a hard disk 1306 denoted HD may be provided as a mass storage device; - an I/O module 1307 may be used for receiving/sending data from/to external devices such as a video source or display.
The executable code may be stored either in read only memory 1303, on the hard disk 1306 or on a removable digital medium such as for example a disk. 30 According to a variant, the executable code of the programs can be received by means of a communication network, via the network interface 1304, in order to be stored in one of the storage means of the communication device 1300, such as the hard disk 1306, before being executed The central processing unit 1301 is adapted to control and direct the execution of the instructions or portions of software code of the program or programs according to embodiments of the invention, which instructions are stored in one of the aforementioned storage means. After powering on, the CPU 1301 is capable of executing instructions from main RAM memory 1302 relating to a software application after those instructions have been loaded from the program ROM 1303 or the hard-disc (HD) 1306 for example. Such a software application, when executed by the CPU 1301, causes the steps of the method according to the invention to be performed.
Any step of the methods according to the invention may be implemented in software by execution of a set of instructions or program by a programmable computing machine, such as a PC ("Personal Computer"), a DSP ("Digital Signal Processor") or a microcontroller; or else implemented in hardware by a machine or a dedicated component, such as an FPGA ("Field-Programmable Gate Array") especially for the Minima and Maxima selection, or an ASIC ("Application-Specific Integrated Circuit").
It is also to be noted that while some examples are based on HEVC for the sake of illustration, the invention is not limited to HEVC. For example, the present invention can also be used in any other prediction/estimation process where a relationship between two or more components' sample values can be estimated/predicted with a model, wherein the model is an approximate model determined based on at least two sets of related/associated component sample values selected from all available sets of the related/associated component sample values.
It is understood that each point corresponding to a sample pair (i.e. a set of associated sample values for different components) may be stored and/or processed in terms of an array. For example, each component's sample values may be stored in an array so that each sample value of that component is referable/accessible/obtainable by referencing an element of that array, using an index for that sample value for example. Alternatively, an array may be used to store and process each sample pairs that each sample value of the sample pairs accessible/obtainable as an element of the array.
It is also understood that any result of comparison, determination, assessment, selection, or consideration described above, for example a selection made during an encoding process, may be indicated in or determinable from data in a bitstream, for example a flag or data indicative of the result, so that the indicated or determined result can be used in the processing instead of actually performing the comparison, determination, assessment, selection, or consideration, for example during a decoding process.
Although the present invention has been described hereinabove with reference to specific embodiments, the present invention is not limited to the specific embodiments, and modifications will be apparent to a skilled person in the art, which lie within the scope of the present invention.
Many further modifications and variations will suggest themselves to those versed in the art upon making reference to the foregoing illustrative embodiments, which are given by way of example only and which are not intended to limit the scope of the invention, that being determined solely by the appended claims. In particular the different features from different embodiments may be interchanged, where appropriate.
Each of the embodiments of the invention described above can be implemented solely or as a combination of a plurality of the embodiments. Also, features from different embodiments can be combined where necessary or where the combination of elements or features from individual embodiments in a single embodiment is beneficial.
Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise. Thus, unless expressly stated otherwise, each feature disclosed is one example only of a generic series of equivalent or similar features.
In the claims, the word "comprising" does not exclude other elements or steps, and the indefinite article "a" or "an" does not exclude a plurality. The mere fact that different features are recited in mutually different dependent claims does not indicate that a combination of these features cannot be advantageously used.

Claims (51)

  1. CLAIMS1. A method of deriving a linear model for obtaining a first-component sample for a first-component block from an associated reconstructed second-component sample of a second-component block in the same frame, the method comprising: determining M points, where M3; each point being defined by two variables, the first variable corresponding to a second-component sample value, the second variable corresponding to a first-component sample value, based on reconstructed samples of both the first-component and the second-component; - determining the parameters of a plurality of linear equations, each equation representing a straight line passing through two adjacent points of said M points, and deriving a continuous linear model defined by the parameters of the said straight lines.
  2. 2. The method of claim 1, wherein the points are determined based on sample pairs in the neighbourhood of the second-component block.
  3. 3. The method of claim 2, wherein the points are determined based on the sample values of the sample pairs in the neighbourhood of the second-component block.
  4. 4. The method of claim 3, wherein the first and Mth points correspond respectively to the sample pairs with the lowest second-component sample value and with the highest second-component sample value.
  5. 5. The method of claim 4 wherein said first and Mth points correspond respectively to the sample pairs with the lowest second-component sample value and with the highest second-component sample value from a selection of sample pairs.
  6. 6. The method of claim 5 wherein the selection of the sample pairs comprises a selection of sample pairs from one of: a top border, or a left border.
  7. 7. The method of claim 5 or 6 comprising iteratively determining said first and Mth points when accessing said selection of sample pairs.
  8. 8. The method of any preceding claim wherein the or each point from 2 to M1 is derived from a mid-point of second-component values.
  9. 9. The method of claim 8, wherein M>3 and each point from 2 to M-1 are distributed corresponding to a fractional mid-point.
  10. 10. The method of claim 9, wherein each point from 2 to M-1 corresponds to a point having a second-component value nearest a value corresponding to said distribution of points between the first and Nth point.
  11. 11. The method of claim 9, wherein each point from 2 to M-1 corresponds to a point having a second-component value which differs from a value corresponding to an even distribution of points between the first and Mth point less than a threshold amount.
  12. 12. The method claims 10 or 11 wherein each point corresponding to said distribution is determined iteratively.
  13. 13. The method of claim 8, wherein M=3 and the second point corresponds to the mid-point.
  14. 14. The method of claim 13 wherein the second point corresponds to a point having a second-component value nearest a mid-point of the second-component sample values.
  15. 15. The method of claim 13 wherein the second point corresponds to a point of which the second-component value differs from a mid-point of the second-component sample values less than a threshold amount.
  16. 16. The method of claim 14 or 15 wherein the point or points corresponding to the mid-point is determined iteratively.
  17. 17. The method of any of claims 8 to 16 wherein said mid-point is an average of a selection of second-component sample values.
  18. 18. The method of claim 17 wherein the values of the samples having the lowest and highest second-component sample values are ignored from the selection of the second-component sample values when determining the average.
  19. 19. The method of claim 18 wherein said highest and lowest values are ignored when updating a calculated average.
  20. 20. The method of claim 19 wherein the calculated average is updated by: avg += (2*avg -min -max) >> N where N is a predetermined scaling parameter.;
  21. 21. The method of claim 19 wherein the calculated average is updated by: avg += (2* avg -min -max + 2N-1) ).> N where N is a predetermined scaling parameter.
  22. 22. The method of claim 20 or 21 wherein N is dependent on the block size.
  23. 23. The method of any of claims 20 to 22 wherein N is dependent on the number of samples used to compute the calculated average.
  24. 24. The method of any of claims 17 to 23 wherein the selection of sample pairs is from the sample pairs forming a top border of the block.
  25. 25. The method of any of claims 17 to 23 wherein the selection of sample pairs is from the sample pairs forming a left border of the block.
  26. 26. The method of claim 24 or 25 wherein the selection of sample pairs also includes a sample from a top-left neighbouring block.
  27. 27. The method of any of claims 17 to 26 wherein the total number of samples in said selection is a power of two.
  28. 28. The method of claim 27 wherein the total number of samples in said selection is greater than a minimum possible number of such samples.
  29. 29. The method of any preceding claim further comprising the step of determining a range of said second-component values, and determining the parameters of said straight lines in dependence on said range.
  30. 30. The method of claim 29 wherein said dependence is whether said range is greater than a threshold.
  31. 31. The method of claim 30 wherein if said range between adjacent points is below a threshold, a parameter derived from a straight line between two different points is used.
  32. 32. The method of claim 31 wherein said parameter comprises the slope.
  33. 33. The method of claim 31 or 32 wherein said parameter comprises the ordinate intercept.
  34. 34. A method of deriving a linear model for obtaining a first-component sample for a first-component block from an associated reconstructed second-component sample of a second-component block in the same frame, the method comprising: determining M points, each point being defined by two variables, the first variable corresponding to a second-component sample value, the second variable corresponding to a first-component sample value, based on reconstructed samples of both the first-component and the second-component, - the first point corresponding to the point with the lowest second component value, and the Mth point corresponding to the point with the highest second component value; - determining the parameters of at least one linear equations, each equation representing a straight line passing through two adjacent points of said M points, and - deriving a linear model defined by the parameters of the or each straight lines; wherein M=2 if the number of samples is less than a threshold, and wherein W3 if the number of samples is greater than said threshold.
  35. 35. The method of claim 34 wherein the threshold is 16 samples.
  36. 36. The method of claim 34 wherein the threshold is 32 samples.
  37. 37. The method of any of claims 34 to 36 further comprising, if the number of samples is greater than said threshold, performing the method of any of claims 1 to 33.
  38. 38. A method of deriving a linear model for obtaining a first-component sample for a first-component block from an associated reconstructed second-component sample of a second-component block in the same frame, the method comprising: - determining a difference between the second-component values corresponding to the two points having the largest and smallest second-component values; - if said difference is lower than a threshold: determining the parameters of a linear equation, representing a straight line passing through said points, and deriving a linear model defined by the parameters of said straight line; - if said difference is higher than the threshold: determining at least one further point between said two points having the largest and smallest second-component values; and determining the parameters of two linear equations, the first equation representing a straight line passing through the point having the smallest second-component value and an adjacent point; the second equation representing a straight line passing through the point having the largest second-component value and an adjacent point; and deriving a linear model defined by the parameters of said straight lines.
  39. 39. The method of claim 38 wherein the threshold depends on the bitdepth of the samples.
  40. 40. The method of claim 38 or 39 wherein for a bitdepth of 10, the threshold is between 48 and 96.
  41. 41. A method of deriving a linear model for obtaining a first-component sample for a first-component block from an associated reconstructed second-component sample of a second-component block in the same frame, the method comprising: - determining three points, said three points comprising two points having the largest and smallest second-component values, and a third point between said two points; - determining a difference between the point having the smallest second-component value and the third point; - determining a difference between the point having the largest second-component value and the third point; - if one or both of said differences are lower than a threshold: determining the parameters of a linear equation, representing a straight line passing through said two points having the largest and smallest second-component values, and deriving a linear model defined by the parameters of said straight line; if said difference is higher than the threshold: determining the parameters of two linear equations, the first equation representing a straight line passing through the point having the smallest second-component value and the third point; and the second equation representing a straight line passing through the point having the largest second-component value and the third point; and deriving a linear model defined by the parameters of said straight lines.
  42. 42. The method of any of claims 38 to 41 further comprising signalling, in a bitstream, the model to be used.
  43. 43. The method of any of claims 38 to 41 further comprising retrieving a signal, from a bitstream, the model to be used.
  44. 44. The method of claim 42 or 43 wherein said signal is coded using unary max code.
  45. 45. The method of any of claims 42 to 44 wherein said signal comprises a CABAC encoded flag.
  46. 46. The method of any of claims 38 to 45 further comprising, if said difference is higher than the threshold, performing the method of any of claims 1 to 33.
  47. 47. A device for encoding images, wherein the device comprises a means for deriving a continuous linear model according to any one claim 1 to 46.
  48. 48. A device for decoding images, wherein the device comprises a means for deriving a continuous linear model according to any one claim 1 to 46.
  49. 49. A computer program product for a programmable apparatus, the computer program product comprising a sequence of instructions for implementing a method according to any one of claims 1 to 46, when loaded into and executed by the programmable apparatus.
  50. 50. A computer-readable medium storing a program which, when executed by a microprocessor or computer system in a device, causes the device to perform a method according to any one of claims 1 to 46.
  51. 51. A computer program which upon execution causes the method of any one of claims 1 to 46 to be performed.
GB1820861.1A 2018-12-20 2018-12-20 Piecewise modeling for linear component sample prediction Withdrawn GB2580078A (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
GB1820861.1A GB2580078A (en) 2018-12-20 2018-12-20 Piecewise modeling for linear component sample prediction
GB1900137.9A GB2581948A (en) 2018-12-20 2019-01-04 Piecewise modeling for linear component sample prediction
GB1903756.3A GB2580192A (en) 2018-12-20 2019-03-19 Piecewise modeling for linear component sample prediction
PCT/EP2019/086658 WO2020127956A1 (en) 2018-12-20 2019-12-20 Piecewise modeling for linear component sample prediction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB1820861.1A GB2580078A (en) 2018-12-20 2018-12-20 Piecewise modeling for linear component sample prediction

Publications (2)

Publication Number Publication Date
GB201820861D0 GB201820861D0 (en) 2019-02-06
GB2580078A true GB2580078A (en) 2020-07-15

Family

ID=65364354

Family Applications (2)

Application Number Title Priority Date Filing Date
GB1820861.1A Withdrawn GB2580078A (en) 2018-12-20 2018-12-20 Piecewise modeling for linear component sample prediction
GB1900137.9A Withdrawn GB2581948A (en) 2018-12-20 2019-01-04 Piecewise modeling for linear component sample prediction

Family Applications After (1)

Application Number Title Priority Date Filing Date
GB1900137.9A Withdrawn GB2581948A (en) 2018-12-20 2019-01-04 Piecewise modeling for linear component sample prediction

Country Status (1)

Country Link
GB (2) GB2580078A (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020258053A1 (en) * 2019-06-25 2020-12-30 Oppo广东移动通信有限公司 Image component prediction method and apparatus, and computer storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170064318A1 (en) * 2015-09-02 2017-03-02 Kabushiki Kaisha Toshiba Image compression apparatus, image expansion apparatus, and image transfer system
US20180077426A1 (en) * 2016-09-15 2018-03-15 Qualcomm Incorporated Linear model chroma intra prediction for video coding

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170064318A1 (en) * 2015-09-02 2017-03-02 Kabushiki Kaisha Toshiba Image compression apparatus, image expansion apparatus, and image transfer system
US20180077426A1 (en) * 2016-09-15 2018-03-15 Qualcomm Incorporated Linear model chroma intra prediction for video coding

Also Published As

Publication number Publication date
GB2581948A (en) 2020-09-09
GB201820861D0 (en) 2019-02-06

Similar Documents

Publication Publication Date Title
US12010319B2 (en) Sample sets and new down-sampling schemes for linear component sample prediction
US20200288135A1 (en) New sample sets and new down-sampling schemes for linear component sample prediction
WO2019162116A1 (en) New sample sets and new down-sampling schemes for linear component sample prediction
WO2019162117A1 (en) Methods and devices for improvement in obtaining linear component sample prediction parameters
WO2019162118A1 (en) Methods and devices for linear component sample prediction using a double classification
WO2020127956A1 (en) Piecewise modeling for linear component sample prediction
GB2580078A (en) Piecewise modeling for linear component sample prediction

Legal Events

Date Code Title Description
WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)