CN110999290B - Method and apparatus for intra prediction using cross-component linear model - Google Patents

Method and apparatus for intra prediction using cross-component linear model Download PDF

Info

Publication number
CN110999290B
CN110999290B CN201980002859.7A CN201980002859A CN110999290B CN 110999290 B CN110999290 B CN 110999290B CN 201980002859 A CN201980002859 A CN 201980002859A CN 110999290 B CN110999290 B CN 110999290B
Authority
CN
China
Prior art keywords
block
reconstructed
luma
chrominance
adjacent
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201980002859.7A
Other languages
Chinese (zh)
Other versions
CN110999290A (en
Inventor
马祥
赵寅
杨海涛
陈建乐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of CN110999290A publication Critical patent/CN110999290A/en
Application granted granted Critical
Publication of CN110999290B publication Critical patent/CN110999290B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/167Position within a video image, e.g. region of interest [ROI]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/593Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A method for encoding a block of video data comprising: down-sampling the reconstructed luminance block to obtain a down-sampled luminance block; determining a maximum luminance value and a minimum luminance value from reconstructed neighboring luminance samples above the reconstructed luminance block or reconstructed neighboring luminance samples to the left of the reconstructed luminance block; deriving first and second chrominance values corresponding to the maximum and minimum luminance values, respectively, from reconstructed neighboring chrominance samples of the target chrominance block based on the locations of the maximum and minimum luminance values; calculating parameters of a cross-component linear model (CCLM) prediction mode based on the maximum and minimum luminance values and the first and second chrominance values; and generating a predicted chroma value of the target chroma block based on the parameter and the downsampled luma block.

Description

Method and apparatus for intra prediction using cross-component linear model
Cross application of related applications
The present invention claims priority from a prior application, U.S. provisional patent application No. 62/698,279, filed 2018, 7, 15, the entire contents of which are incorporated herein by reference.
Technical Field
The present invention relates to video encoding and decoding, and more particularly, to techniques for intra-prediction encoding using a cross-component linear model.
Background
Since the advent of DVD optical discs, digital video has been widely used for years since it was first encoded, then transmitted using a transmission medium to a viewer to receive the video and using a viewing device to decode and display the video, the improvement in video quality has resulted in larger data streams, now typically transmitted over the internet and mobile communication networks, due to higher resolution, depth of color, and frame rate.
However, the higher the video resolution, the more bandwidth is required, and because such videos have more information to reduce bandwidth requirements, video coding standards that involve video compression have been introduced that reduce bandwidth requirements (or reduce corresponding memory requirements when stored) when encoding video, which generally results in a reduction in quality.
As quality and bandwidth requirements are continually improved, solutions that reduce or increase quality while maintaining bandwidth requirements are continually sought. Furthermore, sometimes compromises are acceptable. For example, if the quality is significantly improved, it may be acceptable to increase the bandwidth requirement.
High Efficiency Video Coding (HEVC) is one example of a Video Coding standard well known to those skilled in the art. In HEVC, the next generation standards for partitioning Coding Units (CUs) into Prediction Units (PU) or Transform Units (TU) universal Video Coding (VVC) are the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG) standards organization the latest Joint Video project, these two standards organizations co-operate under the name Joint Video Exploration Team (jviet). VVC is also known as ITU-T H.266/Next Generation Video Coding (NGVC) standard. In VVC, the concept of multiple partition types is removed, i.e. the separation of CU, PU and TU concepts is removed, unless these concepts need to be separated because the size of the CU is too large for the maximum transform length. VVCs also support more flexible CU partition shapes.
Disclosure of Invention
Methods and apparatus for intra-prediction encoding using cross-component linear models are described. The method and apparatus may be used in existing Video codecs such as High Efficiency Video Coding (HEVC) and any future Video Coding standard.
It is an object of the present invention to provide a method for cross-component prediction of a block of video data. The method may include downsampling a reconstructed luma block to obtain a downsampled luma block, the reconstructed luma block corresponding to a target chroma block. The method may further include determining a maximum luminance value and a minimum luminance value from reconstructed neighboring luminance samples above the reconstructed luminance block and/or reconstructed neighboring luminance samples to the left of the reconstructed luminance block, the reconstructed neighboring luminance samples above the reconstructed luminance block being neighboring to the reconstructed luminance block. The method may further comprise: deriving first and second chrominance values corresponding to the maximum and minimum luminance values, respectively, from reconstructed neighboring chrominance samples of the target chrominance block based on the locations of the maximum and minimum luminance values; calculating parameters of a cross-component linear model (CCLM) prediction mode based on the maximum and minimum luminance values and the first and second chrominance values; and generating a predicted chroma value of the target chroma block based on the parameter of the CCLM prediction mode and the downsampled luma block.
In other aspects of the disclosure, an apparatus for processing a block of video data is provided. The apparatus may include a processor and a memory storing instructions that, when executed by the processor, cause the processor to downsample a reconstructed luma block to obtain a downsampled luma block, the reconstructed luma block corresponding to a target chroma block. The stored instructions, when executed by the processor, further cause the processor to determine a maximum luma value and a minimum luma value from reconstructed neighboring luma samples above the reconstructed luma block and/or reconstructed neighboring luma samples to the left of the reconstructed luma block, wherein the reconstructed neighboring luma samples above the reconstructed luma block are neighboring the reconstructed luma block. The stored instructions, when executed by the processor, further cause the processor to: deriving first and second chrominance values corresponding to the maximum and minimum luminance values, respectively, from reconstructed neighboring chrominance samples of the target chrominance block based on the locations of the maximum and minimum luminance values; calculating parameters of a cross-component linear model (CCLM) prediction mode based on the maximum and minimum luminance values and the first and second chrominance values; and generating a predicted chroma value of the target chroma block based on the parameter of the CCLM prediction mode and the downsampled luma block. The device further comprises: an inter prediction unit; a filter for downsampling the reconstructed block to obtain a filtered block; and a buffer to store the filtered block and to provide the filtered block to the inter prediction unit to generate a predicted block of video data.
In some embodiments, the processor is configured to obtain a position M of the maximum luminance value and a position N of the minimum luminance value in a region, wherein the region comprises the neighboring luminance samples above the reconstructed luminance block. The processor is further configured to obtain the first chrominance value at a first position in adjacent chrominance samples adjacent to the target chrominance block corresponding to the maximum luminance value and the second chrominance value at a second position in adjacent chrominance samples adjacent to the target chrominance block corresponding to the minimum luminance value.
In an embodiment, the region further comprises neighboring luma samples to the left of the reconstructed chroma block or neighboring luma samples to the left of the downsampled luma block.
In an embodiment, in the case that M is an mth position of an adjacent upper row of the reconstructed luma block, the first position is an M/2 or (M +1)/2 or (M-1)/2 position of an adjacent upper row of the target chroma block.
In an embodiment, in the case that N is the nth position of the adjacent upper row of the reconstructed luma block, the second position is the nth/2 or (N +1)/2 or (N-1)/2 position of the adjacent upper row of the target chroma block.
In an embodiment, where M is the mth position of the adjacent left column of the reconstructed luma block or the downsampled luma block, the first position is the M/2 or (M +1)/2 or (M-1)/2 position of the adjacent left column of the target chroma block
In an embodiment, in case N is an nth position of an adjacent left column of the reconstructed luma block or the downsampled luma block, the second position is an nth/2 or (N +1)/2 or (N-1)/2 position of the adjacent left column of the target chroma block.
Other aspects and advantages of the invention will become apparent from the following detailed description, taken in conjunction with the accompanying drawings, illustrating by way of example the principles of the embodiments.
Drawings
For a better understanding of the present invention, reference is now made to the drawings, in which like elements are referenced by like numerals. These drawings should not be construed as limiting the present invention, but are merely illustrative.
Fig. 1 is a block diagram of an encoding device including a filter according to some embodiments of the present invention.
Fig. 2 is a block diagram of a decoding device including a filter according to some embodiments of the present invention.
Fig. 3 is a schematic diagram of the positions of luma and chroma samples relative to each other.
Fig. 4, which includes fig. 4A and 4B, is a schematic diagram of example derived positions of scaling parameters for scaling down-sampled reconstructed luminance blocks. Fig. 4A shows an example of neighboring reconstructed pixels of a co-located luma block. Fig. 4B shows an example of neighboring reconstructed pixels of a chrominance block.
Fig. 5 is a schematic diagram of one example of a video component sampling format that may be employed in embodiments of the present invention.
Fig. 6 is a schematic diagram of another example of a video component sampling format that may be employed in embodiments of the present invention.
FIG. 7 shows a graph of a threshold correlation curve according to an embodiment of the invention.
Fig. 8 is a diagram of a straight line extending between a minimum luminance value and a maximum luminance value according to an embodiment of the present invention.
Fig. 9, which includes fig. 9A and 9B, is an exemplary system architecture for deriving linear model coefficients based on downsampled luminance blocks in accordance with an embodiment of the present invention. Fig. 9A shows an example original (reconstructed) luma block, and fig. 9B shows an example downsampled luma block.
Fig. 10, comprising fig. 10A and 10B, is an exemplary system architecture for deriving linear model coefficients based on downsampled luminance blocks according to another embodiment of the present invention. Fig. 10A shows an example original (reconstructed) luminance block, and fig. 10B shows an example downsampled luminance block.
Fig. 11, which includes fig. 11A and 11B, is an exemplary system architecture for deriving linear model coefficients based on downsampled chroma blocks according to an embodiment of the present invention. Fig. 11A shows an example original (reconstructed) luma block and fig. 11B shows an example downsampled chroma block.
Fig. 12, which includes fig. 12A-12C, is an example system architecture for deriving linear model coefficients based on downsampled luma and chroma blocks according to an embodiment of the present invention. Fig. 12A shows an example original (reconstructed) luma block, fig. 12B shows an example downsampled luma block, and fig. 12C shows an example downsampled chroma block.
Fig. 13 is a simplified flow diagram of a block prediction method using a cross-component linear model (CCLM) prediction mode according to some embodiments of the present invention.
FIG. 14 is a block diagram of an apparatus that may be used to implement various embodiments of the invention.
Detailed Description
Reference is now made to the accompanying drawings, which form a part hereof, and in which is shown by way of illustration specific aspects in which the invention may be practiced. It is to be understood that other aspects may be utilized and structural or logical changes may be made without departing from the scope of the present invention. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope of the present invention is defined by the appended claims.
It will be appreciated that the same applies to apparatus or systems corresponding to the method for performing the method, and vice versa, in connection with the method described. For example, if a specific method step is described, the corresponding apparatus may comprise means for performing the described method step, even if such means are not elaborated or illustrated in the figures. Further, it is to be understood that features of the various exemplary aspects described herein may be combined with each other, unless specifically noted otherwise.
The video image, also referred to as a frame, may comprise three arrays of samples, denoted SL, Scb and Scr. SL is a two-dimensional array of luminance samples (also referred to as a block). Scb is a two-dimensional array of Cb chroma samples. Scr is a two-dimensional Cr chroma sample array. The chroma samples may also be referred to as "chroma" samples. The image may also be monochromatic and may include only an array of luminance samples. A video slice (i.e., a video frame or a portion of a video frame) may be divided into video blocks, which may also be referred to as coding tree blocks, which may be blocks of NxN samples.
The term "block" as used herein refers to any type of block or block of any depth, for example, the term "block" may include, but is not limited to, a root block, a sub-block, a leaf node, and the like. The size of the blocks to be encoded need not be the same. One image may comprise blocks of different sizes and the block grid of different images of the video sequence may also be different. A transform block refers to a rectangular (square or non-square) block of samples to which the same transform is applied. The terms "Coding Unit (CU)" and "Prediction Unit (PU)" refer to a coding unit and a prediction unit that are currently being coded. Each transform unit ("TU") of a Coding Unit (CU) is associated with a luma transform block, a Cb transform block, and a Cr transform block.
Fig. 1 shows an encoding apparatus 100 according to an embodiment, comprising a filter 120 according to an embodiment. The encoding apparatus 100 is configured to encode blocks of a frame of a video signal, the video signal comprising a plurality of frames (also referred to herein as images or pictures), wherein each frame may comprise a plurality of blocks, and each block may comprise a plurality of pixels. In an embodiment, a block may be a macroblock, a coding tree unit, a coding unit, a prediction unit, and/or a prediction block.
In the exemplary embodiment shown in fig. 1, the encoding apparatus 100 is implemented in the form of a hybrid video encoding encoder. Typically, the first frame of the video signal is an intra-coded frame, coded using only intra-prediction. To this end, the embodiment of the encoding apparatus 100 shown in fig. 1 includes an intra prediction unit 154 for intra prediction. An intra-coded frame can be decoded without the information of other frames. Intra prediction unit 154 may perform intra prediction on the block based on the information provided by intra estimation unit 152.
Blocks of a subsequent frame after the first intra-coded frame may be encoded using inter-prediction or intra-prediction, as selected by the mode selection unit 160. To this end, the encoding apparatus 100 may further include an inter prediction unit 144. In general, inter prediction unit 144 may be used to perform motion compensation on a block based on motion estimation provided by inter estimation unit 142.
In an embodiment, the encoding apparatus 100 may further comprise a residual calculation unit 104 for determining the difference between the original block and its prediction value, i.e. a residual block defining the prediction error of the intra/inter image prediction is transformed by a transform unit 106 (e.g. using DCT) to generate transform coefficients, which are quantized by a quantization unit 108. The output of quantization unit 108 and the coding information or side information are provided, for example, by intra prediction unit 154, inter prediction unit 144, and filter 120, and are further encoded by entropy coding unit 170 to generate coded image data.
Hybrid video encoders typically replicate the decoder processing so that both generate the same prediction value. Thus, in the embodiment shown in FIG. 1, inverse quantization unit 110 and inverse transform unit 112 perform the inverse operations of transform unit 106 and quantization unit 108 and replicate the decoded approximation of the residual block. Then, the reconstruction unit 114 adds data of the decoded residual block to the prediction result, i.e., the prediction block. The output of the reconstruction unit 114 may then be provided to a (line) buffer 116 for intra prediction and further processed by a filter 120, the final image being stored in a decoded image buffer 130 and available for inter prediction of subsequent frames as will be described in detail below.
Fig. 2 shows a decoding apparatus 200 comprising a filter 220 according to an embodiment of the present invention. The decoding apparatus 200 is used to decode a coded video (or image) block of a frame. In the embodiment shown in fig. 2, the decoding device 200 may be implemented as a hybrid decoder. Entropy decoding unit 204 performs entropy decoding of the encoded image data, which may typically contain prediction errors (i.e., residual blocks), motion data, and other side information, which is particularly needed by intra-prediction unit 254, inter-prediction unit 244, and other components of decoding device 200, such as filter 220. In general, the intra prediction unit 254 and the inter prediction unit 244 of the decoding apparatus 200 shown in fig. 2 are selected by the mode selection unit 260 and operate in the same manner as the intra prediction unit 154 and the inter prediction unit 144 of the encoding apparatus 100 shown in fig. 1, so that the encoding apparatus 100 and the decoding apparatus 200 can generate the same prediction values. The reconstruction unit 214 of the decoding device 200 is configured to reconstruct the video blocks based on the filtered prediction blocks and residual blocks provided by the inverse quantization unit 210 and the inverse transform unit 212. As in the encoding apparatus 100, the reconstructed block may be provided to the line buffer 216 for intra prediction, and the filtered block/frame may be provided to the decoded image buffer 230 by the filter 220 for inter prediction.
Techniques for video encoding and compression are described. In particular, this disclosure describes techniques for Linear Model (LM) predictive video coding modes. In the LM predictive video coding mode, a chroma block is predicted from a scaled, downsampled, reconstructed corresponding luma block (i.e., the scaled, downsampled, reconstructed corresponding luma block constitutes a prediction block for predicting the chroma block).
In some exemplary embodiments, the downsampling of the reconstructed corresponding luma block includes filtering, exemplary embodiments of which are described herein. Furthermore, the techniques described herein may also be applicable to the following cases: the luma samples used in the LM prediction mode are located in different slices.
Thus, the techniques described herein relate to a Linear Model (LM) prediction mode for reducing inter-component redundancy in video coding. The techniques described herein may be used in the context of advanced video codecs, such as extensions to the High Efficiency Video Coding (HEVC) standard or next generation video coding standards, such as Joint Exploration Model (JEM) built on top of HEVC.
In performing LM predictive encoding or decoding, a video encoder or video decoder (collectively referred to as a video codec) extracts neighboring luma samples from a video data memory for down-sampling to determine scaling parameters for scaling corresponding luma blocks for down-sampling. If the filter used to downsample the neighboring luma samples uses neighboring luma samples that are outside the range of neighboring luma samples stored locally (e.g., in a local memory of the encoding circuit), the video encoder or video decoder retrieving luma samples from an external memory may waste processing time and memory bandwidth. For example, in video coding techniques other than general video coding, the following problems may exist: performing LM prediction mode operation requires fetching luma sample values from memory, which may require additional processing time and memory bandwidth. Exemplary embodiments are described that reduce the number of sampled samples that require relatively high processing time and memory bandwidth.
In an exemplary embodiment, the video encoder and video decoder (i.e., video codec) may exclude particular luma samples (e.g., luma samples not stored in local memory or luma samples not yet generated) when extracting neighboring luma samples for performing the downsampling. In this way, extracting samples does not cause the video encoder and video decoder to access non-local (i.e., external) memory. In contrast, in an exemplary embodiment, the video encoder or video decoder only extracts luma samples from local memory, e.g., for LM prediction mode operation.
In some embodiments, examples of Video Coding standards in which Video encoders and Video decoders (collectively referred to as Video codecs) operate according to Video compression standards include ITU-T H.261, ISO/IEC MPEG-1Visual, ITU-T H.262, or ISO/IEC MPEG-2Visual, ITU-T H.263, ISO/IEC MPEG-4Visual, and ITU-T H.264 (also known as ISO/IEC MPEG-4AVC), including Scalable Video Coding (SVC) and Multiview Video Coding (MVC) extensions thereof.
Furthermore, Joint Video Coding Team (JCT-VC) of ITU-T Video Coding Experts Group (VCEG) and ISO/IEC Moving Picture Experts Group (MPEG) has recently developed a new Video Coding standard, High Efficiency Video Coding (HEVC) HEVC draft specification, hereinafter referred to as HEVC WD, available from: http:// phenix. int-evry. fr/jct/doc _ end _ user/documents/14_ Vienna/wg11/JCTVC-N1003-v1. zip. The HEVC standard is also commonly proposed in the ITU-T h.265 recommendation and the international standard ISO/IEC 23008-2, both titled "High efficiency video coding", both published 10 months 2014.
The specification and extensions of HEVC, including Format Range Extension (RExt), HEVC layered Extension (Scalable Extension of HEVC, SHVC), and multiview HEVC (Multi-View HEVC, MV-HEVC) extensions, can be obtained from the following addresses: http:// phenix. int-evry. fr/jct/doc _ end _ user/documents/14_ Vienna/wg11/JCTVC-N1003-v1. zip.
Video encoding may be performed based on color space and color format. For example, color video plays an important role in a multimedia system that uses various color spaces to efficiently represent colors. Color space specifies color by numerical value using multiple components, one popular color space is the RGB color space, where color is represented as a combination of three primary color component values (i.e., red, green, and blue), which is widely used for color video compression, the YCbCr color space, as described below: "color space conversion" (color space conversion), by a.ford and a.roberts, technical report, university of weinsminster, london, 8 months 1998.
YCbCr can be easily converted from the RGB color space by a linear transformation, significantly reducing the redundancy between different components in the YCbCr color space, i.e. the cross-component redundancy. One advantage of YCbCr is backward compatibility with black and white television since the Y signal conveys luminance information. Furthermore, by sub-sampling the Cb and Cr components in a 4:2:0 chroma sampling format, the chroma bandwidth can be reduced, the subjective impact of this sub-sampling being significantly less compared to sub-sampling in the RGB color space. Because of these advantages, YCbCr has been the primary color space for video compression. Other color spaces, such as YCoCg, are also used in video compression. In the present invention, Y, Cb, Cr are used to represent three color components in a video compression scheme, regardless of what color space is actually used.
In 4:2:0 sampling, each of the two chroma arrays has only half the height and half the width of the luma array. Fig. 3 shows the nominal vertical and horizontal relative positions of the luma and chroma samples in an image.
To reduce cross-component redundancy, a cross-component linear model (CCLM) prediction mode is used in JEM, predicting chroma samples based on reconstructed luma samples of the same CU using the following linear model:
predC(i,j)=α·recL′(i,j)+β (1)
therein, predC(i, j) denotes the predicted chroma samples, rec, in the CUL(i, j) represents the downsampled reconstructed luma samples for the same CU. The parameters α and β are derived by minimizing a regression error between neighboring reconstructed luma and chroma samples around the current block according to the following equations (2) and (3):
Figure GDA0003025202200000061
Figure GDA0003025202200000062
where l (N) represents the up and left neighboring reconstructed luma samples of the downsampling, c (N) represents the up and left neighboring reconstructed chroma samples, and the value of N is equal to twice the minimum of the width and height of the current chroma coding block. For a coding block having a square shape, the above equations (2) and (3) are directly applied. For non-square coded blocks, the neighboring samples of the longer side are first sub-sampled to obtain the same number of samples as the shorter side.
Fig. 4, which includes fig. 4A and 4B, is a schematic diagram illustrating example derived positions of scaling parameters used to scale down-sampled reconstructed luma blocks. Fig. 4 shows an example of the positions of causal samples (cause samples) on the left and top and samples of the current block involved in the CCLM mode. The white squares are samples of the current block and the shaded circles are reconstructed samples. Fig. 4A shows an example of neighboring reconstructed pixels of a co-located luma block. Fig. 4B shows an example of neighboring reconstructed pixels of a chrominance block. If the video format is YUV4:2:0, there is one 16x16 luminance block and two 8x8 chrominance blocks.
This regression error minimization calculation is performed as part of the decoding process, not just as an encoder search operation, and therefore does not use syntax to convey the alpha and beta values.
The CCLM prediction mode also includes prediction between the two chroma components, i.e., predicting the Cr component from the Cb component uses the CCLM Cb to Cr prediction in the residual domain, without using the reconstructed sample signal. This is achieved by adding the weighted reconstructed Cb residual values to the original Cr intra prediction values to form the final Cr prediction value according to equation (4):
Figure GDA0003025202200000063
the scaling factor a is derived in a similar manner as in CCLM luminance-to-chrominance prediction, with the only difference being that the regression cost associated with the default value of a in the error function is added so that the derived scaling factor is biased towards the default value of-0.5, as follows:
Figure GDA0003025202200000071
where Cb (n) represents the neighboring reconstructed Cb samples, Cr (n) represents the neighboring reconstructed Cr samples, and λ is equal to Σ (Cb (n) (n)) Cb > 9.
Fig. 5 is a diagram illustrating one example of a video component sampling format that may be employed in embodiments of the present invention.
Fig. 6 is a schematic diagram illustrating another example of a video component sampling format that may be employed in embodiments of the present invention.
Adding CCLM luma to chroma prediction mode as an additional chroma intra prediction mode. On the encoder side, a rate-distortion cost check is added for the chroma components to select the chroma intra prediction mode. When intra prediction mode is used instead of CCLM luma to chroma prediction mode for chroma components of a CU, CCLM Cb to Cr prediction is used for Cr component prediction.
There are two kinds of CCLM modes in JEM: a single model CCLM mode and a multi-model CCLM mode (MMLM). As the name implies, the single model CCLM mode uses one linear model to predict chroma samples from luma samples for the entire CU, whereas in MMLM there may be two linear models. In MMLM, neighboring luma samples and neighboring chroma samples of a current block are classified into two groups, each of which is used as a training set to derive a linear model (i.e., a specific α and a specific β are derived for a specific group). In addition, the samples of the current luminance block are also classified based on the same classification rule as the neighboring luminance samples.
In an embodiment, Rec'L[x,y]Neighboring samples less than or equal to the threshold are classified as group 1; and is prepared from Rec'L[x,y]Neighboring samples greater than the threshold are classified as group 2. The threshold is calculated as the average of the neighboring reconstructed luma samples.
The inter-component prediction value at the position (x, y) can be generated by the following equation (6):
Figure GDA0003025202200000072
FIG. 7 shows a graph of a threshold correlation curve according to an embodiment of the invention.
To perform cross-component prediction, for the 4:2:0 chroma format, the default downsampling filter used in CCLM mode needs to be computed by equation (7) to down-sample the reconstructed luma block to match the size of the chroma signal:
Rec′L[x,y]=(2×RecL[2x,2y]+2×RecL[2x,2y+1]+RecL[2x-1,2y]+RecL[2x+1,2y]+RecL[2x-1,2y+1]+RecL[2x+1,2y+1]+4)>>3 (7)
note that this down-sampling assumes a "type 0" phase relationship of chroma sample position relative to luma sample position, i.e., horizontal collocated sampling and vertical gap sampling.
In an embodiment, a 6-tap down-sampling filter is used for both single model CCLM mode and multi-model CCLM (MMLM) mode.
For the MMLM mode, the encoder may alternatively select one of four additional luma downsampling filters for prediction in the CU and send a filter index to indicate which of these filters was used. The four optional luminance down-sampling filters for the MMLM mode are as follows:
Rec'L[x,y]=(RecL[2x,2y]+RecL[2x+1,2y]+1)>>1 (8)
Rec'L[x,y]=(RecL[2x+1,2y]+RecL[2x+1,2y+1]+1)>>1 (9)
Rec'L[x,y]=(RecL[2x,2y+1]+RecL[2x+1,2y+1]+1)>>1 (10)
Rec'L[x,y]=(RecL[2x,2y]+RecL[2x,2y+1]+RecL[2x+1,2y]+RecL[2x+1,2y+1]+2)>>2 (11)
cross-component linear model prediction
To derive the linear model coefficients, the neighboring reconstructed luma samples on the top and left are downsampled to obtain a one-to-one relationship with the neighboring reconstructed chroma samples on the top and left. To obtain linear model coefficient parameters α and β, embodiments of the invention use only the minimum and maximum values: 2 points (luminance and chrominance pairs), (a, B) are the minimum and maximum values within the set of adjacent luminance samples, as shown in fig. 8.
Fig. 8 is a diagram illustrating a straight line extending between a minimum luminance value a and a maximum luminance value B according to some embodiments of the present invention. This straight line is represented by the equation Y ═ α x + β, where the linear model parameters α and β are obtained according to the following equations (12) and (13):
Figure GDA0003025202200000081
β=yA-αxA (13)
note here that 2 points (luminance and chrominance pairs) are selected from the downsampled luminance neighboring reconstructed sample and the chrominance neighboring reconstructed sample (a, B).
Method for downsampling adjacent samples
In practice, YUV4:2:0 is a common format for video sequences or images, and since the spatial resolution of the luminance part is greater than the spatial resolution of the chrominance part, it is generally necessary to downsample the resolution of the luminance part to that of the chrominance part for the LM mode. For example, for the YUV4:2:0 format, it is typically necessary to downsample the luminance portion by 4 (downsample width by 2, downsample height by 2). Then, prediction is performed in the CCIP mode using the down-sampled luminance block (whose spatial resolution is the same as that of the chrominance portion) and the chrominance block.
This method is also applicable to neighboring samples in order to derive linear model coefficients, which also require down-sampling of the upper and left neighboring reconstructed luma samples to obtain a one-to-one relationship with the upper and left neighboring reconstructed chroma samples.
Fig. 9, which includes fig. 9A and 9B, shows an adjacent downsampling process, where the left-hand block is the original or reconstructed block (fig. 9A) and the right-hand block is the downsampled block (fig. 9B). Referring to fig. 9A, the original (reconstructed) block is the luminance block 2 top adjacent rows a1 and a2 to be used for downsampling to obtain downsampled adjacent row a. A [ i ] is the ith sample in the downsampled adjacent row A, A1[ i ] is the ith sample in A1, and A2[ i ] is the ith sample in A2. In an exemplary embodiment, if a 6-tap down-sampling filter is used, the ith sample is calculated by equation (14):
A[i]=(A2[2i]*2+A2[2i-1]+A2[2i+1]+A1[2i]*2+A1[2i-1]+A1[2i+1]+4)>>3 (14)
where "> >" represents a shift operation to the right, i.e., > >3 represents a 3-bit shift to the right.
Also, 3 left adjacent columns L1, L2, and L3 will be used for downsampling to obtain L columns. For example, if a 6-tap downsampling filter is used, the L [ i ] samples are calculated by equation (15):
L[i]=(L2[2i]*2+L1[2i]+L3[2i]+L2[2i+1]*2+L1[2i+1]+L3[2i+1]+4)>>3 (15)
where L [ i ] is the ith sample in column L, L1[ i ] is the ith sample in column L1, L2[ i ] is the ith sample in column L2, L3[ i ] is the ith sample in column L3, ">" indicates a shift operation to the right.
Fig. 9B shows an example downsampled luminance block "luminance'" after the downsampling process described above.
For the LM mode, the number of neighboring samples may also be larger than the size of the current block. Fig. 10, which includes fig. 10A and 10B, illustrates an adjacent downsampling process, where the left block is the original block (fig. 10A) and the right block is the downsampled block (fig. 10B) fig. 10A illustrates an example, the original luminance block "luminance" 2 top adjacent lines a1 and a2 are larger in size than the line size of the current luminance block and will be used for downsampling to obtain downsampled adjacent line a, as shown in fig. 10B. Three left-adjacent columns L1 through L3 will be used for down-sampling to obtain L columns. Fig. 10B shows a downsampled luma block "luma '" where the size M of the top neighboring sample a is larger than the size W of the downsampled block "luma '", and the size "N" of the left downsampled column "L" is larger than the size "H" of the columns of the downsampled block "luma '".
In some LM processes, 2 rows of neighboring reconstructed luma samples will be used to obtain the down-sampled top-neighboring reconstructed luma samples. Such LM processes would increase the line buffers, increasing memory cost, compared to conventional intra mode prediction. Also, in order to obtain down-sampled neighboring reconstructed samples, a down-sampling operation using a multi-tap down-sampling filter is required, which increases decoder complexity.
Although the compression advantages of the LM mode have been shown in the related art, some regions may be optimized to achieve low cost at the decoder side and in particular, the present inventors have determined an improved solution, which will be described in detail below.
The maximum and minimum luminance values can be obtained in the original reconstructed samples in adjacent 1 row and 1 column without increasing the size of the row buffer and without introducing a down-sampling operation.
Embodiments of the present invention provide improved processes and techniques for using the above described down-sampling operations in conjunction with fig. 9 and 10. In an embodiment, an adjacent down-sampled line above the current block or an adjacent down-sampled column to the left of the current block may be used. In another embodiment, adjacent downsampled lines above the current block or the original three adjacent columns to the left of the reconstructed luma sample may be used.
In the present invention, an improved solution is provided to obtain the maximum and minimum luminance values and the corresponding chrominance values in order to derive Linear Model (LM) coefficients. In the proposed scheme, instead of down-sampling with 2 rows of luminance samples, only 1 row of the original reconstructed luminance reconstructed samples is used to search (determine) the maximum and minimum luminance values, while for the left-hand neighboring samples, three left-hand neighboring columns may be used. Further, it is also possible to search (determine) the maximum luminance value and the minimum value using only 1 column of luminance reconstruction samples.
The first embodiment: samples in the adjacent row above the reconstructed luma sample and samples in the left adjacent column are used.
According to the first embodiment, the neighboring original reconstructed luma samples in the upper neighboring row are A1[ i ] (i ≧ 0 and ≦ 2W-1), and the neighboring original reconstructed luma samples in the left neighboring column are L1[ i ] (i ≧ 0 and ≦ 2H-1).
The adjacent chroma samples in the upper row are ai (i is greater than or equal to 0 and less than or equal to W-1), and the adjacent chroma samples on the left side of ai (i is greater than or equal to 0 and less than or equal to H-1).
If the maximum and minimum luminance values (collectively referred to as maximum/minimum values) are in the upper adjacent line and the index of the sample position is M, the corresponding chroma sample position is M/2 and the corresponding chroma value is A [ M/2 ].
If the maximum/minimum value is in the left column and the index of the sample position is N, the position of the corresponding chroma sample is N/2 and the corresponding chroma value is L [ N/2 ].
In one embodiment, if the maximum/minimum value is in the upper line, then in addition to using M/2 to obtain the location of the chroma sample, (M +1)/2 may be used, with the corresponding chroma value being A [ (M +1)/2 ].
In one embodiment, if the maximum/minimum value is in the adjacent upper line, then in addition to using M/2 to obtain the location of the chroma sample, (M-1)/2 may be used, with the corresponding chroma value being A [ (M-1)/2 ].
In one embodiment, if the maximum/minimum value is in the left column, then in addition to using N/2 to obtain the location of the chroma sample, (N +1)/2 may be used, with the corresponding chroma value being L [ (N +1)/2 ].
In one embodiment, if the maximum/minimum value is in the left column, (N-1)/2 may be used in addition to the position where N/2 is used to obtain the chroma samples, when the corresponding chroma value is L [ (N-1)/2 ].
In one embodiment, to speed up the search operation, the search step "s" may be greater than 1, which means that only a1[ s J ] and L1[ s J ] are checked to search for max/min values, where s is a positive integer.
As in the case shown in fig. 10, the method is also applicable when the number of neighboring samples is greater than the size of the current block.
Second embodiment: using adjacent rows above reconstructed luma samples and three adjacent columns to the left
According to the second embodiment, only 1 row of luminance reconstruction samples will be used to obtain the maximum/minimum luminance values, while the three columns on the left side can be used to obtain the maximum/minimum luminance values. In other words, according to the second embodiment, the maximum/minimum luminance values will be obtained from the original reconstructed luminance sample in the upper row and the down-sampled luminance sample in the left column.
Referring to FIG. 12A, the neighboring original reconstructed luma samples in the upper row are A1[ i ] (i ≧ 0 and ≦ 2W-1). Referring to FIG. 12B, the downsampled neighboring luma samples in the left column are L4[ i ] (i ≧ 0 and ≦ H-1).
Referring to FIG. 12C, the adjacent chroma sample in the upper row is Ai (i ≧ 0 and ≦ W-1) and the adjacent chroma sample in the left column is Li (i ≧ 0 and ≦ H-1).
In one embodiment, if the maximum/minimum value is in the upper line and the index of the sample position is M, then the position of the corresponding chroma sample is M/2 and the corresponding chroma value is A [ M/2 ].
In one embodiment, if the maximum/minimum value is in the left L4 column and the index of the sample position is N, then the position of the corresponding chroma sample is N and the corresponding chroma value is L [ N ].
In one embodiment, if the maximum/minimum value is in the upper line, then in addition to using M/2 to obtain the location of the chroma sample, (M +1)/2 may be used, with the corresponding chroma value being A [ (M +1)/2 ].
In one embodiment, if the maximum/minimum value is in the upper line, (M-1)/2 may be used in addition to M/2 to obtain the location of the chroma sample, when the corresponding chroma value is A [ (M-1)/2 ].
In one embodiment, to speed up the search operation, the search step s may be larger than 1, which means that only a1 s × j is checked to search for max/min values.
As in the case shown in fig. 10, the method is also applicable when the number of neighboring samples is greater than the size of the current block.
In the current CCIP or LM method, 2 lines of luminance samples are needed to obtain the down-sampled top neighboring samples in order to obtain the maximum/minimum luminance values, but this increases the size of the line buffer and increases the memory cost. Also, for the down-sampling operation, the decoder complexity will increase.
According to an embodiment of the present invention, maximum/minimum luminance values and corresponding chrominance values will be obtained using only 1 line of adjacent reconstructed luminance samples for the upper line, without a downsampling operation.
In a first embodiment, only 1 line of adjacent reconstructed luma samples and 1 column of adjacent reconstructed luma samples would be used to obtain the maximum/minimum luma values and corresponding chroma values.
In a second embodiment, only 1 row of neighboring reconstructed luma samples would be used to obtain the maximum/minimum luma values, while for the left neighboring samples, the existing approach may be used. In other words, in this method, the maximum/minimum luminance values will be obtained in the original reconstructed luminance samples above the row adjacent to the original reconstructed luminance block and the down-sampled luminance samples in the three columns to the left of the original reconstructed luminance block.
Thus, the above-described embodiments will not add line buffers, keeping memory costs low.
It is noted here that the embodiments described in the present invention are used to obtain maximum/minimum luminance values and corresponding chrominance values in order to derive linear model coefficients for chrominance intra prediction, embedding process steps or hardware units into the intra prediction module. Thus, process steps or hardware units are present in both the decoder unit and the encoder unit. Also, the process steps or hardware units in the encoder (encoding unit) and decoder (decoding unit) for obtaining the maximum/minimum luminance values and the corresponding chrominance values may be the same.
For a chroma block, its prediction value is obtained using the LM mode. First, it is necessary to obtain or receive a corresponding downsampled luma sample, then obtain a maximum/minimum luma value and a corresponding chroma value in adjacent samples to calculate linear model coefficients, and then obtain a predicted value of a current chroma block using the derived linear model coefficients and the downsampled luma block.
The present invention relates to obtaining maximum/minimum luminance values and corresponding chrominance values for deriving linear model coefficients the following description will focus on the scheme for obtaining maximum/minimum luminance values and corresponding chrominance values.
Fig. 13 is a simplified flowchart illustrating a method 130 for block prediction of video data using a cross-component linear model (CCLM) prediction mode according to an embodiment of the present invention. The method 130 may include the steps of:
step 1310: and performing down-sampling on the reconstructed brightness block to obtain a down-sampled brightness block, wherein the reconstructed brightness block corresponds to the target chroma block. Since the spatial resolution of the luminance portion is typically higher than the spatial resolution of the chrominance portion, it is typically necessary to downsample the luminance portion to match the resolution of the chrominance portion.
Step 1320: the maximum luminance value and the minimum luminance value are determined from reconstructed neighboring luminance samples above the reconstructed luminance block and/or from reconstructed neighboring luminance samples to the left of the reconstructed luminance block. The reconstructed neighboring luma samples above the reconstructed luma block are neighboring the reconstructed luma block.
Step 1330: based on the positions of the maximum luminance value and the minimum luminance value, first chrominance values and second chrominance values corresponding to the maximum luminance value and the minimum luminance value, respectively, are derived from reconstructed neighboring chrominance samples of the target chrominance block. The chrominance values corresponding to the maximum luminance value and the minimum luminance value may be determined based on the steps described in the first embodiment above. In some embodiments, the maximum luminance value and the minimum luminance value may be determined using only 1 line of reconstructed luminance samples and 1 column of reconstructed luminance samples, and the corresponding chrominance values may be obtained from the maximum luminance value and the minimum luminance value, respectively.
In an embodiment, deriving the first and second chrominance values comprises: acquiring a position M of a maximum brightness value and a position N of a minimum brightness value in a region, wherein the region comprises adjacent brightness samples above a reconstructed brightness block; and acquiring a first chrominance value at a first position in an adjacent sample adjacent to the target chrominance block corresponding to the maximum luminance value and a second chrominance value at a second position in an adjacent sample adjacent to the target chrominance block corresponding to the minimum luminance value. In an embodiment, the region further comprises neighboring luma samples to the left of the reconstructed chroma block or neighboring luma samples to the left of the downsampled luma block.
In the case where M is the mth position of the adjacent upper row of the reconstructed luminance block, the first position is the M/2 or (M +1)/2 or (M-1)/2 position of the adjacent upper row of the target chrominance block. In the case where N is the nth position of the adjacent upper line of the reconstructed luma block, the second position is the nth/2 or (N +1)/2 or (N-1)/2 position of the adjacent upper line of the target chroma block.
In the case where M is the M-th position of the adjacent upper row of the adjacent left column of the reconstructed or downsampled luminance block, the first position is the M/2 or (M +1)/2 or (M-1)/2 position of the adjacent left column of the target chrominance block. In the case where N is the nth position of the adjacent left column of the reconstructed or downsampled luminance block, the second position is the nth/2 or (N +1)/2 or (N-1)/2 position of the adjacent left column of the target chrominance block.
In the example of the first embodiment, the neighboring original reconstructed luma sample to the left of the neighboring original reconstructed luma sample above the reconstructed luma block of A1[ i ] ((i ≧ 0 and ≦ 2W-1) is L1[ i ] ((i ≧ 0 and ≦ 2H-1) (see FIG. 11A).
The adjacent chroma sample to the left of the adjacent chroma sample above the chroma block as ai (i ≧ 0 ≦ W-1) as Li (i ≧ 0 ≦ H-1) (see FIG. 11B) may determine the chroma value based on the following condition:
if the maximum luminance value and the minimum luminance value (collectively referred to as maximum/minimum values) are in the upper adjacent line and the index of the sample position of the maximum/minimum value is M, the position of the corresponding chroma sample is M/2 and the corresponding chroma value is a [ M/2 ].
If the maximum/minimum value is in the left column and the index of the sample position is N, the position of the corresponding chroma sample is N/2 and the corresponding chroma value is L [ N/2 ].
If the maximum/minimum value is in the upper line, (M +1)/2 may be used in addition to the position where M/2 is used to obtain the chroma samples, when the corresponding chroma value is A [ (M +1)/2 ].
If the maximum/minimum value is in the adjacent upper line, (M-1)/2 may be used in addition to M/2 to obtain the location of the chroma sample, when the corresponding chroma value is A [ (M-1)/2 ].
If the maximum/minimum value is in the left column, (N +1)/2 may be used in addition to the position where N/2 is used to obtain the chroma samples, when the corresponding chroma value is L [ (N +1)/2 ].
If the maximum/minimum value is in the left column, (N-1)/2 may be used in addition to the position where N/2 is used to obtain the chroma samples, when the corresponding chroma value is L [ (N-1)/2 ].
In some other examples, only 1 row of luminance reconstruction samples will be used to obtain the maximum/minimum luminance values, while the three columns on the left may be used to obtain the maximum/minimum luminance values. These example embodiments have been described in detail above in terms of the second embodiment, and therefore, for the sake of brevity, these example embodiments will not be described again here.
Step 1340: after calculating the maximum luminance value and the minimum luminance value and the corresponding chrominance values based on the maximum luminance value and the minimum luminance value and the corresponding first chrominance value and second chrominance value, parameters of a cross-component linear model (CCLM) prediction mode are calculated, linear model coefficients are derived using equations (2) and (3).
Step 1350: a predicted chroma value of the target chroma block is generated based on the parameters of the CCLM prediction mode and the downsampled luma block. After deriving linear model coefficients from the respective chrominance values corresponding to the maximum luminance value and the minimum luminance value from the reconstructed upper neighboring chrominance block or the reconstructed left neighboring chrominance block based on the positions of the maximum and minimum luminance values, a prediction or generation of a predicted chrominance value may be derived using equation (1).
In some examples, the linear model parameters α and β in equations (2) and (3) are obtained based on a Least Mean Square (LMS) algorithm, respectively. The complexity associated with the least mean square algorithm is given in table 1 below by considering the substitution of multiplication for division.
TABLE 1
Operations Number of operations in LMS
Multiplication
2N+2+2
Addition 7N+3
Division " 2
Where N is the number of chroma samples in the boundary note that the division can be performed using table, multiplication, and shift operations.
According to an embodiment of the present invention, the LMS algorithm for determining the linear model parameters α and β may be replaced with a straight-line equation as shown in fig. 8. The linear model parameters α and β are obtained by equation (12) and equation (13), respectively. In some embodiments, the division may be implemented by multiplication and right shift operations. To derive the chroma prediction value, the multiplication is implemented using integer operations as shown in the following equation:
Figure GDA0003025202200000121
Figure GDA0003025202200000122
representing the reconstructed luma block corresponding to the chroma block after downsampling.
This embodiment is simpler than the current LMS embodiment, since the displacement operation S always has the same value.
Table 2 shows the number of operations between the LMS algorithm and the scheme provided by the present invention.
TABLE 2
Operations Number of operations in LMS Number of operations according to the invention
Multiplication
2N+2+2 1
Addition 7N+3 3
Division " 2 1
Comparison 0 2N
The min/max-based intra prediction method significantly reduces the number of multiplications and additions compared to the number of operations in the LMS algorithm, while increasing the number of comparisons for determining the minimum luminance value and the maximum luminance value of adjacent samples.
Fig. 14 is a block diagram of an apparatus 1400 that can be used to implement various embodiments, and the apparatus 1400 may be the encoding apparatus 100 shown in fig. 1 and the decoding apparatus 200 shown in fig. 2. Furthermore, the apparatus 1400 may host one or more of the elements in some embodiments, the apparatus 1400 is equipped with one or more input/output devices, e.g., speakers, microphone, mouse, touch screen, buttons, keyboard, printer, display, etc. the apparatus 1400 may include one or more Central Processing Units (CPUs) 1410, a memory 1420, a mass storage 1430, a video adapter 1440, and an I/O interface 1460 to the bus. The bus is one or more of any type of several bus architectures including a memory bus or memory controller, a peripheral bus, a video bus, or the like.
CPU 1410 may have any type of electronic data processor memory 1420 may include or may be any type of system memory such as Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), Synchronous DRAM (SDRAM), read-only memory (read-only memory), and combinations thereof, among others, and in one embodiment memory 1420 may include ROM for use at power-on, DRAM for programs, and data memory for use in executing programs. in some embodiments, memory 1420 may be non-transitory mass storage 1430 may include any type of storage device for storing data, programs, and other information and making the data, programs, and other information accessible over a bus. Mass storage 1430 includes, for example, one or more of a solid state drive, hard disk drive, magnetic disk drive, optical disk drive, and the like.
The video adapter 1440 and the I/O interface 1460 provide interfaces to couple external input and output devices to the apparatus 1400. For example, the apparatus 1400 may provide an SQL command interface to clients as shown, examples of input and output devices include any combination of a display 1490 coupled to the video adapter 1440, and a mouse/keyboard/printer 1470 coupled to the I/O interface 1460. Other devices may be coupled to apparatus 1400 and use more or fewer interface cards-for example, a serial port may be provided to a printer using a serial port card (not shown).
The apparatus 1400 may also include one or more network interfaces 1450, the network interfaces 1450 including wired links, such as ethernet lines, and/or wireless links for accessing a node or one or more networks 1480. Network interface 1450 allows device 1400 to communicate with remote units over a network 1480. For example, the network interface 1450 may provide communication with a database in one embodiment, the apparatus 1400 is coupled to a local or wide area network for data processing and communication with remote devices such as other processing units, the internet, remote storage devices, and the like.
Although a particular feature or aspect of the invention may have been disclosed with respect to only one of several implementations or embodiments, such feature or aspect may be combined with one or more other features or aspects of the other implementations or embodiments as may be desired and advantageous for any given or particular application, and further, to the extent that the terms "including," "having," or other variants thereof are used in a particular implementation or claim, such terms are intended to be inclusive in a manner similar to the term "comprising," the terms "exemplary," "e.g.," merely represent an example, and it is to be understood that such terms are intended to mean that two elements may cooperate or interact with each other, whether directly in physical or electrical contact, or they may not be in direct contact with each other.
Although specific aspects have been illustrated and described herein, it will be appreciated by those of ordinary skill in the art that a variety of alternate and/or equivalent implementations may be substituted for the specific aspects shown and described without departing from the scope of the present invention. This application is intended to cover any adaptations or variations of the specific aspects discussed herein.
Although the elements in the above claims below are recited in a particular sequence with corresponding labeling, unless the recitation of the claims otherwise implies a particular sequence for implementing some or all of the elements, the elements are not necessarily limited to being implemented in the particular sequence described.
Many alternatives, modifications, and variations will be apparent to those skilled in the art in light of the foregoing teachings. Of course, those skilled in the art will readily recognize that there are numerous other applications of the present invention beyond those described herein. While the present invention has been described with reference to one or more particular embodiments, those skilled in the art will recognize that many changes may be made thereto without departing from the scope of the present invention. It is therefore to be understood that within the scope of the appended claims and their equivalents, the invention may be practiced otherwise than as specifically described herein.

Claims (14)

1. A method for cross component prediction of a block of video data, the method comprising:
down-sampling the reconstructed brightness block to obtain a down-sampled brightness block, wherein the reconstructed brightness block corresponds to a target chrominance block;
determining a maximum luma value and a minimum luma value from reconstructed neighboring luma samples above the reconstructed luma block and/or reconstructed neighboring luma samples to the left of the reconstructed luma block, wherein the reconstructed neighboring luma samples above the reconstructed luma block are neighboring the reconstructed luma block;
deriving first and second chrominance values corresponding to the maximum and minimum luminance values from reconstructed neighboring chrominance samples of the target chrominance block based on the locations of the maximum and minimum luminance values;
calculating parameters of a cross-component linear model (CCLM) prediction mode based on the maximum and minimum luma values and the first and second chroma values; and
generating a predicted chroma value for the target chroma block based on the parameter and the downsampled luma block.
2. The method of claim 1, wherein deriving the first and second chrominance values comprises:
obtaining a position M of the maximum luminance value and a position N of the minimum luminance value in a region, wherein the region comprises adjacent luminance samples above the reconstructed luminance block;
obtaining the first chrominance value at a first position in a chrominance sample adjacent to the target chrominance block corresponding to the maximum luminance value and the second chrominance value at a second position in a chrominance sample adjacent to the target chrominance block corresponding to the minimum luminance value.
3. The method of claim 2, wherein the region further comprises an adjacent luma sample to the left of the reconstructed luma block or an adjacent luma sample to the left of the downsampled luma block.
4. The method according to claim 3, wherein in the case that M is the Mth position of the upper row adjacent to the reconstructed luma block, the first position is the M/2 or (M +1)/2 or (M-1)/2 positions of the upper row adjacent to the target chroma block.
5. The method according to claim 3, wherein in case that N is the Nth position of the upper row adjacent to the reconstructed luma block, the second position is the Nth/2 or (N +1)/2 or (N-1)/2 positions of the upper row adjacent to the target chroma block.
6. The method of claim 3, wherein if M is the Mth position of the adjacent left column of the reconstructed luma block or the downsampled luma block, the first position is the M/2 or (M +1)/2 or (M-1)/2 positions of the adjacent left column of the target chroma block.
7. The method of claim 3, wherein if N is the Nth position of the adjacent left column of the reconstructed luma block or the downsampled luma block, the second position is the Nth/2 or (N +1)/2 or (N-1)/2 position of the adjacent left column of the target chroma block.
8. An apparatus for processing a block of video data, comprising:
a processor and a memory storing instructions that, when executed by the processor, cause the processor to:
down-sampling the reconstructed brightness block to obtain a down-sampled brightness block, wherein the reconstructed brightness block corresponds to a target chrominance block;
determining a maximum luma value and a minimum luma value from reconstructed neighboring luma samples above the reconstructed luma block and/or reconstructed neighboring luma samples to the left of the reconstructed luma block, wherein the reconstructed neighboring luma samples above the reconstructed luma block are neighboring the reconstructed luma block;
deriving first and second chrominance values corresponding to the maximum and minimum luminance values from reconstructed neighboring chrominance samples of the target chrominance block based on the locations of the maximum and minimum luminance values;
calculating parameters of a cross-component linear model (CCLM) prediction mode based on the maximum and minimum luminance values and the first and second chrominance values; and
generating a predicted chroma value for the target chroma block based on the parameter and the downsampled luma block.
9. The apparatus of claim 8, wherein the processor is configured to:
obtaining a position M of the maximum luminance value and a position N of the minimum luminance value in a region, wherein the region comprises the neighboring luminance samples above the reconstructed luminance block;
obtaining the first chrominance value at a first position in a chrominance sample adjacent to the target chrominance block corresponding to the maximum luminance value and the second chrominance value at a second position in a chrominance sample adjacent to the target chrominance block corresponding to the minimum luminance value.
10. The apparatus of claim 9, wherein the region further comprises an adjacent luma sample to the left of the reconstructed luma block or an adjacent luma sample to the left of the downsampled luma block.
11. The apparatus of claim 9, wherein if M is an mth position of an upper row adjacent to the reconstructed luma block, the first position is an M/2 or (M +1)/2 or (M-1)/2 position of an upper row adjacent to the target chroma block.
12. The apparatus of claim 9, wherein in case that N is an nth position of an upper row adjacent to the reconstructed luma block, the second position is an nth/2 or (N +1)/2 or (N-1)/2 position of an upper row adjacent to the target chroma block.
13. The apparatus of claim 9, wherein if M is an mth position of an adjacent left column of the reconstructed luma block or the downsampled luma block, the first position is an M/2 or (M +1)/2 or (M-1)/2 position of the adjacent left column of the target chroma block.
14. The apparatus of claim 9, wherein in case that N is an nth position of an adjacent left column of the reconstructed luma block or the downsampled luma block, the second position is an nth/2 or (N +1)/2 or (N-1)/2 position of the adjacent left column of the target chroma block.
CN201980002859.7A 2018-07-15 2019-04-30 Method and apparatus for intra prediction using cross-component linear model Active CN110999290B (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201862698279P 2018-07-15 2018-07-15
US62/698,279 2018-07-15
PCT/CN2019/085133 WO2020015433A1 (en) 2018-07-15 2019-04-30 Method and apparatus for intra prediction using cross-component linear model

Publications (2)

Publication Number Publication Date
CN110999290A CN110999290A (en) 2020-04-10
CN110999290B true CN110999290B (en) 2021-07-16

Family

ID=69164987

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201980002859.7A Active CN110999290B (en) 2018-07-15 2019-04-30 Method and apparatus for intra prediction using cross-component linear model

Country Status (2)

Country Link
CN (1) CN110999290B (en)
WO (1) WO2020015433A1 (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20210070368A (en) 2018-10-12 2021-06-14 광동 오포 모바일 텔레커뮤니케이션즈 코포레이션 리미티드 Video image element prediction method and apparatus, computer storage medium
WO2021083377A1 (en) 2019-11-01 2021-05-06 Beijing Bytedance Network Technology Co., Ltd. Block size restrictions for cross-component video coding
CN115606177A (en) * 2020-03-21 2023-01-13 抖音视界有限公司(Cn) Using neighboring samples in cross-component video coding
KR20230002432A (en) 2020-04-18 2023-01-05 베이징 바이트댄스 네트워크 테크놀로지 컴퍼니, 리미티드 Restrictions on use for cross-component prediction
CN115665409A (en) * 2020-06-03 2023-01-31 北京达佳互联信息技术有限公司 Method and apparatus for encoding video data
CN113365067B (en) * 2021-05-21 2023-03-14 中山大学 Chroma linear prediction method, device, equipment and medium based on position weighting
CN113489974B (en) * 2021-07-02 2023-05-16 浙江大华技术股份有限公司 Intra-frame prediction method, video/image encoding and decoding method and related devices
WO2024022325A1 (en) * 2022-07-27 2024-02-01 Mediatek Inc. Method and apparatus of improving performance of convolutional cross-component model in video coding system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104380741A (en) * 2012-01-19 2015-02-25 华为技术有限公司 Reference pixel reduction for intra lm prediction
CN104718759A (en) * 2012-01-24 2015-06-17 华为技术有限公司 Simplification of LM mode
CN107211124A (en) * 2015-01-27 2017-09-26 高通股份有限公司 Across the component residual prediction of adaptability
CN107409209A (en) * 2015-03-20 2017-11-28 高通股份有限公司 Down-sampled for Linear Model for Prediction pattern is handled

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6543578B2 (en) * 2016-01-27 2019-07-10 富士通株式会社 INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND PROGRAM
US10477240B2 (en) * 2016-12-19 2019-11-12 Qualcomm Incorporated Linear model prediction mode with sample accessing for video coding

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104380741A (en) * 2012-01-19 2015-02-25 华为技术有限公司 Reference pixel reduction for intra lm prediction
CN104718759A (en) * 2012-01-24 2015-06-17 华为技术有限公司 Simplification of LM mode
CN107211124A (en) * 2015-01-27 2017-09-26 高通股份有限公司 Across the component residual prediction of adaptability
CN107409209A (en) * 2015-03-20 2017-11-28 高通股份有限公司 Down-sampled for Linear Model for Prediction pattern is handled

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
《Chroma intra prediction by reconstructed luma samples》;Jianle Chen ET AL;《Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11 3rd Meeting: Guangzhou》;20101015;第2节,附图1 *
《Multi-model Based Cross-component Linear Model Chroma Intra-prediction for Video Coding》;Kai Zhang ET AL;《VCIP 2017》;20171213;全文 *
《Non-CE3:On cross-component linear model simplification》;Guillaume Laroche ET AL;《Joint Video Experts Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11 11th Meeting: Ljubljana》;20180712;摘要、第1-2节,附图1 *

Also Published As

Publication number Publication date
CN110999290A (en) 2020-04-10
WO2020015433A1 (en) 2020-01-23

Similar Documents

Publication Publication Date Title
CN110999290B (en) Method and apparatus for intra prediction using cross-component linear model
CN110892724B (en) Improved intra prediction in video coding
CN107409209B (en) Downsampling process for linear model prediction mode
JP6157614B2 (en) Encoder, decoder, method, and program
CN104247423B (en) The frame mode coding method of scalable video coding system and device
TW201904285A (en) Enhanced deblocking filtering design in video coding
KR102606414B1 (en) Encoder, decoder and corresponding method to derive edge strength of deblocking filter
CN111819852A (en) Method and apparatus for residual symbol prediction in transform domain
JP2020503815A (en) Intra prediction techniques for video coding
CN113557527B (en) Video decoding method, video decoder and medium
KR20210125088A (en) Encoders, decoders and corresponding methods harmonizing matrix-based intra prediction and quadratic transform core selection
CN113728629A (en) Motion vector derivation in video coding
CN113994676B (en) Video decoding method and device and electronic equipment
JP7384974B2 (en) Method and apparatus for image filtering using adaptive multiplication coefficients
KR102160242B1 (en) Image decoding method and apparatus using same
CN113545063A (en) Method and apparatus for intra prediction using linear model
KR20220128468A (en) Sample offset by predefined filters
US20120263225A1 (en) Apparatus and method for encoding moving picture
KR20220127308A (en) Method and apparatus for video coding
KR20210129736A (en) Optical flow-based video inter prediction
CN115769577A (en) Orthogonal transform generation with subspace constraints
KR20230111257A (en) Encoding and decoding methods and apparatus for enhancement layer
EP3777157A1 (en) Line buffer for spatial motion vector predictor candidates
JP2021528919A (en) Devices and methods for intra-prediction
JP7458489B2 (en) Image coding method and device based on transformation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant