CN113497937A

CN113497937A - Image encoding method, image decoding method and related device

Info

Publication number: CN113497937A
Application number: CN202010206251.0A
Authority: CN
Inventors: 杨宁
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2020-03-20
Filing date: 2020-03-20
Publication date: 2021-10-12
Anticipated expiration: 2040-03-20
Also published as: WO2021185257A1; WO2021185008A1; CN113497937B

Abstract

The embodiment of the application discloses an image coding method, an image decoding method and a related device, wherein the image decoding method comprises the following steps: dividing the image, and determining a brightness component intra-frame prediction mode and a chroma component intra-frame prediction mode of a current coding block; when the chroma component intra-prediction mode indicates that a chroma component of the current coding block is derived by using a luma component of the current coding block, determining a reference prediction block of the chroma component of the current coding block according to the luma component intra-prediction mode; and filtering the reference prediction block of the chrominance component of the current coding block to obtain the prediction block of the chrominance component of the current coding block. According to the embodiment of the application, the filtering of the reference prediction block of the chrominance component of the current pixel block is realized in the cross-component prediction mode, and the improvement of the compression efficiency of the pixel block is facilitated.

Description

Image encoding method, image decoding method and related device

Technical Field

The present application relates to the field of electronic device technologies, and in particular, to an image encoding method, an image decoding method, and a related apparatus.

Background

Digital video capabilities can be incorporated into a wide range of devices, including digital televisions, digital direct broadcast systems, wireless broadcast systems, Personal Digital Assistants (PDAs), laptop or desktop computers, tablet computers, e-book readers, digital cameras, digital recording devices, digital media players, video gaming devices, video game consoles, cellular or satellite radio telephones, video conferencing devices, video streaming devices, and so forth.

Digital video devices implement video compression techniques such as those described in the standards and extensions of the standards defined by the MPEG-2, MPEG-4, ITU-t h.263, ITU-t h.264/MPEG-4 part 10 Advanced Video Coding (AVC), ITU-t h.265 High Efficiency Video Coding (HEVC) standards, to more efficiently transmit and receive digital video information. Video devices may more efficiently transmit, receive, encode, decode, and/or store digital video information by implementing these video codec techniques.

With the proliferation of internet video, even though digital video compression technology is continuously evolving, still higher requirements are put on video compression ratio.

Disclosure of Invention

The embodiment of the application provides an image coding method, an image decoding method and a related device, so that in a cross-component prediction mode, different filters are used for performing down-sampling according to different directivities of a current coding unit.

In a first aspect, an embodiment of the present application provides an image encoding method, including: dividing the image, and determining a brightness component intra-frame prediction mode and a chroma component intra-frame prediction mode of a current coding block; when the chroma component intra-prediction mode indicates that a chroma component of the current coding block is derived by using a luma component of the current coding block, determining a reference prediction block of the chroma component of the current coding block according to the luma component intra-prediction mode; and filtering the reference prediction block of the chrominance component of the current coding block to obtain the prediction block of the chrominance component of the current coding block.

Compared with the prior art, the scheme of the application utilizes the implementation of filtering the reference prediction block of the chrominance component of the current coding block in the cross-component prediction mode, and is beneficial to improving the compression efficiency of the coding block, thereby improving the coding efficiency.

In a second aspect, an embodiment of the present application provides an image decoding method, including: analyzing the code stream, and determining a brightness component intra-frame prediction mode and a chroma component intra-frame prediction mode of the current decoding block; determining a reference prediction block for a chroma component of the current decoded block according to the luma component intra prediction mode when the chroma component intra prediction mode indicates that a prediction value for a chroma component of the current decoded block is derived using a reconstructed block for a luma component of the current decoded block; and filtering the reference prediction block of the chroma component of the current decoding block to obtain the prediction block of the chroma component of the current decoding block.

Compared with the prior art, the scheme of the application utilizes the implementation of filtering the reference prediction block of the chrominance component of the current decoding block in the cross-component prediction mode, and is beneficial to improving the compression efficiency of the decoding block, thereby improving the decoding efficiency.

In a third aspect, an embodiment of the present application provides an image encoding apparatus, including: the dividing unit is used for dividing the image and determining a brightness component intra-frame prediction mode and a chroma component intra-frame prediction mode of the current coding block; a determining unit configured to determine a reference prediction block for a chroma component of the current coding block according to the luma component intra prediction mode when the chroma component intra prediction mode indicates that the chroma component of the current coding block is derived using a luma component of the current coding block; and the filtering unit is used for filtering the reference prediction block of the chroma component of the current coding block to obtain the prediction block of the chroma component of the current coding block.

In a fourth aspect, an embodiment of the present application provides an image decoding apparatus, including: the decoding unit is used for decoding the video stream and determining a luminance component intra-frame prediction mode and a chrominance component intra-frame prediction mode of a current decoding block; a determination unit for determining a reference prediction block for a chroma component of the currently decoded block according to the luma component intra prediction mode when the chroma component intra prediction mode indicates that a prediction value for the chroma component of the currently decoded block is derived using a reconstructed block for the luma component of the currently decoded block; and the filtering unit is used for filtering the reference prediction block of the chroma component of the current decoding block to obtain the prediction block of the chroma component of the current decoding block.

In a fifth aspect, an embodiment of the present application provides an encoder, including: a processor and a memory coupled to the processor; the processor is configured to perform the method of the first aspect.

In a sixth aspect, an embodiment of the present application provides a decoder, including: a processor and a memory coupled to the processor; the processor is configured to perform the method of the second aspect.

In a seventh aspect, an embodiment of the present application provides a terminal, where the terminal includes: one or more processors, memory, and a communication interface; the memory, the communication interface and the one or more processors; the terminal communicates with other devices through the communication interface, the memory is used for storing computer program code comprising instructions which, when executed by the one or more processors, perform the method according to the first or second aspect.

In an eighth aspect, the present invention provides a computer-readable storage medium, having stored therein instructions, which, when executed on a computer, cause the computer to perform the method of the first or second aspect.

In a ninth aspect, embodiments of the present application provide a computer program product comprising instructions that, when executed on a computer, cause the computer to perform the method of the first or second aspect.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a schematic block diagram of a coding tree unit in an embodiment of the present application;

FIG. 2 is a schematic block diagram of a color format in an embodiment of the present application;

FIG. 3 is a schematic block diagram of a CTU and a coding unit CU in an embodiment of the present application;

FIG. 4 is a schematic block diagram of an associated pixel of an encoding unit in an embodiment of the present application;

FIG. 5 is a block diagram illustrating a luminance component intra prediction mode according to an embodiment of the present disclosure;

FIG. 6 is a schematic block diagram of neighboring pixels used for calculation of coefficients of a linear model in an embodiment of the present application;

FIG. 7 is a schematic block diagram of a down-sampling filter in an embodiment of the present application;

fig. 8 is a schematic block diagram of a change from a luma component reconstruction block to a chroma component prediction block in an embodiment of the present application;

FIG. 9 is a schematic block diagram of a video coding system in an embodiment of the present application;

FIG. 10 is a schematic block diagram of a video encoder in an embodiment of the present application;

FIG. 11 is a schematic block diagram of a video decoder in an embodiment of the present application;

FIG. 12A is a flowchart illustrating an image encoding method according to an embodiment of the present application;

FIG. 12B is a diagram illustrating a horizontal down-sampling process according to an embodiment of the present disclosure;

FIG. 12C is a diagram illustrating a vertical down-sampling process according to an embodiment of the present disclosure;

FIG. 12D is a schematic diagram illustrating a diagonal down-sampling process in an embodiment of the present application;

FIG. 13 is a flowchart illustrating an image decoding method according to an embodiment of the present application;

FIG. 14 is a block diagram of a functional unit of an image encoding apparatus according to an embodiment of the present application;

FIG. 15 is a block diagram showing another functional unit of the image encoding apparatus according to the embodiment of the present application;

FIG. 16 is a block diagram of a functional unit of an image decoding apparatus according to an embodiment of the present application;

fig. 17 is a block diagram of another functional unit of the image decoding apparatus in the embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

It will be understood that, as used herein, the terms "first," "second," and the like may be used herein to describe various elements, but these elements are not limited by these terms. These terms are only used to distinguish one element from another. For example, a first client may be referred to as a second client, and similarly, a second client may be referred to as a first client, without departing from the scope of the present invention. Both the first client and the second client are clients, but they are not the same client.

First, the embodiments of the present application will be described

Terminology and related art.

For the partition of images, in order to more flexibly represent Video contents, a Coding Tree Unit (CTU), a Coding Unit (CU), a Prediction Unit (PU), and a Transform Unit (TU) are defined in the High Efficiency Video Coding (HEVC) technology. The CTU, CU, PU, and TU are all image blocks.

A coding tree unit CTU, an image being composed of a plurality of CTUs, a CTU generally corresponding to a square image area, containing luminance pixels and chrominance pixels (or may contain only luminance pixels, or may contain only chrominance pixels) in the image area; syntax elements are also included in the CTU that indicate how the CTU is divided into at least one Coding Unit (CU), and the method of decoding each coding unit resulting in a reconstructed picture. As shown in fig. 1 (a), the picture 10 is composed of a plurality of CTUs (including CTU a, CTU B, CTU C, and the like). The encoded information corresponding to a CTU includes luminance values and/or chrominance values of pixels in a square image region corresponding to the CTU. Furthermore, the coding information corresponding to a CTU may also contain syntax elements indicating how to divide the CTU into at least one CU and the method of decoding each CU to get the reconstructed picture. The image area corresponding to one CTU may include 64 × 64, 128 × 128, or 256 × 256 pixels. In one example, a CTU of 64 × 64 pixels comprises a rectangular pixel lattice of 64 columns of 64 pixels each, each pixel comprising a luminance component and/or a chrominance component. The CTUs may also correspond to rectangular image regions or image regions with other shapes, and an image region corresponding to one CTU may also be an image region in which the number of pixels in the horizontal direction is different from the number of pixels in the vertical direction, for example, including 64 × 128 pixels.

The coding unit CU, which usually corresponds to an a × B rectangular area in the image, contains a × B luminance pixels and/or its corresponding chrominance pixels, a being the width of the rectangle and B being the height of the rectangle, a and B may be the same or different, and a and B usually take values to the power of 2, to the integer, e.g. 128, 64, 32, 16, 8, 4. Here, the width referred to in the embodiment of the present application refers to the length in the X-axis direction (horizontal direction) in the two-dimensional rectangular coordinate system XoY shown in fig. 1, and the height refers to the length in the Y-axis direction (vertical direction) in the two-dimensional rectangular coordinate system XoY shown in fig. 1. The reconstructed image of a CU may be obtained by adding a predicted image, which is generated by intra prediction or inter prediction, specifically, may be composed of one or more Predicted Blocks (PB), and a residual image, which is generated by inverse quantization and inverse transform processing on transform coefficients, specifically, may be composed of one or more Transform Blocks (TB). Specifically, one CU includes coding information including information such as a prediction mode and a transform coefficient, and performs decoding processing such as corresponding prediction, inverse quantization, and inverse transform on the CU according to the coding information to generate a reconstructed image corresponding to the CU.

The prediction unit PU is a basic unit of intra prediction and inter prediction. Defining motion information of an image block to include an inter-frame prediction direction, a reference frame, a motion vector, and the like, wherein the image block undergoing encoding processing is called a Current Coding Block (CCB), the image block undergoing decoding processing is called a Current Decoding Block (CDB), and for example, when one image block is undergoing prediction processing, the current coding block or the current decoding block is a prediction block; when an image block is being residual processed, the currently encoded block or the currently decoded block is a transform block. The picture in which the current coding block or the current decoding block is located is called the current frame. In the current frame, image blocks located on the left or upper side of the current block may be inside the current frame and have completed encoding/decoding processing, resulting in reconstructed images, which are referred to as reconstructed blocks; information of the coding mode, reconstructed pixels, etc. of the reconstructed block is available (available). A frame in which the encoding/decoding process has been completed before the encoding/decoding of the current frame is referred to as a reconstructed frame. When the current frame is a uni-directionally predicted frame (P frame) or a bi-directionally predicted frame (B frame), it has one or two reference frame lists, respectively, referred to as L0 and L1, each of which contains at least one reconstructed frame, referred to as the reference frame of the current frame. The reference frame provides reference pixels for inter-frame prediction of the current frame.

And a transform unit TU for processing the residual between the original image block and the predicted image block.

The pixel (also called as a pixel) refers to a pixel in an image, such as a pixel in a coding unit, a pixel in a luminance component pixel block (also called as a luminance pixel), a pixel in a chrominance component pixel block (also called as a chrominance pixel), and the like.

The samples (also referred to as pixel values) refer to pixel values of pixels, the pixel values refer to luminance (i.e., gray level) in a luminance component domain, and the pixel values refer to chrominance values (i.e., color and saturation) in a chrominance component domain.

And intra-frame prediction, namely generating a prediction image of the current block according to the spatial adjacent pixels of the current block. An intra prediction mode corresponds to a method of generating a prediction image. The division of the intra prediction unit includes a 2N × 2N division manner (as shown by a in fig. 2) and an N × N division manner (as shown by B in fig. 2), the 2N × 2N division manner being that the image block is not divided; the N × N division is to divide the image block into four equal-sized sub-image blocks.

Typically, digital video compression techniques work on video sequences whose color coding method is YCbCr, which may also be referred to as YUV, in a color format of 4:2:0, 4:2:2, or 4:4: 4. Where Y denotes brightness (Luma) that is a gray scale value, Cb denotes a blue Chrominance component, Cr denotes a red Chrominance component, and U and V denote Chrominance (Chroma) for describing color and saturation. In color format, 4:2:0 indicates 4 luminance components per 4 pixels, 2 chrominance components (yyycbcr), 4:2:2 indicates 4 luminance components per 4 pixels, 4 chrominance components (yyyycbcrcbccr), and 4:4:4 indicates full pixel display (yyycbcrcbcrcbcr), fig. 2 shows the component profiles for different color formats, where the circle is the Y component and the triangle is the UV component.

In a digital video encoding process, an encoder reads pixels and encodes raw video sequences in different color formats. A general digital encoder generally includes prediction, transformation and quantization, inverse transformation and inverse quantization, loop filtering, entropy coding, and the like, and is used to eliminate spatial, temporal, visual, and character redundancy. However, the human eye is more sensitive to changes in the luminance component and does not react strongly to changes in the chrominance component, so the original video sequence is typically encoded using YUV4:2:0 color format. Meanwhile, the digital video encoder adopts different prediction processes for the luminance component and the chrominance component in the intra-frame coding part, the prediction of the luminance component is more delicate and complex, and the prediction of the chrominance component is generally simpler. The Cross Component Prediction (CCP) mode is a technique of existing digital video coding that acts on a luminance Component and a chrominance Component to increase a video compression ratio.

A cross-component prediction mode implementation is used in intra coding, the method including determining a Linear Model (Linear Model) for predicting a chroma Block (chroma Block) using training samples of the luma Block (luma Block), and determining samples of the chroma Block using the samples of the luma Block and the Linear Model. The luma block and the chroma block are pixel blocks of the coding units in the luma component and the chroma component, and the digital video encoder usually reads an original video sequence into a frame-by-frame image and divides the image into coding tree units CTU, which can be further divided into coding units CU of different and same sizes, where the specific coding process is performed in the coding units of different components, and the relationship between the coding tree units and the coding units is shown in fig. 3.

Example of Cross Component Prediction (CCP): in the latest Video Coding (VVC) standard, a Cross Component Linear Model (CCLM) is used to reduce redundancy between components. The linear model is obtained by training original samples and reconstructed samples of adjacent pixels of an original pixel block of a luminance component of a current coding unit, wherein the sample information of the adjacent pixels comprises the original samples and the reconstructed samples of upper adjacent pixels of the original pixel block of the luminance component of the current coding unit, the original samples and the reconstructed samples of upper right adjacent pixels of the original pixel block of the luminance component of the current coding unit, the original samples and the reconstructed samples of left adjacent pixels of the original pixel block of the luminance component of the current coding unit, and the original samples and the reconstructed samples of lower left adjacent pixels of the original pixel block of the luminance component of the current coding unit. Fig. 4 shows an example of the positional relationship of the original pixel block and the neighboring pixels of a luminance component of 8x8 and the original predicted pixel block and the neighboring pixels of a chrominance component of 4x4 under the color format YUV4:2:0, respectively.

In the current coding unit, the prediction samples of the pixels in the chroma component prediction block are obtained by performing linear model calculation and downsampling on reconstructed samples of the pixels in the original pixel block of the luminance component of the current coding unit, wherein the linear model calculation process is represented as follows:

Pred_C(i，j)＝α·Rec_L(i，j)+β (1)

where (i, j) is the coordinate of the pixel, and x specifically refers to the abscissa of the prediction block of the chroma component of the current coding unit, and the range is [0, width-1 ]]Step size is 1, width is the width of the prediction block of the chroma component of the current coding unit, and the values can be 4, 8, 16 and 32; y specifically refers to the ordinate of the prediction block of the chroma component of the current coding unit, which ranges from 0, height-1]Step size is 1, height is the height of the prediction block of the chroma component of the current coding unit, and can be 4, 8, 16 and 32, Rec_LBeing reconstructed samples of pixels in the original block of pixels of the luminance component, Pred_CIs a prediction sample of a pixel in a prediction block of the chroma component, and α, β are coefficients of a linear model.

In another example of Cross-component Prediction, the Cross-component technical proposal M4612, which is newly adopted by the china digital Video coding Standard (AVS), is a Two-Step Cross-component Prediction Mode (TSCPM). In the encoding process, as shown in fig. 5, Intra-coded luminance components are calculated for 65 Intra Prediction modes (Intra Prediction modes) at most, DC denotes a mean mode, Plane denotes a Plane mode, Bilinear denotes a Bilinear mode, and Zone denotes an area. And selecting an optimal result according to the Rate Distortion (Rate distorsion) cost, and transmitting the intra-frame prediction mode, the corresponding prediction residual error and the like. When performing cross-component technique prediction on pixels of a prediction block of chrominance components, reconstructed samples of neighboring pixels of the original pixel block of the luminance component of the current coding unit and reconstructed samples of neighboring pixels of the original prediction pixel block of the chrominance components of the current coding unit are used for calculation of a linear model. The adjacent pixels of the original pixel block of the brightness component comprise an upper adjacent pixel and a left adjacent pixel of the original pixel block of the brightness component of the current coding unit; the above-mentioned neighboring pixels of the prediction block of the chroma component include upper neighboring pixels and left neighboring pixels of the prediction block of the current coding unit chroma component.

When a reconstructed sample is selected as a reference sample for calculating coefficients of the linear model, in combination with availability of reconstructed samples of adjacent pixels, a combination of reconstructed samples of two pixels in upper adjacent pixels and reconstructed samples of two pixels in left adjacent pixels may be adopted, reconstructed samples of four pixels in upper adjacent pixels may be adopted, and reconstructed samples of four pixels in left adjacent pixels may be adopted.

According to the above-mentioned different choices of reference samples, the prediction mode includes, if the reconstructed samples of the upper side neighboring pixels of the original pixel block of the luminance component and the original pixel block of the chrominance component (which are collectively referred to as the original pixel block for convenience of description) corresponding to the current coding unit and the reconstructed samples of the left side neighboring pixels of the original pixel block of the current coding unit are available, and the reference samples adopted for the coefficient calculation of the linear model are from the upper side and the left side neighboring pixels at the same time, or if the reconstructed samples of only the upper side neighboring pixels of the original pixel block corresponding to the current coding unit are available, and the reconstructed samples of only the upper side neighboring pixels of the reference samples adopted for the coefficient calculation of the linear model are available, or if the reconstructed samples of only the left side neighboring pixels of the original pixel block corresponding to the current coding unit are available, and the reconstructed samples of only the left side neighboring pixels of the reference samples adopted for the coefficient calculation of the linear model are selected, are all TSCPM modes; if the reconstructed samples of the upper side adjacent pixels of the original pixel block corresponding to the current coding unit and the reconstructed samples of the left side adjacent pixels of the original pixel block corresponding to the current coding unit are available, and the reference sample adopted for calculating the coefficient of the linear model is the TSCPM _ T mode when only the reconstructed sample of the upper side adjacent pixels is selected; and if the reconstructed samples of the upper adjacent pixel of the original pixel block corresponding to the current coding unit and the reconstructed samples of the left adjacent pixel of the original pixel block corresponding to the current coding unit are available and the reference sample adopted for calculating the coefficient of the linear model only selects the reconstructed sample of the upper adjacent pixel, the model is in the TSCPM _ L mode.

In the reference samples used for calculating the coefficients of the linear model, as shown in fig. 6, if the reference samples are from adjacent pixels on both sides of the original pixel block corresponding to the current coding unit, the upper reference sample selects a reconstructed sample of a leftmost pixel in the upper adjacent pixels and a reconstructed sample of a rightmost pixel on the upper side of the width of the original pixel block corresponding to the current coding unit block, and the left reference sample selects a reconstructed sample of a topmost pixel in the left adjacent pixels and a reconstructed sample of a bottommost pixel in the left adjacent pixels of the height of the original pixel block corresponding to the current coding unit; if the reference sample for calculating the coefficient of the linear model only comes from the upper side, selecting the reconstructed samples of the pixels with four continuous step lengths in the upper side adjacent pixels by taking the quarter distance of the width of the original pixel block corresponding to the current coding unit as the step length; if the reference sample only comes from the left side, the distance of one fourth of the height of the original pixel block corresponding to the current coding unit is taken as the step length, and the reconstructed samples of the pixels with four continuous step lengths in the four left-side adjacent pixels are selected.

In the above-mentioned specific example AVS3, the linear model calculation formula of the cross-component technique is the same as the above-mentioned formula (1), where α and β can be calculated by the following formula:

β＝Y_Min-α·X_Min (3)

wherein, Y_MaxIs the average of two largest reconstructed samples among the reconstructed samples of a plurality of adjacent pixels of the original block of pixels of the calculated chrominance component of the coefficients of the linear model, Y_MinIs the average of the two smallest reconstructed samples among the reconstructed samples of a plurality of adjacent pixels of the original block of pixels of the calculated chrominance component for the coefficients of the linear model. X_MaxIs the average of two largest reconstructed samples, X, of the reconstructed samples of a plurality of adjacent pixels of the original block of pixels of the calculated luminance component of the coefficients of the linear model_MinFor the linearityThe mean of two smallest reconstructed samples among the reconstructed samples of a plurality of adjacent pixels of the original block of pixels of the calculated luminance component of the coefficients of the model.

And performing cross-component Prediction according to the calculated linear model, wherein a brightness component reconstruction Block of the current CU is used for generating a corresponding Chroma component Reference Prediction Block (Chroma Reference Prediction Pixel Block). Specifically, a reference prediction sample of a chroma component of each pixel of the current coding unit is calculated according to equations (1), (2) and (3), and the size of the reference prediction block of the chroma component is the same as the size of the original pixel block of the luminance component. In a specific example, the input digital video color format is typically YUV4:2:0 format, i.e., the chroma component predicts a block of one-fourth the size of the original block of pixels for the luma component. In order to obtain a corresponding chroma component prediction block with a correct size, the chroma component reference prediction block needs to perform half down-sampling on the horizontal direction and the vertical direction respectively, and the chroma component prediction block after down-sampling is one fourth of the original pixel block of the corresponding brightness component, so that the size requirement of color format constraint is met. The filter for downsampling the chrominance component reference prediction block adopts a two-tap downsampling filter with the same coefficient in the left boundary pixel region of the chrominance component reference prediction block, and adopts a six-tap downsampling filter with two different coefficients in other pixel regions.

The down-sampling filter with six taps and two different coefficients is shown in equation (4).

Wherein x, y are coordinates of pixel, P'_CIs a predicted sample of the luminance component of the current pixel,

and

is a prediction sample of a chroma component of the current pixel.

The down-sampling filter with the same coefficient for the two taps is shown in equation (5).

and

is a prediction sample of a chroma component of the current pixel.

The downsampling filter is shown in fig. 7, where x1 represents multiplication by 1 and x2 represents multiplication by 2. Fig. 8 shows a schematic diagram of the cross-component technique from the luma component reconstruction block to the chroma component prediction block, where the luma component reconstruction block size of the coding unit is 8 × 8, the size of the corresponding chroma component reference prediction block is 8 × 8, and the size of the filtered chroma component prediction block is 4 × 4.

In general, natural video includes feature information of various angular directions, and in the intra-frame coding process of an image, an original pixel block of a luminance component of the image may be calculated according to intra-frame prediction modes of multiple angles, and an optimal intra-frame prediction mode is selected to obtain a better angle prediction effect and improve compression efficiency. In the conventional image coding and decoding technology, a luminance component reconstruction block predicted by a current coding unit is used for performing linear transformation to obtain a chrominance component reference prediction block of the current coding unit, wherein the chrominance component reference prediction block contains characteristic information of more luminance components, such as directional characteristics and the like. The temporal chroma component prediction block is down-sampled by using a down-sampling mode of a single fixed filter, some salient feature information of the current video content is not considered, all the content is uniformly blurred, and the down-sampling mode of the single fixed filter reduces the information intensity of various directions of the chroma component and reduces the efficiency of video compression.

In order to solve the technical problems, the present application provides a design idea that, in digital video encoding and decoding, intra-frame prediction processes of a luminance component and a chrominance component are greatly different, the luminance component performs calculation of at most 62 angular prediction modes and 3 non-angular prediction modes and selects an optimal intra-frame prediction mode for transmission, the intra-frame prediction mode of the chrominance component performs calculation of at most 6 prediction modes, and the intra-frame prediction process of the luminance component is more accurate. In the same coding unit, the luminance component and the chrominance component have characteristic consistency, that is, the main characteristic information of the image content is kept consistent in different components and is not changed due to linear transformation. Therefore, in the optimal mode selected by computing 65 intra prediction modes for the luminance component, the reference prediction block of the chrominance component after the reconstruction block of the luminance component is linearly transformed should still have the main feature information of the coding unit, and if the prediction of the chrominance component can keep the main feature information of the corresponding luminance component in the optimal mode, the compression efficiency of the coding unit can be improved. Therefore, the technical scheme of the application considers the optimal mode of the intra-frame angle prediction of the luminance component of the current coding unit as the guide information, and designs the down-sampling filter of the reference prediction block of the chrominance component independently.

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.

FIG. 9 is a block diagram of a video coding system 1 of one example described in an embodiment of the present application. As used herein, the term "video coder" generally refers to both video encoders and video decoders. In this application, the term "video coding" or "coding" may generally refer to video encoding or video decoding. The video encoder 100 and the video decoder 200 of the video coding system 1 are used for the cross-component prediction method proposed in the present application in real time.

As shown in fig. 9, video coding system 1 includes a source device 10 and a destination device 20. Source device 10 generates encoded video data. Accordingly, source device 10 may be referred to as a video encoding device. Destination device 20 may decode the encoded video data generated by source device 10. Accordingly, the destination device 20 may be referred to as a video decoding device. Various implementations of source device 10, destination device 20, or both may include one or more processors and memory coupled to the one or more processors. The memory can include, but is not limited to, RAM, ROM, EEPROM, flash memory, or any other medium that can be used to store desired program code in the form of instructions or data structures that can be accessed by a computer, as described herein.

Source device 10 and destination device 20 may comprise a variety of devices, including desktop computers, mobile computing devices, notebook (e.g., laptop) computers, tablet computers, set-top boxes, telephone handsets such as so-called "smart" phones, televisions, cameras, display devices, digital media players, video game consoles, in-vehicle computers, or the like.

Destination device 20 may receive encoded video data from source device 10 via link 30. Link 30 may comprise one or more media or devices capable of moving encoded video data from source device 10 to destination device 20. In one example, link 30 may comprise one or more communication media that enable source device 10 to transmit encoded video data directly to destination device 20 in real-time. In this example, source device 10 may modulate the encoded video data according to a communication standard, such as a wireless communication protocol, and may transmit the modulated video data to destination device 20. The one or more communication media may include wireless and/or wired communication media such as a Radio Frequency (RF) spectrum or one or more physical transmission lines. The one or more communication media may form part of a packet-based network, such as a local area network, a wide area network, or a global network (e.g., the internet). The one or more communication media may include a router, switch, base station, or other apparatus that facilitates communication from source device 10 to destination device 20. In another example, encoded data may be output from output interface 140 to storage device 40.

The image codec techniques of this application may be applied to video codecs to support a variety of multimedia applications, such as over-the-air television broadcasts, cable television transmissions, satellite television transmissions, streaming video transmissions (e.g., via the internet), encoding for video data stored on a data storage medium, decoding of video data stored on a data storage medium, or other applications. In some examples, video coding system 1 may be used to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting, and/or video telephony.

The video coding system 1 illustrated in fig. 9 is merely an example, and the techniques of this application may be applied to video coding settings (e.g., video encoding or video decoding) that do not necessarily include any data communication between an encoding device and a decoding device. In other examples, the data is retrieved from local storage, streamed over a network, and so forth. A video encoding device may encode and store data to a memory, and/or a video decoding device may retrieve and decode data from a memory. In many examples, the encoding and decoding are performed by devices that do not communicate with each other, but merely encode data to and/or retrieve data from memory and decode data.

In the example of fig. 9, source device 10 includes video source 120, video encoder 100, and output interface 140. In some examples, output interface 140 may include a regulator/demodulator (modem) and/or a transmitter. Video source 120 may comprise a video capture device (e.g., a video camera), a video archive containing previously captured video data, a video feed interface to receive video data from a video content provider, and/or a computer graphics system for generating video data, or a combination of such sources of video data.

Video encoder 100 may encode video data from video source 120. In some examples, source device 10 transmits the encoded video data directly to destination device 20 via output interface 140. In other examples, encoded video data may also be stored onto storage device 40 for later access by destination device 20 for decoding and/or playback.

In the example of fig. 9, destination device 20 includes input interface 240, video decoder 200, and display device 220. In some examples, input interface 240 includes a receiver and/or a modem. Input interface 240 may receive encoded video data via link 30 and/or from storage device 40. The display device 220 may be integrated with the destination device 20 or may be external to the destination device 20. In general, display device 220 displays decoded video data. The display device 220 may include a variety of display devices, such as a Liquid Crystal Display (LCD), a plasma display, an Organic Light Emitting Diode (OLED) display, or other types of display devices.

Although not shown in fig. 9, in some aspects, video encoder 100 and video decoder 200 may each be integrated with an audio encoder and decoder, and may include appropriate multiplexer-demultiplexer units or other hardware and software to handle encoding of both audio and video in a common data stream or separate data streams. In some examples, the MUX-DEMUX unit may conform to the ITU h.223 multiplexer protocol, or other protocols such as the User Datagram Protocol (UDP), if applicable.

Video encoder 100 and video decoder 200 may each be implemented as any of a variety of circuits such as: one or more microprocessors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), discrete logic, hardware, or any combinations thereof. If the present application is implemented in part in software, a device may store instructions for the software in a suitable non-volatile computer-readable storage medium and may execute the instructions in hardware using one or more processors to implement the techniques of the present application. Any of the foregoing, including hardware, software, a combination of hardware and software, etc., may be considered one or more processors. Each of video encoder 100 and video decoder 200 may be included in one or more encoders or decoders, either of which may be integrated as part of a combined encoder/decoder (codec) in a respective device.

Fig. 10 is an exemplary block diagram of a video encoder 100 described in embodiments of the present application. The video encoder 100 is used to output the video to the post-processing entity 41. Post-processing entity 41 represents an example of a video entity, such as a media-aware network element (MANE) or a splicing/editing device, that may process the encoded video data from video encoder 100. In some cases, post-processing entity 41 may be an instance of a network entity. In some video encoding systems, post-processing entity 41 and video encoder 100 may be parts of separate devices, while in other cases, the functionality described with respect to post-processing entity 41 may be performed by the same device that includes video encoder 100. In some example, post-processing entity 41 is an example of storage 40 of FIG. 1.

In the example of fig. 10, the video encoder 100 includes a prediction processing unit 108, a filter unit 106, a memory 107, a summer 112, a transformer 101, a quantizer 102, and an entropy encoder 103. The prediction processing unit 108 includes an inter predictor 110 and an intra predictor 109. For image block reconstruction, the video encoder 100 further includes an inverse quantizer 104, an inverse transformer 105, and a summer 111. Filter unit 106 represents one or more loop filters, such as deblocking filters, Adaptive Loop Filters (ALF), and Sample Adaptive Offset (SAO) filters. Although filter unit 106 is shown in fig. 10 as an in-loop filter, in other implementations, filter unit 106 may be implemented as a post-loop filter. In one example, the video encoder 100 may further include a video data memory, a partitioning unit (not shown).

Video encoder 100 receives video data and stores the video data in a video data memory. The partitioning unit partitions the video data into image blocks and these image blocks may be further partitioned into smaller blocks, e.g. image block partitions based on a quadtree structure or a binary tree structure. Prediction processing unit 108 may select one of a plurality of possible coding modes for the current image block, such as one of a plurality of intra coding modes or one of a plurality of inter coding modes. Prediction processing unit 108 may provide the resulting intra, inter coded block to summer 112 to generate a residual block and to summer 111 to reconstruct the encoded block used as the reference picture. An intra predictor 109 within prediction processing unit 108 may perform intra-predictive encoding of the current block relative to one or more neighboring blocks in the same frame or slice as the current block to be encoded to remove spatial redundancy. Inter predictor 110 within prediction processing unit 108 may perform inter-predictive encoding of the current block relative to one or more prediction blocks in one or more reference pictures to remove temporal redundancy. The prediction processing unit 108 provides information indicating the selected intra or inter prediction mode of the current image block to the entropy encoder 103 so that the entropy encoder 103 encodes the information indicating the selected inter prediction mode.

After prediction processing unit 108 generates a prediction block for the current image block via inter-prediction, intra-prediction, video encoder 100 forms a residual image block by subtracting the prediction block from the current image block to be encoded. Summer 112 represents one or more components that perform this subtraction operation. The residual video data in the residual block may be included in one or more TUs and applied to transformer 101. The transformer 101 transforms the residual video data into residual transform coefficients using a transform such as a Discrete Cosine Transform (DCT) or a conceptually similar transform. Transformer 101 may convert residual video data from a pixel value domain to a transform domain, e.g., the frequency domain.

The transformer 101 may send the resulting transform coefficients to the quantizer 102. Quantizer 102 quantizes the transform coefficients to further reduce the bit rate. In some examples, quantizer 102 may then perform a scan of a matrix that includes quantized transform coefficients. Alternatively, the entropy encoder 103 may perform a scan.

After quantization, the entropy encoder 103 entropy encodes the quantized transform coefficients. For example, the entropy encoder 103 may perform Context Adaptive Variable Length Coding (CAVLC), Context Adaptive Binary Arithmetic Coding (CABAC), syntax-based context adaptive binary arithmetic coding (SBAC), Probability Interval Partition Entropy (PIPE) coding, or another entropy encoding method or technique. After entropy encoding by the entropy encoder 103, the encoded codestream may be transmitted to the video decoder 200, or archived for later transmission or retrieved by the video decoder 200. The entropy encoder 103 may also entropy encode syntax elements of the current image block to be encoded.

Inverse quantizer 104 and inverse transformer 105 apply inverse quantization and inverse transform, respectively, to reconstruct the residual block in the pixel domain, e.g., for later use as a reference block for a reference image. The summer 111 adds the reconstructed residual block to the prediction block produced by the inter predictor 110 or the intra predictor 109 to produce a reconstructed image block. The filter unit 106 may be adapted to reconstruct the image block to reduce distortions, such as block artifacts. This reconstructed image block is then stored in memory 107 as a reference block, which may be used by inter predictor 110 as a reference block to inter predict a block in a subsequent video frame or image.

Specifically, the intra predictor 109 may perform intra prediction on the current coding unit, specifically, perform the image coding method provided in the embodiments of the present application. The intra predictor 109 may also provide information indicating the selected intra prediction mode of the current coding unit to the entropy encoder 103 so that the entropy encoder 103 encodes the information indicating the selected intra prediction mode.

Fig. 11 is an exemplary block diagram of a video decoder 200 described in the embodiments of the present application. In the example of fig. 11, the video decoder 200 includes an entropy decoder 203, a prediction processing unit 208, an inverse quantizer 204, an inverse transformer 205, a summer 211, a filter unit 206, and a stored 207. The prediction processing unit 208 may include an inter predictor 210 and an intra predictor 209. In some examples, video decoder 200 may perform a decoding process that is substantially reciprocal to the encoding process described with respect to video encoder 100 from fig. 10.

In the decoding process, video decoder 200 receives an encoded video bitstream representing an image block and associated syntax elements of an encoded video slice from video encoder 100. Video decoder 200 may receive video data from network entity 42 and, optionally, may store the video data in a video data store (not shown). The video data memory may store video data, such as an encoded video bitstream, to be decoded by components of video decoder 200. The video data stored in the video data memory may be obtained, for example, from storage device 40, from a local video source such as a camera, via wired or wireless network communication of video data, or by accessing a physical data storage medium. The video data memory may serve as a decoded picture buffer (CPB) for storing encoded video data from the encoded video bitstream.

Network entity 42 may be, for example, a server, a MANE, a video editor/splicer, or other such device for implementing one or more of the techniques described above. Network entity 42 may or may not include a video encoder, such as video encoder 100. Network entity 42 may implement portions of the techniques described in this application before network entity 42 sends the encoded video bitstream to video decoder 200. In some video decoding systems, network entity 42 and video decoder 200 may be part of separate devices, while in other cases, the functionality described with respect to network entity 42 may be performed by the same device that includes video decoder 200.

The entropy decoder 203 of the video decoder 200 entropy decodes the code stream to generate quantized coefficients and some syntax elements. The entropy decoder 203 forwards the syntax elements to the prediction processing unit 208. Video decoder 200 may receive syntax elements at the video slice level and/or the picture block level. When a video slice is decoded as an intra-decoded (I) slice, intra predictor 209 of prediction processing unit 208 generates a prediction block for an image block of the current video slice based on the signaled intra prediction mode and data from previously decoded blocks of the current frame or picture. When a video slice is decoded as an inter-decoded (i.e., B or P) slice, the inter predictor 210 of the prediction processing unit 208 may determine an inter prediction mode for decoding a current image block of the current video slice based on syntax elements received from the entropy decoder 203, decode the current image block (e.g., perform inter prediction) based on the determined inter prediction mode.

The inverse quantizer 204 inversely quantizes, i.e., dequantizes, the quantized transform coefficients provided in the codestream and decoded by the entropy decoder 203. The inverse quantization process may include: the quantization parameter calculated by the video encoder 100 for each image block in the video slice is used to determine the degree of quantization that should be applied and likewise the degree of inverse quantization that should be applied. Inverse transformer 205 applies an inverse transform, such as an inverse DCT, an inverse integer transform, or a conceptually similar inverse transform process, to the transform coefficients in order to generate a block of residues in the pixel domain.

After the inter predictor 210 generates a prediction block for the current image block or a sub-block of the current image block, the video decoder 200 obtains a reconstructed block, i.e., a decoded image block, by summing the residual block from the inverse transformer 205 with the corresponding prediction block generated by the inter predictor 210. Summer 211 represents the component that performs this summation operation. A loop filter (in or after the decoding loop) may also be used to smooth pixel transitions or otherwise improve video quality, if desired. Filter unit 206 may represent one or more loop filters, such as deblocking filters, Adaptive Loop Filters (ALF), and Sample Adaptive Offset (SAO) filters. Although the filter unit 206 is shown in fig. 11 as an in-loop filter, in other implementations, the filter unit 206 may be implemented as a post-loop filter.

It should be understood that other structural variations of the video decoder 200 may be used to decode the encoded video stream. For example, the video decoder 200 may generate an output video stream without processing by the filter unit 206; alternatively, for some image blocks or image frames, the entropy decoder 203 of the video decoder 200 does not decode quantized coefficients and accordingly does not need to be processed by the inverse quantizer 204 and the inverse transformer 205.

Specifically, the intra predictor 209 may use the image decoding method described in the embodiment of the present application in the generation of the prediction block.

Fig. 12A is a flowchart illustrating an image encoding method in an embodiment of the present application, where the image encoding method can be applied to the source device 10 in the video decoding system 1 shown in fig. 9 or the video encoder 100 shown in fig. 10. The flow shown in fig. 12A is explained by taking as an example the execution subject of the video encoder 100 shown in fig. 10. As shown in fig. 12A, the cross-component prediction method provided in the embodiment of the present application includes:

step 110, dividing the image, and determining the luminance component intra-frame prediction mode and the chrominance component intra-frame prediction mode of the current coding block.

The color format of the video to which the image belongs includes, but is not limited to, 4:2:0, 4:2:2, and the like.

For example, when the color format is 4:2:0, as shown in (C) of fig. 2, the pixel ratio of the original pixel block of the luminance component to the original pixel block of the chrominance component of the current coding block is 4:1, and taking a forward pixel array of 8 × 8 as an example, the size of the original pixel block of the corresponding luminance component is 8 × 8, and the size of the original pixel block of the corresponding chrominance component is 4 × 4.

For another example, when the color format is 4:2:2, as shown in (B) of fig. 2, the pixel ratio of the original pixel block of the luminance component to the original pixel block of the chrominance component of the current coding unit is 2:1, and taking the forward direction pixel array of 8 × 8 as an example, the size of the corresponding original pixel block of the luminance component is 8 × 8, and the size of the corresponding original pixel block of the chrominance component is 8 × 4.

As shown in fig. 5, the intra-coded luminance component calculates 65 intra-prediction modes at most, and in a specific implementation, the luminance component performs 62 angular prediction modes and 3 non-angular prediction modes at most and selects an optimal intra-prediction mode for transmission, and the intra-prediction mode of the chrominance component performs 6 prediction modes at most. The intra-frame prediction mode of the brightness component of the current coding block is a prediction mode with the optimal code rate distortion cost in a plurality of intra-frame prediction modes, and the plurality of intra-frame prediction modes are intra-frame prediction modes used for intra-frame prediction of the brightness component of the current coding block.

And 120, when the chroma component intra-frame prediction mode indicates that the chroma component of the current coding block is deduced by using the brightness component of the current coding block, determining a reference prediction block of the chroma component of the current coding block according to the brightness component intra-frame prediction mode.

In a specific implementation, the device may determine that the chroma component intra-prediction mode indicates to derive the chroma component of the current coding block using the luma component of the current coding block, when it is determined that the luma component intra-prediction mode is the preset intra-prediction mode. The predetermined intra-prediction modes are luminance component intra-prediction modes in predetermined directions, including but not limited to a horizontal direction (e.g., along the X-axis in the two-dimensional rectangular coordinate system XoY shown in fig. 1), a vertical direction (e.g., along the negative Y-axis in the two-dimensional rectangular coordinate system XoY shown in fig. 1), and a diagonal direction (e.g., along the negative 45-degree X-axis in the two-dimensional rectangular coordinate system XoY shown in fig. 1).

Step 130, filtering the reference prediction block of the chroma component of the current coding block to obtain the prediction block of the chroma component of the current coding block.

In a specific implementation, after the prediction block of the chroma component of the current coding block is determined, the device may further calculate a reconstructed block of the chroma component, and determine a reconstructed image block of the current coding block according to the reconstructed block of the chroma component and the reconstructed block of the luma component.

Therefore, in the embodiment of the application, compared with the prior art, the scheme of the application utilizes the implementation of filtering the reference prediction block of the chrominance component of the current coding block in the cross-component prediction mode, which is beneficial to improving the compression efficiency of the coding block, thereby improving the coding efficiency.

In one possible example, the filtering the reference prediction block for chroma components of the current coding block comprises: determining a filter according to the luminance component intra prediction mode; filtering a reference prediction block for a chroma component of the current coding block using the filter.

In a specific implementation, the device determines a filter according to an intra prediction direction indicated by the luminance component intra prediction mode.

For example, for a frame of image in a video with more prominent vertical image features (such as vertical stripe images), the optimal intra-frame prediction mode of the luminance component is used to guide the down-sampling operation of the chrominance component, so that the vertical direction strength of the corresponding chrominance component can be increased, and a better prediction effect can be achieved.

For another example, for a frame of image in a video with prominent horizontal image features (such as horizontal stripe images), the optimal intra-frame prediction mode of the luminance component is used to guide the down-sampling operation of the chrominance component, so that the horizontal direction strength of the corresponding chrominance component can be increased, and a better prediction effect can be achieved.

For another example, for a frame of image in a video with more prominent image features in the diagonal direction of the image (such as a diagonal stripe image), the optimal intra-frame prediction mode of the luminance component is used to guide the down-sampling operation of the chrominance component, so that the direction strength in the diagonal direction of the corresponding chrominance component can be increased, and a better prediction effect can be achieved.

As can be seen, in this example, the filter is determined according to the intra-frame prediction mode of the luminance component, and the reference prediction block of the chrominance component is filtered by using the filter, so that the prediction of the chrominance component can keep the main feature information in the optimal mode of the corresponding luminance component, that is, the main feature information of the image content keeps higher consistency between the luminance component and the chrominance component, which is beneficial to improving the compression efficiency of the coding block and achieving a better prediction effect.

In one possible example, the determining a filter according to the luminance component intra prediction mode includes: when the intra prediction direction indicated by the luminance component intra prediction mode is the horizontal direction, the filter is set as a first filter.

In this possible example, the first filter includes a first two-tap filter for filtering an upper boundary pixel region of the reference prediction block of the chroma component and a first three-tap filter for filtering a non-upper boundary pixel region of the reference prediction block of the chroma component.

In this possible example, the first two-tap filter comprises:

wherein x, y are the coordinates of the current pixel, P'_CIs a predicted sample of the luminance component of the current pixel,

and

predicting samples for a reference of a chrominance component of the current pixel, P_CFor the prediction samples of the chrominance component of the current pixel,

express get

And

the larger of these.

In this possible example, the first three-tap filter comprises:

wherein x, y are the coordinates of the current pixel, P'_CFor the pre-luminance component of the current pixelThe sample is measured and the measured sample is,

and

express get

And

the larger of these.

For example, as shown in fig. 12B, the down-sampling process in the horizontal direction in the 8 × 8 pixel array, x2 represents multiplication by 2, x3 represents multiplication by 3, taking the first row and the second row of pixels in the reference prediction block of the chrominance component as an example, first, down-sampling is performed using the first two-tap filter for pixel a, pixel B, pixel i, and pixel i to form pixel 1 of the prediction block of the chrominance component, then, down-sampling is performed using the first three-tap filter for pixel B, pixel c, pixel d, pixel j, pixel k, and pixel l in the horizontal direction to form pixel 2 of the prediction block of the chrominance component, next, down-sampling is performed using the first three-tap filter to form pixel 3 of the prediction block of the chrominance component according to step 2 skip for pixel d, pixel e, pixel f, pixel l, pixel m, and pixel n, and again, skipping by step 2, for pixel f, pixel g, pixel h, pixel n, pixel o and pixel p, down-sampling using the first three-tap filter forms pixel 4 of the prediction block for the chroma component, and so on for the other columns.

As can be seen, in this example, for an image in a video with a prominent horizontal image feature (such as a horizontal stripe image), the optimal intra prediction mode of the luminance component is used to guide the down-sampling operation of the chrominance component, so that the horizontal direction strength of the corresponding chrominance component can be increased, and a better prediction effect can be achieved.

In one possible example, the determining a filter according to the luminance component intra prediction mode includes: when the intra prediction direction indicated by the luminance component intra prediction mode is the horizontal direction, the filter is set as a second filter.

In this possible example, the second filter includes a second two-tap filter for filtering a left boundary pixel region of the reference prediction block of the chroma component and a second three-tap filter for filtering a non-left boundary pixel region of the reference prediction block of the chroma component.

In this possible example, the second two-tap filter comprises:

and

express get

And

the larger of these.

In this possible example, the second third tap filter comprises:

wherein x, y are coordinates of the current pixel, P'_CIs a predicted sample of the luminance component of the current pixel,

and

express get

And

the larger of these.

For example, as shown in fig. 12C, the down-sampling process in the vertical direction in the 8 × 8 pixel array, x2 represents multiplication by 2, x3 represents multiplication by 3, taking the first and second columns of pixels in the reference prediction block of the chrominance component as an example, first, down-sampling pixel 1 of the prediction block forming the luminance component using the second two-tap filter for pixel a, pixel b, pixel C, pixel d, then down-sampling pixel 2 of the prediction block forming the chrominance component using the second three-tap filter for pixel C, pixel d, pixel e, pixel f, pixel g, and pixel h in the vertical direction, then down-sampling pixel 3 of the prediction block forming the chrominance component using the second three-tap filter for pixel g, pixel h, pixel i, pixel j, pixel k, and pixel l by step 2 skip, and again, skipping by step 2, for pixel k, pixel l, pixel m, pixel n, pixel o and pixel p, down-sampling using a second three-tap filter to form pixel 4 of the prediction block for the chroma component, and so on for the other columns.

As can be seen, in this example, for an image in a video with a more prominent vertical image feature (such as a vertical stripe image), the optimal intra-frame prediction mode of the luminance component is used to guide the down-sampling operation of the chrominance component, so that the vertical directional strength of the corresponding chrominance component can be increased, and a better prediction effect can be achieved.

In one possible example, the selecting a filter for the currently decoded block according to the luma component intra prediction mode comprises: when the intra prediction direction indicated by the luminance component intra prediction mode is a diagonal direction, the filter is set as a third filter.

In this possible example, the third filter includes a third two-tap filter for filtering a left boundary pixel region and an upper boundary pixel region of the chroma component reference prediction block and a third three-tap filter for filtering a pixel region of the chroma component reference prediction block other than the left boundary pixel region and the upper boundary pixel region.

In this possible example, the third two-tap filter comprises:

P_C(x，y)＝(P′_C(2x,2y)+P′_C(2x+1,2y+1)+1)＞＞1；

and

is a prediction sample of a chroma component of the current pixel.

In this possible example, the third tap filter comprises:

P_C(x，y)＝(3×P′_C(2x,2y)+3×P′_C(2x+1,2y+1)+2×P′_C(2x-1,2y-1)+4)＞＞3；

and

is a prediction sample of a chroma component of the current pixel.

For example, as shown in fig. 12D, in the vertical down-sampling process in the 8x8 pixel array, x2 represents multiplication by 2, x3 represents multiplication by 3, taking the diagonal pixels in the reference prediction block of the chrominance component as an example, firstly, the pixels a, b, c, and D are down-sampled by using a third two-tap filter to form the pixel 1 of the prediction block of the luminance component, then, in the diagonal direction, the pixels D, e, f, g, h, i, j, k, and l are down-sampled by using a third three-tap filter to form the pixel 2 of the prediction block of the chrominance component, and secondly, the three-tap filters are down-sampled by step 2 skip for the pixels l, m, n, o, p, q, r, s, and t to form the pixel 3 of the prediction block of the chrominance component, and skipping according to the step size 2, performing down-sampling on the pixel t, the pixel u, the pixel v, the pixel w, the pixel x, the pixel y, the pixel z, the pixel A and the pixel B by using a third tap filter to form a pixel 4 of a prediction block of the chrominance component, and the like on other columns.

As can be seen, in this example, for an image in a video with more prominent image features in a diagonal direction of the image (such as a diagonal stripe image), the optimal intra prediction mode of the luminance component is used to guide the down-sampling operation of the chrominance component, so that the direction strength in the diagonal direction of the corresponding chrominance component can be increased, and a better prediction effect can be achieved.

In one possible example, the determining a reference prediction block for a chroma component of the current coding block according to the luma component intra prediction mode comprises: determining a reconstructed block of the brightness component of the current coding block according to the brightness component intra-frame prediction mode; and determining a reference prediction block of a chrominance component of the current coding block according to a reconstructed block of a luminance component of the current coding block.

Wherein a size of the reference prediction block of the chrominance component is the same as a size of the reconstructed block of the luminance component. For example, the reconstructed block of the luminance component and the reference prediction block of the chrominance component in the prediction process shown in fig. 8 are 8 × 8 pixel arrays.

In this possible example, the determining a reference prediction block for a chroma component of the current coding block from a reconstructed block for a luma component of the current coding block comprises: determining a linear model for performing cross-component prediction by using a reconstructed block of the brightness component of the current coding block; and calculating a reconstructed block of the brightness component according to the linear model to obtain a reference prediction block of the chrominance component of the current coding block.

The linear model may be, for example, the linear model of the foregoing formula (1).

In this possible example, the determining a linear model for cross-component prediction using a reconstructed block of a luma component of the current coding block includes: determining a reference pixel for computing the linear model, the reference pixel comprising at least one neighboring pixel of the currently encoded block; the linear model is calculated from the reference pixels.

Optionally, if the current coding block is a partial image block in the current coding unit, the device may select a linear model adapted to the current coding block from the plurality of linear models, and specifically may select the adapted linear model for the current coding block according to the image characteristic, and since the coefficient of the linear model is not determined yet, the linear model needs to be calculated according to the reference pixel. Therefore, the device can provide a more refined prediction mechanism relative to the coding unit aiming at the chroma component prediction of the current coding block, and realize more refined image prediction.

In this possible example, the determining a reference pixel for computing the linear model comprises: and determining a reference pixel for calculating the linear model according to available information of reconstructed samples of adjacent pixels of the current coding block and the chroma component intra-frame prediction mode.

Wherein the intra prediction mode of the chroma component of the current coding block comprises any one of TSCPM and TSCPM _ T, TSCPM _ L. The availability information specifically includes both available on both sides and available on one side (e.g., left available and right available). The details will be described below.

If the intra prediction mode of the chroma component of the current coding block is TSCPM and the available information is that the reconstructed samples of the upper side neighboring pixels of the original pixel block corresponding to the current coding block and the reconstructed samples of the left side neighboring pixels of the original pixel block of the current coding block are available, the reference neighboring pixels used for calculating the coefficients of the linear model are 2 of the upper side neighboring pixels and 2 of the left side neighboring pixels of the original pixel block, as shown in (a) of fig. 6.

If the intra prediction mode of the chroma component of the current coding block is TSCPM and the available information is that reconstructed samples of upper neighboring pixels of the original pixel block corresponding to the current coding block are available, the reference neighboring pixels used for calculating the coefficients of the linear model are 4 of the upper neighboring pixels of the original pixel block, as shown in (b) of fig. 6.

If the intra prediction mode of the chroma component of the current coding block is TSCPM and the available information indicates that reconstructed samples of left-side neighboring pixels of the original pixel block corresponding to the current coding block are available, the reference neighboring pixels used for calculating the coefficients of the linear model are 4 of the left-side neighboring pixels of the original pixel block, as shown in (c) of fig. 6.

If the intra prediction mode of the chroma component of the current coding block is TSCPM _ T and the reconstructed samples of the upper side neighboring pixels of the original pixel block corresponding to the current coding block and the reconstructed samples of the left side neighboring pixels of the original pixel block corresponding to the current coding block are available, the reference neighboring pixels used for calculating the coefficients of the linear model are 4 of the upper side neighboring pixels of the original pixel block, as shown in (b) of fig. 6.

If the intra prediction mode of the chroma component of the current coding block is TSCPM _ L, and the reconstructed samples of the upper side neighboring pixels of the original pixel block corresponding to the current coding block and the reconstructed samples of the left side neighboring pixels of the original pixel block corresponding to the current coding block are available, the reference neighboring pixels used for calculating the coefficient of the linear model are 4 of the left side neighboring pixels of the original pixel block, as shown in (c) of fig. 6.

It can be seen that, in the present example, the reference neighboring pixels used for calculating the coefficients of the linear model can be flexibly set according to the availability of reconstructed samples of the neighboring pixels and the intra prediction mode of the chrominance component.

In this possible example, the determining a reference pixel for computing the linear model comprises: and determining a reference pixel for calculating the linear model according to the brightness component intra-frame prediction mode with the optimal code rate distortion cost of the adjacent coding block of the current coding block.

The intra-frame prediction mode with the optimal code rate distortion cost of the brightness component of the adjacent coding block may be the same as or different from the intra-frame prediction mode with the optimal code rate distortion cost of the brightness component of the current coding unit.

Fig. 13 is a flowchart illustrating an image encoding method according to an embodiment of the present application, corresponding to the image encoding method illustrated in fig. 12A, where the image encoding method can be applied to the destination device 20 in the video decoding system 1 illustrated in fig. 9 or the video decoder 200 illustrated in fig. 11. The flow shown in fig. 13 is described by taking as an example the video encoder 200 shown in fig. 11 as an execution subject. As shown in fig. 13, the cross-component prediction method provided in the embodiment of the present application includes:

step 210, parsing the code stream, and determining the luminance component intra-frame prediction mode and the chrominance component intra-frame prediction mode of the current decoding block.

The color format of the video of the code stream includes, but is not limited to, 4:2:0, 4:2:2, and the like.

In a specific implementation, the code stream may obtain a syntax element through entropy decoding, where the syntax element is used to determine a luminance component intra-frame prediction mode and a chrominance component intra-frame prediction mode for predicting a current decoded block. The luminance component intra-frame prediction mode is the optimal luminance component intra-frame prediction mode in a plurality of intra-frame prediction modes, and the plurality of intra-frame prediction modes are intra-frame prediction modes used for intra-frame prediction of the luminance component.

Step 220, when the chroma component intra prediction mode indicates to derive a prediction value of a chroma component of the current decoded block using a reconstructed block of a luma component of the current decoded block, determining a reference prediction block of the chroma component of the current decoded block according to the luma component intra prediction mode.

In a specific implementation, the device may determine that the chroma component intra prediction mode indicates to derive the chroma component of the current decoded block using the luma component of the current decoded block, when the luma component intra prediction mode is determined to be the preset intra prediction mode. The predetermined intra-prediction modes are luminance component intra-prediction modes in predetermined directions, including but not limited to a horizontal direction (e.g., along the X-axis in the two-dimensional rectangular coordinate system XoY shown in fig. 1), a vertical direction (e.g., along the negative Y-axis in the two-dimensional rectangular coordinate system XoY shown in fig. 1), and a diagonal direction (e.g., along the negative 45-degree X-axis in the two-dimensional rectangular coordinate system XoY shown in fig. 1).

Step 230, filtering the reference prediction block of the chroma component of the current decoded block to obtain the prediction block of the chroma component of the current decoded block.

In a specific implementation, after the prediction block of the chroma component of the current decoding block is determined, the device may further calculate a reconstructed block of the chroma component, and determine a reconstructed image of the current decoding block according to the reconstructed block of the chroma component and the reconstructed block of the luma component.

It can be seen that, in the embodiment of the present application, compared with the prior art, the scheme of the present application performs filtering on the reference prediction block of the chroma component of the current decoding block in the cross-component prediction mode, which is beneficial to improving the compression efficiency of the decoding block, thereby improving the decoding efficiency.

In one possible example, the filtering of the reference prediction block for chroma components of the currently decoded block comprises: determining a filter according to the luminance component intra prediction mode; filtering a reference prediction block for chroma components of the currently decoded block using the filter.

As can be seen, in this example, the filter is determined according to the intra-frame prediction mode of the luminance component, and the reference decoding block of the chrominance component is filtered by using the filter, so that the prediction of the chrominance component can keep the main feature information in the optimal mode of the corresponding luminance component, that is, the main feature information of the image content keeps higher consistency between the luminance component and the chrominance component, which is beneficial to improving the decoding efficiency of the decoding block and achieving a better prediction effect.

In this possible example, the first two-tap filter comprises:

and

express get

And

the larger of these.

In this possible example, the first three-tap filter comprises:

and

express get

And

the larger of these.

In this possible example, the second two-tap filter comprises:

and

express get

And

the larger of these.

In this possible example, the second third tap filter comprises:

and

express get

And

the larger of these.

In this possible example, the third two-tap filter comprises:

P_C(x，y)＝(P′_C(2x,2y)+P′_C(2x+1,2y+1)+1)＞＞1；

and

is a prediction sample of a chroma component of the current pixel.

In this possible example, the third tap filter comprises:

and

is a prediction sample of a chroma component of the current pixel.

In one possible example, the determining a reference prediction block for a chroma component of the currently decoded block according to the luma component intra prediction mode comprises: determining a reconstructed block of a luminance component of the current decoded block according to the chrominance component intra-prediction mode; determining a reference prediction block for a chroma component of the current decoded block according to a reconstructed block of a luma component of the current decoded block.

In this possible example, said determining a reference prediction block for a chroma component of the current decoded block from a reconstructed block for a luma component of the current decoded block comprises: determining a linear model for cross-component prediction by using a reconstructed block of a luminance component of the current decoding block; and calculating a reconstructed block of the brightness component according to the linear model to obtain a reference prediction block of the chroma component of the current decoding block.

In this possible example, the determining a linear model for cross-component prediction using a reconstructed block of a luma component of the current decoded block comprises: determining reference pixels for computing the linear model, the reference pixels comprising at least one neighboring pixel of the currently decoded block; the linear model is calculated from the reference pixels.

Optionally, if the current decoding block is a partial image block in the current coding unit, the device may select a linear model adapted to the current decoding block from the plurality of linear models, and specifically may select the adapted linear model for the current decoding block according to image characteristics, and since a coefficient of the linear model is not determined yet, the linear model needs to be calculated according to a reference pixel. Therefore, the device can provide a more refined prediction mechanism relative to the coding unit aiming at the chroma component prediction of the current decoding block, and the more refined image prediction is realized.

In this possible example, the determining a reference pixel for computing the linear model comprises: determining a reference pixel for calculating the linear model according to available information of reconstructed samples of neighboring pixels of the current decoded block and the chroma component intra prediction mode.

Wherein the intra prediction mode of the chroma component of the current decoded block includes any one of TSCPM and TSCPM _ T, TSCPM _ L. The availability information specifically includes both available on both sides and available on one side (e.g., left available and right available).

Wherein the intra prediction mode of the chroma component of the current decoded block includes any one of TSCPM and TSCPM _ T, TSCPM _ L. The availability information specifically includes both available on both sides and available on one side (e.g., left available and right available). The details will be described below.

If the intra prediction mode of the chroma component of the current decoded block is TSCPM and the available information is that the reconstructed samples of the upper side neighboring pixels of the original pixel block to which the current decoded block corresponds and the reconstructed samples of the left side neighboring pixels of the original pixel block of the current decoded block are available, the reference neighboring pixels used to calculate the coefficients of the linear model are 2 of the upper side neighboring pixels and 2 of the left side neighboring pixels of the original pixel block, as shown in fig. 6 (a).

If the intra prediction mode of the chroma component of the current decoded block is TSCPM and the available information is that reconstructed samples of upper neighboring pixels of the original pixel block to which the current decoded block corresponds are available, the reference neighboring pixels used for calculating the coefficients of the linear model are 4 of the upper neighboring pixels of the original pixel block, as shown in (b) of fig. 6.

If the intra prediction mode of the chroma component of the current decoded block is TSCPM and the available information is that reconstructed samples of left side neighboring pixels of the original pixel block corresponding to the current decoded block are available, then the reference neighboring pixels used for calculating the coefficients of the linear model are 4 of the left side neighboring pixels of the original pixel block, as shown in (c) of fig. 6.

If the intra prediction mode of the chroma component of the current decoded block is TSCPM _ T and the reconstructed samples of the upper side neighboring pixels of the original pixel block to which the current decoded block corresponds and the reconstructed samples of the left side neighboring pixels of the original pixel block to which the current decoded block corresponds are available, the reference neighboring pixels used to calculate the coefficients of the linear model are 4 of the upper side neighboring pixels of the original pixel block, as shown in (b) of fig. 6.

If the intra prediction mode of the chroma component of the current decoded block is TSCPM _ L, and the reconstructed samples of the upper side neighboring pixel of the original pixel block corresponding to the current decoded block and the reconstructed samples of the left side neighboring pixel of the original pixel block corresponding to the current decoded block are available, the reference neighboring pixels used for calculating the coefficients of the linear model are 4 of the left side neighboring pixels of the original pixel block, as shown in (c) of fig. 6.

In this possible example, the determining a reference pixel for computing the linear model comprises: and determining a reference pixel for calculating the linear model according to the brightness component intra-frame prediction mode with the optimal code rate distortion cost of the adjacent decoding block of the current decoding block.

The intra prediction mode of the luminance component of the adjacent decoded block may be the same as or different from the intra prediction mode of the luminance component of the current decoding unit.

The digital video codec technology mainly adopts the coding bit rate (bitrate) and the peak signal-to-noise ratio (PSNR) for comparison in measuring the Performance index, the test experiment of the scheme of the application is performed under the Common Test Condition (CTC) of the reference software High-Performance Model (HPM) of the AVS3, and table 1 summarizes the simulation results of All Intra (AI, AI) configurations of the test.

TABLE 1

As can be seen from the table, the UV component has a coding gain and the U-BDBR and V-BDBR have gains gain of 0.05% and 0.19%, respectively, without loss of the Y component.

The embodiment of the application provides an image coding device which can be a video decoder or a video encoder. In particular, the image encoding device is configured to perform the steps performed by the video decoder in the above decoding method. The image encoding device provided by the embodiment of the application can comprise modules corresponding to the corresponding steps.

The present embodiment may divide the functional modules of the image encoding apparatus according to the above method, for example, each functional module may be divided according to each function, or two or more functions may be integrated into one processing module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The division of the modules in the embodiment of the present application is schematic, and is only a logic function division, and there may be another division manner in actual implementation.

Fig. 14 shows a schematic diagram of a possible structure of the image encoding apparatus according to the above-described embodiment, in a case where each functional module is divided in correspondence with each function. As shown in fig. 14, the image encoding device 14 includes a dividing unit 140, a determining unit 141, and a filtering unit 142.

A dividing unit 140, configured to divide an image, and determine a luminance component intra-frame prediction mode and a chrominance component intra-frame prediction mode of a current coding block;

a determining unit 141, configured to determine a reference prediction block of a chroma component of the current coding block according to the luma component intra prediction mode when the chroma component intra prediction mode indicates that the chroma component of the current coding block is derived using a luma component of the current coding block;

a filtering unit 142, configured to filter the reference prediction block of the chroma component of the current coding block to obtain a prediction block of the chroma component of the current coding block.

All relevant contents of each step related to the above method embodiment may be referred to the functional description of the corresponding functional module, and are not described herein again. Of course, the image encoding apparatus provided in the embodiments of the present application includes, but is not limited to, the above modules, for example: the image encoding apparatus may further include a storage unit 143. The storage unit 143 may be used to store program codes and data of the image encoding apparatus.

In the case of using an integrated unit, a schematic structural diagram of an image encoding device provided in an embodiment of the present application is shown in fig. 15. In fig. 15, the image encoding device 15 includes: a processing module 150 and a communication module 151. The processing module 150 is used for controlling and managing actions of the image encoding apparatus, for example, performing steps performed by the dividing unit 140, the determining unit 141, the filtering unit 142, and/or other processes for performing the techniques described herein. The communication module 151 is used to support interaction between the image encoding apparatus and other devices. As shown in fig. 15, the image encoding apparatus may further include a storage module 152, and the storage module 152 is used for storing program codes and data of the image encoding apparatus, for example, contents stored in the storage unit 143.

The Processing module 150 may be a Processor or a controller, and may be, for example, a Central Processing Unit (CPU), a general-purpose Processor, a Digital Signal Processor (DSP), an ASIC, an FPGA or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor may also be a combination of computing functions, e.g., comprising one or more microprocessors, DSPs, and microprocessors, among others. The communication module 151 may be a transceiver, an RF circuit or a communication interface, etc. The storage module 152 may be a memory.

All relevant contents of each scene related to the method embodiment may be referred to the functional description of the corresponding functional module, and are not described herein again. The image encoding device 14 and the image encoding device 15 can both perform the image encoding method shown in fig. 12A, and the image encoding device 14 and the image encoding device 15 can be specifically a video image encoding device or other devices with video encoding functions.

The application also provides a video encoder, which comprises a nonvolatile storage medium and a central processing unit, wherein the nonvolatile storage medium stores an executable program, and the central processing unit is connected with the nonvolatile storage medium and executes the executable program to realize the image encoding method of the embodiment of the application.

The embodiment of the application provides an image decoding device which can be a video decoder or a video decoder. Specifically, the image decoding apparatus is configured to perform the steps performed by the video decoder in the above decoding method. The image decoding device provided by the embodiment of the application can comprise modules corresponding to the corresponding steps.

In the embodiment of the present application, the image decoding apparatus may be divided into functional modules according to the above method, for example, each functional module may be divided according to each function, or two or more functions may be integrated into one processing module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The division of the modules in the embodiment of the present application is schematic, and is only a logic function division, and there may be another division manner in actual implementation.

Fig. 16 is a schematic diagram showing a possible configuration of the image decoding apparatus according to the above embodiment, in a case where each functional module is divided in correspondence with each function. As shown in fig. 16, image decoding apparatus 16 includes parsing section 160, determining section 161, and filtering section 162.

An analysis unit 160, configured to analyze the code stream, and determine a luminance component intra-frame prediction mode and a chrominance component intra-frame prediction mode of the current decoded block;

a determining unit 161 configured to determine a reference prediction block for a chroma component of the currently decoded block according to the luma component intra prediction mode when the chroma component intra prediction mode indicates that a prediction value for the chroma component of the currently decoded block is derived using a reconstructed block for the luma component of the currently decoded block;

a filtering unit 162, configured to filter the reference prediction block of the chroma component of the current decoded block to obtain a prediction block of the chroma component of the current decoded block.

All relevant contents of each step related to the above method embodiment may be referred to the functional description of the corresponding functional module, and are not described herein again. Of course, the image decoding apparatus provided in the embodiments of the present application includes, but is not limited to, the above modules, for example: the image decoding apparatus may further include a storage unit 163. The storage unit 163 may be used to store program codes and data of the image decoding apparatus.

In the case of using an integrated unit, a schematic structural diagram of an image decoding apparatus provided in an embodiment of the present application is shown in fig. 17. In fig. 17, the image decoding device 17 includes: a processing module 170 and a communication module 171. The processing module 170 is used to control and manage the actions of the image decoding apparatus, for example, perform the steps performed by the parsing unit 160, the determining unit 161, the filtering unit 162, and/or other processes for performing the techniques described herein. The communication module 171 is used to support interaction between the image decoding apparatus and other devices. As shown in fig. 15, the image decoding apparatus may further include a storage module 172, and the storage module 172 is used to store program codes and data of the image decoding apparatus, for example, contents stored in the storage unit 163.

The Processing module 170 may be a Processor or a controller, and may be, for example, a Central Processing Unit (CPU), a general-purpose Processor, a Digital Signal Processor (DSP), an ASIC, an FPGA or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor may also be a combination of computing functions, e.g., comprising one or more microprocessors, DSPs, and microprocessors, among others. The communication module 171 may be a transceiver, an RF circuit or a communication interface, etc. The storage module 172 may be a memory.

All relevant contents of each scene related to the method embodiment may be referred to the functional description of the corresponding functional module, and are not described herein again. Both the image decoding device 16 and the image decoding device 17 can execute the image decoding method shown in fig. 13, and the image decoding device 16 and the image decoding device 17 may specifically be a video image decoding device or other devices having a video decoding function.

The application also provides a video decoder, which comprises a nonvolatile storage medium and a central processing unit, wherein the nonvolatile storage medium stores an executable program, and the central processing unit is connected with the nonvolatile storage medium and executes the executable program to realize the image decoding method of the embodiment of the application.

The present application further provides a terminal, including: one or more processors, memory, a communication interface. The memory, communication interface, and one or more processors; the memory is used for storing computer program code comprising instructions which, when executed by the one or more processors, cause the terminal to perform the image encoding and/or image decoding methods of embodiments of the present application. The terminal can be a video display device, a smart phone, a portable computer and other devices which can process video or play video.

Another embodiment of the present application also provides a computer-readable storage medium including one or more program codes, where the one or more programs include instructions, and when a processor in a decoding apparatus executes the program codes, the decoding apparatus executes an image encoding method and an image decoding method of the embodiments of the present application.

In another embodiment of the present application, there is also provided a computer program product comprising computer executable instructions stored in a computer readable storage medium; the at least one processor of the decoding device may read the computer executable instructions from the computer readable storage medium, and the execution of the computer executable instructions by the at least one processor causes the terminal to implement the image encoding method and the image decoding method of the embodiments of the present application.

In the above embodiments, all or part of the implementation may be realized by software, hardware, firmware or any combination thereof. When implemented using a software program, may take the form of a computer program product, either entirely or partially. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part.

The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium

The medium may be any available medium that can be accessed by a computer or a data storage device including one or more available media integrated servers, data centers, and the like. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

Through the above description of the embodiments, it is clear to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional modules is merely used as an example, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device may be divided into different functional modules to complete all or part of the above described functions.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described device embodiments are merely illustrative, and for example, the division of the modules or units is only one logical functional division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another device, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may be one physical unit or a plurality of physical units, that is, may be located in one place, or may be distributed in a plurality of different places. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially or partially contributed to by the prior art, or all or part of the technical solutions may be embodied in the form of a software product, where the software product is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The above description is only an embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions within the technical scope of the present disclosure should be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. An image encoding method, comprising:

dividing the image, and determining a brightness component intra-frame prediction mode and a chroma component intra-frame prediction mode of a current coding block;

when the chroma component intra-prediction mode indicates that a chroma component of the current coding block is derived by using a luma component of the current coding block, determining a reference prediction block of the chroma component of the current coding block according to the luma component intra-prediction mode;

and filtering the reference prediction block of the chrominance component of the current coding block to obtain the prediction block of the chrominance component of the current coding block.

2. The method of claim 1, wherein the filtering the reference prediction block for chroma components of the current coding block comprises:

determining a filter according to the luminance component intra prediction mode;

filtering a reference prediction block for a chroma component of the current coding block using the filter.

3. The method of claim 2, wherein the determining a filter according to the luma component intra prediction mode comprises:

when the intra prediction direction indicated by the luminance component intra prediction mode is the horizontal direction, the filter is set as a first filter.

4. The method of claim 3, wherein the first filter comprises a first two-tap filter for filtering an upper boundary pixel region of the reference prediction block of the chroma component and a first three-tap filter for filtering a non-upper boundary pixel region of the reference prediction block of the chroma component.

5. The method of claim 4, wherein the first two-tap filter comprises:

and

express get

And

the larger of these.

6. The method of claim 5, wherein the first three-tap filter comprises:

and

express get

And

the larger of these.

7. The method of claim 2, wherein the determining a filter according to the luma component intra prediction mode comprises:

when the intra prediction direction indicated by the luminance component intra prediction mode is the horizontal direction, the filter is set as a second filter.

8. The method of claim 7, wherein the second filter comprises a second two-tap filter for filtering a left boundary pixel region of the reference prediction block for the chroma component and a second three-tap filter for filtering a non-left boundary pixel region of the reference prediction block for the chroma component.

9. The method of claim 8, wherein the second two-tap filter comprises:

and

express get

And

the larger of these.

10. The method of claim 9, wherein the second third tap filter comprises:

and

to representGet

And

the larger of these.

11. The method of claim 2, wherein selecting a filter for the currently decoded block according to the luma component intra prediction mode comprises:

when the intra prediction direction indicated by the luminance component intra prediction mode is a diagonal direction, the filter is set as a third filter.

12. The method of claim 11, wherein the third filter comprises a third two-tap filter for filtering a left boundary pixel region and an upper boundary pixel region of the chroma component reference prediction block and a third three-tap filter for filtering a pixel region of the chroma component reference prediction block other than the left boundary pixel region and the upper boundary pixel region.

13. The method of claim 12, wherein the third two-tap filter comprises:

P_C(x，y)＝(P′_C(2x,2y)+P′_C(2x+1,2y+1)+1)＞＞1；

and

is a prediction sample of a chroma component of the current pixel.

14. The method of claim 13, wherein the third tap filter comprises:

and

is a prediction sample of a chroma component of the current pixel.

15. The method of any of claims 1-14, wherein said determining a reference prediction block for a chroma component of the current coding block according to the luma component intra prediction mode comprises:

determining a reconstructed block of the brightness component of the current coding block according to the brightness component intra-frame prediction mode;

and determining a reference prediction block of a chrominance component of the current coding block according to a reconstructed block of a luminance component of the current coding block.

16. The method of claim 15, wherein determining the reference prediction block for the chroma component of the current coding block based on the reconstructed block for the luma component of the current coding block comprises:

determining a linear model for performing cross-component prediction by using a reconstructed block of the brightness component of the current coding block;

and calculating a reconstructed block of the brightness component according to the linear model to obtain a reference prediction block of the chrominance component of the current coding block.

17. The method of claim 16, wherein determining a linear model for cross-component prediction using a reconstructed block of a luma component of the current coding block comprises:

determining a reference pixel for computing the linear model, the reference pixel comprising at least one neighboring pixel of the currently encoded block;

the linear model is calculated from the reference pixels.

18. The method of claim 17, wherein determining the reference pixel for computing the linear model comprises:

and determining a reference pixel for calculating the linear model according to available information of reconstructed samples of adjacent pixels of the current coding block and the chroma component intra-frame prediction mode.

19. The method of claim 17, wherein determining the reference pixel for computing the linear model comprises:

and determining a reference pixel for calculating the linear model according to the brightness component intra-frame prediction mode with the optimal code rate distortion cost of the adjacent coding block of the current coding block.

20. An image decoding method, comprising:

analyzing the code stream, and determining a brightness component intra-frame prediction mode and a chroma component intra-frame prediction mode of the current decoding block;

determining a reference prediction block for a chroma component of the current decoded block according to the luma component intra prediction mode when the chroma component intra prediction mode indicates that a prediction value for a chroma component of the current decoded block is derived using a reconstructed block for a luma component of the current decoded block;

and filtering the reference prediction block of the chroma component of the current decoding block to obtain the prediction block of the chroma component of the current decoding block.

21. The method of claim 20, wherein said filtering the reference prediction block for the chroma components of the currently decoded block comprises:

filtering a reference prediction block for chroma components of the currently decoded block using the filter.

22. The method of claim 21, wherein determining a filter according to the luma component intra prediction mode comprises:

23. The method of claim 22, wherein the first filter comprises a first two-tap filter for filtering an upper boundary pixel region of the reference prediction block of the chroma component and a first three-tap filter for filtering a non-upper boundary pixel region of the reference prediction block of the chroma component.

24. The method of claim 23, wherein the first two-tap filter comprises:

wherein x, y are the coordinates of the current pixel, P'_CIs the current pixelIs determined based on the predicted samples of the luminance component,

and

express get

And

the larger of these.

25. The method of claim 24, wherein the first three-tap filter comprises:

and

express get

And

the larger of these.

26. The method of claim 21, wherein determining a filter according to the luma component intra prediction mode comprises:

27. The method of claim 26, wherein the second filter comprises a second two-tap filter for filtering a left boundary pixel region of the reference prediction block for the chroma component and a second three-tap filter for filtering a non-left boundary pixel region of the reference prediction block for the chroma component.

28. The method of claim 27, wherein the second two-tap filter comprises:

and

express get

And

the larger of these.

29. The method of claim 28, wherein the second third tap filter comprises:

and

express get

And

the larger of these.

30. The method of claim 21, wherein selecting a filter for the currently decoded block according to the luma component intra prediction mode comprises:

31. The method of claim 30, wherein the third filter comprises a third two-tap filter for filtering a left boundary pixel region and an upper boundary pixel region of the chroma component reference prediction block and a third three-tap filter for filtering a pixel region of the chroma component reference prediction block other than the left boundary pixel region and the upper boundary pixel region.

32. The method of claim 31, wherein the third two-tap filter comprises:

P_C(x，y)＝(P′_C(2x,2y)+P′_C(2x+1,2y+1)+1)＞＞1；

and

is a prediction sample of a chroma component of the current pixel.

33. The method of claim 32, wherein the third tap filter comprises:

and

is a prediction sample of a chroma component of the current pixel.

34. The method of any of claims 20-33, wherein said determining a reference prediction block for a chroma component of the currently decoded block according to the luma component intra prediction mode comprises:

determining a reconstructed block of a luminance component of the current decoded block according to the chrominance component intra-prediction mode;

determining a reference prediction block for a chroma component of the current decoded block according to a reconstructed block of a luma component of the current decoded block.

35. The method of claim 34, wherein said determining a reference prediction block for a chroma component of the current decoded block from a reconstructed block for a luma component of the current decoded block comprises:

determining a linear model for cross-component prediction by using a reconstructed block of a luminance component of the current decoding block;

and calculating a reconstructed block of the brightness component according to the linear model to obtain a reference prediction block of the chroma component of the current decoding block.

36. The method of claim 35, wherein determining a linear model for cross-component prediction using a reconstructed block of a luma component of the currently decoded block comprises:

determining reference pixels for computing the linear model, the reference pixels comprising at least one neighboring pixel of the currently decoded block;

the linear model is calculated from the reference pixels.

37. The method of claim 36, wherein determining the reference pixel for computing the linear model comprises:

determining a reference pixel for calculating the linear model according to available information of reconstructed samples of neighboring pixels of the current decoded block and the chroma component intra prediction mode.

38. The method of claim 36, wherein determining the reference pixel for computing the linear model comprises:

and determining a reference pixel for calculating the linear model according to the brightness component intra-frame prediction mode with the optimal code rate distortion cost of the adjacent decoding block of the current decoding block.

39. An image encoding device characterized by comprising:

the dividing unit is used for dividing the image and determining a brightness component intra-frame prediction mode and a chroma component intra-frame prediction mode of the current coding block;

a determining unit configured to determine a reference prediction block for a chroma component of the current coding block according to the luma component intra prediction mode when the chroma component intra prediction mode indicates that the chroma component of the current coding block is derived using a luma component of the current coding block;

and the filtering unit is used for filtering the reference prediction block of the chroma component of the current coding block to obtain the prediction block of the chroma component of the current coding block.

40. An image decoding apparatus, comprising:

the decoding unit is used for decoding the video stream and determining a luminance component intra-frame prediction mode and a chrominance component intra-frame prediction mode of a current decoding block;

a determination unit for determining a reference prediction block for a chroma component of the currently decoded block according to the luma component intra prediction mode when the chroma component intra prediction mode indicates that a prediction value for the chroma component of the currently decoded block is derived using a reconstructed block for the luma component of the currently decoded block;

and the filtering unit is used for filtering the reference prediction block of the chroma component of the current decoding block to obtain the prediction block of the chroma component of the current decoding block.

41. An encoder comprising a non-volatile storage medium and a central processor, wherein the non-volatile storage medium stores an executable program, wherein the central processor is coupled to the non-volatile storage medium, and wherein the encoder performs the bi-directional inter prediction method as recited in any one of claims 1-19 when the executable program is executed by the central processor.

42. A decoder comprising a non-volatile storage medium and a central processor, wherein the non-volatile storage medium stores an executable program, wherein the central processor is coupled to the non-volatile storage medium, wherein the decoder performs the bi-directional inter prediction method as claimed in any one of claims 20 to 38 when the executable program is executed by the central processor.

43. A terminal, characterized in that the terminal comprises: one or more processors, memory, and a communication interface; the memory, the communication interface and the one or more processors; the terminal communicating with other devices via the communication interface, the memory for storing computer program code, the computer program code comprising instructions,

the instructions, when executed by the one or more processors, cause the terminal to perform the method of any of claims 1-38.

44. A computer program product comprising instructions for causing a terminal to perform the method according to any one of claims 1-38 when the computer program product is run on the terminal.

45. A computer-readable storage medium comprising instructions that, when executed on a terminal, cause the terminal to perform the method of any one of claims 1-38.