CN114079791A

CN114079791A - Encoding method, decoding method and related device

Info

Publication number: CN114079791A
Application number: CN202010851865.4A
Authority: CN
Inventors: 谢志煌
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2020-08-21
Filing date: 2020-08-21
Publication date: 2022-02-22
Also published as: TW202209882A; WO2022037300A1; CN115769573A

Abstract

The embodiment of the application discloses an encoding method, a decoding method and a related device, wherein the decoding method comprises the following steps: dividing the image to obtain the coding information of the current coding block; determining whether intra-frame prediction smoothing correction is allowed to be used according to the first identification bit and the second identification bit, and performing filling processing on the original prediction block to obtain a first prediction block after filling processing; filtering each pixel in the original prediction block by using a smooth correction filter according to the first prediction block to obtain a second prediction block after smooth correction; and if the rate distortion cost of the second prediction block is less than that of the original prediction block, setting the third identification bit of the current coding block to be a first value. The embodiment of the application provides selection for operations such as smoothing processing or local blurring of intra-frame prediction, and for parts of image texture which do not need to be sharpened too much, the technology is used to enable predicted pixels to be smoother, predicted blocks are closer to original images, and finally coding efficiency is improved.

Description

Encoding method, decoding method and related device

Technical Field

The present application relates to the field of encoding and decoding technologies, and in particular, to an encoding method, a decoding method, and a related apparatus.

Background

Digital video capabilities can be incorporated into a wide range of devices, including digital televisions, digital direct broadcast systems, wireless broadcast systems, Personal Digital Assistants (PDAs), laptop or desktop computers, tablet computers, e-book readers, digital cameras, digital recording devices, digital media players, video gaming devices, video game consoles, cellular or satellite radio telephones, video conferencing devices, video streaming devices, and so forth.

Digital video devices implement video compression techniques such as those described in the standards and extensions of the standards defined by the Moving Picture Experts Group (MPEG) -2, MPEG-4, ITU-t h.263, ITU-t h.264/MPEG-4 part 10 Advanced Video Coding (AVC), ITU-t h.265 High Efficiency Video Coding (HEVC) standards, to more efficiently transmit and receive digital video information. Video devices may more efficiently transmit, receive, encode, decode, and/or store digital video information by implementing these video codec techniques.

With the proliferation of internet video, even though digital video compression technology is continuously evolving, still higher requirements are put on video compression ratio.

Disclosure of Invention

The embodiment of the application provides an encoding method, a decoding method and a related device, which aim to provide selection for operations such as smoothing processing or local blurring of intra-frame prediction, for parts of image texture which do not need to be sharpened too much, the prediction pixels are smoother by using the technology, the prediction block is closer to an original image, and finally the encoding efficiency is improved.

In a first aspect, an embodiment of the present application provides an encoding method, including:

dividing an image, and acquiring coding information of a current coding block, wherein the coding information comprises a first identification bit and a second identification bit, the first identification bit is used for indicating whether the current coding block allows intra-frame prediction filtering, and the second identification bit is used for indicating whether the current coding block allows intra-frame prediction smooth correction;

determining an original prediction block of a current coding block;

determining whether the intra-frame prediction smooth correction is allowed to be used according to the first identification bit and the second identification bit, and performing filling processing on the original prediction block to obtain a first prediction block after filling processing;

filtering each pixel in the original prediction block by using a smoothing correction filter according to the first prediction block to obtain a second prediction block after smoothing correction, calculating the rate distortion cost of the second prediction block, and comparing the rate distortion cost of the second prediction block with the rate distortion cost of the original prediction block:

if the rate distortion cost of the second prediction block is smaller than the rate distortion cost of the original prediction block, setting a third identification bit of the current coding block as a first numerical value, and transmitting the first numerical value through a code stream, wherein the third identification bit is used for indicating whether the current coding block uses the intra-frame prediction smooth correction, and the first numerical value indicates that the current coding block uses the intra-frame prediction smooth correction.

Compared with the prior art, the scheme provides selection for operations such as smoothing processing or local blurring for intra-frame prediction, for parts of the image texture which does not need to be sharpened too much, the prediction pixels are smoother by using the technology, the prediction block is closer to the original image, and finally the coding efficiency is improved.

In a second aspect, an embodiment of the present application provides a decoding method, including:

analyzing the code stream, and acquiring a second identification bit of the current decoding block, wherein the second identification bit is used for indicating whether the current decoding block allows using intra-frame prediction smooth correction or not;

if the second identification bit indicates that the intra-frame prediction smooth correction is allowed to be used, analyzing the code stream to obtain a third identification bit of the current decoding block, wherein the third identification bit is used for indicating whether the current decoding block uses the intra-frame prediction smooth correction or not;

analyzing the code stream, acquiring original residual error information of the current decoding block and an intra-frame prediction mode required to be used, performing inverse transformation and inverse quantization on the original residual error information to obtain time domain residual error information, and acquiring an original prediction block of the current decoding block according to the intra-frame prediction mode required to be used and an adjacent reconstruction block of the current decoding block;

if the third identification bit indicates that the intra-frame prediction smooth correction is used, filling the original prediction block to obtain a first prediction block after filling;

filtering each pixel in the original prediction block by using a smooth correction filter according to the first prediction block to obtain a second prediction block after smooth correction;

and superposing the second prediction block on the time domain residual error information to obtain a reconstructed block of the current decoding block.

In a third aspect, an embodiment of the present application provides an encoding apparatus, including:

a dividing unit, configured to divide an image, and determine intra prediction filtering indication information of a current encoding block, where the intra prediction filtering indication information includes first indication information and second indication information, the first indication information is used to indicate whether a first intra prediction filtering mode is allowed to be used, the second indication information is used to indicate whether a second intra prediction filtering mode is allowed to be used, and the first intra prediction filtering mode is an intra prediction filtering IPF mode;

a determining unit, configured to set a first usage flag of the first intra-prediction filtering mode of the current coding block to be allowed to be used if it is determined that the current coding block needs to use the first intra-prediction filtering mode according to the intra-prediction filtering indication information;

a transmission unit, configured to transmit the intra-prediction filtering indication information, the first intra-prediction filtering mode, and the first usage flag via a code stream;

and the superposition unit is used for determining to obtain the prediction block of the decoding block by using the first frame rate prediction filtering mode according to the intra-frame prediction filtering indication information and the first use identification bit.

In a fourth aspect, an embodiment of the present application provides a decoding apparatus, including:

the first analysis unit is used for analyzing the code stream and acquiring a second identification bit of the current decoding block, wherein the second identification bit is used for indicating whether the current decoding block allows intra-frame prediction smooth correction or not;

a second parsing unit, configured to parse the code stream to obtain a third flag of the current decoded block if the second flag indicates that the intra-frame prediction smooth correction is allowed to be used, where the third flag is used to indicate whether the current decoded block uses the intra-frame prediction smooth correction;

a third parsing unit, configured to parse the code stream, obtain original residual information of the current decoded block and an intra-frame prediction mode that needs to be used, perform inverse transformation and inverse quantization on the original residual information to obtain time-domain residual information, and obtain an original prediction block of the current decoded block according to the intra-frame prediction mode that needs to be used and an adjacent reconstructed block of the current decoded block;

a padding unit, configured to perform padding on the original prediction block to obtain a padded first prediction block if the third flag indicates that the intra-frame prediction smoothing correction is used;

a smoothing correction unit, configured to filter, according to the first prediction block, each pixel in the original prediction block by using a smoothing correction filter, so as to obtain a second prediction block after smoothing correction;

and the reconstruction unit is used for superposing the second prediction block on the time domain residual error information to obtain a reconstruction block of the current decoding block.

In a fifth aspect, an embodiment of the present application provides an encoder, including: a processor and a memory coupled to the processor; the processor is configured to perform the method of the first aspect.

In a sixth aspect, an embodiment of the present application provides a decoder, including: a processor and a memory coupled to the processor; the processor is configured to perform the method of the second aspect.

In a seventh aspect, an embodiment of the present application provides a terminal, where the terminal includes: one or more processors, memory, and a communication interface; the memory, the communication interface and the one or more processors; the terminal communicates with other devices through the communication interface, the memory is used for storing computer program code comprising instructions which, when executed by the one or more processors, perform the method according to the first or second aspect.

In an eighth aspect, the present invention provides a computer-readable storage medium, having stored therein instructions, which, when executed on a computer, cause the computer to perform the method of the first or second aspect.

In a ninth aspect, embodiments of the present application provide a computer program product comprising instructions that, when executed on a computer, cause the computer to perform the method of the first or second aspect.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a schematic block diagram of a coding tree unit in an embodiment of the present application;

FIG. 2 is a schematic block diagram of a CTU and a coding block CU in an embodiment of the present application;

FIG. 3 is a schematic block diagram of a color format in an embodiment of the present application;

FIG. 4 is a schematic diagram of an IPF in an embodiment of the present application;

FIG. 5 is a diagram illustrating intra prediction filtering according to an embodiment of the present application;

FIG. 6 is a schematic block diagram of a video coding system in an embodiment of the present application;

FIG. 7 is a schematic block diagram of a video encoder in an embodiment of the present application;

FIG. 8 is a schematic block diagram of a video decoder in an embodiment of the present application;

FIG. 9a is a flowchart illustrating an encoding method according to an embodiment of the present application;

FIG. 9b is a schematic diagram illustrating the filling of a prediction block according to an embodiment of the present application;

FIG. 9c is a schematic diagram of a prediction block filtering in the embodiment of the present application;

FIG. 9d is a diagram illustrating another prediction block filtering in the embodiment of the present application;

FIG. 9e is a diagram illustrating another prediction block filtering in the embodiment of the present application;

FIG. 9f is a schematic diagram of another prediction block filtering in the embodiment of the present application;

FIG. 9g is a schematic diagram of another prediction block filtering in the embodiment of the present application;

FIG. 9h is a diagram illustrating another prediction block filtering in the embodiment of the present application;

FIG. 9i is a schematic diagram of another prediction block filtering in the embodiment of the present application;

FIG. 9j is a diagram illustrating another prediction block filtering in the embodiment of the present application;

FIG. 9k is a diagram illustrating another prediction block filtering in the embodiment of the present application;

FIG. 9l is a schematic diagram of another prediction block filtering in the embodiment of the present application;

FIG. 9m is a schematic diagram of another prediction block filtering in the embodiment of the present application;

FIG. 9n is a schematic diagram of another prediction block filtering in the embodiment of the present application;

FIG. 10 is a flowchart illustrating a decoding method according to an embodiment of the present application;

FIG. 11 is a block diagram of functional units of an encoding apparatus according to an embodiment of the present application;

FIG. 12 is a block diagram of another functional unit of the encoding apparatus in the embodiment of the present application;

FIG. 13 is a block diagram of functional units of a decoding apparatus according to an embodiment of the present application;

fig. 14 is a block diagram of another functional unit of the decoding apparatus in the embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

It will be understood that, as used herein, the terms "first," "second," and the like may be used herein to describe various elements, but these elements are not limited by these terms. These terms are only used to distinguish one element from another.

First, terms used in the embodiments of the present application will be described.

For the partition of an image, in order to more flexibly represent video contents, a Coding Tree Unit (CTU), a Coding Unit (CU), a Prediction Unit (PU), and a Transform Unit (TU) are defined. The CTU, CU, PU, and TU are all image blocks.

A coding tree unit CTU, an image being composed of a plurality of CTUs, a CTU generally corresponding to a square image area, containing luminance pixels and chrominance pixels (or may contain only luminance pixels, or may contain only chrominance pixels) in the image area; the CTU also contains syntax elements that indicate how the CTU is divided into at least one Coding Unit (CU) and the method of decoding each coding block to obtain a reconstructed picture.

As shown in fig. 1, the picture 10 is composed of a plurality of CTUs (including CTU a, CTU B, CTU C, etc.). The encoded information corresponding to a CTU includes luminance values and/or chrominance values of pixels in a square image region corresponding to the CTU. Furthermore, the coding information corresponding to a CTU may also contain syntax elements indicating how to divide the CTU into at least one CU and the method of decoding each CU to get the reconstructed picture. The image area corresponding to one CTU may include 64 × 64, 128 × 128, or 256 × 256 pixels. In one example, a CTU of 64 × 64 pixels comprises a rectangular pixel lattice of 64 columns of 64 pixels each, each pixel comprising a luminance component and/or a chrominance component. The CTUs may also correspond to rectangular image regions or image regions with other shapes, and an image region corresponding to one CTU may also be an image region in which the number of pixels in the horizontal direction is different from the number of pixels in the vertical direction, for example, including 64 × 128 pixels.

The coding block CU, as shown in fig. 2, may further be divided into coding blocks CU, each of which generally corresponds to an a × B rectangular region in the image, and includes a × B luma pixels and/or its corresponding chroma pixels, a being the width of the rectangle, B being the height of the rectangle, a and B may be the same or different, and a and B generally take values of 2 raised to an integer power, such as 128, 64, 32, 16, 8, 4. Here, the width referred to in the embodiment of the present application refers to the length in the X-axis direction (horizontal direction) in the two-dimensional rectangular coordinate system XoY shown in fig. 1, and the height refers to the length in the Y-axis direction (vertical direction) in the two-dimensional rectangular coordinate system XoY shown in fig. 1. The reconstructed image of a CU may be obtained by adding a predicted image, which is generated by intra prediction or inter prediction, specifically, may be composed of one or more Predicted Blocks (PB), and a residual image, which is generated by inverse quantization and inverse transform processing on transform coefficients, specifically, may be composed of one or more Transform Blocks (TB). Specifically, one CU includes coding information including information such as a prediction mode and a transform coefficient, and performs decoding processing such as corresponding prediction, inverse quantization, and inverse transform on the CU according to the coding information to generate a reconstructed image corresponding to the CU. The coding tree unit and coding block relationship is shown in fig. 3.

The prediction unit PU is a basic unit of intra prediction and inter prediction. Defining motion information of an image block to include an inter-frame prediction direction, a reference frame, a motion vector, and the like, wherein the image block undergoing encoding processing is called a Current Coding Block (CCB), the image block undergoing decoding processing is called a Current Decoding Block (CDB), and for example, when one image block is undergoing prediction processing, the current coding block or the current decoding block is a prediction block; when an image block is being residual processed, the currently encoded block or the currently decoded block is a transform block. The picture in which the current coding block or the current decoding block is located is called the current frame. In the current frame, image blocks located on the left side or the upper side of the current block (based on the coordinate system in fig. 1, the left side refers to the negative direction of the X axis, and the upper side refers to the positive direction of the Y axis) may be located inside the current frame and have completed encoding/decoding processing, resulting in reconstructed images, which are called reconstructed blocks; information such as the coding mode of the reconstructed block, the reconstructed pixels, etc. is available (available). A frame in which the encoding/decoding process has been completed before the encoding/decoding of the current frame is referred to as a reconstructed frame. When the current frame is a uni-directionally predicted frame (P frame) or a bi-directionally predicted frame (B frame), it has one or two reference frame lists, respectively, referred to as L0 and L1, each of which contains at least one reconstructed frame, referred to as the reference frame of the current frame. The reference frame provides reference pixels for inter-frame prediction of the current frame.

And a transform unit TU for processing the residual between the original image block and the predicted image block.

The pixel (also called as a pixel) refers to a pixel in an image, such as a pixel in a coding block, a pixel in a luminance component pixel block (also called as a luminance pixel), a pixel in a chrominance component pixel block (also called as a chrominance pixel), and the like.

The samples (also referred to as pixel values and sample values) refer to pixel values of pixels, the pixel values refer to luminance (i.e., gray-scale values) in a luminance component domain, and the pixel values refer to chrominance values (i.e., colors and saturations) in a chrominance component domain, and according to different processing stages, a sample of one pixel specifically includes an original sample, a predicted sample, and a reconstructed sample.

Description of the directions: horizontal direction, for example: in the two-dimensional rectangular coordinate system XoY shown in fig. 1, along the X-axis direction and the vertical direction, for example: as shown in the two-dimensional rectangular coordinate system XoY of fig. 1 along the Y-axis in the negative direction.

And intra-frame prediction, namely generating a prediction image of the current block according to the spatial adjacent pixels of the current block. An intra prediction mode corresponds to a method of generating a prediction image. The division of the intra-frame prediction unit comprises a2 Nx 2N division mode and an Nx N division mode, wherein the 2 Nx 2N division mode is that image blocks are not divided; the N × N division is to divide the image block into four equal-sized sub-image blocks.

In general, digital video compression techniques operate on video sequences whose color coding method is YCbCr, which may also be referred to as YUV, in a color format of 4: 2: 0, 4: 2, or 4: 4. Where Y denotes brightness (Luma) that is a gray scale value, Cb denotes a blue Chrominance component, Cr denotes a red Chrominance component, and U and V denote Chrominance (Chroma) for describing color and saturation. In the color format, 4: 2: 0 indicates 4 luminance components per 4 pixels, 2 chrominance components (yyycbcr), 4: 2 indicates 4 luminance components per 4 pixels, 4 chrominance components (yyyycbcrcbccr), and 4: 4 indicates a full-pixel display (yyycbcrcbcrcbccr), and fig. 3 shows component profiles in different color formats, where circles are Y components and triangles are UV components.

The intra-frame prediction part in the digital video coding and decoding mainly refers to the image information of adjacent blocks of a current frame to predict a current coding unit block, calculates residual errors of a prediction block and an original image block to obtain residual error information, and transmits the residual error information to a decoding end through the processes of transformation, quantization and the like. And after receiving and analyzing the code stream, the decoding end obtains residual information through steps of inverse transformation, inverse quantization and the like, and a reconstructed image block is obtained after the residual information is superposed on a predicted image block obtained by prediction of the decoding end. In the process, intra-frame prediction usually predicts a current coding block by means of respective angle mode and non-angle mode to obtain a prediction block, screens out the optimal prediction mode of a current coding unit according to rate distortion information obtained by calculation of the prediction block and an original block, and then transmits the prediction mode to a decoding end through a code stream. And the decoding end analyzes the prediction mode, predicts to obtain a predicted image of the current decoding block and superposes residual pixels transmitted by the code stream to obtain a reconstructed image.

The development of ultra-high definition digital video puts higher requirements on intra-frame prediction, and the coding efficiency cannot be submitted by only increasing angle prediction modes and expanding wide angles. All reference pixels are not adopted in the current intra-frame angle prediction, the relevance between some pixels and the current coding unit is easy to ignore, and the intra-frame prediction filtering (IPF) improves the pixel prediction precision through point-to-point filtering, so that the spatial relevance can be effectively enhanced, and the intra-frame prediction precision is improved.

Fig. 4 shows an embodiment of a prediction mode applying intra prediction filtering. Wherein, URB represents the boundary pixel of the left neighboring block close to the current coding unit, MRB represents the boundary pixel of the upper neighboring block close to the current coding unit, and filter direction represents the filtering direction. In the prediction mode direction from top right to bottom left, the generated prediction value of the current coding unit mainly uses the reference pixel points of the adjacent block in the row of the MRB above, that is, the prediction pixel of the current coding unit does not refer to the reconstructed pixel of the adjacent block on the left side, however, the current coding unit and the reconstructed block on the left side are in a spatial adjacent relationship, and if only the MRB pixel on the upper side is referred to and the URB pixel on the left side is not referred to, spatial correlation is easily lost, which results in poor prediction effect.

The intra-frame prediction filtering can be applied to all prediction modes of intra-frame prediction, and is a filtering method for improving the intra-frame prediction precision. The intra-frame prediction filtering is mainly realized by the following processes:

a) determining a current prediction mode of the coding unit, the current prediction mode comprising: a horizontal angle prediction mode, a vertical angle prediction mode and a non-angle prediction mode;

b) filtering the input pixels by adopting different filters according to different types of prediction modes;

c) and filtering the input pixel by adopting different filter coefficients according to different distances from the current pixel to the reference pixel.

The input pixel is a prediction pixel obtained in each prediction mode, and the output pixel is a final prediction pixel after intra-frame prediction filtering.

In an embodiment, an enable flag bit ipf _ enable _ flag, a binary variable, a value of '1' indicates that intra prediction filtering may be used; a value of '0' indicates that no intra prediction filtering should be used.

In one embodiment, the flag bit ipf _ flag, a binary variable, with a value of '1' indicates that intra prediction filtering should be used; a value of '0' indicates that intra prediction filtering should not be used, and if the flag ipf _ flag does not exist in the code stream, 0 is defaulted.

Syntax element IPF _ flag, as follows:

in one embodiment, prediction modes 0, 1 and 2 are classified as non-angular prediction modes, and the prediction pixels are filtered using a first three-tap filter;

classifying the prediction modes 3 to 18 and 34 to 50 into vertical angle prediction modes, and filtering the prediction pixels by using a first two-tap filter;

the prediction modes 19 to 32 and 51 to 65 are classified into horizontal-class angle prediction modes, and the prediction pixels are filtered using the second two-tap filter.

In one embodiment, the first three-tap filter has the following filtering formula:

P′(x，y)＝f(x)·P(-1，y)+f(y)·P(x，-1)+(1-f(x)-f(y))·P(x，y)

the first two-tap filter, the filter formula is as follows:

P′(x，y)＝f(x)·P(-1，y)+(1-f(x))·P(x，y)

the second two-tap filter has the following filtering formula:

P′(x，y)＝f(y)·P(x，-1)+(1-f(y))·P(x，y)

in the above equation, P' (x, y) is the final prediction value of the pixel at the (x, y) position of the current chroma prediction block, f (x) and f (y) are the horizontal filter coefficient of the reconstructed pixel of the reference left-side neighboring block and the vertical filter coefficient of the reconstructed pixel of the reference upper-side neighboring block, respectively, P (-1, y) and P (x, -1) are the reconstructed pixel at the left side of the y row and the reconstructed pixel at the upper side of the x column, respectively, and P (x, y) is the original prediction pixel value in the current chroma component prediction block. Wherein, the values of x and y do not exceed the width and height value range of the current coding unit block.

The values of the horizontal filter coefficient and the vertical filter coefficient are related to the size of the current coding unit block and the distance from the prediction pixel in the current prediction block to the left reconstruction pixel and the upper reconstruction pixel. The values of the horizontal filter coefficient and the vertical filter coefficient are also related to the size of the current coding block, and are divided into different filter coefficient groups according to the size of the current coding unit block.

Table 1 shows the filter coefficients of the intra prediction filtering in one embodiment.

TABLE 1

Fig. 5 is a schematic diagram illustrating three filtering cases of intra prediction filtering, in which only upper reference pixels are referred to filter the prediction value in the current coding unit; only the left reference pixel is referred to filter the measured value in the current coding unit; and filtering the prediction value in the current coding unit block with reference to both the upper side reference pixel and the left side reference pixel.

FIG. 6 is a block diagram of a video coding system 1 of one example described in an embodiment of the present application. In embodiments of the present application, the term "video coder" includes: a video encoder and a video decoder. In embodiments of the present application, the terms "video coding" or "coding" may generally refer to video encoding or video decoding. The video encoder 100 and the video decoder 200 of the video coding system 1 are used to implement the encoding method and the decoding method proposed by the present application.

As shown in fig. 6, video coding system 1 includes a source device 10 and a destination device 20. Source device 10 generates encoded video data. Accordingly, source device 10 may be referred to as a video encoding device. Destination device 20 may decode the encoded video data generated by source device 10. Accordingly, the destination device 20 may be referred to as a video decoding device. Various implementations of source device 10 and destination device 20 may include one or more processors and memory coupled to the one or more processors. The memory can include, but is not limited to, RAM, ROM, EEPROM, flash memory, or any other medium that can be used to store desired program code in the form of instructions or data structures that can be accessed by a computer.

Source device 10 and destination device 20 may comprise a variety of devices, including desktop computers, mobile computing devices, notebook (e.g., laptop) computers, tablet computers, set-top boxes, telephone handsets such as "smart" phones, televisions, cameras, display devices, digital media players, video game consoles, in-vehicle computers, or the like.

Destination device 20 may receive encoded video data from source device 10 via link 30. Link 30 may comprise one or more media or devices capable of moving encoded video data from source device 10 to destination device 20. In one example, link 30 may comprise one or more communication media that enable source device 10 to transmit encoded video data directly to destination device 20 in real-time. In this example, source device 10 may modulate the encoded video data according to a communication standard, such as a wireless communication protocol, and may transmit the modulated video data to destination device 20. The one or more communication media may include wireless and/or wired communication media such as a Radio Frequency (RF) spectrum or one or more physical transmission lines. The one or more communication media may form part of a packet-based network, such as a local area network, a wide area network, or a global network (e.g., the internet). The one or more communication media may include a router, switch, base station, or other apparatus that facilitates communication from source device 10 to destination device 20. In another example, encoded data may be output from output interface 140 to storage device 40.

The image codec techniques of embodiments of the present application may be applied to video codecs to support a variety of multimedia applications, such as over-the-air television broadcasts, cable television transmissions, satellite television transmissions, streaming video transmissions (e.g., via the internet), encoding for video data stored on a data storage medium, decoding of video data stored on a data storage medium, or other applications. In some examples, video coding system 1 may be used to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting, and/or video telephony.

The video coding system 1 illustrated in fig. 6 is merely an example, and the techniques of embodiments of the present application may be applied to video coding settings (e.g., video encoding or video decoding) that do not necessarily include any data communication between the encoding device and the decoding device. In other examples, the data is retrieved from local storage, streamed over a network, and so forth. A video encoding device may encode and store data to a memory, and/or a video decoding device may retrieve and decode data from a memory. In many examples, the encoding and decoding are performed by devices that do not communicate with each other, but merely encode data to and/or retrieve data from memory and decode data.

In the example of fig. 6, source device 10 includes video source 120, video encoder 100, and output interface 140. In some examples, output interface 140 may include a regulator/demodulator (modem) and/or a transmitter. Video source 120 may comprise a video capture device (e.g., a video camera), a video archive containing previously captured video data, a video feed interface to receive video data from a video content provider, and/or a computer graphics system for generating video data, or a combination of such sources of video data.

Video encoder 100 may encode video data from video source 120. In some examples, source device 10 transmits the encoded video data directly to destination device 20 via output interface 140. In other examples, encoded video data may also be stored onto storage device 40 for later access by destination device 20 for decoding and/or playback.

In the example of fig. 6, destination device 20 includes input interface 240, video decoder 200, and display device 220. In some examples, input interface 240 includes a receiver and/or a modem. Input interface 240 may receive encoded video data via link 30 and/or from storage device 40. The display device 220 may be integrated with the destination device 20 or may be external to the destination device 20. In general, display device 220 displays decoded video data. The display device 220 may include a variety of display devices, such as a Liquid Crystal Display (LCD), a plasma display, an Organic Light Emitting Diode (OLED) display, or other types of display devices.

Although not shown in fig. 6, in some aspects, video encoder 100 and video decoder 200 may each be integrated with an audio encoder and decoder, and may include appropriate multiplexer-demultiplexer units or other hardware and software to handle encoding of both audio and video in a common data stream or separate data streams.

Video encoder 100 and video decoder 200 may each be implemented as any of a variety of circuits such as: one or more microprocessors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), discrete logic, hardware, or any combinations thereof. If the present application is implemented in part in software, a device may store instructions for the software in a suitable non-volatile computer-readable storage medium and may execute the instructions in hardware using one or more processors to implement the techniques of the present application. Any of the foregoing, including hardware, software, a combination of hardware and software, etc., may be considered one or more processors. Each of video encoder 100 and video decoder 200 may be included in one or more encoders or decoders, either of which may be integrated as part of a combined encoder/decoder (codec) in a respective device.

Fig. 7 is an exemplary block diagram of a video encoder 100 as described in an embodiment of the present application. The video encoder 100 is used to output the video to the post-processing entity 41. Post-processing entity 41 represents an example of a video entity, such as a media-aware network element (MANE) or a splicing/editing device, that may process the encoded video data from video encoder 100. In some cases, post-processing entity 41 may be an instance of a network entity. In some video encoding systems, post-processing entity 41 and video encoder 100 may be parts of separate devices, while in other cases, the functionality described with respect to post-processing entity 41 may be performed by the same device that includes video encoder 100. In some example, post-processing entity 41 is an example of storage 40 of FIG. 1.

In the example of fig. 7, the video encoder 100 includes a prediction processing unit 108, a filter unit 106, a memory 107, a summer 112, a transformer 101, a quantizer 102, and an entropy encoder 103. The prediction processing unit 108 includes an inter predictor 110 and an intra predictor 109. For image block reconstruction, the video encoder 100 further includes an inverse quantizer 104, an inverse transformer 105, and a summer 111. Filter unit 106 represents one or more loop filters, such as deblocking filters, Adaptive Loop Filters (ALF), and Sample Adaptive Offset (SAO) filters. In one example, the video encoder 100 may further include a video data memory, a partitioning unit (not shown).

Video encoder 100 receives video data and stores the video data in a video data memory. The partitioning unit partitions the video data into image blocks, which may be further partitioned into smaller blocks, e.g. based on a quadtree structure or a binary tree structure. Prediction processing unit 108 may select one of a plurality of possible coding modes for the current image block, such as one of a plurality of intra coding modes or one of a plurality of inter coding modes. Prediction processing unit 108 may provide the resulting intra-predicted, inter-predicted prediction blocks to summer 112 to generate a residual block, and to summer 111 to reconstruct the encoded block (also referred to as a reconstructed block) used as a reference picture. The intra predictor 109 is used to perform intra-predictive coding of the current image block to remove spatial redundancy, and the inter predictor 110 is used to perform inter-predictive coding of the current image block to remove temporal redundancy. The prediction processing unit 108 supplies a syntax element indicating the current image block (information of the selected intra or inter prediction mode) to the entropy encoder 103, the entropy encoder 103 encodes to indicate the selected prediction mode, the filter unit 106 supplies filter control data to the entropy encoder 103, and the entropy encoder 103 encodes to indicate the selected filtering manner.

After prediction processing unit 108 generates a prediction block for the current image block via inter/intra prediction, video encoder 100 forms a residual image block by subtracting the prediction block from the current image block to be encoded. Summer 112 represents one or more components that perform this subtraction operation. The residual video data in the residual block may be included in one or more TUs and applied to transformer 101. The transformer 101 transforms the residual video data into residual transform coefficients using a transform such as a Discrete Cosine Transform (DCT) or a conceptually similar transform. Transformer 101 may convert residual video data from a pixel value domain to a transform domain, e.g., the frequency domain.

The transformer 101 may send the resulting transform coefficients to the quantizer 102. Quantizer 102 quantizes the transform coefficients to further reduce the bit rate. In some examples, quantizer 102 may then perform a scan of a matrix that includes quantized transform coefficients. Alternatively, the entropy encoder 103 may perform a scan.

After quantization, the entropy encoder 103 entropy encodes the quantized transform coefficients. For example, the entropy encoder 103 may perform Context Adaptive Variable Length Coding (CAVLC), Context Adaptive Binary Arithmetic Coding (CABAC), syntax-based context adaptive binary arithmetic coding (SBAC), Probability Interval Partition Entropy (PIPE) coding, or another entropy encoding method or technique. After entropy encoding by the entropy encoder 103, the encoded codestream may be transmitted to the video decoder 200, or archived for later transmission or retrieved by the video decoder 200. The entropy encoder 103 may also entropy encode syntax elements of the current image block to be encoded.

Inverse quantizer 104 and inverse transformer 105 apply inverse quantization and inverse transform, respectively, to reconstruct the residual block in the pixel domain.

The summer 111 adds the reconstructed residual block to the prediction block produced by the inter predictor 110 or the intra predictor 109 to produce a reconstructed image block.

The filter unit 106 may be adapted to reconstruct the image block to reduce distortions, such as block artifacts. This reconstructed image block is then stored in memory 107 as a reference block, which may be used by inter predictor 110 as a reference block to inter predict a block in a subsequent video frame or image.

The video encoder 100 divides the input video into a number of coding tree units, each of which is in turn divided into a number of coding blocks, either rectangular or square. When the current coding block selects the intra-frame prediction mode for coding, the calculation traversal of a plurality of prediction modes is carried out on the brightness component of the current coding block, the optimal prediction mode is selected according to the rate distortion cost, the calculation traversal of a plurality of prediction modes is carried out on the chroma component of the current coding block, and the optimal prediction mode is selected according to the rate distortion cost. And then, calculating a residual between the original video block and the prediction block, wherein one subsequent path of the residual forms an output code stream through change, quantization, entropy coding and the like, and the other path of the residual forms a reconstruction sample through inverse transformation, inverse quantization, loop filtering and the like to be used as reference information of subsequent video compression.

In an embodiment, the intra-prediction filtering is implemented in the video encoder 100 as follows.

The input digital video information is divided into a plurality of coding tree units at a coding end, each coding tree unit is divided into a plurality of rectangular or square coding units, and each coding unit carries out intra-frame prediction process to calculate a prediction block.

In the current coding unit,

if the identification bit is allowed to be '1', all the following steps a1) -f1) are carried out;

② if the allowed identification bit is '0', only the steps of a1), b1), f1 and g1) are carried out.

Step a1) traversing all prediction modes, calculating prediction pixels in each intra-frame prediction mode, and calculating rate-distortion cost according to original pixels;

step b1) selecting the optimal prediction mode of the current coding unit according to the principle of minimum rate distortion cost of all the prediction modes, and recording the information of the optimal prediction mode and the rate distortion cost information corresponding to the optimal prediction mode;

step c1) traversing all intra-frame prediction modes again, starting intra-frame prediction filtering in the process, firstly calculating prediction pixels under each intra-frame prediction mode to obtain a prediction block of the current coding unit;

step d1) performing intra-frame prediction filtering on the prediction block of the current coding unit, selecting a corresponding filter according to the current prediction mode, selecting a corresponding filter coefficient group according to the size of the current coding unit, and searching table 1 according to the specific correspondence;

step e1) calculating rate distortion cost information of each prediction mode according to the final prediction pixel obtained by intra-frame prediction filtering and the original pixel, and recording the prediction mode and the corresponding cost value of the minimum rate distortion cost information;

if the allowable identification bit is '0', transmitting the prediction mode index recorded in b1) to a decoding end through a code stream;

if the allowed identification bit is '1', the minimum cost value recorded in b1) is compared with the minimum cost value recorded in e1),

if the rate distortion cost in b1) is lower, the prediction mode index code recorded in b1) is used as the optimal prediction mode of the current coding unit and is transmitted to a decoding end through a code stream, and the identifier position of the current coding unit uses the mark position to indicate that intra-frame prediction filtering is not used and is also transmitted to the decoding end through the code stream;

if the rate distortion in e1) is smaller, the prediction mode index code recorded in e1) is transmitted to the decoding end as the optimal prediction mode of the current coding unit through the code stream, and the flag position used for the identification position of the current coding unit is true, which indicates that the intra-frame prediction filtering is used, and is also transmitted to the decoding end through the code stream.

Step f1) the predicted value is superposed with the residual information after operations such as transformation, quantization and the like, and the reconstructed block of the current coding unit is obtained as the reference information of the subsequent coding unit.

The intra predictor 109 may also provide information indicating the selected intra prediction mode of the current encoding block to the entropy encoder 103 so that the entropy encoder 103 encodes the information indicating the selected intra prediction mode.

Fig. 8 is an exemplary block diagram of a video decoder 200 described in embodiments of the present application. In the example of fig. 8, the video decoder 200 includes an entropy decoder 203, a prediction processing unit 208, an inverse quantizer 204, an inverse transformer 205, a summer 211, a filter unit 206, and a memory 207. The prediction processing unit 208 may include an inter predictor 210 and an intra predictor 209. In some examples, video decoder 200 may perform a decoding process that is substantially reciprocal to the encoding process described with respect to video encoder 100 from fig. 7.

In the decoding process, video decoder 200 receives an encoded video bitstream representing an image block and associated syntax elements of an encoded video slice from video encoder 100. Video decoder 200 may receive video data from network entity 42 and, optionally, may store the video data in a video data store (not shown). The video data memory may store video data, such as an encoded video bitstream, to be decoded by components of video decoder 200. The video data stored in the video data memory may be obtained, for example, from storage device 40, from a local video source such as a camera, via wired or wireless network communication of video data, or by accessing a physical data storage medium. The video data memory may serve as a decoded picture buffer (CPB) for storing encoded video data from the encoded video bitstream.

Network entity 42 may be, for example, a server, a MANE, a video editor/splicer, or other such device for implementing one or more of the techniques described above. Network entity 42 may or may not include a video encoder, such as video encoder 100. Network entity 42 may implement portions of the techniques described in this application before network entity 42 sends the encoded video bitstream to video decoder 200. In some video decoding systems, network entity 42 and video decoder 200 may be part of separate devices, while in other cases, the functionality described with respect to network entity 42 may be performed by the same device that includes video decoder 200.

The entropy decoder 203 of the video decoder 200 entropy decodes the code stream to generate quantized coefficients and some syntax elements. The entropy decoder 203 forwards the syntax elements to the prediction processing unit 208. Video decoder 200 may receive syntax elements at the video slice level and/or the picture block level. The intra predictor 209 is used to generate a prediction block for the current image block based on the signaled intra prediction mode and previously decoded image data. The inter predictor 210 is configured to determine an inter prediction mode for decoding a current image block based on syntax elements received from the entropy decoder 203, and decode (e.g., perform inter prediction) the current image block based on the determined inter prediction mode.

The inverse quantizer 204 inversely quantizes, i.e., dequantizes, the quantized transform coefficients provided in the codestream and decoded by the entropy decoder 203. The inverse quantization process may include: the quantization parameter calculated by the video encoder 100 for each image block is used to determine the degree of quantization that should be applied and likewise the degree of inverse quantization that should be applied. Inverse transformer 205 applies an inverse transform, such as an inverse DCT, an inverse integer transform, or a conceptually similar inverse transform process, to the transform coefficients in order to generate a block of residues in the pixel domain.

After the inter predictor 210 generates a prediction block for the current image block or a sub-block of the current image block, the video decoder 200 obtains a reconstructed block, i.e., a decoded image block, by summing the residual block from the inverse transformer 205 with the corresponding prediction block generated by the inter predictor 210. Summer 211 represents the component that performs this summation operation. A loop filter (in or after the decoding loop) may also be used to smooth pixel transitions or otherwise improve video quality, if desired. Filter unit 206 may represent one or more loop filters, such as deblocking filters, Adaptive Loop Filters (ALF), and Sample Adaptive Offset (SAO) filters. Although the filter unit 206 is shown in fig. 8 as an in-loop filter, in other implementations, the filter unit 206 may be implemented as a post-loop filter.

The decoding method specifically executed by the video decoder 200 includes that the input code stream is analyzed, inversely transformed and inversely quantized to obtain the prediction mode index of the current coding block. If the prediction mode index of the chroma component of the current coding block is an enhanced two-step cross-component prediction mode, selecting only reconstructed samples from upper side or left side adjacent pixels of the current coding block according to an index value to calculate a linear model, calculating according to the linear model to obtain a reference prediction block of the chroma component of the current coding block, performing down-sampling, and performing prediction correction based on the correlation of boundary adjacent pixels in the orthogonal direction on the down-sampled prediction block to obtain a final prediction block of the chroma component. One path of the subsequent code stream is used as reference information of subsequent video decoding, and the other path of the subsequent code stream is subjected to post-filtering processing to output a video signal.

In one embodiment, the intra-prediction filtering is implemented at the video decoder 200 as follows.

And the decoding end acquires and analyzes the code stream to obtain digital video sequence information, and analyzes to obtain an IPF allowed identification bit of the current video sequence, the current decoding unit coding mode is an intra-frame prediction coding mode, and the IPF used identification bit of the current decoding unit.

In the current decoding unit,

if the allowed identification bit is '1', the following steps a2) to e2) are carried out;

if the allowed identification bit is '0', only the steps of a2), b2) and e2) are carried out:

step a2) acquiring code stream information, analyzing residual error information of a current decoding unit, and obtaining time domain residual error information through inverse transformation and inverse quantization processes;

step b2) analyzing the code stream and obtaining the prediction mode index of the current decoding unit, and calculating to obtain the prediction block of the current decoding unit according to the adjacent reconstruction block and the prediction mode index;

step c2), analyzing and acquiring the usage identification bit of the IPF, and if the usage identification bit of the IPF is '0', not performing additional operation on the current prediction block; if the usage flag of the IPF is '1', executing d 2);

step d2) selecting corresponding filter according to the prediction mode classification information of the current decoding unit, selecting corresponding filter coefficient group according to the size of the current decoding unit, and then filtering each pixel in the prediction block to obtain the final prediction block;

step e2) overlapping the residual error information after the prediction block is restored to obtain a reconstruction block of the current decoding unit, and outputting the reconstruction block after post-processing;

it should be understood that other structural variations of the video decoder 200 may be used to decode the encoded video stream. For example, the video decoder 200 may generate an output video stream without processing by the filter unit 206; alternatively, for some image blocks or image frames, the entropy decoder 203 of the video decoder 200 does not decode quantized coefficients and accordingly does not need to be processed by the inverse quantizer 204 and the inverse transformer 205.

The intra-frame prediction filtering can effectively improve the coding efficiency of intra-frame prediction, and greatly enhances the spatial correlation of the intra-frame prediction. But only a single reference pixel row or column is used, ignoring the effect of some pixels on the prediction value. When the intra-frame prediction process needs a part of smoothing processing, the intra-frame prediction filtering and the intra-frame prediction mode can not solve similar problems well, and the pixel-by-pixel filtering according to the reference pixel can improve the relevance between the prediction block and the reference block, but can not solve the smoothing problem in the prediction block.

The prediction block calculated according to the single prediction mode usually shows better prediction effect in the image with clear texture, and the residual error becomes smaller and smaller, so that the coding efficiency is improved. However, in an image block with a blurred texture, too sharp prediction may increase and enlarge a residual, resulting in poor prediction effect and reduced coding efficiency.

The embodiment of the application provides an intra-frame prediction filtering based on smoothing processing aiming at some image blocks needing smoothing processing, and the intra-frame prediction filtering can be used for directly filtering prediction blocks obtained according to an intra-frame prediction mode.

Fig. 9a is a flowchart illustrating an encoding method in an embodiment of the present application, where the encoding method can be applied to the source device 10 in the video decoding system 1 shown in fig. 6 or the video encoder 100 shown in fig. 7. The flow shown in fig. 9a is described by taking as an example the execution subject of the video encoder 100 shown in fig. 7. As shown in fig. 9a, an encoding method provided in an embodiment of the present application includes:

step 110, dividing the image, and obtaining the coding information of the current coding block, where the coding information includes a first flag bit and a second flag bit, the first flag bit is used to indicate whether the current coding block allows using intra-prediction filtering, and the second flag bit is used to indicate whether the current coding block allows using intra-prediction smoothing correction.

The first flag bit may also be referred to as an intra prediction filtering allowable flag ipf _ enable _ flag, and the second flag bit may also be referred to as an intra prediction smoothing correction allowable flag ips _ enable _ flag.

Step 120, the original prediction block of the current coding block is determined.

In specific implementation, an intra-frame prediction mode required by a current coding block is determined, and the intra-frame prediction mode is used for predicting the current coding block to obtain an original prediction block.

Step 130, determining that the intra-frame prediction smooth correction is allowed to be used according to the first flag bit and the second flag bit, and performing padding processing on the original prediction block to obtain a padded first prediction block.

The values of the first flag and the second flag may be 1 or 0 (or true or false, etc., without being limited), and the allowable intra prediction smoothing correction may correspond to that the values of the first flag and the second flag are both 1.

In a specific implementation, as shown in fig. 9b, a filling diagram (in the diagram, pred. pixel represents a pixel in an original prediction block, which is simply referred to as a prediction pixel, and recon. pixel represents a pixel at a corresponding position in an adjacent reconstruction block, which is simply referred to as a reconstruction pixel) is used to perform filling processing on the original prediction block, so as to obtain a first prediction block after the filling processing, where the filling diagram includes: filling the 1 line on the most adjacent left side outside the original prediction block by using the reconstruction pixels, and filling 2 pixels below the lowest pixel point of the left most adjacent reconstruction pixel; filling the used reconstruction pixels of the 1 line on the most adjacent upper side outside the original prediction block, and filling 2 pixels to the right by using the rightmost pixel point of the upper side most adjacent reconstruction pixels; respectively filling 1 pixel point upwards and leftwards to the reconstructed reference pixel point at the upper left corner of the original prediction block; the right and lower most adjacent columns and rows outside the original prediction block are padded with 1 row and 1 column of prediction pixels closest to the boundary.

Step 140, filtering each pixel in the original prediction block by using a smoothing correction filter according to the first prediction block to obtain a second prediction block after smoothing correction.

In this possible example, the smoothing correction filter is configured to filter a first reference pixel on an upper side boundary of the first prediction block, a second reference pixel on a left side boundary, and a currently processed prediction pixel in the original prediction block, where the first reference pixel includes two reference pixels whose difference between an abscissa and an abscissa of a center pixel of the currently processed prediction pixel is 2, the second reference pixel includes two reference pixels whose difference between an ordinate and an ordinate of the center pixel of the currently processed prediction pixel is 2, and the upper side boundary and the left side boundary are filling regions of the first prediction block relative to the original prediction block.

In this possible example, the smooth correction filter includes a first thirteen-tap filter;

the thirteenth tap filter includes:

P′(x，y)＝c₁·Ref(-1，y-2)+c₂·P(x-1，y-1)+c₃·P(x，y-1)+c₂·P(x+1，y-1)+c₁

·Ref(x-2，-1)+c₃·P(x-1，y)+c₄·P(x，y)+c₃·P(x+1，y)+c₁·Ref(x+2，-1)

+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)+c₁·Ref(-1，y+2)

wherein P' (x, y) is the final predicted value of (x, y) pixel points in the first predicted block, c₁、c₂And c₃The prediction coefficients are respectively filter coefficients, P (a, b) is an original prediction value of a pixel point (a, b) in the first prediction block, wherein the value ranges of x and y do not exceed the width and height of the current coding unit block, and Ref (m, n) is a reconstruction value of a pixel point (m, n).

The intra-frame prediction smoothing correction adopts a simplified Gaussian convolution kernel to filter the prediction block, and the filter has 13 filter coefficients in total, as follows:

the sum of the filter coefficients is 256, i.e. the calculated prediction value needs to be shifted to the right by 8 bits.

As shown in the prediction block filtering diagram of fig. 9c (Rec in fig. 9c to 9n indicates reconstructed pixels), all prediction modes and prediction blocks can be filtered by using a 13-tap filter. The first reference pixel point on the upper side boundary comprises two reference pixel points of which the difference value between the abscissa and the abscissa of the central pixel point of the currently processed prediction pixel point is 2, and the second reference pixel point on the left side boundary comprises two reference pixel points of which the difference value between the ordinate and the ordinate of the central pixel point of the currently processed prediction pixel point is 2.

In this possible example, the smoothing correction filter is determined by the intra prediction mode used by the current coding block;

if the intra-frame prediction mode used by the current coding block is a non-angle type prediction mode, filtering by adopting a first filter;

if the intra-frame prediction mode used by the current coding block is a horizontal angle prediction mode, filtering by adopting a second filter;

and if the intra-frame prediction mode used by the current coding block is a vertical angle prediction mode, filtering by adopting a third filter.

As can be seen, in the present example, flexible setting of the smoothing correction filter according to the intra prediction mode is supported.

In this possible example, the smoothing correction filter is configured to filter a first reference pixel on an upper side boundary of the first prediction block and/or a second reference pixel on a left side boundary of the first prediction block and a currently processed prediction pixel in the original prediction block, where the first reference pixel includes at least one reference pixel whose difference between an abscissa and an abscissa of a center pixel of the currently processed prediction pixel is less than or equal to 2, the second reference pixel includes at least one reference pixel whose difference between an ordinate and an ordinate of the center pixel of the currently processed prediction pixel is less than or equal to 2, and the upper side boundary and the left side boundary are filling regions of the first prediction block relative to the original prediction block.

In this possible example, the first filter comprises a first thirteen-tap filter;

the thirteenth tap filter includes:

+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)+c₁·Ref(-1，y+2)

As shown in the prediction block filtering diagram of fig. 9c, the non-angle-class prediction mode and the prediction block may be filtered by using a 13-tap filter. The first reference pixel point on the upper side boundary comprises two reference pixel points of which the difference value between the abscissa and the abscissa of the central pixel point of the currently processed prediction pixel point is 2, and the second reference pixel point on the left side boundary comprises two reference pixel points of which the difference value between the ordinate and the ordinate of the central pixel point of the currently processed prediction pixel point is 2.

In this possible example, the second filter comprises a first eleventh tap filter;

the first eleventh tap filter comprises:

P′(x，y)＝2c₁·Ref(-1，y-2)+c₂·P(x-1，y-1)+c₃·P(x，y-1)+c₂·P(x+1，y-1)+c₃

·P(x-1，y)+c₄·P(x，y)+c₃·P(x+1，y)+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂

·P(x+1，y+1)+2c₁·Ref(-1，y+2)

As shown in the prediction block filtering diagram of fig. 9d, the horizontal angular prediction mode and the prediction block may be filtered by using an 11-tap filter. The first reference pixel point on the upper side boundary comprises two reference pixel points of which the difference value between the abscissa and the abscissa of the central pixel point of the currently processed prediction pixel point is 2.

In this possible example, the third filter comprises a twenty-first tap filter;

the twenty-first tap filter comprises:

P′(x，y)＝c₂·P(x-1，y-1)+c₃·P(x，y-1)+c₂·P(x+1，y-1)+2c₁·Ref(x-2，-1)+c₃

·P(x-1，y)+c₄·P(x，y)+c₃·P(x+1，y)+2c₁·Ref(x+2，-1)+c₂·P(x-1，y+1)

+c₃·P(x，y+1)+c₂·P(x+1，y+1)

As shown in the prediction block filtering diagram of fig. 9e, the vertical-class prediction mode and the prediction block may be filtered by using an 11-tap filter. And the second reference pixel point on the left boundary comprises two reference pixel points of which the difference value between the vertical coordinate and the vertical coordinate of the central pixel point of the currently processed prediction pixel point is 2.

In this possible example, the first filter comprises a twenty-third tap filter;

the twenty-third tap filter comprises:

P′(x，y)＝c₁·Ref(-1，y-1)+c₂·P(x-1，y-1)+c₃·P(x，y-1)+c₂·P(x+1，y-1)+c₁

·Ref(x-1，-1)+c₃·P(x-1，y)+c₄·P(x，y)+c₃·P(x+1，y)+c₁·Ref(x+1，-1)

+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)+c₁·Ref(-1，y+1)

As shown in the prediction block filtering diagram of fig. 9f, the non-angle-class prediction mode and the prediction block may be filtered by using a 13-tap filter. The first reference pixel point on the upper side boundary comprises two reference pixel points of which the difference value between the abscissa and the abscissa of the central pixel point of the currently processed prediction pixel point is 1, and the second reference pixel point on the left side boundary comprises two reference pixel points of which the difference value between the ordinate and the ordinate of the central pixel point of the currently processed prediction pixel point is 1.

In this possible example, the second filter comprises a thirty-first tap filter;

the thirty-first tap filter comprises:

P′(x，y)＝2c₁·Ref(-1，y-1)+c₂·P(x-1，y-1)+c₃·P(x，y-1)+c₂·P(x+1，y-1)+c₃

·P(x+1，y+1)+2c₁·Ref(-1，y+1)

As shown in the prediction block filtering diagram of fig. 9g, the horizontal-class prediction mode and the prediction block may be filtered using an 11-tap filter. The first reference pixel point on the upper side boundary comprises two reference pixel points of which the difference value between the abscissa and the abscissa of the central pixel point of the currently processed prediction pixel point is 1.

In this possible example, the second filter comprises a forty-first tap filter;

the forty-first tap filter includes:

P′(x，y)＝c₂·P(x-1，y-1)+c₃·P(x，y-1)+c₂·P(x+1，y-1)+2c₁·Ref(x-1，-1)+c₃

·P(x-1，y)+c₄·P(x，y)+c₃·P(x+1，y)+2c₁·Ref(x+1，-1)+c₂·P(x-1，y+1)

+c₃·P(x，y+1)+c₂·P(x+1，y+1)

As shown in the prediction block filtering diagram of fig. 9h, the vertical-class prediction mode and the prediction block may be filtered by using an 11-tap filter. And the second reference pixel point on the left boundary comprises two reference pixel points of which the difference value between the vertical coordinate and the vertical coordinate of the central pixel point of the currently processed prediction pixel point is 1.

In this possible example, the first filter comprises a fifty-first tap filter;

the fifty-first tap filter includes:

P′(x，y)＝c₁·Ref(-1，y)+c₂·P(x-1，y-1)+c₃·P(x，y-1)+c₂·P(x+1，y-1)+c₁·Ref(x，-1)

+c₃·P(x-1，y)+c₄·P(x，y)+c₃·P(x+1，y)+c₂·P(x-1，y+1)+c₃·P(x，y+1)

+c₂·P(x+1，y+1)

As shown in the prediction block filtering diagram of fig. 9i, the non-angle-class prediction mode and the prediction block may be filtered by using an 11-tap filter. The first reference pixel point on the upper side boundary comprises a single reference pixel point of which the difference value between the abscissa and the abscissa of the central pixel point of the currently processed prediction pixel point is 0, and the second reference pixel point on the left side boundary comprises a single reference pixel point of which the difference value between the ordinate and the ordinate of the central pixel point of the currently processed prediction pixel point is 0.

In this possible example, the second filter comprises a first ten-tap filter;

the first ten-tap filter comprises:

P′(x，y)＝2c₁·Ref(-1，y)+c₂·P(x-1，y-1)+c₃·P(x，y-1)+c₂·P(x+1，y-1)+c₃·P(x-1，y)

+c₄·P(x，y)+c₃·P(x+1，y)+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂

·P(x+1，y+1)

As shown in the prediction block filtering diagram of fig. 9j, the horizontal-class prediction mode and the prediction block may be filtered by using a 10-tap filter. The first reference pixel point on the upper side boundary includes a single reference pixel point whose difference between the abscissa and the abscissa of the central pixel point of the currently processed prediction pixel point is 0.

In this possible example, the third filter comprises a twentieth tap filter;

the twentieth tap filter includes:

P′(x，y)＝c₂·P(x-1，y-1)+c₃·P(x，y-1)+c₂·P(x+1，y-1)+2c₁·Ref(x，-1)+c₃·P(x-1，y)

+c₄·P(x，y)+c₃·P(x+1，y)+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂

·P(x+1，y+1)

As shown in the prediction block filtering diagram of fig. 9k, the vertical-class prediction mode and the prediction block may be filtered by using an 11-tap filter. And the second reference pixel point of the left boundary comprises a single reference pixel point of which the difference value between the vertical coordinate and the vertical coordinate of the central pixel point of the currently processed prediction pixel point is 0.

In this possible example, the first filter comprises a first fifteen tap filter;

the first fifteen tap filter includes:

P′(x，y)＝c₁·Ref(-1，y-1)+c₁·Ref(-1，y)+c₁·Ref(-1，y+1)+c₁·Ref(x-1，-1)+c₁

·Ref(x，-1)+c₁·Ref(x+1，-1)+c₂·P(x-1，y-1)+c₃·P(x，y-1)+c₂

·P(x+1，y-1)+c₃·P(x-1，y)+c₄·P(x，y)+c₃·P(x+1，y)+c₂·P(x-1，y+1)

+c₃·P(x，y+1)+c₂·P(x+1，y+1)

As shown in the prediction block filtering diagram of fig. 9l, the non-angle-class prediction mode and the prediction block may be filtered by using a 15-tap filter. The first reference pixel points on the upper side boundary comprise three reference pixel points of which the difference value between the abscissa and the abscissa of the central pixel point of the currently processed prediction pixel point is less than or equal to 2, and the second reference pixel points on the left side boundary comprise three reference pixel points of which the difference value between the ordinate and the ordinate of the central pixel point of the currently processed prediction pixel point is less than or equal to 2.

In this possible example, the second filter comprises a first twelve tap filter;

the first twelve tap filter comprises:

P′(x，y)＝2c₁·Ref(-1，y-1)+2c₁·Ref(-1，y)+2c₁·Ref(-1，y+1)+c₂·P(x-1，y-1)+c₃

·P(x，y-1)+c₂·P(x+1，y-1)+c₃·P(x-1，y)+c₄·P(x，y)+c₃·P(x+1，y)+c₂

·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)

As shown in the prediction block filtering diagram of fig. 9m, the horizontal-class prediction mode and the prediction block may be filtered by using a 12-tap filter. The first reference pixel points on the upper side boundary comprise three reference pixel points, wherein the difference value between the abscissa and the abscissa of the central pixel point of the currently processed prediction pixel point is less than or equal to 2.

In this possible example, the third filter comprises a twenty-second tap filter;

the twenty-second tap filter comprises:

P′(x，y)＝c₁·Ref(x-1，-1)+c₁·Ref(x，-1)+c₁·Ref(x+1，-1)+c₂·P(x-1，y-1)+c₃

·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)

As shown in the prediction block filtering diagram of fig. 9n, the vertical-class prediction mode and the prediction block may be filtered by using a 12-tap filter. And the second reference pixel point on the left boundary comprises three reference pixel points of which the difference value between the vertical coordinate and the vertical coordinate of the central pixel point of the currently processed prediction pixel point is less than or equal to 2.

In this possible example, after filtering each pixel in the original prediction block according to the first prediction block by using a smoothing correction filter to obtain a second prediction block after smoothing correction, the method further includes:

calculating a rate-distortion cost of the second prediction block;

if the rate distortion cost of the second prediction block is smaller than the rate distortion cost of the original prediction block, setting a third identification bit of the current coding block to be a first value, wherein the third identification bit is used for indicating whether the current coding block uses the intra-frame prediction smooth correction or not; transmitting the first value through a code stream, wherein the first value represents that the current coding block uses the intra-frame prediction smooth correction;

and if the rate distortion cost of the second prediction block is not less than the rate distortion cost of the original prediction block, setting the third identification bit of the current coding block as a second numerical value, and transmitting the second numerical value through a code stream, wherein the second numerical value is used for indicating that the current coding block does not use the intra-frame prediction smooth correction.

The value of the third flag may be 1 or 0 (or true or false, etc., without being limited uniquely), the first value may be 1, and the second value may be 0.

In this possible example, c₁Is 7, c₂Is 20, c₃Is 26, c₄Is 44; alternatively, the first and second electrodes may be,

c1 is determined according to the size of the current coding block and a reference distance, wherein the reference distance is a horizontal distance or a vertical distance between a reference pixel and a central point prediction pixel, the reference pixel is a first reference pixel point on the upper side boundary of the first prediction block or a second reference pixel point on the left side boundary of the first prediction block, the first reference pixel point comprises two reference pixel points of which the difference between the abscissa and the abscissa of the central pixel point of the currently processed prediction pixel point is less than or equal to 2, the second reference pixel point comprises two reference pixel points of which the difference between the ordinate and the ordinate of the central pixel point of the currently processed prediction pixel point is less than or equal to 2, and the central point prediction pixel is the central pixel point of the currently processed prediction pixel point.

It can be seen that in the embodiment of the present application, a choice is provided for the intra-frame prediction in operations that require smoothing processing or local blurring, and for portions where the image texture does not need to be too sharpened, the prediction pixels are smoother by using the technique, and the prediction block is closer to the original image, which will eventually improve the coding efficiency.

The above coding method is explained below with reference to two examples.

Example 1, the specific implementation of intra prediction at the encoding end is as follows:

step 1, an encoder acquires encoding information including an intra-frame prediction filtering allowable identification bit, an intra-frame prediction smooth correction allowable identification bit of the method described in the embodiment of the application, and the like, after acquiring image information, an image is divided into a plurality of CTUs, the CTUs are further divided into a plurality of CUs, each independent CU performs intra-frame prediction, and the width and height of the current CU are not more than 64 pixels.

Step 2, determining that the current encoder allows smooth correction, that is, the flag bit of the intra-frame prediction smooth correction is 1, obtaining a prediction block of the current CU after intra-frame prediction of each CU, and performing intra-frame prediction smooth correction processing on pixels in the prediction block, specifically comprising the following steps:

and 3, filling the current CU, filling the 1 line of the most adjacent left side outside the current CU by using the reconstructed pixels, and filling 2 pixels downwards by using the lowermost pixel point of the most adjacent reconstructed pixel on the left side.

The most adjacent upside 1 line outside the current CU fills with the reconstruction pixels, and fills 2 pixels to the right by using the rightmost pixel point of the most adjacent reconstruction pixels on the upside.

And respectively filling 1 pixel point upwards and leftwards into the reconstructed reference pixel point at the upper left corner of the current CU. The right and lower nearest neighbor columns and rows outside the current CU are filled with 1 row and 1 column of predicted pixels closest to the boundary.

Step 4, filtering the pixels by pixels in the prediction block by using a smooth correction filter, wherein two taps on the outermost side of the filter in the horizontal direction correspond to upper reference pixel points of the horizontal position, two taps on the outermost side of the filter in the vertical direction correspond to left reference pixel points of the vertical position, taps inside the rest of the filter correspond to prediction pixels at corresponding positions in the current prediction block, and filtering to obtain a final prediction value of the central point position of the filter;

step 5, obtaining the prediction block after smooth correction, and calculating the rate distortion cost of the prediction block;

if the rate distortion cost value of the current prediction block is minimum, determining that the current CU uses smooth correction, using an identification position 1 for the intra-frame prediction smooth correction of the current CU, and transmitting the intra-frame prediction smooth correction to a decoding end through a code stream;

and if the rate distortion cost value of the current prediction block is not the minimum, the current CU does not use the smooth correction, the intra-frame prediction smooth correction of the current CU uses the identification position 0, and the intra-frame prediction smooth correction is transmitted to a decoding end through a code stream.

Example 2, the intra prediction part at the encoding end is implemented as follows:

step 1, an encoder acquires encoding information including an intra-frame prediction filtering allowable identification bit, an intra-frame prediction smooth correction allowable identification bit of the method described in the embodiment of the application, and the like, divides an image into a plurality of CTUs after acquiring image information, further divides the image into a plurality of CUs, performs intra-frame prediction on each independent CU, and the width and height of the current CU are not more than 64 pixels.

Step 2, determining that the current encoder allows smooth correction, that is, the intra-frame prediction smooth correction allowed flag bit is 1, obtaining a prediction block of the current CU after intra-frame prediction of each CU, and performing smooth filtering correction processing on pixels in the prediction block, specifically, the following steps:

And respectively filling 1 pixel point upwards and leftwards into the reconstructed reference pixel point at the upper left corner of the current CU. The right and lower nearest neighbor columns and rows outside the current CU are filled with 1 row and 1 column of predicted pixels closest to the boundary;

and 4, filtering the pixels in the prediction block by using a smooth correction filter, and selecting a corresponding filter according to the current prediction mode.

If the current prediction mode is a non-angle type prediction mode, filtering by adopting a first filter to obtain a final prediction value of a position pixel at the center point of the filter;

if the current prediction mode is a horizontal angle prediction mode, filtering by adopting a second filter to obtain a final prediction value of a pixel at the center point of the filter;

and if the current prediction mode is the vertical angle prediction mode, filtering by adopting a third filter to obtain a final predicted value of the pixel at the central point of the filter.

And 5, obtaining the prediction block after smoothing correction and filtering, and calculating the rate distortion cost of the prediction block.

And if the rate distortion cost value of the current prediction block is minimum, determining that the current CU uses smooth correction, using the identifier position 1 for the intra-frame prediction smooth correction of the current CU, and transmitting the intra-frame prediction smooth correction to a decoding end through a code stream.

Corresponding to the encoding method described in fig. 9a, fig. 10 is a flowchart illustrating a decoding method in an embodiment of the present application, which can be applied to the destination device 20 in the video decoding system 1 shown in fig. 6 or the video decoder 200 shown in fig. 8. The flow shown in fig. 10 is described by taking as an example the video encoder 200 shown in fig. 8 as an execution subject. As shown in fig. 10, the decoding method provided in the embodiment of the present application includes:

step 210, parsing the code stream, and obtaining a second flag of the current decoding block, where the second flag is used to indicate whether the current decoding block allows intra-frame prediction smoothing correction.

Step 220, if the second flag bit indicates that the intra prediction smooth correction is allowed to be used, analyzing the code stream to obtain a third flag bit of the current decoding block, where the third flag bit is used to indicate whether the current decoding block uses the intra prediction smooth correction.

Step 230, parsing the code stream, obtaining original residual information of the current decoding block and an intra-frame prediction mode required to be used, performing inverse transformation and inverse quantization on the original residual information to obtain time domain residual information, and obtaining an original prediction block of the current decoding block according to the intra-frame prediction mode required to be used and an adjacent reconstruction block of the current decoding block.

It should be noted that the timing sequence of the step 220 and the step 230 is not limited, and may be executed in parallel, or the step 220 may be executed first and then the step 230 is executed, or the step 230 may be executed first and then the step 220 is executed.

Step 240, if the third flag indicates that the intra-frame prediction smoothing correction is used, performing padding processing on the original prediction block to obtain a first prediction block after the padding processing.

And 250, filtering each pixel in the original prediction block by using a smooth correction filter according to the first prediction block to obtain a second prediction block after smooth correction.

Step 260, superimposing the second prediction block on the time domain residual information to obtain a reconstructed block of the current decoding block.

In this possible example, the width and height of the current decoded block are no more than 64 pixels.

In this possible example, the method further comprises: and if the second flag bit indicates that the intra-frame prediction smooth correction is not allowed to be used, or if the third flag bit indicates that the intra-frame prediction smooth correction is not used, overlapping the original prediction block with the time domain residual error information to obtain a reconstructed block of the current decoding block.

the thirteenth tap filter includes:

+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)+c₁·Ref(-1，y+2)

In this possible example, it is characterized in that said smoothing correction filter is determined by the intra prediction mode used by said current coding block;

the thirteenth tap filter includes:

+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)+c₁·Ref(-1，y+2)

the first eleventh tap filter comprises:

·P(x+1，y+1)+2c₁·Ref(-1，y+2)

In this possible example, the third filter comprises a twenty-first tap filter;

the twenty-first tap filter comprises:

+c₃·P(x，y+1)+c₂·P(x+1，y+1)

In this possible example, the first filter comprises a twenty-third tap filter;

the twenty-third tap filter comprises:

+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)+c₁·Ref(-1，y+1)

the thirty-first tap filter comprises:

·P(x+1，y+1)+2c₁·Ref(-1，y+1)

In this possible example, the second filter comprises a forty-first tap filter;

the forty-first tap filter includes:

+c₃·P(x，y+1)+c₂·P(x+1，y+1)

In this possible example, the first filter comprises a fifty-first tap filter;

the fifty-first tap filter includes:

+c₂·P(x+1，y+1)

In this possible example, the second filter comprises a first ten-tap filter;

the first ten-tap filter comprises:

+c₄·P(x，y)+c₃·P(x+1，y)+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂

·P(x+1，y+1)

In this possible example, the third filter comprises a twentieth tap filter;

the twentieth tap filter includes:

+c₄·P(x，y)+c₃·P(x+1，y)+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂

·P(x+1，y+1)

wherein the sum of P' (x,y) is the final predicted value of (x, y) pixel points in the first predicted block, c)₁、c₂And c₃The prediction coefficients are respectively filter coefficients, P (a, b) is an original prediction value of a pixel point (a, b) in the first prediction block, wherein the value ranges of x and y do not exceed the width and height of the current coding unit block, and Ref (m, n) is a reconstruction value of a pixel point (m, n).

the first fifteen tap filter includes:

·Ref(x，-1)+c₁·Ref(x+1，-1)+c₂·P(x-1，y-1)+c₃·P(x，y-1)+c₂

+c₃·P(x，y+1)+c₂·P(x+1，y+1)

the first twelve tap filter comprises:

·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)

the twenty-second tap filter comprises:

·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)

The above coding method is explained below with reference to two examples.

Example 1, the specific flow of intra prediction at the decoding end is as follows:

step 1, a decoder acquires a code stream, analyzes the code stream to obtain an intra-frame prediction smooth correction allowable identification bit of a current video sequence, analyzes the code stream and performs inverse transformation and inverse quantization on obtained residual information.

Step 2, determining that the identification bit allowed by the intra-frame prediction smooth correction in the current code stream is 1, and then the intra-frame prediction decoding process is as follows:

step 3, acquiring and decoding a code stream to obtain residual error information, and obtaining time domain residual error information through processes of inverse transformation, inverse quantization and the like;

step 4, analyzing the code stream to obtain a prediction mode of the current decoding unit, and calculating according to the prediction mode of the current decoding unit and the adjacent reconstruction block to obtain a prediction block;

step 5, analyzing and acquiring the identification bits used for intra-frame prediction smooth correction of the current decoding unit;

if the identification bit used for the intra-frame prediction smooth correction is '0', no additional operation is performed on the current prediction block, and all the remaining steps are skipped;

if the identification bit ips _ flag used for smooth correction is '1' and the width and height of the current prediction block are not more than 64 pixels, continuing to execute the subsequent operation steps 6 to 8;

and 6, filling the current decoding unit, filling the 1 line of the most adjacent left side outside the current decoding unit with the reconstructed pixels, and filling 2 pixels downwards by using the lowermost pixel point of the most adjacent reconstructed pixel on the left side.

The most adjacent upper side 1 line outside the current decoding unit is filled with the reconstruction pixels, and the rightmost pixel point of the most adjacent reconstruction pixels on the upper side is used for filling 2 pixels to the right. And the upper left corner of the current decoding unit is rebuilt with reference pixel points and is respectively filled with 1 pixel point upwards and leftwards.

The right and lower most adjacent columns and rows outside the current decoding unit are filled with 1 row and 1 column of predicted pixels closest to the boundary;

and 7, filtering the pixels by pixels in the prediction block by using a smooth correction filter, wherein two taps on the outermost side of the filter in the horizontal direction correspond to upper reference pixel points at the horizontal position, two taps on the outermost side of the filter in the vertical direction correspond to left reference pixel points at the vertical position, taps inside the rest of the filter correspond to prediction pixels at corresponding positions in the current prediction block, and filtering to obtain a final prediction value of the central point position of the filter.

And 8, overlapping the residual error information after the prediction block is restored to obtain a reconstructed block of the current decoding unit, and outputting the reconstructed block after post-processing.

Example 2, the specific flow of intra prediction at the decoding end is as follows:

step 5, resolving and acquiring a smooth correction use identification bit ips _ flag of the current decoding unit and acquiring a prediction mode of the current prediction block,

if the identification bit for intra-frame prediction smooth correction is '1' and the width and height of the current prediction block are not more than 64 pixels, continuing to execute the subsequent operation steps 6 to 8;

The most adjacent upper side 1 line outside the current decoding unit is filled with the reconstruction pixels, and the rightmost pixel point of the most adjacent reconstruction pixels on the upper side is used for filling 2 pixels to the right.

And the upper left corner of the current decoding unit is rebuilt with reference pixel points and is respectively filled with 1 pixel point upwards and leftwards. The right and lower most adjacent columns and rows outside the current decoding unit are filled with 1 row and 1 column of predicted pixels closest to the boundary;

step 7, filtering the pixels by pixels in the prediction block by using a smooth correction filter,

if the current prediction mode is a non-angle type prediction mode, filtering by adopting a first filter;

if the current prediction mode is a horizontal angle prediction mode, filtering by adopting a second filter;

and if the current prediction mode is the vertical angle prediction mode, filtering by adopting a third filter.

Setting of filter coefficients

All coefficients of the filter used by all the above filtering schemes can be fixed coefficients or different coefficient groups can be selected according to the size of the current prediction block, different filtering coefficients can be selected according to different distances from the prediction pixel to the reference pixel, and the coefficients between the same distances can share the same coefficient value.

In the filters used for all the above filtering schemes c₁The coefficients may use the coefficient values in table 1.

The method described in the embodiments of the present application can use the same register with the standard existing technology without adding extra registers in order to save the register overhead and reduce the hardware design and implementation cost. In the standard with cross-component prediction techniques, the solution is as follows:

the hardware scheme 1 is adapted, and when the luminance component uses the method described in the embodiments of the present application, the chrominance component does not perform the cross-component prediction technique. As in the existing standard AVS3, if the luminance component is subjected to smooth correction using the method described in the embodiment of the present application, cross-component prediction techniques such as TSCPM, TSCPM _ T, TSCPM _ L, MCPM _ L, MPCM _ T are disabled when chrominance component prediction is performed; in the existing standard VVC, cross-component prediction technologies such as CCLM, LM _ A and LM _ L are forbidden;

when the method described in the embodiment of the present application is applied to the luminance component, the hardware scheme 2 is adapted, and if the cross-component prediction technique is applied to the chrominance component, the method described in the embodiment of the present application is not applied to the reconstructed pixel of the luminance component, and whether the method described in the embodiment of the present application is applied to the luminance component is not affected.

The method scheme described in the embodiment of the application is suitable for an intra-frame prediction coding and decoding part, provides selection for operations such as smoothing processing or local blurring of intra-frame prediction, and enables prediction pixels to be smoother and prediction blocks to be closer to original images by using the technology for the parts without too sharp image textures, so that the coding efficiency is improved finally.

In the embodiment of the application, the test is performed on an official simulation platform HPM8.0 of AVS, the intra-frame prediction block is subjected to smooth filtering, and the test results are shown in tables 2 and 3 under the test condition in the whole frame and the random access condition.

TABLE 2 All Intra test results

TABLE 3 Random Access test results

As can be seen from tables 2 and 3, the examples of the present application have good performance improvement under both test conditions.

Under AI test conditions, the luminance component has a significant BDBR saving of 0.44%, and the UV components have respective BDBR saving of 0.46% and 0.50%, which can obviously show that the luminance component has higher performance and effectively improves the coding efficiency of the encoder.

From each resolution, the embodiment of the application has a larger coding performance improvement on the 4K resolution video, which is beneficial to the development of the future ultra-high definition video, and saves more code rates and more bandwidths for the ultra-high resolution video.

According to the scheme, in the intra-frame prediction process, smooth filtering is carried out on the prediction block obtained by calculating the intra-frame prediction mode, the intra-frame prediction precision is improved, and the coding efficiency is effectively improved, specifically as follows:

2 schemes for smoothing filtering an intra-coded prediction block are proposed, and the 1 st scheme is to perform smoothing filtering by adopting a 13-tap filter for any intra-prediction mode, operate an inner 9-tap rectangular filter on a prediction pixel, and operate an outer 4-tap on a reference pixel at a corresponding position. The 2 nd scheme is to divide the intra-frame prediction mode into 3 classes, and carry out smooth correction on different types of prediction modes by adopting different filters;

various filter schemes are proposed, and various combinations are also proposed for filter coefficients;

the scheme which is friendly to hardware implementation is provided, and the scheme shares a register with the prior art, so that the realization cost is extremely low.

In one possible example, the method described in the embodiments of the present application is combined with existing intra prediction filtering.

Specifically, at the encoding end, after step 140, the existing intra-prediction filtering is applied to the second prediction block to obtain a third prediction block.

At the decoding end, after step 250, the existing intra-frame prediction filtering is adopted for the second prediction block to obtain a third prediction block, and the third prediction block is overlapped with the time domain residual error information to obtain a reconstructed block of the current decoding block.

In one possible example, the luminance component and the chrominance component in the method described in the embodiment of the present application are respectively indicated whether to use intra-prediction smooth modification by using independent identification bits, that is, the third identification bits include a third identification bit a of the luminance component and a third identification bit b of the chrominance component, which respectively indicate whether to use intra-prediction smooth modification.

In particular, at the encoding end,

for the luminance domain, after filtering each pixel in the original prediction block by using a smoothing correction filter according to the first prediction block to obtain a second prediction block after smoothing correction, the method further comprises:

calculating a rate-distortion cost of the second prediction block;

if the rate distortion cost of the second prediction block is smaller than the rate distortion cost of the original prediction block, setting a third identification bit a of the current coding block to be a first value, wherein the third identification bit a is used for indicating whether the brightness component of the current coding block uses the intra-frame prediction smooth correction or not; transmitting the third identification bit a through a code stream, wherein the first value represents that the brightness component of the current coding block is smoothly corrected by using the intra-frame prediction;

and if the rate distortion cost of the second prediction block is not less than the rate distortion cost of the original prediction block, setting the third identification bit a of the current coding block as a second numerical value, and transmitting the second numerical value through a code stream, wherein the second numerical value is used for indicating that the brightness component of the current coding block does not use the intra-frame prediction smoothing correction.

For the chroma domain, after filtering each pixel in the original prediction block by using a smoothing correction filter according to the first prediction block to obtain a second prediction block after smoothing correction, the method further comprises:

calculating a rate-distortion cost of the second prediction block;

if the rate distortion cost of the second prediction block is smaller than the rate distortion cost of the original prediction block, setting a third identification bit b of the current coding block to be a first value, wherein the third identification bit b is used for indicating whether the chroma component of the current coding block uses the intra-frame prediction smooth correction or not; transmitting the third identification bit b through a code stream, wherein the first value represents that the chroma component of the current coding block is smoothly corrected by using the intra-frame prediction;

and if the rate distortion cost of the second prediction block is not less than the rate distortion cost of the original prediction block, setting the third identification bit b of the current coding block as a second numerical value, and transmitting the second numerical value through a code stream, wherein the second numerical value is used for indicating that the brightness component of the current coding block does not use the intra-frame prediction smoothing correction.

At the decoding end, the process includes:

if the second identification bit indicates that the intra-frame prediction smooth correction is allowed to be used, analyzing the code stream, and obtaining a third identification bit a and a third identification bit b of the current decoding block, wherein the third identification bit a is used for indicating whether the brightness component of the current decoding block uses the intra-frame prediction smooth correction, and the third identification bit a is used for indicating whether the chroma component of the current decoding block uses the intra-frame prediction smooth correction;

if the third flag bit a indicates that the luminance component of the current decoding block uses the intra-frame prediction smooth correction, performing filling processing on the luminance component of the original prediction block to obtain a first luminance component prediction block after the filling processing;

filtering each pixel in the original prediction block by using a smoothing correction filter according to the first brightness component prediction block to obtain a second brightness component prediction block after smoothing correction;

superposing the second brightness component prediction block on the time domain residual error information to obtain a brightness component reconstruction block of the current decoding block;

if the third identification bit b indicates that the chroma component of the current decoding block uses the intra-frame prediction smooth correction, filling the brightness component of the original prediction block to obtain a first chroma component prediction block after filling;

filtering each pixel in the original prediction block by using a smoothing correction filter according to the first chrominance component prediction block to obtain a second chrominance component prediction block after smoothing correction;

and superposing the second chrominance component prediction block on the time domain residual error information to obtain a chrominance component reconstruction block of the current decoding block.

In a possible example, the reference pixel is filtered (by using a three-tap filter or a five-tap filter), and then the smoothing correction filter described in the embodiment of the present application is used to perform smoothing correction on the current image block, because the reference pixel is filtered, the smoothing correction filter used later may filter the filtered reference pixel, so that the smoothing correction effect may be further improved.

In a specific implementation, at the encoding end, after step 130 and before step 140, the method further includes: and filtering the reference pixels in the first prediction block by using a three-tap filter or a five-tap filter to obtain a filtered first prediction block.

At the decoding end, after step 240 and before step 250, the method further comprises: and filtering the reference pixels in the first prediction block by using a three-tap filter or a five-tap filter to obtain a filtered first prediction block.

The embodiment of the application provides an encoding device which can be a video decoder or a video encoder. In particular, the encoding device is configured to perform the steps performed by the video decoder in the above decoding method. The encoding device provided by the embodiment of the application may include modules corresponding to the corresponding steps.

In the embodiment of the present application, functional modules of the encoding apparatus may be divided according to the above method example, for example, each functional module may be divided according to each function, or two or more functions may be integrated into one processing module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The division of the modules in the embodiment of the present application is schematic, and is only a logic function division, and there may be another division manner in actual implementation.

Fig. 11 shows a possible structure diagram of the coding apparatus according to the above embodiment, in the case of dividing each functional module according to each function. As shown in fig. 11, the encoding device 11 includes a dividing unit 110, a determining unit 111, a filling unit 112, and a smoothing correction unit 113.

A dividing unit 110, configured to divide an image, and obtain coding information of a current coding block, where the coding information includes a first flag bit and a second flag bit, the first flag bit is used to indicate whether the current coding block allows intra prediction filtering, and the second flag bit is used to indicate whether the current coding block allows intra prediction smooth correction;

a determining unit 111, configured to determine an original prediction block of a current coding block;

a padding unit 112, configured to determine, according to the first flag and the second flag, that the intra-prediction smoothing correction is allowed to be used, and perform padding processing on the original prediction block to obtain a first prediction block after padding processing;

and a smoothing correction unit 113, configured to filter, according to the first prediction block, each pixel in the original prediction block by using a smoothing correction filter, so as to obtain a second prediction block after smoothing correction.

In this possible example, the apparatus further includes a setting unit 114, configured to set a third flag bit of the current coding block to a first value if the rate-distortion cost of the second prediction block is smaller than the rate-distortion cost of the original prediction block, and transmit the first value via a code stream, where the third flag bit is used to indicate whether the current coding block uses the intra-prediction smoothing correction, and the first value indicates that the current coding block uses the intra-prediction smoothing correction;

In this possible example, the width and height of the current coding block are not greater than 64 pixels.

the thirteenth tap filter includes:

+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)+c₁·Ref(-1，y+2)

the thirteenth tap filter includes:

+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)+c₁·Ref(-1，y+2)

the first eleventh tap filter comprises:

·P(x+1，y+1)+2c₁·Ref(-1，y+2)

In this possible example, the third filter comprises a twenty-first tap filter;

the twenty-first tap filter comprises:

P′(x，y)＝c₂·P(x-1，y-1)+c₃·P(x，y)+c₂·P(x+1，y-1)+2c₁·Ref(x-2，-1)+c₃·P(x-1，y)

+c₄·P(x，y)+c₃·P(x+1，y)+2c₁·Ref(x+2，-1)+c₂·P(x-1，y+1)+c₃

·P(x，y+1)+c₂·P(x+1，y+1)

In this possible example, the first filter comprises a twenty-third tap filter;

the twenty-third tap filter comprises:

+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)+c₁·Ref(-1，y+1)

the thirty-first tap filter comprises:

·P(x+1，y+1)+2c₁·Ref(-1，y+1)

In this possible example, the second filter comprises a forty-first tap filter;

the forty-first tap filter includes:

+c₃·P(x，y+1)+c₂·P(x+1，y+1)

In this possible example, the first filter comprises a fifty-first tap filter;

the fifty-first tap filter includes:

+c₂·P(x+1，y+1)

In this possible example, the second filter comprises a first ten-tap filter;

the first ten-tap filter comprises:

P′(x，y)＝2c₁·Ref(-1，y)+c₂·P(x-1，y-1)+c₃·P(x，y)+c₂·P(x+1，y-1)+c₃·P(x-1，y)+c₄

·P(x，y)+c₃·P(x+1，y)+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)

In this possible example, the third filter comprises a twentieth tap filter;

the twentieth tap filter includes:

+c₄·P(x，y)+c₃·P(x+1，y)+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂

·P(x+1，y+1)

the first fifteen tap filter includes:

·Ref(x，-1)+c₁·Ref(x+1，-1)+c₂·P(x-1，y-1)+c₃·P(x，y-1)+c₂

+c₃·P(x，y+1)+c₂·P(x+1，y+1)

the first twelve tap filter comprises:

·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)

the twenty-second tap filter comprises:

·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)

All relevant contents of each step related to the above method embodiment may be referred to the functional description of the corresponding functional module, and are not described herein again. Of course, the encoding apparatus provided in the embodiments of the present application includes, but is not limited to, the above modules, for example: the encoding apparatus may further include a storage unit. The memory unit may be used for storing program codes and data of the encoding apparatus.

In the case of using an integrated unit, a schematic structural diagram of an encoding apparatus provided in the embodiment of the present application is shown in fig. 12. In fig. 12, the encoding device 12 includes: a processing module 120 and a communication module 121. The processing module 120 is used for controlling and managing the actions of the encoding apparatus, for example, executing the steps performed by the dividing unit 110, the determining unit 111, the filling unit 112, the smoothing correction unit 113, the setting unit 114, and/or other processes for performing the techniques described herein. The communication module 121 is used to support interaction between the encoding apparatus and other devices. As shown in fig. 11, the encoding apparatus may further include a storage module 122, and the storage module 122 is used for storing program codes and data of the encoding apparatus, for example, contents stored in the storage unit.

The Processing module 120 may be a Processor or a controller, and may be, for example, a Central Processing Unit (CPU), a general-purpose Processor, a Digital Signal Processor (DSP), an ASIC, an FPGA or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor may also be a combination of computing functions, e.g., comprising one or more microprocessors, DSPs, and microprocessors, among others. The communication module 121 may be a transceiver, an RF circuit or a communication interface, etc. The storage module 122 may be a memory.

All relevant contents of each scene related to the method embodiment may be referred to the functional description of the corresponding functional module, and are not described herein again. The encoding device may perform the encoding method, and the encoding device may specifically be a video encoding device or other equipment with a video encoding function.

The application also provides a video encoder, which comprises a nonvolatile storage medium and a central processing unit, wherein the nonvolatile storage medium stores an executable program, and the central processing unit is connected with the nonvolatile storage medium and executes the executable program to realize the encoding method of the embodiment of the application.

The embodiment of the application provides a decoding device, and the decoding device can be a video decoder or a video decoder. In particular, the decoding device is used for executing the steps executed by the video decoder in the above decoding method. The decoding device provided by the embodiment of the application can comprise modules corresponding to the corresponding steps.

In the embodiment of the present application, the decoding apparatus may be divided into functional modules according to the method example, for example, each functional module may be divided according to each function, or two or more functions may be integrated into one processing module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The division of the modules in the embodiment of the present application is schematic, and is only a logic function division, and there may be another division manner in actual implementation.

Fig. 13 shows a schematic diagram of a possible structure of the decoding apparatus according to the above embodiment, in the case of dividing each functional module according to each function. As shown in fig. 13, the decoding apparatus 13 comprises a first parsing unit 130, a second parsing unit 131, a third parsing unit 132, a filling unit 133, a smooth modification unit 134, and a reconstruction unit 135,

a first parsing unit 130, configured to parse the code stream to obtain a second flag of the current decoded block, where the second flag is used to indicate whether the current decoded block allows intra-prediction smoothing correction;

a second parsing unit 131, configured to parse the code stream to obtain a third flag bit of the current decoded block if the second flag bit indicates that the intra-frame prediction smooth correction is allowed to be used, where the third flag bit is used to indicate whether the current decoded block uses the intra-frame prediction smooth correction;

a third parsing unit 132, configured to parse the code stream, obtain original residual information of the current decoded block and an intra-frame prediction mode that needs to be used, perform inverse transformation and inverse quantization on the original residual information to obtain time-domain residual information, and obtain an original prediction block of the current decoded block according to the intra-frame prediction mode that needs to be used and an adjacent reconstructed block of the current decoded block;

a padding unit 133, configured to perform padding on the original prediction block to obtain a first prediction block after padding, if the third flag indicates that the intra-frame prediction smoothing correction is used;

a smoothing correction unit 134, configured to filter, according to the first prediction block, each pixel in the original prediction block by using a smoothing correction filter, so as to obtain a second prediction block after smoothing correction;

a reconstructing unit 135, configured to superimpose the second prediction block on the time-domain residual information to obtain a reconstructed block of the current decoded block.

In this possible example, the reconstructing unit 135 is further configured to, if the second flag indicates that the intra-prediction smooth correction is not allowed to be used, or if the third flag indicates that the intra-prediction smooth correction is not used, superimpose the original prediction block on the temporal residual information to obtain a reconstructed block of the current decoded block.

the thirteenth tap filter includes:

+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)+c₁·Ref(-1，y+2)

the thirteenth tap filter includes:

+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)+c₁·Ref(-1，y+2)

the first eleventh tap filter comprises:

·P(x+1，y+1)+2c₁·Ref(-1，y+2)

wherein, P'(x, y) is the final predicted value of (x, y) pixel points in the first predicted block, c₁、c₂And c₃The prediction coefficients are respectively filter coefficients, P (a, b) is an original prediction value of a pixel point (a, b) in the first prediction block, wherein the value ranges of x and y do not exceed the width and height of the current coding unit block, and Ref (m, n) is a reconstruction value of a pixel point (m, n).

In this possible example, the third filter comprises a twenty-first tap filter;

the twenty-first tap filter comprises:

+c₃·P(x，y+1)+c₂·P(x+1，y+1)

In this possible example, the first filter comprises a twenty-third tap filter;

the twenty-third tap filter comprises:

+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)+c₁·Ref(-1，y+1)

the thirty-first tap filter comprises:

·P(x+1，y+1)+2c₁·Ref(-1，y+1)

In this possible example, the second filter comprises a forty-first tap filter;

the forty-first tap filter includes:

+c₃·P(x，y+1)+c₂·P(x+1，y+1)

wherein P' (x, y) is the secondA final predicted value of (x, y) pixel points in a predicted block, c₁、c₂And c₃The prediction coefficients are respectively filter coefficients, P (a, b) is an original prediction value of a pixel point (a, b) in the first prediction block, wherein the value ranges of x and y do not exceed the width and height of the current coding unit block, and Ref (m, n) is a reconstruction value of a pixel point (m, n).

In this possible example, the first filter comprises a fifty-first tap filter;

the fifty-first tap filter includes:

+c₂·P(x+1，y+1)

In this possible example, the second filter comprises a first ten-tap filter;

the first ten-tap filter comprises:

+c₄·P(x，y)+c₃·P(x+1，y)+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂

·P(x+1，y+1)

In this possible example, the third filter comprises a twentieth tap filter;

the twentieth tap filter includes:

+c₄·P(x，y)+c₃·P(x+1，y)+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂

·P(x+1，y+1)

the first fifteen tap filter includes:

·Ref(x，-1)+c₁·Ref(x+1，-1)+c₂·P(x-1，y-1)+c₃·P(x，y-1)+c₂

+c₃·P(x，y+1)+c₂·P(x+1，y+1)

wherein P' (x, y) is in the first prediction block(x, y) final predicted value of pixel, c₁、c₂And c₃The prediction coefficients are respectively filter coefficients, P (a, b) is an original prediction value of a pixel point (a, b) in the first prediction block, wherein the value ranges of x and y do not exceed the width and height of the current coding unit block, and Ref (m, n) is a reconstruction value of a pixel point (m, n).

the first twelve tap filter comprises:

·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)

the twenty-second tap filter comprises:

·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)

All relevant contents of each step related to the above method embodiment may be referred to the functional description of the corresponding functional module, and are not described herein again. Of course, the decoding apparatus provided in the embodiments of the present application includes, but is not limited to, the above modules, for example: the decoding apparatus may further include a storage unit. The memory unit may be used to store program codes and data of the decoding apparatus.

In the case of using an integrated unit, a schematic structural diagram of a decoding apparatus provided in an embodiment of the present application is shown in fig. 14. In fig. 14, the decoding apparatus includes: a processing module 140 and a communication module 141. The processing module 140 is used for controlling and managing the actions of the decoding apparatus, for example, executing the steps performed by the parsing unit 130, the residual processing unit 131, the prediction unit 132, the padding unit 133, the smooth modification unit 134, the reconstruction unit 135, and/or other processes for performing the techniques described herein. The communication module 141 is used to support interaction between the decoding apparatus and other devices. As shown in fig. 12, the decoding apparatus may further include a storage module 142, and the storage module 142 is used for storing program codes and data of the decoding apparatus, for example, contents stored in the storage unit.

The Processing module 140 may be a Processor or a controller, and may be, for example, a Central Processing Unit (CPU), a general-purpose Processor, a Digital Signal Processor (DSP), an ASIC, an FPGA or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor may also be a combination of computing functions, e.g., comprising one or more microprocessors, DSPs, and microprocessors, among others. The communication module 141 may be a transceiver, an RF circuit or a communication interface, etc. The storage module 142 may be a memory.

All relevant contents of each scene related to the method embodiment may be referred to the functional description of the corresponding functional module, and are not described herein again. The decoding device may perform the decoding method, and the decoding device may specifically be a video decoding device or other equipment with a video decoding function.

The application also provides a video decoder, which comprises a nonvolatile storage medium and a central processing unit, wherein the nonvolatile storage medium stores an executable program, and the central processing unit is connected with the nonvolatile storage medium and executes the executable program to realize the decoding method of the embodiment of the application.

The present application further provides a terminal, including: one or more processors, memory, a communication interface. The memory, communication interface, and one or more processors; the memory is used for storing computer program code comprising instructions which, when executed by the one or more processors, cause the terminal to perform the image encoding and/or decoding methods of embodiments of the present application. The terminal can be a video display device, a smart phone, a portable computer and other devices which can process video or play video.

Another embodiment of the present application also provides a computer-readable storage medium including one or more program codes, where the one or more programs include instructions, and when a processor in a decoding apparatus executes the program codes, the decoding apparatus executes the encoding method and the decoding method of the embodiments of the present application.

In another embodiment of the present application, there is also provided a computer program product comprising computer executable instructions stored in a computer readable storage medium; the at least one processor of the decoding device can read the computer-executable instructions from the computer-readable storage medium, and the at least one processor executes the computer-executable instructions to enable the terminal to implement the encoding method and the decoding method of the embodiment of the application.

In the above embodiments, all or part of the implementation may be realized by software, hardware, firmware or any combination thereof. When implemented using a software program, may take the form of a computer program product, either entirely or partially. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part.

The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.).

The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

Through the above description of the embodiments, it is clear to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional modules is merely used as an example, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device may be divided into different functional modules to complete all or part of the above described functions.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described device embodiments are merely illustrative, and for example, the division of the modules or units is only one logical functional division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another device, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may be one physical unit or a plurality of physical units, that is, may be located in one place, or may be distributed in a plurality of different places. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially or partially contributed to by the prior art, or all or part of the technical solutions may be embodied in the form of a software product, where the software product is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The above description is only an embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions within the technical scope of the present disclosure should be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method of encoding, comprising:

determining an original prediction block of a current coding block;

and filtering each pixel in the original prediction block by using a smooth correction filter according to the first prediction block to obtain a second prediction block after smooth correction.

2. The method of claim 1, wherein after filtering each pixel in the original prediction block according to the first prediction block using a smoothing correction filter to obtain a second smoothed prediction block, the method further comprises:

calculating a rate-distortion cost of the second prediction block;

3. The method of claim 1 or 2, wherein the width and height of the current coding block are not greater than 64 pixels.

4. The method according to claim 1 or 2, wherein the smoothing correction filter is configured to filter a first reference pixel on an upper side boundary of the first prediction block, a second reference pixel on a left side boundary of the first prediction block, and a currently processed prediction pixel in the original prediction block, the first reference pixel includes two reference pixels whose difference between an abscissa and an abscissa of a center pixel of the currently processed prediction pixel is 2, the second reference pixel includes two reference pixels whose difference between an ordinate and an ordinate of the center pixel of the currently processed prediction pixel is 2, and the upper side boundary and the left side boundary are padding regions of the first prediction block relative to the original prediction block.

5. The method of claim 4, wherein the smooth correction filter comprises a first thirteen-tap filter;

the thirteenth tap filter includes:

P′(x，y)＝c₁·Ref(-1，y-2)+c₂·P(x-1，y-1)+c₃·P(x，y-1)+c₂·P(x+1，y-1)+c₁·Ref(x-2，-1)+c₃·P(x-1，y)+c₄·P(x，y)+c₃·P(x+1，y)+c₁·Ref(x+2，-1)+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)+c₁·Ref(-1，y+2)

6. The method of claim 1 or 2, wherein the smoothing correction filter is determined by an intra prediction mode used by the current coding block;

7. The method of claim 6, wherein the first filter comprises a first thirteen-tap filter;

the thirteenth tap filter includes:

the second filter comprises an eleventh tap filter;

the first eleventh tap filter comprises:

P′(x，y)＝2c₁·Ref(-1，y-2)+c₂·P(x-1，y-1)+c₃·P(x，y-1)+c₂·P(x+1，y-1)+c₃·P(x-1，y)+c₄·P(x，y)+c₃·P(x+1，y)+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)+2c₁·Ref(-1，y+2)

the third filter comprises a twenty-first tap filter;

the twenty-first tap filter comprises:

P′(x，y)＝c₂·P(x-1，y-1)+c₃·P(x，y-1)+c₂·P(x+1，y-1)+2c₁·Ref(x-2，-1)+c₃·P(x-1，y)+c₄·P(x，y)+c₃·P(x+1，y)+2c₁·Ref(x+2，-1)+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)

wherein P' (x, y) is the final predicted value of (x, y) pixel points in the first predicted block, c₁、c₂And c₃Respectively, the filter coefficients, P (a, b) is the original prediction value of the pixel point (a, b) in the first prediction block, wherein the value ranges of x and y do not exceed the width and height of the current coding unit block, Ref (m)And n) is a reconstruction value located at the (m, n) pixel point.

8. The method of claim 6, wherein the first filter comprises a twenty-third tap filter;

the twenty-third tap filter comprises:

P′(x，y)＝c₁·Ref(-1，y-1)+c₂·P(x-1，y-1)+c₃·P(x，y-1)+c₂·P(x+1，y-1)+c₁·Ref(x-1，-1)+c₃·P(x-1，y)+c₄·P(x，y)+c₃·P(x+1，y)+c₁·Ref(x+1，-1)+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)+c₁·Ref(-1，y+1)

the second filter comprises a thirty-first tap filter;

the thirty-first tap filter comprises:

P′(x，y)＝2c₁·Ref(-1，y-1)+c₂·P(x-1，y-1)+c₃·P(x，y-1)+c₂·P(x+1，y-1)+c₃·P(x-1，y)+c₄·P(x，y)+c₃·P(x+1，y)+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)+2c₁·Ref(-1，y+1)

wherein P' (x, y) is the final predicted value of (x, y) pixel points in the first predicted block, c₁、c₂And c₃Respectively, the values are filter coefficients, P (a, b) is an original predicted value of a pixel point (a, b) in the first prediction block, wherein the value ranges of x and y do not exceed the width and height of the current coding unit block, and Ref (m, n) is a reconstruction value of a pixel point (m, n);

the second filter comprises a forty-first tap filter;

the forty-first tap filter includes:

P′(x，y)＝c₂·P(x-1，y-1)+c₃·P(x，y-1)+c₂·P(x+1，y-1)+2c₁·Ref(x-1，-1)+c₃·P(x-1，y)+c₄·P(x，y)+c₃·P(x+1，y)+2c₁·Ref(x+1，-1)+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)

9. The method of claim 6, wherein the first filter comprises a fifty-first tap filter;

the fifty-first tap filter includes:

P′(x，y)＝c₁·Ref(-1，y)+c₂·P(x-1，y-1)+c₃·P(x，y-1)+c₂·P(x+1，y-1)+c₁·Ref(x，-1)+c₃·P(x-1，y)+c₄·P(x，y)+c₃·P(x+1，y)+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)

the second filter comprises a first ten-tap filter;

the first ten-tap filter comprises:

P′(x，y)＝2c₁·Ref(-1，y)+c₂·P(x-1，y-1)+c₃·P(x，y-1)+c₂·P(x+1，y-1)+c₃·P(x-1，y)+c₄·P(x，y)+c₃·P(x+1，y)+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)

the third filter comprises a twentieth tap filter;

the twentieth tap filter includes:

P′(x，y)＝c₂·P(x-1，y-1)+c₃·P(x，y-1)+c₂·P(x+1，y-1)+2c₁·Ref(x，-1)+c₃·P(x-1，y)+c₄·P(x，y)+c₃·P(x+1，y)+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)

10. The method of claim 6, wherein the first filter comprises a first fifteen tap filter;

the first fifteen tap filter includes:

P′(x，y)＝c₁·Ref(-1，y-1)+c₁ref (one 1, y) + c₁·Ref(-1，y+1)+c₁·Ref(x-1，-1)+c₁·Ref(x，-1)+c₁·Ref(x+1，-1)+c₂·P(x-1，y-1)+c₃·P(x，y-1)+c₂·P(x+1，y-1)+c₃·P(x-1，y)+c₄·P(x，y)+c₃·P(x+1，y)+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)

The second filter comprises a first twelve tap filter;

the first twelve tap filter comprises:

P′(x，y)＝2c₁·Ref(-1，y-1)+2c₁·Ref(-1，y)+2c₁·Ref(-1，y+1)+c₂·P(x-1，y-1)+c₃·P(x，y-1)+c₂·P(x+1，y-1)+c₃·P(x-1，y)+c₄·P(x，y)+c₃·P(x+1，y)+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)

the third filter comprises a twenty-second tap filter;

the twenty-second tap filter comprises:

P′(x，y)＝c₁·Ref(x-1，-1)+c₁·Ref(x，-1)+c₁·Ref(x+1，-1)+c₂·P(x-1，y-1)+c₃·P(x，y-1)+c₂·P(x+1，y-1)+c₃·P(x-1，y)+c₄·P(x，y)+c₃·P(x+1，y)+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)

11. The method of any one of claims 5 or 7-10, wherein c is₁Is 7, c₂Is 20, c₃Is 26, c₄Is 44; alternatively, the first and second electrodes may be,

12. A method of decoding, comprising:

13. The method of claim 12 wherein the width and height of the current decoded block are each no greater than 64 pixels.

14. The method according to claim 12 or 13, characterized in that the method further comprises:

and if the second flag bit indicates that the intra-frame prediction smooth correction is not allowed to be used, or if the third flag bit indicates that the intra-frame prediction smooth correction is not used, overlapping the original prediction block with the time domain residual error information to obtain a reconstructed block of the current decoding block.

15. The method according to claim 12 or 13, wherein the smoothing correction filter is configured to filter a first reference pixel on an upper side boundary of the first prediction block, a second reference pixel on a left side boundary of the first prediction block, and a currently processed prediction pixel in the original prediction block, the first reference pixel includes two reference pixels whose difference between an abscissa and an abscissa of a center pixel of the currently processed prediction pixel is 2, the second reference pixel includes two reference pixels whose difference between an ordinate and an ordinate of the center pixel of the currently processed prediction pixel is 2, and the upper side boundary and the left side boundary are padding regions of the first prediction block relative to the original prediction block.

16. The method of claim 15, wherein the smoothing correction filter comprises a first thirteen-tap filter;

the thirteenth tap filter includes:

17. The method according to claim 12 or 13, wherein the smoothing correction filter is determined by an intra prediction mode used by the current coding block;

18. The method of claim 17, wherein the first filter comprises a first thirteen-tap filter;

the thirteenth tap filter includes:

the second filter comprises an eleventh tap filter;

the first eleventh tap filter comprises:

the third filter comprises a twenty-first tap filter;

the twenty-first tap filter comprises:

wherein P' (x, y) is the final predicted value of (x, y) pixel points in the first predicted block, c₁、c₂And c₃Respectively, filter coefficients, P (a, b) being (a, b) pixels in the first prediction blockAnd (3) the original predicted value of the point, wherein the value ranges of x and y do not exceed the width and height of the current coding unit block, and Ref (m, n) is a reconstruction value of a pixel point (m, n).

19. The method of claim 17, wherein the first filter comprises a twenty-third tap filter;

the twenty-third tap filter comprises:

the second filter comprises a thirty-first tap filter;

the thirty-first tap filter comprises:

the second filter comprises a forty-first tap filter;

the forty-first tap filter includes:

wherein P' (x, y) is the final predicted value of (x, y) pixel points in the first predicted block, c₁、c₂And c₃Respectively, the filter coefficients, P (a,b) and the original predicted value of the pixel point (a, b) in the first prediction block is obtained, wherein the value ranges of x and y do not exceed the width and height of the current coding unit block, and Ref (m, n) is a reconstruction value of the pixel point (m, n).

20. The method of claim 19, wherein the first filter comprises a fifty-first tap filter;

the fifty-first tap filter includes:

the second filter comprises a first ten-tap filter;

the first ten-tap filter comprises:

the third filter comprises a twentieth tap filter;

the twentieth tap filter includes:

wherein P' (x, y) is the final predicted value of (x, y) pixel points in the first predicted block, c₁、c₂And c₃Respectively, a filter coefficient, and P (a, b) is an original predicted value of a pixel point (a, b) in the first prediction block, wherein the value ranges of x and y are bothRef (m, n) is a reconstruction value located at a pixel point (m, n) without exceeding the width and height of the current coding unit block.

21. The method of claim 17, wherein the first filter comprises a first fifteen tap filter;

the first fifteen tap filter includes:

P′(x，y)＝c₁·Ref(-1，y-1)+c₁·Ref(-1，y)+c₁·Ref(-1，y+1)+c₁·Ref(x-1，-1)+c₁·Ref(x，-1)+c₁·Ref(x+1，-1)+c₂·P(x-1，y-1)+c₃·P(x，y-1)+c₂·P(x+1，y-1)+c₃·P(x-1，y)+c₄·P(x，y)+c₃·P(x+1，y)+c₂·P(x-1，y+1)+c₃·P(x，y+1)+c₂·P(x+1，y+1)

the second filter comprises a first twelve tap filter;

the first twelve tap filter comprises:

the third filter comprises a twenty-second tap filter;

the twenty-second tap filter comprises:

22. The method of any one of claims 16 or 18-21, wherein c is₁Is 7, c₂Is 20, c₃Is 26, c₄Is 44; alternatively, the first and second electrodes may be,

23. An encoding apparatus, comprising:

the device comprises a dividing unit, a processing unit and a processing unit, wherein the dividing unit is used for dividing an image and acquiring coding information of a current coding block, the coding information comprises a first identification bit and a second identification bit, the first identification bit is used for indicating whether the current coding block allows intra-frame prediction filtering to be used, and the second identification bit is used for indicating whether the current coding block allows intra-frame prediction smooth correction to be used;

a determining unit, configured to determine an original prediction block of a current coding block;

a filling unit, configured to determine, according to the first flag and the second flag, that intra-prediction smoothing correction is allowed to be used, and perform filling processing on the original prediction block to obtain a first prediction block after the filling processing;

and the smoothing correction unit is used for filtering each pixel in the original prediction block by using a smoothing correction filter according to the first prediction block to obtain a second prediction block after smoothing correction.

24. A decoding apparatus, comprising:

25. An encoder comprising a non-volatile storage medium and a central processor, wherein the non-volatile storage medium stores an executable program, the central processor is coupled to the non-volatile storage medium, and when the executable program is executed by the central processor, the encoder performs the bi-directional inter prediction method as recited in any of claims 1-11.

26. A decoder comprising a non-volatile storage medium and a central processor, wherein the non-volatile storage medium stores an executable program, wherein the central processor is coupled to the non-volatile storage medium, wherein the decoder performs the bi-directional inter prediction method of any of claims 12-22 when the executable program is executed by the central processor.

27. A terminal, characterized in that the terminal comprises: one or more processors, memory, and a communication interface; the memory, the communication interface and the one or more processors; the terminal communicating with other devices via the communication interface, the memory for storing computer program code, the computer program code comprising instructions,

the instructions, when executed by the one or more processors, cause the terminal to perform the method of any of claims 1-11 or any of claims 12-22.

28. A computer program product comprising instructions for causing a terminal to perform the method according to any one of claims 1 to 11 or any one of claims 12 to 22 when the computer program product is run on the terminal.

29. A computer-readable storage medium comprising instructions that, when executed on a terminal, cause the terminal to perform the method of any of claims 1-11 or any of claims 12-22.