CN116250240A

CN116250240A - Image encoding method, image decoding method and related devices

Info

Publication number: CN116250240A
Application number: CN202180060486.6A
Authority: CN
Inventors: 谢志煌
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2020-07-29
Filing date: 2021-07-29
Publication date: 2023-06-09
Also published as: WO2022022622A1; CN114071162A; TW202209878A

Abstract

An image encoding method, an image decoding method and a related device, wherein the image encoding method comprises the following steps: determining intra-frame prediction filtering indication information of a current coding block; if the fact that the current coding block needs to use the first intra-frame prediction filtering mode is determined according to the intra-frame prediction filtering indication information, a first use identification bit of the first intra-frame prediction filtering mode of the current coding block is set to be allowed to be used; writing the intra-frame prediction filtering indication information, the first intra-frame prediction filtering mode and the first use identification bit into a code stream; and according to the first intra-frame prediction filtering mode, predicting the current coding block to obtain a prediction block, coding the prediction block, and writing the coding result into the code stream. The method provides selection for operations such as smoothing processing or local blurring for intra prediction, and for parts of image textures which do not need to be sharpened, so that predicted pixels are smoother, predicted blocks are closer to an original image, and finally, the coding efficiency is improved.

Description

Image encoding method, image decoding method and related devices

Technical Field

The present disclosure relates to the field of electronic devices, and in particular, to an image encoding method, an image decoding method, and related devices.

Background

Digital video capabilities can be incorporated into a wide range of devices including digital televisions, digital direct broadcast systems, wireless broadcast systems, personal digital assistants (personal digital assistant, PDAs), laptop or desktop computers, tablet computers, electronic book readers, digital cameras, digital recording devices, digital media players, video gaming devices, video game consoles, cellular or satellite radio telephones, video conferencing devices, video streaming devices, and the like.

Digital video devices implement video compression techniques such as those described in the standards defined by the moving picture experts group (Moving Picture Experts Group, MPEG) -2, MPEG-4, ITU-t h.263, ITU-t h.264/MPEG-4 part 10 advanced video codec (advanced video coding, AVC), ITU-t h.265 high efficiency video codec (high efficiency video coding, HEVC) standard and extensions of the standards, thereby transmitting and receiving digital video information more efficiently. Video devices may more efficiently transmit, receive, encode, decode, and/or store digital video information by implementing these video codec techniques.

With the proliferation of internet video, although digital video compression technology is evolving, there is still a higher demand for video compression ratio.

Disclosure of Invention

The embodiment of the application provides an image coding method, an image decoding method and a related device, which are used for providing selection for operations such as smoothing processing or local blurring and the like in the intra-frame prediction, and for the parts of image textures which do not need to be sharpened, the technology is used for enabling predicted pixels to be smoother, a predicted block is closer to an original image, and finally the coding efficiency is improved.

In a first aspect, an embodiment of the present application provides an image encoding method, including:

determining intra-prediction filtering indication information of a current coding block, wherein the intra-prediction filtering indication information comprises first indication information and second indication information, the first indication information is used for indicating whether a first intra-prediction filtering mode is allowed to be used or not, the second indication information is used for indicating whether a second intra-prediction filtering mode is allowed to be used or not, and the first intra-prediction filtering mode is an intra-prediction filtering IPF mode;

if the fact that the current coding block needs to use the first intra-frame prediction filtering mode is determined according to the intra-frame prediction filtering indication information, a first use identification bit of the first intra-frame prediction filtering mode of the current coding block is set to be a first value;

Writing the intra-frame prediction filtering indication information, the first intra-frame prediction filtering mode and the first use identification bit into a code stream;

and according to the first intra-frame prediction filtering mode, predicting the current coding block to obtain a prediction block, coding the prediction block, and writing a coding result into a code stream.

Compared with the prior art, the method and the device provide selection for intra prediction in operations such as smoothing processing or local blurring, and for the portions of the image texture which do not need to be sharpened, the prediction pixels are smoother, the prediction blocks are closer to the original image, and finally the coding efficiency is improved.

In a second aspect, an embodiment of the present application provides an image decoding method, including:

analyzing a code stream, and determining intra-frame prediction filtering indication information and a first use identification bit of a current decoding block, wherein the intra-frame prediction indication information comprises first indication information and second indication information, the first indication information is used for indicating whether a first intra-frame prediction filtering mode is allowed to be used or not, the second indication information is used for indicating whether a second intra-frame prediction filtering mode is allowed to be used or not, the first intra-frame prediction filtering mode is an intra-frame prediction filtering IPF mode, and the first use identification bit is a use identification bit of the first intra-frame prediction filtering mode;

And determining to predict the current decoding block by using the first frame rate prediction filtering mode according to the intra-frame prediction filtering indication information and the first use identification bit to obtain a prediction block of the current decoding block.

In a third aspect, an embodiment of the present application provides an image encoding apparatus, including:

a determining unit configured to determine intra prediction filtering indication information of a current coding block, the intra prediction filtering indication information including first indication information indicating whether a first intra prediction filtering mode is allowed to be used and second indication information indicating whether a second intra prediction filtering mode is allowed to be used, the first intra prediction filtering mode being an intra prediction filtering IPF mode;

a setting unit, configured to set a first use flag of the first intra-prediction filtering mode of the current coding block to a first value if it is determined that the current coding block needs to use the first intra-prediction filtering mode according to the intra-prediction filtering indication information;

A transmission unit, configured to write the intra-frame prediction filtering indication information, the first intra-frame prediction filtering mode, and the first use identification bit into a code stream;

and the superposition unit is used for determining to predict the current decoding block by using the first frame rate prediction filtering mode according to the intra-frame prediction filtering indication information and the first use identification bit to obtain a prediction block of the current decoding block.

In a fourth aspect, an embodiment of the present application provides an image decoding apparatus, including:

an parsing unit, configured to determine intra-prediction filtering indication information and a first usage identification bit of a current decoding block, where the intra-prediction indication information includes first indication information and second indication information, where the first indication information is used to indicate whether to allow use of a first intra-prediction filtering mode, the second indication information is used to indicate whether to allow use of a second intra-prediction filtering mode, the first intra-prediction filtering mode is an intra-prediction filtering IPF mode, and the first usage flag bit is a usage identification bit of the first intra-prediction filtering mode;

and the determining unit is used for determining to predict the current decoding block by using the first frame rate prediction filtering mode according to the intra-frame prediction filtering indication information and the first use identification bit to obtain a prediction block of the current decoding block.

In a fifth aspect, embodiments of the present application provide an encoder, comprising: a processor and a memory coupled to the processor; the processor is configured to perform the method according to the first aspect.

In a sixth aspect, embodiments of the present application provide a decoder, including: a processor and a memory coupled to the processor; the processor is configured to perform the method according to the second aspect.

In a seventh aspect, an embodiment of the present application provides a terminal, including: one or more processors, memory, and communication interfaces; the memory, the communication interface, and the one or more processors are connected; the terminal communicates with other devices via the communication interface, the memory being for storing computer program code comprising instructions which, when executed by the one or more processors, cause the terminal to perform the method of the first or second aspect.

In an eighth aspect, embodiments of the present application provide a computer-readable storage medium having instructions stored therein, which when run on a computer, cause the computer to perform the method of the first or second aspect described above.

In a ninth aspect, embodiments of the present application provide a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of the first or second aspect described above.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic block diagram of an encoding tree unit in an embodiment of the present application;

FIG. 2 is a schematic block diagram of a CTU and a coding block CU in an embodiment of the present application;

FIG. 3 is a schematic block diagram of a color format in an embodiment of the present application;

FIG. 4 is a schematic diagram of an IPF in an embodiment of the present application;

FIG. 5 is a schematic diagram illustrating intra prediction filtering according to an embodiment of the present application;

FIG. 6 is a schematic block diagram of a video decoding system in an embodiment of the present application;

FIG. 7 is a schematic block diagram of a video encoder in an embodiment of the present application;

FIG. 8 is a schematic block diagram of a video decoder in an embodiment of the present application;

FIG. 9 is a flowchart of an image encoding method according to an embodiment of the present application;

fig. 10 is a flowchart of an image decoding method according to an embodiment of the present application;

FIG. 11A is a schematic diagram of a filling of a prediction block according to an embodiment of the present application;

FIG. 11B is another schematic illustration of filling of a prediction block in an embodiment of the present application;

FIG. 12 is a functional block diagram of an image encoding device according to an embodiment of the present application;

fig. 13 is a block diagram of another functional unit of the image encoding apparatus in the embodiment of the present application;

fig. 14 is a functional block diagram of an image decoding apparatus in the embodiment of the present application;

fig. 15 is a block diagram of another functional unit of the image decoding apparatus in the embodiment of the present application.

Detailed Description

The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

It will be understood that the terms first, second, etc. as used herein may be used to describe various elements, but these elements are not limited by these terms. These terms are only used to distinguish one element from another element. For example, a first client may be referred to as a second client, and similarly, a second client may be referred to as a first client, without departing from the scope of the invention. Both the first client and the second client are clients, but they are not the same client.

First, terms used in the embodiments of the present application will be described.

For image division, in order to more flexibly represent video content, a Coding Tree Unit (CTU), a Coding block (CU), a Prediction Unit (PU), and a Transform Unit (TU) are defined in a high efficiency video codec (High Efficiency Video Coding standard, HEVC) technology. CTU, CU, PU and TU are tiles.

A coding tree unit CTU, an image being made up of a plurality of CTUs, a CTU generally corresponding to a square image area containing luminance pixels and chrominance pixels (or may contain only luminance pixels or may contain only chrominance pixels) in this image area; the CTU also includes syntax elements indicating how the CTU is divided into at least one Coding Unit (CU), and a method of decoding each coding block to obtain a reconstructed image. As shown in fig. 1, the image 10 is constituted by a plurality of CTUs (including CTU a, CTU B, CTU C, and the like). The coding information corresponding to a certain CTU contains luminance values and/or chrominance values of pixels in a square image area corresponding to the CTU. Furthermore, the coding information corresponding to a certain CTU may further contain syntax elements indicating how the CTU is divided into at least one CU, and a method of decoding each CU to obtain a reconstructed image. The image area corresponding to one CTU may include 64×64, 128×128, or 256×256 pixels. In one example, a 64×64 pixel CTU contains a rectangular pixel lattice of 64 columns of 64 pixels each, each pixel containing a luminance component and/or a chrominance component. The CTU may correspond to a rectangular image area or an image area of another shape, and the image area corresponding to one CTU may be an image area having a different number of pixels in the horizontal direction from the number of pixels in the vertical direction, for example, 64×128 pixels.

The coding blocks CU, as shown in fig. 2, may be further divided into coding blocks CTU, typically corresponding to a rectangular area of a x B in the image, comprising a x B luminance pixels or/and its corresponding chrominance pixels, a being the width of the rectangle, B being the height of the rectangle, a and B being either the same or different, the values of a and B typically being the integer powers of 2, e.g. 128, 64, 32, 16, 8, 4. The width in the embodiment of the present application refers to the length along the X-axis direction (horizontal direction) in the two-dimensional rectangular coordinate system XoY shown in fig. 1, and the height refers to the length along the Y-axis direction (vertical direction) in the two-dimensional rectangular coordinate system XoY shown in fig. 1. The reconstructed image of a CU may be obtained by adding a prediction image, which may be generated by intra prediction or inter prediction, and may be specifically composed of one or more Prediction Blocks (PB), and a residual image, which may be generated by inverse quantization and inverse transform of transform coefficients, and may be specifically composed of one or more Transform Blocks (TB). Specifically, a CU contains coding information, where the coding information includes information such as a prediction mode and a transform coefficient, and decoding processes such as corresponding prediction, inverse quantization, inverse transform are performed on the CU according to the coding information, so as to generate a reconstructed image corresponding to the CU. The relation between the coding tree unit and the coding block is shown in fig. 3.

The prediction unit PU is a basic unit for intra prediction and inter prediction. The motion information defining the image block includes inter prediction direction, reference frame, motion vector, etc., the image block being encoded is referred to as a current encoded block (current coding block, CCB), the image block being decoded is referred to as a current decoded block (current decoding block, CDB), and for example, when one image block is being predicted, the current encoded block or the current decoded block is a predicted block; when residual processing is being performed on one image block, the current encoded block or the current decoded block is a transform block. The image in which the current encoded block or the current decoded block is located is referred to as a current frame. In the current frame, an image block located at the left side or upper side (the left side refers to the negative direction of the X axis and the upper side refers to the positive direction of the Y axis) of the current block may be located inside the current frame and has completed encoding/decoding processing, resulting in a reconstructed image, which is referred to as a reconstructed block; information such as the coding mode of the reconstructed block, reconstructed pixels, etc. is available. A frame that has completed the encoding/decoding process before the current frame is encoded/decoded is referred to as a reconstructed frame. When the current frame is a unidirectional predicted frame (P-frame) or a bidirectional predicted frame (B-frame), it has one or two reference frame lists, respectively, referred to as L0 and L1, respectively, each list containing at least one reconstructed frame, referred to as the reference frame of the current frame. The reference frame provides reference pixels for inter prediction of the current frame.

The transform unit TU processes residuals of the original image block and the predicted image block.

Pixels (also referred to as pixels) refer to pixels in an image, such as pixels in a coding block, pixels in a luminance component pixel block (also referred to as luminance pixels), pixels in a chrominance component pixel block (also referred to as chrominance pixels), and so forth.

Samples (also called pixel values, sample values) refer to pixel values of a pixel point, the pixel values refer to luminance (i.e. gray scale values) in the luminance component domain, the pixel values refer to chrominance values (i.e. color and saturation) in the chrominance component domain, and samples of one pixel include an original sample, a predicted sample and a reconstructed sample according to different processing stages.

Direction description: horizontal direction, for example: in the two-dimensional rectangular coordinate system XoY as shown in fig. 1, the direction along the X-axis and the vertical direction are as follows: in the negative Y-axis direction in a two-dimensional rectangular coordinate system XoY as shown in fig. 1.

Intra prediction generates a predicted image of a current block from spatially neighboring pixels of the current block. An intra prediction mode corresponds to a method of generating a predicted image. The intra prediction unit is divided into a 2n×2n division mode and an n×n division mode, wherein the 2n×2n division mode is not to divide the image block; the n×n division is to divide an image block into four equally large sub-image blocks.

Typically, digital video compression techniques work on video sequences whose color coding method is YCbCr, which may also be referred to as YUV, in color format 4:2:0, 4:2:2, or 4:4:4. Where Y represents brightness (luminence or Luma), i.e., a gray level value, cb represents a blue Chrominance component, cr represents a red Chrominance component, and U and V represent Chrominance (chromance or Chroma) for describing color and saturation. In color format, 4:2:0 represents 4 luminance components per 4 pixels, 2 chrominance components (yyycbcr), 4:2:2 represents 4 luminance components per 4 pixels, 4 chrominance components (yyyycbcrbcr), and 4:4:4 represents a full-pixel display (yyyycbcrbcrcbcr), fig. 3 shows component profiles in different color formats, with white circles being the Y component and gray circles being the UV component.

And an intra-frame prediction part in the digital video coding and decoding mainly predicts a current coding unit block by referring to the image information of the adjacent block of the current frame, calculates residual errors between a prediction block and an original image block to obtain residual error information, and transmits the residual error information to a decoding end through the processes of transformation, quantization and the like. After receiving and analyzing the code stream, the decoding end obtains residual information through steps of inverse transformation, inverse quantization and the like, and the predicted image block obtained by prediction of the decoding end is overlapped with the residual information to obtain a reconstructed image block. In this process, intra-frame prediction generally predicts a current coding block by means of an angle mode and a non-angle mode to obtain a prediction block, screens out an optimal prediction mode of the current coding unit according to rate distortion information obtained by calculating the prediction block and an original block, and then writes the prediction mode into a code stream to a decoding end. The decoding end analyzes the prediction mode, predicts the prediction image of the current decoding block, and superimposes residual pixels written in the code stream to obtain a reconstructed image.

Through the development of the digital video coding and decoding standard of the past generation, the non-angle mode is kept relatively stable, and has a mean mode and a plane mode; the angle mode is increased along with the evolution of the digital video coding and decoding standard, taking the international digital video coding standard H series as an example, the H.264/AVC standard only has 8 angle prediction modes and 1 non-angle prediction mode; 265/HEVC is extended to 33 angular prediction modes and 2 non-angular prediction modes; and the current latest general video coding standard H.266/VVC adopts 67 prediction modes, wherein 2 non-angle prediction modes are reserved, and the angle modes are expanded from 33 of H.265 to 65. Needless to say, with the increase of angle modes, intra-frame prediction will be more accurate, and the requirements of the current society on high-definition and ultra-high-definition video development are met. Not only is the international standard so, but also the national digital audio and video coding standard AVS3 continues to expand an angle mode and a non-angle mode, and the development of ultra-high definition digital video puts higher requirements on intra-frame prediction, and can not only rely on simply increasing the angle prediction mode to expand a wide angle to submit the coding efficiency. Therefore, the domestic digital audio and video coding standard AVS3 adopts an intra-frame prediction filtering technique (IPF, intra prediction filter), which indicates that all reference pixels are not adopted in the current intra-frame angle prediction, so that the relevance between some pixels and the current coding unit is easily ignored, and the intra-frame prediction filtering technique improves the pixel prediction precision through point-to-point filtering, and can effectively enhance the spatial relevance, thereby improving the intra-frame prediction precision. The IPF technique is exemplified by a prediction mode from top right to bottom left in AVS3, as shown in fig. 4, in which URB represents a boundary pixel of the left neighboring block near the current coding unit, MRB represents a boundary pixel of the upper neighboring block near the current coding unit, and filter direction represents a filtering direction. The prediction mode direction is from top right to bottom left, the generated prediction value of the current coding unit mainly uses the reference pixel point of the row of adjacent blocks of the upper MRB, namely the prediction pixel of the current coding unit does not refer to the reconstructed pixel of the left adjacent block, however, the current coding unit and the reconstructed block on the left are in a space adjacent relation, and if only the upper MRB pixel is referred to but not the left URB pixel, the space association is easily lost, so that the prediction effect is poor.

The IPF technology is applied to all prediction modes of intra-frame prediction, and is a filtering method for improving the accuracy of intra-frame prediction. The IPF technology is mainly realized by the following flow:

a) The IPF technology judges the current prediction mode of the coding unit and divides the current prediction mode into a horizontal angle prediction mode, a vertical angle prediction mode and a non-angle prediction mode;

b) According to different types of prediction modes, the IPF technology adopts different filters to filter input pixels;

c) According to different distances from the current pixel to the reference pixel, the IPF technology adopts different filter coefficients to filter the input pixel;

the input pixels of the IPF technology are prediction pixels obtained in each prediction mode, and the output pixels are final prediction pixels after IPF.

The IPF technology has an allowed identification bit ipf_enable_flag, and a binary variable with a value of '1' indicates that intra-frame prediction filtering can be used; a value of '0' indicates that intra prediction filtering should not be used. Meanwhile, the IPF technology also uses an identification bit ipf_flag, and a binary variable with a value of '1' indicates that intra-frame prediction filtering should be used; a value of '0' indicates that intra prediction filtering should not be used, and defaults to 0 if the identification bit ipf_flag is not present in the bitstream.

The syntax element ipf_flag is as follows:

the IPF technique described above classifies prediction modes 0, 1 and 2 as non-angular prediction modes, filtering the predicted pixels using a first three tap filter;

classifying prediction modes 3 through 18 and 34 through 50 as vertical-class angle prediction modes, filtering the predicted pixels using a first two-tap filter;

the prediction modes 19 to 32, 51 to 65 are classified as horizontal class angle prediction modes, and the prediction pixels are filtered using a second two-tap filter.

The above-mentioned first three-tap filter suitable for IPF technology has the following filtering formula:

P′(x,y)＝f(x)·P(-1,y)+f(y)·P(x,-1)+(1-f(x)-f(y))·P(x,y)

the first two-tap filter applicable to the IPF technology has the following filtering formula:

P′(x,y)＝f(x)·P(-1,y)+(1-f(x))·P(x,y)

the second two-tap filter applicable to the IPF technology has the following filtering formula:

P′(x,y)＝f(y)·P(x,-1)+(1-f(y))·P(x,y)

in the above equation, P' (x, y) is the final prediction value of the pixel of the current chroma prediction block at the (x, y) position, f (x) and f (y) are the horizontal filter coefficient of the reconstructed pixel of the reference left neighboring block and the vertical filter coefficient of the reconstructed pixel of the reference upper neighboring block, P (-1, y) and P (x, -1) are the left reconstructed pixel at the y line and the upper reconstructed pixel at the x column, respectively, and P (x, y) is the original prediction pixel value in the current chroma component prediction block. Wherein, the values of x and y are not more than the wide and high value ranges of the current coding unit block.

The values of the horizontal filter coefficient and the vertical filter coefficient are related to the distance from the predicted pixel to the left reconstructed pixel point and the upper reconstructed pixel point in the current predicted block. The values of the horizontal filter coefficient and the vertical filter coefficient are also related to the size of the current coding block, and are divided into different filter coefficient groups according to the size of the current coding unit block.

The filter coefficients of the IPF technique are given in table 1.

TABLE 1 intra chroma prediction filter coefficients

FIG. 5 shows three filtering cases of intra prediction filtering, wherein only the upper reference pixel is referenced to filter the predicted value in the current coding unit; filtering the measured value in the current coding unit by referring to the left reference pixel only; and filtering the prediction value in the current coding unit block with reference to both the upper and left reference pixels.

Fig. 6 is a block diagram of video coding system 1 of one example described in an embodiment of the present application. As used herein, the term "video coder" generally refers to both a video encoder and a video decoder. In this application, the term "video coding" or "coding" may refer generally to video encoding or video decoding. The video encoder 100 and the video decoder 200 of the video coding system 1 are used to implement the image coding method proposed in the present application.

As shown in fig. 6, video coding system 1 includes a source device 10 and a destination device 20. Source device 10 generates encoded video data. Thus, source device 10 may be referred to as a video encoding device. Destination device 20 may decode the encoded video data generated by source device 10. Thus, destination device 20 may be referred to as a video decoding device. Various implementations of source device 10, destination device 20, or both may include one or more processors and memory coupled to the one or more processors. The memory may include, but is not limited to RAM, ROM, EEPROM, flash memory, or any other medium that can be used to store the desired program code in the form of instructions or data structures that can be accessed by a computer, as described herein.

The source device 10 and the destination device 20 may include a variety of devices including desktop computers, mobile computing devices, notebook (e.g., laptop) computers, tablet computers, set-top boxes, telephone handsets such as so-called "smart" phones, televisions, cameras, display devices, digital media players, video game consoles, vehicle-mounted computers, or the like.

Destination device 20 may receive encoded video data from source device 10 via link 30. Link 30 may comprise one or more media or devices capable of moving encoded video data from source device 10 to destination device 20. In one example, link 30 may include one or more communication media that enable source device 10 to transmit encoded video data directly to destination device 20 in real-time. In this example, source device 10 may modulate the encoded video data according to a communication standard, such as a wireless communication protocol, and may transmit the modulated video data to destination device 20. The one or more communication media may include wireless and/or wired communication media such as a Radio Frequency (RF) spectrum or one or more physical transmission lines. The one or more communication media may form part of a packet-based network, such as a local area network, a wide area network, or a global network (e.g., the internet). The one or more communication media may include routers, switches, base stations, or other equipment that facilitate communication from source device 10 to destination device 20. In another example, encoded data may be output from output interface 140 to storage device 40.

The image codec techniques of this disclosure may be applied to video codecs to support a variety of multimedia applications, such as over-the-air television broadcasting, cable television transmission, satellite television transmission, streaming video transmission (e.g., via the internet), encoding of video data for storage on a data storage medium, decoding of video data stored on a data storage medium, or other applications. In some examples, video coding system 1 may be used to support unidirectional or bidirectional video transmission to support applications such as video streaming, video playback, video broadcasting, and/or video telephony.

The video coding system 1 illustrated in fig. 6 is merely an example, and the techniques of this disclosure may be applicable to video coding settings (e.g., video encoding or video decoding) that do not necessarily include any data communication between an encoding device and a decoding device. In other examples, the data is retrieved from local memory, streamed over a network, and so forth. The video encoding device may encode and store data to the memory and/or the video decoding device may retrieve and decode data from the memory. In many examples, encoding and decoding are performed by devices that do not communicate with each other, but instead only encode data to memory and/or retrieve data from memory and decode data.

In the example of fig. 6, source device 10 includes a video source 120, a video encoder 100, and an output interface 140. In some examples, output interface 140 may include a regulator/demodulator (modem) and/or a transmitter. Video source 120 may include a video capture device (e.g., a video camera), a video archive containing previously captured video data, a video feed interface to receive video data from a video content provider, and/or a computer graphics system for generating video data, or a combination of such sources of video data.

Video encoder 100 may encode video data from video source 120. In some examples, source device 10 transmits encoded video data directly to destination device 20 via output interface 140. In other examples, the encoded video data may also be stored onto storage device 40 for later access by destination device 20 for decoding and/or playback.

In the example of fig. 6, destination device 20 includes an input interface 240, a video decoder 200, and a display device 220. In some examples, input interface 240 includes a receiver and/or a modem. Input interface 240 may receive encoded video data via link 30 and/or from storage device 40. The display device 220 may be integrated with the destination device 20 or may be external to the destination device 20. In general, the display device 220 displays decoded video data. The display device 220 may include a variety of display devices, such as a Liquid Crystal Display (LCD), a plasma display, an Organic Light Emitting Diode (OLED) display, or other types of display devices.

Although not shown in fig. 6, in some aspects, video encoder 100 and video decoder 200 may each be integrated with an audio encoder and decoder, and may include an appropriate multiplexer-demultiplexer unit or other hardware and software to handle encoding of both audio and video in a common data stream or separate data streams.

Video encoder 100 and video decoder 200 may each be implemented as any of a variety of circuits, such as: one or more microprocessors, digital Signal Processors (DSPs), application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs), discrete logic, hardware, or any combinations thereof. If the present application is implemented in part in software, the device may store instructions for the software in a suitable non-volatile computer-readable storage medium and the instructions may be executed in hardware using one or more processors to implement the techniques of this application. Any of the foregoing (including hardware, software, a combination of hardware and software, etc.) may be considered one or more processors. Each of video encoder 100 and video decoder 200 may be included in one or more encoders or decoders, any of which may be integrated as part of a combined encoder/decoder (codec) in the respective device.

Fig. 7 is an example block diagram of a video encoder 100 described in an embodiment of the present application. The video encoder 100 is arranged to output video to a post-processing entity 41. Post-processing entity 41 represents an example of a video entity, such as a Media Aware Network Element (MANE) or a stitching/editing device, that may process encoded video data from video encoder 100. In some cases, post-processing entity 41 may be an instance of a network entity. In some video coding systems, post-processing entity 41 and video encoder 100 may be parts of separate devices, while in other cases, the functionality described with respect to post-processing entity 41 may be performed by the same device that includes video encoder 100. In one example, post-processing entity 41 is an example of storage device 40 of FIG. 1.

In the example of fig. 7, the video encoder 100 includes a prediction processing unit 108, a filter unit 106, a memory 107, a summer 112, a transformer 101, a quantizer 102, and an entropy encoder 103. The prediction processing unit 108 includes an inter predictor 110 and an intra predictor 109. For image block reconstruction, the video encoder 100 also includes an inverse quantizer 104, an inverse transformer 105, and a summer 111. The filter unit 106 represents one or more loop filters, such as a deblocking filter, an Adaptive Loop Filter (ALF), and a Sample Adaptive Offset (SAO) filter. Although the filter unit 106 is shown in fig. 7 as an in-loop filter, in other implementations, the filter unit 106 may be implemented as a post-loop filter. In one example, the video encoder 100 may further include a video data memory, a segmentation unit (not shown).

The video encoder 100 receives video data and stores the video data in a video data memory. The segmentation unit segments the video data into image blocks and these image blocks may be further segmented into smaller blocks, e.g. image block segmentations based on a quadtree structure or a binary tree structure. Prediction processing unit 108 may select one of a plurality of possible coding modes for the current image block, such as one of a plurality of intra coding modes or one of a plurality of inter coding modes. Prediction processing unit 108 may provide the resulting intra-frame, inter-coded blocks to summer 112 to generate residual blocks, and to summer 111 to reconstruct the encoded blocks used as reference pictures. The intra predictor 109 within the prediction processing unit 108 may perform intra-predictive encoding of the current image block with respect to one or more neighboring encoding blocks in the same frame or slice as the current block to be encoded to remove spatial redundancy. The inter predictor 110 within the prediction processing unit 108 may perform inter predictive encoding of the current image block relative to one or more prediction blocks in one or more reference images to remove temporal redundancy. The prediction processing unit 108 provides information indicating the selected intra or inter prediction mode of the current image block to the entropy encoder 103 so that the entropy encoder 103 encodes the information indicating the selected inter prediction mode.

After the prediction processing unit 108 generates a prediction block of the current image block via inter prediction/intra prediction, the video encoder 100 forms a residual image block by subtracting the prediction block from the current image block to be encoded. Summer 112 represents one or more components that perform this subtraction operation. Residual video data in the residual block may be included in one or more TUs and applied to transformer 101. The transformer 101 transforms the residual video data into residual transform coefficients using a transform such as a Discrete Cosine Transform (DCT) or a conceptually similar transform. The transformer 101 may convert the residual video data from a pixel value domain to a transform domain, such as the frequency domain.

The transformer 101 may send the resulting transform coefficients to the quantizer 102. The quantizer 102 quantizes the transform coefficients to further reduce bit rate. In some examples, quantizer 102 may then perform a scan of a matrix including quantized transform coefficients. Alternatively, the entropy encoder 103 may perform the scanning.

After quantization, the quantized transform coefficients are entropy encoded by the entropy encoder 103. For example, the entropy encoder 103 may perform Context Adaptive Variable Length Coding (CAVLC), context Adaptive Binary Arithmetic Coding (CABAC), syntax-based context adaptive binary arithmetic coding (SBAC), probability Interval Partitioning Entropy (PIPE) coding, or another entropy coding method or technique. After entropy encoding by entropy encoder 103, the encoded stream may be transmitted to video decoder 200, or archived for later transmission or retrieval by video decoder 200. The entropy encoder 103 may also entropy encode syntax elements of the current image block to be encoded.

The inverse quantizer 104 and the inverse transformer 105 apply inverse quantization and inverse transformation, respectively, to reconstruct the residual block in the pixel domain, e.g. for later use as a reference block of a reference image. The summer 111 adds the reconstructed residual block to the prediction block generated by the inter predictor 110 or the intra predictor 109 to generate a reconstructed image block. The filter unit 106 may be adapted to reconstruct image blocks to reduce distortion, such as block artifacts (blockartifacts). The reconstructed image block is then stored as a reference block in memory 107, which may be used by inter predictor 110 as a reference block for inter predicting blocks in subsequent video frames or images.

The video encoder 100 divides the input video into coding tree units, each of which is in turn divided into coding blocks of either rectangular or square shape. When the current coding block selects an intra-frame prediction mode for coding, performing calculation traversal of a plurality of prediction modes on a brightness component of the current coding block, selecting an optimal prediction mode according to the rate distortion cost, performing calculation traversal of a plurality of prediction modes on a chroma component of the current coding block, and selecting the optimal prediction mode according to the rate distortion cost. And then, calculating residual errors between the original video block and the prediction block, wherein one path of the residual errors is changed, quantized, entropy coded and the like to form an output code stream, and the other path of the residual errors is subjected to inverse transformation, inverse quantization, loop filtering and the like to form a reconstructed sample which is used as reference information of the subsequent video compression.

The current IPF technique is implemented in the video encoder 100 as follows.

The input digital video information is divided into a plurality of coding tree units at the coding end, each coding tree unit is divided into a plurality of coding units which are rectangular or square, and each coding unit respectively performs an intra-frame prediction process to calculate a prediction block.

In the current coding unit of the present invention,

(1) if the allowed identification bit of the IPF is '1', performing all the following steps;

(2) if the allowed flag of IPF is '0', only steps a 1), b 1), f1 and g 1) are performed.

a1 Firstly traversing all prediction modes, calculating the prediction pixel under each intra prediction mode, and calculating the rate distortion cost according to the original pixel;

b1 According to the minimum rate distortion cost principle of all the prediction modes, selecting the optimal prediction mode of the current coding unit, and recording the optimal prediction mode information and the rate distortion cost information corresponding to the optimal prediction mode information;

c1 Traversing all intra-frame prediction modes again, starting an IPF technology in the process, and firstly calculating prediction pixels under each intra-frame prediction mode to obtain a prediction block of a current coding unit;

d1 IPF is carried out on the prediction block of the current coding unit, a filter corresponding to the prediction block is selected according to the current prediction mode, a corresponding filter coefficient group is selected according to the size of the current coding unit, and a specific corresponding relation can be checked in table 1;

e1 Calculating to obtain the rate distortion cost information of each prediction mode according to the final prediction pixel and the original pixel obtained by the IPF technology, and recording the prediction mode and the corresponding cost value of the minimum rate distortion cost information;

f1 If the IPF permission identification bit is '0', writing the prediction mode index recorded in b 1) into the code stream to a decoding end;

if the IPF grant flag is '1', the minimum cost value recorded in b 1) is compared with the minimum cost value recorded in e 1),

if the rate distortion cost in b 1) is smaller, the prediction mode index code recorded in b 1) is used as the optimal prediction mode of the current coding unit to be written into the code stream to the decoding end, and the identification position of the current coding unit of the IPF is marked with a mark position which is not used, so that the IPF technology is not used, and the code stream is written into the decoding end;

if the rate distortion in e 1) is smaller, the prediction mode index code recorded in e 1) is used as the optimal prediction mode of the current coding unit to be written into the code stream to the decoding end, and the identification position of the current coding unit of the IPF is true by using the mark position, which means that the IPF technology is used and the code stream is also written into the decoding end.

g) And then, superposing the predicted value and residual information after operations such as transformation, quantization and the like to obtain a reconstructed block of the current coding unit as reference information of a subsequent coding unit.

The intra predictor 109 may also provide information indicating the intra prediction mode selected by the current encoding block to the entropy encoder 103 so that the entropy encoder 103 encodes the information indicating the selected intra prediction mode.

Fig. 8 is an example block diagram of a video decoder 200 described in an embodiment of the present application. In the example of fig. 8, the video decoder 200 includes an entropy decoder 203, a prediction processing unit 208, an inverse quantizer 204, an inverse transformer 205, a summer 211, a filter unit 206, and a memory 207. The prediction processing unit 208 may include an inter predictor 210 and an intra predictor 209. In some examples, video decoder 200 may perform a decoding process that is substantially reciprocal to the encoding process described with respect to video encoder 100 from fig. 7.

In the decoding process, video decoder 200 receives an encoded video bitstream representing image blocks and associated syntax elements of an encoded video slice from video encoder 100. The video decoder 200 may receive video data from the network entity 42 and may optionally also store the video data in a video data memory (not shown). The video data memory may store video data, such as an encoded video bitstream, to be decoded by components of the video decoder 200. Video data stored in the video data memory may be obtained, for example, from the storage device 40, from a local video source such as a camera, via wired or wireless network communication of video data, or by accessing a physical data storage medium. The video data memory may act as a decoded image buffer (CPB) for storing encoded video data from the encoded video bitstream.

The network entity 42 may be, for example, a server, a MANE, a video editor/splicer, or other such device for implementing one or more of the techniques described above. The network entity 42 may or may not include a video encoder, such as video encoder 100. The network entity 42 may implement portions of the techniques described herein before the network entity 42 sends the encoded video bitstream to the video decoder 200. In some video decoding systems, network entity 42 and video decoder 200 may be part of separate devices, while in other cases, the functionality described with respect to network entity 42 may be performed by the same device that includes video decoder 200.

The entropy decoder 203 of the video decoder 200 entropy decodes the bitstream to produce quantized coefficients and some syntax elements. The entropy decoder 203 forwards the syntax elements to the prediction processing unit 208. The video decoder 200 may receive syntax elements at the video slice level and/or the picture block level. When a video slice is decoded as an intra-decoded (I) slice, the intra predictor 209 of the prediction processing unit 208 generates a prediction block for an image block of the current video slice based on the signaled intra prediction mode and data from a previously decoded block of the current frame or image. When a video slice is decoded as an inter-decoded (i.e., B or P) slice, the inter predictor 210 of the prediction processing unit 208 may determine an inter-prediction mode for decoding a current image block of the current video slice based on the syntax element received from the entropy decoder 203, and decode (e.g., perform inter-prediction) the current image block based on the determined inter-prediction mode.

The dequantizer 204 dequantizes, i.e., dequantizes, the quantized transform coefficients provided in the code stream and decoded by the entropy decoder 203. The inverse quantization process may include: the quantization parameter calculated by the video encoder 100 for each image block in the video slice is used to determine the degree of quantization that should be applied and likewise the degree of inverse quantization that should be applied. The inverse transformer 205 applies an inverse transform to the transform coefficients, such as an inverse DCT, an inverse integer transform, or a conceptually similar inverse transform process, in order to produce residual blocks in the pixel domain.

After the inter predictor 210 generates a prediction block for the current image block or a sub-block of the current image block, the video decoder 200 obtains a reconstructed block, i.e., a decoded image block, by summing the residual block from the inverse transformer 205 with the corresponding prediction block generated by the inter predictor 210. Summer 211 represents the component performing this summation operation. Loop filters (in or after the decoding loop) may also be used to smooth pixel transitions or otherwise improve video quality, if desired. The filter unit 206 may represent one or more loop filters, such as a deblocking filter, an Adaptive Loop Filter (ALF), and a Sample Adaptive Offset (SAO) filter. Although the filter unit 206 is shown in fig. 8 as an in-loop filter, in other implementations, the filter unit 206 may be implemented as a post-loop filter.

The image decoding method specifically executed by the video decoder 200 includes that the input code stream is parsed, inversely transformed and inversely quantized to obtain a prediction mode index of the current coding block. If the prediction mode index of the chroma component of the current coding block is an enhanced two-step cross-component prediction mode, selecting reconstructed samples only from upper or left adjacent pixels of the current coding block according to the index value to perform linear model calculation, obtaining an original prediction block of the chroma component of the current coding block according to the linear model calculation, downsampling, and performing prediction correction based on correlation of boundary adjacent pixels in the orthogonal direction on the downsampled prediction block to obtain a final prediction block of the final chroma component. One path of the subsequent code stream is used as reference information for subsequent video decoding, and the other path of the subsequent code stream is subjected to post-filtering processing to output video signals.

The IPF technique is now implemented at the video decoder 200 as follows.

The decoding end obtains the code stream and analyzes the code stream to obtain digital video sequence information, and analyzes the IPF permission identification bit of the current video sequence, wherein the current decoding unit coding mode is an intra-frame prediction coding mode and the IPF use identification bit of the current decoding unit.

In the current decoding unit,

(2) if the allowed flag bit of IPF is '0', only steps a 2), b 2) and e 2) are performed:

a2 Obtaining code stream information, analyzing residual information of a current decoding unit, and obtaining time domain residual information through inverse transformation and inverse quantization processes;

b2 Analyzing the code stream and obtaining a prediction mode index of the current decoding unit, and calculating to obtain a prediction block of the current decoding unit according to the adjacent reconstruction block and the prediction mode index;

c2 Analyzing and acquiring the use identification bit of the IPF, and if the use identification bit of the IPF is '0', not performing additional operation on the current prediction block; if the use flag of the IPF is '1', executing d 2);

d2 Selecting a corresponding filter according to the prediction mode classification information of the current decoding unit, selecting a corresponding filter coefficient group according to the size of the current decoding unit, and then filtering each pixel in the prediction block to obtain a final prediction block;

e2 The residual information after the superposition and the restoration of the prediction block is obtained to obtain a reconstruction block of the current decoding unit, and the reconstruction block is output through post-processing;

it should be appreciated that other structural variations of video decoder 200 may be used to decode the encoded video bitstream. For example, the video decoder 200 may generate an output video stream without processing by the filter unit 206; alternatively, for some image blocks or image frames, the quantized coefficients are not decoded by the entropy decoder 203 of the video decoder 200, and accordingly do not need to be processed by the inverse quantizer 204 and the inverse transformer 205.

In the intra-frame prediction technology, the existing IPF technology can effectively improve the coding efficiency of intra-frame prediction, greatly enhance the spatial correlation of intra-frame prediction, well solve the problem that only a single reference pixel row or column is used in the intra-frame prediction process, and neglect the influence of certain pixels on a predicted value. However, when the intra prediction process is a part requiring smoothing, the IPF technique and the current intra prediction mode cannot solve similar problems well, and the pixel-by-pixel filtering can improve the relevance between the prediction block and the reference block according to the reference pixel, but cannot solve the smoothing problem inside the prediction block.

The prediction block calculated according to the single prediction mode usually shows better prediction effect in the image with clearer texture, so that the residual error is smaller and less, and the coding efficiency is improved. However, in an image block with a blurred texture, too sharp prediction may cause an increase and enlargement of residual errors, and the prediction effect is poor, and the coding efficiency is reduced.

In view of the above problems, the embodiments of the present application propose an IPF technique based on smoothing for some image blocks that need smoothing, and directly filter a prediction block obtained according to an intra prediction mode.

Fig. 9 is a schematic flow chart of an image encoding method according to an embodiment of the present application, and the image encoding method may be applied to the source device 10 in the video decoding system 1 shown in fig. 6 or the video encoder 100 shown in fig. 7. The flow shown in fig. 9 is described taking the video encoder 100 shown in fig. 7 as an example of an execution subject. As shown in fig. 9, the image encoding method provided in the embodiment of the present application includes:

step 110, determining intra-prediction filtering indication information of a current coding block, wherein the intra-prediction filtering indication information comprises first indication information and second indication information, the first indication information is used for indicating whether a first intra-prediction filtering mode is allowed to be used, the second indication information is used for indicating whether a second intra-prediction filtering mode is allowed to be used, and the first intra-prediction filtering mode is an intra-prediction filtering IPF mode;

step 120, if it is determined that the current coding block needs to use the first intra-frame prediction filtering mode according to the intra-frame prediction filtering indication information, a first use flag of the first intra-frame prediction filtering mode of the current coding block is set to a first value, where the first value is used to indicate that the first intra-frame prediction filtering mode is used;

Step 130, writing the intra-frame prediction filtering indication information, the first intra-frame prediction filtering mode and the first use identification bit into a code stream;

and 140, according to the first intra-frame prediction filtering mode, predicting the current coding block to obtain a prediction block, coding the prediction block, and writing a coding result into a code stream.

In some embodiments, the method further comprises: and superposing the predicted block of the current coding block and the residual block obtained after inverse transformation and inverse quantization to obtain a reconstructed block which is used as a predicted reference block of the next coding block.

The intra-frame prediction part of the coding end in the technical scheme 1 is specifically realized as follows:

the method comprises the steps that an encoder obtains coding information, wherein the coding information comprises an intra-frame prediction filtering permission identification bit, an intra-frame prediction smoothing filtering (IPS) permission identification bit and the like, an image is divided into a plurality of CTUs after image information is obtained, the CTUs are further divided into a plurality of CUs, and each independent CU carries out intra-frame prediction;

in the course of the intra-frame prediction,

(1) if the IPF permission identification bit and the IPS permission identification bit are both '1', executing the following steps;

(2) if the IPF enable flag is '1', and the IPS enable flag is '0', then only a 3), b 3), c 3), d 3), e 3), and i 3), j 3) are performed;

(3) If the IPF permission flag is '0', and the IPS permission flag is '1', and the current CU area is 64 or more and 4096 or less, executing only a 3), b 3), f 3), g 3), h 3), and i 3), j 3);

(4) if both the IPF enable flag and the IPS enable flag are '0', then only a 3), b 3), and i 3), j 3) are performed:

a3 Traversing all intra-frame prediction modes by the current coding unit, calculating to obtain a prediction block under each prediction mode, and calculating to obtain rate distortion cost information of the current prediction mode according to the original block;

b3 According to the minimum rate distortion cost principle of all the prediction modes, selecting the optimal prediction mode of the current coding unit, and recording the optimal prediction mode information and the rate distortion cost information corresponding to the optimal prediction mode information;

c3 Performing a second traversal of all intra-frame prediction modes, starting an IPF technique in the process, and firstly calculating a prediction pixel under each intra-frame prediction mode to obtain a prediction block of a current coding unit;

d3 IPF filtering is carried out on the prediction block of the current coding unit, a filter corresponding to the prediction block is selected according to the current prediction mode, a corresponding filter coefficient group is selected according to the size of the current coding unit, and a specific corresponding relation can be checked in table 1;

e3 Calculating to obtain the rate distortion cost information of each prediction mode according to the final prediction pixel and the original pixel obtained by the IPF technology, and recording the prediction mode and the corresponding cost value of the minimum rate distortion cost information;

f3 Third traversal is carried out on all intra-frame prediction modes, the IPS technology is started in the process, and firstly, prediction pixels in each intra-frame prediction mode are calculated to obtain a prediction block of a current coding unit;

g3 IPS is carried out on the prediction block of the current coding unit twice to obtain a final prediction block;

h3 Calculating the rate distortion cost information of each prediction mode according to the final prediction pixel and the original pixel obtained by the IPS technology, and recording the prediction mode and the corresponding cost value of the minimum rate distortion cost information;

i3 If IPF permission identification bit is '0' and IPS permission identification bit is '0', writing the prediction mode index recorded in b 3) into the code stream to the decoding end;

if the IPF enable flag is '1' and the IPS enable flag is '0', the minimum cost value recorded in b 3) is compared with the minimum cost value recorded in e 3),

if the rate distortion cost in b 3) is smaller, writing the prediction mode index code recorded in b 3) as the optimal prediction mode of the current coding unit into a code stream to a decoding end, and writing the IPF current coding unit into a code stream to the decoding end by using a mark position '0', which means that an IPF technology is not used;

If the rate distortion in e 3) is smaller, writing the prediction mode index code recorded in e 3) as the optimal prediction mode of the current coding unit into a code stream to a decoding end, and writing the IPF current coding unit into the code stream to the decoding end by using a mark position '1', which means that an IPF technology is used;

if the IPF enable flag is '0' and the IPS enable flag is '1', the minimum cost value recorded in b 3) is compared with the minimum cost value recorded in h),

if the rate distortion cost in b 3) is smaller, writing the prediction mode index code recorded in b 3) as the optimal prediction mode of the current coding unit into the code stream to the decoding end, and writing the IPS use flag position '0' (namely the second value) of the current coding unit into the code stream to the decoding end without using the technology;

if the rate distortion in h 3) is smaller, writing the prediction mode index code recorded in h 3) as the optimal prediction mode of the current coding unit into a code stream to a decoding end, and writing the IPS use mark position '1' of the current coding unit into the code stream to the decoding end by using the technology;

if the IPF enable flag is '1' and the IPS enable flag is '1', the minimum cost values recorded in b 3), e 3) and h 3) are compared,

If the rate distortion cost in b 3) is smaller, writing the prediction mode index code recorded in b 3) as the optimal prediction mode of the current coding unit into a code stream to a decoding end, and writing the IPS use identification bit and IPF use mark position '0' of the current coding unit into the code stream to the decoding end without using the IPS use identification bit and IPF use mark position '0';

if the rate distortion in e 3) is smaller, writing the prediction mode index code recorded in e 3) as the optimal prediction mode of the current coding unit into a code stream to a decoding end, using an IPF (IPF) of the current coding unit to identify a position '1' and not transmitting an IPS identification bit, indicating that an IPF technology is used instead of an IPS technology, and writing the code stream into the decoding end;

if the rate distortion in h 3) is smaller, writing the prediction mode index code recorded in h 3) as the optimal prediction mode of the current coding unit into the code stream to the decoding end, and writing the IPF using the identification position '0' and the IPS using the identification position '1' of the current coding unit into the code stream without using the IPF technology.

j3 And (3) superposing the predicted block and the residual error after inverse transformation and inverse quantization to obtain a reconstructed coding unit block which is used as a predicted reference block of the next coding unit.

The specific implementation of the intra-frame prediction part at the coding end in the technical scheme 2 is as follows:

in the course of the intra-frame prediction,

(2) if the IPF enable flag is '1', and the IPS enable flag is '0', then only a 4), b 4), c 4), d 4), e 4), and i 4), j 4) are performed;

(3) if the IPF permission flag is '0', and the IPS permission flag is '1', and the current CU area is 64 or more and 4096 or less, executing only a 4), b 4), f 4), g 4), h 4), and i 4), j 4);

(4) if both IPF enable flag bit and IPS enable flag bit are '0', then only a 4), b 4) and i 4), j 4) are performed:

a4 Traversing all intra-frame prediction modes by the current coding unit, calculating to obtain a prediction block under each prediction mode, and calculating to obtain rate distortion cost information of the current prediction mode according to the original block;

b4 According to the minimum rate distortion cost principle of all the prediction modes, selecting the optimal prediction mode of the current coding unit, and recording the optimal prediction mode information and the rate distortion cost information corresponding to the optimal prediction mode information;

c4 Performing a second traversal of all intra-frame prediction modes, starting an IPF technique in the process, and firstly calculating a prediction pixel under each intra-frame prediction mode to obtain a prediction block of a current coding unit;

d4 IPF filtering is carried out on the prediction block of the current coding unit, a filter corresponding to the prediction block is selected according to the current prediction mode, a corresponding filter coefficient group is selected according to the size of the current coding unit, and a specific corresponding relation can be checked in table 1;

e4 Calculating to obtain the rate distortion cost information of each prediction mode according to the final prediction pixel and the original pixel obtained by the IPF technology, and recording the prediction mode and the corresponding cost value of the minimum rate distortion cost information;

f4 Third traversal is carried out on all intra-frame prediction modes, the IPS technology is started in the process, and firstly, prediction pixels in each intra-frame prediction mode are calculated to obtain a prediction block of a current coding unit;

g4 IPS is carried out on the prediction block of the current coding unit once to obtain a final prediction block;

h4 Calculating the rate distortion cost information of each prediction mode according to the final prediction pixel and the original pixel obtained by the IPS technology, and recording the prediction mode and the corresponding cost value of the minimum rate distortion cost information;

i4 If IPF permission identification bit is '0' and IPS permission identification bit is '0', writing the prediction mode index recorded in b 4) into the code stream to the decoding end;

if the IPF enable flag is '1' and the IPS enable flag is '0', the minimum cost value recorded in b 4) is compared with the minimum cost value recorded in e),

if the rate distortion cost in b 4) is smaller, writing the prediction mode index code recorded in b 4) as the optimal prediction mode of the current coding unit into a code stream to a decoding end, and writing the IPF current coding unit into a code stream to the decoding end by using a mark position '0', which means that an IPF technology is not used;

if the rate distortion in e 4) is smaller, writing the prediction mode index code recorded in e 4) as the optimal prediction mode of the current coding unit into a code stream to a decoding end, and writing the IPF current coding unit into the code stream to the decoding end by using a mark position '1', which means that an IPF technology is used;

if the IPF enable flag is '0' and the IPS enable flag is '1', the minimum cost value recorded in b 4) is compared with the minimum cost value recorded in h),

If the rate distortion cost in b 4) is smaller, writing the prediction mode index code recorded in b 4) as the optimal prediction mode of the current coding unit into the code stream to the decoding end, and writing the IPS use flag position '0' (second value) of the current coding unit into the code stream to the decoding end without using the technology;

if the rate distortion in h 4) is smaller, writing the prediction mode index code recorded in h 4) as the optimal prediction mode of the current coding unit into a code stream to a decoding end, and writing the IPS use mark position '1' of the current coding unit into the code stream to the decoding end by using the technology;

if the IPF enable flag is '1' and the IPS enable flag is '1', the minimum cost values recorded in b 4), e 4) and h 4) are compared,

if the rate distortion cost in b 4) is smaller, writing the prediction mode index code recorded in b 4) as the optimal prediction mode of the current coding unit into a code stream to a decoding end, and writing the IPS use identification bit and IPF use mark position '0' of the current coding unit into the code stream to the decoding end without using the IPS use identification bit and IPF use mark position '0';

if the rate distortion in e 4) is smaller, writing the prediction mode index code recorded in e 4) as the optimal prediction mode of the current coding unit into a code stream to a decoding end, using an IPF (IPF) of the current coding unit to identify a position '1' and not transmitting an IPS identification bit, indicating that an IPF technology is used instead of an IPS technology, and writing the code stream into the decoding end;

If the rate distortion in h 4) is smaller, writing the prediction mode index code recorded in h 4) as the optimal prediction mode of the current coding unit into the code stream to the decoding end, and writing the IPF using the identification position '0' and the IPS using the identification position '1' of the current coding unit into the code stream without using the IPF technology.

j4 And (3) superposing the predicted block and the residual error after inverse transformation and inverse quantization to obtain a reconstructed coding unit block which is used as a predicted reference block of the next coding unit.

Fig. 10 is a schematic flow chart of an image decoding method according to an embodiment of the present application, which corresponds to the image encoding method shown in fig. 9, and the image decoding method can be applied to the destination device 20 in the video decoding system 1 shown in fig. 6 or the video decoder 200 shown in fig. 8. The flow shown in fig. 10 is described taking the video encoder 200 shown in fig. 8 as an example of an execution subject. As shown in fig. 10, the image decoding method provided in the embodiment of the present application includes:

step 210, analyzing a code stream, and determining intra-frame prediction filtering indication information and a first use identification bit of a current decoding block, wherein the intra-frame prediction indication information comprises first indication information and second indication information, the first indication information is used for indicating whether a first intra-frame prediction filtering mode is allowed to be used, the second indication information is used for indicating whether a second intra-frame prediction filtering mode is allowed to be used, the first intra-frame prediction filtering mode is an intra-frame prediction filtering (IPF) mode, and the first use identification bit is a use identification bit of the first intra-frame prediction filtering mode;

And 220, determining to predict the current decoding block by using the first frame rate prediction filtering mode according to the intra-frame prediction filtering indication information and the first use identification bit, so as to obtain a prediction block of the current decoding block.

The specific process of intra-frame prediction at the decoding end in the technical scheme 1 is as follows:

the decoder acquires the code stream, analyzes the code stream to obtain an IPF (internet protocol version) permission identification bit and an IPS (internet protocol version) permission identification bit of the current video sequence, analyzes the code stream and performs inverse transformation and inverse quantization on the obtained residual information.

In the process of intra-prediction decoding,

(1) if the allowed identification bit of the IPF and the identification bit of the IPS are both '1', executing the following steps;

(2) if the allowed flag bit of IPF is '1' and the flag bit of IPS is '0', only steps a 5), b 5), c 5), d 5) and g 5) are executed;

(3) if the allowed flag bit of the IPF is '0' and the flag bit of the IPS is '1', and the current CU area is 64 or more and 4096 or less, only performing steps a 5), b 5), e 5), f 5) and g 5);

(4) if the IPF permission identification bit and the IPS identification bit are both '0', only executing the steps a 5), b 5) and g 5);

a5 Obtaining code stream and decoding to obtain residual information, and obtaining time domain residual information through processes such as inverse transformation and inverse quantization;

b5 Analyzing the code stream to obtain a prediction mode of the current decoding unit, and calculating to obtain a prediction block according to the prediction mode of the current decoding unit and the adjacent reconstruction block;

c5 Parsing and retrieving the usage identification bits of the IPF,

if the use flag bit of the IPF is '0', performing no additional operation on the current predicted block, and skipping the step d 5);

if the usage flag of the IPF is '1', executing d 5);

d5 Selecting a corresponding filter according to the prediction mode classification information of the current decoding unit, selecting a corresponding filter coefficient group according to the size of the current decoding unit, and then filtering each pixel in the prediction block to obtain the prediction block;

e5 A use identification bit of the IPF is obtained,

if the use identification bit of the IPF is '1', skipping the rest process of the step and skipping the step f 5);

if the use identification bit of the IPF is '0', analyzing and acquiring the use identification bit of the IPS.

If the use identification bit of the IPS is '0', no additional operation is performed on the current prediction block;

if the use flag of the IPS is '1', executing f 5);

f5 Using an IPS device to carry out twice filtering on the input prediction block to obtain a filtered current decoding unit prediction block;

g5 The residual information after the superposition and the restoration of the prediction block is obtained to obtain a reconstruction block of the current decoding unit, and the reconstruction block is output through post-processing;

The specific process of intra-frame prediction at the decoding end in the technical scheme 2 is as follows:

In the process of intra-prediction decoding,

(2) if the allowed flag bit of IPF is '1' and the flag bit of IPS is '0', only steps a 6), b 6), c 6), d 6) and g 6) are executed;

(3) if the allowed flag bit of the IPF is '0' and the flag bit of the IPS is '1', and the current CU area is 64 or more and 4096 or less, executing only steps a 6), b 6), e 6), f 6) and g 6);

(4) if the IPF permission identification bit and the IPS identification bit are both '0', only executing the steps a 6), b 6) and g 6);

a6 Obtaining code stream and decoding to obtain residual information, and obtaining time domain residual information through processes such as inverse transformation and inverse quantization;

b6 Analyzing the code stream to obtain a prediction mode of the current decoding unit, and calculating to obtain a prediction block according to the prediction mode of the current decoding unit and the adjacent reconstruction block;

c6 Parsing and retrieving the usage identification bits of the IPF,

If the use flag bit of the IPF is '0', performing no additional operation on the current predicted block, and skipping the step d 6);

if the use flag of the IPF is '1', executing d 6);

d6 Selecting a corresponding filter according to the prediction mode classification information of the current decoding unit, selecting a corresponding filter coefficient group according to the size of the current decoding unit, and then filtering each pixel in the prediction block to obtain the prediction block;

e6 A use identification bit of the IPF is obtained,

if the use identification bit of the IPF is '1', skipping the rest process of the step and skipping the step f 6);

if the use flag of the IPS is '1', executing f 6);

f6 Using an IPS device to carry out primary filtering on the input prediction block to obtain a filtered current decoding unit prediction block;

g6 The residual information after the superposition and the restoration of the prediction block is obtained to obtain a reconstruction block of the current decoding unit, and the reconstruction block is output through post-processing;

the technical scheme is applied to an intra prediction part in a coding and decoding framework. When the IPS technique is used to filter the current coding unit or decoding unit, the current block needs to be filled first, which includes the following steps:

a7 If the left and upper reference pixels outside the current prediction block are available, i.e., the left and upper reconstructed pixels are available, the left column and upper column are filled with reconstructed pixels;

b7 If the reference pixel on the left or upper side outside the current prediction block is not available, i.e., the left or upper side has no reconstructed pixel, then the side without reconstructed pixel is filled with the row or column on the side closest to the current prediction block;

c7 Filling a right-most column prediction value of the current prediction block using a right-side neighboring column outside the current prediction block;

d7 Filling the lowest predicted value of the current predicted block by using the adjacent lower side outside the current predicted block;

filling the upper right corner pixel point outside the current prediction block by using the filled rightmost pixel point outside the upper side of the current prediction block, filling the lower right corner pixel point outside the current prediction block by using the filled rightmost pixel point outside the lower side of the current prediction block, and filling the lower left corner pixel point outside the current prediction block by using the filled bottommost pixel point outside the left side of the current prediction block.

Fig. 11A shows a schematic filling of a prediction block, where pred. Pixel represents a pixel of the prediction block and recon. Pixel represents a filled pixel.

Fig. 11B shows another filling schematic of the prediction block, where pred. Pixel represents the pixels of the prediction block and recon. Pixel represents the filled pixels.

The IPS technique filters the prediction block using a simplified gaussian convolution kernel, the filter having 9 taps, 9 different filter coefficients, as follows:

filter_coeffients represent filter coefficients.

Each prediction pixel in the prediction block is filtered, and the filtering formula is as follows:

P′(x,y)＝c ₁ ·P(x-1,y-1)+c ₂ ·P(x,y-1)+c ₁ ·P(x+1,y-1)+c ₁ ·P(x-1,y)+c ₃ ·P(x,y)+c ₂ ·P(x+1,y)+c ₁ ·P(x-1,y+1)+c ₂ ·P(x,y+1)+c ₁ ·P(x+1,y+1)

in the above equation, P' (x, y) is the final predicted value at the current coding unit (x, y), c ₁ 、c ₂ And c ₃ The coefficients in the filter are respectively c in the approximate Gaussian convolution kernel coefficients ₁ 0075, c ₂ 0.124, c ₃ 0.204.P (x, y) and other values such as P (x-1, y-1) are predicted values at the current coding units (x, y) and (x-1, y-1), wherein the range of values of x and y is not more than the width and height of the current coding unit block.

The convolution kernel coefficient adopted by the IPS technology can be approximately an integer, and the sum of all coefficients is an exponent power of 2, so that floating point calculation of a computer can be avoided, division operation can be avoided, and the calculation complexity is greatly reduced, as shown in the following:

the sum of the filter coefficients is 64, i.e. the calculated predicted value needs to be shifted 6 bits to the right.

In the above technical solution 2, the IPS technique filters the prediction block by using a simplified gaussian convolution check, and the filter has 25 taps and 6 different filter coefficients, as follows:

P′(x,y)＝c ₁ ·P(x-2,y-2)+c ₂ ·P(x-1,y-2)+c ₃ ·P(x,y-2)+c ₂ ·P(x+1,y-2)+c ₁ ·P(x+2,y-2)+c ₂ ·P(x-2,y-1)+c ₄ ·P(x-1,y-1)+c ₅ ·P(x,y-1)+c ₄ ·P(x+1,y-1)+c ₂ ·P(x+2,y-1)+c ₃ ·P(x-2,y)+c ₅ ·P(x-1,y)+c ₆ ·P(x,y)+c ₅ ·P(x+1,y)+c ₃ ·P(x+2,y)+c ₂ ·P(x-2,y+1)+c ₄ ·P(x-1,y+1)+c ₅ ·P(x,y+1)+c ₄ ·P(x+1,y+1)+c ₂ ·P(x+2,y+1)+c ₁ ·P(x-2,y+2)+c ₂ ·P(x-1,y+2)+c ₃ ·P(x,y+2)+c ₂ ·P(x+1,y+2)+c ₁ ·P(x+2,y+2)

in the above equation, P' (x, y) is the final predicted value at the current coding unit (x, y), c ₁ 、c ₂ 、c ₃ 、c ₄ 、c ₅ And c ₆ The coefficients in the filter are respectively c in the approximate Gaussian convolution kernel coefficients ₁ 0.0030, c ₂ 0.0133, c ₃ 0.0219, c ₄ 0.0596, c ₅ 0.0983, c ₆ 0.1621.P (x, y) and other values such as P (x-1, y-1) are predicted values at the current coding units (x, y) and (x-1, y-1), wherein the range of values of x and y is not more than the width and height of the current coding unit block.

the sum of the filter coefficients is 1024, i.e. the calculated predicted value needs to be shifted to the right by 10 bits.

In the above technical solution 2, the IPS technique may also use a simplified gaussian convolution to check the prediction block to perform filtering, where the filter has 13 taps and 4 different filter coefficients, as follows:

the filter formula is as follows:

P′(x,y)＝c ₁ ·P(x,y-2)+c ₂ ·P(x-1,y-1)+c ₃ ·P(x,y)+c ₂ ·P(x+1,y-1)+c ₁ ·P(x-2,y)+c ₃ ·P(x-1,y)+c ₄ ·P(x,y)+c ₃ ·P(x+1,y)+c ₁ ·P(x+2,y)+c ₂ ·P(x-1,y+1)+c ₃ ·P(x,y+1)+c ₂ ·P(x+1,y+1)+c ₁ ·P(x,y+2)

in the above equation, P' (x, y) is the final predicted value at the current coding unit (x, y), c ₁ 、c ₂ 、c ₃ And c ₄ The coefficients in the filter are respectively c in the approximate Gaussian convolution kernel coefficients ₁ 13, c ₂ 18, c ₃ 25, c ₄ 32.P (x, y) and other values such as P (x-1, y-1) are predicted values at the current coding units (x, y) and (x-1, y-1), wherein the range of values of x and y is not more than the width and height of the current coding unit block.

The sum of the filter coefficients is 256, i.e. the calculated predicted value needs to be shifted 8 bits to the right.

The technical scheme is suitable for the intra-frame prediction coding and decoding part, provides selection for the intra-frame prediction on operations such as smoothing processing or local blurring, and the like, and is used for the part of the image texture which does not need to be sharpened, so that the predicted pixels are smoother, the predicted blocks are closer to the original image, and finally the coding efficiency is improved.

Technical solution 1 tests on an official simulation platform HPM7.0 of AVS, smooth filtering of intra-frame prediction blocks, and test results under full-frame test conditions and random access conditions are shown in tables 2 and 3.

TABLE 2 AllIntra test results

Class	Y	U	V
4K	-0.61％	-0.67％	-0.84％
1080P	-0.45％	-0.78％	-0.48％
720P	-0.22％	-0.08％	-0.66％
Average performance	-0.42％	-0.51％	-0.66％

TABLE 3 Random Access test results

Class	Y	U	V
4K	-0.25％	-0.37％	-0.57％
1080P	-0.22％	-0.41％	-0.64％
720P	-0.22％	-0.01％	-0.73％
Average performance	-0.23％	-0.26％	-0.65％

As can be seen from tables 2 and 3, the present solution has good performance improvement under both test conditions.

Under AI test conditions, the brightness component has 0.42% BDBR saving, the UV component has 0.51% BDBR saving and 0.66% BDBR saving respectively, and the high performance can be obviously seen, and the coding efficiency of the encoder is effectively improved.

From each resolution, the scheme has larger coding performance improvement on the video with 4K resolution, which is beneficial to the development of the ultra-high definition video in the future, saves more code rate for the ultra-high resolution video and saves more bandwidth.

The scheme provides that in the intra-frame prediction process, the prediction block obtained by calculating the intra-frame prediction mode is subjected to smooth filtering, so that the intra-frame prediction precision is improved, and the coding efficiency is effectively improved, and the scheme is as follows:

1. providing smoothing filtering for a prediction block coded in a frame;

2. under the condition that the allowed identification bits of the IPS technology and the IPF technology are both '1', the IPF and the IPS in the same coding unit cannot be determined to be used at the same time;

3. when the encoder decides to use the IPF technology for the current coding unit, the identification bit of the IPS is not transmitted, and the decoder does not need to analyze the use identification bit of the IPS;

When the encoder decides that the IPF technology is not used for the current coding unit, the identification bit of the IPS needs to be transmitted, and the decoder needs to analyze the use identification bit of the IPS;

4. the IPS convolution kernel is provided as a simplified 9-tap filter Gaussian convolution kernel;

5. the convolution kernel of the floating point number is approximated to take the value, the filter coefficient is rounded to avoid floating point calculation, the total of the filter coefficient is the exponent power of 2, the shift operation is used for replacing division operation, the calculation resource is saved, and the complexity is reduced;

6. providing a 9-tap integer Gaussian convolution kernel filter coefficient, wherein the first filter coefficient is 5, the second filter coefficient is 8 and the third filter coefficient is 12, and the predicted value after filtering needs to be shifted by 6 bits to the right;

7. 25 tap and 13 tap filters are proposed.

The present application can be further extended from the following directions.

The expansion scheme 1 is as follows: the 9-tap Gaussian convolution kernel in the technical scheme is replaced by a filter convolution kernel with more taps, so that a better smooth filtering effect is achieved.

The extension scheme 2 is as follows: the luminance component and the chrominance component in the technical scheme are respectively used for representing whether the IPS is used or not by using independent identification bits.

And the extension scheme 3 is as follows: the application range in the technical scheme is limited, and an IPS technology is not used for units with smaller prediction block areas, so that transmission identification bits are reduced, and the calculation complexity is reduced.

Extension scheme 4: the application range in the technical scheme is limited, the prediction mode of the current coding unit is screened, and if the prediction mode is the mean mode, the IPS technology is not used, so that the transmission identification bit is reduced, and the calculation complexity is reduced.

The embodiment of the application provides an image encoding device, which can be a video decoder or a video encoder. Specifically, the image encoding device is configured to perform the steps performed by the video decoder in the above decoding method. The image coding device provided by the embodiment of the application can comprise modules corresponding to the corresponding steps.

The embodiment of the present application may divide the functional modules of the image encoding apparatus according to the above method example, for example, each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module. The integrated modules may be implemented in hardware or in software functional modules. The division of the modules in the embodiment of the present application is schematic, which is merely a logic function division, and other division manners may be implemented in practice.

Fig. 12 shows a possible configuration diagram of the image encoding apparatus involved in the above-described embodiment in the case where respective functional blocks are divided with corresponding respective functions. As shown in fig. 12, the image encoding apparatus 12 includes a determination unit 120, a setting unit 121, a transmission unit 122, and a superimposition unit 123.

A determining unit 120 configured to determine intra prediction filtering indication information of a current encoded block, where the intra prediction filtering indication information includes first indication information for indicating whether a first intra prediction filtering mode is allowed to be used and second indication information for indicating whether a second intra prediction filtering mode is allowed to be used, and the first intra prediction filtering mode is an intra prediction filtering IPF mode;

a setting unit 121, configured to set a first use flag of the first intra-prediction filtering mode of the current coding block to a first value if it is determined that the current coding block needs to use the first intra-prediction filtering mode according to the intra-prediction filtering indication information;

a transmission unit 122, configured to write the intra-prediction filtering indication information, the first intra-prediction filtering mode, and the first usage identification bit into a code stream;

superposition unit 123 a superposition unit, configured to determine, according to the intra-frame prediction filtering indication information and the first use identification bit, to predict the current decoding block using the first frame rate prediction filtering mode, so as to obtain a prediction block of the current decoding block.

All relevant contents of each step related to the above method embodiment may be cited to the functional description of the corresponding functional module, which is not described herein. Of course, the image encoding device provided in the embodiment of the present application includes, but is not limited to, the above modules, for example: the image encoding apparatus may further include a storage unit. The storage unit may be used for storing program codes and data of the image encoding apparatus.

In the case of using an integrated unit, a schematic structural diagram of the image encoding device provided in the embodiment of the present application is shown in fig. 13. In fig. 13, the image encoding apparatus 13 includes: a processing module 130 and a communication module 131. The processing module 130 is configured to control and manage actions of the image encoding apparatus, for example, perform steps performed by the determining unit 120, the setting unit 121, the transmitting unit 122, the superimposing unit 123, and/or perform other processes of the techniques described herein. The communication module 131 is used to support interaction between the image encoding apparatus and other devices. As shown in fig. 13, the image encoding apparatus may further include a storage module 132, where the storage module 132 is configured to store program codes and data of the image encoding apparatus, for example, contents stored in the storage unit.

The processing module 130 may be a processor or controller, such as a central processing unit (Central Processing Unit, CPU), a general purpose processor, a digital signal processor (Digital Signal Processor, DSP), an ASIC, an FPGA or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various exemplary logic blocks, modules, and circuits described in connection with this disclosure. The processor may also be a combination that performs the function of a computation, e.g., a combination comprising one or more microprocessors, a combination of a DSP and a microprocessor, and the like. The communication module 131 may be a transceiver, an RF circuit, a communication interface, or the like. The storage module 132 may be a memory.

All relevant contents of each scenario related to the above method embodiment may be cited to the functional description of the corresponding functional module, which is not described herein. The image encoding apparatus may perform the image encoding method, and the image encoding apparatus may be a video image encoding apparatus or other devices having a video encoding function.

The application also provides a video encoder, which comprises a nonvolatile storage medium and a central processing unit, wherein the nonvolatile storage medium stores an executable program, and the central processing unit is connected with the nonvolatile storage medium and executes the executable program to realize the image encoding method of the embodiment of the application.

The embodiment of the application provides an image decoding device, which can be a video decoder or a video decoder. Specifically, the image decoding apparatus is configured to perform the steps performed by the video decoder in the above decoding method. The image decoding device provided by the embodiment of the application can comprise modules corresponding to the corresponding steps.

The embodiment of the present application may divide the functional modules of the image decoding apparatus according to the above method example, for example, each functional module may be divided corresponding to each function, or two or more functions may be integrated in one processing module. The integrated modules may be implemented in hardware or in software functional modules. The division of the modules in the embodiment of the present application is schematic, which is merely a logic function division, and other division manners may be implemented in practice.

Fig. 14 shows a possible configuration diagram of the image decoding apparatus related to the above-described embodiment in the case where respective functional blocks are divided with corresponding respective functions. As shown in fig. 14, the image decoding apparatus 14 includes an analysis unit 140 and a determination unit 141.

All relevant contents of each step related to the above method embodiment may be cited to the functional description of the corresponding functional module, which is not described herein. Of course, the image decoding apparatus provided in the embodiment of the present application includes, but is not limited to, the above modules, for example: the image decoding apparatus may further include a storage unit. The storage unit may be used to store program codes and data of the image decoding apparatus.

In the case of using an integrated unit, a schematic structural diagram of the image decoding apparatus provided in the embodiment of the present application is shown in fig. 15. In fig. 15, the image decoding apparatus 15 includes: a processing module 150 and a communication module 151. The processing module 150 is configured to control and manage actions of the image decoding apparatus, for example, performing steps performed by the parsing unit 140, the determining unit 141, and/or other processes for performing the techniques described herein. The communication module 151 is used to support interaction between the image decoding apparatus and other devices. As shown in fig. 13, the image decoding apparatus may further include a storage module 152, where the storage module 152 is configured to store program codes and data of the image decoding apparatus, for example, contents stored in the storage unit.

The processing module 150 may be a processor or controller, such as a central processing unit (Central Processing Unit, CPU), a general purpose processor, a digital signal processor (Digital Signal Processor, DSP), an ASIC, an FPGA or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various exemplary logic blocks, modules, and circuits described in connection with this disclosure. The processor may also be a combination that performs the function of a computation, e.g., a combination comprising one or more microprocessors, a combination of a DSP and a microprocessor, and the like. The communication module 151 may be a transceiver, an RF circuit, a communication interface, or the like. The storage module 152 may be a memory.

All relevant contents of each scenario related to the above method embodiment may be cited to the functional description of the corresponding functional module, which is not described herein. The image decoding apparatus may perform the image decoding method, and the image decoding apparatus may be a video image decoding apparatus or other devices having a video decoding function.

The application also provides a video decoder, which comprises a nonvolatile storage medium and a central processing unit, wherein the nonvolatile storage medium stores an executable program, and the central processing unit is connected with the nonvolatile storage medium and executes the executable program to realize the image decoding method of the embodiment of the application.

The application also provides a terminal, which comprises: one or more processors, memory, a communication interface. The memory, communication interface, and one or more processors; the memory is used to store computer program code that includes instructions that, when executed by the one or more processors, cause the terminal to perform the image encoding and/or image decoding methods of embodiments of the present application. The terminals herein may be video display devices, smart phones, laptops and other devices that can process video or play video.

Another embodiment of the present application also provides a computer-readable storage medium including one or more program codes including instructions, which when executed by a processor in a decoding apparatus, the decoding apparatus performs the image encoding method, the image decoding method of the embodiments of the present application.

In another embodiment of the present application, there is also provided a computer program product comprising computer-executable instructions stored in a computer-readable storage medium; at least one processor of the decoding apparatus may read the computer-executable instructions from the computer-readable storage medium, and execution of the computer-executable instructions by the at least one processor causes the terminal to perform the image encoding method, the image decoding method of the embodiments of the present application.

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented using a software program, may be present in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present application, in whole or in part.

The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, for example, by wired (e.g., coaxial cable, optical fiber, digital Subscriber Line (DSL)), or wireless (e.g., infrared, wireless, microwave, etc.).

The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Disk (SSD)), etc.

From the foregoing description of the embodiments, it will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of functional modules is illustrated, and in practical application, the above-described functional allocation may be implemented by different functional modules according to needs, i.e. the internal structure of the apparatus is divided into different functional modules to implement all or part of the functions described above.

In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the modules or units is merely a logical functional division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another apparatus, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and the parts displayed as units may be one physical unit or a plurality of physical units, may be located in one place, or may be distributed in a plurality of different places. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a readable storage medium. Based on such understanding, the technical solution of the embodiments of the present application may be essentially or a part contributing to the prior art or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, including several instructions for causing a device (may be a single-chip microcomputer, a chip or the like) or a processor (processor) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

The foregoing is merely a specific embodiment of the present application, but the protection scope of the present application is not limited thereto, and any changes or substitutions within the technical scope of the present disclosure should be covered in the protection scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

An image encoding method, comprising:

determining intra-prediction filtering indication information of a current coding block, wherein the intra-prediction filtering indication information comprises first indication information and second indication information, the first indication information is used for indicating whether a first intra-prediction filtering mode is allowed to be used or not, the second indication information is used for indicating whether a second intra-prediction filtering mode is allowed to be used or not, and the first intra-prediction filtering mode is an intra-prediction filtering IPF mode;

if the fact that the current coding block needs to use the first intra-frame prediction filtering mode is determined according to the intra-frame prediction filtering indication information, a first use identification bit of the first intra-frame prediction filtering mode of the current coding block is set to be a first value;

writing the intra-frame prediction filtering indication information, the first intra-frame prediction filtering mode and the first use identification bit into a code stream;

and according to the first intra-frame prediction filtering mode, predicting the current coding block to obtain a prediction block, coding the prediction block, and writing a coding result into a code stream.
The method according to claim 1, wherein said determining that the current encoded block requires use of the first intra prediction filtering mode according to the intra prediction filtering indication information comprises:

Detecting that the first indication information indicates that the first intra prediction filtering mode is allowed to be used, and the second indication information indicates that the second intra prediction filtering mode is allowed to be used;

traversing a plurality of intra-frame prediction modes for the current coding block, and respectively calculating a first prediction block of the current coding block in each intra-frame prediction mode;

calculating first rate distortion cost information of the current coding block in each prediction mode according to the first prediction block and an original block of the current coding block aiming at each intra prediction mode; determining a first intra-frame prediction mode of the minimum first rate-distortion cost information and the corresponding first rate-distortion cost information as first generation values;

selecting a filter corresponding to each intra-frame prediction mode according to each intra-frame prediction mode, selecting a filter coefficient set corresponding to each intra-frame prediction mode according to the size of the current coding block, and performing IPF filtering on the prediction block of the current coding block according to the filter and the filter coefficient set to obtain a second prediction block of the current coding block in each intra-frame prediction mode; calculating second rate-distortion cost information of each prediction mode according to the second prediction block and the original block of the current coding block; recording a second intra-frame prediction mode of the minimum second rate-distortion cost information and the corresponding second rate-distortion cost information as second cost values;

Performing intra-frame prediction smoothing (IPS) filtering on the prediction block of the current coding block aiming at each intra-frame prediction mode to obtain a third prediction block of the current coding block in each intra-frame prediction mode; calculating third rate-distortion cost information of each prediction mode according to the third prediction block and the original block of the current coding block; determining a third intra-frame prediction mode of the minimum third rate-distortion cost information and the corresponding third rate-distortion cost information as third generation values;

and if the cost value with the smallest value among the first cost value, the second cost value and the third cost value is detected to be the second cost value, determining that the current coding block needs to use the first intra-frame prediction filtering mode.
The method of claim 2, wherein writing the intra prediction filtering indication information, the first intra prediction filtering mode, and the first usage identification bit into a bitstream comprises: writing the intra-frame prediction filtering indication information, the second intra-frame prediction mode and the first use identification bit into a code stream;

the method further comprises the steps of: and superposing the second prediction block of the current coding block and the residual block obtained after inverse transformation and inverse quantization to obtain a reconstructed block which is reconstructed and is used as a prediction reference block of the next coding block.
The method according to claim 2, wherein the method further comprises:

if the cost value with the smallest value among the first generation value, the second generation value and the third generation value is detected to be the first generation value, determining that the current coding block does not need to use the first intra-frame prediction filtering mode and the second intra-frame prediction filtering mode;

setting a first use flag bit of the first intra-prediction filtering mode and a second use flag bit of the second intra-prediction filtering mode of the current coding block to be unused;

writing the intra-frame prediction filtering indication information, the first intra-frame prediction mode, the first use identification bit and the second use identification bit into a code stream;

and according to the first intra-frame prediction mode, predicting the current coding block to obtain a prediction block, coding the prediction block, and writing a coding result into a code stream.
The method according to claim 2, wherein the method further comprises:

if the cost value with the smallest value among the first generation value, the second generation value and the third generation value is detected as the third generation value, determining that the second intra-frame prediction filtering mode is needed to be used for the current coding block and the first intra-frame prediction filtering mode is not needed to be used;

Setting a first use flag bit of the first intra-prediction filtering mode of the current coding block to a second value, and setting second use flag bits of the second intra-prediction filtering mode to use;

writing the intra-frame prediction filtering indication information, the third intra-frame prediction mode, the first use identification bit and the second use identification bit into a code stream;

and according to the third intra-frame prediction mode, predicting the current coding block to obtain a prediction block, coding the prediction block, and writing a coding result into a code stream.
The method according to claim 1, wherein said determining that the current encoded block requires use of the first intra prediction filtering mode according to the intra prediction filtering indication information comprises:

detecting that the first indication information indicates that the first intra prediction filtering mode is allowed to be used, and the second indication information indicates that the second intra prediction filtering mode is not allowed to be used;

traversing a plurality of intra-frame prediction modes for the current coding block, and respectively calculating a first prediction block of the current coding block in each intra-frame prediction mode;

calculating first rate distortion cost information of the current coding block in each prediction mode according to the first prediction block and an original block of the current coding block aiming at each intra prediction mode; determining a first intra-frame prediction mode of the minimum first rate-distortion cost information and the corresponding first rate-distortion cost information as first generation values;

Selecting a filter corresponding to each intra-frame prediction mode according to each intra-frame prediction mode, selecting a filter coefficient set corresponding to each intra-frame prediction mode according to the size of the current coding block, and performing IPF filtering on the prediction block of the current coding block according to the filter and the filter coefficient set to obtain a second prediction block of the current coding block in each intra-frame prediction mode; calculating second rate-distortion cost information of each prediction mode according to the second prediction block and the original block of the current coding block; recording a second intra-frame prediction mode of the minimum second rate-distortion cost information and the corresponding second rate-distortion cost information as second cost values;

and if the cost value with the smallest value in the first generation value and the second cost value is detected to be the second generation value, determining that the current coding block needs to use the first intra-frame prediction filtering mode.
The method of claim 6, wherein writing the intra prediction filtering indication information, the first intra prediction filtering mode, and the first usage identification bit into a bitstream comprises: writing the intra-frame prediction filtering indication information, the first intra-frame prediction mode and the first use identification bit into a code stream;

The method further comprises the steps of: and superposing the second prediction block of the current coding block and the residual block obtained after inverse transformation and inverse quantization to obtain a reconstructed block which is reconstructed and is used as a prediction reference block of the next coding block.
The method of claim 6, wherein the method further comprises:

if the cost value with the smallest value in the first generation value and the second cost value is detected to be the first generation value, determining that the current coding block does not need to use the first intra-frame prediction filtering mode;

setting a first use flag of the first intra prediction filtering mode of the current coding block to a second value;

writing the intra-frame prediction filtering indication information, the first intra-frame prediction mode and the first use identification bit into a code stream;

and according to the first intra-frame prediction mode, predicting the current coding block to obtain a prediction block, coding the prediction block, and writing a coding result into a code stream.
The method according to claim 1, wherein the method further comprises:

detecting that the first indication information indicates that the first intra-prediction filtering mode is not allowed to be used, and the second indication information indicates that the second intra-prediction filtering mode is not allowed to be used;

Traversing a plurality of intra-frame prediction modes for the current coding block, and respectively calculating a first prediction block of the current coding block in each intra-frame prediction mode;

calculating first rate distortion cost information of the current coding block in each prediction mode according to the first prediction block and an original block of the current coding block aiming at each intra prediction mode; determining a first intra-frame prediction mode of the minimum first rate-distortion cost information and the corresponding first rate-distortion cost information as first generation values;

writing the intra-frame prediction filtering indication information, the first intra-frame prediction mode, the first use identification bit and the second use identification bit into a code stream;

and according to the first intra-frame prediction mode, predicting the current coding block to obtain a prediction block, coding the prediction block, and writing a coding result into a code stream.
The method according to claim 1, wherein the method further comprises:

detecting that the first indication information indicates that the first intra prediction filtering mode is not allowed to be used and the second indication information indicates that the second intra prediction filtering mode is allowed to be used and that an area of the current decoding block is greater than or equal to 64 and less than 4096;

Calculating first rate distortion cost information of the current coding block in each prediction mode according to the first prediction block and an original block of the current coding block aiming at each intra prediction mode; determining a first intra-frame prediction mode of the minimum first rate-distortion cost information and the corresponding first rate-distortion cost information as first generation values;

performing intra-frame prediction smoothing (IPS) filtering on the prediction block of the current coding block aiming at each intra-frame prediction mode to obtain a third prediction block of the current coding block in each intra-frame prediction mode; calculating third rate-distortion cost information of each prediction mode according to the third prediction block and the original block of the current coding block; determining a third intra-frame prediction mode of the minimum third rate-distortion cost information and the corresponding third rate-distortion cost information as third generation values;

if the cost value with the smallest value in the first generation value and the third generation value is detected to be the first generation value, determining that the current coding block does not need to use the first intra-frame prediction filtering mode; setting a first use flag of the first intra prediction filtering mode of the current coding block to a second value; writing the intra-frame prediction filtering indication information, the first intra-frame prediction mode and the first use identification bit into a code stream; according to the first intra-frame prediction mode, predicting the current coding block to obtain a prediction block, coding the prediction block, and writing a coding result into a code stream;

If the cost value with the smallest value in the first generation value and the third generation value is detected to be the third generation value, determining that the second intra-frame prediction filtering mode is needed to be used for the current coding block; setting a second use flag bit of the second intra prediction filtering mode of the current coding block to use; writing the intra-frame prediction filtering indication information, the third intra-frame prediction mode and the second use identification bit into a code stream; and superposing the third prediction block of the current coding block and the residual block obtained after inverse transformation and inverse quantization to obtain a reconstructed block which is reconstructed and is used as a prediction reference block of the next coding block.
The method according to claim 2 or 6, wherein the IPF filtering employs a 9 tap filter, the filter being a gaussian convolution kernel.
The method of claim 11, wherein the sum of filter coefficients of the 9-tap filter is an exponent of 2.
The method of claim 11, wherein the 9-tap filter has a first filter coefficient of 5, a second filter coefficient of 8, and a third filter coefficient of 12, and wherein the filtered prediction value is shifted 6 bits to the right.
The method according to claim 2 or 6, wherein the IPF filter employs a 25 tap filter or a 13 tap filter.
An image decoding method, comprising:

analyzing a code stream, and determining intra-frame prediction filtering indication information and a first use identification bit of a current decoding block, wherein the intra-frame prediction indication information comprises first indication information and second indication information, the first indication information is used for indicating whether a first intra-frame prediction filtering mode is allowed to be used or not, the second indication information is used for indicating whether a second intra-frame prediction filtering mode is allowed to be used or not, the first intra-frame prediction filtering mode is an intra-frame prediction filtering IPF mode, and the first use identification bit is a use identification bit of the first intra-frame prediction filtering mode;

and determining to predict the current decoding block by using the first frame rate prediction filtering mode according to the intra-frame prediction filtering indication information and the first use identification bit to obtain a prediction block of the current decoding block.
The method according to claim 15, wherein determining, based on the intra prediction filtering indication information and the first usage identification bit, to predict the current decoded block using the first frame rate prediction filtering mode, to obtain a predicted block of the current decoded block, comprises:

Detecting that the first indication information indicates that the first intra prediction filtering mode is allowed to be used, and the second indication information indicates that the second intra prediction filtering mode is allowed to be used or not allowed to be used;

acquiring the code stream and decoding to obtain a residual block of the current decoding block;

analyzing the code stream to obtain a prediction mode of the current decoding block, and calculating to obtain an original prediction block of the current decoding block according to the prediction mode and an adjacent reconstruction block;

if the first use identification bit indicates that the first intra-frame prediction filtering mode is used;

and selecting a corresponding filter according to the prediction mode of the current decoding block, selecting a corresponding filter coefficient group according to the size of the current decoding block, and performing IPF filtering on an original prediction block of the current decoding block according to the filter and the filter coefficient group to obtain a prediction block of the decoding block.
The method of claim 15, wherein the method further comprises:

if the first indication information indicates that the first intra-prediction filtering mode is not allowed to be used, and the second indication information indicates that the second intra-prediction filtering mode is allowed to be used, and the area of the current decoding block is greater than or equal to 64 and less than 4096;

Acquiring the code stream and decoding to obtain a residual block of the current decoding block;

analyzing the code stream to obtain a prediction mode of the current decoding block, and calculating to obtain an original prediction block of the current decoding block according to the prediction mode and an adjacent reconstruction block;

detecting that the first use flag indicates that the first intra prediction filtering mode is not used;

analyzing the code stream and acquiring a second use identification bit of the current decoding block, wherein the second use identification bit is an intra-frame prediction smoothing filter (IPS) use identification bit;

detecting that the second use flag indicates use of a second intra prediction filtering mode;

and carrying out intra-frame prediction smoothing (IPS) filtering on the original prediction block of the current decoding block to obtain the prediction block of the current decoding block.
The method of claim 17, wherein the method further comprises:

detecting that the second use flag indicates that the second intra prediction filtering mode is not used;

determining the original prediction block of the current decoding block as a prediction block of the current decoding block.
The method according to any one of claims 15-18, further comprising:

And superposing the residual information on the predicted block to obtain a reconstructed block of the current decoding block.
The method of claim 16, wherein the IPF filtering employs a 9 tap filter, the filter being a gaussian convolution kernel.
The method of claim 20, wherein the sum of filter coefficients of the 9-tap filter is an exponent of 2.
The method of claim 20, wherein the 9-tap filter has a first filter coefficient of 5, a second filter coefficient of 8, and a third filter coefficient of 12, and wherein the filtered prediction value is shifted 6 bits to the right.
The method of claim 16, wherein the IPF filtering employs a 25 tap filter or a 13 tap filter.
An image encoding device, comprising:

a determining unit configured to determine intra prediction filtering indication information of a current coding block, the intra prediction filtering indication information including first indication information indicating whether a first intra prediction filtering mode is allowed to be used and second indication information indicating whether a second intra prediction filtering mode is allowed to be used, the first intra prediction filtering mode being an intra prediction filtering IPF mode;

A setting unit, configured to set a first use flag of the first intra-prediction filtering mode of the current coding block to a first value if it is determined that the current coding block needs to use the first intra-prediction filtering mode according to the intra-prediction filtering indication information;

a transmission unit, configured to write the intra-frame prediction filtering indication information, the first intra-frame prediction filtering mode, and the first use identification bit into a code stream;

and the superposition unit is used for determining to predict the current decoding block by using the first frame rate prediction filtering mode according to the intra-frame prediction filtering indication information and the first use identification bit to obtain a prediction block of the current decoding block.
An image decoding apparatus, comprising:

an parsing unit, configured to determine intra-prediction filtering indication information and a first usage identification bit of a current decoding block, where the intra-prediction indication information includes first indication information and second indication information, where the first indication information is used to indicate whether to allow use of a first intra-prediction filtering mode, the second indication information is used to indicate whether to allow use of a second intra-prediction filtering mode, the first intra-prediction filtering mode is an intra-prediction filtering IPF mode, and the first usage flag bit is a usage identification bit of the first intra-prediction filtering mode;

And the determining unit is used for determining to predict the current decoding block by using the first frame rate prediction filtering mode according to the intra-frame prediction filtering indication information and the first use identification bit to obtain a prediction block of the current decoding block.
An encoder comprising a non-volatile storage medium storing an executable program and a central processor connected to the non-volatile storage medium, the encoder performing the image encoding method of any of claims 1-14 when the central processor executes the executable program.
A decoder comprising a non-volatile storage medium storing an executable program and a central processor connected to the non-volatile storage medium, the decoder performing the image decoding method according to any of claims 15-23 when the central processor executes the executable program.
A terminal, the terminal comprising: one or more processors, memory, and communication interfaces; the memory, the communication interface, and the one or more processors are connected; the terminal communicates with other devices via the communication interface, the memory is for storing computer program code, the computer program code comprising instructions,

The terminal, when executing the instructions by the one or more processors, performs the image encoding method of any of claims 1-14 and/or the image decoding method of any of claims 15-23.
A computer program product comprising instructions which, when run on a terminal, cause the terminal to perform the image encoding method of any of claims 1-14 and/or the image decoding method of any of claims 15-23.
A computer readable storage medium comprising instructions which, when run on a terminal, cause the terminal to perform the image encoding method of any one of claims 1-14 and/or the image decoding method of any one of claims 15-23.