CN114071161B - Image encoding method, image decoding method and related devices - Google Patents

Image encoding method, image decoding method and related devices Download PDF

Info

Publication number
CN114071161B
CN114071161B CN202010748923.0A CN202010748923A CN114071161B CN 114071161 B CN114071161 B CN 114071161B CN 202010748923 A CN202010748923 A CN 202010748923A CN 114071161 B CN114071161 B CN 114071161B
Authority
CN
China
Prior art keywords
prediction
block
intra
filtering
current
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010748923.0A
Other languages
Chinese (zh)
Other versions
CN114071161A (en
Inventor
谢志煌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN202010748923.0A priority Critical patent/CN114071161B/en
Priority to TW110123862A priority patent/TW202209879A/en
Publication of CN114071161A publication Critical patent/CN114071161A/en
Application granted granted Critical
Publication of CN114071161B publication Critical patent/CN114071161B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/593Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/182Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The embodiment of the application discloses an image coding method, an image decoding method and a related device, wherein the image decoding method comprises the following steps: dividing an image, and determining an intra-frame prediction mode of a target component of a current coding block, wherein the target component comprises a brightness component or a chrominance component; determining a prediction block of a target component of a current coding block according to an intra-frame prediction mode of the target component; performing first filtering on reference pixels used for correcting the prediction block according to the intra-frame prediction mode of the target component to obtain filtered reference pixels; and carrying out second filtering on the prediction block of the target component according to the filtered reference pixel to obtain a corrected prediction block. According to the embodiment of the application, before the prediction block of the current pixel block is corrected by utilizing the spatial relevance between the adjacent pixel block and the current pixel block, the boundary pixels of the adjacent pixel block of the current pixel block are filtered, so that sharpening is avoided, and the intra-frame prediction accuracy and the coding efficiency are improved.

Description

Image encoding method, image decoding method and related device
Technical Field
The present application relates to the field of electronic device technologies, and in particular, to an image encoding method, an image decoding method, and a related apparatus.
Background
Digital video capabilities can be incorporated into a wide range of devices, including digital televisions, digital direct broadcast systems, wireless broadcast systems, personal Digital Assistants (PDAs), laptop or desktop computers, tablet computers, e-book readers, digital cameras, digital recording devices, digital media players, video gaming devices, video game consoles, cellular or satellite radio telephones, video conferencing devices, video streaming devices, and so forth.
Digital video devices implement video compression techniques such as those described in the standards and extensions of the standards defined by the Moving Picture Experts Group (MPEG) -2, MPEG-4, ITU-t h.263, ITU-t h.264/MPEG-4 part 10 Advanced Video Coding (AVC), ITU-t h.265 High Efficiency Video Coding (HEVC) standards, to more efficiently transmit and receive digital video information. Video devices may more efficiently transmit, receive, encode, decode, and/or store digital video information by implementing these video codec techniques.
With the proliferation of internet video, even though digital video compression technology is continuously evolving, still higher requirements are put on video compression ratio.
Disclosure of Invention
The embodiment of the application provides an image coding method, an image decoding method and a related device, so that before a prediction block of a current pixel block is corrected by utilizing the spatial relevance between an adjacent pixel block and the current pixel block, the boundary pixels of the adjacent pixel block of the current pixel block are filtered, sharpening is avoided, and the intra-frame prediction accuracy and the coding efficiency are improved.
In a first aspect, an embodiment of the present application provides an image encoding method, including:
dividing an image, and determining coding information of a current coding block, wherein the coding information comprises first indication information and second indication information, the first indication information is used for indicating whether intra-frame prediction filtering is allowed, and the second indication information is used for indicating whether intra-frame prediction smoothing filtering is allowed;
determining an optimal prediction mode of the current coding speed according to the coding information, setting third indication information according to the optimal prediction mode, and transmitting the optimal prediction mode index and the third indication information through a code stream;
and superposing the prediction block of the current coding block and a residual block obtained after inverse transformation and inverse quantization to obtain a reconstructed block.
Compared with the prior art, the scheme of the application performs smooth filtering on the prediction block obtained by calculating the intra-frame prediction mode, improves intra-frame prediction precision, and effectively improves coding efficiency.
In a second aspect, an embodiment of the present application provides an image decoding method, including:
analyzing a code stream, and determining decoding information of a current decoding block, wherein the decoding information comprises first indication information and third indication information, the first indication information is used for indicating whether intra-frame prediction filtering is allowed or not, and the third indication information is used for indicating whether intra-frame prediction smoothing filtering is used or not;
determining a prediction block of the current decoding block according to the first indication information and the third indication information;
and superposing the restored residual error information on the prediction block to obtain a reconstructed block of the current decoding unit.
Compared with the prior art, the scheme of the application performs smooth filtering on the prediction block obtained by calculating the intra-frame prediction mode, improves intra-frame prediction precision, and effectively improves coding efficiency.
In a third aspect, an embodiment of the present application provides an image encoding apparatus, including:
the device comprises a dividing unit, a decoding unit and a smoothing unit, wherein the dividing unit is used for dividing an image and determining coding information of a current coding block, the coding information comprises first indication information and second indication information, the first indication information is used for indicating whether intra-prediction filtering is allowed or not, and the second indication information is used for indicating whether intra-prediction smoothing filtering is allowed or not;
the determining unit is used for determining an optimal prediction mode of the current coding speed according to the coding information, setting third indication information according to the optimal prediction mode, and transmitting the optimal prediction mode index and the third indication information through a code stream;
and the superposition unit is used for superposing the prediction block of the current coding block and the residual block obtained after inverse transformation and inverse quantization to obtain a reconstructed block.
In a fourth aspect, an embodiment of the present application provides an image decoding apparatus, including:
the decoding unit is used for decoding the code stream and determining decoding information of a current decoding block, wherein the decoding information comprises first indication information and third indication information, the first indication information is used for indicating whether intra-frame prediction filtering is allowed or not, and the third indication information is used for indicating whether intra-frame prediction smoothing filtering is used or not;
a determining unit, configured to determine a prediction block of the current decoded block according to the first indication information and the third indication information;
and the superposition unit is used for superposing the restored residual error information on the prediction block to obtain a reconstructed block of the current decoding unit.
In a fifth aspect, an embodiment of the present application provides an encoder, including: a processor and a memory coupled to the processor; the processor is configured to perform the method of the first aspect.
In a sixth aspect, an embodiment of the present application provides a decoder, including: a processor and a memory coupled to the processor; the processor is configured to perform the method of the second aspect.
In a seventh aspect, an embodiment of the present application provides a terminal, where the terminal includes: one or more processors, memory, and a communication interface; the memory, the communication interface and the one or more processors; the terminal communicates with other devices through the communication interface, and the memory is used for storing computer program code comprising instructions which, when executed by the one or more processors, cause the terminal to perform the method according to the first or second aspect.
In an eighth aspect, the present invention provides a computer-readable storage medium, having stored therein instructions, which, when executed on a computer, cause the computer to perform the method of the first or second aspect.
In a ninth aspect, embodiments of the present application provide a computer program product comprising instructions that, when executed on a computer, cause the computer to perform the method of the first or second aspect.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic block diagram of a coding tree unit in an embodiment of the present application;
FIG. 2 is a schematic block diagram of a CTU and a coding block CU in an embodiment of the present application;
FIG. 3 is a schematic block diagram of a color format in an embodiment of the present application;
FIG. 4 is a schematic diagram of an IPF in an embodiment of the present application;
FIG. 5 is a diagram illustrating intra prediction filtering according to an embodiment of the present application;
FIG. 6 is a schematic block diagram of a video coding system in an embodiment of the present application;
FIG. 7 is a schematic block diagram of a video encoder in an embodiment of the present application;
FIG. 8 is a schematic block diagram of a video decoder in an embodiment of the present application;
FIG. 9 is a flowchart illustrating an image encoding method according to an embodiment of the present application;
FIG. 10 is a flowchart illustrating an image decoding method according to an embodiment of the present application;
FIG. 11A is a schematic diagram of a first padding of a prediction block in the embodiment of the present application;
FIG. 11B is a second padding diagram of a prediction block in the embodiment of the present application;
FIG. 11C is a third padding diagram of the prediction block in the embodiment of the present application;
FIG. 12 is a block diagram of functional units of an image encoding apparatus according to an embodiment of the present application;
FIG. 13 is a block diagram showing another functional unit of the image encoding apparatus according to the embodiment of the present application;
FIG. 14 is a block diagram of a functional unit of an image decoding apparatus according to an embodiment of the present application;
fig. 15 is a block diagram of another functional unit of the image decoding apparatus in the embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
It will be understood that, as used herein, the terms "first," "second," and the like may be used herein to describe various elements, but these elements are not limited by these terms. These terms are only used to distinguish one element from another. For example, a first client may be referred to as a second client, and similarly, a second client may be referred to as a first client, without departing from the scope of the present invention. Both the first client and the second client are clients, but they are not the same client.
First, terms used in the embodiments of the present application will be described.
For the partition of images, in order to more flexibly represent Video contents, a Coding Tree Unit (CTU), a Coding Unit (CU), a Prediction Unit (PU), and a Transform Unit (TU) are defined in the High Efficiency Video Coding (HEVC) technology. The CTU, CU, PU, and TU are all image blocks.
A coding tree unit CTU, an image being composed of a plurality of CTUs, a CTU generally corresponding to a square image area, containing luminance pixels and chrominance pixels (or may contain only luminance pixels, or may contain only chrominance pixels) in the image area; the CTU also contains syntax elements that indicate how the CTU is divided into at least one Coding Unit (CU) and the method of decoding each coding block to obtain a reconstructed picture. As shown in fig. 1, the picture 10 is composed of a plurality of CTUs (including CTU a, CTU B, CTU C, etc.). The encoded information corresponding to a CTU includes luminance values and/or chrominance values of pixels in a square image region corresponding to the CTU. Furthermore, the coding information corresponding to a CTU may also contain syntax elements indicating how to divide the CTU into at least one CU and the method of decoding each CU to get the reconstructed picture. The image area corresponding to one CTU may include 64 × 64, 128 × 128, or 256 × 256 pixels. In one example, a CTU of 64 × 64 pixels comprises a rectangular pixel lattice of 64 columns of 64 pixels each, each pixel comprising a luminance component and/or a chrominance component. The CTUs may also correspond to rectangular image regions or image regions with other shapes, and an image region corresponding to one CTU may also be an image region in which the number of pixels in the horizontal direction is different from the number of pixels in the vertical direction, for example, including 64 × 128 pixels.
The coding block CU, as shown in fig. 2, may further be divided into coding blocks CU, each of which generally corresponds to an a × B rectangular region in the image, and includes a × B luma pixels and/or its corresponding chroma pixels, a being the width of the rectangle, B being the height of the rectangle, a and B may be the same or different, and a and B generally take values of 2 raised to an integer power, such as 128, 64, 32, 16, 8, 4. Here, the width referred to in the embodiment of the present application refers to the length along the X-axis direction (horizontal direction) in the two-dimensional rectangular coordinate system XoY shown in fig. 1, and the height refers to the length along the Y-axis direction (vertical direction) in the two-dimensional rectangular coordinate system XoY shown in fig. 1. The reconstructed image of a CU may be obtained by adding a predicted image, which is generated by intra prediction or inter prediction, specifically, may be composed of one or more Predicted Blocks (PB), and a residual image, which is generated by inverse quantization and inverse transform processing on transform coefficients, specifically, may be composed of one or more Transform Blocks (TB). Specifically, one CU includes coding information including information such as a prediction mode and a transform coefficient, and performs decoding processing such as corresponding prediction, inverse quantization, and inverse transform on the CU according to the coding information to generate a reconstructed image corresponding to the CU. The coding tree unit and coding block relationship is shown in fig. 3.
The prediction unit PU is a basic unit of intra prediction and inter prediction. Defining motion information of an image block to include an inter-frame prediction direction, a reference frame, a motion vector, and the like, wherein the image block undergoing encoding processing is called a Current Coding Block (CCB), the image block undergoing decoding processing is called a Current Decoding Block (CDB), and for example, when one image block is undergoing prediction processing, the current coding block or the current decoding block is a prediction block; when an image block is being residual processed, the currently encoded block or the currently decoded block is a transform block. The picture in which the current coding block or the current decoding block is located is called the current frame. In the current frame, image blocks located on the left or upper side of the current block may be inside the current frame and have completed encoding/decoding processing, resulting in reconstructed images, which are referred to as reconstructed blocks; information such as the coding mode of the reconstructed block, the reconstructed pixels, etc. is available (available). A frame in which the encoding/decoding process has been completed before the encoding/decoding of the current frame is referred to as a reconstructed frame. When the current frame is a unidirectional predicted frame (P-frame) or a bidirectional predicted frame (B-frame), it has one or two reference frame lists, respectively, referred to as L0 and L1, respectively, each of which contains at least one reconstructed frame, referred to as the reference frame of the current frame. The reference frame provides reference pixels for inter-frame prediction of the current frame.
And a transform unit TU for processing the residual between the original image block and the predicted image block.
The pixels (also called as pixels) refer to pixels in an image, such as pixels in a coding block, pixels in a luminance component pixel block (also called as luminance pixels), pixels in a chrominance component pixel block (also called as chrominance pixels), and the like.
The samples (also referred to as pixel values, sample values) refer to pixel values of a pixel point, the pixel values refer to luminance (i.e., gray-scale values) in a luminance component domain, and the pixel values refer to chrominance values (i.e., color and saturation) in a chrominance component domain.
Description of the directions: horizontal direction, for example: in the two-dimensional rectangular coordinate system XoY shown in fig. 1, along the X-axis direction and the vertical direction, for example: in the two-dimensional rectangular coordinate system XoY shown in fig. 1, in the negative direction along the Y-axis.
And intra-frame prediction, namely generating a prediction image of the current block according to the spatial adjacent pixels of the current block. An intra prediction mode corresponds to a method of generating a prediction image. The division of the intra-frame prediction unit comprises a2 Nx 2N division mode and an Nx N division mode, wherein the 2 Nx 2N division mode is that image blocks are not divided; the N × N division is to divide the image block into four equal-sized sub-image blocks.
In general, digital video compression techniques work on video sequences whose color coding method is YCbCr, which may also be referred to as YUV, with a color format of 4. Where Y denotes brightness (Luma) that is a gray scale value, cb denotes a blue Chrominance component, cr denotes a red Chrominance component, and U and V denote Chrominance (Chroma) for describing color and saturation. In color format, 4.
In a digital video encoding process, an encoder reads pixels and encodes raw video sequences in different color formats. A general digital encoder usually includes prediction, transformation and quantization, inverse transformation and inverse quantization, loop filtering, entropy coding, and the like, and is used for eliminating spatial, temporal, visual, and character redundancy, and the like. However, human eyes are more sensitive to changes in luminance components and do not strongly respond to changes in chrominance components, so the original video sequence is generally coded by adopting a color format of YUV 4. Meanwhile, the digital video encoder adopts different prediction processes for the luminance component and the chrominance component in the intra-frame coding part, the prediction of the luminance component is more delicate and complex, and the prediction of the chrominance component is generally simpler. A Cross Component Prediction (CCP) mode is a technique applied to a luminance Component and a chrominance Component in an existing digital video coding to increase a video compression ratio.
The intra-frame prediction part in the digital video coding and decoding mainly refers to the image information of adjacent blocks of a current frame to predict a current coding unit block, calculates residual errors of a prediction block and an original image block to obtain residual error information, and transmits the residual error information to a decoding end through the processes of transformation, quantization and the like. And after receiving and analyzing the code stream, the decoding end obtains residual information through steps of inverse transformation, inverse quantization and the like, and a reconstructed image block is obtained after the residual information is superposed on a predicted image block obtained by prediction of the decoding end. In the process, intra-frame prediction usually predicts a current coding block by means of respective angle mode and non-angle mode to obtain a prediction block, screens out the optimal prediction mode of a current coding unit according to rate distortion information obtained by calculation of the prediction block and an original block, and then transmits the prediction mode to a decoding end through a code stream. And the decoding end analyzes the prediction mode, predicts to obtain a predicted image of the current decoding block and superposes residual pixels transmitted by the code stream to obtain a reconstructed image.
Through the development of the digital video coding and decoding standards of the past generations, a non-angle mode is kept relatively stable, and the non-angle mode has an average mode and a plane mode; the angle mode is continuously increased along with the evolution of the digital video coding and decoding standard, taking the international digital video coding standard H series as an example, the H.264/AVC standard only has 8 angle prediction modes and 1 non-angle prediction mode; H.265/HEVC is extended to 33 angular prediction modes and 2 non-angular prediction modes; and the latest general video coding standard H.266/VVC at present adopts 67 prediction modes, wherein 2 non-angular prediction modes are reserved, and the angular mode is expanded from 33 of H.265 to 65. Needless to say, as the angle mode increases, the intra-frame prediction will be more accurate, and the requirements of the current society for development of high definition and ultra-high definition videos are met. The international standard is the same, the domestic digital audio and video coding standard AVS3 also continues to expand the angle mode and the non-angle mode, the development of the ultra-high definition digital video puts higher requirements on intra-frame prediction, and the coding efficiency can not be submitted by only increasing the angle prediction mode and expanding a wide angle. Therefore, the domestic digital audio and video coding standard AVS3 adopts an intra prediction filtering technique (IPF), which indicates that not all reference pixels are adopted in the current intra angle prediction, and thus the relevance between some pixels and the current coding unit is easily ignored, and the intra prediction filtering technique improves the pixel prediction precision through point-to-point filtering, and can effectively enhance the spatial relevance, thereby improving the intra prediction precision. The IPF technique takes the prediction mode from top right to bottom left in AVS3 as an example, and specifically as shown in fig. 4, where URB represents the boundary pixel of the left neighboring block near the current coding unit, MRB represents the boundary pixel of the top neighboring block near the current coding unit, and filter direction represents the filtering direction. In the prediction mode direction from top right to bottom left, the generated prediction value of the current coding unit mainly uses the reference pixel points of the adjacent block in the row of the MRB above, that is, the prediction pixel of the current coding unit does not refer to the reconstructed pixel of the adjacent block on the left side, however, the current coding unit and the reconstructed block on the left side are in a spatial adjacent relationship, and if only the MRB pixel on the upper side is referred to and the URB pixel on the left side is not referred to, spatial correlation is easily lost, which results in poor prediction effect.
The IPF technology is applied to all prediction modes of intra-frame prediction, and is a filtering method for improving intra-frame prediction precision. The IPF technology is mainly realized through the following processes:
a) Judging the current prediction mode of the coding unit by the IPF technology, and dividing the prediction mode into a horizontal angle prediction mode, a vertical angle prediction mode and a non-angle prediction mode;
b) According to different types of prediction modes, the IPF technology adopts different filters to filter input pixels;
c) According to different distances from the current pixel to the reference pixel, the IPF technology adopts different filter coefficients to filter the input pixel;
the input pixel of the IPF technique is a predicted pixel obtained in each prediction mode, and the output pixel is a final predicted pixel after IPF.
The IPF technique has an allowable flag bit IPF _ enable _ flag, a binary variable, a value of '1' indicating that intra prediction filtering can be used; a value of '0' indicates that no intra prediction filtering should be used. Meanwhile, the IPF technology also uses an identification bit IPF _ flag, and a binary variable with the value of '1' indicates that intra-frame prediction filtering is to be used; a value of '0' indicates that intra prediction filtering should not be used, and if the flag ipf _ flag does not exist in the code stream, 0 is defaulted.
Syntax element IPF _ flag, as follows:
Figure BDA0002608522600000041
Figure BDA0002608522600000051
the IPF technique classifies prediction modes 0, 1, and 2 as non-angular prediction modes, and filters the prediction pixels using a first three-tap filter;
classifying the prediction modes 3 to 18 and 34 to 50 into vertical angle prediction modes, and filtering the prediction pixels by using a first two-tap filter;
the prediction modes 19 to 32 and 51 to 65 are classified into the horizontal-class angle prediction modes, and the prediction pixels are filtered using the second two-tap filter.
The first three-tap filter applicable to the IPF technique has the following filtering formula:
P′(x,y)=f(x)·P(-1,y)+f(y)·P(x,-1)+(1-f(x)-f(y))·P(x,y)
the first two-tap filter applicable to the IPF technique has the following filtering formula:
P′(x,y)=f(x)·P(-1,y)+(1-f(x))·P(x,y)
the second two-tap filter suitable for the IPF technique has the following filtering formula:
P′(x,y)=f(y)·P(x,-1)+(1-f(y))·P(x,y)
in the above equation, P' (x, y) is the final prediction value of the pixel at the (x, y) position of the current chroma prediction block, f (x) and f (y) are the horizontal filter coefficient of the reconstructed pixel with reference to the left-side neighboring block and the vertical filter coefficient of the reconstructed pixel with reference to the upper-side neighboring block, respectively, P (-1, y) and P (x, -1) are the reconstructed pixel at the left side of the y row and the reconstructed pixel at the upper side of the x column, respectively, and P (x, y) is the original prediction pixel value in the current chroma component prediction block. Wherein, the values of x and y do not exceed the width and height value range of the current coding unit block.
The values of the horizontal filter coefficient and the vertical filter coefficient are related to the size of the current coding unit block and the distance from the prediction pixel in the current prediction block to the left reconstruction pixel and the upper reconstruction pixel. The values of the horizontal filter coefficient and the vertical filter coefficient are also related to the size of the current coding block, and are divided into different filter coefficient groups according to the size of the current coding unit block.
Table 1 gives the filter coefficients for the IPF technique.
Table 1 intra chroma prediction filter coefficients
Figure BDA0002608522600000052
Fig. 5 illustrates three filtering cases of intra prediction filtering, (a) filtering the prediction value in the current coding unit with reference to only the upper reference pixel; (b) Only the left reference pixel is referred to filter the measured value in the current coding unit; and (c) filtering the prediction value in the current coding unit block by referring to the upper side reference pixel and the left side reference pixel, wherein the Distance represents the Distance from the current processed pixel to the reference pixel.
FIG. 6 is a block diagram of a video coding system 1 of one example described in an embodiment of the present application. As used herein, the term "video coder" generally refers to both video encoders and video decoders. In this application, the term "video coding" or "coding" may generally refer to video encoding or video decoding. The video encoder 100 and the video decoder 200 of the video coding system 1 are used to implement the image encoding method proposed by the present application.
As shown in fig. 6, video coding system 1 includes a source device 10 and a destination device 20. Source device 10 generates encoded video data. Accordingly, source device 10 may be referred to as a video encoding device. Destination device 20 may decode the encoded video data generated by source device 10. Accordingly, the destination device 20 may be referred to as a video decoding device. Various implementations of source device 10, destination device 20, or both may include one or more processors and memory coupled to the one or more processors. The memory can include, but is not limited to, RAM, ROM, EEPROM, flash memory, or any other medium that can be used to store desired program code in the form of instructions or data structures that can be accessed by a computer, as described herein.
Source device 10 and destination device 20 may comprise a variety of devices, including desktop computers, mobile computing devices, notebook (e.g., laptop) computers, tablet computers, set-top boxes, telephone handsets such as so-called "smart" phones, televisions, cameras, display devices, digital media players, video game consoles, in-vehicle computers, or the like.
Destination device 20 may receive encoded video data from source device 10 via link 30. Link 30 may comprise one or more media or devices capable of moving encoded video data from source device 10 to destination device 20. In one example, link 30 may comprise one or more communication media that enable source device 10 to transmit encoded video data directly to destination device 20 in real-time. In this example, source device 10 may modulate the encoded video data according to a communication standard, such as a wireless communication protocol, and may transmit the modulated video data to destination device 20. The one or more communication media may include wireless and/or wired communication media such as a Radio Frequency (RF) spectrum or one or more physical transmission lines. The one or more communication media may form part of a packet-based network, such as a local area network, a wide area network, or a global network (e.g., the internet). The one or more communication media may include a router, switch, base station, or other apparatus that facilitates communication from source device 10 to destination device 20. In another example, encoded data may be output from output interface 140 to storage device 40.
The image codec techniques of this application may be applied to video codecs to support a variety of multimedia applications, such as over-the-air television broadcasts, cable television transmissions, satellite television transmissions, streaming video transmissions (e.g., via the internet), encoding for video data stored on a data storage medium, decoding of video data stored on a data storage medium, or other applications. In some examples, video coding system 1 may be used to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting, and/or video telephony.
The video coding system 1 illustrated in fig. 6 is merely an example, and the techniques of this application may be applied to video coding settings (e.g., video encoding or video decoding) that do not necessarily include any data communication between an encoding device and a decoding device. In other examples, the data is retrieved from local storage, streamed over a network, and so forth. A video encoding device may encode and store data to a memory, and/or a video decoding device may retrieve and decode data from a memory. In many examples, encoding and decoding are performed by devices that do not communicate with each other, but rather only encode data to and/or retrieve data from memory and decode data.
In the example of fig. 6, source device 10 includes video source 120, video encoder 100, and output interface 140. In some examples, output interface 140 may include a regulator/demodulator (modem) and/or a transmitter. Video source 120 may comprise a video capture device (e.g., a video camera), a video archive containing previously captured video data, a video feed interface to receive video data from a video content provider, and/or a computer graphics system for generating video data, or a combination of such sources of video data.
Video encoder 100 may encode video data from video source 120. In some examples, source device 10 transmits the encoded video data directly to destination device 20 via output interface 140. In other examples, encoded video data may also be stored onto storage device 40 for later access by destination device 20 for decoding and/or playback.
In the example of fig. 6, destination device 20 includes input interface 240, video decoder 200, and display device 220. In some examples, input interface 240 includes a receiver and/or a modem. Input interface 240 may receive encoded video data via link 30 and/or from storage device 40. The display device 220 may be integrated with the destination device 20 or may be external to the destination device 20. In general, display device 220 displays decoded video data. The display device 220 may include a variety of display devices, such as a Liquid Crystal Display (LCD), a plasma display, an Organic Light Emitting Diode (OLED) display, or other types of display devices.
Although not shown in fig. 6, in some aspects, video encoder 100 and video decoder 200 may each be integrated with an audio encoder and decoder, and may include appropriate multiplexer-demultiplexer units or other hardware and software to handle encoding of both audio and video in a common data stream or separate data streams.
Video encoder 100 and video decoder 200 may each be implemented as any of a variety of circuits such as: one or more microprocessors, digital Signal Processors (DSPs), application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs), discrete logic, hardware, or any combinations thereof. If the present application is implemented in part in software, a device may store instructions for the software in a suitable non-volatile computer-readable storage medium and may execute the instructions in hardware using one or more processors to implement the techniques of the present application. Any of the foregoing, including hardware, software, a combination of hardware and software, etc., may be considered one or more processors. Each of video encoder 100 and video decoder 200 may be included in one or more encoders or decoders, either of which may be integrated as part of a combined encoder/decoder (codec) in a respective device.
Fig. 7 is an exemplary block diagram of a video encoder 100 described in embodiments of the present application. The video encoder 100 is used to output the video to the post-processing entity 41. Post-processing entity 41 represents an example of a video entity, such as a media-aware network element (MANE) or a splicing/editing device, that may process the encoded video data from video encoder 100. In some cases, post-processing entity 41 may be an instance of a network entity. In some video encoding systems, post-processing entity 41 and video encoder 100 may be parts of separate devices, while in other cases, the functionality described with respect to post-processing entity 41 may be performed by the same device that includes video encoder 100. In some example, post-processing entity 41 is an example of storage 40 of FIG. 1.
In the example of fig. 7, the video encoder 100 includes a prediction processing unit 108, a filter unit 106, a memory 107, a summer 112, a transformer 101, a quantizer 102, and an entropy encoder 103. The prediction processing unit 108 includes an inter predictor 110 and an intra predictor 109. For image block reconstruction, the video encoder 100 further includes an inverse quantizer 104, an inverse transformer 105, and a summer 111. Filter unit 106 represents one or more loop filters, such as deblocking filters, adaptive Loop Filters (ALF), and Sample Adaptive Offset (SAO) filters. Although filter unit 106 is shown in fig. 7 as an in-loop filter, in other implementations, filter unit 106 may be implemented as a post-loop filter. In one example, the video encoder 100 may further include a video data memory, a partitioning unit (not shown).
Video encoder 100 receives video data and stores the video data in a video data memory. The partitioning unit partitions the video data into image blocks and these image blocks may be further partitioned into smaller blocks, e.g. image block partitions based on a quadtree structure or a binary tree structure. Prediction processing unit 108 may select one of a plurality of possible coding modes for the current image block, such as one of a plurality of intra coding modes or one of a plurality of inter coding modes. Prediction processing unit 108 may provide the resulting intra, inter coded block to summer 112 to generate a residual block and to summer 111 to reconstruct the encoded block used as the reference picture. An intra predictor 109 within prediction processing unit 108 may perform intra-predictive encoding of a current block of video relative to one or more neighboring encoded blocks of the current block to be encoded in the same frame or slice to remove spatial redundancy. Inter predictor 110 within prediction processing unit 108 may perform inter-predictive encoding of the current block relative to one or more prediction blocks in one or more reference pictures to remove temporal redundancy. The prediction processing unit 108 provides information indicating the selected intra or inter prediction mode of the current image block to the entropy encoder 103 so that the entropy encoder 103 encodes the information indicating the selected inter prediction mode.
After prediction processing unit 108 generates a prediction block for the current image block via inter/intra prediction, video encoder 100 forms a residual image block by subtracting the prediction block from the current image block to be encoded. Summer 112 represents one or more components that perform this subtraction operation. The residual video data in the residual block may be included in one or more TUs and applied to transformer 101. The transformer 101 transforms the residual video data into residual transform coefficients using a transform such as a Discrete Cosine Transform (DCT) or a conceptually similar transform. Transformer 101 may convert residual video data from a pixel value domain to a transform domain, e.g., the frequency domain.
The transformer 101 may send the resulting transform coefficients to the quantizer 102. Quantizer 102 quantizes the transform coefficients to further reduce the bit rate. In some examples, quantizer 102 may then perform a scan of a matrix that includes quantized transform coefficients. Alternatively, the entropy encoder 103 may perform the scanning.
After quantization, the entropy encoder 103 entropy encodes the quantized transform coefficients. For example, the entropy encoder 103 may perform Context Adaptive Variable Length Coding (CAVLC), context Adaptive Binary Arithmetic Coding (CABAC), syntax-based context adaptive binary arithmetic coding (SBAC), probability Interval Partition Entropy (PIPE) encoding, or another entropy encoding method or technique. After entropy encoding by the entropy encoder 103, the encoded codestream may be transmitted to the video decoder 200, or archived for later transmission or retrieval by the video decoder 200. The entropy encoder 103 may also entropy encode syntax elements of the current image block to be encoded.
Inverse quantizer 104 and inverse transformer 105 apply inverse quantization and inverse transform, respectively, to reconstruct the residual block in the pixel domain, e.g., for later use as a reference block for a reference image. The summer 111 adds the reconstructed residual block to the prediction block generated by the inter predictor 110 or the intra predictor 109 to generate a reconstructed image block. The filter unit 106 may be adapted to reconstruct the image block to reduce distortions, such as block artifacts. The reconstructed image block is then stored in memory 107 as a reference block that may be used by inter predictor 110 as a reference block for inter prediction of blocks in subsequent video frames or images.
The video encoder 100 divides the input video into a number of coding tree units, each of which is in turn divided into a number of coding blocks, either rectangular or square. When the current coding block selects the intra-frame prediction mode for coding, the calculation traversal of a plurality of prediction modes is carried out on the brightness component of the current coding block, the optimal prediction mode is selected according to the rate distortion cost, the calculation traversal of a plurality of prediction modes is carried out on the chroma component of the current coding block, and the optimal prediction mode is selected according to the rate distortion cost. And then, calculating a residual between the original video block and the prediction block, wherein one subsequent path of the residual forms an output code stream through change, quantization, entropy coding and the like, and the other path of the residual forms a reconstruction sample through inverse transformation, inverse quantization, loop filtering and the like to be used as reference information of subsequent video compression.
The present IPF technique is implemented in the video encoder 100 as follows.
The input digital video information is divided into a plurality of coding tree units at a coding end, each coding tree unit is divided into a plurality of rectangular or square coding units, and each coding unit carries out intra-frame prediction process to calculate a prediction block.
In the current coding unit,
(1) if the allowed identification bit of the IPF is '1', performing all the following steps;
(2) if the allowed identification bit of the IPF is '0', only the steps a 1), b 1), f1 and g 1) are carried out.
a1 Intra-frame prediction firstly traverses all prediction modes, calculates prediction pixels under each intra-frame prediction mode, and calculates rate-distortion cost according to original pixels;
b1 According to the above principle of minimizing the rate distortion cost of all prediction modes, selecting the optimal prediction mode of the current coding unit, and recording the optimal prediction mode information and the rate distortion cost information corresponding to the optimal prediction mode information;
c1 Traverse all intra-frame prediction modes again, start the IPF technology in the process, firstly calculate the prediction pixel under each intra-frame prediction mode to obtain the prediction block of the current coding unit;
d1 IPF is carried out on the prediction block of the current coding unit, a filter corresponding to the current prediction mode is selected according to the current prediction mode, a corresponding filter coefficient group is selected according to the size of the current coding unit, and the specific correspondence can be looked up in a table 1;
e1 According to the final prediction pixel obtained by the IPF technology and the original pixel, calculating rate distortion cost information of each prediction mode, and recording the prediction mode and the corresponding cost value of the minimum rate distortion cost information;
f1 ) if the IPF allowed identification bit is '0', transmitting the prediction mode index recorded in b 1) to a decoding end through a code stream;
if the IPF allows the flag to be '1', the minimum cost value recorded in b 1) is compared with the minimum cost value recorded in e 1),
if the rate distortion cost in b 1) is lower, the prediction mode index code recorded in b 1) is used as the optimal prediction mode of the current coding unit and transmitted to a decoding end through a code stream, and the IPF current coding unit identification position is identified whether to use an identification position, which indicates that the IPF technology is not used, and is also transmitted to the decoding end through the code stream;
if the rate distortion in e 1) is smaller, the prediction mode index code recorded in e 1) is used as the optimal prediction mode of the current coding unit and transmitted to a decoding end through a code stream, and the IPF current coding unit identification position using identification position is true, which indicates that the IPF technology is used, and is also transmitted to the decoding end through the code stream.
g1 And) then, the predicted value is superposed with residual information after operations such as transformation, quantization and the like, and a reconstructed block of the current coding unit is obtained and used as reference information of a subsequent coding unit.
The intra predictor 109 may also provide information indicating the selected intra prediction mode for the current encoding block to the entropy encoder 103 so that the entropy encoder 103 encodes the information indicating the selected intra prediction mode.
Fig. 8 is an exemplary block diagram of a video decoder 200 described in the embodiments of the present application. In the example of fig. 8, the video decoder 200 includes an entropy decoder 203, a prediction processing unit 208, an inverse quantizer 204, an inverse transformer 205, a summer 211, a filter unit 206, and a memory 207. The prediction processing unit 208 may include an inter predictor 210 and an intra predictor 209. In some examples, video decoder 200 may perform a decoding process that is substantially reciprocal to the encoding process described with respect to video encoder 100 from fig. 7.
In the decoding process, video decoder 200 receives an encoded video bitstream representing an image block and associated syntax elements of an encoded video slice from video encoder 100. Video decoder 200 may receive video data from network entity 42 and, optionally, may store the video data in a video data store (not shown). The video data memory may store video data, such as an encoded video bitstream, to be decoded by components of video decoder 200. The video data stored in the video data memory may be obtained, for example, from storage device 40, from a local video source such as a camera, via wired or wireless network communication of video data, or by accessing a physical data storage medium. The video data memory may serve as a decoded picture buffer (CPB) for storing encoded video data from the encoded video bitstream.
Network entity 42 may be, for example, a server, a MANE, a video editor/splicer, or other such device for implementing one or more of the techniques described above. Network entity 42 may or may not include a video encoder, such as video encoder 100. Network entity 42 may implement portions of the techniques described in this application before network entity 42 sends the encoded video bitstream to video decoder 200. In some video decoding systems, network entity 42 and video decoder 200 may be part of separate devices, while in other cases, the functionality described with respect to network entity 42 may be performed by the same device that includes video decoder 200.
The entropy decoder 203 of the video decoder 200 entropy decodes the bitstream to produce quantized coefficients and some syntax elements. The entropy decoder 203 forwards the syntax elements to the prediction processing unit 208. Video decoder 200 may receive syntax elements at the video slice level and/or the picture block level. When a video slice is decoded as an intra-decoded (I) slice, intra predictor 209 of prediction processing unit 208 generates a prediction block for an image block of the current video slice based on the signaled intra prediction mode and data from previously decoded blocks of the current frame or picture. When a video slice is decoded as an inter-decoded (i.e., B or P) slice, the inter predictor 210 of the prediction processing unit 208 may determine an inter prediction mode for decoding a current image block of the current video slice based on syntax elements received from the entropy decoder 203, decode the current image block (e.g., perform inter prediction) based on the determined inter prediction mode.
The inverse quantizer 204 inversely quantizes, i.e., dequantizes, the quantized transform coefficients provided in the code stream and decoded by the entropy decoder 203. The inverse quantization process may include: the quantization parameter calculated by the video encoder 100 for each image block in the video slice is used to determine the degree of quantization that should be applied and likewise the degree of inverse quantization that should be applied. Inverse transformer 205 applies an inverse transform, such as an inverse DCT, an inverse integer transform, or a conceptually similar inverse transformation process, to the transform coefficients in order to generate a residual block in the pixel domain.
After the inter predictor 210 generates a prediction block for the current image block or a sub-block of the current image block, the video decoder 200 obtains a reconstructed block, i.e., a decoded image block, by summing the residual block from the inverse transformer 205 with the corresponding prediction block generated by the inter predictor 210. Summer 211 represents the component that performs this summation operation. A loop filter (in or after the decoding loop) may also be used to smooth pixel transitions or otherwise improve video quality, if desired. Filter unit 206 may represent one or more loop filters, such as deblocking filters, adaptive Loop Filters (ALF), and Sample Adaptive Offset (SAO) filters. Although the filter unit 206 is shown in fig. 8 as an in-loop filter, in other implementations, the filter unit 206 may be implemented as a post-loop filter.
The image decoding method specifically executed by the video decoder 200 includes obtaining a prediction mode index of a current coding block after an input code stream is analyzed, inversely transformed, and inversely quantized. If the prediction mode index of the chroma component of the current coding block is an enhanced two-step cross-component prediction mode, selecting only reconstructed samples from upper side or left side adjacent pixels of the current coding block according to an index value to calculate a linear model, calculating according to the linear model to obtain a reference prediction block of the chroma component of the current coding block, performing down-sampling, and performing prediction correction based on the correlation of boundary adjacent pixels in the orthogonal direction on the down-sampled prediction block to obtain a final prediction block of the chroma component. One path of the subsequent code stream is used as reference information of subsequent video decoding, and the other path of the subsequent code stream is subjected to post-filtering processing to output a video signal.
The present IPF technique is implemented at the video decoder 200 as follows.
And the decoding end acquires and analyzes the code stream to obtain digital video sequence information, and analyzes to obtain an IPF allowed identification bit of the current video sequence, the current decoding unit coding mode is an intra-frame prediction coding mode, and the IPF used identification bit of the current decoding unit.
In the current decoding unit,
(1) if the allowed identification bit of the IPF is '1', performing all the following steps;
(2) if the allowed identification bit of the IPF is '0', only the steps a 2), b 2) and e 2) are carried out:
a2 Code stream information is obtained, residual error information of the current decoding unit is analyzed, and time domain residual error information is obtained through inverse transformation and inverse quantization processes;
b2 Analyzing the code stream and obtaining a prediction mode index of the current decoding unit, and calculating to obtain a prediction block of the current decoding unit according to the adjacent reconstruction block and the prediction mode index;
c2 Analyzing and obtaining the usage identification bit of the IPF, if the usage identification bit of the IPF is '0', no additional operation is performed on the current prediction block; if the usage flag of the IPF is '1', executing d 2);
d2 According to the prediction mode classification information of the current decoding unit, selecting a corresponding filter, according to the size of the current decoding unit, selecting a corresponding filter coefficient group, and then filtering each pixel in a prediction block to obtain a final prediction block;
e2 The residual error information after the prediction block is overlapped and reduced is obtained to obtain a reconstruction block of the current decoding unit, and the reconstruction block is output after post-processing;
it should be understood that other structural variations of the video decoder 200 may be used to decode the encoded video bitstream. For example, the video decoder 200 may generate an output video stream without processing by the filter unit 206; alternatively, for some image blocks or image frames, the entropy decoder 203 of the video decoder 200 does not decode quantized coefficients and accordingly does not need to be processed by the inverse quantizer 204 and the inverse transformer 205.
In the intra-frame prediction technology, the existing IPF technology can effectively improve the coding efficiency of intra-frame prediction, greatly enhances the spatial correlation of intra-frame prediction, and well solves the problem that only a single reference pixel row or column is used in the intra-frame prediction process, but the influence of some pixels on the predicted value is ignored. However, when the intra-frame prediction process needs a smooth part, the IPF technology and the current intra-frame prediction mode cannot solve similar problems well, and the pixel-by-pixel filtering based on the reference pixel can improve the relevance between the prediction block and the reference block, but cannot solve the smooth problem inside the prediction block.
The prediction block calculated according to the single prediction mode usually shows better prediction effect in the image with clear texture, and the residual error becomes smaller and smaller, so that the coding efficiency is improved. However, in an image block with a blurred texture, too sharp prediction may increase and enlarge a residual, resulting in poor prediction effect and reduced coding efficiency.
In view of the above problems, the embodiments of the present application directly filter the prediction block for some image blocks that need smoothing, and the filtering technique is hereinafter referred to as an intra-prediction smoothing filtering technique for easy understanding and distinction.
The following detailed description is made with reference to the accompanying drawings.
Fig. 9 is a flowchart illustrating an image encoding method in an embodiment of the present application, where the image encoding method can be applied to the source device 10 in the video decoding system 1 shown in fig. 6 or the video encoder 100 shown in fig. 7. The flow shown in fig. 9 is described by taking as an example the execution subject of the video encoder 100 shown in fig. 7. As shown in fig. 9, an image encoding method provided in an embodiment of the present application includes:
step 110, dividing the image, and determining coding information of a current coding block, wherein the coding information comprises first indication information and second indication information, the first indication information is used for indicating whether intra-prediction filtering is allowed, and the second indication information is used for indicating whether intra-prediction smoothing filtering is allowed;
step 120, determining an optimal prediction mode of the current coding speed according to the coding information, setting third indication information according to the optimal prediction mode, and transmitting the optimal prediction mode index and the third indication information through a code stream;
and step 130, superposing the prediction block of the current coding block and the residual block obtained after inverse transformation and inverse quantization to obtain a reconstructed block.
The filtering technical scheme 1 is specifically implemented in the intra prediction part at the encoding end as follows:
the encoder acquires encoding information including an intra-frame prediction filtering allowable identification bit, an intra-frame prediction smoothing filtering allowable identification bit and the like, divides an image into a plurality of CTUs after acquiring image information, further divides the image into a plurality of CUs, and performs intra-frame prediction on each independent CU. It should be noted that the minimum processing unit may also be a custom coding block, and the CU is only an example and is not limited to the only example.
In the intra-prediction process, it is possible to,
(1) if the intra prediction smoothing filter allowable flag is '1' and the current CU area is greater than or equal to 64 and less than 2048, performing all the following steps;
(2) if the intra prediction smoothing filter permission flag is '0', only a 3), b 3), and f 3), g 3) are performed;
a3 ) the current coding unit traverses all intra-frame prediction modes, calculates to obtain a prediction block under each prediction mode, and calculates to obtain the rate-distortion cost information of the current prediction mode according to an original pixel block;
b3 According to the above principle of minimizing rate distortion cost of all prediction modes, selecting the optimal prediction mode of the current coding unit, and recording the optimal prediction mode information and the rate distortion cost information corresponding to the optimal prediction mode information;
c3 Traverse all intra-frame prediction modes again, start the smooth filtering technology of intra-frame prediction in the process, firstly calculate the prediction pixel under each intra-frame prediction mode, and obtain the prediction block of the current coding unit;
d3 Performing intra-frame prediction smoothing filtering on all pixels in a prediction block of a current coding unit to obtain a final prediction block;
e3 According to the final prediction pixel obtained by the intra-frame prediction smoothing filtering technology and the original pixel, rate distortion cost information of each prediction mode is obtained through calculation, and the prediction mode and the corresponding cost value of the minimum rate distortion cost information are recorded;
f3 If the identification bit of intra-frame prediction smooth filtering permission is '0', transmitting the prediction mode index recorded in b 3) to a decoding end through a code stream;
if the intra prediction smoothing filter allows the flag to be '1', the minimum cost value recorded in b 3) is compared with the minimum cost value recorded in e 3),
if the rate distortion cost in b 3) is lower, the prediction mode index code recorded in b 3) is used as the optimal prediction mode of the current coding unit and transmitted to a decoding end through a code stream, and the intra-frame prediction smoothing filtering of the current coding unit uses the identification position '0', which indicates that the intra-frame prediction smoothing filtering technology is not used, and is also transmitted to the decoding end through the code stream;
if the rate distortion in e 3) is smaller, the prediction mode index code recorded in e 3) is used as the optimal prediction mode of the current coding unit and transmitted to a decoding end through a code stream, and the intra-frame prediction smoothing filtering of the current coding unit uses the identification position '1', which indicates that the intra-frame prediction smoothing filtering technology is used, and is also transmitted to the decoding end through the code stream.
g3 And) the prediction block and the residual after inverse transformation and inverse quantization are superposed to obtain a reconstructed coding unit block which is used as a prediction reference block of the next coding unit.
The filtering technical scheme 2 is specifically implemented in the intra prediction part at the encoding end as follows:
the encoder acquires encoding information including an intra-frame prediction filtering allowable identification bit, an intra-frame prediction smoothing filtering allowable identification bit and the like, divides an image into a plurality of CTUs after acquiring image information, further divides the image into a plurality of CUs, and performs intra-frame prediction on each independent CU.
In the intra-prediction process, it is possible to,
(1) if the intra prediction smoothing filter allowable flag is '1' and the current CU area is greater than or equal to 64 and less than 2048, performing all the following steps;
(2) if the intra prediction smoothing filter permission flag is '0', only a 4), b 4) and f 4), g 4) are performed;
a4 ) the current coding unit traverses all intra-frame prediction modes, calculates to obtain a prediction block under each prediction mode, and calculates to obtain the rate-distortion cost information of the current prediction mode according to an original pixel block;
b4 According to the above principle of minimizing rate distortion cost of all prediction modes, selecting the optimal prediction mode of the current coding unit, and recording the optimal prediction mode information and the rate distortion cost information corresponding to the optimal prediction mode information;
c4 Traverse all intra-frame prediction modes again, start the smooth filtering technology of intra-frame prediction in the process, firstly calculate the prediction pixel under each intra-frame prediction mode, and obtain the prediction block of the current coding unit;
d4 Twice filtering the prediction block of the current coding unit, performing intra-frame prediction smoothing filtering on all pixels in the prediction block for the first time, and performing intra-frame prediction smoothing filtering on left boundary pixels and upper boundary pixels in the filtered prediction block for the second time to obtain a final prediction block;
e4 According to the final prediction pixel obtained by the intra-frame prediction smoothing filtering technology and the original pixel, rate distortion cost information of each prediction mode is obtained through calculation, and the prediction mode and the corresponding cost value of the minimum rate distortion cost information are recorded;
f4 If the identification bit of intra-frame prediction smooth filtering permission is '0', transmitting the prediction mode index recorded in b 4) to a decoding end through a code stream;
if the intra prediction smoothing filter allows the flag to be '1', the minimum cost value recorded in b 4) is compared with the minimum cost value recorded in e 4),
if the rate distortion cost in b 4) is lower, the prediction mode index code recorded in b 4) is used as the optimal prediction mode of the current coding unit and transmitted to a decoding end through a code stream, and the intra-frame prediction smoothing filtering of the current coding unit uses an identification position '0', which indicates that the intra-frame prediction smoothing filtering technology is not used, and is also transmitted to the decoding end through the code stream;
if the rate distortion in e 4) is smaller, the prediction mode index code recorded in e 4) is used as the optimal prediction mode of the current coding unit and transmitted to the decoding end through the code stream, and the intra-frame prediction smoothing filtering of the current coding unit uses the identification position '1', which indicates that the intra-frame prediction smoothing filtering technology is used, and is also transmitted to the decoding end through the code stream.
g4 And) the prediction block and the residual after inverse transformation and inverse quantization are superposed to obtain a reconstructed coding unit block which is used as a prediction reference block of the next coding unit.
Corresponding to the image encoding method described in fig. 9, fig. 10 is a flowchart illustrating an image encoding method in an embodiment of the present application, which can be applied to the destination device 20 in the video decoding system 1 shown in fig. 6 or the video decoder 200 shown in fig. 8. The flow shown in fig. 10 is described by taking the video encoder 200 shown in fig. 8 as an example of the execution subject. As shown in fig. 10, an image decoding method provided in an embodiment of the present application includes:
step 210, parsing a code stream, and determining decoding information of a current decoded block, where the decoding information includes first indication information and third indication information, the first indication information is used to indicate whether intra-frame prediction filtering is allowed, and the third indication information is used to indicate whether intra-frame prediction smoothing filtering is used;
step 220, determining a prediction block of the current decoding block according to the first indication information and the third indication information;
and step 230, overlapping the restored residual information with the prediction block to obtain a reconstructed block of the current decoding unit.
The specific flow of intra-frame prediction at the decoding end in filtering technical scheme 1 is as follows:
the decoder obtains a code stream, analyzes the code stream to obtain an intra-frame prediction smooth filtering allowable identification bit of the current video sequence, analyzes the code stream and carries out inverse transformation and inverse quantization on the obtained residual error information.
In the intra-prediction decoding process,
(1) if the allowable flag bit of the intra prediction smoothing filtering is '1' and the current CU area is greater than or equal to 64 and less than 2048, performing all the following steps;
(2) if the allowed flag bit of the intra prediction smoothing filtering is '0', only the steps a 5), b 5), c 5), d 5) and e 5) are performed:
a5 Code stream is obtained and decoded to obtain residual error information, and time domain residual error information is obtained through processes of inverse transformation, inverse quantization and the like;
b5 Analyzing the code stream to obtain a prediction mode of the current decoding unit, and calculating to obtain a prediction block according to the prediction mode of the current decoding unit and the adjacent reconstruction block;
c5 Parsing and obtaining usage flag bits for intra prediction smoothing filtering,
if the using identification bit of the intra-frame prediction smooth filtering is '0', no additional operation is performed on the current prediction block;
if the using identification bit of the intra-frame prediction smooth filtering is '1', performing d 5);
d5 Filtering the input prediction block by using an intra-frame prediction smoothing filter to obtain a filtered current decoding unit prediction block;
e5 And) superposing the reduced residual error information on the prediction block to obtain a reconstructed block of the current decoding unit, and outputting the reconstructed block after post-processing.
The specific flow of intra-frame prediction at the decoding end in filtering technical scheme 2 is as follows:
the decoder obtains a code stream, analyzes the code stream to obtain an intra-frame prediction smooth filtering allowable identification bit of the current video sequence, analyzes the code stream and performs inverse transformation and inverse quantization on the obtained residual error information.
In the intra-prediction decoding process,
(1) if the allowable flag bit of the intra prediction smoothing filtering is '1' and the current CU area is greater than or equal to 64 and less than 2048, performing all the following steps;
(2) if the allowed flag bit of the intra prediction smoothing filtering is '0', only the steps a 6) and b 6), e 6) are performed:
a6 Code stream is obtained and decoded to obtain residual error information, and time domain residual error information is obtained through processes such as inverse transformation, inverse quantization and the like;
b6 Analyzing the code stream to obtain a prediction mode of the current decoding unit, and calculating to obtain a prediction block according to the prediction mode of the current decoding unit and the adjacent reconstructed block;
c6 Parsing and obtaining usage flag bits for intra prediction smoothing filtering,
if the using identification bit of the intra-frame prediction smoothing filtering is '0', no additional operation is performed on the current prediction block;
if the using identification bit of the intra-frame prediction smooth filtering is '1', executing d);
d6 Carrying out two times of filtering on the input prediction block, carrying out intra-frame prediction smoothing filtering on all prediction pixels in the prediction block for the first time, and carrying out intra-frame prediction smoothing filtering on left boundary pixels and upper boundary pixels in the prediction block after filtering for the second time to obtain a filtered prediction block of the current decoding unit;
e6 And) superposing the restored residual error information on the prediction block to obtain a reconstructed block of the current decoding unit, and outputting the reconstructed block after post-processing.
When the intra prediction filtering technique filters the current coding unit or decoding unit, the current block needs to be filled first, and there may be the following two filling schemes.
Pixel fill scheme 1, reconstructed pixels can be used using reconstructed pixel fill, otherwise predicted pixel fill is used:
a7 If reference pixels on the left side and the upper side outside the current prediction block are available, namely reconstructed pixels are arranged on the left side and the upper side, two rows on the left side and the upper side outside the prediction block are filled by using the reconstructed pixels;
b7 If the left or upper reference pixels outside the current prediction block are not available, i.e., there are no reconstructed pixels on the left or upper side, the side without reconstructed pixels fills the outer two rows or two columns with the row or column closest to that side in the current prediction block;
c7 Fill the outer two columns with the rightmost column predictor inside the current prediction block for the right adjacent column outside the current prediction block;
d7 To fill the outer two rows with the lowest predicted value in the current predicted block for the lower adjacent row outside the current predicted block;
e7 The pixel points at the upper right corner outside the current prediction block are filled by using the pixel points at the rightmost side which are filled at the upper side outside the current prediction block, the pixel points at the lower right corner outside the current prediction block are filled by using the pixel points at the rightmost side which are filled at the lower side outside the current prediction block, and the pixel points at the lower left corner outside the current prediction block are filled by using the pixel points at the lower bottom which are filled with the left side outside the current prediction block.
Fig. 11A gives a schematic diagram of a prediction block of the padding scheme 1, wherein pred.pixel denotes pixels of the prediction block and recon.pixel denotes pixels of padding.
Pixel fill scheme 2, all using predictive pixel fill:
a8 Fill the outer two rows with the top row of predictors inside the current predictor block for the upper adjacent column outside the current predictor block;
b8 To fill the outer two columns with the leftmost column of predictors within the current predictor block for the left adjacent column outside the current predictor block;
c8 To fill the outer two columns with the rightmost column prediction value in the current prediction block for the right adjacent column outside the current prediction block;
d8 To fill the outer two rows with the lowest row predictor in the current prediction block for the lower adjacent row outside the current prediction block;
e8 The pixel points at the upper right corner outside the current prediction block are filled by using the pixel points at the rightmost side which are filled at the upper side outside the current prediction block, the pixel points at the lower right corner outside the current prediction block are filled by using the pixel points at the rightmost side which are filled at the lower side outside the current prediction block, and the pixel points at the lower left corner outside the current prediction block are filled by using the pixel points at the lower bottom which are filled with the left side outside the current prediction block.
Fig. 11B gives a schematic diagram of a prediction block of padding scheme 2, where pred. Pixel denotes the pixels of the prediction block and recon. Pixel denotes the padded pixels.
Pixel fill scheme 3, reconstructed pixels are available to fill two rows or two columns using the closest row or one column, otherwise predicted pixel fill is used:
a9 If the left and upper reference pixels outside the current prediction block are available, i.e. there are reconstructed pixels on the left and upper sides, two rows or two columns outside the prediction block that need to be filled are filled by using the left first column and the upper first row of the adjacent reconstructed pixels outside the prediction block;
b9 If the left or upper reference pixels outside the current prediction block are not available, i.e. there are no reconstructed pixels on the left or upper side, the side without reconstructed pixels fills the outer two rows or two columns with the row or column closest to that side in the current prediction block, and the two rows or two columns with reconstructed pixels need to be filled are filled with the reconstructed pixels on the first row or first column;
c9 To fill the outer two columns with the rightmost column prediction value in the current prediction block for the right adjacent column outside the current prediction block;
d9 To fill the outer two rows with the lowest predicted value in the current predicted block for the lower adjacent row outside the current predicted block;
e9 The pixel points on the upper right corner outside the current prediction block are filled by using the pixel points on the rightmost side, which are filled on the upper side, outside the current prediction block, the pixel points on the lower right corner outside the current prediction block are filled by using the pixel points on the rightmost side, which are filled on the lower side, outside the current prediction block, and the pixel points on the lower left corner outside the current prediction block are filled by using the pixel points on the lowest side, which are filled with the filler on the left side, outside the current prediction block.
Fig. 11C gives a schematic diagram of a padding scheme 3 prediction block.
The intra-prediction filtering techniques described above employ a simplified gaussian convolution kernel to filter the predicted block, and several filter schemes are proposed herein, and a set of filter coefficients is provided in each scheme.
Filter scheme 25 taps of 1,5 x 5 size
Figure BDA0002608522600000131
Filtering each predicted pixel in the predicted block, the filtering formula is as follows:
P′(x,y)=c 1 ·P(x-2,y-2)+c 2 ·P(x-1,y-2)+c 3 ·P(x,y-2)+c 2 ·P(x+1,y-2)+c 1
·P(x+2,y-2)+c 2 ·P(x-2,y-1)+c 4 ·P(x-1,y-1)+c 5 ·P(x,y-1)+c 4
·P(x+1,y-1)+c 2 ·P(x+2,y-1)+c 3 ·P(x-2,y)+c 5 ·P(x-1,y)+c 6 ·P(x,y)
+c 5 ·P(x+1,y)+c 3 ·P(x+2,y)+c 2 ·P(x-2,y+1)+c 4 ·P(x-1,y+1)+c 5
·P(x,y+1)+c 4 ·P(x+1,y+1)+c 2 ·P(x+2,y+1)+c 1 ·P(x-2,y+2)+c 2
·P(x-1,y+2)+c 3 ·P(x,y+2)+c 2 ·P(x+1,y+2)+c 1 ·P(x+2,y+2)
in the above equation, P' (x, y) is the final prediction value at the current coding unit (x, y), c 1 、c 2 、c 3 、c 4 、c 5 And c 6 Respectively, in the above-mentioned approximate Gaussian convolution kernel coefficient, c 1 Is 0.0030 and c 2 Is 0.0133,c 3 Is 0.0219,c 4 Is 0.0596,c 5 Is 0.0983,c 6 Is 0.1621.P (x, y) is currently located with others such as P (x-1, y-1)And (3) predicting values at the coding units (x, y) and (x-1, y-1), wherein the value ranges of x and y do not exceed the width and height of the current coding unit block.
The convolution kernel coefficients adopted by the intra-frame prediction filtering technology can be approximate to integers, and the sum of all coefficients is an exponential power of 2, so that floating point calculation and division operation of a computer can be avoided, and the calculation complexity is greatly reduced as shown in the following:
Figure BDA0002608522600000132
the sum of the filter coefficients is 1024, i.e. the calculated prediction value needs to be shifted to the right by 10 bits.
Filter scheme 2,5 x 5 size with 13 taps
Filtering each prediction pixel in the prediction block, wherein the adopted filter is a diamond filter, the size of the adopted filter is 5 multiplied by 5, the filter has 13 taps, in order to save computing resources and avoid computing complexity, the coefficients of all filter kernels are integers, and the sum of the coefficients is an exponential power of 2, and the filter kernels are as follows:
Figure BDA0002608522600000133
the sum of the filter coefficients is 256, i.e. the calculated prediction value needs to be shifted to the right by 8 bits.
The filter formula is as follows:
P′(x,y)=c 1 ·P(x,y-2)+c 2 ·P(x-1,y-1)+c 3 ·P(x,y)+c 2 ·P(x+1,y-1)+c 1 ·P(x-2,y)+c 3
·P(x-1,y)+c 4 ·P(x,y)+c 3 ·P(x+1,y)+c 1 ·P(x+2,y)+c 2 ·P(x-1,y+1)+c 3
·P(x,y+1)+c 2 ·P(x+1,y+1)+c 1 ·P(x,y+2)
in the above equation, P' (x, y) is the final prediction value at the current coding unit (x, y), c 1 、c 2 、c 3 And c 4 Are respectively atThe coefficients in the filter are c in the approximate Gaussian convolution kernel coefficients 1 Is 13,c 2 Is 18,c 3 Is 25,c 4 Is 32.P (x, y) and other parameters such as P (x-1, y-1) are prediction values located at the current coding unit (x, y) and (x-1, y-1), wherein the value ranges of x and y do not exceed the width and height of the current coding unit block.
Filter scheme 21 taps of size 3,5 x 5
Filtering each prediction pixel in the prediction block, wherein the adopted filter is a diamond filter, the size of the adopted filter is 21 taps in a size of 5 multiplied by 5, in order to save computing resources and avoid computing complexity, the coefficients of all filtering kernels are integers, and the sum is an exponential power of 2, and the filtering kernels are as follows:
Figure BDA0002608522600000141
the sum of the filter coefficients is 1024, i.e. the calculated prediction value needs to be shifted to the right by 10 bits.
The filter formula is as follows:
P′(x,y)=c 1 ·P(x-1,y-2)+c 2 ·P(x,y-2)+c 1 ·P(x+1,y-2)+c 1 ·P(x-2,y-1)+c 3
·P(x-1,y-1)+c 4 ·P(x,y-1)+c 3 ·P(x+1,y-1)+c 1 ·P(x+2,y-1)+c 2
·P(x-2,y)+c 4 ·P(x-1,y)+c 5 ·P(x,y)+c 4 ·P(x+1,y)+c 2 ·P(x+2,y)+c 1
·P(x-2,y+1)+c 3 ·P(x-1,y+1)+c 4 ·P(x,y+1)+c 3 ·P(x+1,y+1)+c 1
·P(x+2,y+1)+c 1 ·P(x-1,y+2)+c 2 ·P(x,y+2)+c 1 ·P(x+1,y+2)
in the above equation, P' (x, y) is the final prediction value at the current coding unit (x, y), c 1 、c 2 、c 3 、c 4 And c 5 Respectively, in the above-mentioned approximate Gaussian convolution kernel coefficient, c 1 Is 24,c 2 Is 27,c 3 Is 62,c 4 Is 88,c 5 Is 124.P (x, y) and other parameters such as P (x-1, y-1) are prediction values located at the current coding unit (x, y) and (x-1, y-1), wherein the value ranges of x and y do not exceed the width and height of the current coding unit block.
The technology provided by the application is suitable for the intra-frame prediction coding and decoding part, provides selection for operations such as smoothing processing or local blurring of intra-frame prediction, and enables prediction pixels to be smoother and prediction blocks to be closer to original images by using the technology for the parts which do not need too sharp of image textures, and finally improves coding efficiency.
The combination of filtering solution 1, pixel filling solution 1 and filter solution 1 was tested on the official simulation platform HPM8.0 of AVS, the intra prediction block was filtered smoothly, and the test results under the full-frame test condition and the random access condition are shown in table 2.
TABLE 2 All Intra test results
Class Y U V
4K -0.62% -0.51% -0.70%
1080P -0.41% -0.05% -0.17%
720P -0.20% -0.97% -0.26%
Average performance -0.41% -0.51% -0.38%
As can be seen from table 2, this scheme has good performance improvement under both test conditions.
Under the AI test condition, the luminance component has 0.41% of BDBR saving, and the UV component has 0.51% and 0.38% of BDBR saving respectively, so that the high performance can be obviously seen, and the coding efficiency of the coder is effectively improved.
From each resolution, the scheme has larger coding performance improvement on the 4K resolution video, which is beneficial to the development of the future ultra-high definition video, and saves more code rates and more bandwidths for the ultra-high resolution video.
According to the scheme, in the intra-frame prediction process, smooth filtering is carried out on the prediction block obtained by calculating the intra-frame prediction mode, the intra-frame prediction precision is improved, and the coding efficiency is effectively improved, specifically as follows:
1. two technical schemes of smoothing filtering for intra-frame prediction are provided, wherein the first filtering scheme uses a smoothing filtering technology for all pixels of an intra-frame prediction block; the second filtering scheme employs two filtering passes, the first pass using the smoothing filter for all pixels in the prediction block and the second pass using the smoothing filter only at the boundaries;
2. three kinds of pixel filling schemes outside the smoothing filtering boundary are provided, and in the first scheme, if the reconstructed pixels of the adjacent blocks can be used, the reconstructed pixels are used for filling two rows or two columns; the second scheme fills two rows or two columns outside with the rows or columns of the nearest boundary within the prediction block; in the third scheme, if the reconstructed pixels of the adjacent block are available, one row or one column of the nearest prediction block is used for filling two rows or two columns outside the prediction block, and the other row or one column of the nearest boundary in the prediction block is used for filling two rows or two columns outside the prediction block on the side where the reconstructed pixels are unavailable;
three filter schemes are proposed, each providing a specific set of filter coefficients. The first filter scheme is a 25-tap square filter of size 5 × 5; the second filter scheme is a 13-tap diamond filter of 5 × 5 size; the third filtering scheme is a 21-tap octagon filter of 5 x 5 size.
The present application can be further extended from the following directions.
Scheme 1 is as follows: the shape of the filter and the number of taps in the filter scheme are adjusted, and the number of taps is reduced to reduce the calculation burden;
scheme 2 is as follows: adjusting the filter coefficient in the filter scheme, and filtering the prediction block by adopting the filtering of the asymmetric coefficient so as to improve the coding efficiency;
scheme 3 of the embodiment: and filtering the prediction blocks in different prediction modes by using different filters, and improving the filtering effect on different textures through different frequency characteristics so as to improve the coding efficiency.
The embodiment of the application provides an image coding device which can be a video decoder or a video encoder. In particular, the image encoding device is configured to perform the steps performed by the video decoder in the above decoding method. The image encoding device provided by the embodiment of the application can comprise modules corresponding to the corresponding steps.
The image encoding device according to the embodiment of the present application may be divided into functional modules according to the method examples described above, for example, each functional module may be divided for each function, or two or more functions may be integrated into one processing module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The division of the modules in the embodiment of the present application is schematic, and is only one logic function division, and there may be another division manner in actual implementation.
Fig. 12 is a schematic diagram showing a possible configuration of the image encoding apparatus according to the above embodiment, in a case where each functional module is divided in correspondence with each function. As shown in fig. 12, the image encoding device 12 includes a dividing unit 120, a determining unit 121, and an superimposing unit 122.
A dividing unit 120, configured to divide an image, and determine coding information of a current coding block, where the coding information includes first indication information and second indication information, the first indication information is used to indicate whether intra prediction filtering is allowed, and the second indication information is used to indicate whether intra prediction smoothing filtering is allowed;
a determining unit 121, configured to determine an optimal prediction mode of the current coding speed according to the coding information, set third indication information according to the optimal prediction mode, and transmit the optimal prediction mode index and the third indication information via a code stream;
and a superposition unit 122, configured to superpose the prediction block of the current coding block and a residual block obtained after inverse transformation and inverse quantization to obtain a reconstructed block.
All relevant contents of the steps related to the method embodiment may be referred to the functional description of the corresponding functional module, and are not described herein again. Of course, the image encoding apparatus provided in the embodiments of the present application includes, but is not limited to, the above modules, for example: the image encoding apparatus may further include a storage unit. The storage unit may be used to store program codes and data of the image encoding apparatus.
In the case of using an integrated unit, a schematic structural diagram of an image encoding device provided in an embodiment of the present application is shown in fig. 13. In fig. 13, the image encoding device 13 includes: a processing module 130 and a communication module 131. The processing module 130 is used for controlling and managing actions of the image encoding apparatus, for example, performing steps performed by the dividing unit 120, the determining unit 121, the superimposing unit 122, and/or other processes for performing the techniques described herein. The communication module 131 is used to support interaction between the image encoding apparatus and other devices. As shown in fig. 13, the image encoding apparatus may further include a storage module 132, and the storage module 132 is used for storing program codes and data of the image encoding apparatus, for example, contents stored in the storage unit.
The Processing module 130 may be a Processor or a controller, and may be, for example, a Central Processing Unit (CPU), a general-purpose Processor, a Digital Signal Processor (DSP), an ASIC, an FPGA or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or execute the various illustrative logical blocks, modules, and circuits described in connection with the disclosure herein. The processor may also be a combination of computing functions, e.g., comprising one or more microprocessors, DSPs, and microprocessors, and the like. The communication module 131 may be a transceiver, an RF circuit or a communication interface, etc. The storage module 132 may be a memory.
All relevant contents of each scene related to the method embodiment may be referred to the functional description of the corresponding functional module, and are not described herein again. The image coding device can execute the image coding method, and the image coding device can be specifically a video image coding device or other equipment with a video coding function.
The application also provides a video encoder, which comprises a nonvolatile storage medium and a central processing unit, wherein the nonvolatile storage medium stores an executable program, and the central processing unit is connected with the nonvolatile storage medium and executes the executable program to realize the image encoding method of the embodiment of the application.
The embodiment of the application provides an image decoding device which can be a video decoder or a video decoder. Specifically, the image decoding apparatus is configured to perform the steps performed by the video decoder in the above decoding method. The image decoding device provided by the embodiment of the application can comprise modules corresponding to the corresponding steps.
The image decoding device according to the embodiment of the present application may perform division of function modules according to the method example described above, for example, each function module may be divided according to each function, or two or more functions may be integrated into one processing module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The division of the modules in the embodiment of the present application is schematic, and is only one logic function division, and there may be another division manner in actual implementation.
Fig. 14 shows a schematic diagram of a possible structure of the image decoding apparatus according to the above-described embodiment, in a case where each functional module is divided in correspondence with each function. As shown in fig. 14, the image decoding device 14 includes a parsing unit 140, a determination unit 141, and a superimposition unit 142.
A parsing unit 140, configured to parse the code stream, and determine decoding information of the current decoded block, where the decoding information includes first indication information and third indication information, the first indication information is used to indicate whether intra-prediction filtering is allowed, and the third indication information is used to indicate whether intra-prediction smoothing filtering is used;
a determining unit 141, configured to determine a prediction block of the current decoded block according to the first indication information and the third indication information;
and a superposition unit 142, configured to superpose the restored residual information on the prediction block to obtain a reconstructed block of the current decoding unit.
All relevant contents of each step related to the above method embodiment may be referred to the functional description of the corresponding functional module, and are not described herein again. Of course, the image decoding apparatus provided in the embodiments of the present application includes, but is not limited to, the above modules, for example: the image decoding apparatus may further include a storage unit. The storage unit may be used to store program codes and data of the image decoding apparatus.
In the case of using an integrated unit, a schematic structural diagram of an image decoding apparatus provided in an embodiment of the present application is shown in fig. 15. In fig. 15, the image decoding apparatus includes: a processing module 150 and a communication module 151. The processing module 150 is used for controlling and managing actions of the image decoding apparatus, for example, performing steps performed by the parsing unit 140, the determining unit 141, and the superimposing unit 142, and/or other processes for performing the techniques described herein. The communication module 151 is used to support interaction between the image decoding apparatus and other devices. As shown in fig. 15, the image decoding apparatus may further include a storage module 152, and the storage module 152 is used for storing program codes and data of the image decoding apparatus, for example, storing contents stored in the storage unit.
The Processing module 150 may be a Processor or a controller, and may be, for example, a Central Processing Unit (CPU), a general-purpose Processor, a Digital Signal Processor (DSP), an ASIC, an FPGA or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor may also be a combination of computing functions, e.g., comprising one or more microprocessors, DSPs, and microprocessors, among others. The communication module 151 may be a transceiver, an RF circuit or a communication interface, etc. The storage module 152 may be a memory.
All relevant contents of each scene related to the method embodiment may be referred to the functional description of the corresponding functional module, and are not described herein again. The image decoding device can execute the image decoding method, and the image decoding device can be specifically a video image decoding device or other equipment with a video decoding function.
The application further provides a video decoder, which comprises a nonvolatile storage medium and a central processing unit, wherein the nonvolatile storage medium stores an executable program, and the central processing unit is connected with the nonvolatile storage medium and executes the executable program to realize the image decoding method of the embodiment of the application.
The present application further provides a terminal, including: one or more processors, memory, a communication interface. The memory, communication interface and one or more processors; the memory is used for storing computer program code comprising instructions which, when executed by the one or more processors, cause the terminal to perform the image encoding and/or image decoding methods of embodiments of the present application. The terminal can be a video display device, a smart phone, a portable computer and other devices which can process video or play video.
Another embodiment of the present application also provides a computer-readable storage medium including one or more program codes, where the one or more programs include instructions, and when a processor in a decoding apparatus executes the program codes, the decoding apparatus executes an image encoding method and an image decoding method of the embodiments of the present application.
In another embodiment of the present application, there is also provided a computer program product comprising computer executable instructions stored in a computer readable storage medium; the at least one processor of the decoding device may read the computer executable instructions from the computer readable storage medium, and the execution of the computer executable instructions by the at least one processor causes the terminal to implement the image encoding method and the image decoding method of the embodiments of the present application.
In the above embodiments, all or part of the implementation may be realized by software, hardware, firmware or any combination thereof. When implemented using a software program, may take the form of a computer program product, either entirely or partially. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part.
The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.).
The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Disk (SSD)), among others.
Through the above description of the embodiments, it is clear to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional modules is merely used as an example, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device may be divided into different functional modules to complete all or part of the above described functions.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described device embodiments are merely illustrative, and for example, the division of the modules or units is only one logical functional division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another device, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may be one physical unit or a plurality of physical units, that is, may be located in one place, or may be distributed in a plurality of different places. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially or partially contributed to by the prior art, or all or part of the technical solutions may be embodied in the form of a software product, where the software product is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above description is only an embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions within the technical scope disclosed in the present application should be covered within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (14)

1. An image encoding method, comprising:
dividing the image, and determining coding information of a current coding block, wherein the coding information comprises first indication information and second indication information, the first indication information is used for indicating whether intra-prediction filtering is allowed or not, and the second indication information is used for indicating whether intra-prediction smoothing filtering is allowed or not;
calculating a prediction block of each prediction mode in all intra-frame prediction modes corresponding to the current coding block, and calculating first rate-distortion cost information of each prediction mode according to an original pixel block; determining a prediction mode with the minimum rate distortion cost value as a first preferred prediction mode of the current coding block according to the first rate distortion cost information;
if the second indication information is used for indicating that the intra-frame prediction smoothing filtering is allowed, calculating a prediction block of each prediction mode in all intra-frame prediction modes corresponding to the current coding block according to the intra-frame prediction smoothing filtering; processing the prediction block according to the intra-frame prediction smoothing filtering to obtain a final prediction block; calculating to obtain second rate distortion cost information of each prediction mode according to the final prediction block and the original pixel block; determining a prediction mode with the minimum rate distortion cost value as a second preferred prediction mode of the current coding block according to the second rate distortion cost information; determining the optimal prediction mode of the current coding block according to the magnitude relation between the rate-distortion cost value corresponding to the first optimal prediction mode and the rate-distortion cost value corresponding to the second optimal prediction mode; setting third indication information according to the optimal prediction mode, and transmitting the optimal prediction mode index and the third indication information through a code stream;
and superposing the prediction block of the current coding block and a residual block obtained after inverse transformation and inverse quantization to obtain a reconstructed block.
2. The method of claim 1, wherein the processing the prediction block according to the intra-prediction smoothing filtering to obtain a final prediction block comprises:
and carrying out intra-frame prediction smoothing filtering on all pixels in the prediction block of the current coding block to obtain the final prediction block.
3. The method of claim 1, wherein the processing the prediction block according to the intra-prediction smoothing filtering to obtain a final prediction block comprises:
and filtering the prediction block of the current coding block twice, wherein all pixels in the prediction block are subjected to intra-frame prediction smooth filtering for the first time, and left boundary and upper boundary pixels in the prediction block after filtering are subjected to intra-frame prediction smooth filtering for the second time to obtain the final prediction block.
4. The method of claim 1,
and if the second indication information is used for indicating that intra-frame prediction smooth filtering is not allowed, determining the first preferred prediction mode as an optimal prediction mode, and transmitting the optimal prediction mode index through a code stream.
5. An image decoding method, comprising:
analyzing a code stream, and determining decoding information of a current decoding block, wherein the decoding information comprises first indication information and third indication information, the first indication information is used for indicating whether intra-frame prediction filtering is allowed or not, and the third indication information is used for indicating whether intra-frame prediction smoothing filtering is used or not;
determining the prediction mode of the current decoding block, and calculating to obtain a prediction block according to the prediction mode of the current decoding block and the adjacent reconstruction block;
if the third indication information is used for indicating that the intra-frame prediction smooth filtering is used, processing the prediction block according to the intra-frame prediction smooth filtering to obtain the prediction block of the current decoding block after filtering; and superposing the residual error information after reduction on the prediction block of the current decoding block after filtration to obtain a reconstruction block of the current decoding block.
6. The method of claim 5, wherein said processing the prediction block according to the intra-prediction smoothing filtering to obtain a filtered prediction block of the currently decoded block comprises:
and carrying out intra-frame prediction smoothing filtering on the prediction block to obtain the prediction block of the current decoding block after filtering.
7. The method of claim 5, wherein said processing the prediction block according to the intra-prediction smoothing filtering to obtain a filtered prediction block of the currently decoded block comprises:
and filtering the prediction block twice, wherein the intra-frame prediction smoothing filtering is carried out on all prediction pixels in the prediction block for the first time, and the intra-frame prediction smoothing filtering is carried out on the left boundary pixel and the upper boundary pixel in the filtered prediction block for the second time, so as to obtain the prediction block of the current decoding block after filtering.
8. The method of claim 5,
and if the third indication information is used for indicating that the intra-frame prediction smooth filtering is not used, overlapping the residual error information after the prediction block is restored to obtain a reconstruction block of the current decoding block.
9. An image encoding device, comprising:
a dividing unit, configured to divide an image, and determine coding information of a current coding block, where the coding information includes first indication information and second indication information, the first indication information is used to indicate whether intra-prediction filtering is allowed, and the second indication information is used to indicate whether intra-prediction smoothing filtering is allowed;
the determining unit is used for calculating a prediction block of each prediction mode in all intra-frame prediction modes corresponding to the current coding block and calculating first rate-distortion cost information of each prediction mode according to an original pixel block; determining a prediction mode with the minimum rate distortion cost value as a first preferred prediction mode of the current coding block according to the first rate distortion cost information; if the second indication information is used for indicating that the intra-frame prediction smooth filtering is allowed, calculating a prediction block of each prediction mode in all intra-frame prediction modes corresponding to the current coding block according to the intra-frame prediction smooth filtering; processing the prediction block according to the intra-frame prediction smoothing filtering to obtain a final prediction block; calculating to obtain second rate distortion cost information of each prediction mode according to the final prediction block and the original pixel block; determining a prediction mode with the minimum rate distortion cost value as a second preferred prediction mode of the current coding block according to the second rate distortion cost information; determining the optimal prediction mode of the current coding block according to the magnitude relation between the rate-distortion cost value corresponding to the first optimal prediction mode and the rate-distortion cost value corresponding to the second optimal prediction mode; setting third indication information according to the optimal prediction mode, and transmitting the optimal prediction mode index and the third indication information through code streams;
and the superposition unit is used for superposing the prediction block of the current coding block and the residual block obtained after inverse transformation and inverse quantization to obtain a reconstructed block.
10. An image decoding apparatus, comprising:
the decoding unit is used for decoding the code stream and determining decoding information of a current decoding block, wherein the decoding information comprises first indication information and third indication information, the first indication information is used for indicating whether intra-frame prediction filtering is allowed or not, and the third indication information is used for indicating whether intra-frame prediction smoothing filtering is used or not;
a determining unit, configured to determine a prediction mode of the current decoded block, and calculate a prediction block according to the prediction mode of the current decoded block and an adjacent reconstructed block;
a superposition unit, configured to, if the third indication information is used to indicate that intra-prediction smoothing filtering is used, process the prediction block according to intra-prediction smoothing filtering to obtain a prediction block of a filtered current decoded block; and superposing the residual error information after reduction on the prediction block of the current decoding block after filtration to obtain a reconstruction block of the current decoding block.
11. An encoder comprising a non-volatile storage medium and a central processing unit, wherein the non-volatile storage medium stores an executable program, and wherein the central processing unit is coupled to the non-volatile storage medium, wherein the encoder performs the method of any one of claims 1-4 when the executable program is executed by the central processing unit.
12. Decoder comprising a non-volatile storage medium, in which an executable program is stored, and a central processor, which is connected to the non-volatile storage medium, the decoder performing the method according to any of claims 5-8 when the executable program is executed by the central processor.
13. A terminal, characterized in that the terminal comprises: one or more processors, memory, and a communication interface; the memory, the communication interface and the one or more processors; the terminal communicating with other devices via the communication interface, the memory for storing computer program code, the computer program code comprising instructions,
the terminal, when the one or more processors execute the instructions, performs the method of any of claims 1-4 or 5-8.
14. A computer-readable storage medium comprising instructions that, when executed on a terminal, cause the terminal to perform the method of any one of claims 1-4 or 5-8.
CN202010748923.0A 2020-07-29 2020-07-29 Image encoding method, image decoding method and related devices Active CN114071161B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010748923.0A CN114071161B (en) 2020-07-29 2020-07-29 Image encoding method, image decoding method and related devices
TW110123862A TW202209879A (en) 2020-07-29 2021-06-29 Image encoding method, image decoding method and related devices capable of improving the accuracy of intra-frame prediction and encoding efficiency

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010748923.0A CN114071161B (en) 2020-07-29 2020-07-29 Image encoding method, image decoding method and related devices

Publications (2)

Publication Number Publication Date
CN114071161A CN114071161A (en) 2022-02-18
CN114071161B true CN114071161B (en) 2023-03-31

Family

ID=80227085

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010748923.0A Active CN114071161B (en) 2020-07-29 2020-07-29 Image encoding method, image decoding method and related devices

Country Status (2)

Country Link
CN (1) CN114071161B (en)
TW (1) TW202209879A (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118075458A (en) * 2022-11-23 2024-05-24 华为技术有限公司 Video encoding and decoding method and device
CN116156180B (en) * 2023-04-19 2023-06-23 北京中星微人工智能芯片技术有限公司 Intra-frame prediction method, image encoding method, image decoding method, and apparatus

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103581647A (en) * 2013-09-29 2014-02-12 北京航空航天大学 Depth map sequence fractal coding method based on motion vectors of color video
CN104125473A (en) * 2014-07-31 2014-10-29 南京理工大学 3D (three dimensional) video depth image intra-frame predicting mode selecting method and system
WO2019245261A1 (en) * 2018-06-18 2019-12-26 세종대학교 산학협력단 Method and apparatus for encoding/decoding image
CN111294592A (en) * 2020-05-13 2020-06-16 腾讯科技(深圳)有限公司 Video information processing method, multimedia information processing method and device
WO2020130889A1 (en) * 2018-12-21 2020-06-25 Huawei Technologies Co., Ltd. Method and apparatus of mode- and size-dependent block-level restrictions

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10142630B2 (en) * 2010-12-10 2018-11-27 Texas Instruments Incorporated Mode adaptive intra prediction smoothing in video coding
US9602839B2 (en) * 2011-06-15 2017-03-21 Futurewei Technologies, Inc. Mode dependent intra smoothing filter table mapping methods for non-square prediction units
WO2020119814A1 (en) * 2018-12-15 2020-06-18 华为技术有限公司 Image reconstruction method and device
CN109889852B (en) * 2019-01-22 2021-11-05 四川大学 HEVC intra-frame coding optimization method based on adjacent values
CN110267041B (en) * 2019-06-28 2021-11-09 Oppo广东移动通信有限公司 Image encoding method, image encoding device, electronic device, and computer-readable storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103581647A (en) * 2013-09-29 2014-02-12 北京航空航天大学 Depth map sequence fractal coding method based on motion vectors of color video
CN104125473A (en) * 2014-07-31 2014-10-29 南京理工大学 3D (three dimensional) video depth image intra-frame predicting mode selecting method and system
WO2019245261A1 (en) * 2018-06-18 2019-12-26 세종대학교 산학협력단 Method and apparatus for encoding/decoding image
WO2020130889A1 (en) * 2018-12-21 2020-06-25 Huawei Technologies Co., Ltd. Method and apparatus of mode- and size-dependent block-level restrictions
CN111294592A (en) * 2020-05-13 2020-06-16 腾讯科技(深圳)有限公司 Video information processing method, multimedia information processing method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张洪彬 ; 伏长虹 ; 陈锐霖 ; 萧允治 ; 苏卫民 ; .3D-HEVC深度图像快速帧内编码方法.中国图象图形学报.2016,(第07期),全文. *

Also Published As

Publication number Publication date
CN114071161A (en) 2022-02-18
TW202209879A (en) 2022-03-01

Similar Documents

Publication Publication Date Title
WO2021238540A1 (en) Image encoding method, image decoding method, and related apparatuses
CN116506635B (en) Method and system for performing progressive decoding refresh processing on an image
US12010325B2 (en) Intra block copy scratch frame buffer
CN113497937B (en) Image encoding method, image decoding method and related devices
CN114071161B (en) Image encoding method, image decoding method and related devices
CN114868393A (en) Method of performing surround motion compensation
CN115398906A (en) Method for signalling video coding data
WO2021244197A1 (en) Image encoding method, image decoding method, and related apparatuses
WO2022022622A1 (en) Image coding method, image decoding method, and related apparatus
WO2022037300A1 (en) Encoding method, decoding method, and related devices
CN118101967A (en) Position dependent spatially varying transform for video coding
CN113965764B (en) Image encoding method, image decoding method and related device
US20240214561A1 (en) Methods and devices for decoder-side intra mode derivation
US20240236372A1 (en) Video encoding and decoding method, and device
US20240214580A1 (en) Intra prediction modes signaling
US20240223760A1 (en) Adaptive bilateral filtering for video coding
WO2024039803A1 (en) Methods and devices for adaptive loop filter
WO2023154359A1 (en) Methods and devices for multi-hypothesis-based prediction
JP2024522847A (en) Side-window bilateral filtering for video encoding and decoding
CN114979629A (en) Image block prediction sample determining method and coding and decoding equipment
CN114979628A (en) Image block prediction sample determining method and coding and decoding equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant