WO2024216632A1 - 视频编解码方法、装置、设备、系统、及存储介质 - Google Patents
视频编解码方法、装置、设备、系统、及存储介质 Download PDFInfo
- Publication number
- WO2024216632A1 WO2024216632A1 PCT/CN2023/089855 CN2023089855W WO2024216632A1 WO 2024216632 A1 WO2024216632 A1 WO 2024216632A1 CN 2023089855 W CN2023089855 W CN 2023089855W WO 2024216632 A1 WO2024216632 A1 WO 2024216632A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- current block
- pixel
- prediction
- value
- reconstruction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/117—Filters, e.g. for pre-processing or post-processing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/105—Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/11—Selection of coding mode or of prediction mode among a plurality of spatial predictive coding modes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/174—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a slice, e.g. a line of blocks or a group of blocks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/42—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
- H04N19/436—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation using parallelised computational arrangements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/593—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/80—Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/80—Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
- H04N19/82—Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop
Definitions
- the present application relates to the field of video coding and decoding technology, and in particular to a video coding and decoding method, device, equipment, system, and storage medium.
- Digital video technology can be incorporated into a variety of video devices, such as digital televisions, smart phones, computers, e-readers or video players, etc. With the development of video technology, the amount of data included in video data is large. In order to facilitate the transmission of video data, video devices implement video compression technology to make video data more efficiently transmitted or stored.
- prediction can eliminate or reduce the redundancy in the video and improve the compression efficiency.
- the interpolation filter prediction method is used for prediction compression, but the current interpolation filter prediction has the problem of low prediction efficiency, resulting in poor video encoding and decoding performance.
- the embodiments of the present application provide a video encoding and decoding method, apparatus, device, system, and storage medium.
- interpolation filter prediction for prediction
- parallel prediction can be performed, thereby improving the prediction efficiency of the interpolation filter prediction mode and improving the encoding and decoding performance.
- the present application provides a video decoding method, applied to a decoder, comprising:
- a transform kernel corresponding to the current block is determined, and a reconstructed block of the current block is determined based on the transform kernel corresponding to the current block and the prediction block.
- an embodiment of the present application provides a video encoding method, applied to an encoder, comprising:
- a transform kernel corresponding to the current block is determined, and based on the transform kernel corresponding to the current block and the prediction block, the current block is encoded to obtain a code stream.
- the present application provides a video decoding device, which is used to execute the method in the first aspect or its respective implementations.
- the device includes a functional unit for executing the method in the first aspect or its respective implementations.
- the present application provides a video encoding device, which is used to execute the method in the second aspect or its respective implementations.
- the device includes a functional unit for executing the method in the second aspect or its respective implementations.
- a video decoder comprising a processor and a memory, wherein the memory is used to store a computer program, and the processor is used to call and run the computer program stored in the memory to execute the method in the first aspect or its implementations.
- a video encoder comprising a processor and a memory, wherein the memory is used to store a computer program, and the processor is used to call and run the computer program stored in the memory to execute the method in the second aspect or its implementations.
- a video coding and decoding system including a video encoder and a video decoder.
- the video decoder is used to execute the method in the first aspect or its respective implementations
- the video encoder is used to execute the method in the second aspect or its respective implementations.
- a chip for implementing the method in any one of the first to second aspects or their respective implementations.
- the chip includes: a processor for calling and running a computer program from a memory, so that a device equipped with the chip executes the method in any one of the first to second aspects or their respective implementations.
- a computer-readable storage medium for storing a computer program, wherein the computer program enables a computer to execute the method of any one of the first to second aspects or any of their implementations.
- a computer program product comprising computer program instructions, which enable a computer to execute the method in any one of the first to second aspects or their respective implementations.
- a computer program which, when executed on a computer, enables the computer to execute the method in any one of the first to second aspects or in each of their implementations.
- the present application proposes an interpolation filter prediction method.
- the reference area and the interpolation filter of the current block are first determined, and based on the reference area, the filter coefficient is determined.
- the interpolation filter is used to perform parallel prediction on at least two pixel points in the current block to obtain the prediction block of the current block; the transformation kernel corresponding to the current block is determined, and based on the transformation kernel and the prediction block, the reconstruction value of the current block is determined. That is to say, in the embodiment of the present application, when the interpolation filter is used to perform interpolation filter prediction on the current block, at least two points in the current block are predicted in parallel to improve the prediction speed, thereby improving the encoding and decoding efficiency.
- FIG1 is a schematic block diagram of a video encoding and decoding system according to an embodiment of the present application.
- FIG2 is a schematic block diagram of a video encoder according to an embodiment of the present application.
- FIG3 is a schematic block diagram of a video decoder according to an embodiment of the present application.
- FIG4A is a schematic diagram of intra-frame prediction
- FIG4B is a schematic diagram of intra-frame prediction
- 5A-5I are schematic diagrams of intra-frame prediction
- FIG6 is a schematic diagram of an intra-frame prediction mode
- FIG7 is a schematic diagram of an intra-frame prediction mode
- FIG8 is a schematic diagram of an intra-frame prediction mode
- FIG9 is a schematic diagram of the CCCM principle
- FIG10 is a schematic diagram of a video decoding method flow chart provided by an embodiment of the present application.
- FIG11 is a schematic diagram showing the position of the current block in the current image
- FIG12 is a schematic diagram of the reconstruction area
- FIGS. 13A to 13C are schematic diagrams of several reference areas
- FIG15 is a schematic diagram of shapes of several interpolation filters involved in an embodiment of the present application.
- FIG16 is a schematic diagram of several interpolation filter shapes involved in an embodiment of the present application.
- FIG17 is a schematic diagram of shapes of several interpolation filters involved in an embodiment of the present application.
- 18A and 18B are schematic diagrams of shapes of several interpolation filters involved in embodiments of the present application.
- FIG19 is a schematic diagram of shapes of several interpolation filters involved in an embodiment of the present application.
- FIG20A is a sliding step size of an interpolation filter
- FIG20B is a schematic diagram of a first reconstruction area
- FIG21 is a schematic diagram of the movement of interpolation filters of different shapes within different types of reference areas
- 22A and 22B are schematic diagrams of using an interpolation filter to perform interpolation prediction on a current block
- FIG23 is a schematic diagram of performing interpolation prediction on a current block along a diagonal direction according to an embodiment of the present application.
- 24A to 24C are schematic diagrams of several directions of diagonal lines
- FIG25 is a schematic diagram of an intra-frame prediction mode
- FIG26 is a schematic diagram of determining horizontal gradient and vertical gradient
- FIG27 is a gradient amplitude value histogram
- FIG28 is a schematic diagram of a video encoding method flow chart provided in an embodiment of the present application.
- FIG29 is a schematic diagram of a process for determining a prediction mode according to an embodiment of the present application.
- FIG30 is a schematic block diagram of a video decoding device provided by an embodiment of the present application.
- FIG31 is a schematic block diagram of a video encoding device provided by an embodiment of the present application.
- FIG32 is a schematic block diagram of an electronic device provided in an embodiment of the present application.
- Figure 33 is a schematic block diagram of a video encoding and decoding system provided in an embodiment of the present application.
- the present application can be applied to the field of image coding and decoding, the field of video coding and decoding, the field of hardware video coding and decoding, the field of dedicated circuit video coding and decoding, the field of real-time video coding and decoding, etc.
- AVC H.264/audio and video coding
- HEVC H.265/high efficiency video coding
- VVC VVC
- the solution of the present application may be combined with other proprietary or industry standards and operate, the standards include ITU-TH.261, ISO/IEC MPEG-1 Visual, ITU-TH.262 or ISO/IEC MPEG-2 Visual, ITU-TH.263, ISO/IEC MPEG-4 Visual, ITU-TH.264 (also known as ISO/IEC MPEG-4 AVC), including scalable video coding (SVC) and multi-view video coding (MVC) extensions.
- SVC scalable video coding
- MVC multi-view video coding
- FIG1 is a schematic block diagram of a video encoding and decoding system involved in an embodiment of the present application. It should be noted that FIG1 is only an example, and the video encoding and decoding system of the embodiment of the present application includes but is not limited to that shown in FIG1.
- the video encoding and decoding system 100 includes an encoding device 110 and a decoding device 120.
- the encoding device is used to encode (which can be understood as compression) the video data to generate a code stream, and transmit the code stream to the decoding device.
- the decoding device decodes the code stream generated by the encoding device to obtain decoded video data.
- the encoding device 110 of the embodiment of the present application can be understood as a device with a video encoding function
- the decoding device 120 can be understood as a device with a video decoding function, that is, the embodiment of the present application includes a wider range of devices for the encoding device 110 and the decoding device 120, such as smartphones, desktop computers, mobile computing devices, notebook (e.g., laptop) computers, tablet computers, set-top boxes, televisions, cameras, display devices, digital media players, video game consoles, vehicle-mounted computers, etc.
- the encoding device 110 may transmit the encoded video data (eg, a code stream) to the decoding device 120 via the channel 130.
- the channel 130 may include one or more media and/or devices capable of transmitting the encoded video data from the encoding device 110 to the decoding device 120.
- the channel 130 includes one or more communication media that enable the encoding device 110 to transmit the encoded video data directly to the decoding device 120 in real time.
- the encoding device 110 can modulate the encoded video data according to the communication standard and transmit the modulated video data to the decoding device 120.
- the communication medium includes a wireless communication medium, such as a radio frequency spectrum, and optionally, the communication medium may also include a wired communication medium, such as one or more physical transmission lines.
- the channel 130 includes a storage medium, which can store the video data encoded by the encoding device 110.
- the storage medium includes a variety of locally accessible data storage media, such as optical disks, DVDs, flash memories, etc.
- the decoding device 120 can obtain the encoded video data from the storage medium.
- the channel 130 may include a storage server that can store the video data encoded by the encoding device 110.
- the decoding device 120 can download the stored encoded video data from the storage server.
- the storage server can store the encoded video data and transmit the encoded video data to the decoding device 120, such as a web server (e.g., for a website), a file transfer protocol (FTP) server, etc.
- FTP file transfer protocol
- the encoding device 110 includes a video encoder 112 and an output interface 113.
- the output interface 113 may include a modulator/demodulator. (modem) and/or transmitter.
- the encoding device 110 may further include a video source 111 in addition to the video encoder 112 and the input interface 113 .
- the video source 111 may include at least one of a video acquisition device (eg, a video camera), a video archive, a video input interface, and a computer graphics system, wherein the video input interface is used to receive video data from a video content provider, and the computer graphics system is used to generate video data.
- a video acquisition device eg, a video camera
- a video archive e.g., a video archive
- a video input interface e.g., a computer graphics system
- the video input interface is used to receive video data from a video content provider
- the computer graphics system is used to generate video data.
- the video encoder 112 encodes the video data from the video source 111 to generate a bitstream.
- the video data may include one or more pictures or a sequence of pictures.
- the bitstream contains the encoding information of the picture or the sequence of pictures in the form of a bitstream.
- the encoding information may include the encoded picture data and associated data.
- the associated data may include a sequence parameter set (SPS for short), a picture parameter set (PPS for short) and other syntax structures.
- SPS sequence parameter set
- PPS picture parameter set
- the syntax structure refers to a set of zero or more syntax elements arranged in a specified order in the bitstream.
- the video encoder 112 transmits the encoded video data directly to the decoding device 120 via the output interface 113.
- the encoded video data may also be stored in a storage medium or a storage server for subsequent reading by the decoding device 120.
- the decoding device 120 includes an input interface 121 and a video decoder 122 .
- the decoding device 120 may include a display device 123 in addition to the input interface 121 and the video decoder 122 .
- the input interface 121 includes a receiver and/or a modem.
- the input interface 121 can receive the encoded video data through the channel 130 .
- the video decoder 122 is used to decode the encoded video data to obtain decoded video data, and transmit the decoded video data to the display device 123 .
- the display device 123 displays the decoded video data.
- the display device 123 may be integrated with the decoding device 120 or external to the decoding device 120.
- the display device 123 may include a variety of display devices, such as a liquid crystal display (LCD), a plasma display, an organic light emitting diode (OLED) display, or other types of display devices.
- FIG1 is only an example, and the technical solution of the embodiment of the present application is not limited to FIG1 .
- the technology of the present application can also be applied to unilateral video encoding or unilateral video decoding.
- FIG2 is a schematic block diagram of a video encoder according to an embodiment of the present application. It should be understood that the video encoder 200 can be used to perform lossy compression on an image, or can be used to perform lossless compression on an image.
- the lossless compression can be visually lossless compression or mathematically lossless compression.
- the video encoder 200 can be applied to image data in luminance and chrominance (YCbCr, YUV) format.
- the YUV ratio can be 4:2:0, 4:2:2 or 4:4:4, Y represents brightness (Luma), Cb (U) represents blue chrominance, Cr (V) represents red chrominance, and U and V represent chrominance (Chroma) for describing color and saturation.
- 4:2:0 means that every 4 pixels have 4 luminance components and 2 chrominance components (YYYYCbCr)
- 4:2:2 means that every 4 pixels have 4 luminance components and 4 chrominance components (YYYYCbCrCbCr)
- 4:4:4 means full pixel display (YYYYCbCrCbCrCbCrCbCr).
- the video encoder 200 reads video data, and for each frame of the video data, divides the frame into a number of coding tree units (CTUs).
- CTB may be referred to as a "tree block", “largest coding unit” (LCU) or “coding tree block” (CTB).
- Each CTU may be associated with a pixel block of equal size within the image.
- Each pixel may correspond to a luminance (luminance or luma) sample and two chrominance (chrominance or chroma) samples. Therefore, each CTU may be associated with a luminance sample block and two chrominance sample blocks.
- the size of a CTU is, for example, 128 ⁇ 128, 64 ⁇ 64, 32 ⁇ 32, etc.
- a CTU may be further divided into a number of coding units (CUs) for encoding, and a CU may be a rectangular block or a square block.
- CU can be further divided into prediction unit (PU) and transform unit (TU), which makes encoding, prediction and transform separated and more flexible in processing.
- PU prediction unit
- TU transform unit
- CTU is divided into CU in quadtree mode
- CU is divided into TU and PU in quadtree mode.
- the video encoder and video decoder may support various PU sizes. Assuming that the size of a particular CU is 2N ⁇ 2N, the video encoder and video decoder may support PU sizes of 2N ⁇ 2N or N ⁇ N for intra-frame prediction, and support symmetric PUs of 2N ⁇ 2N, 2N ⁇ N, N ⁇ 2N, N ⁇ N or similar sizes for inter-frame prediction. The video encoder and video decoder may also support asymmetric PUs of 2N ⁇ nU, 2N ⁇ nD, nL ⁇ 2N, and nR ⁇ 2N for inter-frame prediction.
- the video encoder 200 may include: a prediction unit 210, a residual unit 220, a transform/quantization unit 230, an inverse transform/quantization unit 240, a reconstruction unit 250, a loop filter unit 260, a decoded image buffer 270, and an entropy coding unit 280. It should be noted that the video encoder 200 may include more, fewer, or different functional components.
- the current block may be referred to as a current coding unit (CU) or a current prediction unit (PU), etc.
- a prediction block may also be referred to as a prediction image block or an image prediction block, and a reconstructed image block may also be referred to as a reconstructed block or an image reconstructed image block.
- the prediction unit 210 includes an inter-frame prediction unit 211 and an intra-frame prediction unit 212. Since there is a strong correlation between adjacent pixels in a frame of a video, the intra-frame prediction method is used in the video coding and decoding technology to eliminate the spatial redundancy between adjacent pixels. Since there is a strong similarity between adjacent frames in a video, the inter-frame prediction method is used in the video coding and decoding technology to eliminate the temporal redundancy between adjacent frames, thereby improving the coding efficiency.
- the inter-frame prediction unit 211 can be used for inter-frame prediction.
- Inter-frame prediction can include motion estimation and motion compensation. It can refer to the image information of different frames.
- Inter-frame prediction uses motion information to find reference blocks from reference frames, and generates prediction blocks based on the reference blocks to eliminate temporal redundancy.
- the frames used for inter-frame prediction can be P frames and/or B frames. P frames refer to forward prediction frames, and B frames refer to bidirectional prediction frames.
- Inter-frame prediction uses motion information to find reference blocks from reference frames, and generates prediction blocks based on the reference blocks.
- Motion information includes a reference frame list where the reference frame is located, a reference frame index, and a motion vector.
- the motion vector can be an integer pixel or a sub-pixel.
- the motion vector is a sub-pixel
- the integer pixel or sub-pixel block in the reference frame found according to the motion vector is called a reference block.
- Some technologies will directly use the reference block as a prediction block, and some technologies will generate a prediction block based on the reference block. Reprocessing the reference block to generate a prediction block can also be understood as taking the reference block as a prediction block and then processing the prediction block to generate a new prediction block.
- the intra-frame prediction unit 212 only refers to the information of the same frame image to predict the pixel information in the current code image block to eliminate spatial redundancy.
- the frame used for intra-frame prediction can be an I frame.
- the intra-frame prediction modes used by HEVC are Planar, DC, and 33 angle modes, for a total of 35 prediction modes.
- the intra-frame modes used by VVC are Planar, DC, and 65 angle modes, for a total of 67 prediction modes.
- the residual unit 220 may generate a residual block of the CU based on the pixel blocks of the CU and the prediction blocks of the PUs of the CU. For example, the residual unit 220 may generate a residual block of the CU so that each sample in the residual block has a value equal to the difference between the following two: a sample in the pixel blocks of the CU and a corresponding sample in the prediction blocks of the PUs of the CU.
- the transform/quantization unit 230 may quantize the transform coefficients.
- the transform/quantization unit 230 may quantize the transform coefficients associated with the TUs of the CU based on a quantization parameter (QP) value associated with the CU.
- QP quantization parameter
- the video encoder 200 may adjust the degree of quantization applied to the transform coefficients associated with the CU by adjusting the QP value associated with the CU.
- the inverse transform/quantization unit 240 may apply inverse quantization and inverse transform to the quantized transform coefficients, respectively, to reconstruct a residual block from the quantized transform coefficients.
- the reconstruction unit 250 may add the samples of the reconstructed residual block to the corresponding samples of one or more prediction blocks generated by the prediction unit 210 to generate a reconstructed image block associated with the TU. By reconstructing the sample blocks of each TU of the CU in this manner, the video encoder 200 may reconstruct the pixel blocks of the CU.
- the loop filter unit 260 is used to process the inverse transformed and inverse quantized pixels to compensate for distortion information and provide a better reference for subsequent coded pixels. For example, a deblocking filter operation may be performed to reduce the blocking effect of the pixel blocks associated with the CU.
- the loop filter unit 260 includes a deblocking filter unit and a sample adaptive offset/adaptive loop filter (SAO/ALF) unit, wherein the deblocking filter unit is used to remove the block effect, and the SAO/ALF unit is used to remove the ringing effect.
- SAO/ALF sample adaptive offset/adaptive loop filter
- the decoded image buffer 270 may store the reconstructed pixel blocks.
- the inter prediction unit 211 may use the reference image containing the reconstructed pixel blocks to perform inter prediction on PUs of other images.
- the intra prediction unit 212 may use the reconstructed pixel blocks in the decoded image buffer 270 to perform intra prediction on other PUs in the same image as the CU.
- the entropy encoding unit 280 may receive the quantized transform coefficients from the transform/quantization unit 230.
- the entropy encoding unit 280 may perform one or more entropy encoding operations on the quantized transform coefficients to generate entropy-encoded data.
- FIG3 is a schematic block diagram of a video decoder according to an embodiment of the present application.
- the video decoder 300 includes an entropy decoding unit 310, a prediction unit 320, an inverse quantization/transformation unit 330, a reconstruction unit 340, a loop filter unit 350, and a decoded image buffer 360. It should be noted that the video decoder 300 may include more, fewer, or different functional components.
- the video decoder 300 may receive a bitstream.
- the entropy decoding unit 310 may parse the bitstream to extract syntax elements from the bitstream. As part of parsing the bitstream, the entropy decoding unit 310 may parse the syntax elements in the bitstream that have been entropy encoded.
- the prediction unit 320, the inverse quantization/transformation unit 330, the reconstruction unit 340, and the loop filter unit 350 may decode the video data according to the syntax elements extracted from the bitstream, that is, generate decoded video data.
- the prediction unit 320 includes an intra-prediction unit 322 and an inter-prediction unit 321 .
- the intra prediction unit 322 may perform intra prediction to generate a prediction block for the PU.
- the intra prediction unit 322 may use an intra prediction mode to generate a prediction block for the PU based on pixel blocks of spatially neighboring PUs.
- the intra prediction unit 322 may also determine the intra prediction mode of the PU according to one or more syntax elements parsed from the code stream.
- the inter prediction unit 321 may construct a first reference image list (list 0) and a second reference image list (list 1) according to the syntax elements parsed from the code stream.
- the entropy decoding unit 310 may parse the motion information of the PU.
- the inter prediction unit 321 may determine one or more reference blocks of the PU according to the motion information of the PU.
- the inter prediction unit 321 may generate a prediction block of the PU according to one or more reference blocks of the PU.
- the inverse quantization/transform unit 330 may inversely quantize (ie, dequantize) the transform coefficients associated with the TU.
- the inverse quantization/transform unit 330 may use the QP value associated with the CU of the TU to determine the degree of quantization.
- the inverse quantization/transform unit 330 may apply one or more inverse transforms to the inverse quantized transform coefficients in order to generate a residual block associated with the TU.
- the reconstruction unit 340 uses the residual block associated with the TU of the CU and the prediction block of the PU of the CU to reconstruct the pixel block of the CU. For example, the reconstruction unit 340 may add samples of the residual block to corresponding samples of the prediction block to reconstruct the pixel block of the CU to obtain a reconstructed image block.
- the loop filtering unit 350 may perform a deblocking filtering operation to reduce blocking effects of pixel blocks associated with a CU.
- the video decoder 300 may store the reconstructed image of the CU in the decoded image buffer 360.
- the video decoder 300 may use the reconstructed image in the decoded image buffer 360 as a reference image for subsequent prediction, or transmit the reconstructed image to a display device for presentation.
- the basic process of video encoding and decoding is as follows: at the encoding end, a frame of image is divided into blocks, and for the current block, the prediction unit 210 uses intra-frame prediction or inter-frame prediction to generate a prediction block of the current block.
- the residual unit 220 can calculate the residual block based on the original block of the prediction block and the current block, that is, the difference between the original block of the prediction block and the current block, and the residual block can also be called residual information.
- the residual block can remove information that is not sensitive to the human eye through the transformation and quantization process of the transformation/quantization unit 230 to eliminate visual redundancy.
- the residual block before transformation and quantization by the transformation/quantization unit 230 can be called a time domain residual block, and the time domain residual block after transformation and quantization by the transformation/quantization unit 230 can be called a frequency residual block or a frequency domain residual block.
- the entropy coding unit 280 receives the quantized change coefficient output by the change quantization unit 230, and can entropy encode the quantized change coefficient and output a bit stream. For example, the entropy coding unit 280 can eliminate character redundancy according to the target context model and the probability information of the binary bit stream.
- the entropy decoding unit 310 can parse the code stream to obtain the prediction information, quantization coefficient matrix, etc. of the current block.
- the prediction unit 320 uses intra-frame prediction or inter-frame prediction to generate a prediction block of the current block based on the prediction information.
- the inverse quantization/transformation unit 330 uses the quantization coefficient matrix obtained from the code stream to inverse quantize and inverse transform the quantization coefficient matrix to obtain a residual block.
- the reconstruction unit 340 adds the prediction block and the residual block to obtain a reconstructed block.
- the reconstructed blocks constitute a reconstructed image, and the loop filtering unit 350 performs loop filtering on the reconstructed image based on the image or on the block to obtain a decoded image.
- the encoding end also requires similar operations as the decoding end to obtain a decoded image.
- the decoded image can also be called a reconstructed image, and the reconstructed image can be used as a reference frame for inter-frame prediction for subsequent
- the block division information determined by the encoder parses the bitstream and determines the same block division information, prediction, transformation, quantization, entropy coding, loop filtering and other mode information or parameter information as the encoding end by analyzing the existing information, so as to ensure that the decoded image obtained by the encoding end is the same as the decoded image obtained by the decoding end.
- the above is the basic process of the video codec under the block-based hybrid coding framework. With the development of technology, some modules or steps of the framework or process may be optimized. The present application is applicable to the basic process of the video codec under the block-based hybrid coding framework, but is not limited to the framework and process.
- the current block may be a current coding unit (CU) or a current prediction unit (PU), etc.
- CU current coding unit
- PU current prediction unit
- an image may be divided into slices, etc., and slices in the same image may be processed in parallel, that is, there is no data dependency between them.
- "Frame” is a commonly used term, and it can generally be understood that a frame is an image. In the application, the frame may also be replaced by an image or a slice, etc.
- Intra-frame prediction usually predicts the current coding block with the help of respective angle modes and non-angle modes to obtain a prediction block.
- the optimal prediction mode of the current coding unit is selected, and then the prediction mode is transmitted to the decoding end via the code stream.
- the decoding end parses the prediction mode, predicts the predicted image of the current decoding block, and superimposes the residual pixels transmitted via the code stream to obtain the reconstructed image.
- the intra-frame prediction method uses the reconstructed pixels that have been coded and decoded around the current block as reference pixels to predict the current block.
- Figure 4A is a schematic diagram of intra-frame prediction.
- the size of the current block is 4x4, and the pixels in the left row and the upper column of the current block are the reference pixels of the current block.
- Intra-frame prediction uses these reference pixels to predict the current block.
- These reference pixels may all be available, that is, all have been coded and decoded. Some may also be unavailable, for example, if the current block is the leftmost of the entire frame, then the reference pixels on the left of the current block are unavailable. Or when encoding and decoding the current block, the lower left part of the current block has not been coded and decoded, so the reference pixels on the lower left are also unavailable.
- available reference pixels or certain values or methods may be used for filling, or no filling may be performed.
- FIG4B is a schematic diagram of intra prediction.
- the multiple reference line intra prediction method can use more reference pixels to improve encoding and decoding efficiency. For example, four reference rows/columns are used as reference pixels of the current block.
- FIG5A-5I are schematic diagrams of intra-frame prediction.
- intra-frame prediction for 4x4 blocks in H.264 can mainly include 9 modes. Among them, mode 0 as shown in FIG5A copies the pixels above the current block to the current block in the vertical direction as the prediction value, mode 1 as shown in FIG5B copies the reference pixels on the left to the current block in the horizontal direction as the prediction value, mode 2 DC as shown in FIG5C uses the average value of the 8 points A ⁇ D and I ⁇ L as the prediction value of all points, and modes 3 ⁇ 8 as shown in FIG5D-5I copy the reference pixels to the corresponding positions of the current block at a certain angle, because some positions of the current block cannot correspond exactly to the reference pixels, and it may be necessary to use the weighted average value of the reference pixels, or the sub-pixels of the interpolated reference pixels.
- FIG. 6 is a schematic diagram of the intra-frame prediction mode.
- the intra-frame prediction modes used by HEVC include Planar, DC and 33 angle modes, a total of 35 prediction modes.
- Figure 7 is a schematic diagram of the intra-frame prediction mode.
- the intra-frame modes used by VVC include Planar, DC and 65 angle modes, a total of 67 prediction modes.
- Figure 8 is a schematic diagram of the intra-frame prediction mode. As shown in Figure 8, VS3 uses DC, Plane, Bilinear, PCM and 62 angle modes, a total of 66 prediction modes.
- the multiple intra prediction filter (MIPF) in AVS3 uses different filters to generate prediction values for different block sizes. For pixels at different positions in the same block, one filter is used to generate prediction values for pixels closer to the reference pixel, and another filter is used to generate prediction values for pixels farther from the reference pixel.
- technologies for filtering predicted pixels such as the intra prediction filter (IPF) in AVS3, can use reference pixels to filter the predicted values.
- the loop filter unit uses an adaptive loop filter (Adaptive loop filter) technology.
- Adaptive loop filter Adaptive loop filter
- the ALF technology is used to filter the reconstructed image to obtain the final decoded image.
- ALF adaptive loop filter
- ALF is a filter in the loop filter, which is designed based on the principle of Wiener filter and is a filter to minimize the error between the target sample and the input sample.
- the target sample is the original image and the input is the reconstructed image.
- the filter coefficients must be determined first.
- the filter coefficient of the interpolation filter can be obtained by solving the Wienerhof equation:
- r is The position of a sample within, illustratively, the coordinates of the sample at position r can be expressed as (x, y).
- o[r] is the original pixel value of the sample at position r
- t[r] is the pixel value to be filtered at position r
- t[r] is also called the reconstructed value of the pixel at position r in the reconstructed image.
- ⁇ p 0 ,p 1 ,...,p N-1 ⁇ is the relative position difference between the N positions corresponding to position r and position r.
- the filter coefficients of the filter can be obtained by solving the above Wienerhof equations through Cholesky decomposition of the autocorrelation coefficient matrix.
- the sample to be filtered is filtered by the following formula (2) to obtain the filtered sample:
- t[r]′ is the pixel value after filtering at position r
- p n is the relative position difference between the nth position and position r among the N positions corresponding to position r
- t[r+p n ] represents the pixel value to be filtered at position r+p n .
- Convolutional cross component model is a process of predicting chrominance pixels by reconstructing the luminance component pixels. Its advantage is that the coefficients of the CCCM filter can be obtained from the decoder using the reconstructed pixels, thus eliminating the cost of storing the filter coefficients in the bitstream like ALF. As shown in Figure 9, the coefficients of CCCM are calculated from the reconstructed pixels around the current chrominance block to be predicted and the reconstructed pixels around the luminance block at the corresponding position of the chrominance block.
- an interpolation filtering prediction mode determines the filter coefficient of the interpolation filter through the reconstructed area around the current block. Based on the filter coefficient, the interpolation filter is used to perform interpolation filtering prediction on each point in the current block to obtain the predicted value of each point in the current block, and then obtain the predicted block of the current block.
- interpolation filtering when using an interpolation filter to perform interpolation filtering prediction on each pixel in the current block, interpolation filtering is performed on each pixel one by one. That is, after the interpolation filtering prediction of the previous pixel is completed, the interpolation filtering prediction is performed on the next pixel, and the prediction value of the previous pixel is used when the interpolation filtering prediction is performed on the next pixel. It can be seen that when the related art performs interpolation filtering prediction on the current block, prediction is performed point by point, and only one point can be predicted at a time, which makes the prediction efficiency low, thereby affecting the overall encoding and decoding performance of the video.
- the embodiment of the present application performs parallel prediction on the pixels in the current block when predicting the current block using the interpolation filter prediction mode, thereby improving the prediction efficiency and enhancing the encoding and decoding performance of the video.
- the video decoding method provided in the embodiment of the present application is introduced by taking the decoding end as an example.
- FIG10 is a schematic flow chart of a video decoding method provided by an embodiment of the present application, and the embodiment of the present application is applied to the video decoders shown in FIG1 and FIG3. As shown in FIG10, the method of the embodiment of the present application includes:
- S101 Determine a reference area and an interpolation filter of a current block, and determine a filter coefficient of the interpolation filter based on the reference area.
- the decoding end decodes the bitstream to obtain the quantization coefficients of the current block, dequantizes the quantization coefficients to obtain the transformation coefficients of the current block, and de-transforms the transformation coefficients to obtain the residual value of the current block.
- the prediction mode of the current block is determined, and the prediction value of the current block is determined based on the prediction mode. Based on the prediction value and the residual value of the current block, the reconstructed value of the current block is obtained.
- the current block also becomes a block to be predicted.
- the decoding end first determines the prediction mode of the current block.
- the decoding end determines the prediction mode of the current block in at least the following ways:
- Method 1 The encoder determines the prediction mode of the current block, for example, from the candidate prediction modes composed of the traditional prediction mode and the interpolation filter prediction mode shown in Figure 6 or Figure 7, the candidate prediction mode with the lowest cost is determined as the prediction mode of the current block. Then, the encoder adds the indication information of the prediction mode of the current block to the bitstream. In this way, the decoder obtains the indication information of the prediction mode of the current block by decoding the bitstream, and then determines the prediction mode of the current block based on the indication information, and then uses the intra-frame prediction mode to predict the current block to obtain the prediction value of the current block.
- the terminal device determines that the prediction mode of the current block is the traditional prediction mode
- the index of the prediction mode of the current block is used as the indication information of the prediction mode and written into the bitstream.
- the decoding end obtains the index of the prediction mode by decoding the bitstream, and then determines the prediction mode of the current block from the traditional prediction mode shown in FIG. 6 or FIG. 7 based on the index.
- Method 2 The encoder constructs a candidate list of intra-frame prediction modes, and selects the intra-frame prediction mode of the current block from the candidate list of intra-frame prediction modes. It should be noted that the candidate list of intra-frame prediction modes includes the interpolation filter prediction mode. Then, the encoder writes the sequence number (or index number) of the intra-frame prediction mode of the current block in the candidate list of intra-frame prediction modes into the bitstream.
- the decoder determines the sequence number of the intra-frame prediction mode of the current block in the candidate list of intra-frame prediction modes by decoding the bitstream, and constructs the candidate list of intra-frame prediction modes based on the same method as the encoder (it should be noted that the constructed candidate list of intra-frame prediction modes includes the interpolation filter prediction mode), and then determines the intra-frame prediction mode of the current block from the constructed candidate list of intra-frame prediction modes based on the sequence number of the intra-frame prediction mode of the current block in the candidate list of intra-frame prediction modes. Finally, the determined intra-frame prediction mode of the current block is used to predict the current block to obtain the prediction value of the current block.
- Method 3 The encoder constructs a candidate list of intra-frame prediction modes, which includes an interpolation filter prediction mode. Then, the intra-frame prediction mode of the current block is selected from the candidate list of intra-frame prediction modes. For example, the cost of each candidate prediction mode in the candidate list of intra-frame prediction modes on the template of the current block is determined, and then the intra-frame prediction mode of the current block is determined based on the cost.
- the decoder constructs a candidate list of intra-frame prediction modes in the same manner as the encoder, and the constructed candidate list of intra-frame prediction modes also includes an interpolation filter prediction mode.
- the cost of each candidate prediction mode in the candidate list of intra-frame prediction modes on the template of the current block is determined, and then the intra-frame prediction mode of the current block is determined based on the cost. Finally, the determined intra-frame prediction mode of the current block is used to predict the current block to obtain the prediction value of the current block.
- Mode 4 The encoder and decoder use the interpolation filter prediction mode by default to predict the current block.
- the decoding end can also determine whether the current block adopts the interpolation filter prediction mode through the following method 5.
- Mode 5 The decoder decodes the bitstream to obtain third information indicating whether the current block is predicted using the interpolation filter prediction mode. If the decoder determines that the current block is predicted using the interpolation filter prediction mode based on the third information, the reference area and interpolation filter of the current block are determined.
- the third information is written into the code stream, so that the decoding end obtains the third information by decoding the code stream, and then determines whether the current block adopts the interpolation filter prediction mode for prediction based on the third information. If the third information indicates that the current block adopts the interpolation filter prediction mode for prediction, the decoding end uses the interpolation filter prediction mode to predict the current block to obtain the prediction block of the current block.
- the decoding end skips the step of predicting the current block using the interpolation filter prediction mode, and further determines the prediction mode of the current block, and predicts the current block using the determined prediction mode to obtain the prediction block of the current block.
- the embodiment of the present application does not limit the specific form of the third information, which may be any indication information that can indicate whether the current block is predicted using the interpolation filter prediction mode.
- the use conditions of the interpolation filter prediction mode are limited, based on which, before determining the reference area and interpolation filter of the current block, it is determined whether the current image block is allowed to be predicted using the interpolation filter prediction mode.
- the embodiment of the present application does not limit the specific method of determining whether the current image block is allowed to use the interpolation filter prediction mode for prediction, that is, does not limit the specific use conditions of the interpolation filter prediction mode.
- the interpolation filter prediction mode in order to improve the prediction accuracy of the interpolation filter prediction mode, is used for some blocks that meet the requirements, and the interpolation filter prediction mode is not used for some blocks that do not meet the requirements.
- the decoding end before decoding the code stream and obtaining the third information, the decoding end also needs to determine whether the position of the current block in the current image meets the preset position requirement, and determine whether the size of the current block meets the preset block size. If it is determined that the position of the current block in the current image meets the preset position requirement, and the size of the current block meets the preset block size, the code stream is decoded to obtain the third information.
- the embodiment of the present application does not impose any restrictions on the preset position requirements and the prediction block size, which are determined based on actual needs.
- the position of the upper left corner of the current image is (0,0)
- the position of the upper left corner of the current block is (x, y)
- the preset position requires that the x value of the current block is greater than or equal to a first preset value XX
- the y value of the current block is greater than or equal to a second preset value YY.
- the embodiment of the present application does not limit the specific values of the first preset value and the second preset value.
- the first preset value and the second preset value are the same.
- the first preset value and the second preset value are both 13, that is, when the distance from the upper edge line of the current block to the upper edge line of the current image is greater than or equal to 13 pixel rows, and the distance from the left edge line of the current block to the left edge line of the current image is greater than or equal to 13 pixel columns, it indicates that the position of the current block in the current image meets the preset position requirements.
- the preset block size requirement is that the width W of the current block is less than or equal to the third preset value A, and the height H of the current block is less than or equal to the fourth preset value B.
- the embodiment of the present application does not limit the specific values of the third preset value and the fourth preset value.
- the third preset value and the fourth preset value are the same.
- the third preset value and the fourth preset value are both 32, that is, when the width and height of the current block are both less than or equal to 32, it indicates that the current block meets the preset block size requirement.
- the decoding end before determining whether the current block is predicted using the interpolation filter prediction mode, the decoding end first determines whether the position of the current block in the current image meets the preset position requirement, and determines whether the size of the current block meets the preset block size requirement. If the position of the current block in the current image meets the preset position requirement, and the size of the current block meets the preset block size requirement, the code stream is decoded to obtain the third information, and based on the third information, it is determined whether the current block is predicted using the interpolation filter prediction mode.
- the decoding end decodes the code stream to obtain the third information.
- the first preset value, the second preset value, the third preset value and the fourth preset value are default values.
- the first preset value, the second preset value, the third preset value and the fourth preset value are values decoded from the bit stream by the decoding end.
- the current block is not predicted using the interpolation filtering prediction mode.
- the decoding end before determining whether the position of the current block in the current image meets the preset position requirement and determining whether the size of the current block meets the preset block size, also includes: decoding the code stream to obtain second information, and the second information is used to indicate whether the current sequence is allowed to be predicted using the interpolation filtering prediction mode; if the second information indicates that the current sequence is allowed to be predicted using the interpolation filtering prediction mode, then determining whether the position of the current block in the current image meets the preset position requirement and determining whether the size of the current block meets the preset block size.
- a high-level syntax element such as second information at the sequence level, indicates whether the current sequence is allowed to be predicted using the interpolation filter prediction mode. If the second information indicates that the current sequence is allowed to be predicted using the interpolation filter prediction mode, the decoding end determines whether the position of the current block in the current image meets the preset position requirement, and determines whether the size of the current block meets the preset block size. When it is determined that the position of the current block in the current image meets the preset position requirement, and when it is determined that the size of the current block meets the preset block size requirement, the third information is decoded to determine whether the current block is predicted using the interpolation filter prediction mode.
- the decoding end if the second information indicates that the current sequence is not allowed to be predicted using an interpolation filtering prediction mode, the decoding end skips the above steps of determining whether the position of the current block in the current image meets the preset position requirement, and determining whether the size of the current block meets the preset block size requirement, and skips the step of decoding the third information.
- the embodiment of the present application does not limit the specific form of the second information, which may be any indication information that can indicate whether the current sequence is allowed to be predicted using the interpolation filtering prediction mode.
- the second information is carried in a sequence parameter set (SPS).
- SPS sequence parameter set
- sps_eip_enabled_flag represents the second information
- the embodiments of the present application may further include a general constraints information (GCI) flag to indicate whether the interpolation filter prediction technology is used.
- GCI general constraints information
- gci_no_eip_constraint_flag is used to indicate whether the current video enables the interpolation filter prediction technology.
- Table 2 the gci_no_eip_constraint_flag is carried in the general constraints information general_constraints_info().
- gci_no_eip_constraint_flag 1
- the current video does not enable the interpolation filter prediction technology, that is, the interpolation filter intra-frame prediction technology at the restricted sequence level must be 0 in all images, that is, it means that all sequences in the current video are not allowed to use the interpolation filter intra-frame prediction technology.
- gci_no_eip_constraint_flag 0 it means that the current video enables the interpolation filter prediction technology, that is, the interpolation filter intra-frame prediction technology at the unrestricted sequence level must be 0 in all images.
- cbWidth and cbHeight are the width and height of the current block
- SIZE_A can be understood as the third preset value mentioned above
- SIZE_B can be understood as the fourth preset value
- XX can be understood as the first preset value
- YY can be understood as the second preset value
- x0, y0 represent the coordinate difference between the upper left corner of the current block and the upper left corner of the current image.
- the second information sps_eip_enabled_flag of the sequence level 1, that is, indicating that the current sequence allows the use of the interpolation filter prediction mode, then it is determined whether the position of the current block in the current image meets the preset position requirement, and whether the size of the current block meets the preset block size requirement. If it is determined that the position of the current block in the current image meets the preset position requirement, and it is determined that the size of the current block meets the preset block size requirement, the third information intra_eip_flag is decoded, and based on the decoded third information intra_eip_flag, it is determined whether the current block is predicted using the interpolation filter prediction mode.
- whether the current block adopts the interpolation filter prediction mode can be determined by high-level syntax, such as GCI, sequence level, frame level, slice level, block level, etc. It can also be determined by the size of the current block and the position of the current block.
- the interpolation filter prediction mode when used for some smaller blocks, the computational cost and computational complexity will be increased. This is because the computational complexity of the interpolation filter prediction mode in this application is relatively high. If the interpolation filter prediction mode is also used for some small blocks, this will increase the number of times the interpolation filter prediction mode is used in the entire image decoding, thereby increasing the computational cost and computational complexity of the image. Based on this, in an embodiment of the present application, the interpolation filter prediction mode is allowed to be used for slightly larger blocks. For example, if the size of the current block is greater than or equal to the preset size, the interpolation filter prediction mode is allowed to be used.
- the interpolation filter prediction mode is not allowed to be used for the current block.
- the embodiment of the present application does not limit the specific value of the preset size.
- the size of the current block is greater than or equal to the preset size, which can be that the number of pixels of the current block is greater than or equal to the preset number, or at least one of the length and width of the current block is greater than or equal to the preset value, or the ratio of the length and width of the current block is greater than or equal to the preset comparison, etc.
- the current block if the current block is in the first row of the current CTU, it is determined that the current block is not allowed to use the interpolation filter prediction mode. That is, if the current block is predicted using the interpolation filter prediction mode, the current block is not in the first row of the current CTU.
- determining whether the current block is allowed to use the interpolation filter prediction mode is also related to the type of the current image. For example, for intra-frame prediction images (i.e., images that use intra-frame prediction when predicting), it is stipulated that the interpolation filter prediction mode can be used for prediction, and for inter-frame prediction images (i.e., images that use inter-frame prediction when predicting), the interpolation filter prediction mode is not allowed to be used for prediction. Based on this, if the current image where the current block is located is an intra-frame prediction image, it is determined that the current block is allowed to use the interpolation filter prediction mode for prediction. If the current image is not an intra-frame prediction image (for example, an inter-frame prediction image), it is determined that the current block is not allowed to use the interpolation filter prediction mode for prediction.
- intra-frame prediction images i.e., images that use intra-frame prediction when predicting
- inter-frame prediction images i.e., images that use inter-frame prediction when predicting
- a series of complex intra prediction modes are introduced, such as: template-based intra prediction derivation mode (TIMD), decoder-side intra prediction derivation mode (DIMD), template-based multiple reference line intra prediction (TMRL), spatial geometrical partitioning mode (SGPM). and convolutional cross component model (CCCM for short).
- TMD template-based intra prediction derivation mode
- DIMD decoder-side intra prediction derivation mode
- TMRL template-based multiple reference line intra prediction
- SGPM spatial geometrical partitioning mode
- CCCM convolutional cross component model
- unified identification information (such as the first information) is used to uniformly indicate the above intra prediction modes based on template matching technology. For example, if the first information indicates that the technology based on template matching is not turned on, it means that the above intra prediction modes based on template matching technology (i.e., TIMD, DIMD, TMRL, SGPM, TMRL, CCCM and interpolation filter prediction mode) are not allowed to be used. If the first information indicates that the technology based on template matching is not turned on, it means that the above intra prediction modes based on template matching technology are allowed to be used, and then based on other information, the intra prediction mode specifically used by the current block is further determined.
- template matching technology i.e., TIMD, DIMD, TMRL, SGPM, TMRL, CCCM and interpolation filter prediction mode
- determining whether the current block is allowed to use the interpolation filter prediction mode includes: decoding the code stream to obtain first information, the first information is used to indicate whether the template matching-based technology is turned on; based on the first information, determining whether the current block is allowed to use the interpolation filter prediction mode. For example, if the first information indicates that the template matching-based technology is not turned on, it is determined that the current block is not allowed to use the interpolation filter prediction mode for prediction. For another example, if the first information indicates that the template matching-based technology is turned on, the decoding end determines whether the current block is predicted using the interpolation filter prediction mode through other information.
- the embodiment of the present application does not limit the specific form of the first information.
- the first information may be GCI, sequence level, frame level, slice level or block level indication information.
- the decoding end decodes the bitstream to obtain the first information. If the first information indicates that the template matching technology is turned on, the decoding end continues to decode the bitstream to obtain the second information (sps_eip_enabled_flag), and then determines whether the current block is allowed to use the interpolation filter prediction mode based on the second information. If the first information indicates that the template matching technology is not turned on, the decoding end directly determines that the current block is not suitable for prediction using the interpolation filter prediction mode, and skips the step of decoding the second information.
- condition for the decoder to determine whether the current block can be predicted using the interpolation filter prediction mode includes at least one of the following:
- high-level syntax includes sequence level, frame level, slice level, block level, etc., refer to the above description for details;
- the interpolation filtering prediction mode is used to predict the current block to obtain a predicted value of the current block.
- the following introduces the process of using the interpolation filter prediction mode at the decoding end to predict the current block.
- the decoding end determines that the current block is predicted using the interpolation filtering prediction mode, it first determines the reference area and the interpolation filter of the current block.
- the reference area of the current block is part or all of the reconstructed area around the current block.
- the reconstruction area around the current block may include: an upper reconstruction area of the current block, a left reconstruction area of the current block, an upper right reconstruction area of the current block, a lower left reconstruction area of the current block, and an upper left reconstruction area of the current block.
- the block to be predicted in FIG12 is the current block.
- the embodiment of the present application does not limit the specific shape and size of the reference area of the current block.
- the reference area of the current block includes any one of the reconstruction area above the current block, the reconstruction area on the left side of the current block, the reconstruction area on the upper right side of the current block, the reconstruction area on the lower left side of the current block, and the reconstruction area on the upper left side of the current block.
- the reference area of the current block is the reconstruction area above the current block, or the reference area of the current block is the reconstruction area on the left side of the current block.
- the reference area of the current block includes any two reconstruction areas of the upper reconstruction area of the current block, the left reconstruction area of the current block, the upper right reconstruction area of the current block, the lower left reconstruction area of the current block, and the upper left reconstruction area of the current block.
- the reference area of the current block includes the upper reconstruction area of the current block and the left reconstruction area of the current block.
- the reference area of the current block includes the upper reconstruction area of the current block and the lower left reconstruction area of the current block.
- the reference area of the current block includes any three reconstruction areas of the upper reconstruction area of the current block, the left reconstruction area of the current block, the upper right reconstruction area of the current block, the lower left reconstruction area of the current block, and the upper left reconstruction area of the current block.
- the reference area of the current block includes the upper reconstruction area of the current block, the upper right reconstruction area of the current block, and the upper left reconstruction area of the current block.
- the reference block of the current block includes the left reconstruction area of the current block, the upper left reconstruction area of the current block, and the lower left reconstruction area of the current block.
- the reference area of the current block includes any four reconstruction areas of the upper reconstruction area of the current block, the left reconstruction area of the current block, the upper right reconstruction area of the current block, the lower left reconstruction area of the current block, and the upper left reconstruction area of the current block.
- the reference area of the current block includes the upper reconstruction area of the current block, the upper right reconstruction area of the current block, the upper left reconstruction area of the current block, and the left reconstruction area of the current block.
- the reference block of the current block includes the left reconstruction area of the current block, the upper left reconstruction area of the current block, the lower left reconstruction area of the current block, and the upper reconstruction area of the current block.
- the reference area of the current block includes five reconstruction areas: an upper reconstruction area of the current block, a left reconstruction area of the current block, an upper right reconstruction area of the current block, a lower left reconstruction area of the current block, and an upper left reconstruction area of the current block.
- the decoding end determines a reference area of the current block from among the preset P reference areas.
- the specific manners in which the decoding end determines the reference region of the current block from the preset P reference regions include but are not limited to the following:
- the reference area of the current block is a default area.
- the encoder and decoder default that the reference area of the current block includes at least one of the P reference areas: an upper reconstruction area of the current block, a left reconstruction area of the current block, an upper right reconstruction area of the current block, a lower left reconstruction area of the current block, and an upper left reconstruction area of the current block.
- Method 2 The decoding end decodes the code stream to obtain fourth information, which is used to indicate the type of the reference area of the current block; based on the type of the reference area, the reference area of the current block is determined in preset P reference areas, where P is a positive integer greater than 1.
- the encoder determines the reference area of the current block from among the preset P reference areas. For example, the encoder determines the P reference areas The corresponding coding costs are respectively determined, and the reference area with the smallest coding cost is determined as the reference area of the current block. Then, the type of the determined reference area with the smallest coding cost is indicated to the decoding end through the fourth information. In this way, the decoding end obtains the fourth information by decoding the bit stream, and then determines the reference area of the current block from the preset P reference areas based on the type of the reference area indicated by the fourth information.
- the embodiment of the present application does not impose any specific limitation on the specific number and shape of the P reference areas.
- the P reference regions include at least one of a first reference region, a second reference region, and a third reference region.
- the first reference area includes the reconstruction area above, upper right, left, upper left and upper left of the current block.
- the second reference area includes the reconstruction area above, upper right and upper left of the current block.
- the third reference area includes the reconstruction area on the left, upper left and upper left of the current block.
- the block to be predicted in FIG. 13A to FIG. 13C is the current block.
- the embodiment of the present application does not limit the specific form of the fourth information, as long as it is any indication information that can indicate the type of the reference area of the current block.
- eip_ref_type is used to represent the fourth information.
- different types of reference areas are indicated by the value of eip_ref_type.
- the P reference areas are the three reference areas shown in Figures 13A to 13C.
- the P reference areas in the embodiment of the present application also include other reference areas in addition to the above three reference areas, and the embodiment of the present application does not limit this.
- the correspondence between the reference areas and the eip_ref_type values shown in Table 4 above can be adaptively adjusted according to the number of reference areas.
- the decoding end may adopt a decoding method of truncated binary code to decode the code stream to obtain the fourth information.
- the decoding end may use an equal probability decoding method or a context model decoding method to decode the codeword of the truncated binary code.
- the decoding end may also use the following method 3 to determine the reference area of the current block.
- Mode 3 Based on the shape of the current block, a reference area of the current block is determined from among the preset P reference areas.
- the first type of reference region is used.
- the second type of reference area is used.
- the third type of reference area is used.
- the correspondence between the P reference regions and the shape of the current block is preset.
- the decoding end can determine the reference region of the current block from the P reference regions according to the shape of the current block through the correspondence between the P reference regions and the shape of the current block.
- the following describes the process of determining the interpolation filter of the current block at the decoding end.
- the interpolation filters provided in the embodiments of the present application include, but are not limited to: a square interpolation filter, and an interpolation filter whose height is smaller than its width.
- the square interpolation filter includes but is not limited to the 4X4 interpolation filter shown in Figure 14A.
- interpolation filters with a height greater than width include but are not limited to the 5X3 interpolation filter shown in Figure 14B, the 6X2 interpolation filter shown in Figure 14D, and the 7X1 interpolation filter shown in Figure 14G.
- interpolation filters whose height is smaller than their width include but are not limited to the 3X5 interpolation filter shown in FIG. 14C , the 2X6 interpolation filter shown in FIG. 14E , and the 1X7 interpolation filter shown in FIG. 14F .
- the dark grey position represents the current position to be predicted
- the light grey position represents the input position of the interpolation filter, that is, the ⁇ p 0 ,p 1 ,...,p N-1 ⁇ position.
- the decoding end determines the interpolation filter of the current block from among the preset Q interpolation filters.
- the specific manners in which the decoding end determines the interpolation filter of the current block from the preset Q interpolation filters include but are not limited to the following:
- the interpolation filter of the current block is a default interpolation filter.
- the interpolation filter of the current block is set to the default interpolation filter of FIG. 14A to FIG. Any one of the Q interpolation filters 14G.
- the default interpolation filter is a 4X4 interpolation filter.
- the decoding end decodes the code stream to obtain the fifth information, which is used to indicate the shape of the interpolation filter of the current block; based on the shape of the interpolation filter of the current block, the interpolation filter of the current block is determined from the preset Q interpolation filters, where Q is a positive integer greater than 1.
- the encoder determines the interpolation filter of the current block from among the preset Q interpolation filters. For example, the encoder determines the coding costs corresponding to the Q interpolation filters, and determines the interpolation filter with the smallest coding cost as the interpolation filter of the current block. Then, the shape of the interpolation filter with the smallest coding cost is indicated to the decoder through the fifth information. In this way, the decoder obtains the fifth information by decoding the bitstream, and then determines the interpolation filter of the current block from among the preset Q interpolation filters based on the shape of the interpolation filter indicated by the fifth information.
- the Q interpolation filters include at least one of a first interpolation filter, a second interpolation filter, and a third interpolation filter
- the first interpolation filter is a square interpolation filter
- the second interpolation filter is a rectangular interpolation filter with a width greater than a height
- the third interpolation filter is a rectangular interpolation filter with a height greater than a width.
- the Q interpolation filters include the plurality of interpolation filters in FIGS. 14A to 14H .
- the embodiment of the present application does not limit the specific form of the fifth information, as long as it is any indication information that can indicate the shape of the interpolation filter of the current block.
- eip_filter_type is used to represent the fifth information.
- the value of eip_filter_type is used to indicate interpolation filters of different shapes.
- the Q interpolation filters are the five interpolation filters shown in FIG. 15 , as shown in Table 6, the corresponding relationship between the five interpolation filters and the eip_filter_type value is:
- the decoding end may adopt a decoding method of truncated binary code to decode the code stream to obtain the fifth information.
- the preset Q interpolation filters include the five interpolation filters shown in FIG. 15
- the corresponding relationship between the truncated binary code, the eip_filter_type value, and the shape of the interpolation filter is shown in Table 7:
- the five interpolation filter shapes shown in Table 7 and the three reconstruction region types shown in Table 5 have a total of 15 interpolation filter and reconstruction region combinations.
- the decoding end may obtain the fifth information eip_filter_type by decoding the bitstream, and then determine the interpolation filter of the current block in the above Table 7 according to the shape of the interpolation filter indicated by the fifth information eip_filter_type. Similarly, the bitstream is decoded to obtain the fourth information, and then the reference area of the current block is determined in the above Table 5 according to the value of the fourth information eip_ref_type.
- the decoder decodes the bitstream and first obtains the fifth information sps_eip_enabled_flag at the sequence level, which indicates whether the current sequence allows the interpolation filter prediction mode to be used for prediction. Next, it is determined whether the position of the current block in the current image meets the preset position requirement, and whether the size of the current block meets the preset block size requirement. If it is determined that the position of the current block in the current image meets the preset position requirement, and it is determined that the size of the current block meets the preset block size requirement, the third information intra_eip_flag is decoded, and the third information intra_eip_flag indicates whether the current block is predicted using the interpolation filter prediction mode.
- the bitstream is decoded to obtain the fourth information eip_ref_type and the fifth information eip_filter_type.
- the fourth information eip_ref_type indicates the type of the reference area of the current block, so that the decoder can obtain the reference area of the current block by looking up the table based on the value of the fourth information eip_ref_type.
- the fifth information eip_filter_type indicates the shape of the interpolation filter of the current block, and based on the value of the fifth information eip_filter_type, the interpolation filter of the current block is obtained by looking up a table.
- the correspondence between the truncated binary code, the eip_filter_type value and the shape of the interpolation filter is shown in Table 9:
- the 7 interpolation filter shapes shown in Table 9 and the 3 reconstruction area types shown in Table 5 have a total of 21 interpolation filter and reconstruction area combinations.
- the decoding end can obtain the reference area and interpolation filter of the current block by decoding the syntax shown in Table 8 and by looking up Table 5 and the above Table 8.
- the correspondence between the truncated binary code, the eip_filter_type value, and the shape of the interpolation filter is shown in Table 10:
- the three interpolation filter shapes shown in Table 10 and the three reconstruction region types shown in Table 5 have a total of nine combinations of interpolation filters and reconstruction regions.
- the decoding end can obtain the reference area and interpolation filter of the current block by decoding the syntax shown in Table 8 and by looking up Table 5 and the above Table 10.
- the correspondence between the truncated binary code, the eip_filter_type value, and the shape of the interpolation filter is shown in Table 11:
- the three interpolation filter shapes shown in Table 11 and the three reconstruction region types shown in Table 5 have a total of nine combinations of interpolation filters and reconstruction regions.
- the decoding end can obtain the reference area and interpolation filter of the current block by decoding the syntax shown in Table 8 and by looking up Table 5 and the above Table 11.
- the correspondence between the truncated binary code, the eip_filter_type value, and the shape of the interpolation filter is shown in Table 12:
- the three interpolation filter shapes shown in Table 12 and the three reconstruction region types shown in Table 5 have a total of nine combinations of interpolation filters and reconstruction regions.
- the decoding end can obtain the reference area and interpolation filter of the current block by decoding the syntax shown in Table 8 and by looking up Table 5 and the above Table 11.
- FIG18B increases the number of taps of the filter, for example, expanding it to 2x8 and 8x2 interpolation filters, and in fact, the 2x8 and 8x2 filters and the 4x4 filter are filters that use 15 samples as input and 1 output, and their complexity is similar in terms of complexity, so the interpolation filter shown in FIG18B increases the interpolation effect without increasing the complexity.
- the correspondence between the truncated binary code, the eip_filter_type value, and the shape of the interpolation filter is shown in Table 13:
- the three interpolation filter shapes shown in Table 13 and the three reconstruction area types shown in Table 5 have a total of nine interpolation filter and reconstruction area combinations.
- the decoding end can obtain the reference area and interpolation filter of the current block by decoding the syntax shown in Table 8 and looking up Table 5 and the above Table 13.
- the decoding end may also use the following method 3 to determine the interpolation filter of the current block.
- Mode 3 Based on the shape of the current block, determine the interpolation filter of the current block from among the preset Q interpolation filters.
- an interpolation filter of the first shape is used.
- the interpolation filter of the second shape is used.
- an interpolation filter of the third shape is used.
- the correspondence between the Q interpolation filters and the shape of the current block is preset.
- the decoding end can determine the interpolation filter of the current block from the Q interpolation filters according to the shape of the current block through the correspondence between the Q interpolation filters and the shape of the current block.
- the following is an introduction to determining the filter coefficients of the interpolation filter based on the reference area.
- the decoding end determines the filter coefficients of the interpolation filter in at least the following ways:
- Method 1 Use the interpolation filter determined above to slide in the reference area of the current block to construct the Wienerhof equation. Then, solve the Wienerhof equation to obtain the filter coefficients of the interpolation filter.
- N positions corresponding to each position of the reference area are determined according to the shape of the interpolation filter. For example, for position r in the reference area, based on the shape of the interpolation filter, N positions corresponding to position r are determined in the reference area, and the pixel reconstruction values of these N positions are the input of the interpolation filter.
- the phase position difference between these N positions and position r is ⁇ p 0 ,p 1 ,...,p N-1 ⁇ , where p N is a two-dimensional representation.
- ⁇ c 0 ,c 1 ,...,c N-1 ⁇ is the interpolation filter coefficient at the position ⁇ p 0 ,p 1 ,...,p N-1 ⁇ .
- the interpolation filter is slid in the reference area of the current block to construct the Wienerhof equation, as shown in formula (3):
- t[r+p n ] is the pixel reconstructed value of the pixel at position r+p n in the reference area
- t[r] is the pixel reconstructed value of the pixel at position r in the reference area
- the above formula (3) contains the interpolation filter coefficients Except for , all other parameters are known, so the filter coefficient of the interpolation filter of the current block can be determined by solving the above formula (3).
- the decoding end can solve the Wienerhof equation shown in the above formula (3) by Cholesky decomposition of the autocorrelation coefficient matrix to obtain the filter coefficients of the filter.
- the embodiment of the present application does not limit the sliding step length of the interpolation filter within the reference area.
- the horizontal sliding step size and the vertical sliding step size of the interpolation filter in the reference area are equal, both being 1 pixel.
- the horizontal sliding step size and the vertical sliding step size of the interpolation filter in the reference area are not equal.
- the horizontal sliding step size is 2 pixels and the vertical sliding step size is 1 pixel.
- the horizontal sliding step size is 1 pixel and the vertical sliding step size is 2 pixels.
- At least one of the horizontal sliding step length and the vertical sliding step length of the interpolation filter in the reference area is greater than the preset step length.
- the horizontal sliding step length is greater than the preset step length.
- the vertical sliding step length is greater than the preset step length.
- both the horizontal sliding step length and the vertical sliding step length are greater than the preset step length.
- the embodiment of the present application does not limit the specific value of the preset step length. For example, it can be 1, 2, 3, etc.
- the decoding end determines the filter coefficient through the following steps S101-A1 to S101-A4:
- the reference area is de-averaged, and the filter coefficient of the interpolation filter is determined based on the de-averaged reference area. Since the amount of data becomes smaller after the reference area is de-averaged, when the filter coefficient is determined based on the de-averaged reference area, the efficiency of determining the filter coefficient can be improved.
- the decoding end first determines a first reconstruction area, and the first reconstruction area may be any part of the reconstruction area around the current block.
- the decoding end determines the first reconstruction area around the current block in at least the following ways:
- the decoding end determines a reconstruction area around the current block as the first reconstruction area by default.
- the decoding end determines by default that an area consisting of a row above, a column on the left, and a pixel point in the upper left corner of the current block is first determined as the first reconstruction area.
- Method 2 determining the first reconstruction area based on the shape of the current block.
- a reconstructed pixel region in an upper row and a left column of the current block is determined as the first reconstructed region.
- the shape of the current block is a rectangle whose width is greater than its height
- a row of reconstructed pixel regions above the current block is determined as the first reconstructed region.
- the shape of the current block is a rectangle whose height is greater than its width
- a column of reconstructed pixel regions on the left side of the current block is determined as the first reconstructed region.
- the manner of determining the first reconstruction area includes but is not limited to the above examples.
- the decoding end determines the first reconstruction area, it determines the pixel average reconstruction value m based on the reconstruction value of the first reconstruction area.
- the embodiment of the present application does not limit the specific method of determining the pixel average reconstruction value m based on the reconstruction value of the first reconstruction area in the above S101-A2.
- Method 1 the above S101-A2 includes: determining the average value of the reconstruction values of the first reconstruction area as the pixel average reconstruction value m.
- the pixel average reconstruction value m can be calculated by the method shown in Table 14:
- the average of the reconstruction values of the row above and/or the column to the left may be determined as the pixel average reconstruction value m.
- the pixel average reconstruction value m may be calculated by the method shown in Table 15:
- shift calculation can be used instead of division to quickly calculate the pixel average reconstruction value m.
- the above S101-A2 includes: determining a pixel average reconstruction value based on the shape of the current block and the reconstruction value of the first reconstruction area.
- the average value of the entire first reconstruction area determined above is determined as the pixel average reconstruction value m.
- the first reconstructed area includes an upper reconstructed area and a left reconstructed area of the current block.
- determining the pixel average reconstruction value based on the shape of the current block and the reconstruction value of the first reconstructed area includes: determining the first area from the upper reconstructed area and the left reconstructed area based on the shape of the current block; determining the average reconstruction value of the first area based on the reconstruction value of the first area; and determining the pixel average reconstruction value based on the average reconstruction value of the first area.
- the mean is determined in the same way as the DC prediction mode. Specifically, based on the shape of the current block, the first area is determined from the reconstruction area above and the reconstruction area on the left of the first reconstruction area.
- the upper reconstruction area is determined as the first area.
- the left reconstruction area is determined as the first area.
- the upper reconstruction area and the left reconstruction area are determined as the first area.
- the average reconstruction value of the first area is determined, and then based on the average reconstruction value of the first area, the pixel average reconstruction value m is determined, for example, the average reconstruction value of the first area is determined as the pixel average reconstruction value m.
- the average reconstruction value of the reconstruction area above the current block is determined as the pixel average reconstruction value m. If the shape of the current block is that the height is greater than the width, the average reconstruction value of the reconstruction area on the left side of the current block is determined as the pixel average reconstruction value m. If the shape of the current block is that the height is equal to the width, the average reconstruction value of the reconstruction area above and the reconstruction area on the left side of the current block is determined as the pixel average reconstruction value m.
- the decoding end may calculate the pixel average reconstruction value m by the method shown in Table 16:
- the reconstruction area i.e., the first area
- the average reconstruction value m of the pixel is quickly calculated.
- division is realized by simple shifting, so as to avoid the problem that the length and width of the current block are different, resulting in different sizes of the left reconstruction area and the upper reconstruction area, and the problem of large amount of division operation when calculating the average value, thereby improving the calculation speed of the average reconstruction value m of the pixel, and improving the prediction efficiency of the current block.
- the reconstruction values of the pixels in the reference area are de-averaged based on the pixel average reconstruction value.
- the reconstructed value of the pixel is divided by the above pixel average reconstructed value and then rounded to the integer to obtain the pixel value of the pixel in the reference area after the mean value is removed.
- the decoding end subtracts the pixel average reconstruction value from the reconstruction value of the pixel point in the reference area to obtain the pixel value of the pixel point in the reference area after the average value is removed. For example, for each pixel point in the reference area, the reconstruction value of the pixel point is subtracted from the above pixel average reconstruction value to obtain the pixel value of the pixel point in the reference area after the average value is removed.
- the embodiment of the present application does not limit the specific manner in which the decoding end de-averages the reconstructed values of the pixels in the reference area based on the pixel average reconstruction value.
- the decoding end de-averages the reconstructed values of the pixels in the reference area, obtains the pixel values of the de-averaged pixels in the reference area, and then executes the above steps S101-A4, uses the pixel values of the de-averaged pixels in the reference area as the input of the interpolation filter, slides the interpolation filter in the reference area, and obtains the filter coefficients of the interpolation filter.
- the interpolation filter of the current block is an interpolation filter of 5 different shapes, and the reference area of the current block is 3 different shapes.
- the interpolation filter of the current block is slid on the reference area after the current block is de-averaged to obtain the filter coefficient of the interpolation filter.
- the interpolation filter can slide horizontally row by row or vertically column by column on the reference area after the de-averaged.
- the block to be predicted in Figure 21 is the current block.
- the N positions corresponding to each position of the reference area are determined. For example, for position r in the reference area, based on the shape of the interpolation filter, the N positions corresponding to position r are determined in the reference area, and the pixel reconstruction values of these N positions are the input of the interpolation filter.
- the phase position difference between these N positions and position r is ⁇ p 0 ,p 1 ,...,p N-1 ⁇ , where p N is a two-dimensional representation.
- ⁇ c 0 ,c 1 ,...,c N-1 ⁇ is the interpolation filter coefficient at the position ⁇ p 0 ,p 1 ,...,p N-1 ⁇ .
- the interpolation filter is slid in the reference area of the current block, and the constructed Wienerhof equation is shown in formula (4):
- t[r+p n ]-m is the pixel reconstructed value after removing the mean of the pixel at position r+p n in the reference area
- t[r]-m is the pixel reconstructed value after removing the mean of the pixel at position r in the reference area.
- the above formula (4) contains the following: Except for , all other parameters are known, so the filter coefficient of the interpolation filter of the current block can be determined by solving the above formula (4).
- the decoding end can solve the Wienerhof equation shown in the above formula (4) by Cholesky decomposition of the autocorrelation coefficient matrix to obtain the filter coefficients of the filter.
- step S102 After the decoding end determines the filter coefficients of the interpolation filter based on the above steps, it executes the following step S102.
- S102 Based on the filter coefficients, use an interpolation filter to perform parallel prediction on at least two pixel points in the current block to determine a prediction block of the current block.
- the decoding end After the decoding end determines the filter coefficient of the interpolation filter based on the above steps, it uses the interpolation filter to perform interpolation filtering prediction on the current block based on the filter coefficient to obtain a predicted block of the current block.
- the interpolation filter when using an interpolation filter to predict the current block by interpolation filtering, after the prediction of a pixel in the current block is completed, the next pixel is predicted. For example, as shown in FIG22A, the interpolation filter performs interpolation filtering prediction on each pixel in the current block one by one along the horizontal direction. During prediction, after the prediction of the previous pixel in the horizontal direction is completed, the predicted value of the previous pixel is used as an input pixel value of the interpolation filter of the next pixel to predict the next pixel.
- the decoding end uses an interpolation filter with known filter coefficients to perform interpolation prediction on each position in the current block one by one. Specifically, for the rth point in the current block, first determine the pixel values of the N positions corresponding to the rth point according to the shape of the interpolation filter of the current block. For example, as shown in FIG22, in the 4X4 interpolation filter, the dark position is the position of the rth point to be processed, and the 15 light positions are the N positions corresponding to the rth point. The block to be predicted in FIG22 is the current block. Next, the pixel values of the N positions corresponding to the rth point are determined.
- the reconstruction value of the position is determined as the pixel value of the position. If the position is located in the current block, the predicted value of the position is determined as the pixel value of the position.
- the interpolation filter performs interpolation filtering prediction on each pixel point in the current block one by one along the vertical direction.
- the prediction value of the previous pixel point is used as an input pixel value of the interpolation filter of the next pixel point, and is used to predict the next pixel point.
- the related art uses the interpolation filter to perform interpolation filtering prediction on the current block, only one pixel point can be predicted at a time, which makes the prediction time-consuming and the prediction efficiency low, thereby affecting the decoding efficiency.
- the embodiment of the present application uses an interpolation filter to perform interpolation filtering prediction on the current block, and performs parallel prediction on at least two points in the current block. That is, the decoding end can use the interpolation filter to perform interpolation filtering prediction on at least two pixel points in the current block at the same time.
- the decoding end uses the interpolation filter to perform interpolation filtering prediction on at least two pixels in the current block at the same time, including at least two implementation methods:
- the first implementation method is that for at least two pixels in the current block, these at least two pixels are adjacent pixels.
- the decoding end first determines the interpolation filter input information corresponding to the at least two pixels, inputs the input information into the interpolation filter for interpolation filter prediction, obtains a prediction value, and determines the prediction value of the at least two pixels based on the prediction value. For example, based on the relevant feature information of the at least two pixels, the prediction value is processed to obtain the prediction values corresponding to the at least two pixels. For another example, the prediction value is determined as the prediction value corresponding to the at least two pixels.
- the decoding end does not impose any restriction on the specific manner of determining the interpolation filter input information corresponding to the at least two pixel points.
- the interpolation filter determines the same interpolation filter input value of the at least two pixels, and use the same interpolation filter input value as interpolation filter input information.
- the at least two pixels include pixel 1 and pixel 2.
- determine the N input values corresponding to pixel 1 and the N input values corresponding to pixel 2. Determine the same input value among the N input values corresponding to pixel 1 and the N input values corresponding to pixel 2, and use the same input value as the input value of the interpolation filter. It should be noted that if the N input values corresponding to pixel 2 or pixel 1 include an undecoded value, the undecoded value is discarded.
- the input value corresponding to the pixel point with the most decoded input values among the at least two pixel points is determined as the interpolation filter input information.
- the at least two pixel points include pixel point 1 and pixel point 2.
- the N input values corresponding to pixel point 1 and the N input values corresponding to pixel point 2 are determined. Among them, the N input values corresponding to pixel point 1 have all been decoded, and the N input values corresponding to pixel point 2 include undecoded input values, so the N input values corresponding to pixel point 1 are used as the interpolation filter input information.
- the second method is to use an interpolation filter to perform interpolation filtering on at least two pixels in the current block at the same time.
- the decoder uses an interpolation filter to perform interpolation filtering prediction on pixel 1 in the current block to obtain the predicted value of pixel 1
- the decoder uses an interpolation filter to perform interpolation filtering prediction on pixel 2 in the current block to obtain the predicted value of pixel 2.
- the embodiment of the present application does not restrict the prediction direction when the decoding end uses the interpolation filter to perform parallel prediction on at least two pixels in the current block.
- the decoding end may use an interpolation filter to perform parallel prediction on at least two pixels in the current block along the horizontal direction.
- an interpolation filter to perform parallel prediction on at least two pixels in the current block along the horizontal direction.
- the decoding end may use an interpolation filter to perform parallel prediction on at least two pixels in the current block along the vertical direction.
- an interpolation filter to perform parallel prediction on at least two pixels in the current block along the vertical direction.
- the decoding end may use an interpolation filter to perform parallel prediction on at least two pixels in the current block along a diagonal direction.
- the above S102 includes the following step S102-A:
- the pixel points to be predicted are located at a corner of the area selected by the interpolation filter (e.g., the lower right corner or the upper left corner).
- the N positions corresponding to the selected pixel point do not include other pixel points on the diagonal line.
- the N positions corresponding to each pixel point on the same diagonal line do not include the pixel points on the diagonal line.
- the N positions corresponding to the pixel point a and the N positions corresponding to the pixel point b are determined respectively, wherein the N positions corresponding to the pixel point a and the N positions corresponding to the determined pixel point b do not include the pixel points on the diagonal line.
- the decoding end uses the interpolation filter to perform interpolation filtering prediction on the current block, it can perform parallel interpolation filtering prediction on the pixel points on the same diagonal line in the current block along the diagonal direction.
- parallel interpolation filtering prediction is performed on pixel point a and pixel point b on the same diagonal line.
- shape of the interpolation filter shown in Figure 23 is an example, and the shape of the interpolation filter in the embodiment of the present application is not limited to this.
- the left side, lower left, upper left, left side and upper right areas of the current block have been decoded. Therefore, the starting point of predicting the current block along the diagonal direction can be determined based on the decoded areas and the shape of the interpolation filter.
- the decoding end performs interpolation filtering prediction on the current block along the diagonal direction starting from the upper left corner of the current block.
- the above S102-A includes the following step S102-A1:
- the decoding end uses the interpolation filter to perform parallel interpolation filtering prediction on the pixel points on the same diagonal line of the current block starting from the upper left corner of the current block along the diagonal direction to obtain a predicted block of the current block.
- the pixel point to be predicted is located at the lower right corner of the area selected by the interpolation filter.
- the embodiment of the present application does not limit the specific direction of the diagonal line.
- the diagonal direction includes at least one of the following: a direction from the upper right to the lower left, and a direction from the lower left to the upper right.
- the diagonal direction includes a direction from the upper right to the lower left.
- the diagonal direction of the current block is from the upper right to the lower left.
- the diagonal direction includes a direction from lower left to upper right.
- the diagonal direction of the current block is from lower left to upper right.
- the diagonal direction includes the direction from the upper right to the lower left and the direction from the lower left to the upper right.
- the diagonal direction of the current block includes two directions: the direction from the upper right to the lower left and the direction from the lower left to the upper right.
- the decoding end performs parallel prediction on the pixel points located on the same diagonal line in the current block, the specific direction of the diagonal line does not constitute a limitation on the technical solution of the embodiment of the present application.
- the decoding end predicts each time using pixels on a diagonal line in the current block as a unit, and the decoding end predicts pixels on each diagonal line in the current block in parallel in the same process.
- the k-th diagonal line of the current block is used as an example for description.
- the above S102-A1 includes the following S102-A11 and S102-A12 steps:
- the k-th diagonal line can be understood as any diagonal line in the current block shown in FIG. 23, and the k-th diagonal line includes M pixels.
- the decoding end uses an interpolation filter based on the filter coefficient to determine the predicted values of the M pixels in parallel. In other words, the decoding end can determine the predicted values of the M pixels on the k-th diagonal line at the same time, which greatly increases the prediction speed.
- the decoding end determines the predicted values of the 3 pixels in parallel.
- the 3 pixels are respectively recorded as pixel 1, pixel 2 and pixel 3.
- the decoding end uses the interpolation filter to perform interpolation filtering prediction on pixel 1 based on the filter coefficient to obtain the predicted value of pixel 1.
- the interpolation filter is used to perform interpolation filtering prediction on pixel 2 to obtain the predicted value of pixel 2.
- the interpolation filter is used to perform interpolation filtering prediction on pixel 3 to obtain the predicted value of pixel 3.
- the decoding end determines the predicted values of the 3 pixels on the kth diagonal in the current block in parallel during one interpolation filtering prediction process, which greatly improves the speed of interpolation filtering prediction.
- the decoding end can determine the predicted values of the pixels on other diagonals in the current block with reference to the method for determining the predicted values of the pixels on the kth diagonal, and then obtain the predicted block of the current block, thereby improving the prediction speed of the current block and the decoding efficiency.
- the embodiment of the present application does not limit the specific manner in which the decoding end uses an interpolation filter to determine the prediction values of M pixels in parallel based on the filter coefficients.
- the M pixels are points on the kth diagonal of the current block, the M pixels can be understood as adjacent pixels with similar features. Therefore, in order to reduce the computational complexity, the input value of the interpolation filter corresponding to one or several of the M pixels is determined based on the shape of the interpolation filter. Next, based on the input value of the interpolation filter corresponding to the one or several pixels, the input value of the interpolation filter corresponding to the other pixels in the M pixels except the one or several pixels is determined by, for example, average value calculation, weighted calculation or other calculation methods. Finally, based on the filter coefficient and the input value of the interpolation filter corresponding to each of the M pixels, the predicted values of the M pixels are determined in parallel.
- the above S102-A11 uses the interpolation filter to determine the predicted values of M pixels in parallel, including the following steps:
- the decoding end when the decoding end determines the predicted values of the M pixels on the kth diagonal line of the current block in parallel, it determines the pixel values of the N positions corresponding to the M pixels in parallel based on the shape of the interpolation filter, wherein the pixel values of the N positions corresponding to each pixel can be understood as the input value of the interpolation filter corresponding to the pixel. Then, the decoding end determines the predicted values of the M pixels in parallel based on the filter coefficients and the pixel values of the N positions corresponding to the M pixels.
- the k-th diagonal line of the current block includes 3 pixels, and these 3 pixels are respectively recorded as pixel 1, pixel 2 and pixel 3.
- the interpolation filter of the current block is a 4X4 interpolation filter.
- the decoder determines the pixel values of the 15 positions corresponding to pixel 1 based on the shape of the interpolation filter, and uses the pixel values of these 15 positions as the input of the interpolation filter, and determines the predicted value of pixel 1 based on the above-determined filter coefficient.
- the decoder determines the pixel values of the 15 positions corresponding to pixel 2 based on the shape of the interpolation filter, and uses the pixel values of these 15 positions as the input of the interpolation filter, and determines the predicted value of pixel 2 based on the above-determined filter coefficient.
- the decoder determines the pixel values of the 15 positions corresponding to pixel 3 based on the shape of the interpolation filter, and uses the pixel values of these 15 positions as the input of the interpolation filter, and determines the predicted value of pixel 3 based on the above-determined filter coefficient. That is to say, in this embodiment, the decoding end determines the prediction values of three pixels in the current block in parallel at the same time, which greatly improves the prediction speed and further improves the decoding efficiency.
- the decoding end directly multiplies the pixel values of the N positions corresponding to the pixel point by the filter coefficient to obtain the predicted value of the pixel point.
- the decoding end obtains the predicted value of each pixel in the M pixels based on the following formula (5):
- t[ri+p n ] is the pixel value at the position ri+p n . If the position ri+p n is in the current block, then t[ri+p n ] is the predicted value of the pixel at the position ri+p n . If the position ri+p n is in the reconstructed area around the current block, then t[ri+p n ] is the reconstructed value of the pixel at the position ri+p n . pred ri is the predicted value of the pixel at the position ri in the current block.
- the decoding end can determine the prediction value of each pixel point on the same diagonal line in the current block in parallel.
- the filter coefficient is determined by the reference area after the mean value is removed in the above formula (4), so when determining the prediction value of the current block based on the filter coefficient, the influence of the pixel average reconstruction value m needs to be considered.
- the interpolation filter coefficient determined by the above formula (4) is substituted into the above formula (5) to obtain the prediction value of each point in the current block, and then the prediction value of each point is added to the pixel average reconstruction value m to obtain the final prediction value of each point in the current block, thereby obtaining the prediction block of the current block.
- the above S102-A11-a2 includes the following steps:
- the pixel values of the N positions corresponding to the M pixel points are de-averaged in parallel to obtain the pixel values of the N positions corresponding to the M pixel points after de-averaging;
- the predicted values of the M pixels are determined in parallel.
- the decoding end performs averaging on the pixel values of the N positions corresponding to the M pixels on the k-th diagonal line of the current block based on the pixel average reconstruction value to obtain the pixel values of the N positions corresponding to the M pixels after averaging. For example, for any pixel among the M pixels, the pixel value of the N positions of the pixel is subtracted from the pixel average reconstruction value to obtain the pixel value of the N positions of the pixel after averaging.
- the predicted values of the M pixels are determined in parallel.
- the embodiment of the present application does not limit the specific method of determining the predicted values of M pixel points in parallel based on the filter coefficients and the pixel values after averaging the N positions corresponding to the M pixel points.
- the decoder substitutes the pixel value and filter coefficient of the N positions of the pixel after averaging the pixel value into the above formula (5).
- t[ri+p n ] in formula (5) is the pixel value of the pixel at position ri+p n after averaging the pixel value.
- the pixel average reconstruction value m is added to the predicted value to obtain the final predicted value of the rth point.
- the above S102-A11-a22 includes the following steps:
- the predicted values of the M pixel points are determined in parallel.
- the decoding end limits the prediction value of the current block to a range. Specifically, a second reconstruction area is determined, and the maximum reconstruction value max and the minimum reconstruction value min of the pixel points in the second reconstruction area are determined.
- the embodiment of the present application does not limit the specific method of determining the second reconstruction area around the current block.
- the second reconstructed area of the current block is consistent with the reference area of the current block.
- the second reconstruction area of the current block is consistent with the first reconstruction area of the current block.
- the reconstruction area above, on the left, on the upper right, on the upper left, and on the lower left of the current block is determined as the second reconstruction area.
- the reconstruction area of 13 rows above, 13 columns on the left, 13 rows on the upper right, 13 rows and 13 columns on the upper left, and 13 columns on the lower left of the current block is determined as the second reconstruction area.
- the above-mentioned S102-A11-a221 can be executed before the above-mentioned S102-A11-a222, or after the above-mentioned S102-A11-a222, or synchronously with the above-mentioned S102-A11-a222.
- the embodiment of the present application does not limit the specific method in which the decoding end obtains the first predicted values of the M pixel points in parallel based on the pixel values, filter coefficients and pixel average reconstruction values after averaging the N positions corresponding to the M pixel points.
- the pixel value after averaging the N positions of the pixel is multiplied by the filter coefficient to obtain the second predicted value of the pixel; the second predicted value and the pixel average reconstruction value are added to obtain the first predicted value of the pixel.
- the decoding end obtains the first predicted value of the pixel based on the following formula (6):
- pred r is the first predicted value of the rth pixel among the M pixels, is the second predicted value of the rth point.
- the decoding end obtains a predicted value of the pixel point based on the above formula (6)
- the predicted value is processed in a preset manner to obtain a first predicted value of the pixel point.
- the decoding end After the decoding end determines the first prediction values of the M pixels based on the above steps, it determines the prediction values of the M pixels in parallel based on the first prediction value, the maximum reconstruction value and the minimum reconstruction value.
- the first prediction value of the pixel is determined as the prediction value of the pixel.
- the minimum reconstruction value is determined as the predicted value of the pixel point.
- the maximum reconstruction value is determined as the predicted value of the pixel point.
- the decoding end determines the predicted value of the pixel point by using the following formula (7):
- Clip represents the first predicted value of the rth point among M pixels The value is limited to the maximum reconstruction value max and the minimum reconstruction value min.
- the decoding end can refer to the above method to determine the predicted values of the pixels on each diagonal line in the current block in parallel, and then obtain the predicted value of each point of the current block to form a predicted block of the current block.
- the decoding end performs interpolation filtering prediction on the current block, obtains the prediction block of the current block, and then executes the following steps.
- the decoding end decodes the code stream to obtain the quantization coefficient of the current block, then inversely quantizes the quantization coefficient to obtain the transformation coefficient of the current block, and then inversely transforms the transformation coefficient of the current block to obtain the residual block (or residual value) of the current block.
- the prediction mode of the current block is determined, and the current block is predicted using the prediction mode to obtain the prediction block of the current block, and the prediction block and the residual block are added to obtain the reconstructed block of the current block.
- the embodiment of the present application does not limit the specific method in which the decoding end determines the transform kernel corresponding to the current block.
- the encoding end and the decoding end use a default transform kernel as the transform kernel of the current block.
- the encoder after the encoder determines the transformation core of the current block, the encoder writes the indication information of the transformation core into the bitstream, so that the decoder determines the transformation core of the current block by decoding the bitstream.
- the decoding end determines the transformation kernel of the current block through the following steps S103-A and S103-A:
- the traditional intra-frame prediction mode corresponding to the prediction block is determined, and then based on the traditional intra-frame prediction mode, the transformation kernel corresponding to the current block is determined.
- the conventional intra prediction modes currently included in VVC are:
- PLANAR mode intra prediction mode index is 0,
- the intra-frame prediction mode index is 2 to 66.
- the arrows in the figure point to the directions predicted by the angle modes in VVC, and the prediction mode indexes used in decoding are 2 to 66.
- the current block is a non-square block, some angle directions will be replaced with wide angles, such as -1 to -14 and 67 to 80 in FIG25 .
- the intra-frame prediction mode corresponding to the above prediction block is a default intra-frame prediction mode. That is, if the current block is predicted using the interpolation filter prediction mode, when the prediction block is obtained, one of the traditional intra-frame prediction modes is determined as the default intra-frame prediction mode corresponding to the prediction block.
- the decoding end determines the intra prediction mode corresponding to the prediction block through the following steps S103-A1 and S103-A2:
- the intra-frame prediction mode corresponding to the prediction block is determined by counting the intra-frame prediction modes corresponding to the angle values of R points in the prediction block. Mode.
- the embodiment of the present application does not limit the specific position and number of the R points in the prediction block used to determine the angle value.
- the R points may be one point in the prediction block, or may be multiple points in the prediction block.
- the decoding end determines the angle value of a point in the prediction block (for example, the center point of the prediction block), and based on the angle value of the point, determines the intra-frame prediction mode corresponding to the point, and then determines the intra-frame prediction mode as the intra-frame prediction mode corresponding to the prediction block.
- the decoding end determines the angle values of these multiple points, and based on the angle values of these multiple points, determines the intra-frame prediction mode corresponding to each of these multiple points, and then determines the intra-frame prediction mode with the largest number of identical intra-frame prediction modes among these multiple points as the intra-frame prediction mode corresponding to the prediction block.
- the selection of the R points is related to the shape and size of the sliding window. For example, each of the R points is the center point of the sliding window when the sliding window slides in the prediction block.
- the method for determining the angle value of each of the R points is the same.
- the method of determining the angle value of the i-th point among the R points is used as an example for explanation.
- the embodiment of the present application does not limit the specific method of determining the angle value of the point.
- the above S103-A1 includes steps S103-A11 and S103-A12:
- the decoding end for each point among the R points, for example, the ith point, the decoding end first determines the horizontal gradient and the vertical gradient of the ith point, and then determines the angle value of the ith point according to the horizontal gradient and the vertical gradient.
- the embodiment of the present application does not limit the specific method of determining the horizontal gradient and the vertical gradient of the i-th point.
- the horizontal gradient value of the i-th point is determined based on the predicted values of points around the i-th point in the prediction block and the change in the predicted value of the i-th point in the horizontal direction
- the vertical gradient value of the i-th point is determined based on the predicted values of points around the i-th point in the prediction block and the change in the predicted value of the i-th point in the vertical direction.
- the decoding end determines the prediction value of the point in the sliding window centered on the i-th point in the prediction block; based on the prediction value of the point in the sliding window and the horizontal gradient operator, as well as the vertical gradient operator, the horizontal gradient and vertical gradient of the i-th point are obtained.
- a sliding window is first determined, for example, as shown in FIG26, a sliding window of size 3 ⁇ 3 is determined, and the sliding window is slid in the prediction block.
- the horizontal gradient and vertical gradient of the center point of the sliding window are determined.
- the horizontal gradient and vertical gradient of the i-th point are determined.
- the product of the predicted value of the point in the sliding window and the horizontal gradient operator is determined as the horizontal gradient G x of the ith point; the product of the predicted value of the point in the sliding window and the vertical gradient operator is determined as the vertical gradient of the ith point.
- the predicted value of a point in the sliding window is multiplied by the horizontal gradient operator and then subjected to a preset operation with a preset value to obtain the horizontal gradient G x of the i-th point; the predicted value of a point in the sliding window is multiplied by the vertical gradient operator and then subjected to a preset operation with a preset value to obtain the vertical gradient of the i-th point.
- the embodiment of the present application does not limit the specific values of the horizontal gradient operator and the vertical gradient operator.
- the horizontal gradient operator M x and the vertical gradient operator My are:
- the angle value of the i-th point can be determined according to the horizontal gradient and the vertical gradient of the i-th point.
- the inverse tangent value of the ratio of the vertical gradient to the horizontal gradient of the i-th point is determined as the angle value of the i-th point. For example, as shown in formula (8):
- Gx is the horizontal gradient of the i-th point
- Gy is the vertical gradient of the i-th point
- O is the angle value of the i-th point
- atan() is the inverse tangent function.
- the decoding end may also use other methods to determine the angle value of the i-th point. For example, the decoding end adjusts the angle value determined by the above formula (8) to obtain the angle value of the i-th point.
- the decoding end uses the above method for each of the R points to determine the angle value of each of the R points, and then executes the above S103-A2 to determine the intra-frame prediction mode corresponding to the prediction block based on the angle values of the R points.
- the embodiment of the present application does not limit the specific method of determining the intra-frame prediction mode corresponding to the prediction block based on the angle values of R points.
- the decoding end selects the angle value 1 that is the same the most times from the angle values of the R points, matches the angle value 1 with the prediction angle of the traditional intra-frame prediction mode, obtains the intra-frame prediction mode corresponding to the angle value 1, and determines the intra-frame prediction mode corresponding to the angle value 1 as the intra-frame prediction mode corresponding to the prediction block.
- the above S103-A2 includes the following steps S103-A21 and S103-A22:
- the decoding end determines the intra-frame prediction mode corresponding to each point based on the angle value of each point in the R points. For example, for each point in the R points, the angle value of the point is matched with the prediction angle of the traditional intra-frame prediction mode to obtain the intra-frame prediction mode corresponding to the angle value of the point. In this way, the intra-frame prediction mode corresponding to each point in the R points can be obtained.
- the intra-frame prediction mode corresponding to the prediction block is determined.
- the intra-frame prediction mode with the largest number of repetitions among the intra-frame prediction modes corresponding to the R points is determined as the intra-frame prediction mode corresponding to the prediction block.
- the above S103-A22 includes the following steps:
- the decoding end determines the gradient amplitude value corresponding to each of the R points based on the horizontal gradient and the vertical gradient of each of the R points determined above.
- the specific manner in which the decoding end determines the gradient amplitude value corresponding to each of the R points is the same.
- the decoding end determines the gradient amplitude value corresponding to the i-th point among the R points is the same.
- the embodiment of the present application does not limit the specific manner in which the decoding end determines the gradient amplitude value corresponding to the i-th point based on the horizontal gradient and the vertical gradient of the i-th point.
- the decoding end multiplies the horizontal gradient and the vertical gradient of the i-th point to obtain the gradient amplitude value corresponding to the i-th point.
- the decoding end adds the absolute value of the horizontal gradient and the absolute value of the vertical gradient of the i-th point to obtain the gradient amplitude value corresponding to the i-th point.
- G is the gradient amplitude value corresponding to the i-th point
- Gx is the horizontal gradient of the i-th point
- Gy is the vertical gradient of the i-th point.
- the decoder can determine the gradient amplitude value corresponding to each of the R points based on the above steps. Next, the decoder performs the above S103-A222 to determine the intra-frame prediction mode corresponding to the prediction block based on the intra-frame prediction modes and gradient amplitude values corresponding to the R points.
- the intra-frame prediction mode corresponding to the point with the largest gradient magnitude value among the R points is determined as the intra-frame prediction mode corresponding to the prediction block.
- the gradient amplitude value corresponding to the point is accumulated on the intra-frame prediction mode corresponding to the point to obtain the accumulated gradient amplitude values of the intra-frame prediction modes corresponding to the R points; and the intra-frame prediction mode with the largest accumulated gradient amplitude value among the intra-frame prediction modes corresponding to the R points is determined as the intra-frame prediction mode corresponding to the prediction block.
- the gradient amplitude value corresponding to each of the R points is accumulated on the corresponding intra-frame prediction mode.
- the intra-frame prediction modes corresponding to point 1 and point 2 of the R points are both intra-frame prediction mode 1
- the gradient amplitude values corresponding to point 1 and point 2 are accumulated to the gradient amplitude value corresponding to intra-frame prediction mode 1.
- the gradient amplitude value histogram shown in FIG27 can be obtained.
- the intra-frame prediction mode with the largest accumulated gradient amplitude value in the gradient amplitude value histogram can be determined as the intra-frame prediction mode corresponding to the prediction block.
- the intra-frame prediction mode corresponding to the dark accumulated gradient amplitude value in FIG27 is determined as the intra-frame prediction mode corresponding to the prediction block.
- the first intra-frame prediction mode is determined as the intra-frame prediction mode corresponding to the prediction block.
- the gradient amplitude values corresponding to all points in the R points are all 0, it means that the horizontal gradient and the vertical gradient of each point in the R points are all 0.
- the preset first intra-frame prediction mode can be determined as the intra-frame prediction mode corresponding to the prediction block.
- the embodiment of the present application does not limit the type of the first intra-frame prediction mode.
- the first intra-frame prediction mode is the PLANAR mode.
- the decoding end After the decoding end determines the intra-frame prediction mode corresponding to the prediction block based on the above steps, it determines the transform kernel corresponding to the current block based on the intra-frame prediction mode corresponding to the prediction block.
- the embodiment of the present application does not limit the specific manner in which the decoding end determines the transform kernel corresponding to the current block based on the intra-frame prediction mode corresponding to the prediction block.
- the decoding end searches for an image block whose intra-frame prediction mode is the same as the intra-frame prediction mode corresponding to the prediction block in the decoded image blocks around the prediction block based on the intra-frame prediction mode corresponding to the prediction block, and then determines the transform kernel corresponding to the image block as the transform kernel corresponding to the current block.
- the step of determining the transform kernel corresponding to the current block based on the intra prediction mode corresponding to the prediction block in S103-B above includes the following steps:
- the decoder obtains the correspondence between the preset intra prediction mode and the transform core group.
- Table 17 is only a correspondence between an intra-frame prediction mode and a transform core group involved in an embodiment of the present application.
- the correspondence between the intra-frame prediction mode and the transform core group in the embodiment of the present application includes but is not limited to what is shown in Table 17.
- Each transformation core group includes at least one type of transformation core.
- the decoder After obtaining the correspondence between the intra prediction mode and the transform core group as shown in Table 17, the decoder searches for the transform core group corresponding to the intra prediction mode corresponding to the prediction block in the correspondence between the intra prediction mode and the transform core group based on the intra prediction mode corresponding to the prediction block, and records the transform core group as the first transform core group.
- the intra prediction mode corresponding to the prediction block is the angular prediction mode in the 64-angle direction
- searching the above Table 17 shows that the transform core group corresponding to the angular prediction mode in the 64-angle direction is 4.
- the decoder determines the transform core corresponding to the current block from at least one type of transform core included in the transform core group 4.
- the transform core is determined as the transform core corresponding to the current block.
- the decoder determines the transform core category corresponding to the current block, and then determines the transform core of the transform core category in the first transform core group as the transform core corresponding to the current block.
- the methods for the decoder to determine the transform kernel type corresponding to the current block include but are not limited to the following:
- the transform kernel category corresponding to the current block is a default category, so the decoding end determines the default category as the transform kernel category corresponding to the current block.
- the encoder writes the transform kernel category corresponding to the current block into the bitstream, so that the decoder obtains the transform kernel category corresponding to the current block by decoding the bitstream.
- the decoding end uses the interpolation filter prediction mode to determine the prediction block of the current block, and then determines the traditional intra-frame prediction mode corresponding to the prediction block, and determines the transform kernel corresponding to the current block based on the traditional intra-frame prediction mode corresponding to the prediction block. That is to say, the embodiment of the present application uses the traditional intra-frame prediction mode derived from the interpolation filter prediction to select the transform kernel group of the non-separable primary transform (NSPT) and the non-separable secondary transform kernel (LFNST), so that the determined transform kernel is more in line with the characteristics of the current block, and the accuracy of determining the transform kernel is improved.
- NPT non-separable primary transform
- LNNST non-separable secondary transform kernel
- the determination accuracy of the reconstruction value can be improved, and the decoding accuracy of the current block can be improved.
- the embodiment of the present application determines the transform kernel of the current block through the traditional prediction mode corresponding to the prediction block, it is not necessary to indicate the transform kernel separately, which saves codewords and further improves the video encoding and decoding effect.
- the decoding end determines the transform kernel corresponding to the current block, and then inversely transforms the transform coefficients of the current block based on the transform kernel corresponding to the current block to obtain the residual block of the current block, and obtains the reconstructed block of the current block based on the prediction block and the residual block of the current block.
- the decoding end determines the prediction block of the current block and the transform kernel corresponding to the current block based on the above steps. In this way, the decoding end can decode the code stream to obtain the quantization coefficient of the current block, then dequantize the quantization coefficient to obtain the transform coefficient of the current block, and use the transform kernel corresponding to the current block determined above to de-transform the transform coefficient of the current block to obtain the residual block (or residual value) of the current block. Finally, the decoding end adds the prediction block and the residual block of the current block to obtain the reconstructed block of the current block.
- the current block is a bright color block or a chroma block, that is, in the embodiment of the present application, the interpolation filtering prediction mode provided in the embodiment of the present application can be used to predict both the luminance block and the chroma block.
- the prediction mode of the current block is an interpolation filtering prediction mode
- the chrominance block corresponding to the current block adopts a direct derivation mode DM
- the PLANAR mode or the intra-frame prediction mode corresponding to the above prediction block is determined as the prediction mode of the chrominance block.
- the video decoding method provided in the embodiment of the present application when predicting the current block, first determines the reference area and interpolation filter of the current block, and based on the reference area, determines the filter coefficient, based on the filter coefficient, uses the interpolation filter to perform parallel prediction on at least two pixel points in the current block to obtain the prediction block of the current block; determines the transformation kernel corresponding to the current block, and determines the reconstruction value of the current block based on the transformation kernel and the prediction block. That is, in the embodiment of the present application, when using the interpolation filter to perform interpolation filtering prediction on the current block, at least two points in the current block are predicted in parallel, which improves the prediction speed and thus improves the decoding efficiency.
- FIG28 is a flow chart of a prediction method provided by an embodiment of the present application, and the embodiment of the present application is applied to the video encoder shown in FIG1 and FIG2. As shown in FIG28, the method of the embodiment of the present application includes:
- the encoder When encoding the current block, the encoder first determines the prediction mode of the current block, and uses the prediction mode to predict the current block to obtain the prediction block (or prediction value) of the current block. The current block is subtracted from the prediction block of the current block to obtain the residual block (or residual value) of the current block. The residual block of the current block is then transformed to obtain the transformation coefficient, the transformation coefficient is quantized to obtain the quantization coefficient, and the quantization coefficient is encoded to obtain the bitstream.
- the encoding end first determines the prediction mode of the current block.
- the encoding end determines the prediction mode of the current block in at least the following ways:
- Mode 1 The encoder determines the candidate prediction mode with the lowest cost as the prediction mode of the current block from multiple candidate prediction modes consisting of the traditional prediction mode and the interpolation filter prediction mode shown in FIG6 or FIG7. Then, the encoder adds the indication information of the prediction mode of the current block to the bitstream. In this way, the decoder obtains the indication information of the prediction mode of the current block by decoding the bitstream, and then determines the prediction mode of the current block based on the indication information.
- the encoder constructs an intra-frame prediction mode candidate list, selects the intra-frame prediction mode of the current block from the intra-frame prediction mode candidate list, and it should be noted that the intra-frame prediction mode candidate list includes the interpolation filter prediction mode. Then, the encoder writes the sequence number (or index number) of the intra-frame prediction mode of the current block in the intra-frame prediction mode candidate list into the bitstream.
- Method 3 the encoding end constructs an intra-frame prediction mode candidate list, which includes an interpolation filtering prediction mode, and then selects the intra-frame prediction mode of the current block from the intra-frame prediction mode candidate list. For example, the cost of each candidate prediction mode in the intra-frame prediction mode candidate list on the template of the current block is determined, and then based on the cost, the intra-frame prediction mode of the current block is determined.
- the encoder determines the prediction mode of the current block, it first determines multiple candidate prediction modes, and then determines the prediction mode of the current block from these multiple candidate prediction modes, wherein the multiple candidate prediction modes include the interpolation filtering prediction mode.
- the specific method of determining the prediction mode of the current block from the multiple candidate prediction modes may be that the encoder selects any one of the multiple candidate prediction modes as The candidate prediction mode is determined as the prediction mode of the current block. That is, the encoder uses multiple candidate prediction modes to predict the current block respectively, determines the cost corresponding to each candidate prediction mode, which can be RDO or SATD, and then determines the candidate prediction mode with the smallest cost as the prediction mode of the current block.
- the encoding end determines the prediction mode of the current block based on the above method. If the prediction mode of the current block is the interpolation filtering prediction mode, the above step S201 is executed.
- the use conditions of the interpolation filter prediction mode are limited, based on which, before determining the reference area and interpolation filter of the current block, it is determined whether the current image block is allowed to be predicted using the interpolation filter prediction mode.
- the embodiment of the present application does not limit the specific method of determining whether the current image block is allowed to use the interpolation filter prediction mode for prediction, that is, does not limit the specific use conditions of the interpolation filter prediction mode.
- the encoder before determining the prediction mode of the current block from multiple candidate prediction modes, the encoder also needs to determine whether the position of the current block in the current image meets the preset position requirement and whether the size of the current block meets the preset block size requirement.
- the embodiment of the present application does not limit the preset position and the prediction block size, which are determined according to actual needs.
- the position of the upper left corner of the current image is (0,0)
- the position of the upper left corner of the current block is (x, y)
- the preset position requires that the x value of the current block is greater than or equal to a first preset value XX
- the y value of the current block is greater than or equal to a second preset value YY.
- the embodiment of the present application does not limit the specific values of the first preset value and the second preset value.
- the first preset value and the second preset value are the same.
- the first preset value and the second preset value are both 13, that is, when the distance from the upper edge line of the current block to the upper edge line of the current image is greater than or equal to 13 pixel rows, and the distance from the left edge line of the current block to the left edge line of the current image is greater than or equal to 13 pixel columns, it indicates that the position of the current block in the current image meets the preset position requirements.
- the preset block size requirement is that the width W of the current block is less than or equal to the third preset value A, and the height H of the current block is less than or equal to the fourth preset value B.
- the embodiment of the present application does not limit the specific values of the third preset value and the fourth preset value.
- the third preset value and the fourth preset value are the same.
- the third preset value and the fourth preset value are both 32, that is, when the width and height of the current block are both less than or equal to 32, it indicates that the current block meets the preset block size requirement.
- the encoding end before determining whether the current block is predicted using the interpolation filter prediction mode, the encoding end first determines whether the position of the current block in the current image meets the preset position requirement, and determines whether the size of the current block meets the preset block size requirement. If the position of the current block in the current image meets the preset position requirement, and the size of the current block meets the preset block size requirement, the prediction mode of the current block is determined from the above-mentioned multiple candidate prediction modes including the interpolation filter prediction mode.
- the prediction mode of the current block is determined from the above-mentioned multiple candidate prediction modes including the interpolation filter prediction mode.
- the first preset value, the second preset value, the third preset value and the fourth preset value are default values.
- the encoding end determines that the prediction mode of the current block is not an interpolation filtering prediction mode, so that the encoding end determines the prediction mode of the current block from the candidate prediction modes that do not include the interpolation filtering prediction mode.
- the encoding end before determining whether the position of the current block in the current image meets the preset position requirement and determining whether the size of the current block meets the preset block size, the encoding end also includes: determining whether the current sequence allows prediction using the interpolation filtering prediction mode; if the current sequence allows prediction using the interpolation filtering prediction mode, determining whether the position of the current block in the current image meets the preset position requirement and determining whether the size of the current block meets the preset block size.
- a high-level syntax element is used to indicate whether the current sequence is allowed to be predicted using an interpolation filter prediction mode. If the current sequence is predicted using an interpolation filter prediction mode, the encoding end determines whether the position of the current block in the current image meets the preset position requirement, and determines whether the size of the current block meets the preset block size. When it is determined that the position of the current block in the current image meets the preset position requirement, and when it is determined that the size of the current block meets the preset block size requirement, the encoding end determines the prediction mode of the current block from candidate prediction modes that do not include the interpolation filter prediction mode.
- the encoder if the encoder determines that the current sequence is not allowed to be predicted using the interpolation filtering prediction mode, the encoder skips the above step S201.
- the encoder writes second information into the bitstream, where the second information is used to indicate whether the current sequence is allowed to be predicted using the interpolation filtering prediction mode.
- the embodiment of the present application does not limit the specific form of the second information, which may be any indication information that can indicate whether the current sequence is allowed to be predicted using the interpolation filtering prediction mode.
- the second information is carried in a sequence parameter set (SPS).
- SPS sequence parameter set
- the embodiments of the present application may further include a general constraints information (GCI) flag to indicate whether the interpolation filter prediction technology is used.
- GCI general constraints information
- gci_no_eip_constraint_flag is used to indicate whether the current video enables the interpolation filter prediction technology.
- Table 2 the gci_no_eip_constraint_flag is carried in the general constraints information general_constraints_info().
- whether the current block adopts the interpolation filter prediction mode can be determined by high-level syntax, such as GCI, sequence level, frame level, slice level, block level, etc. It can also be determined by the size of the current block and the position of the current block.
- the interpolation filter prediction mode when used for some smaller blocks, the computational cost and computational complexity will increase. This is because the interpolation filter prediction mode in this application has a high computational complexity. If the interpolation filter prediction mode is also used for some small blocks, this will increase the interpolation filter in the entire image decoding. The number of times the prediction mode is used increases, thereby increasing the computational cost and complexity of the image. Based on this, in an embodiment of the present application, the interpolation filter prediction mode is only allowed to be used for slightly larger blocks. For example, if the size of the current block is greater than or equal to the preset size, the interpolation filter prediction mode is allowed to be used.
- the interpolation filter prediction mode is not allowed to be used for the current block.
- the embodiment of the present application does not limit the specific value of the preset size.
- the size of the current block is greater than or equal to the preset size, which can be that the number of pixels in the current block is greater than or equal to the preset number, or at least one of the length and width of the current block is greater than or equal to the preset value, or the ratio of the length and width of the current block is greater than or equal to the preset comparison, etc.
- the current block if the current block is in the first row of the current CTU, it is determined that the current block is not allowed to use the interpolation filter prediction mode. That is, if the current block is predicted using the interpolation filter prediction mode, the current block is not in the first row of the current CTU.
- determining whether the current block adopts the interpolation filter prediction mode is also related to the type of the current image. For example, for intra-frame prediction images, it is stipulated that the interpolation filter prediction mode can be used for prediction, and for inter-frame prediction images, the interpolation filter prediction mode is not allowed to be used for prediction. Based on this, if the current image where the current block is located is an intra-frame prediction image, it is determined that the current block is allowed to be predicted using the interpolation filter prediction mode. If the current image is not an intra-frame prediction image (for example, an inter-frame prediction image), it is determined that the current block is not allowed to be predicted using the interpolation filter prediction mode.
- intra-frame prediction images it is stipulated that the interpolation filter prediction mode can be used for prediction, and for inter-frame prediction images, the interpolation filter prediction mode is not allowed to be used for prediction.
- a series of complex intra prediction modes are introduced, such as: template-based intra prediction derivation mode (TIMD for short), decoder-side intra prediction derivation mode (DIMD for short), template-based multiple reference line intra prediction (TMRL for short), spatial geometrical partitioning mode (SGPM for short) and convolutional cross component model (CCCM for short).
- TMD template-based intra prediction derivation mode
- DIMD decoder-side intra prediction derivation mode
- TMRL template-based multiple reference line intra prediction
- SGPM spatial geometrical partitioning mode
- CCCM convolutional cross component model
- the interpolation filter prediction mode can also be classified as an intra prediction mode based on template matching technology.
- unified identification information (such as the first information) is used to uniformly indicate the above-mentioned intra-frame prediction modes based on template matching technology. For example, if the first information indicates that the technology based on template matching is not turned on, it means that the above-mentioned intra-frame prediction modes based on template matching technology (i.e., TIMD, DIMD, TMRL, SGPM, TMRL, CCCM and interpolation filter prediction mode) are not allowed to be used.
- template matching technology i.e., TIMD, DIMD, TMRL, SGPM, TMRL, CCCM and interpolation filter prediction mode
- the first information indicates that the technology based on template matching is not turned on, it means that the above-mentioned intra-frame prediction modes based on template matching technology are allowed to be used, and then based on other information, the specific intra-frame prediction mode used by the current block is further determined.
- determining whether the current block is allowed to use the interpolation filter prediction mode includes: determining to obtain first information, the first information is used to indicate whether the technology based on template matching is turned on; based on the first information, determining whether the current block is allowed to use the interpolation filter prediction mode. For example, if the first information indicates that the technology based on template matching is not turned on, it is determined that the current block is not allowed to use the interpolation filter prediction mode for prediction. For another example, if the first information indicates that the technology based on template matching is turned on, it is determined through other information whether the current block uses the interpolation filter prediction mode for prediction.
- the embodiment of the present application does not limit the specific form of the first information.
- the first information may be GCI, sequence level, frame level, slice level or block level indication information.
- the first information is sequence-level indication information
- the first information is determined. If the first information indicates that the template matching technology is turned on, the second information (sps_eip_enabled_flag) is determined, and then based on the second information, it is determined whether the current block is allowed to use the interpolation filter prediction mode. If the first information indicates that the template matching technology is not turned on, it is directly determined that the current block is not suitable for prediction using the interpolation filter prediction mode, and the step of determining the second information is skipped.
- condition for the encoder to determine whether the current block can be predicted using the interpolation filter prediction mode includes at least one of the following:
- high-level syntax includes sequence level, frame level, slice level, block level, etc., refer to the above description for details;
- third information is written into the bitstream, where the third information is used to indicate whether the current block is predicted using the interpolation filtering prediction mode.
- the embodiment of the present application does not limit the specific form of the third information, which may be any indication information that can indicate whether the current block is predicted using the interpolation filter prediction mode.
- the process of determining the prediction mode of the current block in the embodiment of the present application may include: first, determining whether the current block is predicted using the interpolation filter prediction mode, for example, the second information at the sequence level indicates that the current sequence allows the use of the interpolation filter prediction mode, and when it is determined that the position of the current block in the current image meets the preset position requirement, and when it is determined that the size of the current block meets the preset block size requirement, it is determined that the current block can be predicted using the interpolation filter prediction mode.
- the filter coefficient is obtained, and the current block is predicted based on the filter coefficient to obtain the predicted value of the current block.
- the prediction mode is roughly screened with other intra-frame prediction mode tools, and several prediction modes with a smaller cost are selected for fine screening, and the final intra-frame prediction mode is determined as the prediction mode of the current block. If it is determined that the current block cannot be predicted using the interpolation filter prediction mode, the screening of the interpolation filter prediction mode is skipped.
- R represents the bit overhead expected to be spent in the intra-frame prediction mode of coding
- ⁇ is the Lagrange multiplier, which is related to the quantization parameter used in the current coding
- D represents the distortion value between the predicted block and the original block in the current prediction mode.
- D min(SAD ⁇ 2,SATD) (11)
- SAD The sum of absolute difference
- SATD The sum of transformed difference
- the encoder After the encoder determines the cost of each candidate prediction mode, it selects several candidate prediction modes from multiple candidate prediction modes for detailed screening.
- the above prediction modes after rough screening will be further transformed, quantized, inversely quantized, inversely transformed, reconstructed, and the rate-distortion cost of each mode combination (prediction mode + transform mode + quantization mode) will be compared to determine the final prediction mode, transform mode and quantized residual value.
- the rate-distortion cost calculation is still D+ ⁇ R, but here D represents the SSE (the sum of squared error) between the reconstructed block and the original block, and R represents the bit overhead of encoding the mode identifier, coefficients, etc. of the current block.
- the encoder determines the candidate prediction mode with the lowest cost in the fine screening process as the prediction mode of the current block.
- step S201 is executed.
- the following introduces the process of using the interpolation filter prediction mode at the encoding end to predict the current block.
- the encoder determines that the current block is predicted using the interpolation filtering prediction mode, it first determines the reference area and the interpolation filter of the current block.
- the reference area of the current block is part or all of the reconstructed area around the current block.
- the reconstruction area around the current block may include: a reconstruction area above the current block, a reconstruction area to the left of the current block, a reconstruction area to the upper right of the current block, a reconstruction area to the lower left of the current block, and a reconstruction area to the upper left of the current block.
- the embodiment of the present application does not limit the specific shape and size of the reference area of the current block.
- the reference area of the current block includes any one of the reconstruction area above the current block, the reconstruction area on the left side of the current block, the reconstruction area on the upper right side of the current block, the reconstruction area on the lower left side of the current block, and the reconstruction area on the upper left side of the current block.
- the reference area of the current block is the reconstruction area above the current block, or the reference area of the current block is the reconstruction area on the left side of the current block.
- the reference area of the current block includes any two reconstruction areas of the upper reconstruction area of the current block, the left reconstruction area of the current block, the upper right reconstruction area of the current block, the lower left reconstruction area of the current block, and the upper left reconstruction area of the current block.
- the reference area of the current block includes the upper reconstruction area of the current block and the left reconstruction area of the current block.
- the reference area of the current block includes the upper reconstruction area of the current block and the lower left reconstruction area of the current block.
- the reference area of the current block includes any three reconstruction areas of the upper reconstruction area of the current block, the left reconstruction area of the current block, the upper right reconstruction area of the current block, the lower left reconstruction area of the current block, and the upper left reconstruction area of the current block.
- the reference area of the current block includes the upper reconstruction area of the current block, the upper right reconstruction area of the current block, and the upper left reconstruction area of the current block.
- the reference block of the current block includes the left reconstruction area of the current block, the upper left reconstruction area of the current block, and the lower left reconstruction area of the current block.
- the reference area of the current block includes any four reconstruction areas of the upper reconstruction area of the current block, the left reconstruction area of the current block, the upper right reconstruction area of the current block, the lower left reconstruction area of the current block, and the upper left reconstruction area of the current block.
- the reference area of the current block includes the upper reconstruction area of the current block, the upper right reconstruction area of the current block, the upper left reconstruction area of the current block, and the left reconstruction area of the current block.
- the reference block of the current block includes the left reconstruction area of the current block, the upper left reconstruction area of the current block, the lower left reconstruction area of the current block, and the upper reconstruction area of the current block.
- the reference area of the current block includes five reconstruction areas: an upper reconstruction area of the current block, a left reconstruction area of the current block, an upper right reconstruction area of the current block, a lower left reconstruction area of the current block, and an upper left reconstruction area of the current block.
- the specific manners in which the encoder determines the reference area of the current block include but are not limited to the following:
- the reference area of the current block is a default area.
- the encoder and the decoder default that the reference area of the current block includes at least one of the reconstruction area above the current block, the reconstruction area on the left side of the current block, the reconstruction area on the upper right of the current block, the reconstruction area on the lower left of the current block, and the reconstruction area on the upper left of the current block.
- Method 2 determining the first costs when predicting the current block based on P reference regions respectively; determining the reference region with the smallest first cost among the P reference regions as the reference region of the current block.
- the encoder predicts the current block based on the P reference areas, determines the first cost corresponding to each reference area, and then determines the reference area with the smallest first cost among the P reference areas as the reference area of the current block.
- the encoder writes fourth information into the bitstream, and the fourth information indicates the type of the reference area of the current block. That is, in this mode 2, the encoder also indicates the determined type of the reference area of the current block to the decoder through the fourth information.
- the embodiment of the present application does not impose any specific limitation on the specific number and shape of the P reference areas.
- the P reference regions include at least one of a first reference region, a second reference region, and a third reference region.
- the first reference area includes the reconstruction areas above, upper right, left, upper left and upper left of the current block.
- the second reference area includes the reconstruction areas above, upper right and upper left of the current block.
- the third reference area includes the reconstruction areas on the left, upper left and upper left of the current block.
- the embodiment of the present application does not limit the specific form of the fourth information, as long as it is any indication information that can indicate the type of the reference area of the current block.
- eip_ref_type is used to represent the fourth information.
- different types of reference areas are indicated by the value of eip_ref_type.
- the P reference areas are the three reference areas shown in Figures 13A to 13C.
- the P reference areas in the embodiment of the present application also include other reference areas in addition to the above three reference areas, and the embodiment of the present application does not limit this.
- the correspondence between the reference areas and the eip_ref_type values shown in Table 4 above can be adaptively adjusted according to the number of reference areas.
- the encoding end may use a truncated binary code encoding method to write the fourth information into the code stream.
- the encoding end may use an equal probability encoding method or a context model encoding method to encode the codeword of the truncated binary code.
- the encoder may also use the following method 3 to determine the reference area of the current block.
- Mode 3 Based on the shape of the current block, a reference area of the current block is determined from among the preset P reference areas.
- the first type of reference region is used.
- the second type of reference area is used.
- the third type of reference area is used.
- the correspondence between the P reference regions and the shape of the current block is preset.
- the encoder can determine the reference region of the current block from the P reference regions according to the shape of the current block through the correspondence between the P reference regions and the shape of the current block.
- the following describes the process of determining the interpolation filter of the current block at the encoder end.
- the interpolation filters provided in the embodiments of the present application include, but are not limited to: a square interpolation filter, and an interpolation filter whose height is smaller than its width.
- the square interpolation filter includes but is not limited to the 4X4 interpolation filter shown in Figure 14A.
- interpolation filters with a height greater than width include but are not limited to the 5X3 interpolation filter shown in Figure 14B, the 6X2 interpolation filter shown in Figure 14D, and the 7X1 interpolation filter shown in Figure 14G.
- interpolation filters whose height is smaller than their width include but are not limited to the 3X5 interpolation filter shown in FIG. 14C , the 2X6 interpolation filter shown in FIG. 14E , and the 1X7 interpolation filter shown in FIG. 14F .
- the dark grey position represents the current position to be predicted
- the light grey position represents the input position of the interpolation filter, that is, the ⁇ p 0 ,p 1 ,...,p N-1 ⁇ position.
- the specific manners in which the encoder determines the interpolation filter of the current block include but are not limited to the following:
- the interpolation filter of the current block is a default interpolation filter, for example, the encoder and decoder default the interpolation filter of the current block to any one of the interpolation filters in Figures 14A to 14G.
- the default interpolation filter is a 4X4 interpolation filter.
- the encoder determines an interpolation filter for the current block from among the preset Q interpolation filters.
- the encoder randomly selects an interpolation filter from Q interpolation filters as the interpolation filter of the current block.
- the encoder determines the second cost when using Q interpolation filters to predict the current block respectively; and determines the interpolation filter with the smallest second cost among the Q interpolation filters as the interpolation filter of the current block.
- the encoding end writes fifth information into the bitstream, where the fifth information is used to indicate the shape of the interpolation filter of the current block.
- the encoder determines the interpolation filter of the current block from among the preset Q interpolation filters. For example, the encoder determines the second costs corresponding to the Q interpolation filters, and determines the interpolation filter with the smallest second cost as the interpolation filter of the current block. Then, the shape of the interpolation filter with the smallest second cost is indicated to the encoder through the fifth information. In this way, the decoder obtains the fifth information by decoding the bitstream, and then determines the interpolation filter of the current block from among the preset Q interpolation filters based on the shape of the interpolation filter indicated by the fifth information.
- the Q interpolation filters include at least one of a first interpolation filter, a second interpolation filter, and a third interpolation filter
- the first interpolation filter is a square interpolation filter
- the second interpolation filter is a rectangular interpolation filter with a width greater than a height
- the third interpolation filter is a rectangular interpolation filter with a height greater than a width.
- the Q interpolation filters include the plurality of interpolation filters in FIGS. 14A to 14H .
- the embodiment of the present application does not limit the specific form of the fifth information, as long as it is any indication information that can indicate the shape of the interpolation filter of the current block.
- eip_filter_type is used to represent the fifth information.
- the value of eip_filter_type is used to indicate interpolation filters of different shapes.
- the Q interpolation filters are the five interpolation filters shown in FIG. 15
- the correspondence between the five interpolation filters and the eip_filter_type values is as shown in Table 6.
- the encoding end may use a truncated binary code encoding method to encode the fifth information into the bit stream.
- the preset Q interpolation filters include the five interpolation filters shown in FIG. 15
- the correspondence between the truncated binary code, the eip_filter_type value, and the shape of the interpolation filter is as shown in Table 7.
- the five interpolation filter shapes shown in Table 7 and the three reconstruction region types shown in Table 5 have a total of 15 interpolation filter and reconstruction region combinations.
- the correspondence between the truncated binary code, the eip_filter_type value and the shape of the interpolation filter is shown in Table 9.
- the 7 interpolation filter shapes shown in Table 9 and the 3 reconstruction area types shown in Table 5 have a total of 21 interpolation filter and reconstruction area combinations.
- the embodiment of the present application when the embodiment of the present application includes three interpolation filters as shown in FIG. 17, the binary code is truncated, the eip_filter_type value is taken, and the interpolation
- Table 10 The correspondence between the shapes of the filters is shown in Table 10.
- the three interpolation filter shapes shown in Table 10 and the three reconstruction region types shown in Table 5 have a total of nine combinations of interpolation filters and reconstruction regions.
- the correspondence between the truncated binary code, the eip_filter_type value and the shape of the interpolation filter is as shown in Table 11.
- the three interpolation filter shapes shown in Table 11 and the three reconstruction region types shown in Table 5 have a total of nine combinations of interpolation filters and reconstruction regions.
- the correspondence between the truncated binary code, the eip_filter_type value and the shape of the interpolation filter is shown in Table 12.
- the three interpolation filter shapes shown in Table 12 and the three reconstruction region types shown in Table 5 have a total of nine combinations of interpolation filters and reconstruction regions.
- the embodiment of the present application when the embodiment of the present application includes three interpolation filters as shown in Figure 19, the correspondence between the truncated binary code, the eip_filter_type value and the shape of the interpolation filter is shown in Table 13.
- the three interpolation filter shapes shown in Table 13 and the three reconstruction area types shown in Table 5 have a total of nine interpolation filter and reconstruction area combinations.
- the encoder may also use the following method 3 to determine the interpolation filter of the current block.
- Mode 3 Based on the shape of the current block, determine the interpolation filter of the current block from among the preset Q interpolation filters.
- an interpolation filter of the first shape is used.
- the interpolation filter of the second shape is used.
- an interpolation filter of the third shape is used.
- the correspondence between the Q interpolation filters and the shape of the current block is preset.
- the encoding end can determine the interpolation filter of the current block from the Q interpolation filters according to the shape of the current block through the correspondence between the Q interpolation filters and the shape of the current block.
- the encoder determines the reference area and the interpolation filter of the current block based on the above steps, it determines the prediction block of the current block based on the reference area and the interpolation filter.
- the following introduces how to determine the filter coefficients of the interpolation filter based on the reference area.
- the manners for determining the filter coefficients of the interpolation filter include at least the following:
- Method 1 Use the interpolation filter determined above to slide in the reference area of the current block to construct the Wienerhof equation. Then, solve the Wienerhof equation to obtain the filter coefficients of the interpolation filter.
- N positions corresponding to each position of the reference area are determined according to the shape of the interpolation filter. For example, for position r in the reference area, based on the shape of the interpolation filter, N positions corresponding to position r are determined in the reference area, and the pixel reconstruction values of these N positions are the input of the interpolation filter.
- the phase position difference between these N positions and position r is ⁇ p 0 ,p 1 ,...,p N-1 ⁇ , where p N is a two-dimensional representation.
- ⁇ c 0 ,c 1 ,...,c N-1 ⁇ is the interpolation filter coefficient at the position ⁇ p 0 ,p 1 ,...,p N-1 ⁇ .
- the interpolation filter is slid in the reference area of the current block to construct the Wienerhof equation, as shown in formula (3).
- the filter coefficient of the interpolation filter of the current block can be determined by solving the above formula (3).
- the encoding end can solve the Wienerhof equation shown in the above formula (3) by Cholesky decomposition of the autocorrelation coefficient matrix to obtain the filter coefficient of the filter.
- the embodiment of the present application does not limit the sliding step length of the interpolation filter within the reference area.
- the horizontal sliding step size and the vertical sliding step size of the interpolation filter in the reference area are equal, both being 1 pixel.
- the horizontal sliding step size and the vertical sliding step size of the interpolation filter in the reference area are not equal.
- the horizontal sliding step size is 2 pixels and the vertical sliding step size is 1 pixel.
- the horizontal sliding step size is 1 pixel and the vertical sliding step size is 2 pixels.
- At least one of the horizontal sliding step length and the vertical sliding step length of the interpolation filter in the reference area is greater than the preset step length.
- the horizontal sliding step length is greater than the preset step length.
- the vertical sliding step length is greater than the preset step length.
- both the horizontal sliding step length and the vertical sliding step length are greater than the preset step length.
- the embodiment of the present application does not limit the specific value of the preset step length. For example, it can be 1, 2, 3, etc.
- the encoder determines the filter coefficients through the following steps S201-A1 to S201-A4:
- the reference area is de-averaged, and the filter coefficient of the interpolation filter is determined based on the de-averaged reference area. Since the amount of data becomes smaller after the reference area is de-averaged, when the filter coefficient is determined based on the de-averaged reference area, the efficiency of determining the filter coefficient can be improved.
- the encoding end first determines a first reconstruction area, and the first reconstruction area may be any part of the reconstruction area around the current block.
- the encoding end determines the first reconstruction area around the current block in at least the following ways:
- the encoder determines a reconstruction area around the current block as the first reconstruction area by default.
- the encoder uses, by default, an area consisting of a row above, a column on the left, and a pixel point in the upper left corner of the current block as the first reconstruction area.
- Method 2 determining the first reconstruction area based on the shape of the current block.
- a reconstructed pixel region in an upper row and a left column of the current block is determined as the first reconstructed region.
- the shape of the current block is a rectangle whose width is greater than its height
- a row of reconstructed pixel regions above the current block is determined as the first reconstructed region.
- the shape of the current block is a rectangle whose height is greater than its width
- a column of reconstructed pixel regions on the left side of the current block is determined as the first reconstructed region.
- the manner of determining the first reconstruction area includes but is not limited to the above examples.
- the encoding end After determining the first reconstruction area, the encoding end determines the pixel average reconstruction value m based on the reconstruction value of the first reconstruction area.
- the embodiment of the present application does not limit the specific method of determining the pixel average reconstruction value m based on the reconstruction value of the first reconstruction area in the above S201-A2.
- Method 1 the above S201-A2 includes: determining the average value of the reconstruction values of the first reconstruction area as the pixel average reconstruction value m.
- the pixel average reconstruction value m can be calculated by the method shown in Table 13.
- the average of the reconstruction values of the row above and/or the column to the left may be determined as the pixel average reconstruction value m.
- the pixel average reconstruction value m may be calculated by the method shown in Table 15.
- shift calculation can be used instead of division to quickly calculate the pixel average reconstruction value m.
- the above S201-A2 includes: determining a pixel average reconstruction value based on the shape of the current block and the reconstruction value of the first reconstruction area.
- the average value of the entire first reconstruction area determined above is determined as the pixel average reconstruction value m.
- the first reconstructed area includes an upper reconstructed area and a left reconstructed area of the current block.
- determining the pixel average reconstruction value based on the shape of the current block and the reconstruction value of the first reconstructed area includes: determining the first area from the upper reconstructed area and the left reconstructed area based on the shape of the current block; determining the average reconstruction value of the first area based on the reconstruction value of the first area; and determining the pixel average reconstruction value based on the average reconstruction value of the first area.
- the mean is determined in the same way as the DC prediction mode. Specifically, based on the shape of the current block, the first area is determined from the reconstruction area above and the reconstruction area on the left of the first reconstruction area.
- the upper reconstruction area is determined as the first area.
- the left reconstruction area is determined as the first area.
- the upper reconstruction area and the left reconstruction area are determined as the first area.
- the average reconstruction value of the first area is determined, and then based on the average reconstruction value of the first area, the pixel average reconstruction value m is determined, for example, the average reconstruction value of the first area is determined as the pixel average reconstruction value m.
- the average reconstruction value of the reconstruction area above the current block is determined as the pixel average reconstruction value m. If the shape of the current block is that the height is greater than the width, the average reconstruction value of the reconstruction area on the left side of the current block is determined as the pixel average reconstruction value m. If the shape of the current block is that the height is equal to the width, the average reconstruction value of the reconstruction area above and the reconstruction area on the left side of the current block is determined as the pixel average reconstruction value m.
- the encoding end determines the pixel average reconstruction value, it removes the average of the reconstruction values of the pixels in the reference area based on the pixel average reconstruction value.
- the reconstructed value of the pixel is divided by the above pixel average reconstructed value and then rounded to the integer to obtain the pixel value of the pixel in the reference area after the mean value is removed.
- the encoder subtracts the pixel average reconstruction value from the reconstruction value of the pixel point in the reference area to obtain the pixel value of the pixel point in the reference area after the average value is removed. For example, for each pixel point in the reference area, the reconstruction value of the pixel point is subtracted from the pixel average reconstruction value to obtain the pixel value of the pixel point in the reference area after the average value is removed.
- the embodiment of the present application does not limit the specific manner in which the encoding end de-averages the reconstructed values of the pixels in the reference area based on the pixel average reconstruction value.
- the encoding end de-averages the reconstructed values of the pixels in the reference area, obtains the pixel values of the de-averaged pixels in the reference area, and then executes the above steps S201-A4, uses the pixel values of the de-averaged pixels in the reference area as the input of the interpolation filter, slides the interpolation filter in the reference area, and obtains the filter coefficients of the interpolation filter.
- the interpolation filter of the current block is an interpolation filter of 5 different shapes and the reference area of the current block is a reference area of 3 different types
- the interpolation filter of the current block is slid on the reference area after the mean value is removed from the current block to obtain the filter coefficient of the interpolation filter.
- the interpolation filter can slide row by row in the horizontal direction or column by column in the vertical direction on the reference area after the mean value is removed.
- the N positions corresponding to each position of the reference area are determined. For example, for position r in the reference area, based on the shape of the interpolation filter, the N positions corresponding to position r are determined in the reference area, and the pixel reconstruction values of these N positions are the input of the interpolation filter.
- the phase position difference between these N positions and position r is ⁇ p 0 ,p 1 ,...,p N-1 ⁇ , where p N is a two-dimensional representation.
- ⁇ c 0 ,c 1 ,...,c N-1 ⁇ is the interpolation filter coefficient at the position ⁇ p 0 ,p 1 ,...,p N-1 ⁇ .
- the interpolation filter is slid in the reference area of the current block, and the constructed Wienerhof equation is shown in formula (4).
- the filter coefficient of the interpolation filter of the current block can be determined by solving the above formula (4).
- the encoding end can solve the Wienerhof equation shown in the above formula (4) by Cholesky decomposition of the autocorrelation coefficient matrix to obtain the filter coefficients of the filter.
- step S202 After the encoding end determines the filter coefficients of the interpolation filter based on the above steps, it executes the following step S202.
- S202 Based on the filter coefficients, use an interpolation filter to perform parallel prediction on at least two pixel points in the current block to determine a prediction block of the current block.
- the encoder After the encoder determines the filter coefficient of the interpolation filter based on the above steps, it uses the interpolation filter to perform interpolation filtering prediction on the current block based on the filter coefficient to obtain a prediction block of the current block.
- the interpolation filter when using an interpolation filter to predict the current block by interpolation filtering, after the prediction of a pixel in the current block is completed, the next pixel is predicted. For example, as shown in FIG22A, the interpolation filter performs interpolation filtering prediction on each pixel in the current block one by one along the horizontal direction. During prediction, after the prediction of the previous pixel in the horizontal direction is completed, the predicted value of the previous pixel is used as an input pixel value of the interpolation filter of the next pixel to predict the next pixel.
- the encoding end uses an interpolation filter with known filter coefficients to perform interpolation prediction on each position in the current block one by one. Specifically, for the rth point in the current block, first determine the pixel values of the N positions corresponding to the rth point according to the shape of the interpolation filter of the current block. For example, as shown in FIG22, in the 4X4 interpolation filter, the dark position is the position of the rth point to be processed, and the 15 light positions are the N positions corresponding to the rth point. The block to be predicted in FIG22 is the current block. Next, the pixel values of the N positions corresponding to the rth point are determined.
- the reconstruction value of the position is determined as the pixel value of the position. If the position is located in the current block, the predicted value of the position is determined as the pixel value of the position.
- the interpolation filter performs interpolation filtering prediction on each pixel in the current block one by one along the vertical direction.
- the prediction value of the previous pixel is used as an input pixel value of the interpolation filter of the next pixel to predict the next pixel.
- the related art uses the interpolation filter to perform interpolation filtering prediction on the current block, only one pixel can be predicted at a time, which makes the prediction time-consuming and the prediction efficiency low, thereby affecting the coding efficiency.
- an embodiment of the present application uses an interpolation filter to perform interpolation filtering prediction on the current block, at least two points in the current block are predicted in parallel. That is, the encoding end can use the interpolation filter to perform interpolation filtering prediction on at least two pixel points in the current block at the same time.
- the encoder uses an interpolation filter to perform interpolation filtering prediction on at least two pixels in the current block at the same time, including at least two implementation methods:
- the first implementation method is that for at least two pixels in the current block, these at least two pixels are adjacent pixels.
- the encoding end first determines the interpolation filter input information corresponding to the at least two pixels, inputs the input information into the interpolation filter for interpolation filter prediction, obtains a prediction value, and determines the prediction value of the at least two pixels based on the prediction value. For example, based on the relevant feature information of the at least two pixels, the prediction value is processed to obtain the prediction values corresponding to the at least two pixels. For another example, the prediction value is determined as the prediction value corresponding to the at least two pixels.
- the encoding end does not impose any restriction on the specific manner of determining the interpolation filter input information corresponding to the at least two pixel points.
- the interpolation filter determines the same interpolation filter input value of the at least two pixels, and use the same interpolation filter input value as interpolation filter input information.
- the at least two pixels include pixel 1 and pixel 2.
- determine the N input values corresponding to pixel 1 and the N input values corresponding to pixel 2. Determine the same input value among the N input values corresponding to pixel 1 and the N input values corresponding to pixel 2, and use the same input value as the input value of the interpolation filter. It should be noted that if the N input values corresponding to pixel 2 or pixel 1 include uncoded values, the uncoded values are discarded.
- the input value corresponding to the pixel point with the most encoded input values among the at least two pixel points is determined as the interpolation filter input information.
- the at least two pixel points include pixel point 1 and pixel point 2.
- the N input values corresponding to pixel point 1 and the N input values corresponding to pixel point 2 are determined. Among them, the N input values corresponding to pixel point 1 are all encoded, and the N input values corresponding to pixel point 2 include unencoded input values, so the N input values corresponding to pixel point 1 are used as the interpolation filter input information.
- the second method is to use an interpolation filter to perform interpolation filtering on at least two pixels in the current block at the same time. For example, at time t, the encoder uses an interpolation filter to perform interpolation filtering prediction on pixel 1 in the current block to obtain the predicted value of pixel 1, and at the same time, uses an interpolation filter to perform interpolation filtering prediction on pixel 2 in the current block to obtain the predicted value of pixel 2.
- the embodiment of the present application does not restrict the prediction direction when the encoder uses an interpolation filter to perform parallel prediction on at least two pixels in the current block.
- the encoding end may use an interpolation filter to perform parallel prediction on at least two pixels in the current block along the horizontal direction.
- an interpolation filter to perform parallel prediction on at least two pixels in the current block along the horizontal direction.
- the encoding end may use an interpolation filter to perform parallel prediction on at least two pixels in the current block along the vertical direction.
- an interpolation filter to perform parallel prediction on at least two pixels in the current block along the vertical direction.
- the encoding end may use an interpolation filter to perform parallel prediction on at least two pixels in the current block along a diagonal direction.
- the above S202 includes the following step S202-A:
- S202-A Based on the filter coefficients, use an interpolation filter along the diagonal direction to perform parallel interpolation filtering prediction on pixel points on the same diagonal line of the current block to obtain a prediction block of the current block.
- the pixel points to be predicted are located at a corner (e.g., the lower right corner or the upper left corner) of the area selected by the interpolation filter.
- the N positions corresponding to the selected pixel point do not include other pixel points on the diagonal line.
- the N positions corresponding to each pixel point on the same diagonal line do not include the pixel points on the diagonal line.
- the N positions corresponding to the pixel point a and the N positions corresponding to the pixel point b are determined respectively, wherein the N positions corresponding to the pixel point a and the N positions corresponding to the determined pixel point b both include the pixel points on the diagonal line.
- parallel interpolation filtering prediction can be performed on the pixel points on the same diagonal line in the current block along the diagonal direction. For example, parallel interpolation filtering prediction is performed on pixel point a and pixel point b on the same diagonal line.
- the left side, lower left, upper left, left side and upper right areas of the current block have been encoded, so the starting point of predicting the current block along the diagonal direction can be determined based on the encoded area and the shape of the interpolation filter.
- the encoder performs interpolation filtering prediction on the current block along the diagonal direction starting from the upper left corner of the current block.
- the above S202-A includes the following step S202-A1:
- the encoder uses an interpolation filter to perform parallel interpolation filtering prediction on the pixels on the same diagonal line of the current block starting from the upper left corner of the current block along the diagonal direction to obtain a predicted block of the current block.
- the pixel to be predicted is located at the lower right corner of the area selected by the interpolation filter.
- the embodiment of the present application does not limit the specific direction of the diagonal line.
- the diagonal direction includes at least one of the following: a direction from the upper right to the lower left, and a direction from the lower left to the upper right.
- the diagonal direction includes a direction from the upper right to the lower left.
- the diagonal direction of the current block is from the upper right to the lower left.
- the diagonal direction includes a direction from lower left to upper right.
- the diagonal direction of the current block is from lower left to upper right.
- the diagonal direction includes the direction from the upper right to the lower left and the direction from the lower left to the upper right.
- the diagonal direction of the current block includes two directions: the direction from the upper right to the lower left and the direction from the lower left to the upper right.
- the encoding end performs parallel prediction on the pixel points located on the same diagonal line in the current block, the specific direction of the diagonal line does not constitute a limitation on the technical solution of the embodiment of the present application.
- the encoder performs prediction in units of pixels on a diagonal line in the current block at each prediction, and the encoder performs the same process of parallel prediction on pixels on each diagonal line in the current block.
- the k-th diagonal line of the current block is used as an example for description.
- the above S202-A1 includes the following S202-A11 and S202-A12 steps:
- the k-th diagonal line can be understood as any diagonal line in the current block shown in FIG. 23, and the k-th diagonal line includes M pixels.
- the encoder uses an interpolation filter based on the filter coefficient to determine the predicted values of the M pixels in parallel. In other words, the encoder can determine the predicted values of the M pixels on the k-th diagonal line at the same time, which greatly increases the prediction speed.
- the encoding end determines the predicted values of the 3 pixels in parallel.
- the 3 pixels are respectively recorded as pixel 1, pixel 2 and pixel 3.
- the encoding end uses the interpolation filter to perform interpolation filtering prediction on pixel 1 based on the filter coefficient to obtain the predicted value of pixel 1.
- the interpolation filter is used to perform interpolation filtering prediction on pixel 2 to obtain the predicted value of pixel 2.
- the interpolation filter is used to perform interpolation filtering prediction on pixel 3 to obtain the predicted value of pixel 3.
- the encoding end determines the predicted values of the 3 pixels on the kth diagonal in the current block in parallel during one interpolation filtering prediction process, which greatly improves the speed of interpolation filtering prediction.
- the encoding end can determine the predicted values of the pixels on other diagonals in the current block with reference to the method for determining the predicted values of the pixels on the kth diagonal, and then obtain the predicted block of the current block, thereby improving the prediction speed of the current block and improving the coding efficiency.
- the embodiment of the present application does not limit the specific manner in which the encoding end uses an interpolation filter to determine the prediction values of M pixels in parallel based on the filter coefficients.
- the M pixels are points on the kth diagonal of the current block, the M pixels can be understood as adjacent pixels with similar features. Therefore, in order to reduce the computational complexity, the input value of the interpolation filter corresponding to one or several of the M pixels is determined based on the shape of the interpolation filter. Next, based on the input value of the interpolation filter corresponding to the one or several pixels, the input value of the interpolation filter corresponding to the other pixels in the M pixels except the one or several pixels is determined by, for example, average value calculation, weighted calculation or other calculation methods. Finally, based on the filter coefficient and the input value of the interpolation filter corresponding to each of the M pixels, the predicted values of the M pixels are determined in parallel.
- the above S202-A11 uses the interpolation filter to determine the predicted values of M pixels in parallel, including the following steps:
- the encoder when the encoder determines the predicted values of the M pixels on the kth diagonal line of the current block in parallel, it determines the pixel values of the N positions corresponding to the M pixels in parallel based on the shape of the interpolation filter, wherein the pixel values of the N positions corresponding to each pixel can be understood as the input value of the interpolation filter corresponding to the pixel. Then, the encoder determines the predicted values of the M pixels in parallel based on the filter coefficients and the pixel values of the N positions corresponding to the M pixels.
- the k-th diagonal line of the current block includes 3 pixels, and these 3 pixels are respectively recorded as pixel 1, pixel 2, and pixel 3.
- the interpolation filter of the current block is a 4 ⁇ 4 interpolation filter.
- the encoder determines the predicted values of these 3 pixels in parallel, the encoder determines the pixel values of 15 positions corresponding to pixel 1 based on the shape of the interpolation filter, and uses the pixel values of these 15 positions as the interpolation filter input, and determines the predicted value of pixel 1 based on the above-determined filter coefficients.
- the encoder determines the pixel values of 15 positions corresponding to pixel 2 based on the shape of the interpolation filter, and uses the pixel values of these 15 positions as the interpolation filter input, and determines the predicted value of pixel 2 based on the above-determined filter coefficients.
- the encoder determines the pixel values of 15 positions corresponding to pixel 3 based on the shape of the interpolation filter, and uses the pixel values of these 15 positions as the interpolation filter input, and determines the predicted value of pixel 3 based on the above-determined filter coefficients. That is to say, in this embodiment, the encoding end determines the prediction values of three pixels in the current block in parallel at the same time, which greatly improves the prediction speed and further improves the encoding efficiency.
- the encoding end directly multiplies the pixel values of the N positions corresponding to the pixel point by the filter coefficient to obtain the predicted value of the pixel point.
- the encoding end obtains the predicted value of each pixel among the M pixels based on the above formula (5).
- the encoder can determine the prediction value of each pixel point on the same diagonal line in the current block in parallel.
- the filter coefficient is determined by the reference area after the mean value is removed in the above formula (4), so when determining the prediction value of the current block based on the filter coefficient, the influence of the pixel average reconstruction value m needs to be considered.
- the interpolation filter coefficient determined by the above formula (4) is substituted into the above formula (5) to obtain the prediction value of each point in the current block, and then the prediction value of each point is added to the pixel average reconstruction value m to obtain the final prediction value of each point in the current block, thereby obtaining the prediction block of the current block.
- the above S202-A11-a2 includes the following steps:
- the pixel values of the N positions corresponding to the M pixel points are de-averaged in parallel to obtain the pixel values of the N positions corresponding to the M pixel points after de-averaging;
- the encoding end performs averaging on the pixel values of the N positions corresponding to the M pixels on the k-th diagonal line of the current block based on the pixel average reconstruction value to obtain the pixel values of the N positions corresponding to the M pixels after averaging. For example, for any pixel among the M pixels, the pixel value of the N positions of the pixel is subtracted from the pixel average reconstruction value to obtain the pixel value of the N positions of the pixel after averaging.
- the predicted values of the M pixels are determined in parallel.
- the embodiment of the present application does not limit the specific method of determining the predicted values of M pixel points in parallel based on the filter coefficients and the pixel values after averaging the N positions corresponding to the M pixel points.
- the encoder substitutes the pixel value and filter coefficient of the N positions of the pixel after averaging the pixel value into the above formula (5).
- t[ri+p n ] in formula (5) is the pixel value of the pixel at the position ri+p n after averaging.
- the pixel average reconstruction value m is added to the predicted value to obtain the final predicted value of the rth point.
- the above S202-A11-a22 includes the following steps:
- the prediction values of the M pixel points are determined in parallel.
- the encoder limits the prediction value of the current block to a range. Specifically, a second reconstruction area is determined, and the maximum reconstruction value max and the minimum reconstruction value min of the pixel points in the second reconstruction area are determined.
- the embodiment of the present application does not limit the specific method of determining the second reconstruction area around the current block.
- the second reconstructed area of the current block is consistent with the reference area of the current block.
- the second reconstruction area of the current block is consistent with the first reconstruction area of the current block.
- the reconstruction area above, on the left, on the upper right, on the upper left, and on the lower left of the current block is determined as the second reconstruction area.
- the reconstruction area of 13 rows above, 13 columns on the left, 13 rows on the upper right, 13 rows and 13 columns on the upper left, and 13 columns on the lower left of the current block is determined as the second reconstruction area.
- the above-mentioned S202-A11-a221 can be executed before the above-mentioned S202-A11-a222, or after the above-mentioned S202-A11-a222, or synchronously with the above-mentioned S202-A11-a222.
- the embodiment of the present application does not limit the specific method in which the encoding end obtains the first predicted values of the M pixel points in parallel based on the pixel values, filter coefficients and pixel average reconstruction values after averaging the N positions corresponding to the M pixel points.
- the pixel value after averaging the N positions of the pixel is multiplied by the filter coefficient to obtain the second predicted value of the pixel; the second predicted value and the pixel average reconstruction value are added to obtain the first predicted value of the pixel.
- the encoding end obtains the first prediction value of the pixel based on the above formula (6).
- the encoding end After the encoding end obtains a predicted value of the pixel point based on the above formula (6), the encoding end performs a preset process on the predicted value to obtain a first predicted value of the pixel point.
- the encoder determines the first prediction values of the M pixels based on the above steps, it determines the prediction values of the M pixels in parallel based on the first prediction value, the maximum reconstruction value and the minimum reconstruction value.
- the first prediction value of the pixel is determined as the prediction value of the pixel.
- the minimum reconstruction value is determined as the predicted value of the pixel point.
- the maximum reconstruction value is determined as the predicted value of the pixel point.
- the encoding end determines the predicted value of the pixel point by using the above formula (7).
- the encoding end can refer to the above method to determine the predicted values of the pixels on each diagonal line in the current block in parallel, and then obtain the predicted value of each point in the current block to form a predicted block of the current block.
- the encoder performs interpolation filtering prediction on the current block, obtains the prediction block of the current block, and then performs the following steps.
- the encoder determines the prediction block of the current block based on the above steps. Then, the prediction block of the current block is subtracted from the current block to obtain the residual block of the current block. Then, the residual block of the current block is transformed to obtain the transformation coefficient, the transformation coefficient is quantized to obtain the quantization coefficient, and the quantization coefficient is encoded to obtain the code stream.
- the residual value of the current block is transformed to obtain the transformation coefficient
- the embodiment of the present application does not limit the specific method in which the encoder determines the transform kernel corresponding to the current block.
- the encoding end and the decoding end use a default transform kernel as the transform kernel of the current block.
- the encoder determines the transformation kernel of the current block through the following steps S203-A and S203-A:
- the conventional intra prediction modes currently included in VVC are:
- PLANAR mode intra prediction mode index is 0,
- the intra-frame prediction mode index is 2 to 66.
- the arrows in the figure point to the directions predicted by the angle modes in VVC, and the prediction mode indexes used in encoding are 2 to 66.
- the current block is a non-square block, some angle directions will be replaced with wide angles, such as -1 to -14 and 67 to 80 in FIG25 .
- the intra-frame prediction mode corresponding to the above prediction block is a default intra-frame prediction mode. That is, if the current block is predicted using the interpolation filter prediction mode, when the prediction block is obtained, one of the traditional intra-frame prediction modes is determined as the default intra-frame prediction mode corresponding to the prediction block.
- the encoder determines the intra prediction mode corresponding to the prediction block through the following steps S203-A1 and S203-A2:
- S203-A2 Determine the intra-frame prediction mode corresponding to the prediction block based on the angle values of the R points.
- the intra-frame prediction mode corresponding to the prediction block is determined by counting the intra-frame prediction modes corresponding to the angle values of R points in the prediction block.
- the embodiment of the present application does not limit the specific position and number of the R points in the prediction block used to determine the angle value.
- the R points may be one point in the prediction block, or may be multiple points in the prediction block.
- the encoding end determines the angle value of a point in the prediction block (for example, the center point of the prediction block), and based on the angle value of the point, determines the intra-frame prediction mode corresponding to the point, and then determines the intra-frame prediction mode as the intra-frame prediction mode corresponding to the prediction block.
- the encoding end determines the angle values of the multiple points, and based on the angle values of the multiple points, determines the intra-frame prediction mode corresponding to each of the multiple points, and then determines the intra-frame prediction mode with the largest number of identical intra-frame prediction modes among the multiple points as the intra-frame prediction mode corresponding to the prediction block.
- the selection of the R points is related to the shape and size of the sliding window. For example, each of the R points is the center point of the sliding window when the sliding window slides in the prediction block.
- the method for determining the angle value of each of the R points is the same.
- the method of determining the angle value of the i-th point among the R points is used as an example for explanation.
- the embodiment of the present application does not limit the specific method of determining the angle value of the point.
- the above S203-A1 includes steps S202-A11 and S203-A12:
- S203-A12. Determine the angle value of the i-th point based on the horizontal gradient and the vertical gradient of the i-th point.
- the encoding end for each point among the R points, for example, the i-th point, the encoding end first determines the horizontal gradient and vertical gradient of the i-th point, and then determines the angle value of the i-th point based on the horizontal gradient and the vertical gradient.
- the embodiment of the present application does not limit the specific method of determining the horizontal gradient and the vertical gradient of the i-th point.
- the horizontal gradient value of the i-th point is determined based on the predicted values of points around the i-th point in the prediction block and the change in the predicted value of the i-th point in the horizontal direction
- the vertical gradient value of the i-th point is determined based on the predicted values of points around the i-th point in the prediction block and the change in the predicted value of the i-th point in the vertical direction.
- the encoding end determines the prediction value of the point in the sliding window centered on the i-th point in the prediction block; based on the prediction value of the point in the sliding window and the horizontal gradient operator, as well as the vertical gradient operator, the horizontal gradient and vertical gradient of the i-th point are obtained.
- a sliding window is first determined, for example, as shown in FIG26, a sliding window of size 3 ⁇ 3 is determined, and the sliding window is slid in the prediction block.
- the horizontal gradient and vertical gradient of the center point of the sliding window are determined.
- the horizontal gradient and vertical gradient of the i-th point are determined.
- the product of the predicted value of the point in the sliding window and the horizontal gradient operator is determined as the horizontal gradient G x of the ith point; the product of the predicted value of the point in the sliding window and the vertical gradient operator is determined as the vertical gradient of the ith point.
- the predicted value of a point in the sliding window is multiplied by the horizontal gradient operator and then subjected to a preset operation with a preset value to obtain the horizontal gradient G x of the i-th point; the predicted value of a point in the sliding window is multiplied by the vertical gradient operator and then subjected to a preset operation with a preset value to obtain the vertical gradient of the i-th point.
- the embodiment of the present application does not limit the specific values of the horizontal gradient operator and the vertical gradient operator.
- the angle value of the i-th point can be determined according to the horizontal gradient and the vertical gradient of the i-th point.
- the inverse tangent value of the ratio of the vertical gradient to the horizontal gradient of the i-th point is determined as the angle value of the i-th point.
- the angle value of the i-th point is determined according to formula (8).
- the encoder can also use other methods to determine the angle value of the i-th point. For example, the encoder adjusts the angle value determined by the above formula (8) to obtain the angle value of the i-th point.
- the encoding end uses the above method for each of the R points to determine the angle value of each of the R points, and then executes the above S203-A2 to determine the intra-frame prediction mode corresponding to the prediction block based on the angle values of the R points.
- the embodiment of the present application does not limit the specific method of determining the intra-frame prediction mode corresponding to the prediction block based on the angle values of R points.
- the encoder selects the angle value 1 with the largest number of identical times from the angle values of the R points, matches the angle value 1 with the prediction angle of the traditional intra-frame prediction mode, obtains the intra-frame prediction mode corresponding to the angle value 1, and determines the intra-frame prediction mode corresponding to the angle value 1 as the prediction block pair. The corresponding intra prediction mode.
- the above S203-A2 includes the following steps S203-A21 and S203-A22:
- the encoder determines the intra-frame prediction mode corresponding to each point based on the angle value of each point in the R points. For example, for each point in the R points, the angle value of the point is matched with the prediction angle of the traditional intra-frame prediction mode to obtain the intra-frame prediction mode corresponding to the angle value of the point. In this way, the intra-frame prediction mode corresponding to each point in the R points can be obtained.
- the intra-frame prediction mode corresponding to the prediction block is determined.
- the intra-frame prediction mode with the largest number of repetitions among the intra-frame prediction modes corresponding to the R points is determined as the intra-frame prediction mode corresponding to the prediction block.
- the above S203-A22 includes the following steps:
- S203-A222 Determine the intra-frame prediction mode corresponding to the prediction block based on the intra-frame prediction modes and gradient amplitude values corresponding to the R points.
- the encoding end determines the gradient amplitude value corresponding to each of the R points based on the horizontal gradient and the vertical gradient of each of the R points determined above.
- the specific manner in which the encoder determines the gradient amplitude value corresponding to each of the R points is the same.
- the embodiment of the present application does not limit the specific manner in which the encoding end determines the gradient amplitude value corresponding to the i-th point based on the horizontal gradient and vertical gradient of the i-th point.
- the encoder multiplies the horizontal gradient and the vertical gradient of the i-th point to obtain the gradient amplitude value corresponding to the i-th point.
- the encoding end adds the absolute value of the horizontal gradient and the absolute value of the vertical gradient of the i-th point to obtain the gradient amplitude value corresponding to the i-th point.
- the encoder determines the gradient amplitude value corresponding to the i-th point based on the following formula (9).
- the encoder can determine the gradient amplitude value corresponding to each of the R points based on the above steps. Next, the encoder performs the above S203-A222 to determine the intra-frame prediction mode corresponding to the prediction block based on the intra-frame prediction modes and gradient amplitude values corresponding to the R points.
- the intra-frame prediction mode corresponding to the point with the largest gradient magnitude value among the R points is determined as the intra-frame prediction mode corresponding to the prediction block.
- the gradient amplitude value corresponding to the point is accumulated on the intra-frame prediction mode corresponding to the point to obtain the accumulated gradient amplitude values of the intra-frame prediction modes corresponding to the R points; and the intra-frame prediction mode with the largest accumulated gradient amplitude value among the intra-frame prediction modes corresponding to the R points is determined as the intra-frame prediction mode corresponding to the prediction block.
- the gradient amplitude value corresponding to each of the R points is accumulated on the corresponding intra-frame prediction mode.
- the intra-frame prediction modes corresponding to point 1 and point 2 of the R points are both intra-frame prediction mode 1
- the gradient amplitude values corresponding to point 1 and point 2 are accumulated to the gradient amplitude value corresponding to intra-frame prediction mode 1.
- the gradient amplitude value histogram shown in FIG27 can be obtained.
- the intra-frame prediction mode with the largest accumulated gradient amplitude value in the gradient amplitude value histogram can be determined as the intra-frame prediction mode corresponding to the prediction block.
- the intra-frame prediction mode corresponding to the dark accumulated gradient amplitude value in FIG27 is determined as the intra-frame prediction mode corresponding to the prediction block.
- the first intra-frame prediction mode is determined as the intra-frame prediction mode corresponding to the prediction block.
- the gradient amplitude values corresponding to all points in the R points are all 0, it means that the horizontal gradient and the vertical gradient of each point in the R points are all 0.
- the preset first intra-frame prediction mode can be determined as the intra-frame prediction mode corresponding to the prediction block.
- the embodiment of the present application does not limit the type of the first intra-frame prediction mode.
- the first intra-frame prediction mode is the PLANAR mode.
- the encoder determines the intra-frame prediction mode corresponding to the prediction block based on the above steps, it determines the transform kernel corresponding to the current block based on the intra-frame prediction mode corresponding to the prediction block.
- the embodiment of the present application does not limit the specific manner in which the encoder determines the transform kernel corresponding to the current block based on the intra-frame prediction mode corresponding to the prediction block.
- the encoding end searches for an image block whose intra-frame prediction mode is the same as the intra-frame prediction mode corresponding to the prediction block in the encoded image blocks around the prediction block based on the intra-frame prediction mode corresponding to the prediction block, and then determines the transform kernel corresponding to the image block as the transform kernel corresponding to the current block.
- determining the transform kernel corresponding to the current block based on the intra prediction mode corresponding to the prediction block in S203-B above includes the following steps:
- the encoder obtains the correspondence between the preset intra prediction mode and the transform core group.
- the correspondence between the intra prediction modes and the transform core groups is shown in Table 17.
- Table 17 is only a correspondence between an intra-frame prediction mode and a transform core group involved in an embodiment of the present application.
- the correspondence between the intra-frame prediction mode and the transform core group in the embodiment of the present application includes but is not limited to that shown in Table 16.
- Each transformation core group includes at least one type of transformation core.
- the encoder After obtaining the correspondence between the intra prediction mode and the transform core group as shown in Table 17, the encoder searches for the transform core group corresponding to the intra prediction mode corresponding to the prediction block in the correspondence between the intra prediction mode and the transform core group based on the intra prediction mode corresponding to the prediction block, and records the transform core group as the first transform core group.
- the intra prediction mode corresponding to the prediction block is the angular prediction mode in the 64-angle direction
- searching the above Table 16 shows that the transform core group corresponding to the angular prediction mode in the 64-angle direction is 4.
- the encoder determines the transform core corresponding to the current block from at least one type of transform core included in the transform core group 4.
- the transform core is determined as the transform core corresponding to the current block.
- the encoder determines the transform core category corresponding to the current block, and then determines the transform core of the transform core category in the first transform core group as the transform core corresponding to the current block.
- the methods for the encoder to determine the transform kernel type corresponding to the current block include but are not limited to the following:
- the transform kernel category corresponding to the current block is a default category, so the encoder determines the default category as the transform kernel category corresponding to the current block.
- the encoder writes the transform kernel category corresponding to the current block into the bitstream, so that the encoder obtains the transform kernel category corresponding to the current block by encoding the bitstream.
- the encoding end uses the interpolation filter prediction mode to determine the prediction block of the current block, and then determines the traditional intra-frame prediction mode corresponding to the prediction block, and determines the transform kernel corresponding to the current block based on the traditional intra-frame prediction mode corresponding to the prediction block. That is to say, the embodiment of the present application uses the traditional intra-frame prediction mode derived from the interpolation filter prediction to select the transform kernel group of the non-separable primary transform (NSPT) and the non-separable secondary transform kernel (LFNST), so that the determined transform kernel is more in line with the characteristics of the current block, and the accuracy of determining the transform kernel is improved.
- NPT non-separable primary transform
- LNNST non-separable secondary transform kernel
- the determination accuracy of the reconstruction value can be improved, and the encoding accuracy of the current block can be improved.
- the embodiment of the present application determines the transform kernel of the current block through the traditional prediction mode corresponding to the prediction block, it is not necessary to indicate the transform kernel separately, which saves code words and further improves the video encoding effect.
- the encoding end determines the prediction block of the current block and the transform kernel corresponding to the current block based on the above steps. In this way, the encoding end can obtain the residual block of the current block based on the prediction block of the current block and the current block, for example, subtract the current block from the prediction block of the current block to obtain the residual block of the current block. Then, the residual block of the current block is transformed based on the above-determined transform kernel to obtain the transform coefficient of the current block. Then, the transform coefficient is directly encoded to obtain a code stream. Alternatively, the transform coefficient is quantized to obtain a quantization coefficient, and the quantization coefficient is encoded to obtain a code stream.
- the current block is a bright color block or a chroma block, that is, in the embodiment of the present application, the interpolation filtering prediction mode provided in the embodiment of the present application can be used to predict both the luminance block and the chroma block.
- the prediction mode of the current block is an interpolation filtering prediction mode
- the chrominance block corresponding to the current block adopts a direct derivation mode DM
- the PLANAR mode or the intra-frame prediction mode corresponding to the above prediction block is determined as the prediction mode of the chrominance block.
- the video encoding method provided in the embodiment of the present application when predicting the current block, first determines the reference area and interpolation filter of the current block, and based on the reference area, determines the filter coefficient, based on the filter coefficient, uses the interpolation filter to perform parallel prediction on at least two pixel points in the current block to obtain the prediction block of the current block; determines the transformation kernel corresponding to the current block, and based on the transformation kernel and the prediction block, encodes the current block to obtain a code stream. That is to say, in the embodiment of the present application, when using the interpolation filter to perform interpolation filtering prediction on the current block, at least two points in the current block are predicted in parallel, which improves the prediction speed and thus improves the coding efficiency.
- Figures 10 to 29 are merely examples of the present application and should not be construed as limitations to the present application.
- the size of the sequence number of each process does not mean the order of execution, and the execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present application.
- the term "and/or” is merely a description of the association relationship of associated objects, indicating that three relationships may exist. Specifically, A and/or B can represent: A exists alone, A and B exist at the same time, and B exists alone.
- the character "/" in the present application generally indicates that the associated objects before and after are in an "or" relationship.
- FIG30 is a schematic block diagram of a video decoding device provided in an embodiment of the present application.
- the video decoding device 10 is applied to the above-mentioned video decoder.
- the video decoding device 10 includes:
- a coefficient determination unit 11 configured to determine a reference area and an interpolation filter of a current block, and determine a filter coefficient of the interpolation filter based on the reference area;
- a prediction unit 12 configured to perform parallel prediction on at least two pixels in the current block using the interpolation filter based on the filter coefficient to determine a prediction block of the current block;
- the reconstruction unit 13 is used to determine a transformation kernel corresponding to the current block, and determine a reconstructed block of the current block based on the transformation kernel corresponding to the current block and the prediction block.
- the prediction unit 12 is specifically configured to perform parallel interpolation filtering prediction on pixel points on the same diagonal line of the current block using the interpolation filter along the diagonal direction based on the filter coefficient to obtain a prediction block of the current block.
- the prediction unit 12 is specifically used to perform parallel interpolation filtering prediction on pixel points on the same diagonal line of the current block using the interpolation filter based on the filter coefficient, starting from the upper left corner of the current block and along the diagonal direction, to obtain a predicted block of the current block.
- the diagonal direction includes at least one of the following: a direction from the upper right to the lower left, and a direction from the lower left to the upper right.
- the prediction unit 12 is specifically used to determine the prediction values of the M pixel points on the kth diagonal line of the current block in parallel based on the filter coefficient and using the interpolation filter, where k and M are both positive integers; and obtain the prediction value of the current block based on the prediction values of the pixel points on each diagonal line in the current block.
- the prediction unit 12 is specifically used to determine in parallel the pixel values of the N positions corresponding to the M pixel points based on the shape of the interpolation filter; and to determine in parallel the predicted values of the M pixel points based on the filter coefficients and the pixel values of the N positions corresponding to the M pixel points.
- the coefficient determination unit 11 is specifically used to determine a first reconstruction area around the current block; determine a pixel average reconstruction value based on the reconstruction value of the first reconstruction area; remove the mean of the reconstruction values of the pixels of the reference area based on the pixel average reconstruction value; and The pixel values of the pixels in the reference area after averaging are used as the input of the interpolation filter, and the interpolation filter is slid in the reference area to obtain the filter coefficients of the interpolation filter.
- the coefficient determination unit 11 is specifically configured to determine the pixel average reconstruction value based on the shape of the current block and the reconstruction value of the first reconstruction area.
- the coefficient determination unit 11 is specifically used to determine a first area from the upper reconstruction area and the left reconstruction area based on the shape of the current block; determine an average reconstruction value of the first area based on the reconstruction value of the first area; and determine the pixel average reconstruction value based on the average reconstruction value of the first area.
- the coefficient determination unit 11 is specifically used to determine the upper reconstruction area as the first area if the shape of the current block is wider than the height; or, if the shape of the current block is higher than the width, determine the left reconstruction area as the first area; or, if the shape of the current block is higher than the width, determine the upper reconstruction area and the left reconstruction area as the first area.
- the horizontal sliding step size and the vertical sliding step size of the interpolation filter in the reference area are different.
- At least one of a horizontal sliding step size and a vertical sliding step size of the interpolation filter within the reference area is greater than a preset step size.
- the prediction unit 12 is specifically used to de-average the pixel values of the N positions corresponding to the M pixel points in parallel based on the pixel average reconstruction value to obtain the de-averaged pixel values of the N positions corresponding to the M pixel points; and determine the predicted values of the M pixel points in parallel based on the filter coefficients and the de-averaged pixel values of the N positions corresponding to the M pixel points.
- the prediction unit 12 is specifically used to subtract the pixel average reconstruction value from the pixel values of the N positions of any pixel point among the M pixel points to obtain the pixel values of the N positions of the pixel point after removing the mean.
- the prediction unit 12 is specifically used to determine a second reconstruction area around the current block, and determine the maximum reconstruction value and the minimum reconstruction value of the second reconstruction area; based on the pixel values after averaging the N positions corresponding to the M pixels, the filter coefficients and the pixel average reconstruction value, the first prediction values of the M pixels are obtained in parallel; based on the first prediction values of the M pixels, the maximum reconstruction value and the minimum reconstruction value, the prediction values of the M pixels are determined in parallel.
- the prediction unit 12 is specifically used to multiply the pixel value after de-averaging the N positions of the pixel point and the filter coefficient for any pixel point among the M pixels to obtain a second prediction value of the pixel point; and add the second prediction value and the pixel average reconstruction value to obtain a first prediction value of the pixel point.
- the prediction unit 12 is specifically used to, for any pixel point among the M pixels, if the first prediction value of the pixel point is greater than the minimum reconstruction value and less than the maximum reconstruction value, determine the first prediction value as the prediction value of the pixel point; or, if the first prediction value of the pixel point is less than or equal to the minimum reconstruction value, determine the minimum reconstruction value as the prediction value of the pixel point; or, if the first prediction value of the pixel point is greater than or equal to the maximum reconstruction value, determine the maximum reconstruction value as the prediction value of the pixel point.
- the coefficient determination unit 11 before determining the reference area and interpolation filter of the current block, is also used to determine whether the current block allows the use of the interpolation filter prediction mode; if the current block allows the use of the interpolation filter prediction mode, the reference area and interpolation filter of the current block are determined.
- the coefficient determination unit 11 is specifically configured to determine that the current block is not allowed to use the interpolation filtering prediction mode if the current block is in the first row of the current CTU.
- the coefficient determination unit 11 is specifically configured to determine whether the current block is allowed to use the interpolation filter prediction mode based on the type of the current image.
- the coefficient determination unit 11 is specifically configured to determine that the current block is not allowed to use the interpolation filter prediction mode if the current image is not an intra-frame prediction image.
- the coefficient determination unit 11 is specifically configured to determine that the current block is not allowed to use the interpolation filter prediction mode if the size of the current block is smaller than the preset size.
- the coefficient determination unit 11 is specifically used to decode the code stream to obtain first information, where the first information is used to indicate whether the template matching-based technology is turned on; based on the first information, determine whether the current block is allowed to use the interpolation filtering prediction mode.
- the coefficient determination unit 11 is specifically configured to determine that the current block is not allowed to use the interpolation filtering prediction mode if the first information indicates that the template matching-based technology is not enabled.
- the coefficient determination unit 11 is specifically used to decode the code stream to obtain second information if the first information indicates that the template matching-based technology is turned on, and the second information is used to indicate whether the current sequence is allowed to use the interpolation filtering prediction mode for prediction; based on the second information, determine whether the current block is allowed to use the interpolation filtering prediction mode.
- the reconstruction unit 13 is specifically configured to determine an intra-frame prediction mode corresponding to the prediction block; and determine a transform kernel corresponding to the current block based on the intra-frame prediction mode corresponding to the prediction block.
- the reconstruction unit 13 is specifically configured to determine angle values of R points in the prediction block, where R is a positive integer; and determine an intra-frame prediction mode corresponding to the prediction block based on the angle values of the R points.
- the reconstruction unit 13 is specifically used to obtain the correspondence between the intra-frame prediction mode and the transformation core group, wherein one transformation core group includes at least one type of transformation core; in the correspondence, the first transformation core group corresponding to the intra-frame prediction mode of the prediction block is searched; and from the first transformation core group, the transformation core corresponding to the current block is determined.
- the coefficient determination unit 11 is specifically configured to determine a reference area of the current block among preset P reference areas, where P is a positive integer greater than 1.
- the coefficient determination unit 11 is specifically configured to determine the interpolation filter of the current block from among Q preset interpolation filters, where Q is a positive integer greater than 1.
- the device 10 shown in FIG. 30 can perform the decoding method of the decoding end of the embodiment of the present application, and the aforementioned and other operations and/or functions of each unit in the device 10 are respectively for implementing the corresponding processes in each method such as the decoding method of the above-mentioned decoding end, and for the sake of brevity, no further description is given here.
- FIG31 is a schematic block diagram of a video encoding device provided in an embodiment of the present application, and the video encoding device is applied to the above-mentioned encoder.
- the video encoding device 20 may include:
- a coefficient determination unit 21 configured to determine a reference area and an interpolation filter of a current block, and determine a filter coefficient of the interpolation filter based on the reference area;
- a prediction unit 22 configured to perform parallel prediction on at least two pixels in the current block using the interpolation filter based on the filter coefficient to determine a prediction block of the current block;
- the encoding unit 23 is used to determine a transformation kernel corresponding to the current block, and based on the transformation kernel corresponding to the current block and the prediction block, encode the current block to obtain a code stream.
- the prediction unit 22 is specifically configured to perform parallel interpolation filtering prediction on pixel points on the same diagonal line of the current block using the interpolation filter along the diagonal direction based on the filter coefficient to obtain a prediction block of the current block.
- the prediction unit 22 is specifically used to perform parallel interpolation filtering prediction on pixel points on the same diagonal line of the current block using the interpolation filter based on the filter coefficient, starting from the upper left corner of the current block and along the diagonal direction, to obtain a predicted block of the current block.
- the diagonal direction includes at least one of the following: a direction from the upper right to the lower left, and a direction from the lower left to the upper right.
- the prediction unit 22 is specifically used to determine the predicted values of the M pixels on the kth diagonal line of the current block in parallel based on the filter coefficient and using the interpolation filter, where k and M are both positive integers; and obtain the predicted value of the current block based on the predicted values of the pixels on each diagonal line in the current block.
- the prediction unit 22 is specifically used to determine in parallel the pixel values of the N positions corresponding to the M pixel points based on the shape of the interpolation filter; and to determine in parallel the predicted values of the M pixel points based on the filter coefficients and the pixel values of the N positions corresponding to the M pixel points.
- the coefficient determination unit 21 is specifically used to determine a first reconstruction area around the current block; determine a pixel average reconstruction value based on the reconstruction value of the first reconstruction area; de-average the reconstruction values of the pixel points in the reference area based on the pixel average reconstruction value; use the de-averaged pixel values of the pixel points in the reference area as input of the interpolation filter, slide the interpolation filter within the reference area, and obtain the filter coefficients of the interpolation filter.
- the coefficient determination unit 21 is specifically configured to determine the pixel average reconstruction value based on the shape of the current block and the reconstruction value of the first reconstruction area.
- the coefficient determination unit 21 is specifically used to determine a first area from the upper reconstruction area and the left reconstruction area based on the shape of the current block; determine an average reconstruction value of the first area based on the reconstruction value of the first area; and determine the pixel average reconstruction value based on the average reconstruction value of the first area.
- the coefficient determination unit 21 is specifically used to determine the upper reconstruction area as the first area if the shape of the current block is wider than the height; or, if the shape of the current block is higher than the width, determine the left reconstruction area as the first area; or, if the shape of the current block is higher than the width, determine the upper reconstruction area and the left reconstruction area as the first area.
- the horizontal sliding step size and the vertical sliding step size of the interpolation filter in the reference area are different.
- At least one of a horizontal sliding step size and a vertical sliding step size of the interpolation filter within the reference area is greater than a preset step size.
- the prediction unit 22 is specifically used to perform de-averaging on the pixel values of the N positions respectively corresponding to the M pixel points in parallel based on the pixel average reconstruction value to obtain the de-averaged pixel values of the N positions respectively corresponding to the M pixel points; and determine the predicted values of the M pixel points in parallel based on the filter coefficients and the de-averaged pixel values of the N positions respectively corresponding to the M pixel points.
- the prediction unit 22 is specifically used to subtract the pixel average reconstruction value from the pixel values of the N positions of any pixel point among the M pixel points to obtain the pixel values of the N positions of the pixel point after removing the mean.
- the prediction unit 22 is specifically used to determine a second reconstruction area around the current block, and determine a maximum reconstruction value and a minimum reconstruction value of the second reconstruction area; based on the pixel values after averaging at N positions corresponding to the M pixels, the filter coefficients and the pixel average reconstruction value, obtain the first prediction values of the M pixels in parallel; based on the first prediction values of the M pixels, the maximum reconstruction value and the minimum reconstruction value, determine the prediction values of the M pixels in parallel.
- the prediction unit 22 is specifically used to multiply the pixel value after de-averaging the N positions of the pixel point and the filter coefficient for any pixel point among the M pixels to obtain a second prediction value of the pixel point; and add the second prediction value and the pixel average reconstruction value to obtain a first prediction value of the pixel point.
- the prediction unit 22 is specifically used to, for any pixel point among the M pixels, if the first prediction value of the pixel point is greater than the minimum reconstruction value and less than the maximum reconstruction value, determine the first prediction value as the prediction value of the pixel point; or, if the first prediction value of the pixel point is less than or equal to the minimum reconstruction value, determine the minimum reconstruction value as the prediction value of the pixel point; or, if the first prediction value of the pixel point is greater than or equal to the maximum reconstruction value, determine the maximum reconstruction value as the prediction value of the pixel point.
- the coefficient determination unit 21, before determining the reference area and interpolation filter of the current block is also used to determine whether the current block allows the use of the interpolation filter prediction mode; if the current block allows the use of the interpolation filter prediction mode, the reference area and interpolation filter of the current block are determined.
- the coefficient determination unit 21 is specifically configured to determine that the current block is not allowed to use the interpolation filtering prediction mode if the current block is in the first row of the current CTU.
- the coefficient determination unit 21 is specifically configured to determine whether the current block is allowed to use the interpolation filter prediction mode based on the type of the current image.
- the coefficient determination unit 21 is specifically configured to determine that the current block is not allowed to use the interpolation filter prediction mode if the current image is not an intra-frame prediction image.
- the coefficient determination unit 21 is specifically configured to determine that the current block is not Allows the use of the interpolation filter prediction mode.
- the coefficient determination unit 21 is specifically used for 53.
- the method according to claim 48 is characterized in that the determination of whether the current block is allowed to use the interpolation filtering prediction mode includes: determining first information, the first information is used to indicate whether the template matching-based technology is turned on; based on the first information, determining whether the current block is allowed to use the interpolation filtering prediction mode.
- the coefficient determination unit 21 is specifically configured to determine that the current block is not allowed to use the interpolation filtering prediction mode if the first information indicates that the template matching-based technology is not enabled.
- the coefficient determination unit 21 is specifically used to determine second information if the first information indicates that the template matching-based technology is turned on, and the second information is used to indicate whether the current sequence is allowed to use the interpolation filtering prediction mode for prediction; based on the second information, determine whether the current block is allowed to use the interpolation filtering prediction mode.
- the encoding unit 23 is specifically configured to determine an intra-frame prediction mode corresponding to the prediction block; and determine a transform kernel corresponding to the current block based on the intra-frame prediction mode corresponding to the prediction block.
- the encoding unit 23 is specifically configured to determine angle values of R points in the prediction block, where R is a positive integer; and determine an intra-frame prediction mode corresponding to the prediction block based on the angle values of the R points.
- the encoding unit 23 is specifically used to obtain the correspondence between the intra-frame prediction mode and the transform core group, wherein one transform core group includes at least one type of transform core; in the correspondence, the first transform core group corresponding to the intra-frame prediction mode of the prediction block is searched; and from the first transform core group, the transform core corresponding to the current block is determined.
- the coefficient determination unit 21 is specifically configured to determine a reference area of the current block among preset P reference areas, where P is a positive integer greater than 1.
- the coefficient determination unit 21 is specifically configured to determine the interpolation filter of the current block from among Q preset interpolation filters, where Q is a positive integer greater than 1.
- the device embodiment and the method embodiment may correspond to each other, and similar descriptions may refer to the method embodiment. To avoid repetition, it will not be repeated here.
- the device 20 shown in Figure 31 may correspond to the corresponding subject in the encoding method of the encoding end of the embodiment of the present application, and the aforementioned and other operations and/or functions of each unit in the device 20 are respectively for implementing the corresponding processes in each method such as the encoding method of the encoding end, and for the sake of brevity, it will not be repeated here.
- the functional unit can be implemented in hardware form, can be implemented by instructions in software form, and can also be implemented by a combination of hardware and software units.
- the steps of the method embodiment in the embodiment of the present application can be completed by the hardware integrated logic circuit and/or software form instructions in the processor, and the steps of the method disclosed in the embodiment of the present application can be directly embodied as a hardware decoding processor to perform, or a combination of hardware and software units in the decoding processor to perform.
- the software unit can be located in a mature storage medium in the field such as a random access memory, a flash memory, a read-only memory, a programmable read-only memory, an electrically erasable programmable memory, a register, etc.
- the storage medium is located in a memory, and the processor reads the information in the memory, and completes the steps in the above method embodiment in conjunction with its hardware.
- Figure 32 is a schematic block diagram of an electronic device provided in an embodiment of the present application.
- the electronic device 30 may be a video encoder or a video decoder as described in the embodiment of the present application, and the electronic device 30 may include:
- the memory 33 and the processor 32, the memory 33 is used to store the computer program 34 and transmit the program code 34 to the processor 32.
- the processor 32 can call and run the computer program 34 from the memory 33 to implement the method in the embodiment of the present application.
- the processor 32 may be configured to execute the steps in the above method 200 according to the instructions in the computer program 34 .
- the processor 32 may include but is not limited to:
- DSP digital signal processor
- ASIC application-specific integrated circuit
- FPGA field programmable gate array
- the memory 33 includes but is not limited to:
- Non-volatile memory can be read-only memory (ROM), programmable ROM (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM) or flash memory.
- the volatile memory can be random access memory (RAM), which is used as an external cache.
- RAM random access memory
- SRAM static RAM
- DRAM dynamic RAM
- SDRAM synchronous DRAM
- DDR SDRAM double data rate synchronous dynamic random access memory
- ESDRAM enhanced synchronous dynamic random access memory
- SLDRAM synchronous link DRAM
- Direct Rambus RAM Direct Rambus RAM
- the computer program 34 may be divided into one or more units, which are stored in the memory 33 and executed by the processor 32 to complete the method provided by the present application.
- the one or more units may be a series of computer program instruction segments capable of completing specific functions, and the instruction segments are used to describe the execution process of the computer program 34 in the electronic device 30.
- the electronic device 30 may further include:
- the transceiver 33 may be connected to the processor 32 or the memory 33 .
- the processor 32 may control the transceiver 33 to communicate with other devices, specifically, to send information or data to other devices, or to receive information or data sent by other devices.
- the transceiver 33 may include a transmitter and a receiver.
- the transceiver 33 may further include an antenna, and the number of antennas may be one or more.
- bus system includes not only a data bus but also a power bus, a control bus and a status signal bus.
- Figure 33 is a schematic block diagram of a video encoding and decoding system provided in an embodiment of the present application.
- the video encoding and decoding system 40 may include: a video encoder 41 and a video decoder 42, wherein the video encoder 41 is used to perform the present invention.
- the video encoding method involved in the embodiment of the present application, the video decoder 42 is used to execute the video decoding method involved in the embodiment of the present application.
- the present application also provides a computer storage medium on which a computer program is stored, and when the computer program is executed by a computer, the computer can perform the method of the above method embodiment.
- the present application embodiment also provides a computer program product containing instructions, and when the instructions are executed by a computer, the computer can perform the method of the above method embodiment.
- the present application also provides a code stream, which is generated according to the above encoding method.
- the computer program product includes one or more computer instructions.
- the computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
- the computer instructions can be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium.
- the computer instructions can be transmitted from a website site, computer, server or data center by wired (e.g., coaxial cable, optical fiber, digital subscriber line (digital subscriber line, DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) mode to another website site, computer, server or data center.
- the computer-readable storage medium can be any available medium that a computer can access or a data storage device such as a server or data center that includes one or more available media integrations.
- the available medium can be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a digital video disc (DVD)), or a semiconductor medium (e.g., a solid state disk (SSD)), etc.
- a magnetic medium e.g., a floppy disk, a hard disk, a magnetic tape
- an optical medium e.g., a digital video disc (DVD)
- DVD digital video disc
- SSD solid state disk
- the disclosed systems, devices and methods can be implemented in other ways.
- the device embodiments described above are only schematic.
- the division of the unit is only a logical function division.
- Another point is that the mutual coupling or direct coupling or communication connection shown or discussed can be through some interfaces, indirect coupling or communication connection of devices or units, which can be electrical, mechanical or other forms.
- each functional unit in each embodiment of the present application may be integrated into a processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computing Systems (AREA)
- Theoretical Computer Science (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Priority Applications (6)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2023/089855 WO2024216632A1 (zh) | 2023-04-21 | 2023-04-21 | 视频编解码方法、装置、设备、系统、及存储介质 |
| JP2025561338A JP2026512534A (ja) | 2023-04-21 | 2023-04-21 | ビデオコーデック方法、装置、機器、システムおよび記憶媒体 |
| CN202380097396.3A CN121100525A (zh) | 2023-04-21 | 2023-04-21 | 视频编解码方法、装置、设备、系统、及存储介质 |
| EP23933508.6A EP4701181A1 (en) | 2023-04-21 | 2023-04-21 | Video encoding method and apparatus, video decoding method and apparatus, and device, system and storage medium |
| MX2025012524A MX2025012524A (es) | 2023-04-21 | 2025-10-20 | Metodo y aparato de codificacion de video, metodo y aparato de decodificacion de video, y dispositivo, sistema y medio de almacenamiento |
| US19/363,220 US20260046392A1 (en) | 2023-04-21 | 2025-10-20 | Video encoding method and apparatus, video decoding method and apparatus, and device, system and storage medium |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2023/089855 WO2024216632A1 (zh) | 2023-04-21 | 2023-04-21 | 视频编解码方法、装置、设备、系统、及存储介质 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US19/363,220 Continuation US20260046392A1 (en) | 2023-04-21 | 2025-10-20 | Video encoding method and apparatus, video decoding method and apparatus, and device, system and storage medium |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2024216632A1 true WO2024216632A1 (zh) | 2024-10-24 |
Family
ID=93151793
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2023/089855 Ceased WO2024216632A1 (zh) | 2023-04-21 | 2023-04-21 | 视频编解码方法、装置、设备、系统、及存储介质 |
Country Status (6)
| Country | Link |
|---|---|
| US (1) | US20260046392A1 (https=) |
| EP (1) | EP4701181A1 (https=) |
| JP (1) | JP2026512534A (https=) |
| CN (1) | CN121100525A (https=) |
| MX (1) | MX2025012524A (https=) |
| WO (1) | WO2024216632A1 (https=) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119629334A (zh) * | 2024-12-11 | 2025-03-14 | 北京达佳互联信息技术有限公司 | 视频编码方法及装置 |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1933600A (zh) * | 2006-09-08 | 2007-03-21 | 清华大学 | 用于h.264/avc编码器的运动估计方法 |
| US20170244974A1 (en) * | 2014-11-05 | 2017-08-24 | Samsung Electronics Co., Ltd. | Video encoding method and apparatus, or video decoding method and apparatus that perform intra prediction based on one or more sample values and one or more patterns that are determined with respect to block |
| CN108513137A (zh) * | 2016-02-24 | 2018-09-07 | 联发科技股份有限公司 | 可重构插值滤波器与相关的插值滤波方法 |
| CN110381321A (zh) * | 2019-08-23 | 2019-10-25 | 西安邮电大学 | 一种用于运动补偿的插值计算并行实现方法 |
| CN114501029A (zh) * | 2022-01-12 | 2022-05-13 | 深圳市洲明科技股份有限公司 | 图像编码、图像解码方法、装置、计算机设备和存储介质 |
-
2023
- 2023-04-21 JP JP2025561338A patent/JP2026512534A/ja active Pending
- 2023-04-21 CN CN202380097396.3A patent/CN121100525A/zh active Pending
- 2023-04-21 EP EP23933508.6A patent/EP4701181A1/en active Pending
- 2023-04-21 WO PCT/CN2023/089855 patent/WO2024216632A1/zh not_active Ceased
-
2025
- 2025-10-20 US US19/363,220 patent/US20260046392A1/en active Pending
- 2025-10-20 MX MX2025012524A patent/MX2025012524A/es unknown
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1933600A (zh) * | 2006-09-08 | 2007-03-21 | 清华大学 | 用于h.264/avc编码器的运动估计方法 |
| US20170244974A1 (en) * | 2014-11-05 | 2017-08-24 | Samsung Electronics Co., Ltd. | Video encoding method and apparatus, or video decoding method and apparatus that perform intra prediction based on one or more sample values and one or more patterns that are determined with respect to block |
| CN108513137A (zh) * | 2016-02-24 | 2018-09-07 | 联发科技股份有限公司 | 可重构插值滤波器与相关的插值滤波方法 |
| CN110381321A (zh) * | 2019-08-23 | 2019-10-25 | 西安邮电大学 | 一种用于运动补偿的插值计算并行实现方法 |
| CN114501029A (zh) * | 2022-01-12 | 2022-05-13 | 深圳市洲明科技股份有限公司 | 图像编码、图像解码方法、装置、计算机设备和存储介质 |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119629334A (zh) * | 2024-12-11 | 2025-03-14 | 北京达佳互联信息技术有限公司 | 视频编码方法及装置 |
Also Published As
| Publication number | Publication date |
|---|---|
| MX2025012524A (es) | 2025-11-03 |
| EP4701181A1 (en) | 2026-02-25 |
| JP2026512534A (ja) | 2026-04-16 |
| US20260046392A1 (en) | 2026-02-12 |
| CN121100525A (zh) | 2025-12-09 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN114913249B (zh) | 编码、解码方法和相关设备 | |
| US20240236372A1 (en) | Video encoding and decoding method, and device | |
| US20240236371A1 (en) | Video encoding method, video decoding method, device, system, and storage medium | |
| WO2024212735A1 (zh) | 图像滤波方法、装置、设备及存储介质 | |
| WO2023122969A1 (zh) | 帧内预测方法、设备、系统、及存储介质 | |
| WO2024007128A1 (zh) | 视频编解码方法、装置、设备、系统、及存储介质 | |
| WO2023122968A1 (zh) | 帧内预测方法、设备、系统、及存储介质 | |
| WO2024216632A1 (zh) | 视频编解码方法、装置、设备、系统、及存储介质 | |
| WO2023184747A1 (zh) | 视频编解码方法、装置、设备、系统及存储介质 | |
| WO2024192733A1 (zh) | 视频编解码方法、装置、设备、系统、及存储介质 | |
| CN117082239A (zh) | 视频编解码方法与系统、及视频编码器与视频解码器 | |
| CN114979628B (zh) | 图像块预测样本的确定方法及编解码设备 | |
| US20250030843A1 (en) | Video coding method and storage medium | |
| CN114979629B (zh) | 图像块预测样本的确定方法及编解码设备 | |
| WO2024183007A1 (zh) | 视频编解码方法、装置、设备、系统、及存储介质 | |
| WO2024260119A1 (zh) | 基于神经网络的图像滤波方法、装置、设备及存储介质 | |
| WO2025007254A1 (zh) | 视频编解码方法、装置、设备、系统、及存储介质 | |
| WO2024243739A1 (zh) | 视频编解码方法、装置、设备、系统、及存储介质 | |
| HK40074375A (en) | Method for determining image block prediction sample, and encoding and decoding device | |
| HK40074376A (en) | Method for determining image block prediction sample, and encoding and decoding device | |
| WO2025118652A1 (zh) | 视频编解码方法、装置、设备及存储介质 | |
| WO2024152254A1 (zh) | 视频编解码方法、装置、设备、系统、及存储介质 | |
| HK40074375B (zh) | 图像块预测样本的确定方法及编解码设备 | |
| WO2023236113A1 (zh) | 视频编解码方法、装置、设备、系统及存储介质 | |
| CN120128731A (zh) | 视频编解码方法、装置、设备 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23933508 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2025561338 Country of ref document: JP Kind code of ref document: A |
|
| WWE | Wipo information: entry into national phase |
Ref document number: MX/A/2025/012524 Country of ref document: MX Ref document number: 2025561338 Country of ref document: JP |
|
| WWP | Wipo information: published in national office |
Ref document number: MX/A/2025/012524 Country of ref document: MX |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 202517114313 Country of ref document: IN |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2023933508 Country of ref document: EP |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| ENP | Entry into the national phase |
Ref document number: 2023933508 Country of ref document: EP Effective date: 20251121 |
|
| ENP | Entry into the national phase |
Ref document number: 2023933508 Country of ref document: EP Effective date: 20251121 |
|
| ENP | Entry into the national phase |
Ref document number: 2023933508 Country of ref document: EP Effective date: 20251121 |
|
| WWP | Wipo information: published in national office |
Ref document number: 202517114313 Country of ref document: IN |
|
| ENP | Entry into the national phase |
Ref document number: 2023933508 Country of ref document: EP Effective date: 20251121 |
|
| ENP | Entry into the national phase |
Ref document number: 2023933508 Country of ref document: EP Effective date: 20251121 |
|
| ENP | Entry into the national phase |
Ref document number: 2023933508 Country of ref document: EP Effective date: 20251121 |
|
| ENP | Entry into the national phase |
Ref document number: 2023933508 Country of ref document: EP Effective date: 20251121 |
|
| ENP | Entry into the national phase |
Ref document number: 2023933508 Country of ref document: EP Effective date: 20251121 |
|
| ENP | Entry into the national phase |
Ref document number: 2023933508 Country of ref document: EP Effective date: 20251121 |
|
| WWP | Wipo information: published in national office |
Ref document number: 2023933508 Country of ref document: EP |