WO2023044868A1 - 视频编解码方法、设备、系统、及存储介质 - Google Patents
视频编解码方法、设备、系统、及存储介质 Download PDFInfo
- Publication number
- WO2023044868A1 WO2023044868A1 PCT/CN2021/120773 CN2021120773W WO2023044868A1 WO 2023044868 A1 WO2023044868 A1 WO 2023044868A1 CN 2021120773 W CN2021120773 W CN 2021120773W WO 2023044868 A1 WO2023044868 A1 WO 2023044868A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- transformation
- current block
- prediction mode
- intra
- transform
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 168
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims abstract description 67
- 230000009466 transformation Effects 0.000 claims description 714
- 238000004590 computer program Methods 0.000 claims description 29
- 238000000844 transformation Methods 0.000 claims description 6
- 230000000694 effects Effects 0.000 abstract description 14
- 230000000875 corresponding effect Effects 0.000 description 172
- 238000013139 quantization Methods 0.000 description 49
- 230000008569 process Effects 0.000 description 33
- 238000010586 diagram Methods 0.000 description 27
- 241000023320 Luma <angiosperm> Species 0.000 description 16
- OSWPMRLSEDHDFF-UHFFFAOYSA-N methyl salicylate Chemical compound COC(=O)C1=CC=CC=C1O OSWPMRLSEDHDFF-UHFFFAOYSA-N 0.000 description 16
- 238000005516 engineering process Methods 0.000 description 15
- 230000006835 compression Effects 0.000 description 13
- 238000007906 compression Methods 0.000 description 13
- 238000001914 filtration Methods 0.000 description 13
- 230000006870 function Effects 0.000 description 10
- 230000033001 locomotion Effects 0.000 description 9
- 238000004891 communication Methods 0.000 description 8
- 238000011426 transformation method Methods 0.000 description 8
- 230000002596 correlated effect Effects 0.000 description 6
- 239000011159 matrix material Substances 0.000 description 6
- 230000001360 synchronised effect Effects 0.000 description 5
- 230000005540 biological transmission Effects 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000009795 derivation Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 2
- 230000000903 blocking effect Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000013500 data storage Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000000926 separation method Methods 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 230000017105 transposition Effects 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000012958 reprocessing Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/13—Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/12—Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/18—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a set of transform coefficients
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/61—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
Definitions
- the present application relates to the technical field of video coding and decoding, and in particular to a video coding and decoding method, device, system, and storage medium.
- Digital video technology can be incorporated into a variety of video devices, such as digital televisions, smartphones, computers, e-readers, or video players, among others.
- video devices implement video compression technology to enable more effective transmission or storage of video data.
- Video is compressed through encoding, and the encoding process includes prediction, transformation, and quantization. For example, through intra-frame prediction and/or inter-frame prediction, determine the prediction block of the current block, subtract the prediction block from the current block to obtain a residual block, transform the residual block to obtain a transformation coefficient, and quantize the transformation coefficient to obtain a quantization coefficient, And encode the quantized coefficients to form a code stream.
- the intra prediction mode of the current block may be a weighted prediction mode
- the transform mode of the current block may be a quadratic transform mode.
- the transformation kernel is fixed to select the transformation kernel corresponding to a certain intra prediction mode, which makes the selection of the transformation kernel of the secondary transformation inflexible, and the transformation effect is poor, which leads to the overall performance of the codec. not ideal.
- the embodiment of the present application provides a video encoding and decoding method, device, system, and storage medium, by selecting the target transformation kernel corresponding to the current block from among the preset M-type secondary transformation transformation kernels, instead of fixedly selecting a certain
- the transformation kernel corresponding to one mode improves the selection flexibility of the transformation kernel, improves the transformation effect, and improves the overall performance of encoding and decoding.
- the present application provides a video decoding method, including:
- Inverse basic transformation is performed on the basic transformation coefficients to obtain a residual block of the current block.
- the embodiment of the present application provides a video encoding method, including:
- the target transformation kernel corresponding to the current block from the preset transformation kernels of M types of secondary transformation, and the M is a positive integer greater than 1 ;
- the present application provides a video encoder, configured to execute the method in the above first aspect or various implementations thereof.
- the encoder includes a functional unit configured to execute the method in the above first aspect or its implementations.
- the present application provides a video decoder, configured to execute the method in the above second aspect or various implementations thereof.
- the decoder includes a functional unit configured to execute the method in the above second aspect or its various implementations.
- a video encoder including a processor and a memory.
- the memory is used to store a computer program, and the processor is used to call and run the computer program stored in the memory, so as to execute the method in the above first aspect or its various implementations.
- a sixth aspect provides a video decoder, including a processor and a memory.
- the memory is used to store a computer program
- the processor is used to invoke and run the computer program stored in the memory, so as to execute the method in the above second aspect or its various implementations.
- a video codec system including a video encoder and a video decoder.
- the video encoder is configured to execute the method in the above first aspect or its various implementations
- the video decoder is configured to execute the method in the above second aspect or its various implementations.
- the chip includes: a processor, configured to call and run a computer program from the memory, so that the device installed with the chip executes any one of the above-mentioned first to second aspects or any of the implementations thereof. method.
- a computer-readable storage medium for storing a computer program, and the computer program causes a computer to execute any one of the above-mentioned first to second aspects or the method in each implementation manner thereof.
- a computer program product including computer program instructions, the computer program instructions cause a computer to execute any one of the above first to second aspects or the method in each implementation manner.
- a computer program which, when running on a computer, causes the computer to execute any one of the above-mentioned first to second aspects or the method in each implementation manner thereof.
- a code stream is provided, where the code stream is generated by any one of the first aspects above or each implementation thereof.
- the second transformation coefficient of the current block is obtained, and the second transformation coefficient is the transformation coefficient formed by the coding end through the secondary transformation of the residual block of the current block; if the intra-frame of the current block is determined
- the target transformation kernel corresponding to the current block is selected from the preset M-type secondary transformation transformation kernels, and M is a positive integer greater than 1; the second transformation coefficient is reversed using the target transformation kernel.
- the second transformation is to obtain the basic transformation coefficient of the current block; the inverse basic transformation is performed on the basic transformation coefficient to obtain the residual block of the current block.
- the present application selects the target transformation kernel corresponding to the current block from the preset M-type secondary transformation transformation kernels, instead of fixedly using the transformation kernel corresponding to a certain mode, thereby improving the selection flexibility of the transformation kernel and improving The transformation effect improves the overall performance of the codec.
- FIG. 1 is a schematic block diagram of a video encoding and decoding system involved in an embodiment of the present application
- Fig. 2 is a schematic block diagram of a video encoder involved in an embodiment of the present application
- Fig. 3 is a schematic block diagram of a video decoder involved in an embodiment of the present application.
- Fig. 4 is the schematic diagram of LFNST transformation
- FIG. 5 is a schematic diagram of an intra prediction mode
- FIG. 6 is a schematic diagram of another intra prediction mode
- Figure 7A is a schematic diagram of prediction of DIMD
- Figure 7B is a histogram of magnitude values and prediction modes of DIMD
- Figure 8 is a schematic diagram of the prediction of TIMD
- FIG. 9 is a schematic flowchart of a video decoding method provided in an embodiment of the present application.
- FIG. 10 is a schematic diagram of a video decoding process involved in an embodiment of the present application.
- FIG. 11 is another schematic flowchart of a video decoding method provided by an embodiment of the present application.
- FIG. 12 is a schematic diagram of the decoding process involved in the embodiment of the present application.
- FIG. 13 is a schematic flowchart of a video encoding method provided by an embodiment of the present application.
- FIG. 14 is a schematic diagram of the video encoding process involved in the embodiment of the present application.
- FIG. 15 is a schematic flowchart of a video encoding method provided by an embodiment of the present application.
- FIG. 16 is a schematic diagram of the video encoding process involved in the embodiment of the present application.
- Fig. 17 is a schematic block diagram of a video decoder provided by an embodiment of the present application.
- Fig. 18 is a schematic block diagram of a video encoder provided by an embodiment of the present application.
- Fig. 19 is a schematic block diagram of an electronic device provided by an embodiment of the present application.
- Fig. 20 is a schematic block diagram of a video encoding and decoding system provided by an embodiment of the present application.
- the application can be applied to the field of image codec, video codec, hardware video codec, dedicated circuit video codec, real-time video codec, etc.
- the solution of the present application can be combined with audio and video coding standards (audio video coding standard, referred to as AVS), for example, H.264/audio video coding (audio video coding, referred to as AVC) standard, H.265/high efficiency video coding ( High efficiency video coding (HEVC for short) standard and H.266/versatile video coding (VVC for short) standard.
- the solutions of the present application may operate in conjunction with other proprietary or industry standards, including ITU-TH.261, ISO/IECMPEG-1Visual, ITU-TH.262 or ISO/IECMPEG-2Visual, ITU-TH.263 , ISO/IECMPEG-4Visual, ITU-TH.264 (also known as ISO/IECMPEG-4AVC), including scalable video codec (SVC) and multi-view video codec (MVC) extensions.
- SVC scalable video codec
- MVC multi-view video codec
- FIG. 1 is a schematic block diagram of a video encoding and decoding system involved in an embodiment of the present application. It should be noted that FIG. 1 is only an example, and the video codec system in the embodiment of the present application includes but is not limited to what is shown in FIG. 1 .
- the video codec system 100 includes an encoding device 110 and a decoding device 120 .
- the encoding device is used to encode (can be understood as compression) the video data to generate a code stream, and transmit the code stream to the decoding device.
- the decoding device decodes the code stream generated by the encoding device to obtain decoded video data.
- the encoding device 110 in the embodiment of the present application can be understood as a device having a video encoding function
- the decoding device 120 can be understood as a device having a video decoding function, that is, the embodiment of the present application includes a wider range of devices for the encoding device 110 and the decoding device 120, Examples include smartphones, desktop computers, mobile computing devices, notebook (eg, laptop) computers, tablet computers, set-top boxes, televisions, cameras, display devices, digital media players, video game consoles, vehicle-mounted computers, and the like.
- the encoding device 110 may transmit the encoded video data (such as code stream) to the decoding device 120 via the channel 130 .
- Channel 130 may include one or more media and/or devices capable of transmitting encoded video data from encoding device 110 to decoding device 120 .
- channel 130 includes one or more communication media that enable encoding device 110 to transmit encoded video data directly to decoding device 120 in real-time.
- encoding device 110 may modulate the encoded video data according to a communication standard and transmit the modulated video data to decoding device 120 .
- the communication medium includes a wireless communication medium, such as a radio frequency spectrum.
- the communication medium may also include a wired communication medium, such as one or more physical transmission lines.
- the channel 130 includes a storage medium that can store video data encoded by the encoding device 110 .
- the storage medium includes a variety of local access data storage media, such as optical discs, DVDs, flash memory, and the like.
- the decoding device 120 may acquire encoded video data from the storage medium.
- channel 130 may include a storage server that may store video data encoded by encoding device 110 .
- the decoding device 120 may download the stored encoded video data from the storage server.
- the storage server may store the encoded video data and may transmit the encoded video data to the decoding device 120, such as a web server (eg, for a website), a file transfer protocol (FTP) server, and the like.
- FTP file transfer protocol
- the encoding device 110 includes a video encoder 112 and an output interface 113 .
- the output interface 113 may include a modulator/demodulator (modem) and/or a transmitter.
- the encoding device 110 may include a video source 111 in addition to the video encoder 112 and the input interface 113 .
- the video source 111 may include at least one of a video capture device (for example, a video camera), a video archive, a video input interface, a computer graphics system, wherein the video input interface is used to receive video data from a video content provider, and the computer graphics system Used to generate video data.
- a video capture device for example, a video camera
- a video archive for example, a video archive
- a video input interface for example, a video archive
- video input interface for example, a video input interface
- computer graphics system used to generate video data.
- the video encoder 112 encodes the video data from the video source 111 to generate a code stream.
- Video data may include one or more pictures or a sequence of pictures.
- the code stream contains the encoding information of an image or image sequence in the form of a bit stream.
- Encoding information may include encoded image data and associated data.
- the associated data may include a sequence parameter set (SPS for short), a picture parameter set (PPS for short) and other syntax structures.
- SPS sequence parameter set
- PPS picture parameter set
- An SPS may contain parameters that apply to one or more sequences.
- a PPS may contain parameters applied to one or more images.
- the syntax structure refers to a set of zero or more syntax elements arranged in a specified order in the code stream.
- the video encoder 112 directly transmits encoded video data to the decoding device 120 via the output interface 113 .
- the encoded video data can also be stored on a storage medium or a storage server for subsequent reading by the decoding device 120 .
- the decoding device 120 includes an input interface 121 and a video decoder 122 .
- the decoding device 120 may include a display device 123 in addition to the input interface 121 and the video decoder 122 .
- the input interface 121 includes a receiver and/or a modem.
- the input interface 121 can receive encoded video data through the channel 130 .
- the video decoder 122 is used to decode the encoded video data to obtain decoded video data, and transmit the decoded video data to the display device 123 .
- the display device 123 displays the decoded video data.
- the display device 123 may be integrated with the decoding device 120 or external to the decoding device 120 .
- the display device 123 may include various display devices, such as a liquid crystal display (LCD), a plasma display, an organic light emitting diode (OLED) display, or other types of display devices.
- LCD liquid crystal display
- plasma display a plasma display
- OLED organic light emitting diode
- FIG. 1 is only an example, and the technical solutions of the embodiments of the present application are not limited to FIG. 1 .
- the technology of the present application may also be applied to one-sided video encoding or one-sided video decoding.
- Fig. 2 is a schematic block diagram of a video encoder involved in an embodiment of the present application. It should be understood that the video encoder 200 can be used to perform lossy compression on images, and can also be used to perform lossless compression on images.
- the lossless compression may be visually lossless compression or mathematically lossless compression.
- the video encoder 200 can be applied to image data in luminance-chrominance (YCbCr, YUV) format.
- the YUV ratio can be 4:2:0, 4:2:2 or 4:4:4, Y means brightness (Luma), Cb (U) means blue chroma, Cr (V) means red chroma, U and V are expressed as chroma (Chroma) for describing color and saturation.
- 4:2:0 means that every 4 pixels have 4 luminance components
- 2 chroma components YYYYCbCr
- 4:2:2 means that every 4 pixels have 4 luminance components
- 4 Chroma component YYYYCbCrCbCr
- 4:4:4 means full pixel display (YYYYCbCrCbCrCbCrCbCr).
- the video encoder 200 reads video data, and divides a frame of image into several coding tree units (coding tree units, CTUs) for each frame of image in the video data.
- CTB can be called “Tree block", “Largest Coding unit” (LCU for short) or “coding tree block” (CTB for short).
- LCU Large Coding unit
- CTB coding tree block
- Each CTU may be associated with a pixel block of equal size within the image. Each pixel may correspond to one luminance (luma) sample and two chrominance (chrominance or chroma) samples.
- each CTU may be associated with one block of luma samples and two blocks of chroma samples.
- a CTU size is, for example, 128 ⁇ 128, 64 ⁇ 64, 32 ⁇ 32 and so on.
- a CTU can be further divided into several coding units (Coding Unit, CU) for coding, and the CU can be a rectangular block or a square block.
- the CU can be further divided into a prediction unit (PU for short) and a transform unit (TU for short), so that coding, prediction, and transformation are separated, and processing is more flexible.
- a CTU is divided into CUs in a quadtree manner, and a CU is divided into TUs and PUs in a quadtree manner.
- the video encoder and video decoder can support various PU sizes. Assuming that the size of a specific CU is 2N ⁇ 2N, video encoders and video decoders may support 2N ⁇ 2N or N ⁇ N PU sizes for intra prediction, and support 2N ⁇ 2N, 2N ⁇ N, N ⁇ 2N, NxN or similarly sized symmetric PUs for inter prediction. The video encoder and video decoder may also support asymmetric PUs of 2NxnU, 2NxnD, nLx2N, and nRx2N for inter prediction.
- the video encoder 200 may include: a prediction unit 210, a residual unit 220, a transform/quantization unit 230, an inverse transform/quantization unit 240, a reconstruction unit 250, and a loop filter unit 260. Decoded image cache 270 and entropy coding unit 280. It should be noted that the video encoder 200 may include more, less or different functional components.
- the current block may be called a current coding unit (CU) or a current prediction unit (PU).
- a predicted block may also be called a predicted image block or an image predicted block, and a reconstructed image block may also be called a reconstructed block or an image reconstructed image block.
- the prediction unit 210 includes an inter prediction unit 211 and an intra estimation unit 212 . Because there is a strong correlation between adjacent pixels in a video frame, the intra-frame prediction method is used in video coding and decoding technology to eliminate the spatial redundancy between adjacent pixels. Due to the strong similarity between adjacent frames in video, the inter-frame prediction method is used in video coding and decoding technology to eliminate time redundancy between adjacent frames, thereby improving coding efficiency.
- the inter-frame prediction unit 211 can be used for inter-frame prediction.
- the inter-frame prediction can refer to image information of different frames.
- the inter-frame prediction uses motion information to find a reference block from the reference frame, and generates a prediction block according to the reference block to eliminate temporal redundancy;
- Frames used for inter-frame prediction may be P frames and/or B frames, P frames refer to forward predictive frames, and B frames refer to bidirectional predictive frames.
- the motion information includes the reference frame list where the reference frame is located, the reference frame index, and the motion vector.
- the motion vector can be an integer pixel or a sub-pixel. If the motion vector is sub-pixel, then it is necessary to use interpolation filtering in the reference frame to make the required sub-pixel block.
- the reference frame found according to the motion vector A block of whole pixels or sub-pixels is called a reference block.
- Some technologies will directly use the reference block as a prediction block, and some technologies will further process the reference block to generate a prediction block. Reprocessing and generating a prediction block based on a reference block can also be understood as taking the reference block as a prediction block and then processing and generating a new prediction block based on the prediction block.
- the intra-frame estimation unit 212 only refers to the information of the same frame of image to predict the pixel information in the current code image block for eliminating spatial redundancy.
- a frame used for intra prediction may be an I frame.
- the pixels in the left row and the upper column of the current block are reference pixels of the current block, and the intra prediction uses these reference pixels to predict the current block.
- These reference pixels may all be available, that is, all have been encoded and decoded. Some parts may also be unavailable, for example, the current block is the leftmost of the whole frame, then the reference pixel on the left of the current block is unavailable.
- the lower left part of the current block has not been encoded and decoded, so the reference pixel at the lower left is also unavailable.
- the available reference pixel or some value or some method can be used for filling, or no filling is performed.
- the intra prediction modes used by HEVC include planar mode (Planar), DC and 33 angle modes, a total of 35 prediction modes.
- the intra-frame modes used by VVC include Planar, DC and 65 angle modes, with a total of 67 prediction modes.
- For the luminance component there is a prediction matrix based on training (Matrix based intra prediction, MIP) prediction mode, and for the chrominance component, there is a CCLM prediction mode.
- the intra-frame prediction will be more accurate, and it will be more in line with the demand for the development of high-definition and ultra-high-definition digital video.
- the residual unit 220 may generate a residual block of the CU based on the pixel blocks of the CU and the prediction blocks of the PUs of the CU. For example, residual unit 220 may generate a residual block for a CU such that each sample in the residual block has a value equal to the difference between the samples in the pixel blocks of the CU, and the samples in the PUs of the CU. Corresponding samples in the predicted block.
- Transform/quantization unit 230 may quantize the transform coefficients. Transform/quantization unit 230 may quantize transform coefficients associated with TUs of a CU based on quantization parameter (QP) values associated with the CU. Video encoder 200 may adjust the degree of quantization applied to transform coefficients associated with a CU by adjusting the QP value associated with the CU.
- QP quantization parameter
- Inverse transform/quantization unit 240 may apply inverse quantization and inverse transform to the quantized transform coefficients, respectively, to reconstruct a residual block from the quantized transform coefficients.
- the reconstruction unit 250 may add samples of the reconstructed residual block to corresponding samples of one or more prediction blocks generated by the prediction unit 210 to generate a reconstructed image block associated with the TU. By reconstructing the sample blocks of each TU of the CU in this way, the video encoder 200 can reconstruct the pixel blocks of the CU.
- Loop filtering unit 260 may perform deblocking filtering operations to reduce blocking artifacts of pixel blocks associated with a CU.
- the loop filtering unit 260 includes a deblocking filtering unit and a sample adaptive compensation/adaptive loop filtering (SAO/ALF) unit, wherein the deblocking filtering unit is used for deblocking, and the SAO/ALF unit Used to remove ringing effects.
- SAO/ALF sample adaptive compensation/adaptive loop filtering
- the decoded image buffer 270 may store reconstructed pixel blocks.
- Inter prediction unit 211 may use reference pictures containing reconstructed pixel blocks to perform inter prediction on PUs of other pictures.
- intra estimation unit 212 may use the reconstructed pixel blocks in decoded picture cache 270 to perform intra prediction on other PUs in the same picture as the CU.
- Entropy encoding unit 280 may receive the quantized transform coefficients from transform/quantization unit 230 . Entropy encoding unit 280 may perform one or more entropy encoding operations on the quantized transform coefficients to generate entropy encoded data.
- Fig. 3 is a schematic block diagram of a video decoder involved in an embodiment of the present application.
- the video decoder 300 includes: an entropy decoding unit 310 , a prediction unit 320 , an inverse quantization transformation unit 330 , a reconstruction unit 340 , a loop filter unit 350 and a decoded image buffer 360 . It should be noted that the video decoder 300 may include more, less or different functional components.
- the video decoder 300 can receive code streams.
- the entropy decoding unit 310 may parse the codestream to extract syntax elements from the codestream. As part of parsing the codestream, the entropy decoding unit 310 may parse the entropy-encoded syntax elements in the codestream.
- the prediction unit 320 , the inverse quantization transformation unit 330 , the reconstruction unit 340 and the loop filter unit 350 can decode video data according to the syntax elements extracted from the code stream, that is, generate decoded video data.
- the prediction unit 320 includes an intra estimation unit 321 and an inter prediction unit 322 .
- Intra estimation unit 321 may perform intra prediction to generate a predictive block for a PU.
- Intra estimation unit 321 may use an intra prediction mode to generate a prediction block for a PU based on pixel blocks of spatially neighboring PUs.
- Intra estimation unit 321 may also determine the intra prediction mode of the PU from one or more syntax elements parsed from the codestream.
- the inter prediction unit 322 may construct a first reference picture list (list 0) and a second reference picture list (list 1) according to the syntax elements parsed from the codestream. Furthermore, if the PU is encoded using inter prediction, entropy decoding unit 310 may parse the motion information for the PU. Inter prediction unit 322 may determine one or more reference blocks for the PU according to the motion information of the PU. Inter prediction unit 322 may generate a predictive block for the PU from one or more reference blocks for the PU.
- Inverse quantization transform unit 330 may inverse quantize (ie, dequantize) the transform coefficients associated with a TU. Inverse quantization transform unit 330 may use a QP value associated with a CU of a TU to determine the degree of quantization.
- inverse quantized transform unit 330 may apply one or more inverse transforms to the inverse quantized transform coefficients in order to generate a residual block associated with the TU.
- Reconstruction unit 340 uses the residual blocks associated with the TUs of the CU and the prediction blocks of the PUs of the CU to reconstruct the pixel blocks of the CU. For example, the reconstruction unit 340 may add the samples of the residual block to the corresponding samples of the prediction block to reconstruct the pixel block of the CU to obtain the reconstructed image block.
- Loop filtering unit 350 may perform deblocking filtering operations to reduce blocking artifacts of pixel blocks associated with a CU.
- Video decoder 300 may store the reconstructed picture of the CU in decoded picture cache 360 .
- the video decoder 300 may use the reconstructed picture in the decoded picture buffer 360 as a reference picture for subsequent prediction, or transmit the reconstructed picture to a display device for presentation.
- the basic flow of video encoding and decoding is as follows: at the encoding end, a frame of image is divided into blocks, and for the current block, the prediction unit 210 uses intra-frame prediction or inter-frame prediction to generate the prediction block of the current block .
- the residual unit 220 may calculate a residual block based on the predicted block and the original block of the current block, for example, subtract the predicted block from the original block of the current block to obtain a residual block, which may also be referred to as residual information.
- the residual block can be transformed and quantized by the transformation/quantization unit 230 to remove information that is not sensitive to human eyes, so as to eliminate visual redundancy.
- the residual block before being transformed and quantized by the transform/quantization unit 230 may be called a time domain residual block, and the time domain residual block after being transformed and quantized by the transform/quantization unit 230 may be called a frequency residual block or a frequency-domain residual block.
- the entropy encoding unit 280 receives the quantized transform coefficients output by the transform and quantization unit 230 , may perform entropy encoding on the quantized transform coefficients, and output a code stream.
- the entropy coding unit 280 can eliminate character redundancy according to the target context model and the probability information of the binary code stream.
- the entropy decoding unit 310 can analyze the code stream to obtain the prediction information of the current block, the quantization coefficient matrix, etc., and the prediction unit 320 uses intra prediction or inter prediction for the current block based on the prediction information to generate a prediction block of the current block.
- the inverse quantization transformation unit 330 uses the quantization coefficient matrix obtained from the code stream to perform inverse quantization and inverse transformation on the quantization coefficient matrix to obtain a residual block.
- the reconstruction unit 340 adds the predicted block and the residual block to obtain a reconstructed block.
- the reconstructed blocks form a reconstructed image, and the loop filtering unit 350 performs loop filtering on the reconstructed image based on the image or based on the block to obtain a decoded image.
- the encoding end also needs similar operations to the decoding end to obtain the decoded image.
- the decoded image may also be referred to as a reconstructed image, and the reconstructed image may be a subsequent frame as a reference frame for inter-frame prediction.
- the block division information determined by the encoder as well as mode information or parameter information such as prediction, transformation, quantization, entropy coding, and loop filtering, etc., are carried in the code stream when necessary.
- the decoding end analyzes the code stream and analyzes the existing information to determine the same block division information as the encoding end, prediction, transformation, quantization, entropy coding, loop filtering and other mode information or parameter information, so as to ensure the decoding image obtained by the encoding end It is the same as the decoded image obtained by the decoder.
- the current block may be the current coding unit (CU) or the current prediction unit (PU).
- the above is the basic process of the video codec under the block-based hybrid coding framework. With the development of technology, some modules or steps of the framework or process may be optimized. This application is applicable to the block-based hybrid coding framework.
- the basic process of the video codec but not limited to the framework and process.
- the general hybrid encoding framework will first perform prediction, and the prediction uses the correlation performance in space or time to obtain an image that is the same as or similar to the current block.
- the prediction uses the correlation performance in space or time to obtain an image that is the same as or similar to the current block.
- the hybrid coding framework will subtract the predicted image from the original image of the current block to obtain a residual image, or subtract the predicted block from the current block to obtain a residual block.
- Residual blocks are usually much simpler than the original image, so prediction can significantly improve compression efficiency.
- the residual block is not directly encoded, but usually transformed first.
- the transformation is to transform the residual image from the spatial domain to the frequency domain, and remove the correlation of the residual image. After the residual image is transformed into the frequency domain, since most of the energy is concentrated in the low-frequency region, most of the transformed non-zero coefficients are concentrated in the upper left corner. Quantization is then used for further compression. And because the human eye is not sensitive to high frequencies, a larger quantization step size can be used in high frequency areas.
- DCT is the most commonly used transformation in video compression standards, including DCT-II, DCT-VIII, and DST-VII types.
- T i (j) is the transformed coefficient
- N is the point number of the original signal
- i, j 0,1,...,N-1
- ⁇ 0 is the compensation coefficient
- the images are all 2-dimensional, the amount of calculation and memory overhead for direct two-dimensional transformation were unacceptable to the hardware conditions at that time, so when using the above-mentioned DCT-II, DCT-VIII, and DST-VII in the standard for transformation , are split into horizontal direction and vertical direction, and perform two-step one-dimensional transformation. For example, the transformation in the horizontal direction is performed first and then the transformation in the vertical direction, or the transformation in the vertical direction is performed first and then the transformation in the horizontal direction.
- the above transformation method is more effective for horizontal and vertical textures, and it is very useful to improve compression efficiency.
- the effect of the above transformation method on oblique textures will be worse. Therefore, as the demand for compression efficiency continues to increase, if oblique textures can be processed more effectively, the compression efficiency can be further improved.
- a secondary transformation is currently used, that is, after the above-mentioned DCT-II, DCT-VIII, DST-VII and other basic transformations (Primary transform), the frequency domain signal is subjected to a second transformation Transformation converts the signal from one transform domain to another, and then performs quantization, entropy coding and other operations, the purpose of which is to further remove statistical redundancy.
- the low frequency non-separable transform (LFNST for short) is a reduced secondary transform.
- LFNST is used after the primary transform and before quantization.
- LFNST is used after inverse quantization and before inverse basis transform.
- LFNST performs a secondary transformation on the low-frequency coefficients in the upper left corner after the basic transformation.
- the base transform concentrates energy to the upper left corner by decorrelating the image.
- the secondary transform then decorrelates the low-frequency coefficients of the base transform.
- 16 coefficients are input to the 4x4 LFNST transformation kernel, and the output is 8 coefficients; 64 coefficients are input to the 8x8 LFNST transformation kernel, and the output is 16 coefficients.
- 8 coefficients are input to the 4x4 inverse LFNST transform kernel, and the output is 16 coefficients; 16 coefficients are input to the 8x8 inverse LFNST transform kernel, and the output is 64 coefficients.
- the LFNST includes 4 types of transformation kernels.
- the base images corresponding to these 4 types of transformation kernels are shown in FIG. 6 , and some obvious oblique textures can be seen.
- a class of transform kernels is also referred to as a set of transform kernels.
- Intra prediction uses the reconstructed pixels around the current block as a reference to predict the current block. Since the current video is coded from left to right and from top to bottom, the reference pixels available for the current block are usually on the left and top .
- VVC has 67 intra-frame prediction modes, including 65 angle prediction modes except 0Planar and 1DC.
- Planar usually handles some gradient textures
- DC as the name implies, usually handles some flat areas
- blocks with obvious angle textures usually use intra-frame angle prediction.
- the wide-angle prediction mode can also be used for non-square blocks in VVC.
- the wide-angle prediction mode makes the angle of prediction larger than that of square blocks.
- 2-66 are the angles corresponding to the prediction mode of the square block, and -1-14 and 67-80 represent the extended angles in the wide-angle prediction mode.
- Angle prediction tiles the reference pixels to the current block as the prediction value according to the specified angle, which means that the prediction block will have obvious directional texture, and the residual error of the current block after angle prediction will also reflect the obvious angle statistically. characteristic. Therefore, the transformation kernel selected by LFNST can be bound with the intra prediction mode, that is, after the intra prediction mode is determined, LFNST can only use a set of transformation kernels or a type of transformation kernel corresponding to the intra prediction mode, namely
- a class of transform kernels is also referred to as a group of transform kernels, and a class of transform kernels includes at least two transform kernels.
- LFNST in VVC has a total of 4 types of transform kernels, and each type of transform kernel includes at least 2 transform kernels.
- Table 1 The correspondence between intra prediction modes and transform kernel classes is shown in Table 1:
- IntraPredMode represents the intra prediction mode
- Tr.set index represents the index of a type of transformation kernel
- cross-component prediction modes used for chroma intra prediction are 81 to 83, and these modes are not available for luma intra prediction.
- the transformation kernel of LFNST can be transposed to handle more angles with one transformation kernel group.
- the modes 13 to 23 and 45 to 55 correspond to transformation kernel class 2, but 13 to 23 are obviously close to Horizontal mode, while 45 to 55 is obviously close to the vertical mode, and the corresponding transformation and inverse transformation of the 45 to 55 mode need to be matched by transposition.
- VVC's LFNST has four types of transform kernels.
- the types of transformation kernels are extended to more, for example, to 35 types of transformation kernels, each with 3 transformation kernels.
- Planar mode corresponds to the 0th type of transformation kernel
- DC mode corresponds to the 1st type of transformation kernel
- angle 2 and angle 66 correspond to the second type of transformation kernel
- angle 3 and angle 65 correspond to the third type of transformation kernel
- ... angle 33 and angle 35 corresponds to the 34th type of transformation kernel
- the angle 34 corresponds to the 35th type of transformation kernel.
- ECM is a reference software for further improving the performance of VVC tools and the combination of tools. It is based on VTM-10.0 and integrates tools and technologies adopted by EE.
- VTM VVC's reference software test platform
- intra-frame prediction Similar to VTM (VVC's reference software test platform), there are traditional intra-frame prediction, residual transformation and other processes.
- the difference from VVC is that in the intra-frame prediction process, two techniques for deriving intra-frame prediction modes are adopted, namely, Decoder-side Intra Mode Derivation (DIMD) and template-based frame Inner mode derivation (Template-based Intra Mode Derivation, TIMD).
- DIMD Decoder-side Intra Mode Derivation
- TIMD template-based frame Inner mode derivation
- the DIMD and TIMD technologies can derive the intra-frame prediction mode at the decoding end, so that the index of the coding intra-frame prediction mode is omitted to achieve the effect of saving codewords.
- DIMD uses the reconstructed pixel value around the current block as a template, scans and calculates the gradients in the horizontal and vertical directions on each 3x3 area on the template (Template) through the sobel operator, according to the horizontal and vertical
- the gradients Dx and Dy are obtained in the vertical direction.
- the amplitude values of the same angle mode are accumulated to obtain the histogram of the amplitude value and angle mode as shown in FIG. 7B .
- the final prediction result of DIMD can be obtained by weighting the two angle modes with the highest and second highest amplitude values and the predicted values of the planar mode in Fig. 7B.
- the weighting calculation process is shown in formulas (4) and (5):
- Pred Pred planar ⁇ w0+Pred model ⁇ w1+Pred model2 ⁇ w2 (5)
- w0, w1, and w2 are the weights assigned to the planar mode, the angle mode with the highest amplitude value, and the angle mode with the second highest amplitude value
- Pred planar is the predicted value corresponding to the planar mode
- Pred mode1 is the angle mode with the highest amplitude value
- the corresponding prediction value Pred mode1 is the angle mode with the second highest amplitude value
- Pred is the weighted prediction value corresponding to DIMD
- amp1 is the amplitude value of the angle mode with the highest amplitude value
- amp2 is the amplitude value of the angle mode with the second highest amplitude value.
- TIMD is basically the same as DIMD, and also uses the surrounding reconstructed pixel values to derive the intra prediction mode.
- TIMD derives the Most Probable Mode (MPM for short) list through the intra prediction modes selected by the five blocks above, left, upper left, lower left, and upper right of the current block.
- MPM Most Probable Mode
- the reconstruction value in the reference template reference of the template
- the prediction results are made for the template (template) area respectively.
- SATD Sum of absolute transformed differences
- angle mode 1 is the angle mode with the smallest SATD
- SATD is cost1
- angle mode 2 is the angle mode with the second smallest SATD
- SATD is cost2.
- Pred Pred mode1 ⁇ w1+Pred mode2 ⁇ w2 (7)
- Pred mode1 is the predicted value corresponding to the smallest angle mode of SATD
- Pred mode2 is the predicted value corresponding to the second smallest angle mode of SATD
- Pred is the weighted predicted value corresponding to TIMD
- w1 is the weighted weight corresponding to the smallest angle mode of SATD
- w2 is the weighted weight corresponding to the SATD second smallest angle mode.
- the transformation kernel of the secondary transformation corresponding to the angle mode with the highest amplitude value is defaulted as the target transformation kernel of the current block.
- the transformation kernel of the secondary transformation corresponding to the SATD smallest angle mode is defaulted as the target transformation kernel of the current block.
- the transformation kernel of the secondary transformation is fixedly selected to correspond to a certain intra prediction mode, which makes the selection of the transformation kernel of the secondary transformation inflexible, and the transformation effect is poor, which leads to unsatisfactory overall performance of encoding and decoding.
- the present application selects the target transformation kernel corresponding to the current block among the preset M-type secondary transformation transformation kernels, instead of fixedly selecting the transformation kernel corresponding to a certain mode, thereby improving the efficiency of the transformation kernel.
- video encoding and decoding methods provided in the embodiments of the present application can be applied to any video encoding and decoding scenarios that allow weighted prediction mode and secondary transformation mode in addition to the above-mentioned TIMD and DIMD technologies.
- the video decoding method provided in the embodiment of the present application is introduced by taking the decoding end as an example.
- FIG. 9 is a schematic flowchart of a video decoding method provided by an embodiment of the present application
- FIG. 10 is a schematic diagram of a video decoding process involved in an embodiment of the present application.
- the embodiment of the present application is applied to the video decoder shown in FIG. 1 and FIG. 2 .
- the method of the embodiment of the present application includes:
- the current block may also be referred to as a current decoding block, a current decoding unit, a decoding block, a block to be decoded, a current block to be decoded, and the like.
- the current block when the current block includes a chroma component but does not include a luma component, the current block may be called a chroma block.
- the current block when the current block includes a luma component but does not include a chroma component, the current block may be called a luma block.
- the second transformation coefficient is the transformation coefficient formed by the encoding end through the secondary transformation of the residual block of the current block. Specifically, the encoding end performs basic transformation on the residual block of the current block to obtain the basic transformation coefficient, and then the basic transformation coefficient Perform secondary transformation to obtain the second transformation coefficient of the current block.
- the base transformation is also referred to as a first transformation or an initial transformation or a first transformation, etc.
- the secondary transformation is also referred to as a second transformation or the like.
- the basic transform coefficients are also referred to as initial transform coefficients or primary transform coefficients or first transform coefficients or coefficients after the first transform, and the like.
- the second variation coefficient is also referred to as a second transformation coefficient or a coefficient after the second transformation.
- the ways in which the decoder decodes the code stream in S401 to obtain the second transform coefficient of the current block include but are not limited to the following:
- Mode 1 if the encoder does not perform quantization on the second transform coefficient during encoding, but directly encodes the second transform coefficient to obtain a code stream. In this way, the decoding end decodes the code stream, and can directly obtain the second transformation coefficient of the current block from the code stream.
- Mode 2 During encoding, the encoding end quantizes the second transform coefficients to obtain quantized coefficients, and then encodes the quantized coefficients to obtain a code stream. In this way, the decoding end decodes the code stream to obtain the quantization coefficient of the current block, and dequantizes the quantization coefficient to obtain the second transformation coefficient of the current block.
- the type of the intra prediction mode of the current block in the embodiment of the present application includes a single prediction mode (that is, a non-weighted prediction mode, such as an angle prediction mode or a non-angle prediction mode), and may also include a weighted prediction mode.
- the weighted prediction mode refers to using at least two intra prediction modes to predict the current block respectively, and weighting the prediction values corresponding to the at least two prediction modes, and using the weighted result as the prediction value of the current block.
- the weighted prediction mode includes angle prediction mode 1 and angle prediction mode 2, using angle prediction mode 1 to predict the current block to obtain a prediction value 1, using angle prediction mode 2 to predict the current block to obtain a prediction value 2, for The prediction value 1 and the prediction value 2 are weighted, and the weighted result is determined as the prediction value of the current block.
- the code stream includes a flag a indicating whether the current block is allowed to use the weighted prediction mode.
- the decoder obtains the flag a by decoding the code stream, and judges whether the current block is allowed to use the weighted prediction mode according to the value of the flag a.
- Weighted prediction mode for example, when the value of the flag a is 1 (for example, 1), it is determined that the current block is allowed to adopt the weighted prediction mode; if the value of the flag a is 2 (for example, 0), it is determined that the current block does not Allows for weighted forecasting mode.
- the flag a may be a flag at the sequence level, frame level, macroblock level or coded block level.
- the second way is not to carry the flag a in the code stream, but to determine whether the weighted prediction mode is allowed for the current block by default. For example, if the default current sequence, or the current frame or the current macroblock or the current block does not allow weighted prediction mode When the prediction mode is selected, it is determined that the weighted prediction mode is not allowed for the current block. If the weighted prediction mode is allowed to be used in the current sequence, or the current frame or the current macroblock or the current block by default, then it is determined that the weighted prediction mode is allowed to be used in the current block.
- the implementation of selecting the target transformation kernel corresponding to the current block from the preset transformation kernels of the M type secondary transformation in the above S402 includes but is not limited to the following kind:
- the code stream includes an indicator flag, and selects the target transformation kernel corresponding to the current block according to the indicator flag. Specifically, in the above S402, select the target transformation kernel corresponding to the current block from the preset M-type secondary transformation transformation kernels.
- the core includes the following steps S402-A1 and S402-A2:
- the coding end After determining the target transformation kernel corresponding to the current block, and using the target transformation kernel to perform secondary transformation on the residual block of the current block, the coding end writes the first flag in the code stream, the first The flag is used to indicate that the target transform kernel is the transform kernel corresponding to the target prediction mode.
- the decoding end receives the code stream, and obtains the first flag by decoding the code stream, and according to the first flag, the transformation kernel corresponding to the target prediction mode is determined as the target transformation kernel from among the transformation kernels of the M-type secondary transformation .
- the first flag indicates that the target transformation kernel of the secondary transformation corresponding to the current block is the transformation kernel corresponding to the prediction mode 1, and the decoding end queries the transformation kernel 1 corresponding to the prediction mode 1 from the transformation kernels of the M type of secondary transformation.
- the transformation kernel 1 is used as a target transformation kernel corresponding to the current block, and the second transformation coefficient of the current block is inversely transformed by using the target transformation kernel to obtain the basic transformation coefficient of the current block.
- the above-mentioned target prediction mode belongs to the intra prediction mode of the current block, for example, the intra prediction mode of the current block is a weighted prediction mode, and the weighted prediction mode includes N types of intra prediction modes, for example, two angles mode and Planar mode, the target prediction mode is one of the N intra prediction modes.
- the above target prediction mode does not belong to the intra prediction mode of the current block, for example, the intra prediction mode of the current block is a non-weighted prediction mode, for example, the intra prediction mode of the current block is Planar mode, and the target prediction mode Non-Planar modes such as DC mode or angle mode.
- the above S402-A2 includes the following steps from S402-A21 to S402-A23:
- the intra prediction mode of the current block includes a weighted prediction mode and a non-weighted prediction mode.
- the intra prediction mode of the current block is a weighted prediction mode
- the current block only includes one intra prediction mode, for example, from the Planar mode, DC mode and 65 angle modes, select the mode with the smallest rate-distortion cost as the mode The intra prediction mode of the current block.
- S402-A22 Determine a target prediction mode according to the intra prediction mode of the current block and the first flag.
- the manner of determining the target prediction mode based on the first flag is different.
- S402-A221. Determine the prediction mode indicated by the first flag among the N types of intra prediction modes as the target prediction mode.
- the weighted prediction mode of the current block includes two angle modes with the highest and second highest amplitude values and a planar mode, assuming that the first flag indicates that the target transformation kernel is the transformation kernel corresponding to the planar mode, then the planar mode is determined as the target prediction mode, And the transformation kernel corresponding to the planar mode among the transformation kernels of the M-type secondary transformation is determined as the target transformation kernel corresponding to the current block.
- the above-mentioned target prediction mode is one prediction mode among N intra-frame prediction modes.
- the target prediction mode is one of the P types of intra-frame prediction modes
- the P types of intra-frame prediction modes are the transformation kernels for the secondary transformation of the current block among the N types of intra-frame prediction modes
- P is a positive integer greater than 1 and less than or equal to N.
- the N types of intra prediction modes include mode 1, mode 2, and mode 3, but the intra prediction modes related to the selection of the transformation kernel for the secondary transformation of the current block are mode 1 and mode 2, so that the target prediction mode can be determined as One of Mode 1 and Mode 2.
- the P types of intra-frame prediction modes are the angle prediction modes in the N types of intra-frame prediction modes.
- the weighted prediction modes corresponding to DIMD include two angle modes with the highest and second highest amplitude values and planar mode
- the P types of intra prediction modes include two angle modes with the highest and second highest amplitude values.
- the P types of intra-frame prediction modes are the N types of intra-frame prediction modes.
- the weighted prediction modes corresponding to DIMD include two angle modes with the highest and second highest amplitude values and planar mode
- the P types of intra prediction modes include two angle modes with the highest and second highest amplitude values and planar mode.
- the weighting mode is not selected for the current block, for example, in ECM, if the amplitude value of each angle in the amplitude value histogram derived from the template is 0, then DIMD will not be weighted, and the prediction result of planar will be used directly .
- the current block is predicted using a prediction mode, such as plana prediction mode.
- plana prediction mode in order to increase the selection flexibility of the target transformation kernel, different values are assigned to the first flag to indicate that the decoder can select a transformation kernel other than the transformation kernel corresponding to the plana prediction.
- implementation methods of the above S402-A222 include but are not limited to the following methods:
- Way 1 if the value of the first flag is the first value, then determine that the target prediction mode is the intra prediction mode of the current block.
- the intra prediction mode of the current block is the Planar mode
- the value of the first flag is the first value
- determine that the target prediction mode is the Planar mode and then use the transform kernel corresponding to the Planar mode as the corresponding transformation kernel of the current block.
- Target transformation kernel For example, if the intra prediction mode of the current block is the Planar mode, and the value of the first flag is the first value, then determine that the target prediction mode is the Planar mode, and then use the transform kernel corresponding to the Planar mode as the corresponding transformation kernel of the current block.
- Mode 2 if the value of the first flag is the second value, then determine that the target prediction mode is the second intra-frame prediction mode, and the second intra-frame prediction mode is different from the intra-frame prediction mode of the current block.
- the specific type of the second intra-frame prediction mode is not limited, as long as it is different from the intra-frame prediction mode of the current block.
- the second intra-frame prediction mode is the DC mode
- the selection probability of the DC mode during prediction is second only to the Planar mode.
- the transform kernel corresponding to the second intra prediction mode is used as the target transform kernel corresponding to the current block.
- the second intra-frame prediction mode is a Planar mode.
- the second intra-frame prediction mode is Planar mode.
- the transform kernel corresponding to the second intra prediction mode is used as the target transform kernel corresponding to the current block.
- the decoder before executing S402-A1 to decode the code stream to obtain the first flag, the decoder first judges whether the selection of the target transformation kernel adopts the method provided in the embodiment of the present application. Specifically, the decoding end determines that the intra-frame prediction mode of the current block is a weighted prediction mode, and the weighted prediction mode includes N types of intra-frame prediction modes, and then performs a prediction process, that is, determines each of the N types of intra-frame prediction modes predictive values corresponding to the intra prediction modes, and determine weighting weights for weighting the predictive values corresponding to the N intra prediction modes. If the ratio between the minimum weighted weight and the maximum weighted weight corresponding to the N intra prediction modes is greater than the preset value, execute the above S402-A1 to decode the code stream to obtain the first flag.
- mode1 and mode2 are the two modes with the lowest cost, and their cost values are cost1 and cost2 respectively.
- cost1/cost2> ⁇ ⁇ is a preset value and is a positive number close to 1, such as 0.8.
- the transformation kernel at the decoding end selects the target transformation kernel indicated by the first flag, and then executes The above S402-A1 decodes the code stream to obtain the first flag. This is because when cost1 is much smaller than cost2, the weights of the two weights differ greatly, resulting in the prediction result being almost the same as the prediction result of mode1. At this time, there is no need to further consider the transformation kernel corresponding to mode2.
- the method of selecting the target transformation kernel corresponding to the current block from the preset M-type secondary transformation transformation kernels, in addition to determining the target transformation kernel according to the first flag as described in the above method 1, can also be as follows
- the second and third methods are realized.
- the intra prediction mode of the current block is a weighted prediction mode, and the weighted prediction mode includes N types of intra prediction modes, determine N types of intra predictions from the preset transformation kernels of M types of secondary transformations
- the transformation kernel of the secondary transformation corresponding to the mode determine the rate-distortion cost when decoding with the transformation kernel of the secondary transformation corresponding to the N intra prediction modes; determine the transformation kernel of the secondary transformation with the smallest rate-distortion cost as the current
- the block corresponds to the target transform kernel.
- Mode 3 if the intra prediction mode of the current block is a weighted prediction mode, and the weighted prediction mode includes N intra prediction modes, determine the rate-distortion cost corresponding to the N intra prediction modes; Among the transformation kernels, the first transformation kernel corresponding to the intra prediction mode with the smallest rate-distortion cost is searched; and the first transformation kernel is determined as the target transformation kernel corresponding to the current block.
- the manner of selecting the target transformation kernel corresponding to the current block from the preset transformation kernels of M types of secondary transformation includes but not limited to the above three manners.
- the decoding end of the embodiment of the present application selects the target transformation kernel corresponding to the current block from the preset M-type secondary transformation transformation kernels, instead of fixedly using the transformation kernel corresponding to a certain mode, thereby improving
- the selection flexibility of the transformation kernel improves the transformation effect and improves the overall performance of the codec.
- the second transform coefficient of the current block is used to perform inverse secondary transformation using the target transform kernel to obtain the basic transform coefficient of the current block.
- the embodiment of the present application does not limit the manner of performing inverse secondary transformation on the second transformation coefficient according to the transformation kernel of the secondary transformation.
- T is the transformation kernel of the secondary transformation
- the transformation kernel is a transformation matrix, is the coefficient after inverse quadratic transformation.
- the inverse secondary transformation can be performed on the second transformation coefficient to obtain the coefficient after the inverse secondary transformation.
- the transformation and inverse transformation corresponding to some angle modes are not transposed, and some need to be transposed.
- the modes of 13 to 23 and 45 to 55 all correspond to the transformation core class 2, but 13 to 23 are obviously The mode is close to the horizontal mode, and the mode from 45 to 55 is obviously close to the vertical mode.
- the transformation and inverse transformation corresponding to the mode from 45 to 55 need to be matched by transposition.
- the decoder needs to further determine whether to transpose the inversely re-transformed coefficients according to the target transform mode.
- the way for the decoder to judge whether to transpose the coefficient after the inverse secondary transformation includes but not limited to the following:
- Mode 1 the decoder judges according to Table 1 whether the inverse quadratic transformed coefficients corresponding to the target prediction mode need to be transposed. For example, if the target prediction mode is one of the modes from 45 to 55, it is determined that the inverse quadratic transformed coefficients need to be transposed.
- the intra prediction mode of the current block is the weighted prediction mode, it may be judged according to the first flag whether to transpose the coefficients after inverse quadratic transformation.
- the decoder first determines the transformation kernel category of the secondary transformation corresponding to the P types of intra prediction modes, and judges the indication of the first flag according to the transformation kernel category of the secondary transformation corresponding to the P intra prediction modes information.
- the first flag is used to indicate that the target transformation kernel is the transformation kernel corresponding to the target prediction mode.
- the P types of intra prediction modes include mode 1 and mode 2, and the target prediction mode is mode 1, then the first flag indicates that the target transform kernel is the transform kernel corresponding to mode 1.
- the specific value of the first flag may be the mode index of the target prediction mode.
- the first flag is used to indicate whether the inverse secondary transformation is performed in addition to indicating that the target transformation kernel is the transformation kernel corresponding to the target prediction mode
- the subsequent transform coefficients are transposed.
- the P types of intra prediction modes include angle mode 2 and angle mode 66, which all correspond to the third type of transformation kernel, but the difference is whether the coefficients need to be transposed before transformation and after inverse transformation, then the first The role of a flag is not to switch the class of the transformation kernel, but whether to transpose.
- S403 includes the following S403-A:
- Example 1 if it is determined that the types of transform kernels corresponding to the P types of intra prediction modes are not the same, use the target transform kernel indicated by the first flag to perform inverse secondary transformation on the second transform coefficient, and convert the transformed The coefficients are determined as base transform coefficients for the current block.
- Example 2 if it is determined that the transformation kernel types corresponding to the P intra prediction modes are the same, and the first flag indicates to transpose the transformation coefficient after the inverse quadratic transformation, then use the target transformation kernel to perform inverse quadratic transformation on the second transformation coefficient , and transpose the transform coefficients after inverse quadratic transformation, and determine the transposed transform coefficients as the basic transform coefficients of the current block.
- Example 3 if it is determined that the transformation kernel types corresponding to the P types of intra prediction modes are the same, and the first flag indicates that the transformation coefficient after the inverse quadratic transformation is not to be transposed, then use the target transformation kernel to perform inverse quadratic transformation on the second transformation coefficient transform, and determine the transform coefficient after the inverse secondary transform as the basic transform coefficient of the current block.
- the target transform kernel is used to perform inverse secondary transform on the second transform coefficient, and after obtaining the basic transform coefficient of the current block, the following step S404 is performed.
- the target transformation kernel is used to perform inverse secondary transformation on the second transformation coefficient to obtain the basic transformation coefficient of the current block, and then perform inverse basic transformation on the basic transformation coefficient to obtain the residual block of the current block.
- the basic transformation coefficients are subjected to inverse basic transformation to obtain the residual block of the current block.
- the decoding end uses the DCT-II transformation shown in the above formula (1) In this way, inverse basic transformation is performed on the above basic transformation coefficients to obtain the residual block of the current block.
- the decoding end uses the DCT-VIII transformation shown in the above formula (2) In the transformation mode, an inverse basic transformation is performed on the above basic transformation coefficients to obtain a residual block of the current block.
- the decoding end uses the DST-VII transformation method shown in the above formula (3) In the transformation mode, an inverse basic transformation is performed on the above basic transformation coefficients to obtain a residual block of the current block.
- intra prediction is performed on the current block according to the intra prediction mode of the current block to obtain a prediction block of the current block. For example, if the intra prediction mode of the current block is a weighted prediction mode, weighted prediction is performed using the N intra prediction modes included in the weighted prediction mode, that is, the prediction values corresponding to the N intra prediction modes are weighted to obtain the prediction block.
- the prediction block and the residual block are added to obtain the reconstructed block of the current block.
- the second transformation coefficient of the current block is obtained by decoding the code stream, and the second transformation coefficient is the transformation coefficient formed by the encoding end on the residual block of the current block through secondary transformation; if the current block is determined
- M is a positive integer greater than 1
- use the target transformation kernel to check the second transformation coefficient Perform inverse secondary transformation to obtain the basic transformation coefficient of the current block; perform inverse basic transformation on the basic transformation coefficient to obtain the residual block of the current block.
- the decoding end of this application selects the target transformation kernel corresponding to the current block from the preset M-type secondary transformation transformation kernels, instead of fixedly using the transformation kernel corresponding to a certain mode, thereby improving the selection flexibility of the transformation kernel , improve the transformation effect, and improve the overall decoding performance.
- FIG. 11 is another schematic flow chart of the video decoding method provided by the embodiment of the present application
- FIG. 12 is a schematic diagram of the decoding process involved in the embodiment of the present application
- FIG. 11 is a detailed decoding process involved in the embodiment of the present application. As shown in Figure 11 and Figure 12, including:
- the decoder decodes the code stream to obtain the quantization coefficient of the current block.
- a quantization mode is determined, and the quantization coefficient is dequantized using the determined quantization mode to obtain a second transform coefficient of the current block.
- the method for determining the quantization method at the decoder can be:
- the decoding end obtains the indication information of the quantization mode by decoding the code stream, and determines the quantization mode of the current block according to the indication information.
- Method 2 The decoder adopts the default quantization method.
- the decoding end determines the quantization mode of the current block in the same manner as the encoding end.
- the intra prediction mode of the current block allows the use of a weighted prediction mode, derive the intra prediction mode of the current block, and use the intra prediction mode of the current block to perform intra prediction on the current block to obtain a prediction block of the current block .
- the intra prediction mode of the current block includes a weighted prediction mode and a non-weighted prediction mode.
- the intra prediction mode of the current block is a weighted prediction mode
- derive N types of intra prediction modes included in the weighted prediction mode where N is a positive integer greater than 1.
- the current block is predicted by using each of the above N types of intra-frame prediction modes, and a prediction value corresponding to each type of intra-frame prediction mode is obtained.
- the prediction value corresponding to each intra prediction mode is weighted to obtain the prediction block of the current block.
- DIMD the above formulas (4) and (5) are used to weight the prediction value to obtain the prediction block of the current block.
- the above formulas (6) and (7) are used to weight the prediction value to obtain the prediction block of the current block.
- the implementation process of the above S504 is basically the same as the implementation process of the above S402, refer to the description of the above S402, and will not be repeated here.
- the prediction block of the current block is added to the residual block to obtain the reconstructed block of the current block.
- the quantization coefficient of the current block is obtained by decoding the code stream; the quantization coefficient is dequantized to obtain the second transformation coefficient of the current block; if the intra prediction mode of the current block is determined to allow the use of weighted prediction mode , then derive the intra prediction mode of the current block, and use the intra prediction mode of the current block to perform intra prediction on the current block to obtain the prediction block of the current block; if it is determined that the weighted prediction mode is allowed for the current block, from the preset Among the transformation kernels of the M-type secondary transformation, select the target transformation kernel corresponding to the current block; use the target transformation kernel to perform inverse secondary transformation on the second transformation coefficient to obtain the basic transformation coefficient of the current block; perform an inverse basic transformation on the basic transformation coefficient, Obtain the residual block of the current block; obtain the reconstructed block of the current block according to the prediction block and the residual block of the current block.
- the decoding end of this application selects the target transformation kernel corresponding to the current block from the preset M-type secondary transformation transformation kernels, instead of fixedly using the transformation kernel corresponding to a certain mode, thereby improving the selection flexibility of the transformation kernel , improve the transformation effect, and improve the overall decoding performance.
- FIG. 13 is a schematic flowchart of a video encoding method provided by an embodiment of the present application
- FIG. 14 is a schematic diagram of a video encoding process involved in an embodiment of the present application.
- the embodiment of the present application is applied to the video encoder shown in FIG. 1 and FIG. 2 .
- the method of the embodiment of the present application includes:
- the video encoder receives a video stream, which is composed of a series of image frames, performs video encoding for each frame of image in the video stream, and divides the image frames into blocks to obtain the current block.
- the current block is also referred to as a current coding block, a current image block, a coding block, a current coding unit, a current block to be coded, a current image block to be coded, and the like.
- the block divided by the traditional method includes not only the chrominance component of the current block position, but also the luminance component of the current block position.
- the separation tree technology can divide separate component blocks, such as a separate luma block and a separate chrominance block, where the luma block can be understood as only containing the luma component of the current block position, and the chrominance block can be understood as containing only the current block The chroma component of the position. In this way, the luma component and the chrominance component at the same position can belong to different blocks, and the division can have greater flexibility. If the separation tree is used in CU partitioning, some CUs contain both luma and chroma components, some CUs only contain luma components, and some CUs only contain chroma components.
- the current block in the embodiment of the present application only includes chroma components, which may be understood as a chroma block.
- the current block in this embodiment of the present application only includes a luma component, which may be understood as a luma block.
- the current block includes both luma and chroma components.
- the prediction mode of the current block in this embodiment of the present application includes a single prediction mode (that is, a non-weighted prediction mode), and may also include a weighted prediction mode.
- the code stream includes a flag a indicating whether the current block is allowed to use the weighted prediction mode.
- the decoder obtains the flag a by decoding the code stream, and judges whether the current block is allowed to use the weighted prediction mode according to the value of the flag a.
- Weighted prediction mode for example, when the value of the flag a is 1 (for example, 1), it is determined that the current block is allowed to adopt the weighted prediction mode; if the value of the flag a is 2 (for example, 0), it is determined that the current block does not Allows for weighted forecasting mode.
- the flag a may be a flag at the sequence level, frame level, macroblock level or coded block level.
- the second way is not to carry the flag a in the code stream, but to determine whether the weighted prediction mode is allowed for the current block by default. For example, if the default current sequence, or the current frame or the current macroblock or the current block does not allow weighted prediction mode When the prediction mode is selected, it is determined that the weighted prediction mode is not allowed for the current block. If the weighted prediction mode is allowed to be used in the current sequence, or the current frame or the current macroblock or the current block by default, then it is determined that the weighted prediction mode is allowed to be used in the current block.
- a default transform kernel is determined as the target transform kernel corresponding to the current block.
- the above S601 includes the following steps:
- S601-A1 include but are not limited to the following:
- an encoding cost for example, a rate-distortion cost
- the prediction mode with the smallest encoding cost is determined as the intra-frame prediction mode of the current block.
- Method 2 First, from the preset K kinds of intra-frame prediction modes, roughly select Q kinds of intra-frame prediction modes with the lowest cost, and K and Q are both positive integers; then, from the Q kinds of intra-frame prediction modes and the preset In the weighted prediction mode, the intra-frame prediction mode with the least cost is finely screened out as the intra-frame prediction mode of the current block.
- the K intra-frame prediction modes determine the predicted value of the current block when the intra-frame prediction mode is used to encode the current block; calculate the predicted value corresponding to the intra-frame prediction mode and The distortion D1 between the original values of the current block, and the bit number R2 consumed when encoding the flag of the prediction mode of the frame is calculated at the same time.
- the first rate-distortion cost J1 corresponding to each intra prediction mode among the K intra prediction modes can be determined.
- Q intra-frame prediction modes with the lowest cost are roughly selected.
- the intra-frame prediction mode with the lowest cost is finely selected as the intra-frame prediction mode of the current block.
- determine the reconstruction value of the current block when the current block is encoded using the intra prediction mode calculate the reconstruction The distortion D2 between the value and the original value of the current block, and the number of bits R2 consumed when the intra prediction mode is used to encode the current block are counted.
- the second rate-distortion cost J2 corresponding to each intra prediction mode among the Q intra prediction modes and the preset weighted prediction mode can be determined.
- the intra-frame prediction mode with the smallest second rate-distortion cost J2 is determined as the intra-frame prediction mode of the current block.
- the intra prediction mode of the current block includes N types of intra prediction modes, where N is a positive integer greater than 1. If it is determined that the intra prediction mode of the current block is a non-weighted prediction mode, then the intra prediction mode of the current block is an intra prediction mode.
- the implementation methods of the above S601-A2 include but are not limited to the following:
- Method 1 if the intra prediction mode of the current block is a weighted prediction mode, and the weighted prediction mode includes N types of intra prediction modes, then determine N types of intra The transformation kernel of the secondary transformation corresponding to the prediction mode; determine the rate-distortion cost when decoding with the transformation kernel of the secondary transformation corresponding to the N intra prediction modes; determine the transformation kernel of the secondary transformation with the smallest rate-distortion cost as The target transform kernel corresponding to the current block.
- Mode 2 if the intra prediction mode of the current block is a weighted prediction mode, and the weighted prediction mode includes N types of intra prediction modes, determine the rate-distortion cost corresponding to the N types of intra prediction modes; Among the transformation kernels, query the first transformation kernel corresponding to the intra prediction mode with the smallest rate-distortion cost; determine the first transformation kernel as the target transformation kernel corresponding to the current block.
- Method 3 If the intra prediction mode of the current block is a non-weighted prediction mode, the transformation kernel corresponding to the intra prediction mode of the current block among the transformation kernels of the M type secondary transformation is determined as the current The target transformation kernel corresponding to the block.
- Mode 4 if the intra prediction mode of the current block is a non-weighted prediction mode, determine the transformation kernel corresponding to the second intra prediction mode among the transformation kernels of the M type of secondary transformation as the corresponding transformation kernel of the current block Target transformation kernel.
- the second intra-frame prediction mode is different from the intra-frame prediction mode of the current block.
- the second intra-frame prediction mode is a DC mode.
- the second intra-frame prediction mode is a Planar mode.
- S602. Encode the current block to obtain a basic transform coefficient of the current block.
- the intra prediction mode of the current block determined in the above steps is used to predict the current block to obtain the prediction block of the current block, and then, according to the prediction block of the current block and the current block, the residual block of the current block is obtained. For example, the pixel value of the current block is subtracted from the pixel value of the predicted block to obtain the residual block of the current block.
- the basic transformation coefficient of the current block is obtained.
- the encoder uses the DCT-VIII transformation shown in the above formula (2) to perform basic transformation on the residual block of the current block, the basic transformation coefficient of the current block is obtained.
- the encoding end uses the DST-VII transformation method shown in the above formula (3) to perform basic transformation on the residual block of the current block, the basic transformation coefficient of the current block is obtained.
- the target transformation kernel is used to perform secondary transformation on the basic transformation coefficient to obtain the second transformation coefficient of the current block, that is, the product of the target transformation kernel and the basic transformation coefficient is used as the second transformation coefficient of the current block coefficient.
- the second transform coefficient of the current block is directly encoded without being quantized to obtain a code stream.
- the second transform coefficient of the current block is quantized to obtain a quantized coefficient, and the quantized coefficient is encoded to obtain a code stream.
- the encoding end may indicate to the decoding end which transformation kernel of the type of transformation kernel is used by the encoding end.
- the indication information can be carried in the code stream.
- the encoding end may also write a first flag in the code stream, where the first flag is used to indicate that the target transformation kernel is a transformation kernel corresponding to the target prediction mode.
- the encoder before writing the first flag in the code stream, the encoder first determines the weighting weights for weighting the prediction values corresponding to the N intra prediction modes. If the ratio between the minimum weighted weight and the maximum weighted weight corresponding to the N intra-frame prediction modes is greater than a preset value, write the first flag in the code stream.
- mode1 and mode2 are the two modes with the lowest cost, and their cost values are cost1 and cost2 respectively.
- cost1/cost2> ⁇ ⁇ is a preset value and is a positive number close to 1, such as 0.8.
- the transformation kernel at the decoding end selects the target transformation kernel indicated by the first flag, and then executes The above S402-A1 decodes the code stream to obtain the first flag. This is because when cost1 is much smaller than cost2, the weights of the two weights differ greatly, resulting in the prediction result being almost the same as the prediction result of mode1. At this time, there is no need to further consider the transformation kernel corresponding to mode2.
- the intra prediction mode of the current block is a weighted prediction mode
- the weighted prediction mode includes N intra prediction modes
- the target prediction mode is one of the P intra prediction modes Mode
- the P intra-frame prediction modes are the intra-frame prediction modes related to the selection of the transformation kernel for the secondary transformation of the current block among the N intra-frame prediction modes
- P is a positive integer greater than 1 and less than or equal to N.
- the P intra-frame prediction modes are the angle prediction modes among the N intra-frame prediction modes;
- the P types of intra-frame prediction modes are the N types of intra-frame prediction modes.
- the first flag is also used to indicate whether to transpose the transformation coefficient after the inverse secondary transformation.
- the method further includes: determining the value of the first flag according to the target prediction mode.
- the value of the first flag is the first value.
- the value of the first flag is the second value.
- the second intra-frame prediction mode is DC mode.
- the second intra-frame prediction mode is a Planar mode.
- the encoding end determines that the intra-frame prediction mode of the current block allows the weighted prediction mode, it selects the target transformation kernel corresponding to the current block from the preset M-type secondary transformation transformation kernels , the M is a positive integer greater than 1; encode the current block to obtain the basic transformation coefficient of the current block; use the target transformation kernel to perform secondary transformation on the basic transformation coefficient to obtain the current block the second transformation coefficient of the current block; and obtain a code stream according to the second transformation coefficient of the current block.
- the encoding end of this application selects the target transformation kernel corresponding to the current block from the preset M-type secondary transformation transformation kernels, instead of fixedly using the transformation kernel corresponding to a certain mode, thereby improving the selection flexibility of the transformation kernel , improve the transformation effect, and improve the overall performance of the encoding.
- FIG. 15 is a schematic flowchart of a video encoding method provided by an embodiment of the present application
- FIG. 16 is a schematic diagram of a video encoding process involved in an embodiment of the present application.
- the embodiment of the present application is applied to the video encoder shown in FIG. 1 and FIG. 2 .
- the method of the embodiment of the present application includes:
- the implementation manner of the above S702 is the same as the implementation manner of the above S601-A3, refer to the description of the above S601-A3, and will not be repeated here.
- the above S703 and the above S702 are implemented in no order, that is, the above S703 can be executed before the above S702, or can be executed after the above S702, or executed synchronously with the above S702, which is not limited in this application.
- the pixel value of the current block is subtracted from the pixel value of the predicted block to obtain the residual block of the current block.
- the encoding end determines the intra prediction mode of the current block; according to the intra prediction mode of the current block, from the preset Among the transformation kernels of the M-type secondary transformation, select the target transformation kernel corresponding to the current block; use the intra prediction mode of the current block to perform intra prediction on the current block to obtain the prediction block of the current block; according to the prediction block of the current block and the current block to obtain the residual block of the current block; perform basic transformation on the residual block to obtain the basic transformation coefficient of the current block; use the target transformation kernel to perform secondary transformation on the basic transformation coefficient to obtain the second transformation coefficient of the current block; Quantize the second transform coefficients to obtain quantized coefficients; encode the quantized coefficients to obtain a code stream.
- the encoding end of this application selects the target transformation kernel corresponding to the current block from the preset M-type secondary transformation transformation kernels, instead of fixedly using the transformation kernel corresponding to a certain mode, thereby improving the selection flexibility of the transformation kernel , improve the transformation effect, and improve the overall performance of the encoding.
- FIGS. 9 to 16 are only examples of the present application, and should not be construed as limiting the present application.
- sequence numbers of the above-mentioned processes do not mean the order of execution, and the order of execution of the processes should be determined by their functions and internal logic, and should not be used in this application.
- the implementation of the examples constitutes no limitation.
- the term "and/or" is only an association relationship describing associated objects, indicating that there may be three relationships. Specifically, A and/or B may mean: A exists alone, A and B exist simultaneously, and B exists alone.
- the character "/" in this article generally indicates that the contextual objects are an "or" relationship.
- Fig. 17 is a schematic block diagram of a video decoder provided by an embodiment of the present application.
- the video decoder 10 includes:
- the decoding unit 11 is configured to decode the code stream to obtain a second transformation coefficient of the current block, and the second transformation coefficient is a transformation coefficient formed by the encoding end through secondary transformation on the residual block of the current block;
- the selection unit 12 is configured to select a target transform kernel corresponding to the current block from the preset transform kernels of M types of secondary transform if it is determined that the intra prediction mode of the current block allows the weighted prediction mode, and Said M is a positive integer greater than 1;
- the transformation unit 13 is configured to use the target transformation kernel to perform inverse secondary transformation on the second transformation coefficient to obtain the basic transformation coefficient of the current block; and perform inverse basic transformation on the basic transformation coefficient to obtain the current block The block's residual block.
- the selection unit 12 is specifically configured to decode the code stream to obtain a first flag, and the first flag is used to indicate that the target transformation kernel is a transformation kernel corresponding to the target prediction mode; according to the first flag , determining a transform kernel corresponding to the target prediction mode among the M types of secondary transform transform kernels as the target transform kernel.
- the selection unit 12 is specifically configured to determine the intra prediction mode of the current block; determine the target prediction mode according to the intra prediction mode of the current block and the first flag; The transformation kernel corresponding to the target prediction mode among the transformation kernels of the M types of secondary transformation is determined as the target transformation kernel.
- the selecting unit 12 is specifically configured to: if the intra prediction mode of the current block is a weighted prediction mode, and the weighted prediction mode includes N types of intra prediction modes, the N types of intra prediction modes The prediction mode indicated by the first flag in the prediction mode is determined as the target prediction mode, and the N is a positive integer greater than 1.
- the target prediction mode is one of the P types of intra-frame prediction modes, and the P types of intra-frame prediction modes are the N types of intra-frame prediction modes that are compatible with the current block.
- the transformation kernel of the secondary transformation of selects a related intra-frame prediction mode, and the P is a positive integer greater than 1 and less than or equal to N.
- the P types of intra-frame prediction modes are angle prediction modes among the N types of intra-frame prediction modes;
- the P types of intra-frame prediction modes are the N types of intra-frame prediction modes.
- the decoding unit 11 is further configured to determine the transformation kernel type of the secondary transformation corresponding to the P types of intra prediction modes, if the The transformation kernel types of the secondary transformation corresponding to the P types of intra prediction modes are the same, and the first flag is also used to indicate whether to transpose the transformation coefficient after the inverse secondary transformation;
- the above-mentioned transformation unit 13 is specifically configured to use the target transformation kernel to perform inverse quadratic transformation on the second transformation coefficient according to the transformation kernel category of the secondary transformation corresponding to the P types of intra prediction modes and the first flag. Transform to obtain the basic transform coefficients of the current block.
- the above transform unit 13 is specifically configured to, if it is determined that the types of transform kernels corresponding to the P types of intra prediction modes are not the same, use the target transform kernel indicated by the first flag to performing an inverse secondary transform on the second transform coefficient, and determining the transform coefficient after the inverse secondary transform as the basic transform coefficient of the current block;
- the first flag indicates that the transformation coefficient after the inverse secondary transformation is transposed
- use the target transformation kernel to check the second transformation coefficient performing an inverse secondary transform, and transposing the transform coefficients after the inverse secondary transform, and determining the transposed transform coefficients as the basic transform coefficients of the current block;
- the first flag indicates that the transformation coefficient after the inverse secondary transformation is not to be transposed
- use the target transformation to check the second transformation Inverse secondary transform is performed on the coefficients, and the transform coefficients after the inverse secondary transform are determined as the basic transform coefficients of the current block.
- the selecting unit 12 is specifically configured to determine the target prediction mode according to the value of the first flag if the intra prediction mode of the current block is a non-weighted prediction mode.
- the selecting unit 12 is specifically configured to determine that the target prediction mode is the intra prediction mode of the current block if the value of the first flag is a first value;
- the target prediction mode is a second intra-frame prediction mode, and the second intra-frame prediction mode is different from the intra-frame prediction mode of the current block.
- the second intra-frame prediction mode is a DC mode
- the second intra-frame prediction mode is a Planar mode.
- the decoding unit 11 is further configured to determine the weighted weights for weighting the predicted values corresponding to the N intra-frame prediction modes; if the minimum weighted weight and the maximum weighted weight corresponding to the N intra-frame prediction modes If the ratio between the weights is greater than the preset value, then the code stream is decoded to obtain the first flag.
- the selection unit 12 is specifically configured to select from the preset M types of two Among the transformation kernels of the secondary transformation, determine the transformation kernels of the secondary transformation corresponding to the N types of intra prediction modes; determine the rate-distortion cost when decoding using the transformation kernels of the secondary transformation corresponding to the N intra prediction modes ; Determine the transformation kernel of the secondary transformation with the smallest rate-distortion cost as the target transformation kernel corresponding to the current block.
- the selection unit 12 is specifically configured to determine the N types of intra prediction modes.
- the rate-distortion cost corresponding to the prediction mode among the transformation kernels of the M-type secondary transformation, query the first transformation kernel corresponding to the intra-frame prediction mode with the smallest rate-distortion cost; determine the first transformation kernel as the current The target transformation kernel corresponding to the block.
- the device embodiment and the method embodiment may correspond to each other, and similar descriptions may refer to the method embodiment. To avoid repetition, details are not repeated here.
- the video decoder 10 shown in FIG. 17 can execute the decoding method of the embodiment of the present application, and the aforementioned and other operations and/or functions of the various units in the video decoder 10 are for realizing the above-mentioned decoding method and other methods. For the sake of brevity, the corresponding process will not be repeated here.
- Fig. 18 is a schematic block diagram of a video encoder provided by an embodiment of the present application.
- the video encoder 20 may include:
- the selection unit 21 is configured to select a target transformation kernel corresponding to the current block from the preset transformation kernels of M types of secondary transformation if it is determined that the intra prediction mode of the current block allows the weighted prediction mode, and the M is a positive integer greater than 1;
- An encoding unit 22 configured to encode the current block to obtain a basic transform coefficient of the current block
- a transform unit 23 configured to use the target transform kernel to perform secondary transform on the basic transform coefficients to obtain second transform coefficients of the current block;
- the encoding unit 22 is further configured to obtain a code stream according to the second transform coefficient of the current block.
- the selection unit 21 is specifically configured to determine the intra-frame prediction mode of the current block; according to the intra-frame prediction mode of the current block, select from the preset transformation kernels of M types of secondary transformation The target transform kernel corresponding to the current block.
- the selection unit 21 is specifically configured to roughly screen out the Q types of intra-frame prediction modes with the lowest cost from the preset K types of intra-frame prediction modes, where K and Q are both positive integers, and the K
- the intra-frame prediction modes include a weighted prediction mode; from the Q intra-frame prediction modes and preset weighted prediction modes, finely select the intra-frame prediction mode with the least cost as the intra-frame prediction mode of the current block .
- the selection unit 21 is specifically configured to: if the intra prediction mode of the current block is a weighted prediction mode, and the weighted prediction mode includes N types of intra prediction modes, select two Among the transformation kernels of the secondary transformation, determine the transformation kernels of the secondary transformation corresponding to the N types of intra prediction modes; determine the rate-distortion cost when decoding using the transformation kernels of the secondary transformation corresponding to the N intra prediction modes ; Determine the transformation kernel of the secondary transformation with the smallest rate-distortion cost as the target transformation kernel corresponding to the current block.
- the selection unit 21 is specifically configured to determine the N types of intra prediction modes if the intra prediction mode of the current block is a weighted prediction mode, and the weighted prediction mode includes N types of intra prediction modes.
- the rate-distortion cost corresponding to the prediction mode among the transformation kernels of the M-type secondary transformation, query the first transformation kernel corresponding to the intra-frame prediction mode with the smallest rate-distortion cost; determine the first transformation kernel as the current The target transformation kernel corresponding to the block.
- the encoding unit 22 is further configured to write a first flag in the code stream, where the first flag is used to indicate that the target transformation kernel is a transformation kernel corresponding to the target prediction mode.
- the target prediction mode is one of the P types of intra prediction modes Intra-frame prediction modes
- the P types of intra-frame prediction modes are intra-frame prediction modes related to the transformation kernel selection of the secondary transformation of the current block among the N types of intra-frame prediction modes
- the P is greater than 1 and a positive integer less than or equal to N.
- the P types of intra-frame prediction modes are angle prediction modes among the N types of intra-frame prediction modes;
- the P types of intra-frame prediction modes are the N types of intra-frame prediction modes.
- the first flag is also used to indicate whether to transpose the transformation coefficient after the inverse secondary transformation.
- the selection unit 21 is specifically configured to select the intra-frame prediction of the current block in the transformation kernel of the M-type secondary transformation
- the transformation kernel corresponding to the mode is determined as the target transformation kernel corresponding to the current block; or, the transformation kernel corresponding to the second intra prediction mode in the transformation kernels of the M type of secondary transformation is determined as the transformation kernel corresponding to the current block
- the value of the first flag is a first value
- the value of the first flag is a second value.
- the second intra-frame prediction mode is a DC mode
- the second intra-frame prediction mode is a Planar mode.
- the encoding unit 22 is further configured to determine the weighted weights for weighting the predicted values corresponding to the N intra-frame prediction modes; if the minimum weighted weight and the maximum weighted weight corresponding to the N intra-frame prediction modes If the ratio between the weights is greater than the preset value, then the first flag is written in the code stream.
- the device embodiment and the method embodiment may correspond to each other, and similar descriptions may refer to the method embodiment. To avoid repetition, details are not repeated here.
- the video encoder 20 shown in FIG. 18 may correspond to the corresponding subject in the encoding method of the embodiment of the present application, and the aforementioned and other operations and/or functions of each unit in the video decoder 20 are for realizing the encoding
- the corresponding processes in each method, such as the method will not be repeated here.
- the functional unit may be implemented in the form of hardware, may also be implemented by instructions in the form of software, and may also be implemented by a combination of hardware and software units.
- each step of the method embodiment in the embodiment of the present application can be completed by an integrated logic circuit of the hardware in the processor and/or instructions in the form of software, and the steps of the method disclosed in the embodiment of the present application can be directly embodied as hardware
- the decoding processor is executed, or the combination of hardware and software units in the decoding processor is used to complete the execution.
- the software unit may be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, and registers.
- the storage medium is located in the memory, and the processor reads the information in the memory, and completes the steps in the above method embodiments in combination with its hardware.
- Fig. 19 is a schematic block diagram of an electronic device provided by an embodiment of the present application.
- the electronic device 30 may be the video encoder or video decoder described in the embodiment of the present application, and the electronic device 30 may include:
- a memory 33 and a processor 32 the memory 33 is used to store a computer program 34 and transmit the program code 34 to the processor 32 .
- the processor 32 can call and run the computer program 34 from the memory 33 to implement the method in the embodiment of the present application.
- the processor 32 can be used to execute the steps in the above-mentioned method 200 according to the instructions in the computer program 34 .
- the processor 32 may include, but is not limited to:
- DSP Digital Signal Processor
- ASIC Application Specific Integrated Circuit
- FPGA Field Programmable Gate Array
- the memory 33 includes but is not limited to:
- non-volatile memory can be read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM, EPROM), electronically programmable Erase Programmable Read-Only Memory (Electrically EPROM, EEPROM) or Flash.
- the volatile memory can be Random Access Memory (RAM), which acts as external cache memory.
- RAM Static Random Access Memory
- SRAM Static Random Access Memory
- DRAM Dynamic Random Access Memory
- Synchronous Dynamic Random Access Memory Synchronous Dynamic Random Access Memory
- SDRAM double data rate synchronous dynamic random access memory
- Double Data Rate SDRAM, DDR SDRAM double data rate synchronous dynamic random access memory
- Enhanced SDRAM, ESDRAM enhanced synchronous dynamic random access memory
- SLDRAM synchronous connection dynamic random access memory
- Direct Rambus RAM Direct Rambus RAM
- the computer program 34 can be divided into one or more units, and the one or more units are stored in the memory 33 and executed by the processor 32 to complete the present application.
- the one or more units may be a series of computer program instruction segments capable of accomplishing specific functions, and the instruction segments are used to describe the execution process of the computer program 34 in the electronic device 30 .
- the electronic device 30 may also include:
- a transceiver 33 the transceiver 33 can be connected to the processor 32 or the memory 33 .
- the processor 32 can control the transceiver 33 to communicate with other devices, specifically, can send information or data to other devices, or receive information or data sent by other devices.
- Transceiver 33 may include a transmitter and a receiver.
- the transceiver 33 may further include antennas, and the number of antennas may be one or more.
- bus system includes not only a data bus, but also a power bus, a control bus and a status signal bus.
- Fig. 20 is a schematic block diagram of a video encoding and decoding system provided by an embodiment of the present application.
- the video codec system 40 may include: a video encoder 41 and a video decoder 42, wherein the video encoder 41 is used to execute the video encoding method involved in the embodiment of the present application, and the video decoder 42 is used to execute The video decoding method involved in the embodiment of the present application.
- the present application also provides a computer storage medium, on which a computer program is stored, and when the computer program is executed by a computer, the computer can execute the methods of the above method embodiments.
- the embodiments of the present application further provide a computer program product including instructions, and when the instructions are executed by a computer, the computer executes the methods of the foregoing method embodiments.
- the present application also provides a code stream, which is generated by the above coding method.
- the code stream includes the first flag.
- the computer program product includes one or more computer instructions.
- the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
- the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transferred from a website, computer, server, or data center by wire (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) to another website site, computer, server or data center.
- the computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage device such as a server or a data center integrated with one or more available media.
- the available medium may be a magnetic medium (such as a floppy disk, a hard disk, or a magnetic tape), an optical medium (such as a digital video disc (digital video disc, DVD)), or a semiconductor medium (such as a solid state disk (solid state disk, SSD)), etc.
- a magnetic medium such as a floppy disk, a hard disk, or a magnetic tape
- an optical medium such as a digital video disc (digital video disc, DVD)
- a semiconductor medium such as a solid state disk (solid state disk, SSD)
- the disclosed systems, devices and methods may be implemented in other ways.
- the device embodiments described above are only illustrative.
- the division of the units is only a logical function division. In actual implementation, there may be other division methods.
- multiple units or components can be combined or can be Integrate into another system, or some features may be ignored, or not implemented.
- the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
- a unit described as a separate component may or may not be physically separated, and a component displayed as a unit may or may not be a physical unit, that is, it may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
- each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Discrete Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
本申请提供一种视频编解码方法、设备、系统、及存储介质,通过解码码流,得到当前块的第二变换系数,该第二变换系数为编码端对当前块的残差块经过二次变换形成的变换系数;若确定当前块的帧内预测模式允许采用加权预测模式时,从预设的M类二次变换的变换核中,选择当前块对应的目标变换核,M为大于1的正整数;使用目标变换核对第二变换系数进行反二次变换,得到当前块的基础变换系数;对基础变换系数进行反基础变换,得到当前块的残差块。即本申请从预设的M类二次变换的变换核中,选择当前块对应的目标变换核,而不是固定使用某一种模式对应的变换核,进而提高了变换核的选择灵活性,提升变换效果,提高了编解码整体性能。
Description
本申请涉及视频编解码技术领域,尤其涉及一种视频编解码方法、设备、系统、及存储介质。
数字视频技术可以并入多种视频装置中,例如数字电视、智能手机、计算机、电子阅读器或视频播放器等。随着视频技术的发展,视频数据所包括的数据量较大,为了便于视频数据的传输,视频装置执行视频压缩技术,以使视频数据更加有效的传输或存储。
视频通过编码实现压缩,其编码过程包括预测、变换和量化等过程。例如,通过帧内预测和/或帧间预测,确定当前块的预测块,当前块减去预测块得到残差块,对残差块进行变换得到变换系数,对变换系数进行量化得到量化系数,并对量化系数进行编码,形成码流。
在一些情况下,当前块的帧内预测模式可能是加权预测模式,当前块的变换模式可能是二次变换模式。在这种情况下,目前在二次变换时,变换核固定选用某个帧内预测模式对应的变换核,使得二次变换的变换核的选择不灵活,变换效果差,进而导致编解码整体性能不理想。
发明内容
本申请实施例提供了一种视频编解码方法、设备、系统、及存储介质,通过在预设的M类二次变换的变换核中,选择当前块对应的目标变换核,而不是固定选择某一模式对应的变换核,进而提高了变换核的选择灵活性,提升变换效果,提高了编解码整体性能。
第一方面,本申请提供了一种视频解码方法,包括:
解码码流,得到当前块的第二变换系数,所述第二变换系数为编码端对所述当前块的残差块经过二次变换形成的变换系数;
若确定所述当前块的帧内预测模式允许采用加权预测模式时,从预设的M类二次变换的变换核中,选择所述当前块对应的目标变换核,所述M为大于1的正整数;
使用所述目标变换核对所述第二变换系数进行反二次变换,得到所述当前块的基础变换系数;
对所述基础变换系数进行反基础变换,得到所述当前块的残差块。
第二方面,本申请实施例提供一种视频编码方法,包括:
若确定当前块的帧内预测模式允许采用加权预测模式时,从预设的M类二次变换的变换核中,选择所述当前块对应的目标变换核,所述M为大于1的正整数;
对所述当前块进行编码,得到所述当前块的基础变换系数;
使用所述目标变换核对所述基础变换系数进行二次变换,得到所述当前块的第二变换系数;
根据所述当前块的第二变换系数,得到码流。
第三方面,本申请提供了一种视频编码器,用于执行上述第一方面或其各实现方式中的方法。具体地,该编码器包括用于执行上述第一方面或其各实现方式中的方法的功能单元。
第四方面,本申请提供了一种视频解码器,用于执行上述第二方面或其各实现方式中的方法。具体地,该解码器包括用于执行上述第二方面或其各实现方式中的方法的功能单元。
第五方面,提供了一种视频编码器,包括处理器和存储器。该存储器用于存储计算机程序,该处理器用于调用并运行该存储器中存储的计算机程序,以执行上述第一方面或其各实现方式中的方法。
第六方面,提供了一种视频解码器,包括处理器和存储器。该存储器用于存储计算机程序,该处理器用于调用并运行该存储器中存储的计算机程序,以执行上述第二方面或其各实现方式中的方法。
第七方面,提供了一种视频编解码系统,包括视频编码器和视频解码器。视频编码器用于执行上述第一方面或其各实现方式中的方法,视频解码器用于执行上述第二方面或其各实现方式中的方法。
第八方面,提供了一种芯片,用于实现上述第一方面至第二方面中的任一方面或其各实现方式中的方法。具体地,该芯片包括:处理器,用于从存储器中调用并运行计算机程序,使得安装有该芯片的设备执行如上述第一方面至第二方面中的任一方面或其各实现方式中的方法。
第九方面,提供了一种计算机可读存储介质,用于存储计算机程序,该计算机程序使得计算机执行上述第一方面至第二方面中的任一方面或其各实现方式中的方法。
第十方面,提供了一种计算机程序产品,包括计算机程序指令,该计算机程序指令使得计算机执行上述第一方面至第二方面中的任一方面或其各实现方式中的方法。
第十一方面,提供了一种计算机程序,当其在计算机上运行时,使得计算机执行上述第一方面至第二方面中的任一方面或其各实现方式中的方法。
第十二方面,提供了一种码流,该码流是通过上述第一方面中的任一方面或其各实现方式生成的。
基于以上技术方案,通过解码码流,得到当前块的第二变换系数,该第二变换系数为编码端对当前块的残差块经过二次变换形成的变换系数;若确定当前块的帧内预测模式允许采用加权预测模式时,从预设的M类二次变换的变换核中,选择当前块对应的目标变换核,M为大于1的正整数;使用目标变换核对第二变换系数进行反二次变换,得到当前块的基础变换系数;对基础变换系数进行反基础变换,得到当前块的残差块。即本申请从预设的M类二次变换的变换核中,选择当前块对应的目标变换核,而不是固定使用某一种模式对应的变换核,进而提高了变换核的选择灵活性,提升变换效果,提高了编解码整体性能。
图1为本申请实施例涉及的一种视频编解码系统的示意性框图;
图2是本申请实施例涉及的视频编码器的示意性框图;
图3是本申请实施例涉及的视频解码器的示意性框图;
图4是LFNST变换的示意图;
图5是一种帧内预测模式示意图;
图6是另一种帧内预测模式示意图;
图7A为DIMD的预测示意图;
图7B为DIMD的幅度值和预测模式的直方图;
图8为TIMD的预测示意图;
图9为本申请实施例提供的视频解码方法的一种流程示意图;
图10为本申请实施例涉及的视频解码过程示意图;
图11为本申请实施例提供的视频解码方法的另一流程示意图;
图12为本申请实施例涉及的解码过程示意图;
图13本申请实施例提供的视频编码方法的一种流程示意图;
图14本申请实施例涉及的视频编码过程的示意图;
图15本申请实施例提供的视频编码方法的一种流程示意图;
图16本申请实施例涉及的视频编码过程的示意图;
图17是本申请一实施例提供的视频解码器的示意性框图;
图18是本申请一实施例提供的视频编码器的示意性框图;
图19是本申请实施例提供的电子设备的示意性框图;
图20是本申请实施例提供的视频编解码系统的示意性框图。
本申请可应用于图像编解码领域、视频编解码领域、硬件视频编解码领域、专用电路视频编解码领域、实时视频编解码领域等。例如,本申请的方案可结合至音视频编码标准(audio video coding standard,简称AVS),例如,H.264/音视频编码(audio video coding,简称AVC)标准,H.265/高效视频编码(high efficiency video coding,简称HEVC)标准以及H.266/多功能视频编码(versatile video coding,简称VVC)标准。或者,本申请的方案可结合至其它专属或行业标准而操作,所述标准包含ITU-TH.261、ISO/IECMPEG-1Visual、ITU-TH.262或ISO/IECMPEG-2Visual、ITU-TH.263、ISO/IECMPEG-4Visual,ITU-TH.264(还称为ISO/IECMPEG-4AVC),包含可分级视频编解码(SVC)及多视图视频编解码(MVC)扩展。应理解,本申请的技术不限于任何特定编解码标准或技术。
为了便于理解,首先结合图1对本申请实施例涉及的视频编解码系统进行介绍。
图1为本申请实施例涉及的一种视频编解码系统的示意性框图。需要说明的是,图1只是一种示例,本申请实施例的视频编解码系统包括但不限于图1所示。如图1所示,该视频编解码系统100包含编码设备110和解码设备120。其中编码设备用于对视频数据进行编码(可以理解成压缩)产生码流,并将码流传输给解码设备。解码设备对编码设备编码产生的码流进行解码,得到解码后的视频数据。
本申请实施例的编码设备110可以理解为具有视频编码功能的设备,解码设备120可以理解为具有视频解码功能的设备,即本申请实施例对编码设备110和解码设备120包括更广泛的装置,例如包含智能手机、台式计算机、移动计算装置、笔记本(例如,膝上型)计算机、平板计算机、机顶盒、电视、相机、显示装置、数字媒体播放器、视频游戏控制台、车载计算机等。
在一些实施例中,编码设备110可以经由信道130将编码后的视频数据(如码流)传输给解码设备120。信道130可以包括能够将编码后的视频数据从编码设备110传输到解码设备120的一个或多个媒体和/或装置。
在一个实例中,信道130包括使编码设备110能够实时地将编码后的视频数据直接发射到解码设备120的一个或多个通信媒体。在此实例中,编码设备110可根据通信标准来调制编码后的视频数据,且将调制后的视频数据发射到解码设备120。其中通信媒体包含无线通信媒体,例如射频频谱,可选的,通信媒体还可以包含有线通信媒体,例如一根或多根物理传输线。
在另一实例中,信道130包括存储介质,该存储介质可以存储编码设备110编码后的视频数据。存储介质包含多种本地存取式数据存储介质,例如光盘、DVD、快闪存储器等。在该实例中,解码设备120可从该存储介质中获取编码后的视频数据。
在另一实例中,信道130可包含存储服务器,该存储服务器可以存储编码设备110编码后的视频数据。在此实例中,解码设备120可以从该存储服务器中下载存储的编码后的视频数据。可选的,该存储服务器可以存储编码后的视频数据且可以将该编码后的视频数据发射到解码设备120,例如web服务器(例如,用于网站)、文件传送协议(FTP)服务器等。
一些实施例中,编码设备110包含视频编码器112及输出接口113。其中,输出接口113可以包含调制器/解调器(调制解调器)和/或发射器。
在一些实施例中,编码设备110除了包括视频编码器112和输入接口113外,还可以包括视频源111。
视频源111可包含视频采集装置(例如,视频相机)、视频存档、视频输入接口、计算机图形系统中的至少一个,其中,视频输入接口用于从视频内容提供者处接收视频数据,计算机图形系统用于产生视频数据。
视频编码器112对来自视频源111的视频数据进行编码,产生码流。视频数据可包括一个或多个图像(picture) 或图像序列(sequence of pictures)。码流以比特流的形式包含了图像或图像序列的编码信息。编码信息可以包含编码图像数据及相关联数据。相关联数据可包含序列参数集(sequence parameter set,简称SPS)、图像参数集(picture parameter set,简称PPS)及其它语法结构。SPS可含有应用于一个或多个序列的参数。PPS可含有应用于一个或多个图像的参数。语法结构是指码流中以指定次序排列的零个或多个语法元素的集合。
视频编码器112经由输出接口113将编码后的视频数据直接传输到解码设备120。编码后的视频数据还可存储于存储介质或存储服务器上,以供解码设备120后续读取。
在一些实施例中,解码设备120包含输入接口121和视频解码器122。
在一些实施例中,解码设备120除包括输入接口121和视频解码器122外,还可以包括显示装置123。
其中,输入接口121包含接收器及/或调制解调器。输入接口121可通过信道130接收编码后的视频数据。
视频解码器122用于对编码后的视频数据进行解码,得到解码后的视频数据,并将解码后的视频数据传输至显示装置123。
显示装置123显示解码后的视频数据。显示装置123可与解码设备120整合或在解码设备120外部。显示装置123可包括多种显示装置,例如液晶显示器(LCD)、等离子体显示器、有机发光二极管(OLED)显示器或其它类型的显示装置。
此外,图1仅为实例,本申请实施例的技术方案不限于图1,例如本申请的技术还可以应用于单侧的视频编码或单侧的视频解码。
下面对本申请实施例涉及的视频编码框架进行介绍。
图2是本申请实施例涉及的视频编码器的示意性框图。应理解,该视频编码器200可用于对图像进行有损压缩(lossy compression),也可用于对图像进行无损压缩(lossless compression)。该无损压缩可以是视觉无损压缩(visually lossless compression),也可以是数学无损压缩(mathematically lossless compression)。
该视频编码器200可应用于亮度色度(YCbCr,YUV)格式的图像数据上。例如,YUV比例可以为4:2:0、4:2:2或者4:4:4,Y表示明亮度(Luma),Cb(U)表示蓝色色度,Cr(V)表示红色色度,U和V表示为色度(Chroma)用于描述色彩及饱和度。例如,在颜色格式上,4:2:0表示每4个像素有4个亮度分量,2个色度分量(YYYYCbCr),4:2:2表示每4个像素有4个亮度分量,4个色度分量(YYYYCbCrCbCr),4:4:4表示全像素显示(YYYYCbCrCbCrCbCrCbCr)。
例如,该视频编码器200读取视频数据,针对视频数据中的每帧图像,将一帧图像划分成若干个编码树单元(coding tree unit,CTU),在一些例子中,CTB可被称作“树型块”、“最大编码单元”(Largest Coding unit,简称LCU)或“编码树型块”(coding tree block,简称CTB)。每一个CTU可以与图像内的具有相等大小的像素块相关联。每一像素可对应一个亮度(luminance或luma)采样及两个色度(chrominance或chroma)采样。因此,每一个CTU可与一个亮度采样块及两个色度采样块相关联。一个CTU大小例如为128×128、64×64、32×32等。一个CTU又可以继续被划分成若干个编码单元(Coding Unit,CU)进行编码,CU可以为矩形块也可以为方形块。CU可以进一步划分为预测单元(prediction Unit,简称PU)和变换单元(transform unit,简称TU),进而使得编码、预测、变换分离,处理的时候更灵活。在一种示例中,CTU以四叉树方式划分为CU,CU以四叉树方式划分为TU、PU。
视频编码器及视频解码器可支持各种PU大小。假定特定CU的大小为2N×2N,视频编码器及视频解码器可支持2N×2N或N×N的PU大小以用于帧内预测,且支持2N×2N、2N×N、N×2N、N×N或类似大小的对称PU以用于帧间预测。视频编码器及视频解码器还可支持2N×nU、2N×nD、nL×2N及nR×2N的不对称PU以用于帧间预测。
在一些实施例中,如图2所示,该视频编码器200可包括:预测单元210、残差单元220、变换/量化单元230、反变换/量化单元240、重建单元250、环路滤波单元260、解码图像缓存270和熵编码单元280。需要说明的是,视频编码器200可包含更多、更少或不同的功能组件。
可选的,在本申请中,当前块(current block)可以称为当前编码单元(CU)或当前预测单元(PU)等。预测块也可称为预测图像块或图像预测块,重建图像块也可称为重建块或图像重建图像块。
在一些实施例中,预测单元210包括帧间预测单元211和帧内估计单元212。由于视频的一个帧中的相邻像素之间存在很强的相关性,在视频编解码技术中使用帧内预测的方法消除相邻像素之间的空间冗余。由于视频中的相邻帧之间存在着很强的相似性,在视频编解码技术中使用帧间预测方法消除相邻帧之间的时间冗余,从而提高编码效率。
帧间预测单元211可用于帧间预测,帧间预测可以参考不同帧的图像信息,帧间预测使用运动信息从参考帧中找到参考块,根据参考块生成预测块,用于消除时间冗余;帧间预测所使用的帧可以为P帧和/或B帧,P帧指的是向前预测帧,B帧指的是双向预测帧。运动信息包括参考帧所在的参考帧列表,参考帧索引,以及运动矢量。运动矢量可以是整像素的或者是分像素的,如果运动矢量是分像素的,那么需要再参考帧中使用插值滤波做出所需的分像素的块,这里把根据运动矢量找到的参考帧中的整像素或者分像素的块叫参考块。有的技术会直接把参考块作为预测块,有的技术会在参考块的基础上再处理生成预测块。在参考块的基础上再处理生成预测块也可以理解为把参考块作为预测块然后再在预测块的基础上处理生成新的预测块。
帧内估计单元212只参考同一帧图像的信息,预测当前码图像块内的像素信息,用于消除空间冗余。帧内预测所使用的帧可以为I帧。例如对于4×4的当前块,当前块左边一行和上面一列的像素为当前块的参考像素,帧内预测使用这些参考像素对当前块进行预测。这些参考像素可能已经全部可得,即全部已经编解码。也可能有部分不可得,比如当前块是整帧的最左侧,那么当前块的左边的参考像素不可得。或者编解码当前块时,当前块左下方的部分还没有编解码,那么左下方的参考像素也不可得。对于参考像素不可得的情况,可以使用可得的参考像素或某些值或某些方法进行填充,或者不进行填充。
帧内预测有多种预测模式,以国际数字视频编码标准H系列为例,H.264/AVC标准有8种角度预测模式和1种非角度预测模式,H.265/HEVC扩展到33种角度预测模式和2种非角度预测模式。HEVC使用的帧内预测模式有平面模式(Planar)、DC和33种角度模式,共35种预测模式。VVC使用的帧内模式有Planar、DC和65种角度模式,共67种预测模式。对于亮度分量有基于训练得到的预测矩阵(Matrix based intra prediction,MIP)预测模式,对于色度分量, 有CCLM预测模式。
需要说明的是,随着角度模式的增加,帧内预测将会更加精确,也更加符合对高清以及超高清数字视频发展的需求。
残差单元220可基于CU的像素块及CU的PU的预测块来产生CU的残差块。举例来说,残差单元220可产生CU的残差块,使得残差块中的每一采样具有等于以下两者之间的差的值:CU的像素块中的采样,及CU的PU的预测块中的对应采样。
变换/量化单元230可量化变换系数。变换/量化单元230可基于与CU相关联的量化参数(QP)值来量化与CU的TU相关联的变换系数。视频编码器200可通过调整与CU相关联的QP值来调整应用于与CU相关联的变换系数的量化程度。
反变换/量化单元240可分别将逆量化及逆变换应用于量化后的变换系数,以从量化后的变换系数重建残差块。
重建单元250可将重建后的残差块的采样加到预测单元210产生的一个或多个预测块的对应采样,以产生与TU相关联的重建图像块。通过此方式重建CU的每一个TU的采样块,视频编码器200可重建CU的像素块。
环路滤波单元260可执行消块滤波操作以减少与CU相关联的像素块的块效应。
在一些实施例中,环路滤波单元260包括去块滤波单元和样点自适应补偿/自适应环路滤波(SAO/ALF)单元,其中去块滤波单元用于去方块效应,SAO/ALF单元用于去除振铃效应。
解码图像缓存270可存储重建后的像素块。帧间预测单元211可使用含有重建后的像素块的参考图像来对其它图像的PU执行帧间预测。另外,帧内估计单元212可使用解码图像缓存270中的重建后的像素块来对在与CU相同的图像中的其它PU执行帧内预测。
熵编码单元280可接收来自变换/量化单元230的量化后的变换系数。熵编码单元280可对量化后的变换系数执行一个或多个熵编码操作以产生熵编码后的数据。
图3是本申请实施例涉及的视频解码器的示意性框图。
如图3所示,视频解码器300包含:熵解码单元310、预测单元320、逆量化变换单元330、重建单元340、环路滤波单元350及解码图像缓存360。需要说明的是,视频解码器300可包含更多、更少或不同的功能组件。
视频解码器300可接收码流。熵解码单元310可解析码流以从码流提取语法元素。作为解析码流的一部分,熵解码单元310可解析码流中的经熵编码后的语法元素。预测单元320、逆量化变换单元330、重建单元340及环路滤波单元350可根据从码流中提取的语法元素来解码视频数据,即产生解码后的视频数据。
在一些实施例中,预测单元320包括帧内估计单元321和帧间预测单元322。
帧内估计单元321(也称为帧内预测单元)可执行帧内预测以产生PU的预测块。帧内估计单元321可使用帧内预测模式以基于空间相邻PU的像素块来产生PU的预测块。帧内估计单元321还可根据从码流解析的一个或多个语法元素来确定PU的帧内预测模式。
帧间预测单元322可根据从码流解析的语法元素来构造第一参考图像列表(列表0)及第二参考图像列表(列表1)。此外,如果PU使用帧间预测编码,则熵解码单元310可解析PU的运动信息。帧间预测单元322可根据PU的运动信息来确定PU的一个或多个参考块。帧间预测单元322可根据PU的一个或多个参考块来产生PU的预测块。
逆量化变换单元330(也称为反量化/变换单元)可逆量化(即,解量化)与TU相关联的变换系数。逆量化变换单元330可使用与TU的CU相关联的QP值来确定量化程度。
在逆量化变换系数之后,逆量化变换单元330可将一个或多个逆变换应用于逆量化变换系数,以便产生与TU相关联的残差块。
重建单元340使用与CU的TU相关联的残差块及CU的PU的预测块以重建CU的像素块。例如,重建单元340可将残差块的采样加到预测块的对应采样以重建CU的像素块,得到重建图像块。
环路滤波单元350可执行消块滤波操作以减少与CU相关联的像素块的块效应。
视频解码器300可将CU的重建图像存储于解码图像缓存360中。视频解码器300可将解码图像缓存360中的重建图像作为参考图像用于后续预测,或者,将重建图像传输给显示装置呈现。
由上述图2和图3可知,视频编解码的基本流程如下:在编码端,将一帧图像划分成块,对当前块,预测单元210使用帧内预测或帧间预测产生当前块的预测块。残差单元220可基于预测块与当前块的原始块计算残差块,例如将当前块的原始块减去预测块得到残差块,该残差块也可称为残差信息。该残差块经由变换/量化单元230变换与量化等过程,可以去除人眼不敏感的信息,以消除视觉冗余。可选的,经过变换/量化单元230变换与量化之前的残差块可称为时域残差块,经过变换/量化单元230变换与量化之后的时域残差块可称为频率残差块或频域残差块。熵编码单元280接收到变换量化单元230输出的量化后的变换系数,可对该量化后的变换系数进行熵编码,输出码流。例如,熵编码单元280可根据目标上下文模型以及二进制码流的概率信息消除字符冗余。
在解码端,熵解码单元310可解析码流得到当前块的预测信息、量化系数矩阵等,预测单元320基于预测信息对当前块使用帧内预测或帧间预测产生当前块的预测块。逆量化变换单元330使用从码流得到的量化系数矩阵,对量化系数矩阵进行反量化、反变换得到残差块。重建单元340将预测块和残差块相加得到重建块。重建块组成重建图像,环路滤波单元350基于图像或基于块对重建图像进行环路滤波,得到解码图像。编码端同样需要和解码端类似的操作获得解码图像。该解码图像也可以称为重建图像,重建图像可以为后续的帧作为帧间预测的参考帧。
需要说明的是,编码端确定的块划分信息,以及预测、变换、量化、熵编码、环路滤波等模式信息或者参数信息等在必要时携带在码流中。解码端通过解析码流及根据已有信息进行分析确定与编码端相同的块划分信息,预测、变换、量化、熵编码、环路滤波等模式信息或者参数信息,从而保证编码端获得的解码图像和解码端获得的解码图像相同。
当前块(current block)可以是当前编码单元(CU)或当前预测单元(PU)等。
上述是基于块的混合编码框架下的视频编解码器的基本流程,随着技术的发展,该框架或流程的一些模块或步骤 可能会被优化,本申请适用于该基于块的混合编码框架下的视频编解码器的基本流程,但不限于该框架及流程。
由上述可知编码时,通用的混合编码框架会先进行预测,预测利用空间或者时间上的相关性能得到一个跟当前块相同或相似的图像。对一个块来说,预测块和当前块是完全相同的情况是有可能出现的,但是很难保证一个视频中的所有块都如此,特别是对自然视频,或者说相机拍摄的视频,因为有噪音的存在。而且视频中不规则的运动,扭曲形变,遮挡,亮度等的变化,很难被完全预测。所以混合编码框架会将当前块的原始图像减去预测图像得到残差图像,或者说当前块减去预测块得到残差块。残差块通常要比原始图像简单很多,因而预测可以显著提升压缩效率。对残差块也不是直接进行编码,而是通常先进行变换。变换是把残差图像从空间域变换到频率域,去除残差图像的相关性。残差图像变换到频率域以后,由于能量大多集中在低频区域,变换后的非零系数大多集中在左上角。接下来利用量化来进一步压缩。而且由于人眼对高频不敏感,高频区域可以使用更大的量化步长。
原始图像经过离散余弦变换(Discrete Cosine Transform,简称DCT)后,在左上角区域存在非零系数。
DCT是视频压缩标准中最常用的变换,包括DCT-II、DCT-VIII和DST-VII型等。
其中,DCT-II变换的基本公式如公式(1)所示:
其中,T
i(j)为变换后的系数,N为原始信号的点数,i,j=0,1,…,N-1,ω
0为补偿系数,
DCT-VIII变换的基本公式如公式(2)所示:
DST-VII变换的基本公式如公式(4)所示:
由于图像都是2维的,而直接进行二维的变换运算量和内存开销都是当时硬件条件所不能接受的,因而在标准中使用上述DCT-II,DCT-VIII,DST-VII进行变换时,均是拆分成水平方向和竖直方向,进行两步一维变换。如先进行水平方向的变换再进行竖直方向的变换,或者先进行竖直方向的变换再进行水平方向的变换。
上述变换方法对水平方向和竖直方向的纹理比较有效,提升压缩效率是非常有用。但是上述变换方法对斜向的纹理效果就会差一些,因此随着对压缩效率需求的不断提高,如果斜向的纹理能够更有效地处理,可以进一步提升压缩效率。
为了更有效地处理斜向纹理的残差,目前使用了二次变换,即在上述DCT-II,DCT-VIII,DST-VII等基础变换(Primary transform)之后,对频域信号进行第二次变换,将信号从一个变换域转换至另外一个变换域,之后再进行量化,熵编码等操作,其目的是进一步去除统计冗余。
低频不可分离变换(low frequency non-separable transform,简称LFNST)是一种缩减的二次变换。在编码端,LFNST用于基础变换(primary transform)之后量化之前。在解码端,LFNST用于反量化之后,反基础变换之前。
如图4所示,在编码端,LFNST对基础变换后的左上角的低频系数进行二次变换。基础变换通过对图像进行去相关性,把能量集中到左上角。而二次变换对基础变换的低频系数再去相关性。在编码端,16个系数输入到4x4的LFNST变换核,输出是8个系数;64个系数输入到8x8的LFNST变换核,输出是16个系数。在解码端,8个系数输入到4x4的反LFNST变换核,输出是16个系数;16个系数输入到8x8的反LFNST变换核,输出是64个系数。
可选的,LFNST包括4类变换核,这4类变换核对应的基图像如图6所示,可以看出一些明显的斜向纹理。在一些实施例中,一类变换核也称为一组变换核。
目前,LFNST只应用于帧内编码的块。帧内预测使用当前块周边已重建的像素作为参考对当前块进行预测,由于目前视频都是从左向右从上向下编码的,因而当前块可使用的参考像素通常在左侧和上侧。
如图5所示,VVC有67种帧内预测模式,其中除0Planar,1DC外,有65种角度预测模式。Planar通常处理一些渐变的纹理,DC顾名思义通常处理一些平坦区域,而对于角度纹理比较明显的块通常会使用帧内角度预测。当然VVC中对非正方形的块还可以使用宽角度预测模式,宽角度预测模式使得预测的角度会比正方形的块的角度范围更大。如图6所示,2~66为正方形块的预测模式对应的角度,-1~-14以及67~80代表宽角度预测模式下扩展的角度。
角度预测按照指定的角度将参考像素平铺到当前块作为预测值,这意味着预测块会有明显的方向纹理,而当前块经过角度预测后的残差在统计上也会体现出明显的角度特性。因而LFNST所选用的变换核可以跟帧内预测模式进行绑定,即确定了帧内预测模式以后,LFNST只能使用帧内预测模式对应的一组(set)变换核或一类变换核,即在一些实施例中,一类变换核也称为一组变换核,一类变换核中包括至少两个变换核。
具体地,VVC中的LFNST总共有4类变换核,每类变换核至少包括2个变换核,帧内预测模式和变换核类的对应关系如表1所示:
表1
IntraPredMode | Tr.set index |
IntraPredMode<0 | 1 |
0<=IntraPredMode<=1 | 0 |
2<=IntraPredMode<=12 | 1 |
13<=IntraPredMode<=23 | 2 |
24<=IntraPredMode<=44 | 3 |
45<=IntraPredMode<=55 | 2 |
56<=IntraPredMode<=80 | 1 |
81<=IntraPredMode<=83 | 0 |
表1中,IntraPredMode表示帧内预测模式,Tr.set index表示一类变换核的索引。
需要注意的是,色度帧内预测使用的跨分量预测模式为81到83,亮度帧内预测并没有这几种模式。
LFNST的变换核可以通过转置来用一个变换核组对应处理更多的角度,例如表1中,13到23和45到55的模式都对应变换核类2,但是13到23明显是接近于水平的模式,而45到55明显是接近于竖直的模式,45到55的模式对应的变换和反变换后需要通过转置来进行匹配。
VVC的LFNST共有4类变换核。
在一些场景中,变换核的种类扩展到了更多,例如扩展到35类变换核,每类各3个变换核。其中,Planar模式对应第0类变换核,DC模式对应第1类变换核,角度2和角度66对应第2类变换核,角度3和角度65对应第3类变换核,……角度33和角度35对应第34类变换核,角度34对应第35类变换核。当角度大于等于角度35时,输入二次变换前的系数需要转置,同理对于输出也需要对二次反变换后的系数进行转置。
由上述可知,通过将帧内预测模式和变换核的类别进行绑定,可以根据帧内预测模式指定使用哪一类变换核。这样做利用了帧内预测模式和变换核之间的相关性,从而减少了变换核的选择信息在码流中的传输,进而节约码子。而当前块是否会使用二次变换,以及如果使用二次变换,是使用一类变换核中的第一个还是第二个,是需要通过码流和一些条件来确定的。
ECM是进一步提高VVC性能的工具以及工具间的组合的参考软件,它基于VTM-10.0,集成EE采纳的工具和技术。
在ECM的帧内编码中,与VTM(VVC的参考软件测试平台)类似的,有传统的帧内预测,残差的变换等过程。与VVC不同的是在帧内预测环节中,采纳了两项用于导出帧内预测模式的技术,分别是解码器端帧内模式推导(Decoder-side Intra Mode Derivation,DIMD)和基于模板的帧内模式推导(Template-based Intra Mode Derivation,TIMD)。
DIMD和TIMD技术可在解码端导出帧内预测模式,从而省去编码帧内预测模式的索引,以达到节省码字的作用。
如图7A所示,DIMD以当前块周边已重建的像素值为模板,通过sobel算子在模板(Template)上的每个3x3区域上扫描并计算水平方向和竖直方向的梯度,根据水平和竖直方向上求得梯度Dx和Dy。再根据Dx和Dy求得每个位置上的幅度值amp=Dx+Dy,和角度值angular=arctan(Dy/Dx)。根据模板上每个位置的角度对应到传统的角度预测模式,累加相同角度模式的幅度值,得到如图7B所示的幅度值与角度模式的直方图。
将图7B中幅度值最高和次高的两种角度模式,以及planar模式的预测值通过加权可得到DIMD最终的预测结果。在现在的ECM中,加权计算的过程如公式(4)和(5)所示:
Pred=Pred
planar×w0+Pred
model×w1+Pred
mode2×w2 (5)
其中,w0,w1,w2分别是分配到planar模式、幅度值最高的角度模式和幅度值次高的角度模式的权重,Pred
planar为planar模式对应的预测值,Pred
mode1为幅度值最高的角度模式对应的预测值,Pred
mode1为幅度值次高的角度模式,Pred为DIMD对应的加权预测值,amp1为幅度值最高的角度模式的幅度值、amp2为幅度值次高的角度模式的幅度值。
TIMD与DIMD基本相同,也是用周边已重建的像素值导出帧内预测模式。
如图8所示,TIMD通过当前块上方、左侧、左上、左下和右上五个块选中的帧内预测模式,导出最可能模式(Most Probable Mode,简称MPM)列表。如图8所示,以参考模板(reference of the template)中的重建值作为参考行,分别对模板(template)区域做出预测结果。分别计算template的重建值与每种模式预测结果之间的绝对变换差之和(Sum of absolute transformed differences,简称SATD),SATD最小和次小的两种模式将作为当前块的预测模式进行预测并加权。
假设角度模式1为SATD最小的角度模式,其SATD为cost1,角度模式2为SATD次小的角度模式,其SATD为cost2。
当cost1*2>cost2时,加权的方法如公式(6)和(7)所示:
Pred=Pred
mode1×w1+Pred
mode2×w2 (7)
当cost1*2≤cost2时,则确定当前块不采用加权预测模式。
其中,Pred
mode1为SATD最小的角度模式对应的预测值,Pred
mode2为SATD次小的角度模式对应的预测值,Pred为TIMD对应的加权预测值,w1为SATD最小的角度模式对应的加权权重,w2为SATD次小的角度模式对应的加权权重。
在一些实施例中,在DIMD技术中,将幅度值最高的角度模式对应的二次变换的变换核,默认为当前块的目标变换核。在TIMD技术中,将SATD最小的角度模式对应的二次变换的变换核,默认为当前块的目标变换核。
但是,目前二次变换的变换核固定选用某个帧内预测模式对应的变换核,使得二次变换的变换核的选择不灵活,变换效果差,进而导致编解码整体性能不理想。
为了解决上述技术问题,本申请通过在预设的M类二次变换的变换核中,选择当前块对应的目标变换核,而不是 固定选择某一模式对应的变换核,进而提高了变换核的选择灵活性,提升变换效果,提高了编解码整体性能。
需要说明的是,本申请实施例提供的视频编解码方法,除了可以应用于上述TIMD与DIMD技术中外,还可以应用于任意允许采用加权预测模式和二次变换模式的视频编解码场景中。
下面结合具体的实施例,对本申请实施例提供的视频编解码方法进行介绍。
首先结合图9,以解码端为例,对本申请实施例提供的视频解码方法进行介绍。
图9为本申请实施例提供的视频解码方法的一种流程示意图,图10为本申请实施例涉及的视频解码过程示意图。本申请实施例应用于图1和图2所示视频解码器。如图9和图10所示,本申请实施例的方法包括:
S401、解码码流,得到当前块的第二变换系数。
在一些实施例中,当前块也可以称为当前解码块、当前解码单元、解码块、待解码块、待解码的当前块等。
在一些实施例中,当前块包括色度分量不包括亮度分量时,当前块可以称为色度块。
在一些实施例中,当前块包括亮度分量不包括色度分量时,当前块可以称为亮度块。
该第二变换系数为编码端对当前块的残差块经过二次变换形成的变换系数,具体是,编码端对当前块的残差块进行基础变换,得到基础变换系数,再对基础变换系数进行二次变换,得到当前块的第二变换系数。
在一些实施例中,基础变换也称为第一次变换或初始变换或初次变换等。
在一些实施例中,二次变换也称为第二次变换等。
在一些实施例中,基础变换系数也称为初始变换系数或初次变换系数或第一变换系数或第一次变换后的系数等。
在一些实施例中,第二变化系数也称为二次变换系数或二次变换后的系数等。
本实施例中,上述S401中解码端解码码流,得到当前块的第二变换系数的方式包括但不限于如下几种:
方式一,若编码端在编码时,对第二变换系数不进行量化,而是直接对第二变换系数进行编码,得到码流。这样解码端解码码流,可以从码流中直接得到当前块的第二变换系数。
方式二,编码端在编码时,对第二变换系数进行量化,得到量化系数,再对量化系数进行编码,得到码流。这样,解码端解码码流,得到当前块的量化系数,对量化系数进行反量化,得到当前块的第二变换系数。
S402、若确定当前块的帧内预测模式允许采用加权预测模式,则从预设的M类二次变换的变换核中,选择当前块对应的目标变换核,M为大于1的正整数。
本申请实施例的当前块的帧内预测模式的类型包括单预测模式(即非加权预测模式,例如角度预测模式或非角度预测模式),也可以包括加权预测模式。
其中,加权预测模式是指使用至少两种帧内预测模式分别对当前块进行预测,并对该至少两种预测模式对应的预测值进行加权,将加权结果作为当前块的预测值。例如,加权预测模式包括角度预测模式1和角度预测模式2,使用角度预测模式1对当前块进行预测,得到预测值1,使用角度预测模式2对对当前块进行预测,得到预测值2,对预测值1和预测值2进行加权,将加权结果确定为当前块的预测值。
在一些实施例中,可以通过如下方式确定当前块是否允许采用加权预测模式:
方式一,码流中包括指示当前块是否允许采用加权预测模式的标志a,此时,解码端通过解码码流,得到该标志a,并根据该标志a的取值,判断当前块是否允许采用加权预测模式,例如,该标志a的取值为数值1(例如1)时,确定当前块允许采用加权预测模式,若该标志a的取值为数值2(例如0)时,确定当前块不允许采用加权预测模式。可选的,该标志a可以是序列级、帧级、宏块级或编码块级的标志。
方式二,在码流中不携带标志a,而是通过默认方式,来确定当前块是否允许采用加权预测模式,例如,若默认当前序列、或当前帧或当前宏块或当前块不允许采用加权预测模式时,则确定当前块不允许采用加权预测模式。若默认当前序列、或当前帧或当前宏块或当前块允许采用加权预测模式时,则确定当前块允许采用加权预测模式。
若确定当前块的帧内预测模式允许采用加权预测模式时,上述S402中从预设的M类二次变换的变换核中,选择当前块对应的目标变换核的实现方式包括但不限于如下几种:
方式一,码流中包括指示标志,根据指示标志来选择当前块对应的目标变换核,具体的,上述S402中从预设的M类二次变换的变换核中,选择当前块对应的目标变换核包括如下步骤S402-A1和S402-A2:
S402-A1、解码码流,得到第一标志,第一标志用于指示目标变换核为目标预测模式对应的变换核;
S402-A2、根据第一标志,将M类二次变换的变换核中目标预测模式对应的变换核,确定为目标变换核。
在该方式一中,编码端在确定出当前块对应的目标变换核,并使用该目标变换核对当前块的残差块进行二次变换后,在码流在写入第一标志,该第一标志用于指示目标变换核为目标预测模式对应的变换核。这样,解码端接收到码流中,通过解码码流,得到第一标志,并根据该第一标志,从M类二次变换的变换核中目标预测模式对应的变换核,确定为目标变换核。例如,第一标志指示当前块对应的二次变换的目标变换核为预测模式1对应的变换核,解码端从M类二次变换的变换核中,查询预测模式1对应的变换核1,将该变换核1作为当前块对应的目标变换核,并使用该目标变换核对当前块的第二变换系数进行反二次变换,得到当前块的基础变换系数。
本申请实施例对M的具体取值不做限制,例如M=4,或M=35。
在一些实施例中,上述目标预测模式属于当前块的帧内预测模式,例如当前块的帧内预测模式为加权预测模式,且该加权预测模式包括N种帧内预测模式,例如包括两种角度模式和Planar模式,则目标预测模式为N种帧内预测模式中的一种模式。
在一些实施例中,上述目标预测模式不属于当前块的帧内预测模式,例如当前块的帧内预测模式为非加权预测模式,例如当前块的帧内预测模式为Planar模式,而目标预测模式为DC模式或角度模式等非Planar模式。
基于此,上述S402-A2中根据第一标志确定目标变换核时,还需要确定当前块的帧内预测模式,即上述S402-A2包括如下S402-A21至S402-A23的步骤:
S402-A21、确定当前块的帧内预测模式。
当前块的帧内预测模式包括加权预测模式和非加权预测模式。
若当前块的帧内预测模式为加权预测模式时,继续确定该加权预测模式所包括的N种帧内预测模式,其中N为大于1的正整数。例如,DIMD对应的加权预测模式包括幅度值最高和次高的两种角度模式以及planar模式,即N=3。TIMD加权预测模式包括SATD最小和次小的两种模式,即N=2。
若当前块的帧内预测模式为非加权预测模式,则当前块只包括一种帧内预测模式,例如从Planar模式,DC模式和65种角度模式中,选出率失真代价最小的模式作为该当前块的帧内预测模式。
S402-A22、根据当前块的帧内预测模式和第一标志,确定目标预测模式。
S402-A23、将M类二次变换的变换核中目标预测模式对应的变换核,确定为目标变换核。
本申请中根据当前块的帧内预测模式是否为加权预测模式,则基于第一标志确定目标预测模式的方式不同。
情况1,若当前块的帧内预测模式为加权预测模式,且该加权预测模式包括N种帧内预测模式时,则上述S402-A22包括S402-A221:
S402-A221、将N种帧内预测模式中第一标志所指示的预测模式,确定为目标预测模式。
例如,当前块的加权预测模式包括幅度值最高和次高的两种角度模式以及planar模式,假设第一标志指示目标变换核为planar模式对应的变换核,则将planar模式确定为目标预测模式,并在M类二次变换的变换核中planar模式对应的变换核,确定为当前块对应的目标变换核。
在一种示例中,上述目标预测模式为N种帧内预测模式中的一种预测模式。
在另一种示例中,目标预测模式为P种帧内预测模式中的一种帧内预测模式,P种帧内预测模式为N种帧内预测模式中与当前块的二次变换的变换核选择相关的帧内预测模式,P为大于1且小于或等于N的正整数。例如,N种帧内预测模式包括模式1、模式2和模式3,但与当前块的二次变换的变换核选择相关的帧内预测模式为模式1和模式2,这样可以确定目标预测模式为模式1和模式2中的一种模式。
可选的,若当前块的二次变换的变换核的选择与角度预测模式相关,则P种帧内预测模式为N种帧内预测模式中的角度预测模式。以DIMD为例,DIMD对应的加权预测模式包括幅度值最高和次高的两种角度模式以及planar模式,P种帧内预测模式包括幅度值最高和次高的两种角度模式。
可选的,若当前块的二次变换的变换核的选择与当前块的N种帧内预测模式相关,则P种帧内预测模式为N种帧内预测模式。继续以DIMD为例,DIMD对应的加权预测模式包括幅度值最高和次高的两种角度模式以及planar模式,P种帧内预测模式包括幅度值最高和次高的两种角度模式以及planar模式。
情况2,若当前块的帧内预测模式为非加权预测模式,即当前块包括一种帧内预测模式,则上述S402-A22包括S402-A222:
S402-A222、根据第一标志的取值,确定目标预测模式。
在该情况2中,若当前块没有选中加权模式时,例如在ECM中,如果template导出的幅度值直方图里每个角度的幅度值都是0,那么DIMD不加权,直接使用planar的预测结果。此时当前块使用一种预测模式,例如plana预测模式进行预测。此时为了增加目标变换核的选择灵活性,则通过为第一标志赋不同的值,来指示解码端可以选择除plana预测对应的变换核之外的变换核。
基于此,上述S402-A222的实现方式包括但不限于如下几种方式:
方式一,若第一标志的值为第一数值时,则确定目标预测模式为当前块的帧内预测模式。
例如,若所述当前块的帧内预测模式为Planar模式,且第一标志的值为第一数值时,则确定目标预测模式为Planar模式,进而将Planar模式对应的变换核作为当前块对应的目标变换核。
方式二,若第一标志的值为第二数值时,则确定目标预测模式为第二帧内预测模式,第二帧内预测模式与当前块的帧内预测模式不同。
在该方式二中,对第二帧内预测模式的具体类型不做限制,只有与当前块的帧内预测模式不同即可。
示例性的,若当前块的帧内预测模式为Planar模式时,则第二帧内预测模式为DC模式,该DC模式在预测时的选中概率仅次Planar模式。将该第二帧内预测模式对应的变换核作为当前块对应的目标变换核。
示例性的,若当前块的帧内预测模式为非Planar模式时,则第二帧内预测模式为Planar模式。例如,当前块的帧内预测模式为DC模式时,第二帧内预测模式为Planar模式。将该第二帧内预测模式对应的变换核作为当前块对应的目标变换核。
在一些实施例中,在执行S402-A1解码码流,得到第一标志之前,解码端先判断目标变换核的选择是否采用本申请实施例提供的方式。具体是,解码端确定当前块的帧内预测模式为加权预测模式,且该加权预测模式包括N种帧内预测模式,接着,执行预测过程,即确定N种帧内预测模式中每一种帧内预测模式对应的预测值,并确定N种帧内预测模式对应的预测值进行加权时的加权权重。若N种帧内预测模式对应的最小加权权重与最大加权权重之间的比例大于预设值,执行上述S402-A1解码码流,得到第一标志。
例如对于TIMD,假设mode1和mode2为cost最小的两种模式,它们的cost值分别为cost1和cost2,由上述公式(6)和公式(7)可知,mode1的加权权重与cost2正相关,mode2的加权权重与cost1正相关,当cost1/cost2>α,α为预设值,且为接近1的正数,例如0.8等,此时解码端的变换核选择第一标志指示的目标变换核,进而执行上述S402-A1解码码流,得到第一标志。这是因为是当cost1远小于cost2时,两者加权时的权重相差巨大,导致预测结果几乎与mode1的预测结果相同,此时没有必要进一步考虑mode2对应的变换核。
上述S402中从预设的M类二次变换的变换核中,选择当前块对应的目标变换核的方式除了上述方式一所述的根据第一标志,确定目标变换核外,还可以通过如下方式二和方式三实现。
方式二,若当前块的帧内预测模式为加权预测模式,且该加权预测模式包括N种帧内预测模式时,从预设的M类二次变换的变换核中,确定N种帧内预测模式对应的二次变换的变换核;确定使用N种帧内预测模式对应的二次变换的变换核进行解码时的率失真代价;将率失真代价最小的二次变换的变换核,确定为当前块对应的目标变换核。
方式三,若当前块的帧内预测模式为加权预测模式,且该加权预测模式包括N种帧内预测模式时,确定N种帧内预测模式对应的率失真代价;在M类二次变换的变换核中,查询率失真代价最小的帧内预测模式对应的第一变换核;将第一变换核确定为当前块对应的目标变换核。
需要说明的是,上述S402中从预设的M类二次变换的变换核中,选择当前块对应的目标变换核的方式包括但不限于上述三种方式。
由上述可知,本申请实施例解码端是从预设的M类二次变换的变换核中,选择当前块对应的目标变换核,而不是固定使用某一种模式对应的变换核,进而提高了变换核的选择灵活性,提升变换效果,提高了编解码整体性能。
S403、使用目标变换核对第二变换系数进行反二次变换,得到当前块的基础变换系数。
根据上述方式,确定出当前块对应的目标变换核后,使用该目标变换核对当前块的第二变换系数进行反二次变换,得到当前块的基础变换系数。
本申请实施例对根据二次变换的变换核,对第二变换系数进行反二次变换的方式不做限制。
例如,根据如下公式(8),得到当前块的反二次变换后的系数:
根据上述公式(8)可以基于二次变换的变换核,对第二变换系数进行反二次变换,得到反二次变换后的系数。
需要说明的是,上述公式(8)只是一种示例,上述S403的实现方式包括但不限于上述公式(8)所示。
由上述表1可知,有些角度模式对应的变换和反变换后不进行转置,有些需要转置,例如,13到23和45到55的模式都对应变换核类2,但是13到23明显是接近于水平的模式,而45到55明显是接近于竖直的模式,45到55的模式对应的变换和反变换后需要通过转置来进行匹配。基于此,解码端使用目标变换核对第二变换系数进行反二次变换后,需要根据目标变换模式进一步确定是否对反二次变换后的系数进行转置。其中,解码端判断是否对反二次变换后的系数进行转置的方式包括但不限于如下几种:
方式一,解码端根据表1,判断目标预测模式对应的反二次变换后的系数是否需要进行转置。例如,若目标预测模式为45到55的模式中的一种,则确定需要对反二次变换后的系数进行转置。
方式二,若当前块的帧内预测模式为加权预测模式,则可以根据第一标志判断是否对反二次变换后的系数进行转置。
在该方式二中,解码端首先确定P种帧内预测模式对应的二次变换的变换核类别,并根据P种帧内预测模式对应的二次变换的变换核类别,判断第一标志的指示信息。
例如,若P种帧内预测模式对应的二次变换的变换核类别不相同,则第一标志用于指示目标变换核为目标预测模式对应的变换核。例如,P种帧内预测模式包括模式1和模式2,目标预测模式为模式1,则第一标志指示目标变换核为模式1对应的变换核。可选的,第一标志的具体取值可以是目标预测模式的模式索引。
再例如,若P种帧内预测模式对应的二次变换的变换核类别相同,则第一标志除了指示目标变换核为目标预测模式对应的变换核外,还用于指示是否对反二次变换后的变换系数进行转置。例如LFNST中,P种帧内预测模式包括角度模式2和角度模式66,它们都对应着第3类的变换核,但不同的是变换前和反变换后是否系数需要转置,那么此时第一标志起到的作用不是切换变换核的类,而是是否需要转置。
基于上述方式二,则S403包括如下S403-A:
S403-A、根据P种帧内预测模式对应的二次变换的变换核类别,以及第一标志,使用目标变换核对第二变换系数进行反二次变换,得到当前块的基础变换系数。
示例1,若确定P种帧内预测模式对应的变换核类别不相同,则使用第一标志指示的目标变换核,对第二变换系数进行反二次变换,并将反二次变换后的变换系数确定为当前块的基础变换系数。
示例2,若确定P种帧内预测模式对应的变换核类别相同,且第一标志指示对反二次变换后的变换系数进行转置,则使用目标变换核对第二变换系数进行反二次变换,并对反二次变换后的变换系数进行转置,将转置后的变换系数确定为当前块的基础变换系数。
示例3,若确定P种帧内预测模式对应的变换核类别相同,且第一标志指示对反二次变换后的变换系数不进行转置,则使用目标变换核对第二变换系数进行反二次变换,并将反二次变换后的变换系数确定为当前块的基础变换系数。
根据上述方式,使用目标变换核对第二变换系数进行反二次变换,得到当前块的基础变换系数后,执行如下S404的步骤。
S404、对基础变换系数进行反基础变换,得到当前块的残差块。
如图10所示,使用目标变换核对第二变换系数进行反二次变换,得到当前块的基础变换系数,再对基础变换系数进行反基础变换,得到当前块的残差块。
具体的,根据基础变换所采用的方式,对基础变换系数进行反基础变换,得到当前块的残差块。
在一种示例中,若编码端采用上述公式(1)所示的DCT-II变换方式对当前块的残差块进行基础变换,则解码端采用上述公式(1)所示的DCT-II变换方式,对上述基础变换系数进行反基础变换,得到当前块的残差块。
在另一种示例中,若编码端采用上述公式(2)所示的DCT-VIII变换方式对当前块的残差块进行基础变换,则解码端采用上述公式(2)所示的DCT-VIII变换方式,对上述基础变换系数进行反基础变换,得到当前块的残差块。
在另一种示例中,若编码端采用上述公式(3)所示的DST-VII变换方式对当前块的残差块进行基础变换,则解码端采用上述公式(3)所示的DST-VII变换方式,对上述基础变换系数进行反基础变换,得到当前块的残差块。
在一些实施例中,根据当前块的帧内预测模式对当前块进行帧内预测,得到当前块的预测块。例如,若当前块的帧内预测模式为加权预测模式,则使用加权预测模式包括的N种帧内预测模式进行加权预测,即将N种帧内预测模式对应的预测值进行加权,得到当前块的预测块。
根据上述步骤,得到当前块的残差块和预测块后,将预测块与残差块相加,得到当前块的重建块。
本申请实施例的解码方法,通过解码码流,得到当前块的第二变换系数,第二变换系数为编码端对当前块的残差块经过二次变换形成的变换系数;若确定当前块的帧内预测模式允许采用加权预测模式时,从预设的M类二次变换的变换核中,选择当前块对应的目标变换核,M为大于1的正整数;使用目标变换核对第二变换系数进行反二次变换,得到当前块的基础变换系数;对基础变换系数进行反基础变换,得到当前块的残差块。即本申请解码端从预设的M类二次变换的变换核中,选择当前块对应的目标变换核,而不是固定使用某一种模式对应的变换核,进而提高了变换核的选择灵活性,提升变换效果,提高了解码整体性能。
图11为本申请实施例提供的视频解码方法的另一流程示意图,图12为本申请实施例涉及的解码过程示意图,图11为本申请实施例提供涉及的一种详细的解码过程。如图11和图12所示,包括:
S501、解码码流,得到当前块的量化系数。
若编码端对二次变换变换的第二变换系数进行量化,形成量化系数,并对量化系数进行编码,形成码流。对应的,解码端接收到码流后,解码码流,得到当前块的量化系数。
S502、对量化系数进行反量化,得到当前块的第二变换系数。
具体的,确定量化方式,并使用确定的量化方式,对量化系数进行反量化,得到当前块的第二变换系数。
解码端确定量化方式的方式可以是:
方式一,若码流中包括量化方式的指示信息,这样解码端通过解码码流,得到量化方式的指示信息,根据该指示信息,确定当前块的量化方式。
方式二,解码端采用默认的量化方式。
方式三,解码端采用与编码端相同的方式,确定当前块的量化方式。
S503、若确定当前块的帧内预测模式允许采用加权预测模式,则导出当前块的帧内预测模式,并使用当前块的帧内预测模式对当前块进行帧内预测,得到当前块的预测块。
当前块的帧内预测模式包括加权预测模式和非加权预测模式。
若当前块的帧内预测模式为加权预测模式时,则导出该加权预测模式所包括的N种帧内预测模式,其中N为大于1的正整数。例如,DIMD对应的加权预测模式包括幅度值最高和次高的两种角度模式以及planar模式,即N=3。TIMD加权预测模式包括SATD最小和次小的两种模式,即N=2。
接着,使用上述N种帧内预测模式中的每一个帧内预测模式对当前块进行预测,得到每一种帧内预测模式对应的预测值。对每一种帧内预测模式对应的预测值进行加权,得到当前块的预测块。例如,对于DIMD,采用上述公式(4)和(5)进行预测值加权,得到当前块的预测块。例如,对于TIMD,采用上述公式(6)和(7)进行预测值加权,得到当前块的预测块。
S504、从预设的M类二次变换的变换核中,选择当前块对应的目标变换核,M为大于1的正整数。
上述S504的实现过程与上述S402的实现过程基本一致,参照上述S402的描述,在此不再赘述。
S505、使用目标变换核对第二变换系数进行反二次变换,得到当前块的基础变换系数。
S506、对基础变换系数进行反基础变换,得到当前块的残差块。
上述S505和S506的实现过程与上述S403和S404的实现过程一致,参照上述S403和S404的描述。
S507、根据当前块的预测块和残差块,得到当前块的重建块。
例如,当前块的预测块和残差块相加,得到当前块的重建块。
本申请实施例的解码方法,通过解码码流,得到当前块的量化系数;对量化系数进行反量化,得到当前块的第二变换系数;若确定当前块的帧内预测模式允许采用加权预测模式,则导出当前块的帧内预测模式,并使用当前块的帧内预测模式对当前块进行帧内预测,得到当前块的预测块;若确定当前块允许采用加权预测模式时,从预设的M类二次变换的变换核中,选择当前块对应的目标变换核;使用目标变换核对第二变换系数进行反二次变换,得到当前块的基础变换系数;对基础变换系数进行反基础变换,得到当前块的残差块;根据当前块的预测块和残差块,得到当前块的重建块。即本申请解码端从预设的M类二次变换的变换核中,选择当前块对应的目标变换核,而不是固定使用某一种模式对应的变换核,进而提高了变换核的选择灵活性,提升变换效果,提高了解码整体性能。
上文对本申请实施例的解码方法进行介绍,在此基础上,下面对本申请实施例提供的编码方法进行介绍。
图13本申请实施例提供的视频编码方法的一种流程示意图,图14本申请实施例涉及的视频编码过程的示意图。本申请实施例应用于图1和图2所示视频编码器。如图13和图14所示,本申请实施例的方法包括:
S601、若确定当前块的帧内预测模式允许采用加权预测模式时,从预设的M类二次变换的变换核中,选择当前块对应的目标变换核,M为大于1的正整数。
在视频编码过程中,视频编码器接收视频流,该视频流由一系列图像帧组成,针对视频流中的每一帧图像进行视频编码,视频编码器对图像帧进行块划分,得到当前块。
在一些实施例中,当前块也称为当前编码块、当前图像块、编码块、当前编码单元、当前待编码块、当前待编码的图像块等。
在块划分时,传统方法划分后的块既包含了当前块位置的色度分量,又包含了当前块位置的亮度分量。而分离树技术(dual tree)可以划分单独分量块,例如单独的亮度块和单独的色度块,其中亮度块可以理解为只包含当前块位置的亮度分量,色度块理解为只包含当前块位置的色度分量。这样相同位置的亮度分量和色度分量可以属于不同的块,划分可以有更大的灵活性。如果分离树用在CU划分中,那么有的CU既包含亮度分量又包含色度分量,有的CU只包含亮度分量,有的CU只包含色度分量。
在一些实施例中,本申请实施例的当前块只包括色度分量,可以理解为色度块。
在一些实施例中,本申请实施例的当前块只包括亮度分量,可以理解为亮度块。
在一些实施例中,该当前块即包括亮度分量又包括色度分量。
本申请实施例的当前块的预测模式包括单预测模式(即非加权预测模式),也可以包括加权预测模式。
在一些实施例中,可以通过如下方式确定当前块的帧内预测模式是否允许采用加权预测模式:
方式一,码流中包括指示当前块是否允许采用加权预测模式的标志a,此时,解码端通过解码码流,得到该标志a,并根据该标志a的取值,判断当前块是否允许采用加权预测模式,例如,该标志a的取值为数值1(例如1)时,确定当前块允许采用加权预测模式,若该标志a的取值为数值2(例如0)时,确定当前块不允许采用加权预测模式。可选的,该标志a可以是序列级、帧级、宏块级或编码块级的标志。
方式二,在码流中不携带标志a,而是通过默认方式,来确定当前块是否允许采用加权预测模式,例如,若默认当前序列、或当前帧或当前宏块或当前块不允许采用加权预测模式时,则确定当前块不允许采用加权预测模式。若默认当前序列、或当前帧或当前宏块或当前块允许采用加权预测模式时,则确定当前块允许采用加权预测模式。
在一些实施例中,将M类二次变换的变换核中,默认的某一个变换核确定为当前块对应的目标变换核。
在一些实施例中,上述S601包括如下步骤:
S601-A1、确定当前块的帧内预测模式。
S601-A2、根据当前块的帧内预测模式,从预设的M类二次变换的变换核中,选择当前块对应的目标变换核。
其中,S601-A1的实现方式包括但不限于如下几种:
方式一,对于预设的K种帧内预测模式和预设的加权预测模式中的每一种帧内预测模式,确定每一种帧内预测模式对应的编码代价(例如率失真代价)。将这K种帧内预测模式和预设的加权预测模式中,编码代价最小的预测模式,确定为当前块的帧内预测模式。
方式二,首先,从预设的K种帧内预测模式,粗筛选出代价最小的Q种帧内预测模式,K、Q均为正整数;接着,从Q个帧内预测模式和预设的加权预测模式中,细筛选出代价最小的帧内预测模式,作为当前块的帧内预测模式。
具体的,对于K种帧内预测模式中的每一种帧内预测模式,确定使用帧内预测模式对当前块进行编码时,当前块的预测值;计算该帧内预测模式对应的预测值与当前块的原始值之间的失真D1,同时计算编码该帧预测模式的标志位(flag)时所消耗的比特数R2。根据该失真D1和比特数R2,确定该帧内预测模式对应的第一率失真代价J1,例如,J1=D1+R1。根据上述方式,可以确定出K种帧内预测模式中,每一个帧内预测模式对应的第一率失真代价J1。最后,从K个帧内预测模式中,粗筛选出代价最小的Q种帧内预测模式。
接着,从Q个帧内预测模式和预设的加权预测模式中,细筛选出代价最小的帧内预测模式,作为当前块的帧内预测模式。具体是,针对Q个帧内预测模式和预设的加权预测模式中的每一种帧内预测模式,确定使用该帧内预测模式对当前块进行编码时,当前块的重建值;计算该重建值与当前块的原始值之间的失真D2,以及统计使用该帧内预测模式对当前块进行编码时所消耗的比特数R2,根据失真D2和比特数R2,确定该该帧内预测模式对应的第二率失真代价J2,例如,J2=D2+R2。根据上述方式,可以确定出Q个帧内预测模式和预设的加权预测模式中,每个帧内预测模式对应的第二率失真代价J2。最后,将Q个帧内预测模式和预设的加权预测模式中,第二率失真代价J2最小的帧内预测模式,确定为当前块的帧内预测模式。
需要说明的是,若确定该当前块的帧内预测模式为加权预测模式时,则当前块的帧内预测模式包括N种帧内预测模式,N为大于1的正整数。若确定该当前块的帧内预测模式为非加权预测模式时,则当前块的帧内预测模式为一种帧内预测模式。
根据上述步骤,确定出当前块的帧内预测模式后,执行S601-A2,即根据当前块的帧内预测模式,从预设的M类二次变换的变换核中,选择当前块对应的目标变换核。
上述S601-A2的实现方式包括但不限于如下几种:
方式一,若当前块的帧内预测模式为加权预测模式,且该加权预测模式包括N种帧内预测模式时,则从预设的M类二次变换的变换核中,确定N种帧内预测模式对应的二次变换的变换核;确定使用N种帧内预测模式对应的二次变换的变换核进行解码时的率失真代价;将率失真代价最小的二次变换的变换核,确定为当前块对应的目标变换核。
方式二,若当前块的帧内预测模式为加权预测模式,且该加权预测模式包括N种帧内预测模式时,则确定N种帧内预测模式对应的率失真代价;在M类二次变换的变换核中,查询率失真代价最小的帧内预测模式对应的第一变换核;将第一变换核确定为当前块对应的目标变换核。
方式三,若所述当前块的帧内预测模式为非加权预测模式,将所述M类二次变换的变换核中所述当前块的帧内预测模式对应的变换核,确定为所述当前块对应的目标变换核。
方式四,若所述当前块的帧内预测模式为非加权预测模式,将所述M类二次变换的变换核中第二帧内预测模式对应的变换核,确定为所述当前块对应的目标变换核。
其中所述第二帧内预测模式与所述当前块的帧内预测模式不同。
例如,若所述当前块的帧内预测模式为平面Planar模式时,则所述第二帧内预测模式为DC模式。
再例如,若所述当前块的帧内预测模式为非Planar模式时,则所述第二帧内预测模式为Planar模式。
S602、对当前块进行编码,得到当前块的基础变换系数。
具体是,使用上述步骤确定的当前块的帧内预测模式,对当前块进行预测,得到当前块的预测块,接着,根据当前块的预测块和当前块,得到当前块的残差块。例如,当前块的像素值减去预测块的像素值,得到当前块的残差块。
接着,对残差块进行基础变换,得到当前块的基础变换系数。
在一种示例中,若编码端采用上述公式(1)所示的DCT-II变换方式对当前块的残差块进行基础变换,得到当前块的基础变换系数。
在另一种示例中,若编码端采用上述公式(2)所示的DCT-VIII变换方式对当前块的残差块进行基础变换,得到当前块的基础变换系数。
在另一种示例中,若编码端采用上述公式(3)所示的DST-VII变换方式对当前块的残差块进行基础变换,得到当 前块的基础变换系数。
S603、使用目标变换核对基础变换系数进行二次变换,得到当前块的第二变换系数。
本申请实施例对根据二次变换的变换核,对第二变换系数进行反二次变换,得到当前块的基础变换系数的方式不做限制。
例如,根据上述公式(8),使用目标变换核,对基础变换系数进行二次变换,得到当前块的第二变换系数,即将目标变换核与基础变换系数的乘积,作为当前块的第二变换系数。
S604、根据当前块的第二变换系数,得到码流。
在一种示例中,对当前块的第二变换系数不进行量化,直接进行编码,得到码流。
在一种示例中,对当前块的第二变换系数进行量化,得到量化系数,对量化系数进行编码,得到码流。
在一些实施例中,若当前块对应的二次变换的变换核为一类变换核时,则编码端可以向解码端指示编码端具体使用了该一类变换核中的哪一个变换核,可选的,该指示信息可以携带在码流中。
在一些实施例中,编码端还可以在码流中写入第一标志,该第一标志用于指示目标变换核为目标预测模式对应的变换核。
在一些实施例中,编码端在码流中写入第一标志之前,先确定N种帧内预测模式对应的预测值进行加权时的加权权重。若N种帧内预测模式对应的最小加权权重与最大加权权重之间的比例大于预设值大于预设值,则在码流中写入第一标志。
例如对于TIMD,假设mode1和mode2为cost最小的两种模式,它们的cost值分别为cost1和cost2,由上述公式(6)和公式(7)可知,mode1的加权权重与cost2正相关,mode2的加权权重与cost1正相关,当cost1/cost2>α,α为预设值,且为接近1的正数,例如0.8等,此时解码端的变换核选择第一标志指示的目标变换核,进而执行上述S402-A1解码码流,得到第一标志。这是因为当cost1远小于cost2时,两者加权时的权重相差巨大,导致预测结果几乎与mode1的预测结果相同,此时没有必要进一步考虑mode2对应的变换核。
在一些实施例中,若当前块的帧内预测模式为加权预测模式,且该加权预测模式包括N种帧内预测模式时,目标预测模式为P种帧内预测模式中的一种帧内预测模式,P种帧内预测模式为N种帧内预测模式中与当前块的二次变换的变换核选择相关的帧内预测模式,P为大于1且小于或等于N的正整数。
在一些实施例中,若当前块的二次变换的变换核的选择与角度预测模式相关,则P种帧内预测模式为N种帧内预测模式中的角度预测模式;或者,
若当前块的二次变换的变换核的选择与当前块的N种帧内预测模式相关,则P种帧内预测模式为N种帧内预测模式。
在一些实施例中,若P种帧内预测模式对应的二次变换的变换核类别相同,则第一标志还用于指示是否对反二次变换后的变换系数进行转置。
在一些实施例中,若当前块的帧内预测模式为非加权预测模式,则方法还包括:根据目标预测模式,确定第一标志的取值。
例如,若当前块对应的目标变换核为当前块的帧内预测模式对应的变换核,则第一标志的取值为第一数值。
再例如,若当前块对应的目标变换核为第二帧内预测模式对应的变换核,则第一标志的取值为第二数值。
可选的,若当前块的帧内预测模式为平面Planar模式时,则第二帧内预测模式为DC模式。
可选的,若当前块的帧内预测模式为非Planar模式时,则第二帧内预测模式为Planar模式。
本申请实施例的编码方法,编码端若确定当前块的帧内预测模式允许采用加权预测模式时,从预设的M类二次变换的变换核中,选择所述当前块对应的目标变换核,所述M为大于1的正整数;对所述当前块进行编码,得到所述当前块的基础变换系数;使用所述目标变换核对所述基础变换系数进行二次变换,得到所述当前块的第二变换系数;根据所述当前块的第二变换系数,得到码流。即本申请编码端从预设的M类二次变换的变换核中,选择当前块对应的目标变换核,而不是固定使用某一种模式对应的变换核,进而提高了变换核的选择灵活性,提升变换效果,提高了编码整体性能。
图15本申请实施例提供的视频编码方法的一种流程示意图,图16本申请实施例涉及的视频编码过程的示意图。本申请实施例应用于图1和图2所示视频编码器。如图15和图16所示,本申请实施例的方法包括:
S701、若确定当前块的帧内预测模式允许采用加权预测模式,则确定当前块的帧内预测模式。
上述S701的实现方式与上述S601-A1的实现方式相同,参照上述S601-A1的描述,在此不再赘述。
S702、根据当前块的帧内预测模式,从预设的M类二次变换的变换核中,选择当前块对应的目标变换核。
上述S702的实现方式与上述S601-A3的实现方式相同,参照上述S601-A3的描述,在此不再赘述。
S703、使用当前块的帧内预测模式对当前块进行帧内预测,得到当前块的预测块。
需要说明的是,上述S703与上述S702在实现时没有先后顺序,即上述S703可以在上述S702之前执行,也可以在上述S702之后执行,或者与上述S702同步执行,本申请对此不做限制。
S704、根据当前块的预测块和当前块,得到当前块的残差块。
例如,当前块的像素值减去预测块的像素值,得到当前块的残差块。
S705、对残差块进行基础变换,得到当前块的基础变换系数。
上述S705的实现过程与上述S602一致,参照上述S602的描述,在此不再赘述。
S706、使用目标变换核对基础变换系数进行二次变换,得到当前块的第二变换系数。
上述S706的实现过程与上述S603一致,参照上述S603的描述,在此不再赘述。
S707、对当前块的第二变换系数进行量化,得到量化系数。
S708、对量化系数进行编码,得到码流。
上述S707和S708的实现过程与上述S604一致,参照上述S604的描述,在此不再赘述。
本申请实施例提供的视频编码方法,编码端若确定当前块的帧内预测模式允许采用加权预测模式时,确定当前块的帧内预测模式;根据当前块的帧内预测模式,从预设的M类二次变换的变换核中,选择当前块对应的目标变换核;使用当前块的帧内预测模式对当前块进行帧内预测,得到当前块的预测块;根据当前块的预测块和当前块,得到当前块的残差块;对残差块进行基础变换,得到当前块的基础变换系数;使用目标变换核对基础变换系数进行二次变换,得到当前块的第二变换系数;对当前块的第二变换系数进行量化,得到量化系数;对量化系数进行编码,得到码流。即本申请编码端从预设的M类二次变换的变换核中,选择当前块对应的目标变换核,而不是固定使用某一种模式对应的变换核,进而提高了变换核的选择灵活性,提升变换效果,提高了编码整体性能。
应理解,图9至图16仅为本申请的示例,不应理解为对本申请的限制。
以上结合附图详细描述了本申请的优选实施方式,但是,本申请并不限于上述实施方式中的具体细节,在本申请的技术构思范围内,可以对本申请的技术方案进行多种简单变型,这些简单变型均属于本申请的保护范围。例如,在上述具体实施方式中所描述的各个具体技术特征,在不矛盾的情况下,可以通过任何合适的方式进行组合,为了避免不必要的重复,本申请对各种可能的组合方式不再另行说明。又例如,本申请的各种不同的实施方式之间也可以进行任意组合,只要其不违背本申请的思想,其同样应当视为本申请所公开的内容。
还应理解,在本申请的各种方法实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。另外,本申请实施例中,术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系。具体地,A和/或B可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。
上文结合图9至图16,详细描述了本申请的方法实施例,下文结合图17至图19,详细描述本申请的装置实施例。
图17是本申请一实施例提供的视频解码器的示意性框图。
如图17所示,视频解码器10包括:
解码单元11,用于解码码流,得到当前块的第二变换系数,所述第二变换系数为编码端对所述当前块的残差块经过二次变换形成的变换系数;
选择单元12,用于若确定所述当前块的帧内预测模式允许采用加权预测模式时,从预设的M类二次变换的变换核中,选择所述当前块对应的目标变换核,所述M为大于1的正整数;
变换单元13,用于使用所述目标变换核对所述第二变换系数进行反二次变换,得到所述当前块的基础变换系数;并对所述基础变换系数进行反基础变换,得到所述当前块的残差块。
在一些实施例中,选择单元12,具体用于解码码流,得到第一标志,所述第一标志用于指示所述目标变换核为目标预测模式对应的变换核;根据所述第一标志,将所述M类二次变换的变换核中所述目标预测模式对应的变换核,确定为所述目标变换核。
在一些实施例中,选择单元12,具体用于确定所述当前块的帧内预测模式;根据所述当前块的帧内预测模式和所述第一标志,确定所述目标预测模式;将所述M类二次变换的变换核中所述目标预测模式对应的变换核,确定为所述目标变换核。
在一些实施例中,选择单元12,具体用于若所述当前块的帧内预测模式为加权预测模式,且该加权预测模式包括N种帧内预测模式时,则将所述N种帧内预测模式中所述第一标志所指示的预测模式,确定为所述目标预测模式,所述N为大于1的正整数。
在一些实施例中,所述目标预测模式为P种帧内预测模式中的一种帧内预测模式,所述P种帧内预测模式为所述N种帧内预测模式中与所述当前块的二次变换的变换核选择相关的帧内预测模式,所述P为大于1且小于或等于N的正整数。
在一些实施例中,若所述当前块的二次变换的变换核的选择与角度预测模式相关,则所述P种帧内预测模式为所述N种帧内预测模式中的角度预测模式;或者,
若所述当前块的二次变换的变换核的选择与所述当前块的N种帧内预测模式相关,则所述P种帧内预测模式为所述N种帧内预测模式。
在一些实施例中,若所述当前块的帧内预测模式为加权预测模式,解码单元11,还用于确定所述P种帧内预测模式对应的二次变换的变换核类别,若所述P种帧内预测模式对应的二次变换的变换核类别相同,则所述第一标志还用于指示是否对反二次变换后的变换系数进行转置;
上述变换单元13,具体用于根据所述P种帧内预测模式对应的二次变换的变换核类别,以及所述第一标志,使用所述目标变换核对所述第二变换系数进行反二次变换,得到所述当前块的基础变换系数。
在一些实施例中,上述变换单元13,具体用于若确定所述P种帧内预测模式对应的变换核类别不相同,则使用所述第一标志指示的所述目标变换核,对所述第二变换系数进行反二次变换,并将反二次变换后的变换系数确定为所述当前块的基础变换系数;
若确定所述P种帧内预测模式对应的变换核类别相同,且所述第一标志指示对反二次变换后的变换系数进行转置,则使用所述目标变换核对所述第二变换系数进行反二次变换,并对反二次变换后的变换系数进行转置,将转置后的变换系数确定为所述当前块的基础变换系数;
若确定所述P种帧内预测模式对应的变换核类别相同,且所述第一标志指示对反二次变换后的变换系数不进行转置,则使用所述目标变换核对所述第二变换系数进行反二次变换,并将反二次变换后的变换系数确定为所述当前块的基础变换系数。
在一些实施例中,选择单元12,具体用于若所述当前块的帧内预测模式为非加权预测模式,则根据所述第一标志的取值,确定所述目标预测模式。
在一些实施例中,选择单元12,具体用于若所述第一标志的值为第一数值时,则确定所述目标预测模式为所述当前块的帧内预测模式;
若所述第一标志的值为第二数值时,则确定所述目标预测模式为第二帧内预测模式,所述第二帧内预测模式与所述当前块的帧内预测模式不同。
可选的,若所述当前块的帧内预测模式为平面Planar模式时,则所述第二帧内预测模式为DC模式;
若所述当前块的帧内预测模式为非Planar模式时,则所述第二帧内预测模式为Planar模式。
在一些实施例中,解码单元11,还用于确定所述N种帧内预测模式对应的预测值进行加权时的加权权重;若所述N种帧内预测模式对应的最小加权权重与最大加权权重之间的比例大于预设值大于预设值,则解码所述码流,得到所述第一标志。
在一些实施例中,若所述当前块的帧内预测模式为加权预测模式,且该加权预测模式包括N种帧内预测模式时,则选择单元12,具体用于从预设的M类二次变换的变换核中,确定所述N种帧内预测模式对应的二次变换的变换核;确定使用所述N种帧内预测模式对应的二次变换的变换核进行解码时的率失真代价;将率失真代价最小的二次变换的变换核,确定为所述当前块对应的目标变换核。
在一些实施例中,若所述当前块的帧内预测模式为加权预测模式,且该加权预测模式包括N种帧内预测模式时,则选择单元12,具体用于确定所述N种帧内预测模式对应的率失真代价;在所述M类二次变换的变换核中,查询率失真代价最小的帧内预测模式对应的第一变换核;将所述第一变换核确定为所述当前块对应的目标变换核。
应理解,装置实施例与方法实施例可以相互对应,类似的描述可以参照方法实施例。为避免重复,此处不再赘述。具体地,图17所示的视频解码器10可以执行本申请实施例的解码方法,并且视频解码器10中的各个单元的前述和其它操作和/或功能分别为了实现上述解码方法等各个方法中的相应流程,为了简洁,在此不再赘述。
图18是本申请一实施例提供的视频编码器的示意性框图。
如图18所示,该视频编码器20可包括:
选择单元21,用于若确定当前块的帧内预测模式允许采用加权预测模式时,从预设的M类二次变换的变换核中,选择所述当前块对应的目标变换核,所述M为大于1的正整数;
编码单元22,用于对所述当前块进行编码,得到所述当前块的基础变换系数;
变换单元23,用于使用所述目标变换核对所述基础变换系数进行二次变换,得到所述当前块的第二变换系数;
编码单元22,还用于根据所述当前块的第二变换系数,得到码流。
在一些实施例中,选择单元21,具体用于确定所述当前块的帧内预测模式;根据所述当前块的帧内预测模式,从预设的M类二次变换的变换核中,选择所述当前块对应的目标变换核。
在一些实施例中,选择单元21,具体用于从预设的K种帧内预测模式,粗筛选出代价最小的Q种帧内预测模式,所述K、Q均为正整数,所述K种帧内预测模式中包括加权预测模式;从所述Q种帧内预测模式和预设的加权预测模式中,细筛选出代价最小的帧内预测模式,作为所述当前块的帧内预测模式。
在一些实施例中,选择单元21,具体用于若所述当前块的帧内预测模式为加权预测模式,且该加权预测模式包括N种帧内预测模式时,则从预设的M类二次变换的变换核中,确定所述N种帧内预测模式对应的二次变换的变换核;确定使用所述N种帧内预测模式对应的二次变换的变换核进行解码时的率失真代价;将率失真代价最小的二次变换的变换核,确定为所述当前块对应的目标变换核。
在一些实施例中,选择单元21,具体用于若所述当前块的帧内预测模式为加权预测模式,且该加权预测模式包括N种帧内预测模式时,则确定所述N种帧内预测模式对应的率失真代价;在所述M类二次变换的变换核中,查询率失真代价最小的帧内预测模式对应的第一变换核;将所述第一变换核确定为所述当前块对应的目标变换核。
在一些实施例中,编码单元22,还用于在所述码流中写入第一标志,所述第一标志用于指示所述目标变换核为目标预测模式对应的变换核。
在一些实施例中,若所述当前块的帧内预测模式为加权预测模式,且该加权预测模式包括N种帧内预测模式时,所述目标预测模式为P种帧内预测模式中的一种帧内预测模式,所述P种帧内预测模式为所述N种帧内预测模式中与所述当前块的二次变换的变换核选择相关的帧内预测模式,所述P为大于1且小于或等于N的正整数。
在一些实施例中,若所述当前块的二次变换的变换核的选择与角度预测模式相关,则所述P种帧内预测模式为所述N种帧内预测模式中的角度预测模式;或者,
若所述当前块的二次变换的变换核的选择与所述当前块的N种帧内预测模式相关,则所述P种帧内预测模式为所述N种帧内预测模式。
在一些实施例中,若所述P种帧内预测模式对应的二次变换的变换核类别相同,则所述第一标志还用于指示是否对反二次变换后的变换系数进行转置。
在一些实施例中,若所述当前块的帧内预测模式为非加权预测模式,则选择单元21,具体用于将所述M类二次变换的变换核中所述当前块的帧内预测模式对应的变换核,确定为所述当前块对应的目标变换核;或者,将所述M类二次变换的变换核中第二帧内预测模式对应的变换核,确定为所述当前块对应的目标变换核,其中所述第二帧内预测模式与所述当前块的帧内预测模式不同。
在一些实施例中,若所述当前块对应的目标变换核为所述当前块的帧内预测模式对应的变换核,则所述第一标志的取值为第一数值;
若所述当前块对应的目标变换核为所述第二帧内预测模式对应的变换核,则所述第一标志的取值为第二数值。
可选的,若所述当前块的帧内预测模式为平面Planar模式时,则所述第二帧内预测模式为DC模式;
若所述当前块的帧内预测模式为非Planar模式时,则所述第二帧内预测模式为Planar模式。
在一些实施例中,编码单元22,还用于确定所述N种帧内预测模式对应的预测值进行加权时的加权权重;若所述 N种帧内预测模式对应的最小加权权重与最大加权权重之间的比例大于预设值大于预设值,则在所述码流中写入所述第一标志。
应理解,装置实施例与方法实施例可以相互对应,类似的描述可以参照方法实施例。为避免重复,此处不再赘述。具体地,图18所示的视频编码器20可以对应于执行本申请实施例的编码方法中的相应主体,并且视频解码器20中的各个单元的前述和其它操作和/或功能分别为了实现编码方法等各个方法中的相应流程,为了简洁,在此不再赘述。
上文中结合附图从功能单元的角度描述了本申请实施例的装置和系统。应理解,该功能单元可以通过硬件形式实现,也可以通过软件形式的指令实现,还可以通过硬件和软件单元组合实现。具体地,本申请实施例中的方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路和/或软件形式的指令完成,结合本申请实施例公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件单元组合执行完成。可选地,软件单元可以位于随机存储器,闪存、只读存储器、可编程只读存储器、电可擦写可编程存储器、寄存器等本领域的成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法实施例中的步骤。
图19是本申请实施例提供的电子设备的示意性框图。
如图32所示,该电子设备30可以为本申请实施例所述的视频编码器,或者视频解码器,该电子设备30可包括:
存储器33和处理器32,该存储器33用于存储计算机程序34,并将该程序代码34传输给该处理器32。换言之,该处理器32可以从存储器33中调用并运行计算机程序34,以实现本申请实施例中的方法。
例如,该处理器32可用于根据该计算机程序34中的指令执行上述方法200中的步骤。
在本申请的一些实施例中,该处理器32可以包括但不限于:
通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等等。
在本申请的一些实施例中,该存储器33包括但不限于:
易失性存储器和/或非易失性存储器。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synch link DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DR RAM)。
在本申请的一些实施例中,该计算机程序34可以被分割成一个或多个单元,该一个或者多个单元被存储在该存储器33中,并由该处理器32执行,以完成本申请提供的方法。该一个或多个单元可以是能够完成特定功能的一系列计算机程序指令段,该指令段用于描述该计算机程序34在该电子设备30中的执行过程。
如图19所示,该电子设备30还可包括:
收发器33,该收发器33可连接至该处理器32或存储器33。
其中,处理器32可以控制该收发器33与其他设备进行通信,具体地,可以向其他设备发送信息或数据,或接收其他设备发送的信息或数据。收发器33可以包括发射机和接收机。收发器33还可以进一步包括天线,天线的数量可以为一个或多个。
应当理解,该电子设备30中的各个组件通过总线系统相连,其中,总线系统除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。
图20是本申请实施例提供的视频编解码系统的示意性框图。
如图20所示,该视频编解码系统40可包括:视频编码器41和视频解码器42,其中视频编码器41用于执行本申请实施例涉及的视频编码方法,视频解码器42用于执行本申请实施例涉及的视频解码方法。
本申请还提供了一种计算机存储介质,其上存储有计算机程序,该计算机程序被计算机执行时使得该计算机能够执行上述方法实施例的方法。或者说,本申请实施例还提供一种包含指令的计算机程序产品,该指令被计算机执行时使得计算机执行上述方法实施例的方法。
本申请还提供了一种码流,该码流是通过上述编码方式生成的。可选的,该码流中包括第一标志。
当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。该计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行该计算机程序指令时,全部或部分地产生按照本申请实施例该的流程或功能。该计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。该计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,该计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。该计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。该可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如数字视频光盘(digital video disc,DVD))、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如, 以上所描述的装置实施例仅仅是示意性的,例如,该单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。例如,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
以上内容,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以该权利要求的保护范围为准。
Claims (34)
- 一种视频解码方法,其特征在于,包括:解码码流,得到当前块的第二变换系数,所述第二变换系数为编码端对所述当前块的残差块经过二次变换形成的变换系数;若确定所述当前块的帧内预测模式允许采用加权预测模式,则从预设的M类二次变换的变换核中,选择所述当前块对应的目标变换核,所述M为大于1的正整数;使用所述目标变换核对所述第二变换系数进行反二次变换,得到所述当前块的基础变换系数;对所述基础变换系数进行反基础变换,得到所述当前块的残差块。
- 根据权利要求1所述的方法,其特征在于,所述从预设的M类二次变换的变换核中,选择所述当前块对应的目标变换核,包括:解码码流,得到第一标志,所述第一标志用于指示所述目标变换核为目标预测模式对应的变换核;根据所述第一标志,将所述M类二次变换的变换核中所述目标预测模式对应的变换核,确定为所述目标变换核。
- 根据权利要求2所述的方法,其特征在于,所述根据所述第一标志,将所述M类二次变换的变换核中所述目标预测模式对应的变换核,确定为所述目标变换核,包括:确定所述当前块的帧内预测模式;根据所述当前块的帧内预测模式和所述第一标志,确定所述目标预测模式;将所述M类二次变换的变换核中所述目标预测模式对应的变换核,确定为所述目标变换核。
- 根据权利要求3所述的方法,其特征在于,所述根据所述当前块的帧内预测模式和所述第一标志,确定所述目标预测模式,包括:若所述当前块的帧内预测模式为加权预测模式,且该加权预测模式包括N种帧内预测模式时,则将所述N种帧内预测模式中所述第一标志所指示的预测模式,确定为所述目标预测模式,所述N为大于1的正整数。
- 根据权利要求4所述的方法,其特征在于,所述目标预测模式为P种帧内预测模式中的一种帧内预测模式,所述P种帧内预测模式为所述N种帧内预测模式中与所述当前块的二次变换的变换核选择相关的帧内预测模式,所述P为大于1且小于或等于N的正整数。
- 根据权利要求5所述的方法,其特征在于,若所述当前块的二次变换的变换核的选择与角度预测模式相关,则所述P种帧内预测模式为所述N种帧内预测模式中的角度预测模式;或者,若所述当前块的二次变换的变换核的选择与所述当前块的N种帧内预测模式相关,则所述P种帧内预测模式为所述N种帧内预测模式。
- 根据权利要求5或6所述的方法,其特征在于,若所述当前块的帧内预测模式为加权预测模式,则所述方法还包括:确定所述P种帧内预测模式对应的二次变换的变换核类别,若所述P种帧内预测模式对应的二次变换的变换核类别相同,则所述第一标志还用于指示是否对反二次变换后的变换系数进行转置;所述使用所述目标变换核对所述第二变换系数进行反二次变换,得到所述当前块的基础变换系数,包括:根据所述P种帧内预测模式对应的二次变换的变换核类别,以及所述第一标志,使用所述目标变换核对所述第二变换系数进行反二次变换,得到所述当前块的基础变换系数。
- 根据权利要求7所述的方法,其特征在于,所述根据所述P种帧内预测模式对应的二次变换的变换核类别,以及所述第一标志,使用所述目标变换核对所述第二变换系数进行反二次变换,得到所述当前块的基础变换系数,包括:若确定所述P种帧内预测模式对应的变换核类别不相同,则使用所述第一标志指示的所述目标变换核,对所述第二变换系数进行反二次变换,并将反二次变换后的变换系数确定为所述当前块的基础变换系数;若确定所述P种帧内预测模式对应的变换核类别相同,且所述第一标志指示对反二次变换后的变换系数进行转置,则使用所述目标变换核对所述第二变换系数进行反二次变换,并对反二次变换后的变换系数进行转置,将转置后的变换系数确定为所述当前块的基础变换系数;若确定所述P种帧内预测模式对应的变换核类别相同,且所述第一标志指示对反二次变换后的变换系数不进行转置,则使用所述目标变换核对所述第二变换系数进行反二次变换,并将反二次变换后的变换系数确定为所述当前块的基础变换系数。
- 根据权利要求3所述的方法,其特征在于,所述根据所述当前块的帧内预测模式和所述第一标志,确定所述目标预测模式,包括:若所述当前块的帧内预测模式为非加权预测模式,则根据所述第一标志的取值,确定所述目标预测模式。
- 根据权利要求9所述的方法,其特征在于,所述根据所述第一标志的取值,确定所述目标预测模式,包括:若所述第一标志的值为第一数值时,则确定所述目标预测模式为所述当前块的帧内预测模式;若所述第一标志的值为第二数值时,则确定所述目标预测模式为第二帧内预测模式,所述第二帧内预测模式与所述当前块的帧内预测模式不同。
- 根据权利要求10所述的方法,其特征在于,若所述当前块的帧内预测模式为平面Planar模式时,则所述第二帧内预测模式为DC模式;若所述当前块的帧内预测模式为非Planar模式时,则所述第二帧内预测模式为Planar模式。
- 根据权利要求4所述的方法,其特征在于,所述方法还包括:确定所述N种帧内预测模式对应的预测值进行加权时的加权权重;所述解码码流,得到第一标志,包括:若所述N种帧内预测模式对应的最小加权权重与最大加权权重之间的比例大于预设值大于预设值,则解码所述码 流,得到所述第一标志。
- 根据权利要求2所述的方法,其特征在于,若所述当前块的帧内预测模式为加权预测模式,且该加权预测模式包括N种帧内预测模式时,则所述从预设的M类二次变换的变换核中,选择所述当前块对应的目标变换核,包括:从预设的M类二次变换的变换核中,确定所述N种帧内预测模式对应的二次变换的变换核;确定使用所述N种帧内预测模式对应的二次变换的变换核进行解码时的率失真代价;将率失真代价最小的二次变换的变换核,确定为所述当前块对应的目标变换核。
- 根据权利要求2所述的方法,其特征在于,若所述当前块的帧内预测模式为加权预测模式,且该加权预测模式包括N种帧内预测模式时,则所述从预设的M类二次变换的变换核中,选择所述当前块对应的目标变换核,包括:确定所述N种帧内预测模式对应的率失真代价;在所述M类二次变换的变换核中,查询率失真代价最小的帧内预测模式对应的第一变换核;将所述第一变换核确定为所述当前块对应的目标变换核。
- 一种视频编码方法,其特征在于,包括:若确定当前块的帧内预测模式允许采用加权预测模式,则从预设的M类二次变换的变换核中,选择所述当前块对应的目标变换核,所述M为大于1的正整数;对所述当前块进行编码,得到所述当前块的基础变换系数;使用所述目标变换核对所述基础变换系数进行二次变换,得到所述当前块的第二变换系数;根据所述当前块的第二变换系数,得到码流。
- 根据权利要求15所述的方法,其特征在于,所述方法还包括:在所述码流中写入第一标志,所述第一标志用于指示所述目标变换核为目标预测模式对应的变换核。
- 根据权利要求16所述的方法,其特征在于,若所述当前块的帧内预测模式为加权预测模式,且该加权预测模式包括N种帧内预测模式时,所述目标预测模式为P种帧内预测模式中的一种帧内预测模式,所述P种帧内预测模式为所述N种帧内预测模式中与所述当前块的二次变换的变换核选择相关的帧内预测模式,所述P为大于1且小于或等于N的正整数。
- 根据权利要求17所述的方法,其特征在于,若所述当前块的二次变换的变换核的选择与角度预测模式相关,则所述P种帧内预测模式为所述N种帧内预测模式中的角度预测模式;或者,若所述当前块的二次变换的变换核的选择与所述当前块的N种帧内预测模式相关,则所述P种帧内预测模式为所述N种帧内预测模式。
- 根据权利要求17或18所述的方法,其特征在于,若所述P种帧内预测模式对应的二次变换的变换核类别相同,则所述第一标志还用于指示是否对反二次变换后的变换系数进行转置。
- 根据权利要求16所述的方法,其特征在于,所述从预设的M类二次变换的变换核中,选择所述当前块对应的目标变换核,包括:确定所述当前块的帧内预测模式;根据所述当前块的帧内预测模式,从预设的M类二次变换的变换核中,选择所述当前块对应的目标变换核。
- 根据权利要求20所述的方法,其特征在于,所述确定所述当前块的帧内预测模式,包括:从预设的K种帧内预测模式,粗筛选出代价最小的Q种帧内预测模式,所述K、Q均为正整数,所述K种帧内预测模式中包括加权预测模式;从所述Q种帧内预测模式和预设的加权预测模式中,细筛选出代价最小的帧内预测模式,作为所述当前块的帧内预测模式。
- 根据权利要求20所述的方法,其特征在于,所述根据所述当前块的帧内预测模式,从预设的M类二次变换的变换核中,选择所述当前块对应的目标变换核,包括:若所述当前块的帧内预测模式为加权预测模式,且该加权预测模式包括N种帧内预测模式时,则从预设的M类二次变换的变换核中,确定所述N种帧内预测模式对应的二次变换的变换核;确定使用所述N种帧内预测模式对应的二次变换的变换核进行解码时的率失真代价;将率失真代价最小的二次变换的变换核,确定为所述当前块对应的目标变换核。
- 根据权利要求20所述的方法,其特征在于,所述根据所述当前块的帧内预测模式,从预设的M类二次变换的变换核中,选择所述当前块对应的目标变换核,包括:若所述当前块的帧内预测模式为加权预测模式,且该加权预测模式包括N种帧内预测模式时,则确定所述N种帧内预测模式对应的率失真代价;在所述M类二次变换的变换核中,查询率失真代价最小的帧内预测模式对应的第一变换核;将所述第一变换核确定为所述当前块对应的目标变换核。
- 根据权利要求20所述的方法,其特征在于,若所述当前块的帧内预测模式为非加权预测模式,则根据所述当前块的帧内预测模式,从预设的M类二次变换的变换核中,选择所述当前块对应的目标变换核,包括:将所述M类二次变换的变换核中所述当前块的帧内预测模式对应的变换核,确定为所述当前块对应的目标变换核;或者,将所述M类二次变换的变换核中第二帧内预测模式对应的变换核,确定为所述当前块对应的目标变换核,其中所述第二帧内预测模式与所述当前块的帧内预测模式不同。
- 根据权利要求24所述的方法,其特征在于,若所述当前块对应的目标变换核为所述当前块的帧内预测模式对应的变换核,则所述第一标志的取值为第一数值;若所述当前块对应的目标变换核为所述第二帧内预测模式对应的变换核,则所述第一标志的取值为第二数值。
- 根据权利要求25所述的方法,其特征在于,若所述当前块的帧内预测模式为平面Planar模式时,则所述第二 帧内预测模式为DC模式;若所述当前块的帧内预测模式为非Planar模式时,则所述第二帧内预测模式为Planar模式。
- 根据权利要求16所述的方法,其特征在于,所述方法还包括:确定所述加权预测模式包括的N种帧内预测模式对应的预测值进行加权时的加权权重;所述在所述码流中写入第一标志,包括:若所述N种帧内预测模式对应的最小加权权重与最大加权权重之间的比例大于预设值大于预设值,则在所述码流中写入所述第一标志。
- 一种视频解码器,其特征在于,包括:解码单元,用于解码码流,得到当前块的第二变换系数,所述第二变换系数为编码端对所述当前块的残差块经过二次变换形成的变换系数;选择单元,用于若确定所述当前块的帧内预测模式允许采用加权预测模式,则从预设的M类二次变换的变换核中,选择所述当前块对应的目标变换核,所述M为大于1的正整数;变换单元,用于使用所述目标变换核对所述第二变换系数进行反二次变换,得到所述当前块的基础变换系数;并对所述基础变换系数进行反基础变换,得到所述当前块的残差块。
- 一种视频编码器,其特征在于,包括:选择单元,用于若确定当前块的帧内预测模式允许采用加权预测模式时,从预设的M类二次变换的变换核中,选择所述当前块对应的目标变换核,所述M为大于1的正整数;编码单元,用于对所述当前块进行编码,得到所述当前块的基础变换系数;变换单元,用于使用所述目标变换核对所述基础变换系数进行二次变换,得到所述当前块的第二变换系数;编码单元,还用于根据所述当前块的第二变换系数,得到码流。
- 一种视频解码器,其特征在于,包括处理器和存储器;所示存储器用于存储计算机程序;所述处理器用于调用并运行所述存储器中存储的计算机程序,以实现上述权利要求1至14任一项所述的方法。
- 一种视频编码器,其特征在于,包括处理器和存储器;所示存储器用于存储计算机程序;所述处理器用于调用并运行所述存储器中存储的计算机程序,以实现如上述权利要求15至27任一项所述的方法。
- 一种视频编解码系统,其特征在于,包括:根据权利要求31所述的视频编码器;以及根据权利要求32所述的视频解码器。
- 一种计算机可读存储介质,其特征在于,用于存储计算机程序;所述计算机程序使得计算机执行如上述权利要求1至14或15至27任一项所述的方法。
- 一种码流,其特征在于,所述码流是通过如上述权利要求15至27任一项所述的方法生成的。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2021/120773 WO2023044868A1 (zh) | 2021-09-26 | 2021-09-26 | 视频编解码方法、设备、系统、及存储介质 |
CN202180102521.6A CN117981307A (zh) | 2021-09-26 | 2021-09-26 | 视频编解码方法、设备、系统、及存储介质 |
EP21957986.9A EP4383709A1 (en) | 2021-09-26 | 2021-09-26 | Video encoding method, video decoding method, device, system, and storage medium |
US18/614,082 US20240236371A1 (en) | 2021-09-26 | 2024-03-22 | Video encoding method, video decoding method, device, system, and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2021/120773 WO2023044868A1 (zh) | 2021-09-26 | 2021-09-26 | 视频编解码方法、设备、系统、及存储介质 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US18/614,082 Continuation US20240236371A1 (en) | 2021-09-26 | 2024-03-22 | Video encoding method, video decoding method, device, system, and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023044868A1 true WO2023044868A1 (zh) | 2023-03-30 |
Family
ID=85719867
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/120773 WO2023044868A1 (zh) | 2021-09-26 | 2021-09-26 | 视频编解码方法、设备、系统、及存储介质 |
Country Status (4)
Country | Link |
---|---|
US (1) | US20240236371A1 (zh) |
EP (1) | EP4383709A1 (zh) |
CN (1) | CN117981307A (zh) |
WO (1) | WO2023044868A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116760976A (zh) * | 2023-08-21 | 2023-09-15 | 腾讯科技(深圳)有限公司 | 仿射预测决策方法、装置、设备及存储介质 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101719261A (zh) * | 2009-12-18 | 2010-06-02 | 华南理工大学 | 一种基于变换核的数字水印嵌入强度控制方法 |
CN110519601A (zh) * | 2019-09-02 | 2019-11-29 | 北京百度网讯科技有限公司 | 数字视频的编码方法和装置 |
CN111355955A (zh) * | 2020-03-06 | 2020-06-30 | 中南大学 | 基于预选层的多变换核快速跳过算法 |
CN112422991A (zh) * | 2019-08-23 | 2021-02-26 | 杭州海康威视数字技术股份有限公司 | 编码方法、解码方法及装置 |
CN112740684A (zh) * | 2018-09-19 | 2021-04-30 | 韩国电子通信研究院 | 用于对图像进行编码/解码的方法和装置以及用于存储比特流的记录介质 |
-
2021
- 2021-09-26 WO PCT/CN2021/120773 patent/WO2023044868A1/zh active Application Filing
- 2021-09-26 EP EP21957986.9A patent/EP4383709A1/en active Pending
- 2021-09-26 CN CN202180102521.6A patent/CN117981307A/zh active Pending
-
2024
- 2024-03-22 US US18/614,082 patent/US20240236371A1/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101719261A (zh) * | 2009-12-18 | 2010-06-02 | 华南理工大学 | 一种基于变换核的数字水印嵌入强度控制方法 |
CN112740684A (zh) * | 2018-09-19 | 2021-04-30 | 韩国电子通信研究院 | 用于对图像进行编码/解码的方法和装置以及用于存储比特流的记录介质 |
CN112422991A (zh) * | 2019-08-23 | 2021-02-26 | 杭州海康威视数字技术股份有限公司 | 编码方法、解码方法及装置 |
CN110519601A (zh) * | 2019-09-02 | 2019-11-29 | 北京百度网讯科技有限公司 | 数字视频的编码方法和装置 |
CN111355955A (zh) * | 2020-03-06 | 2020-06-30 | 中南大学 | 基于预选层的多变换核快速跳过算法 |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116760976A (zh) * | 2023-08-21 | 2023-09-15 | 腾讯科技(深圳)有限公司 | 仿射预测决策方法、装置、设备及存储介质 |
CN116760976B (zh) * | 2023-08-21 | 2023-12-08 | 腾讯科技(深圳)有限公司 | 仿射预测决策方法、装置、设备及存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN117981307A (zh) | 2024-05-03 |
EP4383709A1 (en) | 2024-06-12 |
US20240236371A1 (en) | 2024-07-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TWI745594B (zh) | 與視訊寫碼中之變換處理一起應用之內部濾波 | |
US11611757B2 (en) | Position dependent intra prediction combination extended with angular modes | |
EP3593531A1 (en) | Intra filtering flag in video coding | |
JP7277586B2 (ja) | モードおよびサイズに依存したブロックレベル制限の方法および装置 | |
US20240236371A1 (en) | Video encoding method, video decoding method, device, system, and storage medium | |
CN114205582A (zh) | 用于视频编解码的环路滤波方法、装置及设备 | |
US20230319267A1 (en) | Video coding method and video decoder | |
WO2023044919A1 (zh) | 视频编解码方法、设备、系统、及存储介质 | |
WO2024192733A1 (zh) | 视频编解码方法、装置、设备、系统、及存储介质 | |
WO2024216632A1 (zh) | 视频编解码方法、装置、设备、系统、及存储介质 | |
WO2023173255A1 (zh) | 图像编解码方法、装置、设备、系统、及存储介质 | |
WO2022116054A1 (zh) | 图像处理方法、系统、视频编码器及视频解码器 | |
WO2023122969A1 (zh) | 帧内预测方法、设备、系统、及存储介质 | |
WO2023236113A1 (zh) | 视频编解码方法、装置、设备、系统及存储介质 | |
WO2023220946A1 (zh) | 视频编解码方法、装置、设备、系统及存储介质 | |
WO2023184747A1 (zh) | 视频编解码方法、装置、设备、系统及存储介质 | |
WO2023184248A1 (zh) | 视频编解码方法、装置、设备、系统及存储介质 | |
WO2022155922A1 (zh) | 视频编解码方法与系统、及视频编码器与视频解码器 | |
WO2023122968A1 (zh) | 帧内预测方法、设备、系统、及存储介质 | |
WO2022179394A1 (zh) | 图像块预测样本的确定方法及编解码设备 | |
WO2024183007A1 (zh) | 视频编解码方法、装置、设备、系统、及存储介质 | |
CN114979628A (zh) | 图像块预测样本的确定方法及编解码设备 | |
CN116760976A (zh) | 仿射预测决策方法、装置、设备及存储介质 | |
CN115412729A (zh) | 环路滤波方法、装置、设备及存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21957986 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2021957986 Country of ref document: EP Effective date: 20240304 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202180102521.6 Country of ref document: CN |
|
NENP | Non-entry into the national phase |
Ref country code: DE |