WO2022155922A1 - Procédé et système de codage vidéo, procédé et système de décodage vidéo, codeur vidéo et décodeur vidéo - Google Patents
Procédé et système de codage vidéo, procédé et système de décodage vidéo, codeur vidéo et décodeur vidéo Download PDFInfo
- Publication number
- WO2022155922A1 WO2022155922A1 PCT/CN2021/073409 CN2021073409W WO2022155922A1 WO 2022155922 A1 WO2022155922 A1 WO 2022155922A1 CN 2021073409 W CN2021073409 W CN 2021073409W WO 2022155922 A1 WO2022155922 A1 WO 2022155922A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- alf
- coefficient
- quantization scale
- quantization
- target
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 161
- 238000013139 quantization Methods 0.000 claims abstract description 434
- 238000007667 floating Methods 0.000 claims description 71
- 238000001914 filtration Methods 0.000 claims description 65
- 230000003044 adaptive effect Effects 0.000 claims description 50
- 238000012937 correction Methods 0.000 claims description 41
- 238000004590 computer program Methods 0.000 claims description 25
- 230000000694 effects Effects 0.000 abstract description 3
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 27
- 230000008569 process Effects 0.000 description 20
- 238000010586 diagram Methods 0.000 description 16
- 108010063123 alfare Proteins 0.000 description 11
- 238000007906 compression Methods 0.000 description 10
- 238000005516 engineering process Methods 0.000 description 10
- 230000006835 compression Effects 0.000 description 9
- 230000006870 function Effects 0.000 description 9
- 238000004422 calculation algorithm Methods 0.000 description 8
- 238000004891 communication Methods 0.000 description 8
- 239000011159 matrix material Substances 0.000 description 8
- OSWPMRLSEDHDFF-UHFFFAOYSA-N methyl salicylate Chemical compound COC(=O)C1=CC=CC=C1O OSWPMRLSEDHDFF-UHFFFAOYSA-N 0.000 description 7
- 238000012360 testing method Methods 0.000 description 7
- 230000009466 transformation Effects 0.000 description 6
- 241000023320 Luma <angiosperm> Species 0.000 description 5
- 238000012986 modification Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 230000001360 synchronised effect Effects 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 230000000007 visual effect Effects 0.000 description 3
- 230000000903 blocking effect Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000013500 data storage Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000000638 solvent extraction Methods 0.000 description 2
- 230000002123 temporal effect Effects 0.000 description 2
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000012958 reprocessing Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/117—Filters, e.g. for pre-processing or post-processing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/186—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/80—Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
- H04N19/82—Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop
Definitions
- the present application relates to the technical field of video encoding and decoding, and in particular, to a video encoding and decoding method and system, as well as a video encoder and a video decoder.
- Digital video technology can be incorporated into a variety of video devices, such as digital televisions, smartphones, computers, e-readers or video players, and the like. With the development of video technology, the amount of data included in video data is relatively large. In order to facilitate the transmission of video data, video devices implement video compression technology to enable more efficient transmission or storage of video data.
- Errors are introduced in the video compression process.
- the reconstructed image is filtered, for example, the reconstructed image is filtered using an adaptive correction filter to minimize the mean square error between the reconstructed image and the original image.
- ALF adaptive correction filter
- a fixed quantization scale is currently collected, and the coefficients of the floating point type of the ALF are quantized into an integer type.
- Different regions of the reconstructed image or reconstructed images of different frames may have different ranges of ALF coefficients of floating point type derived, and may also have different filter gains.
- a fixed quantization scale is used to quantize the floating-point type coefficients, the coding overhead of the filter coefficients and the gain brought by the filter cannot be balanced.
- Embodiments of the present application provide a video encoding and decoding method and system, as well as a video encoder and a video decoder, to achieve a balance between the encoding overhead of filter coefficients and the gain brought by the filter.
- the present application provides a video encoding method, including:
- the reconstructed image includes a first component, and the first component is a luminance component or a chrominance component;
- the maximum quantization scale determine the target quantization scale of the floating-point type ALF coefficient of the first ALF, where the target quantization scale is less than or equal to the maximum quantization scale;
- an embodiment of the present application provides a video decoding method, including:
- the reconstructed image includes a first component, and the first component is a luminance component or a chrominance component;
- Decoding code stream obtains the ALF coefficient information of the first adaptive correction filter ALF corresponding to the target area in the described reconstruction image under the described first component;
- the target area in the reconstructed image under the first component is filtered using the first ALF.
- the present application provides a video encoder for performing the method in the first aspect or each of its implementations.
- the encoder includes a functional unit for executing the method in the above-mentioned first aspect or each of its implementations.
- the present application provides a video decoder for executing the method in the second aspect or each of its implementations.
- the decoder includes functional units for performing the methods in the second aspect or the respective implementations thereof.
- a video encoder including a processor and a memory.
- the memory is used for storing a computer program
- the processor is used for calling and running the computer program stored in the memory, so as to execute the method in the above-mentioned first aspect or each implementation manner thereof.
- a video decoder including a processor and a memory.
- the memory is used for storing a computer program
- the processor is used for calling and running the computer program stored in the memory, so as to execute the method in the above-mentioned second aspect or each implementation manner thereof.
- a video encoding and decoding system including a video encoder and a video decoder.
- the video encoder is used to perform the method in the first aspect or each of its implementations
- the video decoder is used to perform the method in the above-mentioned second aspect or each of its implementations.
- a chip for implementing any one of the above-mentioned first aspect to the second aspect or the method in each implementation manner thereof.
- the chip includes: a processor for invoking and running a computer program from a memory, so that a device on which the chip is installed executes any one of the above-mentioned first to second aspects or each of its implementations method.
- a computer-readable storage medium for storing a computer program, the computer program causing a computer to execute the method in any one of the above-mentioned first aspect to the second aspect or each of its implementations.
- a computer program product comprising computer program instructions, the computer program instructions causing a computer to perform the method in any one of the above-mentioned first to second aspects or the implementations thereof.
- a computer program which, when run on a computer, causes the computer to perform the method in any one of the above-mentioned first to second aspects or the respective implementations thereof.
- the reconstructed image of the current image is obtained; when the target area in the reconstructed image under the first component is filtered using the first ALF, the floating-point type ALF coefficient of the first ALF is determined; Determine the maximum quantization scale of the ALF coefficients of the floating point type of the first ALF according to the ALF coefficients of the floating point type of the first ALF; according to the maximum quantization scale, determine the target quantization scale of the ALF coefficients of the floating point type of the first ALF; use The target quantization scale is to quantize the ALF coefficients of the floating point type of the first ALF into ALF coefficients of the integer type; encode the ALF coefficients of the integer type of the first ALF to obtain a code stream.
- the present application determines the target quantization scale of the ALF coefficient according to the size of the ALF coefficient of the floating point type of the first ALF, and uses the target quantization scale to quantize the ALF coefficient of the floating point type, so that when the ALF coefficient of the floating point type is large , the target quantization scale of the determined ALF coefficient is relatively large, and then the filter gain corresponding to the ALF coefficient quantized by using the target quantization scale is relatively large.
- the target quantization scale of the determined ALF coefficient is relatively small.
- the coding overhead is small, thereby achieving a balance between the coding overhead of the filter coefficients and the gain brought by the filter, thereby improving the effect of video coding and decoding.
- FIG. 1 is a schematic block diagram of a video encoding and decoding system 100 involved in an embodiment of the present application
- FIG. 2 is a schematic block diagram of a video encoder 200 provided by an embodiment of the present application.
- FIG. 3 is a schematic block diagram of a decoding framework 300 provided by an embodiment of the present application.
- FIG. 4 is a schematic flowchart of a video encoding method 400 provided by an embodiment of the present application.
- 5A is a schematic diagram of an ALF shape involved in an embodiment of the application.
- 5B is a schematic diagram of another ALF shape involved in an embodiment of the application.
- FIG. 6 is another schematic flowchart of a video encoding method 600 provided by an embodiment of the present application.
- FIG. 7 is another schematic flowchart of a video encoding method 700 provided by an embodiment of the present application.
- FIG. 8 is a schematic flowchart of a video decoding method 800 provided by an embodiment of the present application.
- FIG. 9 is a schematic flowchart of a video decoding method 900 provided by an embodiment of the present application.
- FIG. 10 is a schematic flowchart of a video decoding method 1000 provided by an embodiment of the present application.
- FIG. 11 is a schematic block diagram of a video encoder 10 provided by an embodiment of the present application.
- FIG. 12 is a schematic block diagram of a video decoder 20 provided by an embodiment of the present application.
- FIG. 13 is a schematic block diagram of an electronic device 30 provided by an embodiment of the present application.
- FIG. 14 is a schematic block diagram of a video encoding and decoding system 40 provided by an embodiment of the present application.
- the present application can be applied to the field of image encoding and decoding, the field of video encoding and decoding, the field of hardware video encoding and decoding, the field of dedicated circuit video encoding and decoding, the field of real-time video encoding and decoding, and the like.
- audio video coding standard audio video coding standard, AVS for short
- H.264/audio video coding audio video coding, AVC for short
- H.265/High Efficiency Video Coding High efficiency video coding, referred to as HEVC
- H.266/versatile video coding versatile video coding, referred to as VVC
- the schemes of the present application may operate in conjunction with other proprietary or industry standards including ITU-TH.261, ISO/IECMPEG-1 Visual, ITU-TH.262 or ISO/IECMPEG-2 Visual, ITU-TH.263 , ISO/IECMPEG-4Visual, ITU-TH.264 (also known as ISO/IECMPEG-4AVC), including Scalable Video Codec (SVC) and Multi-View Video Codec (MVC) extensions.
- SVC Scalable Video Codec
- MVC Multi-View Video Codec
- FIG. 1 For ease of understanding, the video coding and decoding system involved in the embodiments of the present application is first introduced with reference to FIG. 1 .
- FIG. 1 is a schematic block diagram of a video encoding and decoding system 100 according to an embodiment of the present application. It should be noted that FIG. 1 is only an example, and the video encoding and decoding systems in the embodiments of the present application include, but are not limited to, those shown in FIG. 1 .
- the video codec system 100 includes an encoding device 110 and a decoding device 120 .
- the encoding device is used to encode the video data (which can be understood as compression) to generate a code stream, and transmit the code stream to the decoding device.
- the decoding device decodes the code stream encoded by the encoding device to obtain decoded video data.
- the encoding device 110 in this embodiment of the present application may be understood as a device with a video encoding function
- the decoding device 120 may be understood as a device with a video decoding function, that is, the encoding device 110 and the decoding device 120 in the embodiments of the present application include a wider range of devices, Examples include smartphones, desktop computers, mobile computing devices, notebook (eg, laptop) computers, tablet computers, set-top boxes, televisions, cameras, display devices, digital media players, video game consoles, in-vehicle computers, and the like.
- the encoding device 110 may transmit the encoded video data (eg, a code stream) to the decoding device 120 via the channel 130 .
- Channel 130 may include one or more media and/or devices capable of transmitting encoded video data from encoding device 110 to decoding device 120 .
- channel 130 includes one or more communication media that enables encoding device 110 to transmit encoded video data directly to decoding device 120 in real-time.
- encoding apparatus 110 may modulate the encoded video data according to a communication standard and transmit the modulated video data to decoding apparatus 120 .
- the communication medium includes a wireless communication medium, such as a radio frequency spectrum, optionally, the communication medium may also include a wired communication medium, such as one or more physical transmission lines.
- channel 130 includes a storage medium that can store video data encoded by encoding device 110 .
- Storage media include a variety of locally accessible data storage media such as optical discs, DVDs, flash memory, and the like.
- the decoding apparatus 120 may obtain the encoded video data from the storage medium.
- channel 130 may include a storage server that may store video data encoded by encoding device 110 .
- the decoding device 120 may download the stored encoded video data from the storage server.
- the storage server may store the encoded video data and may transmit the encoded video data to the decoding device 120, such as a web server (eg, for a website), a file transfer protocol (FTP) server, and the like.
- FTP file transfer protocol
- encoding apparatus 110 includes video encoder 112 and output interface 113 .
- the output interface 113 may include a modulator/demodulator (modem) and/or a transmitter.
- encoding device 110 may include video source 111 in addition to video encoder 112 and input interface 113 .
- the video source 111 may include at least one of a video capture device (eg, a video camera), a video archive, a video input interface, a computer graphics system for receiving video data from a video content provider, a computer graphics system Used to generate video data.
- a video capture device eg, a video camera
- a video archive e.g., a video archive
- a video input interface e.g., a video input interface
- a computer graphics system for receiving video data from a video content provider e.g., a computer graphics system Used to generate video data.
- the video encoder 112 encodes the video data from the video source 111 to generate a code stream.
- Video data may include one or more pictures or a sequence of pictures.
- the code stream contains the encoding information of the image or image sequence in the form of bit stream.
- the encoded information may include encoded image data and associated data.
- the associated data may include a sequence parameter set (SPS for short), a picture parameter set (PPS for short), and other syntax structures.
- SPS sequence parameter set
- PPS picture parameter set
- An SPS may contain parameters that apply to one or more sequences.
- a PPS may contain parameters that apply to one or more images.
- a syntax structure refers to a set of zero or more syntax elements in a codestream arranged in a specified order.
- the video encoder 112 directly transmits the encoded video data to the decoding device 120 via the output interface 113 .
- the encoded video data may also be stored on a storage medium or a storage server for subsequent reading by the decoding device 120 .
- decoding device 120 includes input interface 121 and video decoder 122 .
- the decoding device 120 may include a display device 123 in addition to the input interface 121 and the video decoder 122 .
- the input interface 121 includes a receiver and/or a modem.
- the input interface 121 may receive the encoded video data through the channel 130 .
- the video decoder 122 is configured to decode the encoded video data, obtain the decoded video data, and transmit the decoded video data to the display device 123 .
- the display device 123 displays the decoded video data.
- the display device 123 may be integrated with the decoding apparatus 120 or external to the decoding apparatus 120 .
- the display device 123 may include various display devices, such as a liquid crystal display (LCD), a plasma display, an organic light emitting diode (OLED) display, or other types of display devices.
- LCD liquid crystal display
- plasma display a plasma display
- OLED organic light emitting diode
- FIG. 1 is only an example, and the technical solutions of the embodiments of the present application are not limited to FIG. 1 .
- the technology of the present application may also be applied to single-side video encoding or single-side video decoding.
- FIG. 2 is a schematic block diagram of a video encoder 200 provided by an embodiment of the present application. It should be understood that the video encoder 200 can be used to perform lossy compression on images, and can also be used to perform lossless compression on images.
- the lossless compression may be visually lossless compression (visually lossless compression) or mathematically lossless compression (mathematically lossless compression).
- the video encoder 200 can be applied to image data in luminance chrominance (YCbCr, YUV) format.
- the YUV ratio can be 4:2:0, 4:2:2 or 4:4:4, Y represents the luminance (Luma), Cb(U) represents the blue chromaticity, Cr(V) represents the red chromaticity, U and V are expressed as chroma (Chroma) to describe color and saturation.
- 4:2:0 means that every 4 pixels has 4 luma components
- 2 chrominance components YYYYCbCr
- 4:2:2 means that every 4 pixels has 4 luma components
- 4 Chroma component YYYYCbCrCbCr
- 4:4:4 means full pixel display (YYYYCbCrCbCrCbCrCbCr).
- the video encoder 200 reads video data, and for each frame of image in the video data, divides one frame of image into several coding tree units (CTUs).
- CTUs coding tree units
- the CTB may be referred to as “Tree block", “Largest Coding Unit” (LCU for short) or “coding tree block” (CTB for short).
- LCU Large Coding Unit
- CTB coding tree block
- Each CTU may be associated with a block of pixels of equal size within the image.
- Each pixel may correspond to one luminance (luma) sample and two chrominance (chrominance or chroma) samples.
- each CTU may be associated with one block of luma samples and two blocks of chroma samples.
- the size of one CTU is, for example, 128 ⁇ 128, 64 ⁇ 64, 32 ⁇ 32, and so on.
- a CTU can be further divided into several coding units (Coding Unit, CU) for coding, and the CU can be a rectangular block or a square block.
- the CU can be further divided into a prediction unit (PU for short) and a transform unit (TU for short), so that coding, prediction, and transformation are separated and processing is more flexible.
- the CTU is divided into CUs in a quadtree manner, and the CUs are divided into TUs and PUs in a quadtree manner.
- Video encoders and video decoders may support various PU sizes. Assuming the size of a particular CU is 2Nx2N, video encoders and video decoders may support PU sizes of 2Nx2N or NxN for intra prediction, and support 2Nx2N, 2NxN, Nx2N, NxN or similar sized symmetric PUs for inter prediction. Video encoders and video decoders may also support 2NxnU, 2NxnD, nLx2N, and nRx2N asymmetric PUs for inter prediction.
- the video encoder 200 may include: a prediction unit 210, a residual unit 220, a transform/quantization unit 230, an inverse transform/quantization unit 240, a reconstruction unit 250, a loop filtering unit 260 , a decoded image buffer 270 and an entropy encoding unit 280 . It should be noted that the video encoder 200 may include more, less or different functional components.
- a current block may be referred to as a current coding unit (CU) or a current prediction unit (PU), or the like.
- a prediction block may also be referred to as a predicted image block or an image prediction block, and a reconstructed image block may also be referred to as a reconstructed block or an image reconstructed image block.
- prediction unit 210 includes an inter prediction unit 211 and an intra prediction unit 212 . Since there is a strong correlation between adjacent pixels in a frame of a video, the method of intra-frame prediction is used in video coding and decoding technology to eliminate the spatial redundancy between adjacent pixels. Due to the strong similarity between adjacent frames in the video, the inter-frame prediction method is used in the video coding and decoding technology to eliminate the temporal redundancy between adjacent frames, thereby improving the coding efficiency.
- the inter-frame prediction unit 211 can be used for inter-frame prediction, and the inter-frame prediction can refer to image information of different frames, and the inter-frame prediction uses motion information to find a reference block from the reference frame, and generates a prediction block according to the reference block for eliminating temporal redundancy;
- Frames used for inter-frame prediction may be P frames and/or B frames, where P frames refer to forward predicted frames, and B frames refer to bidirectional predicted frames.
- the motion information includes the reference frame list where the reference frame is located, the reference frame index, and the motion vector.
- the motion vector can be of whole pixel or sub-pixel. If the motion vector is sub-pixel, then it is necessary to use interpolation filtering in the reference frame to make the required sub-pixel block.
- the reference frame found according to the motion vector is used.
- the whole pixel or sub-pixel block is called the reference block.
- the reference block is directly used as the prediction block, and some technologies are processed on the basis of the reference block to generate the prediction block.
- Reprocessing to generate a prediction block on the basis of the reference block can also be understood as taking the reference block as a prediction block and then processing it on the basis of the prediction block to generate a new prediction block.
- inter-frame prediction methods include: geometric partitioning mode (GPM) in the VVC video codec standard, and angular weighted prediction (AWP) in the AVS3 video codec standard. These two intra prediction modes have something in common in principle.
- GPM geometric partitioning mode
- AVS3 angular weighted prediction
- the intra-frame prediction unit 212 only refers to the information of the same frame image, and predicts the pixel information in the current code image block, so as to eliminate the spatial redundancy.
- Frames used for intra prediction may be I-frames.
- the white 4 ⁇ 4 block is the current block
- the gray pixels in the left row and upper column of the current block are the reference pixels of the current block
- the intra prediction uses these reference pixels to predict the current block.
- These reference pixels may already be all available, ie all already coded and decoded. Some parts may not be available. For example, if the current block is the leftmost part of the whole frame, the reference pixels to the left of the current block are not available.
- the lower left part of the current block has not been encoded or decoded, so the reference pixels at the lower left are also unavailable.
- the available reference pixel or some value or some method can be used for padding, or no padding is performed.
- the intra prediction method further includes a multiple reference line intra prediction method (multiple reference line, MRL), which can use more reference pixels to improve coding efficiency.
- MRL multiple reference line intra prediction method
- mode 0 is to copy the pixels above the current block to the current block in the vertical direction as the predicted value
- mode 1 is to copy the reference pixel on the left to the current block in the horizontal direction as the predicted value
- mode 2 (DC) is to copy A ⁇
- the average value of the 8 points D and I to L is used as the predicted value of all points.
- Modes 3 to 8 copy the reference pixels to the corresponding position of the current block according to a certain angle respectively. Because some positions of the current block cannot exactly correspond to the reference pixels, it may be necessary to use a weighted average of the reference pixels, or sub-pixels of the interpolated reference pixels.
- the intra-frame prediction modes used by HEVC include Planar mode, DC and 33 angle modes, for a total of 35 prediction modes.
- the intra-frame modes used by VVC are Planar, DC, and 65 angular modes, for a total of 67 prediction modes.
- the intra-frame modes used by AVS3 are DC, Plane, Bilinear and 63 angle modes, a total of 66 prediction modes.
- the intra-frame prediction will be more accurate and more in line with the demand for the development of high-definition and ultra-high-definition digital video.
- Residual unit 220 may generate a residual block of the CU based on the pixel blocks of the CU and the prediction blocks of the PUs of the CU. For example, residual unit 220 may generate a residual block of the CU such that each sample in the residual block has a value equal to the difference between the samples in the CU's pixel block, and the CU's PU's Corresponding samples in the prediction block.
- Transform/quantization unit 230 may quantize transform coefficients. Transform/quantization unit 230 may quantize transform coefficients associated with TUs of the CU based on quantization parameter (QP) values associated with the CU. Video encoder 200 may adjust the degree of quantization applied to transform coefficients associated with the CU by adjusting the QP value associated with the CU.
- QP quantization parameter
- Inverse transform/quantization unit 240 may apply inverse quantization and inverse transform, respectively, to the quantized transform coefficients to reconstruct a residual block from the quantized transform coefficients.
- Reconstruction unit 250 may add the samples of the reconstructed residual block to corresponding samples of the one or more prediction blocks generated by prediction unit 210 to generate a reconstructed image block associated with the TU. By reconstructing the block of samples for each TU of the CU in this manner, video encoder 200 may reconstruct the block of pixels of the CU.
- In-loop filtering unit 260 may perform deblocking filtering operations to reduce blocking artifacts for pixel blocks associated with the CU.
- loop filtering unit 260 includes a deblocking filtering unit, a sample adaptive compensation SAO unit, an adaptive loop filtering ALF unit.
- the decoded image buffer 270 may store the reconstructed pixel blocks.
- Inter-prediction unit 211 may use the reference picture containing the reconstructed pixel block to perform inter-prediction on PUs of other pictures.
- intra-prediction unit 212 may use the reconstructed pixel blocks in decoded picture buffer 270 to perform intra-prediction on other PUs in the same picture as the CU.
- Entropy encoding unit 280 may receive the quantized transform coefficients from transform/quantization unit 230 . Entropy encoding unit 280 may perform one or more entropy encoding operations on the quantized transform coefficients to generate entropy encoded data.
- the basic flow of video coding involved in the present application is as follows: at the coding end, the current image is divided into blocks, and for the current block, the prediction unit 210 uses intra-frame prediction or inter-frame prediction to generate a prediction block of the current block.
- the residual unit 220 may calculate a residual block based on the predicted block and the original block of the current block, that is, the difference between the predicted block and the original block of the current block, and the residual block may also be referred to as residual information.
- the residual block can be transformed and quantized by the transform/quantization unit 230 to remove information insensitive to human eyes, so as to eliminate visual redundancy.
- the residual block before being transformed and quantized by the transform/quantization unit 230 may be referred to as a time-domain residual block, and the time-domain residual block after being transformed and quantized by the transform/quantization unit 230 may be referred to as a frequency residual block. or a frequency domain residual block.
- the entropy coding unit 280 receives the quantized variation coefficient output by the variation quantization unit 230, and can perform entropy coding on the quantized variation coefficient to output a code stream. For example, the entropy encoding unit 280 may eliminate character redundancy according to the target context model and the probability information of the binary code stream.
- the video encoder performs inverse quantization and inverse variation on the quantized variation coefficient output by the variation quantization unit 230 to obtain the residual block of the current block, and then adds the residual block of the current block to the prediction block of the current block, Get the reconstructed block of the current block.
- reconstructed blocks corresponding to other image blocks in the current image can be obtained, and these reconstructed blocks are spliced to obtain a reconstructed image of the current image.
- the reconstructed image is filtered, for example, ALF is used to filter the reconstructed image to reduce the difference between the pixel value of the pixel in the reconstructed image and the original pixel value of the pixel in the current image. difference.
- the filtered reconstructed image is stored in the decoded image buffer 270, and can be used as a reference frame for inter-frame prediction for subsequent frames.
- the block division information determined by the coding end, and mode information or parameter information such as prediction, transformation, quantization, entropy coding, and loop filtering, etc. are carried in the code stream when necessary.
- the decoding end determines the same block division information, prediction, transformation, quantization, entropy coding, loop filtering and other mode information or parameter information as the encoding end by analyzing the code stream and analyzing the existing information, so as to ensure the decoded image obtained by the encoding end. It is the same as the decoded image obtained by the decoder.
- FIG. 3 is a schematic block diagram of a decoding framework 300 provided by an embodiment of the present application.
- the video decoder 300 includes an entropy decoding unit 310 , a prediction unit 320 , an inverse quantization/transformation unit 330 , a reconstruction unit 340 , a loop filtering unit 350 , and a decoded image buffer 360 . It should be noted that the video decoder 300 may include more, less or different functional components.
- the video decoder 300 may receive the code stream.
- Entropy decoding unit 310 may parse the codestream to extract syntax elements from the codestream. As part of parsing the codestream, entropy decoding unit 310 may parse the entropy-encoded syntax elements in the codestream.
- the prediction unit 320, the inverse quantization/transform unit 330, the reconstruction unit 340, and the in-loop filtering unit 350 may decode the video data according to the syntax elements extracted from the code stream, ie, generate decoded video data.
- prediction unit 320 includes intra prediction unit 321 and inter prediction unit 322 .
- Intra-prediction unit 321 may perform intra-prediction to generate prediction blocks for the PU. Intra-prediction unit 321 may use an intra-prediction mode to generate prediction blocks for a PU based on pixel blocks of spatially neighboring PUs. Intra-prediction unit 321 may also determine an intra-prediction mode for the PU from one or more syntax elements parsed from the codestream.
- Inter-prediction unit 322 may construct a first reference picture list (List 0) and a second reference picture list (List 1) from the syntax elements parsed from the codestream. Furthermore, if the PU is encoded using inter-prediction, entropy decoding unit 310 may parse the motion information for the PU. Inter-prediction unit 322 may determine one or more reference blocks for the PU according to the motion information of the PU. Inter-prediction unit 322 may generate a prediction block for the PU from one or more reference blocks of the PU.
- the inverse quantization/transform unit 330 inversely quantizes (ie, dequantizes) the transform coefficients associated with the TUs.
- Inverse quantization/transform unit 330 may use the QP value associated with the CU of the TU to determine the degree of quantization.
- inverse quantization/transform unit 330 may apply one or more inverse transforms to the inverse quantized transform coefficients to generate a residual block associated with the TU.
- Reconstruction unit 340 uses the residual blocks associated with the TUs of the CU and the prediction blocks of the PUs of the CU to reconstruct the pixel blocks of the CU. For example, reconstruction unit 340 may add samples of the residual block to corresponding samples of the prediction block to reconstruct the pixel block of the CU, resulting in a reconstructed image block.
- In-loop filtering unit 350 may perform deblocking filtering operations to reduce blocking artifacts for pixel blocks associated with the CU.
- the loop filtering unit 350 includes a deblocking filtering unit, a sample adaptive compensation SAO unit, an adaptive loop filtering ALF unit.
- Video decoder 300 may store the reconstructed images of the CU in decoded image buffer 360 .
- the video decoder 300 may use the reconstructed image in the decoded image buffer 360 as a reference image for subsequent prediction, or transmit the reconstructed image to a display device for presentation.
- the entropy decoding unit 310 can parse the code stream to obtain the prediction information, quantization coefficient matrix, etc. of the current block, and the prediction unit 320 uses intra prediction or inter prediction for the current block to generate the current block based on the prediction information.
- the predicted block for the block The inverse quantization/transform unit 330 performs inverse quantization and inverse transformation on the quantized coefficient matrix using the quantized coefficient matrix obtained from the code stream to obtain a residual block.
- the reconstruction unit 340 adds the prediction block and the residual block to obtain a reconstructed block.
- the reconstructed blocks form a reconstructed image
- the loop filtering unit 350 performs loop filtering on the reconstructed image based on the image or based on the block to obtain a decoded image.
- the decoded image may also be referred to as a reconstructed image.
- the reconstructed image may be displayed by a display device, and on the other hand, the reconstructed image may be stored in the decoded image buffer 360 to serve as a reference frame for inter-frame prediction for subsequent frames.
- the above is the basic process of the video codec under the block-based hybrid coding framework. With the development of technology, some modules or steps of the framework or process may be optimized. This application is applicable to the block-based hybrid coding framework.
- the basic process of the video codec but not limited to the framework and process.
- the encoding end will be introduced below with reference to FIG. 4 .
- FIG. 4 is a schematic flowchart of a video encoding method 400 provided by an embodiment of the present application, and the embodiment of the present application is applied to the video encoder shown in FIG. 1 and FIG. 2 .
- the method of the embodiment of the present application includes:
- the video encoder receives a video stream, which consists of a series of image frames, and performs video encoding for each frame of image in the video stream.
- a video stream which consists of a series of image frames
- this application uses a frame of image currently to be encoded Record as the current image.
- the video encoder divides the current image into one or more image blocks to be encoded, and for each image block to be encoded, the prediction unit 210 in the video encoder uses inter-frame prediction, intra-frame prediction After the prediction block of the image block to be encoded is generated, the prediction block is sent to the residual unit 220, which can be understood as a summer, including one or more components that perform a subtraction operation.
- the residual unit 220 subtracts the prediction block from the image block to be encoded to form a residual block, and sends the residual block to the transform and quantization unit 230 .
- the transform and quantization unit 230 transforms the residual block using, for example, discrete cosine transform (DCT) or the like, to obtain transform coefficients.
- DCT discrete cosine transform
- the transform and quantization unit 230 further quantizes the transform coefficients to obtain quantized transform coefficients.
- the transform and quantization unit 230 forwards the quantized transform coefficients to the entropy encoding unit 280 .
- the entropy encoding unit 280 entropy encodes the quantized transform coefficients.
- entropy encoding unit 280 may perform context adaptive variable length coding (CAVLC), context adaptive binary arithmetic coding (CABAC), syntax-based context adaptive binary arithmetic coding (SBAC), probability interval partitioning entropy (PIPE) ) coding and other coding methods, entropy coding is performed on the quantized transform coefficients to obtain a code stream.
- CAVLC context adaptive variable length coding
- CABAC context adaptive binary arithmetic coding
- SBAC syntax-based context adaptive binary arithmetic coding
- PIPE probability interval partitioning entropy
- the transform and quantization unit 230 forwards the quantized transform coefficients to the inverse transform and quantization unit 240 .
- the inverse transform and quantization unit 240 inversely quantizes and inversely transforms the quantized transform coefficients to reconstruct the residual block in the pixel domain.
- Reconstruction unit 250 is understood to be a summer, including one or more components that perform subtraction operations.
- the reconstruction unit 250 adds the reconstructed residual block to the prediction block generated by the prediction unit 210 to generate a partial or complete reconstructed image of the current image, the partial or complete reconstructed image including one or more reconstructed image blocks.
- the reconstructed image includes a first component, and the first component may be a luminance component or a chrominance component.
- S402. Determine the ALF coefficients of the floating point type of the first ALF when the target area in the reconstructed image under the first component is filtered using the first ALF.
- the video encoder in this embodiment of the present application can be used for images in different formats, such as YUV format or YCbCr format.
- AVS3 there is an independent adaptive correction filter (ALF for short) for each channel in the YUV or YCbCr format.
- the first chrominance component of a frame image (such as U component or Cb component) corresponds to an ALF
- the second chrominance component (eg, V component or Cr component) of a frame of image corresponds to an ALF.
- a frame of image is divided into multiple regions, for example, into 16 regions or 64 regions, and each region corresponds to an ALF. Based on this, when the first component is a chrominance component, the present application uses the entire reconstructed image under the first component as the target area to be filtered.
- the present application divides the reconstructed image under the first component into regions, for example, into 16 regions or 64 regions, and takes one of the regions to be filtered as the target region.
- the ALF that performs adaptive correction filtering on the target area is denoted as the first ALF.
- the form of the first ALF may be as shown in FIG. 5A , the form of the first ALF is a 7 ⁇ 7 cross plus a 3 ⁇ 3 square, and a set of ALF coefficients of the first ALF includes 9 ALF coefficients , C0, C1, C2, C3, C4, C5, C6, C7, and C8, respectively, where C8 is the ALF coefficient corresponding to the center point of the first ALF, and other ALF coefficients in the first ALF are related to the first ALF Center point symmetry.
- the form of the first ALF may be as shown in FIG. 5B , the form of the first ALF is a 7 ⁇ 7 cross plus a 5 ⁇ 5 square, and a set of ALF coefficients of the first ALF includes 15 ALF coefficients , respectively C0, C1, C2...C13 and C14, wherein C14 is the ALF coefficient corresponding to the center point of the first ALF, wherein other ALF coefficients in the first ALF are symmetrical with respect to the center point of the first ALF.
- the Wienerhoff method is used to derive the ALF coefficients of the floating point type of the first ALF.
- the ALF coefficient of the floating point type of the first ALF is derived:
- represents the number of pixels in the target area R
- s[r] represents the original pixel value of the pixel r
- t[r] Represents the reconstructed pixel value of the pixel point r, that is, the pixel value of the pixel point r in the reconstructed image.
- s[r] represents the original pixel value of pixel r.
- C0, C1...CN-1 are the ALF coefficients of the floating point type of the first ALF. ⁇ P 0 , P 1 , P 2 ,...,P N-1 ⁇ is a position offset relative to r.
- the above N is 8
- the above N is 14.
- the floating point type ALF coefficients C0, C1...CN-1 of the first ALF can be derived.
- the value range of ALF coefficients of floating-point type is almost unlimited.
- S403. Determine the maximum quantization scale of the ALF coefficients of the floating-point number type of the first ALF according to the ALF coefficients of the floating-point number type of the first ALF.
- the first 8 ALF coefficients (ie C0, C1, C2, C3, C4, C5, C6 and C7) of the first ALF after quantization
- the coefficients of integer type are limited in the range of -64 to 63
- the coefficients of integer type after the quantization of the ninth ALF coefficient (ie, C8) are limited in the range of 0 to 127.
- the quantized integer-type coefficients of the first 14 ALF coefficients of the first ALF are limited to -64 to 63 Within the range of , the quantized integer type coefficient of the 15th ALF coefficient (ie, C14) is limited to the range of 0-127.
- the maximum quantization scale of the floating-point ALF coefficient of the first ALF can be determined according to the floating-point ALF coefficient of the first ALF and the integer coefficient threshold corresponding to the quantization of the floating-point ALF coefficient to the integer type.
- the above S403 includes the following S403-A1 and S403-A2:
- S403-A2 Determine the maximum quantization scale of the floating-point type ALF coefficient of the first ALF according to the maximum shift value.
- the ALF coefficient corresponding to the center position of the first ALF for example, C8 or C14 is the largest, and the ALF coefficient corresponding to the non-center position of the first ALF is smaller.
- the above-mentioned first coefficient may be a floating-point type ALF coefficient corresponding to the center position of the first ALF, or may be a floating-point type ALF coefficient corresponding to a non-center position of the first ALF.
- the maximum integer coefficient threshold corresponding to the ALF coefficient W min(x, y) is b1. In this way, according to the ALF coefficients W min(x, y) and b1, the maximum shift value allowed when the floating-point type ALF coefficient W min(x, y) is quantized into an integer type can be determined.
- the maximum shift value bitshift allowed when quantizing the ALF coefficient of the floating-point type into an integer type is determined:
- W max(x, y) is the largest ALF coefficient among the floating-point type ALF coefficients corresponding to the non-center position of the first ALF, and neither x nor y is equal to 0.
- d1 is the maximum integer coefficient threshold corresponding to W max(x, y) , for example, d1 is 63.
- the first coefficient is a coefficient of floating point type corresponding to the center position of the first ALF, it can be determined according to the following formula (3) when the ALF coefficient of floating point type is quantized into an integer type Maximum allowed shift value bitshift:
- W f(0,0) is the floating-point type coefficient corresponding to the center position of the first ALF
- d is the maximum integer coefficient threshold corresponding to W f(0,0) , for example, d is 127.
- the maximum quantization scale of the ALF coefficients of the floating point type of the first ALF is determined:
- Scale max is the maximum quantization scale.
- the following S404 is performed to determine the target quantization scale of the ALF coefficients of the floating point type of the first ALF.
- the target quantization scale is less than or equal to the maximum quantization scale.
- the maximum quantization scale is 2 8 , in this way, it can be determined that the target quantization scale is 2 7 or 2 6 or 2 5 smaller than or equal to 2 8 and greater than 1 to the power of 2.
- the methods for determining the target quantization scale of the ALF coefficients of the floating point type of the first ALF include but are not limited to the following:
- the second quantization scale is determined as the target quantization scale of the floating-point type ALF coefficient of the second ALF, and the second quantization scale is less than or equal to the maximum quantization scale.
- the third quantization scale is determined as the target quantization scale of the floating-point type ALF coefficients of the third ALF, and the third ALF is the target quantization scale for the chrominance components.
- the eye is more sensitive to luminance, so higher quantization precision is used for the reconstructed image under the luminance component to improve the luminance information of the reconstructed image.
- the eye is less sensitive to chrominance, so a lower quantization precision is used for the reconstructed image under the chrominance component to improve coding efficiency.
- the larger second quantization scale is determined as the target quantization scale of the floating-point type ALF coefficient of the second ALF.
- the second ALF quantized by using the second quantization scale has higher quantization accuracy, and when the reconstructed image under the luminance component is filtered by using the second ALF, the filtering accuracy can be improved.
- the maximum quantization scale is 2 8
- the second quantization scale is determined to be 2 8 , 2 7 or 2 6 and so on.
- the third quantization scale smaller than the second quantization scale is directly determined as the target quantization scale corresponding to the chrominance component, that is, the third ALF is used to quantify the chrominance component.
- the third quantization scale can be determined as the target quantization scale of the floating-point type ALF coefficient of the third ALF, and there is no need to additionally calculate the target quantization scale corresponding to the chrominance component, reducing the amount of calculation and improving the coding efficiency.
- the largest quantization scale is 2 8
- the second quantization scale is 2 8 , 2 7 or 2 6
- the third quantization scale is 2 5 or 2 4 and so on.
- the fourth quantization scale is determined as the target quantization scale of the floating-point type ALF coefficient of the second ALF, and the fourth quantization scale is smaller than the maximum quantization scale.
- the fifth quantization scale is determined as the target quantization scale of the floating-point type ALF coefficient of the fourth ALF, and the fourth ALF is a filter used when performing ALS filtering on the reconstructed image under the luminance component , the fifth quantization scale is larger than the fourth quantization scale.
- a smaller fourth quantization scale is determined as the target quantization scale of the ALF coefficients of the floating point type of the second ALF.
- the coding scale of the ALF coefficients of the second ALF quantized by using the fourth quantization scale is shorter, which is convenient for coding. For example, if the maximum quantization scale is 2 8 , the fourth quantization scale is determined to be 2 5 , or 2 4 and so on.
- the fifth quantization scale larger than the fourth quantization scale is directly determined as the target quantization scale corresponding to the luminance component, and there is no need to additionally calculate the target quantization scale corresponding to the luminance component , reduce the amount of calculation and improve the coding efficiency.
- the largest quantization scale is 28
- the third quantization scale is 25 or 24 , etc.
- the fifth quantization scale is 28 , 27 , 26 , etc.
- S404 includes the following S404-A1 and S404-A2:
- S404-A2 Determine, according to the quantization cost of each first quantization scale, the target quantization cost of the floating-point type ALF coefficient of the first ALF.
- the maximum quantization scale and the preset minimum quantization scale constitute a quantization interval.
- each quantization scale in the quantization interval is the first quantization scale.
- the quantization cost of each first quantization scale in the quantization interval is determined, and according to the quantization cost of each first quantization scale, the target quantization cost of the ALF coefficient of the floating point type of the first ALF is determined.
- the first quantization scale with the smallest quantization cost is determined as the target quantization cost of the floating point type ALF coefficients of the first ALF.
- the average value of the quantization costs of the first quantization scales is determined as the target quantization cost of the floating-point type ALF coefficients of the first ALF, where the average value may be an arithmetic average value or a weighted average value.
- the above preset minimum quantization scale is 2 0 , that is, 1.
- the above preset minimum quantization scale is 2 1 or 2 2 or the like.
- the quantization cost is determined in the same manner.
- the present application takes a first quantization scale as an example for description.
- the encoding cost of the first quantization scale is determined as the quantization cost of the first quantization scale, for example, the floating-point type ALF coefficients of the first ALF are quantized using the first quantization scale to the integer type first ALF
- the coefficient, the number of bits occupied by the first ALF coefficient of the encoding integer type, is determined as the quantization cost of the first quantization scale.
- determining the quantization cost of the first quantization scale in the above S404-A1 includes the following steps S404-A11 to S404-A13:
- S404-A13 Determine the quantization cost of the first quantization scale according to the quantization distortion result corresponding to the first ALF coefficient and the number of bits consumed by encoding the first ALF coefficient.
- the first quantization scale is used to quantize the ALF coefficients of the floating point type of the first ALF into the first ALF coefficients of the integer type.
- the floating point coefficients of the first ALF are quantized
- An ALF coefficient of type is quantized to the first ALF coefficient of an integer type:
- AlfCoeff integer round(AlfCoeff float ⁇ 2 p ) (5)
- AlfCoeff integer is the first ALF coefficient of the integer type
- AlfCoeff float is the ALF coefficient of the floating point type of the first ALF
- round is the rounding operation
- 2 P is the first quantization scale.
- the manner of the quantization distortion result corresponding to the first ALF coefficient includes but is not limited to the following:
- Method 1 Determine the autocorrelation coefficient of the pixels in the target area according to the reconstructed pixel values of the pixels in the target area; determine the cross-correlation coefficient of the pixels in the target area according to the original pixel values of the pixels in the target area; According to the target area The product of the autocorrelation coefficient of the middle pixel and the first ALF coefficient, and the cross-correlation coefficient of the pixel in the target area, determine the quantization distortion result corresponding to the first ALF coefficient.
- the autocorrelation coefficient E of the pixels in the target area can be determined:
- represents the number of pixels in the target area R
- s[r] represents the original pixel value of the pixel r
- t[r] Represents the reconstructed pixel value of the pixel point r, that is, the pixel value of the pixel point r in the reconstructed image.
- ⁇ P 0 , P 1 , P 2 ,...,P N-1 ⁇ is a position offset relative to r.
- the cross-correlation coefficient Y of pixels in the target area can be determined:
- s[r] represents the original pixel value of the pixel point r.
- D is the quantization distortion result corresponding to the first ALF coefficient
- C is the first ALF coefficient
- the autocorrelation coefficient and cross-correlation coefficient of the target area are respectively denoted as E[15][15] and y[15], where 15 is the number of ALF coefficients of the first ALF, and coeff[i] is the first ALF coefficient.
- E[15][15] and y[15] 15 is the number of ALF coefficients of the first ALF
- coeff[i] is the first ALF coefficient.
- the value of dist is D.
- Method 2 Use the first ALF coefficient to filter the target area of the reconstructed image; determine the quantization distortion result corresponding to the first ALF coefficient according to the difference between the filtered pixel value and the original pixel value of the pixel in the target area.
- the first ALF is used to filter the target area of the reconstructed image under the first component to obtain the filtered target area.
- the filtered pixel values of the pixel points in the filtered target area are compared with the original pixel values to obtain the interpolation of the pixel values, and the quantization distortion result corresponding to the first ALF coefficient is determined according to the difference of the pixel values. For example, the larger the difference value, the larger the quantization distortion corresponding to the first ALF coefficient, and the smaller the difference value, the smaller the quantization distortion corresponding to the first ALF coefficient.
- the quantization distortion result corresponding to the first ALF coefficient is determined.
- the quantization cost of the first quantization scale is determined according to the quantization distortion result corresponding to the first ALF coefficient and the number of bits consumed for encoding the first ALF coefficient.
- J is the quantization cost of the first quantization scale
- D is the quantization distortion result corresponding to the first ALF coefficient
- R is the number of bits consumed for encoding the first ALF coefficient
- ⁇ is a variable value.
- R is an estimate of the number of bits consumed to encode the first ALF coefficient.
- Coefficient value numbers consume bits 0 1 1 1 010 3 -1 011 3 2 00100 5 -2 00101 5 3 00110 5 -3 00111 5
- the filter coefficient For example, if the filter coefficient is 0, the number of bits consumed is 1 bit, and if the filter coefficient is 3, the number of bits consumed is 5 bits. Look up Table 1 to obtain the number of bits consumed by each ALF coefficient in the encoding of the first ALF coefficients. The sum of the bits consumed by the coefficients, as the value of R, is brought into the above formula (9) to obtain the quantization cost of the first quantization scale.
- the quantization cost corresponding to each first quantization scale in the quantization interval formed by the maximum quantization scale and the preset minimum quantization scale can be determined.
- the target quantization cost of the floating-point type ALF coefficient of the first ALF is determined.
- the first quantization scale with the smallest quantization cost is determined as the target quantization cost of the floating point type ALF coefficients of the first ALF.
- the ALF coefficients of the floating point type of the first ALF are quantized into the ALF coefficients of the integer type:
- AlfCoeff integer round(AlfCoeff float ⁇ 2 Alfshift ) (10)
- AlfCoeff integer is an ALF coefficient of integer type
- AlfCoeff float is an ALF coefficient of floating point type of the first ALF
- round is a rounding operation
- 2 Alfshift is a target quantization scale.
- S406 Encode the integer-type ALF coefficients of the first ALF to obtain a code stream.
- the ALF coefficients of the integer type of the first ALF are encoded, carried in the code stream, and sent to the decoding end.
- the decoding end decodes the residual value of the current image from the code stream, determines the reconstructed image of the current image according to the residual value, and decodes the integer type ALF coefficient of the first ALF from the code stream.
- the integer-type ALF coefficients are used to filter the target area in the reconstructed image under the first component using the first ALF to improve the accuracy of the reconstructed image.
- all integer-type ALF coefficients of the first ALF are included in the codestream.
- the decoding end can directly decode all integer-type ALF coefficients of the first ALF from the code stream, which are used for later filtering without the need to calculate the amount of ALF coefficients, thereby improving the filtering efficiency of the decoding end.
- the code stream includes integer-type second ALF coefficients of the first ALF and first information
- the second ALF coefficients are integer-type ALF coefficients corresponding to non-central points in the first ALF
- the first information is an integer-type ALF coefficient. to indicate the target quantization scale.
- the decoding end needs to determine the first ALF coefficient of the first ALF according to the second ALF coefficient and the first information, and then according to The first ALF coefficient and the second ALF coefficient are used to filter the target area in the reconstructed image under the first component by using the first ALF, wherein the first ALF coefficient reference of the first ALF is determined according to the second ALF coefficient and the first information
- the specific description of the decoding end above will not be repeated here.
- the first information includes a target shift value corresponding to the target quantization scale. For example, if the target quantization scale is 2 Alfshift , the target shift value is Alfshift.
- the first information includes the absolute value of the difference between the target shift value corresponding to the target quantization scale and the first numerical value. For example, if the target shift value corresponding to the target quantization scale is Alfshift and the first value is a, the first information includes
- the first value is 6, 5, or 4, etc.
- adaptive_filter_shift_enable_flag is the adaptive modification filter shape quantization scale flag, which is a binary variable.
- a value of '1' indicates that the adaptive correction filtering can use a 7 ⁇ 7 cross plus a 5 ⁇ 5 square filter shape; a value of '0' indicates that the adaptive correction filtering uses a 7 ⁇ 7 cross plus a 3 ⁇ 3 square filter shape .
- adaptive_filter_shift_enable_flag is the adaptive modification filter shift enable flag, which is a binary variable.
- a value of '1' indicates that the solution of the present application is adopted, and the shift value of the adaptive correction filter changes; a value of '0' indicates that the solution of the present application is not adopted, and the shift value of the adaptive correction filter does not change, for example, still is 6.
- AlfShiftEnableFlag is equal to the value of adaptive_filter_shift_enable_flag.
- the first information in the code stream includes the shift information corresponding to the luminance component and the shift information corresponding to the chrominance component, as shown in the following table. 3 shows:
- alf_luma_shift_num_minus6[i] is the coefficient shift of the adaptive correction filter corresponding to the image luminance component sample.
- the luminance component may correspond to multiple adaptive correction filters, for example, 16 or 64.
- alf_coeff_luma[i][j] represents the j-th ALF coefficient of the i-th adaptive correction filter corresponding to the luminance component, and the j-th ALF coefficient can be understood as the above-mentioned second ALF coefficient.
- alf_luma_shift_num_minus6[i] represents the quantization scale shift value of the i-th adaptive correction filter corresponding to the luminance component minus 6, where 6 can be understood as the above-mentioned first value.
- AlfLumaShift[i] alf_luma_shift_num_minus6[i]+6.
- alf_coeff_chroma[0][j] represents the jth ALF parameter of the adaptive correction filter corresponding to the Cb component.
- the jth ALF coefficient can be understood as the above-mentioned second ALF coefficient.
- alf_chroma_shift_num_minus6[0] represents the quantization scale shift value of the adaptive correction filter corresponding to the Cb component minus 6.
- AlfChromaShift[0] alf_chroma_shift_num_minus6[0]+6.
- alf_coeff_chroma[1][j] represents the jth ALF parameter of the adaptive correction filter corresponding to the Cr component.
- the jth ALF coefficient can be understood as the above-mentioned second ALF coefficient.
- alf_chroma_shift_num_minus6[1] represents the quantization scale shift value of the adaptive correction filter corresponding to the Cr component minus 6.
- AlfChromaShift[1] alf_chroma_shift_num_minus6[1]+6.
- the present application further includes: filtering the target region of the reconstructed image under the first component using integer-type ALF coefficients of the first AFL.
- I rec(0,0) ' is the reconstructed pixel value filtered by the current point (0,0) in the target area
- (x, y) is the position relative to the current point
- W j is the jth of the first ALF ALF coefficients of integer type
- W n-1 is the ALF coefficient of integer type corresponding to the center point of the first ALF
- I rec(0,0) is the reconstructed pixel value of the current point
- Alfshif is the target shift corresponding to the target quantization scale Bit value
- n is the number of ALF coefficients included in the first ALF.
- the first ALF when the shape of the first ALF is a 7 ⁇ 7 cross plus a 3 ⁇ 3 square as shown in FIG. 5A , the first ALF includes 9 ALF coefficients, where (x, y) takes the same value as Wj The corresponding relationship is shown in Table 4 below:
- the first ALF when the shape of the first ALF is a 7 ⁇ 7 cross plus a 5 ⁇ 5 square as shown in FIG. 5B , the first ALF includes 15 ALF coefficients, where (x, y) takes the same value as Wj
- Table 5 The corresponding relationship is shown in Table 5 below:
- the encoder can firstly encode the integer-type ALF coefficients of the first ALF, and then use the integer-type ALF coefficients of the first AFL to filter the target area of the reconstructed image under the first component, or First use the ALF coefficients of the integer type of the first AFL to filter the target area of the reconstructed image under the first component, and then encode the ALF coefficients of the integer type of the first AFL, or use the integer type of the first AFL.
- the ALF coefficients of the first ALF are encoded with the integer type ALF coefficients of the first ALF while filtering the target area of the reconstructed image under the first component.
- the floating-point ALF coefficient of the first ALF is determined, and the floating-point ALF coefficient of the first ALF is determined according to the floating-point ALF coefficient of the first ALF.
- the maximum quantization scale of the floating-point type ALF coefficients of the first ALF is determined.
- the target quantization scale of the floating-point type ALF coefficients of the first ALF is determined.
- the ALF coefficients of the floating point type of the first ALF are quantized into ALF coefficients of the integer type.
- the present application Compared with the current use of a fixed quantization scale to quantize the ALF coefficients of the floating point type of the first ALF, the present application considers The range of floating-point type ALF coefficients derived from different regions or different frames may be different, resulting in the problem that the filter gain may also be different.
- the variable quantization scale is used to quantize the ALF coefficients corresponding to different target regions, so that after quantization, the ALF coefficients corresponding to different target regions are quantized.
- the ALF coefficients can achieve a better balance between coding filter coefficient overhead and filter gain.
- Tongjia 4K represents the test video with 4K resolution, and its resolution is 3840x2160
- Tongyi 1080p represents the test video with 1080p resolution
- its resolution is 1920x1080
- Tong Bing 720p means the test video of 720p resolution
- its resolution is 1280x720.
- EncT is the ratio of the encoding time of the application to the encoding time of the original algorithm
- DecT is the ratio of the decoding time of the application to the decoding time of the original algorithm.
- BD-Rate is one of the main parameters for evaluating the performance of a video coding algorithm, which indicates that the video encoded by the new algorithm (that is, the technical solution of the present application) has a higher bit rate and PSNR (Peak Signal to Noise Ratio, peak signal-to-noise ratio) than the original algorithm.
- PSNR Peak Signal to Noise Ratio, peak signal-to-noise ratio
- the change in the above that is, the change in the code rate of the new algorithm and the original algorithm under the same signal-to-noise ratio.
- "-" indicates performance improvement, such as bit rate and PSNR performance improvement.
- Table 6 under the condition of All intra configuration, the average changes of BD-rate on Y, Cb, and Cr components are -0.14%, 0.01%, and -0.00%, respectively.
- test sequence required by AVS3 is tested under the configuration conditions of both intra-frame and inter-frame (Random Access), and the test results are shown in Table 7:
- the coding end increases with very little computational complexity and the decoding end complexity remains unchanged, which brings additional performance gains to the ALF design.
- FIG. 6 is another schematic flowchart of a video encoding method 600 provided by an embodiment of the present application. Taking the first component as a luminance component as an example, as shown in FIG. 6 , it includes:
- a preset area division rule perform area division on the reconstructed image under the luminance component to obtain a target area to be filtered.
- the preset area division rule is to divide the reconstructed image under the luminance component into 16 areas, so that the reconstructed image under the luminance component is divided into 16 areas according to a preset division method (eg, uniform division).
- the preset area division rule is to divide the reconstructed image under the luminance component into 64 areas, so that the reconstructed image under the luminance component is divided into 64 areas according to a preset division method (eg, uniform division).
- S604 Determine the maximum quantization scale of the ALF coefficients of the floating-point number type of the first ALF according to the ALF coefficients of the floating-point number type of the first ALF.
- S605. Determine, according to the maximum quantization scale, the target quantization scale of the ALF coefficients of the floating point type of the first ALF.
- S607 Encode the integer-type ALF coefficients of the first ALF to obtain a code stream.
- S608 and S607 have no sequence in the execution process, and S608 may be executed before S607, may be executed after S607, or may be executed simultaneously with S607.
- the present application may only use the variable quantization scale of the present application to perform ALF coefficient quantization on the luminance component of the reconstructed image, while the chrominance component of the reconstructed image still uses the current fixed quantization scale, such as 2 6 .
- the encoder carries the variable quantization scale information corresponding to the luminance component in the code stream and sends it to the decoder, but does not carry the quantization scale corresponding to the chrominance component in the code stream.
- the definition of adaptive correction filtering parameters is shown in Table 8.
- variable quantization scale information corresponding to the luminance component is parsed from Table 8, and the variable quantization scale information can be the difference between the target shift value corresponding to the target quantization scale and the first numerical value.
- Absolute value in this way, the decoding end can determine the absolute value of the difference between the target shift value corresponding to the target quantization scale and the first value according to the second ALF coefficient corresponding to the luminance component (ie alf_coeff_luma[i][j])
- the first ALF coefficient of the first ALF is obtained, and then the first ALF is used to filter the target area in the reconstructed image under the luminance component according to the first ALF coefficient and the second ALF coefficient of the first ALF, so as to improve the filtering accuracy.
- variable quantization scale information corresponding to the chrominance component is not resolved at the decoding end in Table 8, the decoding end uses the fixed quantization scale information, such as a fixed shift value of 6, to filter the reconstructed image under the chrominance component. .
- the decoding end can determine the first ALF according to the second ALF coefficient corresponding to the chrominance component in Table 6 (ie alf_coeff_chroma[0][j] or alf_coeff_chroma[1][j]) and the fixed shift value 6 Then, according to the first ALF coefficient and the second ALF coefficient of the first ALF, the first ALF is used to filter the target area in the reconstructed image under the luminance component, so as to improve the filtering accuracy.
- FIG. 7 is another schematic flowchart of a video encoding method 700 provided by an embodiment of the present application. Taking the first component as a luminance component as an example, as shown in FIG. 7 , the method includes:
- the reconstructed image includes chrominance components, wherein the chrominance components include a first chrominance component and/or include a second chrominance component, wherein the first chrominance component is, for example, U or Cb, and the first chrominance component is U or Cb.
- Two chrominance components such as V or Cr.
- S703 Determine the ALF coefficient of the floating point type of the first ALF when the target area under the chrominance component is filtered using the first ALF.
- S704. Determine the maximum quantization scale of the ALF coefficients of the floating-point number type of the first ALF according to the ALF coefficients of the floating-point number type of the first ALF.
- S705. Determine, according to the maximum quantization scale, the target quantization scale of the ALF coefficients of the floating point type of the first ALF.
- S707 Encode the integer-type ALF coefficients of the first ALF to obtain a code stream.
- S708 and S707 have no sequence in the execution process, and S708 may be executed before S707, or after S707, and may also be executed simultaneously with S707.
- the present application may only use the variable quantization scale of the present application to perform ALF coefficient quantization on the first chrominance component of the reconstructed image, while the second chrominance component and luminance component of the reconstructed image still use the currently fixed quantization scale.
- Quantization scale eg 2 6 .
- the encoder carries the variable quantization scale information corresponding to the first chrominance component in the code stream and sends it to the decoder, but does not carry the quantization corresponding to the second chrominance component and the luminance component in the code stream scale.
- Table 9 the definition of adaptive correction filtering parameters is shown in Table 9:
- variable quantization scale information corresponding to the first chrominance component is parsed from Table 9, and the ALF coefficient of the first ALF is determined according to the variable quantization scale information corresponding to the first chrominance component, And according to the determined ALF coefficient of the first ALF, the first ALF is used to filter the reconstructed image under the first chrominance component, so as to improve the filtering precision.
- the decoding end uses an existing method to determine the ALF coefficient corresponding to the second chrominance component and/or the luminance component, and uses the determined ALF coefficient to filter the reconstructed image under the second chrominance component and/or the luminance component.
- the present application may only use the variable quantization scale of the present application to perform ALF coefficient quantization on the second chrominance component of the reconstructed image, while the first chrominance component and luminance component of the reconstructed image still use the currently fixed Quantization scale, eg 2 6 .
- the encoder carries the variable quantization scale information corresponding to the second chrominance component in the code stream and sends it to the decoder, but does not carry the quantization corresponding to the first chrominance component and the luminance component in the code stream scale.
- the definition of adaptive correction filtering parameters is shown in Table 10:
- variable quantization scale information corresponding to the second chrominance component is parsed from Table 10, and the ALF coefficient of the first ALF is determined according to the variable quantization scale information corresponding to the second chrominance component, And according to the determined ALF coefficient of the first ALF, the first ALF is used to filter the reconstructed image under the second chrominance component, so as to improve the filtering precision.
- the decoding end uses an existing method to determine the ALF coefficient corresponding to the first chrominance component and/or the luminance component, and uses the determined ALF coefficient to filter the reconstructed image under the first chrominance component and/or the luminance component.
- the present application may only perform ALF coefficient quantization on the chrominance components of the reconstructed image (including the first chrominance component and the second chrominance component) using the variable quantization scale of the present application, while the luminance of the reconstructed image
- the components still use the current fixed quantization scale, for example, 2 6 .
- the encoder carries the variable quantization scale information corresponding to the chrominance component in the code stream and sends it to the decoder, but does not carry the quantization scale corresponding to the luminance component in the code stream.
- the definition of adaptive correction filtering parameters is shown in Table 11.
- the decoding end parses out the variable quantization scale information corresponding to the first chrominance component and the second chrominance component from Table 11.
- the ALF coefficient corresponding to the first chrominance component is determined according to the variable quantization scale information corresponding to the first chrominance component, and the reconstructed image under the first chrominance component is filtered according to the determined ALF coefficient.
- the ALF coefficient corresponding to the second chrominance component is determined according to the variable quantization scale information corresponding to the second chrominance component, and the reconstructed image under the second chrominance component is filtered according to the determined ALF coefficient.
- the decoding end uses the existing method to determine the ALF coefficient corresponding to the luminance component, and uses the determined ALF coefficient to filter the reconstructed image under the luminance component.
- the video encoding method involved in the embodiments of the present application is described above. Based on this, the following describes the video decoding method involved in the present application for the decoding end.
- FIG. 8 is a schematic flowchart of a video decoding method 800 provided by an embodiment of the present application. As shown in FIG. 8 , the method of the embodiment of the present application includes:
- the entropy decoding unit 310 in the decoder can parse the code stream to obtain prediction information, quantization coefficient matrix, etc. of the current block in the current image, and the prediction unit 320 uses intra prediction or Inter prediction produces a predicted block for the current block.
- the inverse quantization/transform unit 330 performs inverse quantization and inverse transformation on the quantized coefficient matrix using the quantized coefficient matrix obtained from the code stream to obtain a residual block.
- the reconstruction unit 340 adds the prediction block and the residual block to obtain a reconstructed block.
- the reconstructed blocks of other image blocks in the current image can be obtained, and each reconstructed block constitutes a reconstructed image.
- the reconstructed image includes a first component, which is a luminance component or a chrominance component.
- an adaptive correction filter switch when the adaptive correction filter switch corresponding to the target area of the reconstruction area under the first component is on, for example, the control signal of the switch is 1, it means that the reconstruction area under the first component is The target region is filtered using the first ALF. If the adaptive correction filter switch corresponding to the target area of the reconstruction area under the first component is off, for example, the control signal of the switch is 0, it means that the target area of the reconstruction area under the first component is not subjected to adaptive correction filtering.
- the code stream is decoded to obtain the ALF coefficient information of the first ALF carried in the code stream.
- the target region in the reconstructed image under the first component is filtered using the first ALF according to the ALF coefficient information of the first ALF.
- the ALF coefficient information of the first ALF includes all integer-type ALF coefficients of the first ALF.
- the decoding end directly uses the first ALF to filter the target region in the reconstructed image under the first component according to all integer-type ALF coefficients of the first ALF decoded, and the filtering process is simple.
- the ALF coefficient information of the first ALF includes an integer-type second ALF coefficient of the first ALF and the first information, wherein the second ALF coefficient is an integer-type ALF corresponding to a non-center point in the first ALF coefficient, the first information is used to indicate the target quantization scale.
- the first information includes a target shift value corresponding to the target quantization scale.
- the first information includes the absolute value of the difference between the target shift value corresponding to the target quantization scale and the first numerical value.
- the above S804 includes the following steps S804-A1 and S804-A2:
- the code stream includes the second ALF coefficient of the first ALF and the target quantization scale information, so that the center position of the first ALF can be determined according to the second ALF coefficient of the first ALF and the target quantization scale information.
- the first ALF coefficient of the first ALF can be determined according to the following formula (12):
- AlfCoeffLuma[i][k] is the first ALF coefficient of the first ALF, and k is the number of ALF coefficients included in the first ALF minus one.
- AlfLumaShift[i] is the target quantization scale corresponding to the first ALF
- AlfCoeffLuma[i][j ] is the jth second ALF coefficient in the first ALF.
- the decoder decodes the second ALF coefficient of the first ALF from the code stream, and determines the first ALF coefficient according to the above formula (12). coefficient, and the first ALF is used to filter the target area in the reconstructed image under the first component, thereby improving the accuracy of the reconstructed image.
- the decoder filters the target region in the reconstructed image under the first component according to the following formula (13):
- I rec(0,0) ′ is the reconstructed pixel value of the current point (0,0) in the target area after filtering
- (x, y) is the position relative to the current point
- W j is the jth of the first ALF ALF coefficients of integer type
- W n-1 is the ALF coefficient of integer type corresponding to the center point of the first ALF
- I rec(0,0) is the reconstructed pixel value of the current point
- Alfshif is the target shift corresponding to the target quantization scale Bit value
- n is the number of ALF coefficients included in the first ALF.
- the filtered reconstructed image can be sent to the display device for display, and on the other hand, the filtered reconstructed image can be stored in the decoded image buffer for subsequent frames as inter-frame prediction. frame of reference.
- FIG. 9 is a schematic flowchart of a video decoding method 900 provided by an embodiment of the present application.
- the method of the embodiment of the present application includes:
- S903 Divide the reconstructed image under the luminance component into regions to obtain a target area of the reconstructed image under the luminance component, where the target area is an area using adaptive correction filtering.
- the ALF coefficient information of the first ALF includes an integer-type second ALF coefficient of the first ALF and the first information, wherein the second ALF coefficient is an integer-type ALF coefficient corresponding to a non-center point in the first ALF,
- the first information is used to indicate the target quantization scale.
- the shape information of the first ALF by parsing the code stream, and determine the type of the first ALF according to the shape information of the first ALF. If the shape of the first ALF is the shape of a 7 ⁇ 7 cross plus a 3 ⁇ 3 square filter, the first ALF includes 9 ALF coefficients; if the shape of the first ALF is a 7 ⁇ 7 cross plus 5 ⁇ 5 square filter shape, the first ALF includes 15 ALF coefficients.
- S906 Determine the first ALF coefficient of the first ALF according to the second ALF coefficient of the first ALF and the target quantization scale, where the first ALF coefficient is an integer type ALF coefficient corresponding to the center point of the first ALF.
- AlfCoeffLuma[i][8] is the first ALF coefficient of the first ALF
- i is the number of ALFs corresponding to the luminance component, such as 16 or 64
- AlfCoeffLuma[i][j] is the i-th first ALF
- alf_filter_num_minus1 is the number of ALFs corresponding to the luminance component minus one
- alf_filter_num_minus1 can be understood as p.
- the bit width of AlfCoeffLuma[i][j] is 7 bits, the value range is -64 to 63, and the value range of AlfCoeffLuma[i][8] is 0 to 127.
- the decoder filters the target area in the reconstructed image under the first component according to the following formula (16):
- I rec(0,0) ′ is the reconstructed pixel value of the current point (0,0) in the target area after filtering
- (x, y) is the position relative to the current point
- W j is the jth of the first ALF a second ALF coefficient
- W n-1 is the first ALF coefficient
- I rec(0,0) is the reconstructed pixel value of the current point
- Alfshif is the target shift value corresponding to the target quantization scale
- n is the first ALF included The number of ALF coefficients.
- FIG. 10 is a schematic flowchart of a video decoding method 1000 provided by an embodiment of the present application.
- the method of the embodiment of the present application includes:
- the ALF coefficient information of the first ALF includes an integer-type second ALF coefficient of the first ALF and the first information, wherein the second ALF coefficient is an integer-type ALF coefficient corresponding to a non-center point in the first ALF,
- the first information is used to indicate the target quantization scale.
- S160 Determine the first ALF coefficient of the first ALF according to the second ALF coefficient of the first ALF and the target quantization scale, where the first ALF coefficient is an integer type ALF coefficient corresponding to the center point of the first ALF.
- the first ALF coefficient of the first ALF is determined according to the following method.
- AlfCoeffLuma[0][8] is the first ALF coefficient of the first ALF
- AlfCoeffLuma[0][j] is the jth ALF coefficient of the first ALF
- j 0 ⁇ 7.
- the bit width of AlfCoeffLuma[0][j] is 7 bits, the value range is -64 to 63, and the value range of AlfCoeffLuma[0][8] is 0 to 127.
- the first ALF coefficient of the first ALF is determined according to the following method.
- AlfCoeffLuma[1][8] is the first ALF coefficient of the first ALF
- AlfCoeffLuma[1][j] is the jth ALF coefficient of the first ALF
- j 0 ⁇ 7.
- the bit width of AlfCoeffLuma[1][j] is 7 bits, the value range is -64 to 63, and the value range of AlfCoeffLuma[1][8] is 0 to 127.
- the filtered reconstructed image can be sent to the display device for display, and on the other hand, the filtered reconstructed image can be stored in the decoded image buffer for subsequent frames as inter-frame prediction. frame of reference.
- FIG. 4 to FIG. 10 are only examples of the present application, and should not be construed as limiting the present application.
- the present application determines that the quantization scale corresponding to the luminance component is larger than the quantization scale corresponding to the chrominance component. For example, the quantization scale corresponding to the luminance component is fixed to 2 7 , and the quantization scale corresponding to the chrominance component is fixed to 2 6 , so that Adjusting the fixed quantization scale can also improve filtering performance, and no bit overhead is required to encode the variable quantization scale.
- the size of the sequence numbers of the above-mentioned processes does not mean the sequence of execution, and the execution sequence of each process should be determined by its functions and internal logic, and should not be dealt with in the present application.
- the implementation of the embodiments constitutes no limitation.
- the term "and/or" is only an association relationship for describing associated objects, indicating that there may be three kinds of relationships. Specifically, A and/or B can represent three situations: A exists alone, A and B exist at the same time, and B exists alone.
- the character "/" in this document generally indicates that the related objects are an "or" relationship.
- FIG. 11 is a schematic block diagram of a video encoder 10 provided by an embodiment of the present application.
- the video encoder 10 includes:
- an obtaining unit 110 configured to obtain a reconstructed image of the current image, where the reconstructed image includes a first component, and the first component is a luminance component or a chrominance component;
- a first determining unit 120 configured to determine, when filtering the target region in the reconstructed image under the first component using the first adaptive correction filter ALF, an ALF coefficient of the floating point type of the first ALF ;
- a second determining unit 130 configured to determine the maximum quantization scale of the ALF coefficients of the floating point type of the first ALF according to the ALF coefficients of the floating point type of the first ALF;
- a third determining unit 140 configured to determine, according to the maximum quantization scale, a target quantization scale of the floating-point type ALF coefficient of the first ALF, where the target quantization scale is less than or equal to the maximum quantization scale;
- a quantization unit 150 configured to use the target quantization scale to quantize the floating-point ALF coefficients of the first ALF into integer-type ALF coefficients
- the encoding unit 160 is configured to encode the integer-type ALF coefficients of the first ALF to obtain a code stream.
- the second determining unit 130 is specifically configured to determine, according to a first coefficient in the floating-point type ALF coefficients of the first ALF and a maximum integer coefficient threshold corresponding to the first coefficient The maximum shift value allowed when the ALF coefficient of the floating point type is quantized into an integer type; according to the maximum shift value, the maximum quantization scale of the ALF coefficient of the floating point type of the first ALF is determined.
- the first coefficient is a floating-point type coefficient corresponding to the center position of the first ALF.
- the second determining unit 130 is specifically configured to determine the maximum shift value according to the following formula:
- the d is 127.
- the second determining unit 130 is specifically configured to determine the maximum quantization scale of the ALF coefficients of the floating point type of the first ALF according to the following formula:
- Scale max is the maximum quantization scale.
- the third determining unit 140 is specifically configured to determine the first quantization scale as the target quantization scale of the floating-point ALF coefficient of the first ALF , the first quantization scale is smaller than or equal to the maximum quantization scale.
- the third determining unit 140 is further configured to determine the second quantization scale as the target quantization scale of the ALF coefficients of the floating point type of the second ALF, where the second ALF is a pair of all the chrominance components.
- a filter used when performing ALS filtering on the reconstructed image, and the second quantization scale is smaller than the first quantization scale.
- the third determining unit 140 is specifically configured to determine the third quantization scale as the target of the floating-point type ALF coefficient of the first ALF a quantization scale, the third quantization scale is smaller than the maximum quantization scale.
- the third determining unit 140 is further configured to determine the fourth quantization scale as the target quantization scale of the floating-point type ALF coefficient of the third ALF, where the third ALF is the A filter used when performing ALS filtering on the reconstructed image, where the fourth quantization scale is larger than the third quantization scale.
- the third determining unit 140 is specifically configured to, for each first quantization scale in the quantization interval formed by the maximum quantization scale and the preset minimum quantization scale, determine the quantization of the first quantization scale cost, the first quantization scale is a positive integer of the power of 2; according to the quantization cost of each first quantization scale, the target quantization cost of the ALF coefficient of the floating point type of the first ALF is determined.
- the third determining unit 140 is specifically configured to determine the first quantization scale with the smallest quantization cost as the target quantization cost of the floating-point type ALF coefficient of the first ALF.
- the third determining unit 140 is specifically configured to use the first quantization scale to quantize the ALF coefficients of the floating point type of the first ALF into the first ALF coefficients of the integer type; determine to use When the first ALF coefficient encodes the target area of the reconstructed image, the quantization distortion result corresponding to the first ALF coefficient; and encoding the first ALF according to the quantization distortion result corresponding to the first ALF coefficient The number of bits consumed by the coefficient determines the quantization cost of the first quantization scale.
- the third determining unit 140 is specifically configured to determine the autocorrelation coefficient of the pixels in the target area according to the reconstructed pixel values of the pixels in the target area; The original pixel value, determine the cross-correlation coefficient of the pixels in the target area; according to the product of the autocorrelation coefficient of the pixels in the target area and the first ALF coefficient, and the cross-correlation coefficient of the pixels in the target area , and determine the quantization distortion result corresponding to the first ALF coefficient.
- the third determining unit 140 is specifically configured to use the first ALF coefficient to filter the target area of the reconstructed image; according to the filtered pixel value and the original pixel value of the pixel in the target area The difference value of , determines the quantization distortion result corresponding to the first ALF coefficient.
- the third determining unit 140 is specifically configured to determine the quantization cost of the first quantization scale according to the following formula:
- the J is the quantization cost of the first quantization scale
- the D is the quantization distortion result corresponding to the first ALF coefficient
- the R is the number of bits consumed by encoding the first ALF coefficient
- the The ⁇ is a variable value.
- the codestream includes all integer-type ALF coefficients of the first ALF.
- the code stream includes a second ALF coefficient of an integer type of the first ALF and first information, and the second ALF coefficient is an integer type corresponding to a non-central point in the first ALF
- the ALF coefficient, the first information is used to indicate the target quantization scale.
- the first information includes a target shift value corresponding to the target quantization scale.
- the first information includes the absolute value of the difference between the target shift value corresponding to the target quantization scale and the first numerical value.
- the first numerical value is 6, 5 or 4.
- the encoding unit 160 is further configured to use the integer-type ALF coefficients of the first AFL to filter the target area of the reconstructed image under the first component.
- the encoding unit 160 is specifically configured to filter the target area of the reconstructed image under the first component according to the following formula:
- the I rec(0,0) ' is the reconstructed pixel value after filtering by the current point (0,0) in the target area
- the (x, y) is the position relative to the current point
- the W j is the ALF coefficient of the jth integer type of the first ALF
- the W n-1 is the ALF coefficient of the integer type corresponding to the center point of the first ALF
- the I rec(0,0 ) is the reconstructed pixel value of the current point
- the Alfshif is the target shift value corresponding to the target quantization scale
- the n is the number of ALF coefficients included in the first ALF.
- the apparatus embodiments and the method embodiments may correspond to each other, and similar descriptions may refer to the method embodiments. To avoid repetition, details are not repeated here.
- the video encoder 10 shown in FIG. 11 can perform the methods of the embodiments of the present application, and the aforementioned and other operations and/or functions of the various units in the video encoder 10 are for implementing the methods 400, 600, and 700, respectively. For the sake of brevity, the corresponding processes in the method will not be repeated here.
- FIG. 12 is a schematic block diagram of a video decoder 20 provided by an embodiment of the present application.
- the video decoder 20 may include:
- the first decoding unit 210 is used for decoding the code stream to obtain the residual value of the current image
- a determining unit 220 configured to determine a reconstructed image of the current image according to the residual value of the current image, where the reconstructed image includes a first component, and the first component is a luminance component or a chrominance component;
- the second decoding unit 230 is configured to decode the code stream to obtain the ALF coefficient information of the first adaptive correction filter ALF corresponding to the target area in the reconstructed image under the first component;
- the filtering unit 240 is configured to use the first ALF to filter the target area in the reconstructed image under the first component according to the ALF coefficient information of the first ALF.
- the ALF coefficient information of the first ALF includes all integer-type ALF coefficients of the first ALF.
- the ALF coefficient information of the first ALF includes an integer-type second ALF coefficient of the first ALF and first information, and the second ALF coefficient is a non-center point in the first ALF Corresponding integer type ALF coefficients, the first information is used to indicate the target quantization scale.
- the first information includes a target shift value corresponding to the target quantization scale.
- the first information includes the absolute value of the difference between the target shift value corresponding to the target quantization scale and the first numerical value.
- the first numerical value is 6, 5 or 4.
- the filtering unit 240 is specifically configured to determine the first ALF coefficient of the first ALF according to the second ALF coefficient of the first ALF and the target quantization scale, where the first ALF coefficient is The integer-type ALF coefficient corresponding to the center point of the first ALF; according to the second ALF coefficient and the first ALF coefficient of the first ALF, use the first ALF to perform a downlink on the first component.
- the target region in the reconstructed image is filtered.
- the filtering unit 240 is specifically configured to determine the first ALF coefficient of the first ALF according to the following formula:
- the AlfCoeffLuma[i][k] is the first ALF coefficient of the i-th first ALF
- the k is the number of ALF coefficients included in the first ALF minus one
- the first ALF The AlfLumaShift[i] is the target quantization scale corresponding to the first ALF
- the AlfCoeffLuma[i][j] is the jth second ALF coefficient in the first ALF.
- the i value ranges from 0 to p
- the p is the number of ALFs corresponding to the luminance component minus one.
- the p is 15 or 63.
- the i is equal to the first value
- the first component is the Cr component
- the first value is 0 and the second value is 1.
- the second decoding unit 230 is specifically configured to parse and obtain the shape information of the first ALF from the code stream; and determine the type of the first ALF according to the shape information of the first ALF .
- the first ALF includes 9 ALF coefficients; if the shape of the first ALF is A 7x7 cross plus a 5x5 square filter shape, the first ALF includes 15 ALF coefficients.
- the filtering unit 240 is specifically configured to filter the target area in the reconstructed image under the first component according to the following formula:
- the I rec(0,0) ′ is the reconstructed pixel value filtered by the current point (0,0) in the target area, and the (x, y) is the position relative to the current point, so
- the W j is the ALF coefficient of the jth integer type of the first ALF
- the W n-1 is the ALF coefficient of the integer type corresponding to the center point of the first ALF
- the I rec(0,0 ) is the reconstructed pixel value of the current point
- the Alfshi is the target shift value corresponding to the target quantization scale
- the n is the number of ALF coefficients included in the first ALF.
- the apparatus embodiments and the method embodiments may correspond to each other, and similar descriptions may refer to the method embodiments. To avoid repetition, details are not repeated here.
- the video decoder 20 shown in FIG. 20 may correspond to the corresponding subject in performing the method 800 or 900 or 1000 of the embodiments of the present application, and the aforementioned and other operations and/or functions of the respective units in the video decoder 20 In order to implement the corresponding processes in each method such as method 800 or 900 or 1000, for the sake of brevity, details are not repeated here.
- the functional unit may be implemented in the form of hardware, may also be implemented by an instruction in the form of software, or may be implemented by a combination of hardware and software units.
- the steps of the method embodiments in the embodiments of the present application may be completed by hardware integrated logic circuits in the processor and/or instructions in the form of software, and the steps of the methods disclosed in conjunction with the embodiments of the present application may be directly embodied as hardware
- the execution of the decoding processor is completed, or the execution is completed by a combination of hardware and software units in the decoding processor.
- the software unit may be located in random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, and other storage media mature in the art.
- the storage medium is located in the memory, and the processor reads the information in the memory, and completes the steps in the above method embodiments in combination with its hardware.
- FIG. 13 is a schematic block diagram of an electronic device 30 provided by an embodiment of the present application.
- the electronic device 30 may be the video encoder or the video decoder described in this embodiment of the application, and the electronic device 30 may include:
- the processor 32 can call and run the computer program 34 from the memory 33 to implement the method in the embodiment of the present application.
- the processor 32 may be configured to perform the steps of the method 200 described above according to instructions in the computer program 34 .
- the processor 32 may include, but is not limited to:
- DSP Digital Signal Processor
- ASIC Application Specific Integrated Circuit
- FPGA Field Programmable Gate Array
- the memory 33 includes but is not limited to:
- Non-volatile memory may be a read-only memory (Read-Only Memory, ROM), a programmable read-only memory (Programmable ROM, PROM), an erasable programmable read-only memory (Erasable PROM, EPROM), an electrically programmable read-only memory (Erasable PROM, EPROM). Erase programmable read-only memory (Electrically EPROM, EEPROM) or flash memory. Volatile memory may be Random Access Memory (RAM), which acts as an external cache.
- RAM Random Access Memory
- RAM Static RAM
- DRAM Dynamic RAM
- SDRAM Synchronous DRAM
- SDRAM double data rate synchronous dynamic random access memory
- Double Data Rate SDRAM DDR SDRAM
- enhanced SDRAM ESDRAM
- synchronous link dynamic random access memory SLDRAM
- Direct Rambus RAM Direct Rambus RAM
- the computer program 34 may be divided into one or more units, and the one or more units are stored in the memory 33 and executed by the processor 32 to complete the procedures provided by the present application.
- the one or more units may be a series of computer program instruction segments capable of performing specific functions, and the instruction segments are used to describe the execution process of the computer program 34 in the electronic device 30 .
- the electronic device 30 may further include:
- a transceiver 33 which can be connected to the processor 32 or the memory 33 .
- the processor 32 can control the transceiver 33 to communicate with other devices, specifically, can send information or data to other devices, or receive information or data sent by other devices.
- the transceiver 33 may include a transmitter and a receiver.
- the transceiver 33 may further include antennas, and the number of the antennas may be one or more.
- each component in the electronic device 30 is connected through a bus system, wherein the bus system includes a power bus, a control bus and a status signal bus in addition to a data bus.
- FIG. 14 is a schematic block diagram of a video encoding and decoding system 40 provided by an embodiment of the present application.
- the video encoding and decoding system 40 may include: a video encoder 41 and a video decoder 42 , wherein the video encoder 41 is used for executing the video encoding method involved in the embodiments of the present application, and the video decoder 42 is used for executing The video decoding method involved in the embodiments of the present application.
- the present application also provides a computer storage medium on which a computer program is stored, and when the computer program is executed by a computer, enables the computer to execute the methods of the above method embodiments.
- the embodiments of the present application further provide a computer program product including instructions, when the instructions are executed by a computer, the instructions cause the computer to execute the methods of the above method embodiments.
- the computer program product includes one or more computer instructions.
- the computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
- the computer instructions may be stored on or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions may be transmitted over a wire from a website site, computer, server or data center (eg coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (eg infrared, wireless, microwave, etc.) means to another website site, computer, server or data center.
- the computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that includes one or more available media integrated.
- the available media may be magnetic media (eg, floppy disk, hard disk, magnetic tape), optical media (eg, digital video disc (DVD)), or semiconductor media (eg, solid state disk (SSD)), and the like.
- the disclosed system, apparatus and method may be implemented in other manners.
- the device embodiments described above are only illustrative.
- the division of the unit is only a logical function division.
- multiple units or components may be combined or may be Integration into another system, or some features can be ignored, or not implemented.
- the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
- Units described as separate components may or may not be physically separated, and components shown as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
- each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
L'invention concerne un procédé et un système de codage vidéo, un procédé et un système de décodage vidéo, un codeur vidéo et un décodeur vidéo. Le procédé de codage vidéo comprend : au moyen d'un coefficient ALF d'un type nombre à virgule flottante d'un premier ALF, la détermination de l'échelle de quantification maximale du coefficient ALF du type nombre à virgule flottante du premier ALF ; en fonction de l'échelle de quantification maximale, la détermination d'une échelle de quantification cible du coefficient ALF du type nombre à virgule flottante du premier ALF ; l'utilisation de l'échelle de quantification cible pour quantifier le coefficient ALF du type nombre à virgule flottante du premier ALF en tant que coefficient ALF de type entier ; et le codage du coefficient ALF du type entier du premier ALF pour obtenir un flux de code. L'équilibre des gains provoqués par les surdébits de codage d'un coefficient de filtre et par un filtre est réalisé, ce qui permet d'améliorer les effets de codage et de décodage vidéo.
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311245327.0A CN117082239A (zh) | 2021-01-22 | 2021-01-22 | 视频编解码方法与系统、及视频编码器与视频解码器 |
CN202180090973.7A CN116746152A (zh) | 2021-01-22 | 2021-01-22 | 视频编解码方法与系统、及视频编码器与视频解码器 |
PCT/CN2021/073409 WO2022155922A1 (fr) | 2021-01-22 | 2021-01-22 | Procédé et système de codage vidéo, procédé et système de décodage vidéo, codeur vidéo et décodeur vidéo |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2021/073409 WO2022155922A1 (fr) | 2021-01-22 | 2021-01-22 | Procédé et système de codage vidéo, procédé et système de décodage vidéo, codeur vidéo et décodeur vidéo |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022155922A1 true WO2022155922A1 (fr) | 2022-07-28 |
Family
ID=82548401
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/073409 WO2022155922A1 (fr) | 2021-01-22 | 2021-01-22 | Procédé et système de codage vidéo, procédé et système de décodage vidéo, codeur vidéo et décodeur vidéo |
Country Status (2)
Country | Link |
---|---|
CN (2) | CN117082239A (fr) |
WO (1) | WO2022155922A1 (fr) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107211154A (zh) * | 2015-02-11 | 2017-09-26 | 高通股份有限公司 | 译码树单元级自适应环路滤波器 |
WO2020094154A1 (fr) * | 2018-11-09 | 2020-05-14 | Beijing Bytedance Network Technology Co., Ltd. | Améliorations apportées à un filtre à boucle adaptatif basé sur des régions |
WO2020126411A1 (fr) * | 2018-12-21 | 2020-06-25 | Canon Kabushiki Kaisha | Filtrage à boucle adaptatif (alf) avec écrêtage non linéaire |
CN111742552A (zh) * | 2019-06-25 | 2020-10-02 | 北京大学 | 环路滤波的方法与装置 |
CN111801941A (zh) * | 2018-03-09 | 2020-10-20 | 华为技术有限公司 | 用于利用自适应乘数系数进行图像滤波的方法及装置 |
-
2021
- 2021-01-22 WO PCT/CN2021/073409 patent/WO2022155922A1/fr active Application Filing
- 2021-01-22 CN CN202311245327.0A patent/CN117082239A/zh active Pending
- 2021-01-22 CN CN202180090973.7A patent/CN116746152A/zh active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107211154A (zh) * | 2015-02-11 | 2017-09-26 | 高通股份有限公司 | 译码树单元级自适应环路滤波器 |
CN111801941A (zh) * | 2018-03-09 | 2020-10-20 | 华为技术有限公司 | 用于利用自适应乘数系数进行图像滤波的方法及装置 |
WO2020094154A1 (fr) * | 2018-11-09 | 2020-05-14 | Beijing Bytedance Network Technology Co., Ltd. | Améliorations apportées à un filtre à boucle adaptatif basé sur des régions |
WO2020126411A1 (fr) * | 2018-12-21 | 2020-06-25 | Canon Kabushiki Kaisha | Filtrage à boucle adaptatif (alf) avec écrêtage non linéaire |
CN111742552A (zh) * | 2019-06-25 | 2020-10-02 | 北京大学 | 环路滤波的方法与装置 |
Non-Patent Citations (1)
Title |
---|
L.-H. XU (FUJITSU), J. YAO (FUJITSU), J.-Q. ZHU, K. KAZUI (FUJITSU): "Non-CE5: Adaptive precision for CCALF coefficients", 17. JVET MEETING; 20200107 - 20200117; BRUSSELS; (THE JOINT VIDEO EXPLORATION TEAM OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ), 31 December 2019 (2019-12-31), XP030223092 * |
Also Published As
Publication number | Publication date |
---|---|
CN117082239A (zh) | 2023-11-17 |
CN116746152A (zh) | 2023-09-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110720218B (zh) | 与视频译码中的变换处理一起应用的帧内滤波 | |
CN111226438B (zh) | 视频解码的方法及解码器 | |
CN110393010B (zh) | 视频译码中的帧内滤波旗标 | |
CN111327904B (zh) | 图像重建方法和装置 | |
JP7401542B2 (ja) | ピクチャのブロックのイントラ予測の方法 | |
JP7277586B2 (ja) | モードおよびサイズに依存したブロックレベル制限の方法および装置 | |
CN114424567B (zh) | 使用基于矩阵的帧内预测进行组合的帧间-帧内预测的方法和装置 | |
WO2023039859A1 (fr) | Procédé de codage vidéo, procédé de décodage vidéo, et dispositif, système et support de stockage | |
WO2019010305A1 (fr) | Remappage de couleurs pour contenu de vidéo de format non 4:4:4 | |
KR20210107131A (ko) | 이미지 예측 방법, 장치 및 시스템, 디바이스 및 저장 매체 | |
EP3890322A1 (fr) | Codeur-décodeur vidéo, et procédé correspondant | |
WO2024050723A1 (fr) | Procédé et appareil de prédiction d'image, et support d'enregistrement lisible par ordinateur | |
WO2022174475A1 (fr) | Procédé et système de codage vidéo, procédé et système de décodage vidéo, codeur vidéo et décodeur vidéo | |
WO2023044868A1 (fr) | Procédé de codage vidéo, procédé de décodage vidéo, dispositif, système et support de stockage | |
WO2022116105A1 (fr) | Procédé et système de codage vidéo, procédé et appareil de décodage vidéo, codeur vidéo et décodeur vidéo | |
EP3993420A1 (fr) | Codeur vidéo et procédé de réglage qp | |
WO2022155922A1 (fr) | Procédé et système de codage vidéo, procédé et système de décodage vidéo, codeur vidéo et décodeur vidéo | |
WO2023184250A1 (fr) | Procédé, appareil et système de codage/décodage vidéo, dispositif et support de stockage | |
WO2023122968A1 (fr) | Procédé, dispositif et système de prédiction intratrame, et support d'enregistrement | |
WO2023236113A1 (fr) | Procédés, appareils et dispositifs de codage et de décodage vidéo, système et support de stockage | |
WO2022217447A1 (fr) | Procédé et système de codage et de décodage vidéo, et codec vidéo | |
WO2023122969A1 (fr) | Procédé de prédiction intra-trame, dispositif, système et support de stockage | |
WO2022116054A1 (fr) | Procédé et système de traitement d'images, codeur vidéo et décodeur vidéo | |
WO2022193390A1 (fr) | Procédé et système de codage et de décodage vidéo, et codeur vidéo et décodeur vidéo | |
WO2022193389A1 (fr) | Procédé et système de codage vidéo, procédé et système de décodage vidéo, et codeur et décodeur vidéo |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21920320 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202180090973.7 Country of ref document: CN |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21920320 Country of ref document: EP Kind code of ref document: A1 |