WO2021121080A1 - 视频解码方法、视频编码方法、装置、设备及存储介质 - Google Patents
视频解码方法、视频编码方法、装置、设备及存储介质 Download PDFInfo
- Publication number
- WO2021121080A1 WO2021121080A1 PCT/CN2020/134581 CN2020134581W WO2021121080A1 WO 2021121080 A1 WO2021121080 A1 WO 2021121080A1 CN 2020134581 W CN2020134581 W CN 2020134581W WO 2021121080 A1 WO2021121080 A1 WO 2021121080A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- effective
- size
- parameter set
- luminance
- chrominance
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/44—Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
- H04N19/126—Details of normalisation or weighting functions, e.g. normalisation matrices or variable uniform quantisers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/12—Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
- H04N19/122—Selection of transform size, e.g. 8x8 or 2x4x8 DCT; Selection of sub-band transforms of varying structure or type
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/157—Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
- H04N19/159—Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/186—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/189—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
- H04N19/196—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/90—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
- H04N19/96—Tree coding, e.g. quad-tree coding
Definitions
- the embodiments of the present application relate to the technical field of video coding and decoding, and in particular to a video decoding method, a video coding method, a device, a device, and a storage medium.
- H.266 is a new generation of video coding technology improved on the basis of H.265/HEVC (High Efficient Video Coding). It has been officially named VVC (Versatile Video Coding, universal video coding). Joint Video Experts Team (Joint Video Experts Group) organization and guidance is constantly being updated and improved.
- VVC Very Video Coding, universal video coding
- the decoding method of the quantization matrix adopted by the VVC has a relatively high computational complexity at the decoder side.
- the embodiments of the present application provide a video decoding method, a video encoding method, a device, a device, and a storage medium, which can reduce the computational complexity on the decoder side.
- the technical solution is as follows:
- an embodiment of the present application provides a video decoding method, and the method includes:
- the first parameter set includes a parameter set used to define syntax elements (syntax elements) related to QM (Quantization Matrix, quantization matrix);
- an embodiment of the present application provides a video encoding method, and the method includes:
- the syntax element used to determine the effective QM and the effective QM are encoded to generate a code stream corresponding to a first parameter set; wherein the first parameter set includes a parameter set used to define QM-related syntax elements.
- an embodiment of the present application provides a video decoding device, the device including:
- a parameter acquisition module configured to acquire a first parameter set corresponding to a video frame to be decoded, the first parameter set including a parameter set for defining QM-related syntax elements
- the QM determination module is configured to determine the effective QM according to the syntax elements contained in the first parameter set, and the effective QM refers to the actual use of the quantized transform coefficients during the inverse quantization during the decoding process of the to-be-decoded video frame QM;
- the QM decoding module is used to decode the effective QM.
- an embodiment of the present application provides a video encoding device, the device including:
- a QM determination module configured to determine an effective QM corresponding to a video frame to be encoded, where the effective QM refers to the QM actually used when transform coefficients are quantized in the encoding process of the video frame to be encoded;
- the QM encoding module is used to encode the syntax elements used to determine the effective QM and the effective QM to generate a code stream corresponding to a first parameter set; wherein, the first parameter set includes related information for defining QM The parameter set of the syntax element.
- an embodiment of the present application provides a computer device, the computer device includes a processor and a memory, and at least one instruction, at least a program, code set, or instruction set is stored in the memory, and the at least one instruction, The at least one program, the code set, or the instruction set is loaded and executed by the processor to implement the above-mentioned video decoding method or the above-mentioned video encoding method.
- an embodiment of the present application provides a computer-readable storage medium that stores at least one instruction, at least a program, code set, or instruction set, and the at least one instruction, the at least one instruction, and the at least one instruction set are stored in the computer-readable storage medium.
- a piece of program, the code set or the instruction set is loaded and executed by the processor to implement the above-mentioned video decoding method or the above-mentioned video encoding method.
- an embodiment of the present application provides a computer program product, which is used to implement the above-mentioned video decoding method or the above-mentioned video encoding method when the computer program product is executed by a processor.
- the effective QM is determined according to the syntax elements contained in the first parameter set.
- the effective QM refers to the actual quantization of transform coefficients during the process of encoding and generating the video frame to be decoded. Use the QM, and then decode the effective QM. In this way, the decoder only needs to decode the effective QM, thereby reducing the computational complexity of the decoder.
- Fig. 1 is a schematic diagram of a video coding exemplarily shown in this application;
- Fig. 2 is a simplified block diagram of a communication system provided by an embodiment of the present application.
- FIG. 3 is a schematic diagram of the placement mode of a video encoder and a video decoder in a streaming environment exemplarily shown in this application;
- FIG. 4 is a schematic diagram of encoding in an inter-frame prediction mode provided by an embodiment of the present application.
- Fig. 5 is a schematic diagram of encoding in an intra-frame prediction mode provided by an embodiment of the present application.
- FIG. 6 is a schematic diagram of functional modules of a video encoder provided by an embodiment of the present application.
- FIG. 7 is a schematic diagram of functional modules of a video decoder provided by an embodiment of the present application.
- FIG. 8 is a schematic diagram of generating QM through down-sampling and copying according to an embodiment of the present application.
- FIG. 9 is a schematic diagram of a diagonal scanning sequence provided by an embodiment of the present application.
- FIG. 10 is a flowchart of a video decoding method provided by an embodiment of the present application.
- FIG. 11 is a flowchart of a video encoding method provided by an embodiment of the present application.
- Figure 12 is a block diagram of a video decoding device provided by an embodiment of the present application.
- FIG. 13 is a block diagram of a video decoding device provided by another embodiment of the present application.
- FIG. 14 is a block diagram of a video encoding device provided by an embodiment of the present application.
- Fig. 15 is a structural block diagram of a computer device provided by an embodiment of the present application.
- the current block 101 includes samples that have been discovered by the encoder during the motion search process, and the samples can be predicted based on previous blocks of the same size that have generated spatial offsets.
- the MV Motion Vector
- the MV may be derived from metadata associated with one or more reference pictures, instead of directly encoding the MV. For example, using the MV associated with any of the five surrounding samples A0, A1 and B0, B1, B2 (corresponding to 102 to 106 respectively), (in decoding order) the MV is derived from the metadata of the nearest reference picture MV.
- the communication system 200 includes a plurality of devices, which can communicate with each other via a network 250, for example.
- the communication system 200 includes a first device 210 and a second device 220 interconnected through a network 250.
- the first device 210 and the second device 220 perform one-way data transmission.
- the first device 210 may encode video data, such as a video picture stream collected by the first device 210, for transmission to the second device 220 via the network 250.
- the encoded video data is transmitted in the form of one or more encoded video streams.
- the second device 220 may receive encoded video data from the network 250, decode the encoded video data to restore the video data, and display video pictures according to the restored video data.
- One-way data transmission is more common in applications such as media services.
- the communication system 200 includes a third device 230 and a fourth device 240 that perform two-way transmission of encoded video data, which can occur, for example, during a video conference.
- each of the third device 230 and the fourth device 240 may encode video data (for example, a video picture stream collected by the device) for transmission to the third device 230 and the fourth device through the network 250 Another device in 240.
- Each of the third device 230 and the fourth device 240 may also receive encoded video data transmitted by another device of the third device 230 and the fourth device 240, and may decode the encoded video data
- the video data can be restored, and the video pictures can be displayed on the accessible display device according to the restored video data.
- the first device 210, the second device 220, the third device 230, and the fourth device 240 may be computer devices such as servers, personal computers, and smart phones, but the principles disclosed in the present application may not be limited thereto.
- the embodiments of this application are applicable to PCs (Personal Computers), mobile phones, tablet computers, media players and/or dedicated video conferencing equipment.
- the network 250 represents any number of networks that transmit encoded video data between the first device 210, the second device 220, the third device 230, and the fourth device 240, including, for example, wired and/or wireless communication networks.
- the communication network 250 may exchange data in circuit-switched and/or packet-switched channels.
- the network may include a telecommunications network, a local area network, a wide area network, and/or the Internet.
- the architecture and topology of the network 250 may be insignificant to the operations disclosed in this application.
- FIG. 3 shows the placement of video encoders and video decoders in a streaming environment.
- the subject matter disclosed in this application is equally applicable to other video-supporting applications, including, for example, video conferencing, digital TV (television), CD (Compact Disc), DVD (Digital Versatile Disc), memory stick Store compressed video on other digital media and so on.
- the streaming system may include an acquisition subsystem 313, which may include a video source 301 such as a digital camera, and the video source creates an uncompressed video picture stream 302.
- the video picture stream 302 includes samples taken by a digital camera. Compared with the encoded video data 304 (or encoded video stream), the video picture stream 302 is depicted as a thick line to emphasize the video picture stream with high data volume.
- the video picture stream 302 can be processed by the electronic device 320.
- the electronic device 320 includes a video encoder 303 coupled to a video source 301.
- the video encoder 303 may include hardware, software, or a combination of software and hardware to implement or implement aspects of the disclosed subject matter as described in more detail below.
- the encoded video data 304 (or the encoded video stream 304) is depicted as a thin line to emphasize the lower data volume of the encoded video data 304 (or the encoded video stream 304).
- 304 which can be stored on the streaming server 305 for future use.
- One or more streaming client subsystems such as client subsystem 306 and client subsystem 308 in FIG. 3, can access streaming server 305 to retrieve copies 307 and 309 of encoded video data 304.
- the client subsystem 306 may include, for example, the video decoder 310 in the electronic device 330.
- the video decoder 310 decodes the incoming copy 307 of the encoded video data and generates an output video picture stream 311 that can be presented on a display 312 (eg, a display screen) or another rendering device (not depicted).
- a display 312 eg, a display screen
- another rendering device not depicted.
- the encoded video data 304, video data 307, and video data 309 can be encoded according to certain video encoding/compression standards.
- the electronic device 320 and the electronic device 330 may include other components (not shown).
- the electronic device 320 may include a video decoder (not shown), and the electronic device 330 may also include a video encoder (not shown).
- the video decoder is used to decode the received encoded video data; the video encoder is used to encode the video data.
- an inter-frame prediction mode or an intra-frame prediction mode can be used to generate a prediction block based on one or more encoded reference blocks.
- the prediction block may be an estimated version of the original block.
- the residual block can be generated by subtracting the original block from the prediction block, and vice versa, the residual block can be used to represent the prediction residual (or called the prediction error). Since the amount of data needed to represent the prediction residual may generally be less than the amount of data needed to represent the original block, the residual block can be encoded to achieve a higher compression ratio.
- the coded reference block 41 and the to-be-coded block 42 are located in two different video frames.
- the coded reference block 51 and the to-be-coded block 52 are located in the same video frame.
- the residual value of the residual block in the spatial domain can be converted into transform coefficients in the frequency domain.
- This conversion can be implemented by a two-dimensional transformation such as a discrete cosine transform (DCT).
- DCT discrete cosine transform
- low-index transformation coefficients e.g., located in the upper-left area
- high-index transformation coefficients e.g., located in the lower-right area
- a quantization matrix including quantization coefficients can be applied to a transformation matrix, thereby quantizing all transformation coefficients to become quantized transformation coefficients.
- the scale or magnitude of transform coefficients may be reduced.
- Some high-index transform coefficients can be reduced to zero, and then may be skipped in subsequent scanning and encoding steps.
- FIG. 6 shows a portion of an exemplary video encoder 60 including a transform module 62, a quantization module 64, and an entropy encoding module 66.
- the video encoder 60 may also include other modules such as a prediction module, a dequantization module, a reconstruction module, and the like.
- the video encoder 60 may obtain a video frame, and the video frame may include a plurality of image blocks.
- the encoding of an image block can be regarded as an example here.
- a prediction block can be first generated as an estimate of the image block.
- the prediction block may be generated by the prediction module through inter prediction or intra prediction mode.
- the difference between the image block and the prediction block can be calculated to generate a residual block.
- the residual block may be transformed into transform coefficients by the transform module 62.
- the residual values in the spatial domain including large features and small features, are converted into transform coefficients in the frequency domain, which includes high frequency bands and low frequency bands.
- the quantization module 64 may use QM to quantize the transform coefficients, thereby generating quantized transform coefficients.
- the quantized transform coefficient may be encoded by the entropy encoding module 66, and finally sent from the video encoder 60 as a part of the bitstream.
- FIG. 7 shows a portion of an exemplary video decoder 70 including an entropy decoding module 72, an inverse quantization (inverse quantization) module 74, and an inverse transform module 76.
- the video decoder 70 may also include other modules such as a prediction module, a transformation module, a quantization module, and the like.
- the video decoder 70 may receive the bitstream output from the video encoder 60, perform decoding on the bitstream according to an inter prediction or intra prediction mode, and output a reconstructed video frame.
- the entropy decoding module 72 can generate quantized transform coefficients by performing entropy decoding on the input bitstream.
- the dequantization module 74 may dequantize the quantized transform coefficients based on QM to obtain the dequantized transform coefficients.
- the inverse transform module 76 inversely transforms the inversely quantized transform coefficients to generate a reconstructed residual block. Then, based on the reconstructed residual block and the predicted block, a reconstructed image block is generated.
- QM is an indispensable part of the video encoding and decoding process.
- the configuration of QM can determine how much information of transform coefficients are retained or filtered out, so QM can affect coding performance and coding quality.
- QM is required in both encoder and decoder. Specifically, in order to decode the image correctly, it is necessary to encode the information about the quantized coefficients in the QM in the encoder and send the information from the encoder to the decoder.
- QM may sometimes be referred to as a scaling matrix or a weight matrix. Therefore, the term "QM" used herein may be a general term covering quantization matrix, scaling matrix, weight matrix, and other equivalent terms.
- VTM VVC Test Model
- Non-square QMs do not exist in the VVC bitstream, they are obtained by duplicating the corresponding square QMs on the decoder side. More specifically, 32 ⁇ 4QM is obtained by copying lines 0, 8, 16, and 24 of 32 ⁇ 32QM. As shown in Fig. 8, 32 ⁇ 4QM is obtained by down-sampling 32 ⁇ 32QM, and rows 0, 8, 16, and 24 filled with diagonal lines are copied from 32 ⁇ 32QM to 32 ⁇ 4QM.
- the corresponding QM size in VTM7 is constrained to be 8 ⁇ 8.
- Up-sampling methods are used for these 8 ⁇ 8QMs to create 16 ⁇ 16, 32 ⁇ 32, and 64 ⁇ 64QMs. More specifically, in order to create a 16 ⁇ 16 size QM, each element in its corresponding 8 ⁇ 8 size QM is up-sampled and copied into a 2 ⁇ 2 area; in order to create a 32 ⁇ 32 size QM, its Each element in the corresponding 8 ⁇ 8 size QM is up-sampled and copied into the 4 ⁇ 4 area.
- Table 1 determines the QM identifier variable (id) according to the variables sizeId and matrixId specified in Table 2 and Table 3, respectively. Among them, sizeId represents the size of the quantization matrix, and matrixId is a QM type identifier based on the prediction mode (predMode) and the color component (cIdx).
- MODE_INTRA represents the intra prediction mode
- MODE_INTER represents the inter prediction mode
- MODE_IBC represents the IBC (Intra Block Copy) prediction mode
- Y represents brightness
- Cb and Cr represent chromaticity.
- VTM7 uses intra-frame and inter-frame predictive coding to encode 28 QMs.
- DPCM Downlink Control Coding
- Differential Pulse Code Modulation Differential Pulse Code Modulation
- the residual in the DPCM frame also needs to be transmitted in the bit stream.
- the diagonal scanning order is (0,0), (1,0), (0,1), (2,0), (1,1),...,(2,3),(3,3).
- inter-frame prediction modes There are two inter-frame prediction modes, namely copy mode and prediction mode.
- the copy mode the current QM to be encoded is exactly the same as a decoded QM called the reference QM. This also means that the copy mode has zero inter-frame residuals, and of course there is no need to signal residuals.
- the encoder should transmit the incremental ID between the current QM and its reference, so that the decoder can reconstruct the current QM by directly copying the reference QM.
- the prediction mode is similar to the copy mode, but with additional inter-frame residuals. DPCM coding is applied to the inter-frame residuals in diagonal scanning order, and the encoder needs to transmit the DPCM inter-frame residuals in the bitstream.
- the up-sampling algorithm is applied to copy each element in the QM into a large square area. Since the DC coefficient at the position (0, 0) is the most important for reconstructing the video, VTM7 encodes it directly, rather than copying from the corresponding elements of other QMs.
- the mode decision is used to calculate the bit cost of the 3 candidate modes of the QM (that is, the copy mode of the inter prediction mode, the prediction mode of the inter prediction mode, and the intra prediction mode), and the selection has the smallest bit One of the cost is the ultimate optimal model. Then, use the optimal mode to encode the QM.
- VVC supports frequency-dependent quantization of transform blocks. Assuming that QM is W, W[x][y] represents the QM weight of the transform coefficient at the position (x, y) in the TB. For the transform coefficient coeff[x][y], the following formula 1 is used to calculate the quantized transform coefficient: level[x][y]:
- QP is a quantization parameter (also called a quantization step size)
- offset is an offset value.
- the value of all elements in QM is equal to 16 the effect is the same as not using QM.
- the SPS Sequence Parameter Set, sequence parameter set syntax element sps_scaling_list_enable_flag is used to indicate whether QM is enabled for those pictures whose picture header (Picture Header, PH) has already referenced the SPS.
- this flag (flag) is enabled, that is, when sps_scaling_list_enable_flag is enabled, the additional flag in the PH is used to control whether to use the default QM with all elements equal to 16 or use the user-defined QM.
- VTM7 user-defined QM is notified in APS (Adaptive Parameter Set, adaptive parameter set). If user-defined QM is enabled in SPS and PH, 1 APS index can be sent in PH to specify the QM set of the image that has referenced this PH.
- ⁇ id incrementmental id
- AC and DC coefficients should be notified.
- 28 groups of QM are coded and decoded in ascending order of id.
- Scaling_list_copy_mode_flag[id] 1 means that the element value of the current QM and its reference QM are the same.
- the reference QM is represented by scaling_list_pred_id_delta[id].
- scaling_list_copy_mode_flag[id] 0 means that scaling_list_pred_mode_flag exists.
- scaling_list_pred_mode_flag[id] 1 indicates that the current QM can be predicted from the reference QM.
- the reference QM is represented by scaling_list_pred_id_delta[id].
- Scaling_list_pred_mode_flag[id] 0 means that the element value of the current QM is explicitly signaled. When it does not exist, the value of scaling_list_pred_mode_flag[id] is inferred to be equal to zero.
- scaling_list_pred_id_delta[id] represents the reference QM used to infer and predict the QM, that is, ScalingMatrixPred[id]. When not present, the value of scaling_list_pred_id_delta[id] is inferred to be equal to zero. The value of scaling_list_pred_id_delta[id] should be in the range of 0 to maxIdDelta, maxIdDelta is inferred based on id, as shown in the following formula 2:
- the QM prediction matrix of matrixSize ⁇ matrixSize is expressed as ScalingMatrixPred[x][y], where x ⁇ [0, matrixSize-1], y ⁇ [0, matrixSize-1], and the variable ScalingMatrixDCPred is expressed as the predicted value of DC. They The specific calculation is as follows:
- scaling_list_copy_mode_flag[id] and scaling_list_pred_mode_flag[id] are both equal to 0, all elements of ScalingMatrixPred are set equal to 8, and the value of ScalingMatrixDCPred is set equal to 8.
- scaling_list_pred_id_delta[id] is equal to 0
- all elements of ScalingMatrixPred are set equal to 16
- the value of ScalingMatrixDCPred is set equal to 16.
- ScalingMatrixPred is set to be equal to ScalingMatrixPred[refId]
- the value of ScalingMatrixDCPredrefingId is set to be greater than 13
- the value of ScalingMatrixDCPredrefDC is calculated as follows: It is equal to ScalingMatrixDCRec[refId-14]; otherwise (that is, refId is less than or equal to 13), the value of ScalingMatrixDCPred is set equal to ScalingMatrixPred[0][0].
- scaling_list_dc_coef[id-14] is used to calculate the value of the variable ScalingMatrixDC[id-14] when id is greater than 13, as shown in the following formula 5:
- ScalingMatrixDCRec[id-14] (ScalingMatrixDCPred+scaling_list_dc_coef[id-14]+256)%256) Formula 5
- scaling_list_dc_coef[id-14] When it does not exist, the value of scaling_list_dc_coef[id-14] is inferred to be equal to 0.
- the value of scaling_list_dc_coef[id-14] should be in the range of -128 to 127 (including -128 and 127).
- the value of ScalingMatrixDCRec[id-14] should be greater than 0.
- scaling_list_delta_coef[id][i] represents the difference between the current matrix coefficient ScalingList[id][i] and the previous matrix coefficient ScalingList[id][i-1] when scaling_list_copy_mode_flag[id] is equal to 0.
- the value of scaling_list_delta_coef[id][i] should be in the range of -128 to 127 (including -128 and 127).
- the ScalingMatrixRec[id] of QM of matrixSize ⁇ matrixSize can be calculated using the following formula 6:
- ScalingMatrixRec[id][x][y] (ScalingMatrixPred[x][y]+ScalingList[id][k]+256)%256)
- ScalingMatrixRec[id][x][y] should be greater than 0.
- sps_max_luma_transform_size_64_flag 1 means that the maximum transform block size in the luma sample is equal to 64.
- sps_max_luma_transform_size_64_flag 0 means that the maximum transform block size in the luma sample is equal to 32.
- chroma_format_idc represents the chroma sample corresponding to the luminance sample, as shown in Table 6:
- SubWidthC and SubHeightC respectively represent the width and height of the CTU (Coding Tree Unit) corresponding to the chrominance component, and Monochrome represents no chrominance component.
- separate_colour_plane_flag 1 means that the three color components of the 4:4:4 chroma format are separately coded.
- separate_colour_plane_flag 0 means that the color components are not coded separately.
- each component is composed of coded samples of a color plane (Y, Cb or Cr), and uses monochrome coding syntax.
- each color plane is associated with a specific colour_plane_id value.
- colour_plane_id specifies the color plane associated with the slice associated with PH.
- colour_plane_id When separate_colour_plane_flag is equal to 1, the value of colour_plane_id should be in the range of 0 to 2 (including 0 and 2).
- the values of colour_plane_id 0, 1, and 2 correspond to the Y, Cb and Cr planes, respectively. It should be noted that there is no dependency between the decoding process of images with different colour_plane_id values.
- sps_log2_ctu_size_minus5 plus 5 indicates the luma coding tree block size of each CTU.
- the value of sps_log2_ctu_size_minus5 is less than or equal to 2 is the requirement of bit stream consistency.
- the maximum luminance coding block size can be calculated:
- CtbSizeY represents the largest luminance coding block size
- CtbLog2SizeY represents the logarithm of CtbSizeY with base 2 as the base
- ⁇ is the left shift operator
- log2_min_luma_coding_block_size_minus2 plus 2 represents the smallest luma coding block size.
- the value range of log2_min_luma_coding_block_size_minus2 should be in the range of 0 to sps_log2_ctu_size_minus5+3, including 0 and sps_log2_ctu_size_minus5+3.
- MinCbLog2SizeY log2_min_luma_coding_block_size_minus2+2
- MinCbSizeY 1 ⁇ MinCbLog2SizeY
- MinCbSizeY represents the smallest luminance coding block size
- MinCbLog2SizeY represents the logarithm of MinCbSizeY with a base of 2
- VSize represents the largest luminance coding block size
- ⁇ is the left shift operator.
- the value of MinCbSizeY should be less than or equal to VSize.
- each chroma CTB (Coding Tree Block), that is, the variables CtbWidthC and CtbHeightC, are determined in the following way:
- chroma_format_idc is equal to 0 (monochrome) or Separate_color_Plane_flag is equal to 1, then CtbWidthC and CtbHeightC are both equal to 0.
- CtbWidthC and CtbHeightC are calculated using the following formula:
- CtbSizeY represents the size of the brightness CTB.
- the current encoding method for quantization matrix adopted by VVC all 28 QMs will be encoded and transmitted in APS, which causes QM signaling to occupy more codewords, high bit overhead, and will increase the calculation on the decoder side. the complexity.
- the effective QM is determined according to the syntax elements contained in the first parameter set, and the effective QM refers to generating the video to be decoded during encoding. In the frame process, the actual QM used when the transform coefficients are quantized is then decoded.
- the encoder end only encodes and transmits the effective QM, which helps to save the codewords required for QM signaling and reduces bit overhead, and the decoder end only needs to decode the effective QM, thereby reducing the computational complexity on the decoder end degree.
- the execution subject of each step is the decoding end device
- the execution subject of each step of the video encoding method provided in the embodiment of the application is the encoding end device
- the decoding end device can be computer devices, which refer to electronic devices with data calculation, processing, and storage capabilities, such as PCs, mobile phones, tablets, media players, dedicated video conferencing devices, servers, and so on.
- the methods provided in this application can be used alone or combined with other methods in any order.
- the encoder and decoder based on the method provided in this application can be implemented by one or more processors or one or more integrated circuits.
- FIG. 10 shows a flowchart of a video decoding method provided by an embodiment of the present application.
- the method is mainly applied to the decoder device introduced above as an example.
- the method can include the following steps (1001-1003):
- Step 1001 Obtain a first parameter set corresponding to a video frame to be decoded.
- the video frame to be decoded may be any video frame to be decoded (or called an image frame) in the video to be decoded.
- the first parameter set includes a parameter set used to define syntax elements related to QM.
- the decoding end device can decode to obtain QM according to the syntax elements in the first parameter set.
- the first parameter set is APS.
- the first parameter set may not be APS, but may also be SPS, etc., which is not limited in the embodiment of the present application.
- Step 1002 Determine an effective QM according to the syntax elements included in the first parameter set.
- the effective QM refers to the QM actually used when inversely quantizing the quantized transform coefficients in the decoding process of the to-be-decoded video frame.
- the number of effective QMs may be less than n, or may be equal to n, where n is a positive integer.
- the number of effective QMs is n; when inversely quantized transform coefficients are used for inverse quantization, some of the QMs out of all n QMs are actually used (For example, m QMs, m is a positive integer less than n), then the number of effective QMs is m.
- the decoding end device can determine which are effective QM and which are not effective QM by reading the syntax element. For QMs that are not valid QMs (may be referred to as invalid QMs), that is, QMs that are not actually used when transform coefficients are quantized in the process of encoding and generating the to-be-decoded video frame, the decoder device may not need to decode them.
- all of its elements are predefined as default values.
- the default value is 16, combined with formula 1.
- Step 1003 Decode the effective QM.
- the decoder device needs to decode each effective QM separately. Taking any effective QM as an example, when decoding the effective QM, the encoding mode corresponding to the effective QM can be determined, and then the effective QM can be decoded according to the encoding mode.
- the number of QMs that may be used when quantizing transform coefficients is 28. Assuming that 12 of them are determined to be effective QMs, then the decoder device only needs to decode the 12 effective QMs. , Without the need to decode the remaining 16 invalid QMs.
- the effective QM is determined according to the syntax elements contained in the first parameter set.
- the actual QM used when the transform coefficients are quantized is then decoded. In this way, the decoder only needs to decode the effective QM, thereby reducing the computational complexity of the decoder.
- determining the effective QM according to the syntax elements included in the first parameter set includes the following sub-steps:
- the effective size range of the QM defines the minimum size and the maximum size of the QM that is actually used when inversely quantizing the quantized transform coefficients in the decoding process.
- the value of the QM size is an exponential power of 2, such as 2, 4, 8, 16, 32, and 64.
- the effective QM when the effective size range of the QM is [4,32], the effective QM includes a 4 ⁇ 4 size QM, an 8 ⁇ 8 size QM, a 16 ⁇ 16 size QM, and a 32 ⁇ 32 size QM.
- the effective size range of the QM when the effective size range of the QM is [8, 16], the effective QM includes an 8 ⁇ 8 size QM and a 16 ⁇ 16 size QM.
- the sizeId corresponding to the 8 ⁇ 8 size QM is 3
- the sizeId corresponding to the 16 ⁇ 16 size QM is 4.
- the following method is adopted to determine the effective size range of the QM according to the syntax elements included in the first parameter set:
- a first syntax element is defined in the first parameter set, and the first syntax element is used to indicate the smallest luminance coding block size; a second syntax element is defined in the first parameter set, and the second syntax element is used to indicate luminance The block size of the coding tree; a third syntax element is defined in the first parameter set, and the third syntax element is used to indicate the maximum luminance TB size.
- the decoding end device reads the first syntax element, the second syntax element, and the third syntax element from the first parameter set, and determines the minimum brightness coding block size, the block size of the brightness coding tree, and the maximum brightness TB size.
- the decoding end device determines the minimum size of the brightness QM according to the minimum size of the brightness coding block.
- the minimum luminance coding block size is determined as the minimum size of the luminance QM.
- the decoding end device determines the larger of the block size of the luminance coding tree and the maximum luminance TB size as the maximum luminance QM size.
- the block size of the luminance coding tree is determined as the maximum luminance QM size; when the block size of the luminance coding tree is smaller than the maximum luminance TB size, the The maximum brightness TB size is determined as the maximum size of the brightness QM; when the block size of the brightness coding tree is equal to the maximum brightness TB size, since the two are equal, the block size of the brightness coding tree is determined as the maximum size of the brightness QM , Or determine the maximum brightness TB size as the maximum brightness QM, and the result is the same.
- a fourth syntax element is defined in the first parameter set, and the fourth syntax element is used to indicate the sampling rate of the chrominance component relative to the luminance component.
- the decoding end device calculates the minimum size of the chrominance QM according to the minimum size of the luminance QM and the sampling rate of the chrominance component relative to the luminance component; according to the maximum size of the luminance QM and the sampling rate of the chrominance component relative to the luminance component , Calculate the maximum size of the chromaticity QM.
- aps_qm_size_info_present_flag indicates whether syntax elements related to the QM size are present in the bitstream.
- the value of 1 indicates that the syntax elements related to the QM size will appear in the bitstream, and the effective size range of the QM can be determined based on this, so as to determine which size of QM needs to be decoded.
- a value of 0 means that the syntax elements related to the QM size will not exist in the bitstream, and QMs of all sizes need to be decoded.
- aps_log2_ctu_size_minus5 whose value plus 5 indicates the block size of the luma coding tree. It is specified that its value is the same as the value of the syntax element sps_log2_ctu_size_minus5.
- aps_log2_min_luma_coding_block_size_minus2 whose value plus 2 indicates the smallest luma coding block size. It is specified that its value is the same as the value of the syntax element sps_log2_min_luma_coding_block_size_minus2.
- aps_max_luma_transform_size_64_flag 1
- aps_max_luma_transform_size_64_flag 1
- sps_max_luma_transform_size_64_flag 1
- aps_chroma_format_idc specifies the sampling rate of the chroma component relative to the luminance component, as shown in Table 6. It is specified that its value is the same as the value of the syntax element chroma_format_idc.
- the derivation process of the variables minQMSizeY (representing the minimum size of the brightness QM) and maxQMSizeY (representing the maximum size of the brightness QM) is as follows:
- ⁇ is the left shift operator,? : It is the ternary conditional operator.
- minQMSizeUV representing the minimum size of chroma QM
- maxQMSizeUV representing the maximum size of chroma QM
- variable cIdx represents the color component corresponding to the current QM.
- luminance component Y its value is 0; for chrominance Cb, its value is 1; for chrominance Cr, its value is 2.
- the variable matrixSize represents the actual coding size of the current QM, which is indicated in the third column of Table 2.
- the variable matrixQMSize represents the TB size corresponding to the current QM, which is indicated by Table 1 and Table 2.
- the decoder device first judges the two conditions proposed in this application, and then decides whether to decode the current QM. For example, to determine whether the first QM is a valid QM (the first QM can be any available QM, that is, any one of the above 28 QMs), if the first QM satisfies one of the first condition and the second condition One, it is determined that the first QM is a valid QM.
- the first QM can be any available QM, that is, any one of the above 28 QMs
- the second condition is cIdx!
- the second condition indicates that the first QM belongs to the chrominance component, which is used in the quantization process of the chrominance TB; and the first QM is in the effective size range of the chrominance QM [MinQMSizeUV, MaxQMSizeUV ], MinQMSizeUV represents the minimum size of chroma QM, and MaxQMSizeUV represents the maximum size of chroma QM.
- the decoding end device needs to calculate the effective size range of the QM based on the syntax elements included in the first parameter set, and then determine the effective QM according to the effective size range.
- the syntax element of the effective size range of the brightness QM can also be directly defined in the first parameter set.
- the decoder device can directly obtain the effective size range of the brightness QM, and then according to the brightness
- the effective size range of QM is combined with the colorimetric format to determine the effective size range of QM. details as follows:
- the following method is adopted to determine the effective size range of the QM according to the syntax elements contained in the first parameter set:
- a fifth syntax element is defined in the first parameter set, and the fifth syntax element is used to indicate the minimum size of the luminance QM; a sixth syntax element is defined in APS, and the fifth syntax element is used to indicate the maximum size of the luminance QM. size.
- the decoding end device reads the fifth syntax element and the sixth syntax element from the first parameter set, and determines the minimum size and maximum size of the luminance QM.
- a fourth syntax element is defined in the first parameter set, and the fourth syntax element is used to indicate the sampling rate of the chrominance component relative to the luminance component.
- the decoding end device calculates the minimum size of the chrominance QM according to the minimum size of the luminance QM and the sampling rate of the chrominance component relative to the luminance component; according to the maximum size of the luminance QM and the sampling rate of the chrominance component relative to the luminance component , Calculate the maximum size of the chromaticity QM.
- aps_qm_size_info_present_flag indicates whether syntax elements related to the QM size are present in the bitstream.
- the value of 1 indicates that the syntax elements related to the QM size will appear in the bitstream, and the effective size range of the QM can be determined based on this, so as to determine which size of QM needs to be decoded.
- a value of 0 means that the syntax elements related to the QM size will not exist in the bitstream, and QMs of all sizes need to be decoded.
- aps_log2_min_luma_qm_size_minus2 plus 2 indicates the minimum size of the luminance QM.
- aps_log2_max_luma_qm_size_minus5 whose value plus 5 indicates the maximum size of the luminance QM.
- the derivation process of the variables minQMSizeY (representing the minimum size of the brightness QM) and maxQMSizeY (representing the maximum size of the brightness QM) is as follows:
- ⁇ is the left shift operator.
- minQMSizeY and maxQMSizeY have the same values as the TB size variables MinCbSizeY and VSize calculated through SPS syntax elements.
- aps_chroma_format_idc specifies the sampling rate of the chroma component relative to the luminance component, as shown in Table 6. It is specified that its value is the same as the value of the syntax element chroma_format_idc.
- minQMSizeUV representing the minimum size of chroma QM
- maxQMSizeUV representing the maximum size of chroma QM
- minQMSizeUV (!aps_chroma_format_idc)? 0: minQMSizeY/SubWidthC Formula 18
- the decoder device may also determine the effective QM according to the syntax elements contained in the SPS. Specifically, the decoder device may calculate the effective size range of the luminance QM [MinQMSizeY, MaxQMSizeY] and the effective size range of the chrominance QM [MinQMSizeUV, MaxQMSizeUV] according to the syntax elements contained in the SPS. Among them, the variable MinQMSizeY represents the minimum size of the luminance QM, the variable MaxQMSizeY represents the maximum size of the luminance QM, the variable MinQMSizeUV represents the minimum size of the chrominance QM, and the variable MaxQMSizeUV represents the maximum size of the chrominance QM.
- MinQMSizeUV (!chroma_format_idc)? 0: MinQMSizeY/SubWidthC Formula 22
- MaxQMSizeUV (!chroma_format_idc)? 0: MaxQMSizeY/SubHeightC Formula 23
- ⁇ is the left shift operator,! Represents logical negation operation,? : It is the ternary conditional operator.
- the analysis between the APS and SPS code streams can be eliminated Dependency (parsing dependency), so that the decoding of APS does not need to rely on the syntax elements of SPS.
- determining the effective QM according to the syntax elements included in the first parameter set includes the following sub-steps:
- the value of the flag syntax element corresponding to the first QM is the first value, it is determined that the first QM is a valid QM;
- a flag syntax element is defined in the APS, and the flag syntax element is used to indicate whether the QM is a valid QM.
- the descriptor of the flag syntax element can be u(1), which represents a 1-bit unsigned integer.
- the value of the flag syntax element is 1, which indicates that the QM is a valid QM and needs to be decoded; the value of the flag syntax element is 0, which indicates that the QM is not a valid QM and does not need to be decoded.
- all its elements are predefined as default values.
- the default value is 16, combined with formula 1. At this time, since the scaling and quantization coefficients of all transform coefficients in the TB are 1, the effect is the same as that of not using QM.
- the first QM may be any available QM, that is, any one of the above 28 QMs.
- the first parameter set is APS.
- the first parameter set may not be APS, which is not limited in the embodiment of the present application.
- scaling_matrix_present_flag is 1, which means that the current QM needs to be decoded; its value is 0, which means that the current QM does not need to be decoded, and the decoder device can infer that all elements of the QM are 16.
- the brightness QM corresponds to 1 flag syntax element, which indicates whether the brightness QM needs to be decoded.
- the same symbol syntax element is used to indicate the first chrominance QM and Whether the second chrominance QM needs to be decoded. That is, the first chrominance QM and the second chrominance QM do not need to use one flag syntax element separately, which helps to further save the bit overhead of QM coding signaling.
- [sizeId] 1
- the value of [sizeId] 1
- the luminance QM is encoded in APS
- when it is decoded as a chrominance QM it means that the encoding prediction mode is predMode and the color of the same size in APS QM corresponding to degrees Cb and Cr.
- this syntax element is 0, it means that the luminance QM or the two chrominance QMs do not need to be decoded, and the decoding device can infer that their elements are all 16.
- the encoding end device when setting the value of the flag syntax element corresponding to each QM, that is, when determining which QMs need to be encoded and which QMs do not need to be encoded, it can be based on the size of the QM.
- the coding prediction mode corresponding to the QM can also be considered based on the YUV color component corresponding to the QM, or a combination of multiple factors in the size of the QM, the coding prediction mode, and the YUV color component, which is not limited in the embodiment of the present application.
- the flag syntax element is used to indicate whether the QM is a valid QM, so that it can be more flexible to indicate whether each QM needs to be decoded.
- FIG. 11 shows a flowchart of a video encoding method provided by an embodiment of the present application.
- the method is mainly applied to the encoding end device introduced above as an example.
- the method can include the following steps (1101-1102):
- Step 1101 Determine the effective QM corresponding to the video frame to be encoded, where the effective QM refers to the QM actually used when the transform coefficient is quantized in the encoding process of the video frame to be encoded.
- the video frame to be encoded may be any video frame to be encoded (or called an image frame) in the video to be encoded.
- the number of effective QMs may be less than n, or may be equal to n, where n is a positive integer.
- n is a positive integer.
- the number of effective QMs is n; when transform coefficients are quantized, some of the n QMs (such as m QMs) are actually used.
- m is a positive integer less than n), then the number of effective QMs is m.
- all of its elements are predefined as default values.
- the default value is 16, combined with formula 1.
- Step 1102 Encode the syntax element used to determine the effective QM and the effective QM, and generate a code stream corresponding to the first parameter set; wherein the first parameter set includes a parameter set used to define syntax elements related to the QM.
- the encoding end device needs to encode each effective QM separately.
- the optimal mode corresponding to the effective QM can be determined, and then the effective QM can be encoded according to the optimal mode; wherein, the optimal mode can be from the above Among the three candidate modes, the copy model of the inter-frame prediction mode, the prediction mode of the inter-frame prediction mode, and the intra-frame prediction mode, the mode with the smallest bit cost is selected.
- the number of QMs that may be used when quantizing transform coefficients is 28. Assuming that 12 of them are determined to be effective QMs, then the encoding device only needs to encode the 12 effective QMs. , Without encoding the remaining 16 invalid QMs.
- the encoding end device in addition to encoding the effective QM, the encoding end device also needs to encode the syntax element used to determine the effective QM, so that the decoding end device determines the effective QM according to the syntax element.
- the encoding terminal device encodes the syntax element used to determine the effective QM and the effective QM, and generates a code stream corresponding to the first parameter set.
- the first parameter set may be APS or other syntax elements used to define QM.
- the parameter set is not limited in this embodiment of the application.
- the effective QM corresponding to the video frame to be encoded is determined, and the effective QM refers to the actual use when the transform coefficient is quantized in the encoding process of the video frame to be encoded.
- QM then encodes the syntax element used to determine the effective QM and the effective QM to generate a code stream corresponding to the first parameter set.
- the encoder end only encodes and transmits the effective QM, which helps to save the codewords required for QM signaling and reduces bit overhead, and the decoder end only needs to decode the effective QM, thereby reducing the computational complexity on the decoder end degree.
- the encoding process of the encoding end device corresponds to the decoding process of the decoding end device.
- the encoding process of the encoding end device corresponds to the decoding process of the decoding end device.
- FIG. 12 shows a block diagram of a video decoding device provided by an embodiment of the present application.
- the device has the function of realizing the above example of the video decoding method, and the function can be realized by hardware, or by hardware executing corresponding software.
- the device can be the decoder device described above, or it can be set on the decoder device.
- the device 1200 may include: a parameter acquisition module 1210, a QM determination module 1220, and a QM decoding module 1230.
- the parameter obtaining module 1210 is configured to obtain a first parameter set corresponding to a video frame to be decoded, and the first parameter set is a parameter set used to define syntax elements related to QM.
- the QM determining module 1220 is configured to determine an effective QM according to the syntax elements contained in the first parameter set, and the effective QM refers to the actual use when inverse quantization is performed on the quantized transform coefficients in the decoding process of the to-be-decoded video frame To the QM.
- the QM decoding module 1230 is used to decode the effective QM.
- the QM determining module 1220 includes: a range determining unit 1221 and a QM determining unit 1222.
- the range determining unit 1221 is configured to determine the effective size range of the QM according to the syntax elements included in the first parameter set.
- the QM determining unit 1222 is configured to determine the QM that belongs to the effective size range as the effective QM.
- the range determining unit 1221 is configured to:
- the effective size range of the chrominance QM according to the effective size range of the luminance QM and the sampling rate of the chrominance component relative to the luminance component; wherein the effective size range of the chrominance QM includes the minimum size of the chrominance QM And the maximum size.
- the range determining unit 1221 is configured to:
- the larger of the block size of the luminance coding tree and the maximum luminance TB size is determined as the maximum size of the luminance QM.
- the range determining unit 1221 is configured to:
- the effective size range of the chrominance QM according to the effective size range of the luminance QM and the sampling rate of the chrominance component relative to the luminance component; wherein the effective size range of the chrominance QM includes the minimum size of the chrominance QM And the maximum size.
- the range determining unit 1221 is configured to:
- the QM determining unit 1222 is configured to:
- the first QM satisfies one of the first condition and the second condition, it is determined that the first QM is the effective QM
- the QM determination module 1220 includes: an element reading unit 1223 and a QM determination unit 1224.
- the element reading unit 1223 is configured to read the value of the flag syntax element corresponding to the first QM from the first parameter set.
- the QM determining unit 1224 is configured to determine that the first QM belongs to the valid QM if the value of the flag syntax element corresponding to the first QM is a first value; if the flag syntax element corresponding to the first QM is If the value is the second value, it is determined that the first QM does not belong to the effective QM.
- the first chroma QM and the second chroma QM having the same prediction mode and the same size share the same flag syntax element.
- the flag syntax element is scaling_matrix_present_flag.
- the first parameter set is APS.
- all elements of other QMs that do not belong to the effective QM are predefined as default values.
- the default value is 16.
- the effective QM is determined according to the syntax elements contained in the first parameter set.
- the actual QM used when the transform coefficients are quantized is then decoded. In this way, the decoder only needs to decode the effective QM, thereby reducing the computational complexity of the decoder.
- FIG. 14 shows a block diagram of a video encoding device provided by an embodiment of the present application.
- the device has the function of realizing the foregoing example of the video encoding method, and the function can be realized by hardware, or by hardware executing corresponding software.
- the device can be the encoding end device described above, or it can be set on the encoding end device.
- the device 1400 may include: a QM determination module 1410 and a QM encoding module 1420.
- the QM determination module 1410 is configured to determine the effective QM corresponding to the video frame to be encoded, and the effective QM refers to the QM actually used when the transform coefficient is quantized in the encoding process of the video frame to be encoded.
- the QM encoding module 1420 is configured to encode the syntax elements used to determine the effective QM and the effective QM to generate a code stream corresponding to a first parameter set; wherein, the first parameter set is used to define QM related The parameter set of the syntax element.
- the effective QM corresponding to the video frame to be encoded is determined, and the effective QM refers to the actual use when the transform coefficients are quantized in the encoding process of the video frame to be encoded.
- QM then encodes the syntax element used to determine the effective QM and the effective QM to generate a code stream corresponding to the first parameter set.
- the encoder end only encodes and transmits the effective QM, which helps to save the codewords required for QM signaling and reduces bit overhead, and the decoder end only needs to decode the effective QM, thereby reducing the computational complexity on the decoder end degree.
- FIG. 15 shows a structural block diagram of a computer device provided by an embodiment of the present application.
- the computer device may be the encoding end device described above, or the decoding end device described above.
- the computer device 150 may include: a processor 151, a memory 152, a communication interface 153, an encoder/decoder 154, and a bus 155.
- the processor 151 includes one or more processing cores, and the processor 151 executes various functional applications and information processing by running software programs and modules.
- the memory 152 may be used to store a computer program, and the processor 151 is used to execute the computer program to implement the foregoing video encoding method or the foregoing video decoding method.
- the communication interface 153 can be used to communicate with other devices, such as receiving audio and video data.
- the encoder/decoder 154 can be used to implement encoding and decoding functions, such as encoding and decoding audio and video data.
- the memory 152 is connected to the processor 151 through a bus 155.
- the memory 152 can be implemented by any type of volatile or non-volatile storage device or a combination thereof.
- the volatile or non-volatile storage device includes, but is not limited to: magnetic disks or optical disks, EEPROM (Electrically Erasable Programmable Read -Only Memory, Electrically Erasable Programmable Read-Only Memory, EPROM (Erasable Programmable Read-Only Memory, Erasable Programmable Read-Only Memory), SRAM (Static Random-Access Memory, Static Random-Access Memory), ROM (Read-Only Memory), magnetic memory, flash memory, PROM (Programmable read-only memory).
- FIG. 15 does not constitute a limitation on the computer device 150, and may include more or fewer components than shown, or combine certain components, or adopt different component arrangements.
- a computer-readable storage medium stores at least one instruction, at least one program, code set, or instruction set, and the at least one instruction, the At least one program, the code set, or the instruction set implements the foregoing video decoding method or implements the foregoing video encoding method when executed by a processor.
- a computer program product is also provided.
- the computer program product When the computer program product is executed by a processor, it is used to implement the above-mentioned video decoding method or the above-mentioned video encoding method.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computing Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Discrete Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Priority Applications (7)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2022515567A JP7744023B2 (ja) | 2019-12-18 | 2020-12-08 | ビデオ復号方法、ビデオ符号化方法、装置、機器及び記憶媒体 |
| KR1020247026780A KR20240127485A (ko) | 2019-12-18 | 2020-12-08 | 비디오 디코딩 방법, 비디오 코딩 방법, 디바이스 및 장치, 저장 매체 |
| KR1020227008146A KR102695010B1 (ko) | 2019-12-18 | 2020-12-08 | 비디오 디코딩 방법, 비디오 코딩 방법, 디바이스 및 장치, 저장 매체 |
| EP20902574.1A EP3975558A4 (en) | 2019-12-18 | 2020-12-08 | VIDEO DECODING METHOD, VIDEO ENCODING METHOD, DEVICE AND FACILITIES, AND STORAGE MEDIA |
| US17/506,784 US12034950B2 (en) | 2019-12-18 | 2021-10-21 | Video decoding method and apparatus, video encoding method and apparatus, device, and storage medium |
| US18/663,766 US12382077B2 (en) | 2019-12-18 | 2024-05-14 | Video decoding method and apparatus, video encoding method and apparatus, device, and storage medium |
| JP2024119640A JP2024138096A (ja) | 2019-12-18 | 2024-07-25 | ビデオ復号方法、ビデオ符号化方法、装置、機器及び記憶媒体 |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201911309768.6 | 2019-12-18 | ||
| CN201911309768.6A CN111050171B (zh) | 2019-12-18 | 2019-12-18 | 视频解码方法、装置、设备及存储介质 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US17/506,784 Continuation US12034950B2 (en) | 2019-12-18 | 2021-10-21 | Video decoding method and apparatus, video encoding method and apparatus, device, and storage medium |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2021121080A1 true WO2021121080A1 (zh) | 2021-06-24 |
Family
ID=70237824
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2020/134581 Ceased WO2021121080A1 (zh) | 2019-12-18 | 2020-12-08 | 视频解码方法、视频编码方法、装置、设备及存储介质 |
Country Status (6)
| Country | Link |
|---|---|
| US (2) | US12034950B2 (https=) |
| EP (1) | EP3975558A4 (https=) |
| JP (2) | JP7744023B2 (https=) |
| KR (2) | KR102695010B1 (https=) |
| CN (2) | CN115720265B (https=) |
| WO (1) | WO2021121080A1 (https=) |
Families Citing this family (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2021060846A1 (ko) * | 2019-09-23 | 2021-04-01 | 엘지전자 주식회사 | 양자화 매트릭스를 이용한 영상 부호화/복호화 방법, 장치 및 비트스트림을 전송하는 방법 |
| CN115720265B (zh) | 2019-12-18 | 2024-08-09 | 腾讯科技(深圳)有限公司 | 视频编解码方法、装置、设备及存储介质 |
| CN111147858B (zh) * | 2019-12-31 | 2024-02-27 | 腾讯科技(深圳)有限公司 | 视频解码方法、装置、设备及存储介质 |
| CN115552900B (zh) * | 2020-02-21 | 2025-12-23 | 阿里巴巴(中国)有限公司 | 用信号通知最大变换大小和残差编码的方法 |
| AU2020203330B2 (en) | 2020-05-21 | 2022-12-01 | Canon Kabushiki Kaisha | Method, apparatus and system for encoding and decoding a block of video samples |
| WO2025212860A1 (en) * | 2024-04-03 | 2025-10-09 | Bytedance Inc. | Method, apparatus, and medium for video processing |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101835039A (zh) * | 2009-03-09 | 2010-09-15 | 联发科技股份有限公司 | 用于视频处理的方法、用于解量化和量化的电子装置 |
| US20140079329A1 (en) * | 2012-09-18 | 2014-03-20 | Panasonic Corporation | Image decoding method and image decoding apparatus |
| US20190149823A1 (en) * | 2017-11-13 | 2019-05-16 | Electronics And Telecommunications Research Institute | Method and apparatus for quantization |
| CN111050171A (zh) * | 2019-12-18 | 2020-04-21 | 腾讯科技(深圳)有限公司 | 视频解码方法、装置、设备及存储介质 |
Family Cites Families (10)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2010288166A (ja) * | 2009-06-15 | 2010-12-24 | Panasonic Corp | 動画像符号化装置、放送波記録装置及び、プログラム |
| CN102577387B (zh) * | 2009-10-30 | 2015-01-14 | 松下电器(美国)知识产权公司 | 解码方法、解码装置、编码方法以及编码装置 |
| JP5966932B2 (ja) * | 2011-02-10 | 2016-08-10 | ソニー株式会社 | 画像処理装置、画像処理方法、プログラム及び媒体 |
| WO2012134046A2 (ko) * | 2011-04-01 | 2012-10-04 | 주식회사 아이벡스피티홀딩스 | 동영상의 부호화 방법 |
| WO2013032794A1 (en) * | 2011-08-23 | 2013-03-07 | Mediatek Singapore Pte. Ltd. | Method and system of transform block processing according to quantization matrix in video coding |
| BR122020017515B1 (pt) * | 2012-01-20 | 2022-11-22 | Electronics And Telecommunications Research Institute | Método de decodificação de vídeo |
| JPWO2013154028A1 (ja) | 2012-04-13 | 2015-12-17 | ソニー株式会社 | 画像処理装置および方法 |
| US20130272391A1 (en) * | 2012-04-16 | 2013-10-17 | Futurewei Technologies, Inc. | Method and Apparatus of Quantization Matrix Coding |
| EP4550782A1 (en) * | 2018-03-28 | 2025-05-07 | Sony Group Corporation | Image processing devices and image processing methods |
| CN110536133B (zh) | 2018-05-24 | 2021-11-19 | 华为技术有限公司 | 视频数据解码方法及装置 |
-
2019
- 2019-12-18 CN CN202211386430.2A patent/CN115720265B/zh active Active
- 2019-12-18 CN CN201911309768.6A patent/CN111050171B/zh active Active
-
2020
- 2020-12-08 KR KR1020227008146A patent/KR102695010B1/ko active Active
- 2020-12-08 JP JP2022515567A patent/JP7744023B2/ja active Active
- 2020-12-08 KR KR1020247026780A patent/KR20240127485A/ko active Pending
- 2020-12-08 WO PCT/CN2020/134581 patent/WO2021121080A1/zh not_active Ceased
- 2020-12-08 EP EP20902574.1A patent/EP3975558A4/en active Pending
-
2021
- 2021-10-21 US US17/506,784 patent/US12034950B2/en active Active
-
2024
- 2024-05-14 US US18/663,766 patent/US12382077B2/en active Active
- 2024-07-25 JP JP2024119640A patent/JP2024138096A/ja active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101835039A (zh) * | 2009-03-09 | 2010-09-15 | 联发科技股份有限公司 | 用于视频处理的方法、用于解量化和量化的电子装置 |
| US20140079329A1 (en) * | 2012-09-18 | 2014-03-20 | Panasonic Corporation | Image decoding method and image decoding apparatus |
| US20190149823A1 (en) * | 2017-11-13 | 2019-05-16 | Electronics And Telecommunications Research Institute | Method and apparatus for quantization |
| CN111050171A (zh) * | 2019-12-18 | 2020-04-21 | 腾讯科技(深圳)有限公司 | 视频解码方法、装置、设备及存储介质 |
Non-Patent Citations (5)
| Title |
|---|
| P. DE LAGRANGE (INTERDIGITAL), F. LE LÉANNEC, E. FRANÇOIS, K. NASER (INTERDIGITAL): "Non-CE7: Quantization matrices with single identifier and prediction from larger ones", 15. JVET MEETING; 20190703 - 20190712; GOTHENBURG; (THE JOINT VIDEO EXPLORATION TEAM OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ), no. JVET-O0223 ; m48333, 24 June 2019 (2019-06-24), XP030218892 * |
| P. DE LAGRANGE (INTERDIGITAL), F. LELÉANNEC (INTERDIGITAL), E. FRANÇOIS (INTERDIGITAL), K. NASER (INTERDIGITAL): "AHG15: Quantization matrices with single identifier and enhanced prediction", 128. MPEG MEETING; 20191007 - 20191011; GENEVA; (MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11), no. m50060, 21 September 2019 (2019-09-21), XP030205961 * |
| See also references of EP3975558A4 |
| T. HASHIMOTO, E. SASAKI, T. IKAI (SHARP): "AHG15: Signaling scaling matrix for LFNST case", 128. MPEG MEETING; 20191007 - 20191011; GENEVA; (MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11), no. m50258, 25 September 2019 (2019-09-25), XP030206268 * |
| T. TOMA (PANASONIC), K. ABE (PANASONIC), S.-C. LIM, J. KANG (ETRI): "AHG18: Support of quantization matrices", 14. JVET MEETING; 20190319 - 20190327; GENEVA; (THE JOINT VIDEO EXPLORATION TEAM OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ), no. JVET-N0204, 12 March 2019 (2019-03-12), XP030202687 * |
Also Published As
| Publication number | Publication date |
|---|---|
| EP3975558A4 (en) | 2023-04-19 |
| US20220046260A1 (en) | 2022-02-10 |
| CN115720265B (zh) | 2024-08-09 |
| KR20240127485A (ko) | 2024-08-22 |
| JP2024138096A (ja) | 2024-10-07 |
| CN111050171A (zh) | 2020-04-21 |
| KR102695010B1 (ko) | 2024-08-12 |
| US12382077B2 (en) | 2025-08-05 |
| CN115720265A (zh) | 2023-02-28 |
| US20240305804A1 (en) | 2024-09-12 |
| US12034950B2 (en) | 2024-07-09 |
| JP7744023B2 (ja) | 2025-09-25 |
| EP3975558A1 (en) | 2022-03-30 |
| KR20220044352A (ko) | 2022-04-07 |
| CN111050171B (zh) | 2022-10-11 |
| JP2022548354A (ja) | 2022-11-18 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12382077B2 (en) | Video decoding method and apparatus, video encoding method and apparatus, device, and storage medium | |
| CN104041035B (zh) | 用于复合视频的无损编码及相关信号表示方法 | |
| CN113545091B (zh) | 对视频序列执行最大变换大小控制的解码方法和装置 | |
| US12137234B2 (en) | Video encoding and decoding methods and apparatuses, device, and storage medium | |
| US12335524B2 (en) | Frequency-dependent joint component secondary transform | |
| CN116744003A (zh) | 编解码方法、存储介质和发送方法 | |
| US11968366B2 (en) | Video encoding method and apparatus, video decoding method and apparatus, device, and storage medium | |
| US11575937B2 (en) | Methods for efficient application of LGT | |
| HK40080503B (zh) | 视频编解码方法、装置、设备及存储介质 | |
| HK40080503A (en) | Video encoding and decoding method, device, equipment and storage medium | |
| HK40021740A (en) | Video decoding method and device, apparatus and storage medium | |
| HK40021740B (en) | Video decoding method and device, apparatus and storage medium | |
| US20250358402A1 (en) | Adaptive frame padding | |
| US20260032262A1 (en) | Systems and methods for implicit derivation in a recursive intra region | |
| US20250227232A1 (en) | Secondary transform set selection | |
| CN120692399A (zh) | 视频编/解码方法、视频码流处理方法、计算系统和存储介质 | |
| KR20250169174A (ko) | 현재 역양자화 상태들에 기초한 종속 양자화를 위한 시스템들 및 방법들 | |
| CN119948872A (zh) | 扩展的多残差块编解码的系统和方法 | |
| HK40048752B (zh) | 视频解码方法、装置、设备及存储介质 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20902574 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2020902574 Country of ref document: EP Effective date: 20211221 |
|
| ENP | Entry into the national phase |
Ref document number: 2022515567 Country of ref document: JP Kind code of ref document: A |
|
| ENP | Entry into the national phase |
Ref document number: 20227008146 Country of ref document: KR Kind code of ref document: A |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |