WO2023004590A1

WO2023004590A1 - Video decoding and encoding methods and devices, and storage medium

Info

Publication number: WO2023004590A1
Application number: PCT/CN2021/108723
Authority: WO
Inventors: 唐桐
Original assignee: Oppo广东移动通信有限公司
Priority date: 2021-07-27
Filing date: 2021-07-27
Publication date: 2023-02-02
Also published as: CN117426089A

Abstract

Provided are video decoding and encoding methods, devices, and systems, and a storage medium. The decoding method comprises: parsing a code stream to obtain a quantization mode identifier of an image to be decoded (901); and inversely quantizing the image according to a quantization mode indicated by the quantization mode identifier (902), wherein the quantization mode identifier is an identifier which is used for indicating a dependence scalar quantization mode or a non-dependence scalar quantization mode and which is determined according to a spatial continuity feature value of a raw image. The encoding method comprises: according to the spatial continuity feature value, determining to use the dependence scalar quantization mode or the non-dependence scalar quantization mode on an image to be encoded to carry out quantization (602). Different quantization modes are used according to spatial continuity features of an image to be encoded, so that the encoding complexity can be reduced without lowering the encoding performance.

Description

A video decoding and encoding method, device, and storage medium

technical field

Embodiments of the present disclosure relate to but are not limited to the technical field of video data processing, and in particular relate to a video decoding method, encoding method, device, system, storage medium, and code stream.

Background technique

Digital video compression technology mainly compresses huge digital image and video data to facilitate transmission and storage. With the proliferation of Internet videos and people's higher and higher requirements for video clarity, although the existing digital video compression standards can save a lot of video data, it is still necessary to pursue better digital video compression technology to reduce digital video. The bandwidth and traffic pressure of video transmission achieve more efficient video codec and transmission storage.

Contents of the invention

The following is an overview of the topics described in detail in this article. This summary is not intended to limit the scope of the claims.

An embodiment of the present disclosure provides a video decoding method, including:

Analyze the code stream to obtain the quantization mode identification of the image to be decoded;

Dequantize the image to be decoded according to the quantization mode indicated by the quantization mode identifier;

Wherein, the quantization mode identification is an identification determined according to the spatial continuity feature value of the original image to indicate a scalar-dependent quantization mode or a scalar-independent quantization mode.

An embodiment of the present disclosure also provides a video coding method, including:

Determine the spatial continuity eigenvalue of the image to be encoded;

Determining, according to the spatial continuity feature value, that the image to be encoded is quantized in a scalar-dependent quantization manner or in a scalar-independent quantization manner.

In an embodiment of the present disclosure, when the spatial continuity eigenvalue is within the eigenvalue threshold range, the image to be encoded is quantized using a dependent scalar quantization method;

In the case that the spatial continuity feature value is outside the feature value threshold range, the image to be encoded is quantized using a scalar-independent quantization manner.

An embodiment of the present disclosure also provides a video decoding device, including a processor and a memory storing a computer program that can run on the processor, wherein, when the processor executes the computer program, any aspect of the present disclosure can be implemented. A decoding method described in an embodiment.

An embodiment of the present disclosure also provides a video encoding device, including a processor and a memory storing a computer program that can run on the processor, wherein, when the processor executes the computer program, any one of the present disclosure can be realized. The encoding method described in the embodiment.

An embodiment of the present disclosure further provides a video encoding and decoding system, which includes the decoding device according to any embodiment of the present disclosure and/or the encoding device according to any embodiment of the present disclosure.

An embodiment of the present disclosure also provides a non-transitory computer-readable storage medium, the computer-readable storage medium stores a computer program, wherein, when the computer program is executed by a processor, any embodiment of the present disclosure can be implemented. The video decoding method or encoding method.

An embodiment of the present disclosure also provides a code stream, wherein the code stream is generated according to the encoding method described in any embodiment of the present disclosure, wherein the code stream includes a quantization mode identifier, and the quantization mode identifier is used for Indicates how the encoder quantizes the image.

Other aspects will be apparent to others upon reading and understanding the drawings and detailed description.

Description of drawings

The accompanying drawings are used to provide an understanding of the embodiments of the present disclosure, and constitute a part of the description, together with the embodiments of the present disclosure, are used to explain the technical solutions of the present disclosure, and do not constitute limitations on the technical solutions of the present disclosure.

FIG. 1 is a structural block diagram of a video codec system that can be used in an embodiment of the present disclosure;

FIG. 2 is a structural block diagram of a video encoder that can be used in an embodiment of the present disclosure;

FIG. 3 is a structural block diagram of a video decoder that can be used in an embodiment of the present disclosure;

FIG. 4 is a schematic diagram of a dependency quantizer that can be used in an embodiment of the present disclosure;

FIG. 5 is a schematic diagram of a state transition of a quantizer used for determining transform coefficients that can be used in an embodiment of the present disclosure;

FIG. 6 is a flowchart of a video encoding method according to an embodiment of the present disclosure;

FIG. 7 is a flowchart of a video encoding method according to another embodiment of the present disclosure;

FIG. 8 is a flowchart of a video encoding method according to another embodiment of the present disclosure;

FIG. 9 is a flowchart of a video decoding method according to an embodiment of the present disclosure;

Fig. 10 is a structural block diagram of a video encoding device or a decoding device according to an embodiment of the present disclosure.

Detailed ways

The present disclosure describes a number of embodiments, but the description is illustrative rather than restrictive, and it will be apparent to those of ordinary skill in the art that within the scope encompassed by the described embodiments of the present disclosure, There are many more embodiments and implementations.

In this disclosure, the words "exemplary" or "for example" are used to mean an example, illustration or illustration. Any embodiment described in this disclosure as "exemplary" or "for example" should not be construed as preferred or advantageous over other embodiments.

In describing representative exemplary embodiments, the specification may have presented a method and/or process as a particular sequence of steps. However, to the extent the method or process is not dependent on the specific order of steps described herein, the method or process should not be limited to the specific order of steps described. Other sequences of steps are also possible, as will be appreciated by those of ordinary skill in the art. Therefore, the specific order of the steps set forth in the specification should not be construed as limitations on the claims. Furthermore, claims to the method and/or process should not be limited to performing their steps in the order written, as those skilled in the art can readily appreciate that such order can be varied and still remain within the spirit and scope of the disclosed embodiments Inside.

Internationally, mainstream video codec standards include H.264/Advanced Video Coding (Advanced Video Coding, AVC), H.265/High Efficiency Video Coding (High Efficiency Video Coding, HEVC), H.266/Versatile Video Coding (Multiple Functional Video Coding, VVC), MPEG (Moving Picture Experts Group, Dynamic Image Experts Group), AOM (Open Media Alliance, Alliance for Open Media), AVS (Audio Video coding Standard, audio and video coding standards) and the expansion of these standards, Or any other custom standards, etc. These standards reduce the amount of transmitted data and stored data through video compression technology, so as to achieve more efficient video codec and transmission storage.

In H.264/AVC, the input image is divided into blocks of fixed size as the basic unit of encoding, and it is called a macro block (MB, Macro Block), including a luma block and two chrominance blocks, luma The block size is 16×16. If 4:2:0 sampling is used, the chroma block size is half of the luma block size. In the prediction link, according to different prediction modes, the macroblock is further divided into small blocks for prediction. In the intra-frame prediction, the macroblock can be divided into small blocks of 16×16, 8×8, and 4×4, and intra-frame prediction is performed on each small block. In the transformation and quantization link, the macroblock is divided into 4×4 or 8×8 small blocks, and the prediction residual in each small block is transformed and quantized respectively to obtain quantized coefficients.

Compared with H.264/AVC, H.265/HEVC has taken improvement measures in multiple encoding links. In H.265/HEVC, an image is divided into coding tree units (CTU, Coding Tree Unit), and CTU is the basic unit of coding (corresponding to macroblocks in H.264/AVC). A CTU includes a luma coding tree block (CTB, Coding Tree Block) and two chrominance coding tree blocks. The maximum size of a CU in the H.265/HEVC standard is generally 64×64. In order to adapt to a variety of video content and video features, CTU is iteratively divided into a series of coding units (CU, Coding Unit) in the form of quadtree (QT, Quadro Tree). CU is the basic unit of intra/inter coding. A CU includes a luma coding block (CB, Coding Block) and two chroma coding blocks and related syntax structures. The maximum CU size is CTU, and the minimum CU size is 8×8. The leaf node CUs obtained through coding tree division can be divided into three types according to different prediction methods: intra CU for intra-frame prediction, inter CU for inter-frame prediction, and skipped CU. The skipped CU can be regarded as a special case of the inter CU, which does not contain motion information and prediction residual information. The leaf node CU contains one or more prediction units (PU, Prediction Unit). H.265/HEVC supports PUs of 4×4 to 64×64 sizes, and there are eight division modes in total. For the intra coding mode, there are two possible division modes: Part_2Nx2N and Part_NxN. For the prediction residual signal, the CU uses the prediction residual quadtree to divide it into transform units (TU: Transform Unit). A TU includes a luma transform block (TB, Transform Block) and two chroma transform blocks. Only square division is allowed, and one CB is divided into 1 or 4 PBs. The same TU has the same transformation and quantization process, and the supported sizes are from 4×4 to 32×32. Different from previous coding standards, in inter prediction, TB can cross the boundary of PB to further maximize the coding efficiency of inter coding.

In H.266/VVC, video coding images are first divided into coding tree units CTU similar to H.265/HEVC, but the maximum size is increased from 64×64 to 128×128. H.266/VVC proposed quadtree and nested multi-type tree (MTT, Multi-Type Tree) division, MTT includes binary tree (BT, Binary Tree) and ternary tree (TT, Ternary Tree), and unified H. 265/HEVC concepts of CU, PU, and TU, and supports more flexible CU division shapes. The CTU is divided according to the quadtree structure, and the leaf nodes are further divided by MTT. The leaf nodes of the multi-type tree become the coding unit CU. When the CU is not larger than the largest transformation unit (64×64), subsequent prediction and transformation will not be further divided. In most cases CU, PU, TU have the same size. Considering the different characteristics of luminance and chrominance and the parallelism of specific implementations, in H.266/VVC, chrominance can adopt a separate partition tree structure instead of keeping the same with the luma partition tree. The chroma division of I frame in H.266/VVC adopts chroma separation tree, and the chroma division of P frame and B frame is consistent with the luma division.

FIG. 1 is a block diagram of a video encoding and decoding system applicable to an embodiment of the present disclosure. As shown in FIG. 1 , the system is divided into an encoding-side device 1 and a decoding-side device 2 , and the encoding-side device 1 encodes video images to generate a code stream. The device 2 on the decoding side can decode the code stream to obtain a reconstructed video image. The encoding side device 1 and the decoding side device 2 may include one or more processors and memory coupled to the one or more processors, such as random access memory, charged erasable programmable read-only memory, flash memory or other media. The encoding side device 1 and the decoding side device 2 can be implemented with various devices, such as desktop computers, mobile computing devices, notebook computers, tablet computers, set-top boxes, televisions, cameras, display devices, digital media players, vehicle-mounted computers, or other similar installation.

The device 2 on the decoding side can receive the code stream from the device 1 on the encoding side via the link 3 . The link 3 includes one or more media or devices capable of moving the code stream from the device 1 on the encoding side to the device 2 on the decoding side. In one example, the link 3 includes one or more communication media that enable the device 1 on the encoding side to directly transmit the code stream to the device 2 on the decoding side. The device 1 on the encoding side can modulate the code stream according to a communication standard (such as a wireless communication protocol), and can send the modulated code stream to the device 2 on the decoding side. The one or more communication media may include wireless and/or wired communication media, such as a radio frequency (RF) spectrum or one or more physical transmission lines. The one or more communication media may form part of a packet-based network, such as a local area network, a wide area network, or a global network (eg, the Internet). The one or more communication media may include routers, switches, base stations, or other devices that facilitate communication from device 1 on the encoding side to device 2 on the decoding side. In another example, the code stream can also be output from the output interface 15 to a storage device, and the decoding-side device 2 can read the stored data from the storage device via streaming or downloading. The storage device may comprise any of a variety of distributed-access or locally-accessed data storage media, such as hard disk drives, Blu-ray Discs, Digital Versatile Discs, CD-ROMs, flash memory, volatile or non-volatile Volatile memory, file servers, and more.

In the example shown in FIG. 1 , the encoding side device 1 includes a data source 11 , an encoder 13 and an output interface 15 . In some examples. Data sources 11 may include video capture devices (eg, video cameras), archives containing previously captured data, feed interfaces to receive data from content providers, computer graphics systems to generate data, or combinations of these sources. The encoder 13 can encode the data from the data source 11 and output it to the output interface 15, and the output interface 15 can include at least one of an adjuster, a modem and a transmitter.

In the example shown in FIG. 1 , the decoding side device 2 includes an input interface 21 , a decoder 23 and a display device 25 . In some examples, input interface 21 includes at least one of a receiver and a modem. The input interface 21 can receive the code stream via the link 3 or from a storage device. The decoder 23 decodes the received code stream. The display device 25 is used for displaying the decoded data, and the display device 25 may be integrated with other devices of the decoding side device 2 or provided separately. The display device 25 may be, for example, a liquid crystal display, a plasma display, an organic light emitting diode display or other types of display devices. In other examples, the device 2 on the decoding side may not include the display device 25 , or may include other devices or devices for applying the decoded data.

Encoder 13 and decoder 23 of Fig. 1 can use any one in the following circuits or any combination of following circuits to realize: one or more microprocessors, digital signal processors, application specific integrated circuits, field programmable gate arrays , discrete logic, hardware. If the present disclosure is implemented partially in software, instructions for the software may be stored in a suitable non-transitory computer-readable storage medium and executed in hardware using one or more processors to thereby Implement the disclosed method.

Fig. 2 is a structural block diagram of an exemplary video encoder. In this example, the description is mainly based on the terminology and block division of the H.265/HEVC standard, but the structure of the video encoder can also be used for videos of H.264/AVC, H.266/VVC and other similar standards coding.

As shown in the figure, the video encoder 20 is used to encode video data to generate code streams. As shown in the figure, the video encoder 20 includes a prediction processing unit 100, a division unit 101, a prediction residual generation unit 102, a transformation processing unit 104, a quantization unit 106, an inverse quantization unit 108, an inverse transformation processing unit 110, a reconstruction unit 112, A filter unit 113 , a decoded picture buffer 114 , and an entropy coding unit 116 . The prediction processing unit 100 includes an inter prediction processing unit 121 and an intra prediction processing unit 126 . In other embodiments, video encoder 20 may contain more, fewer or different functional components than this example. Both the prediction residual generation unit 102 and the reconstruction unit 112 are represented by circles with plus signs in the figure.

The division unit 101 cooperates with the prediction processing unit 100 to divide the received video data into slices (Slices), CTUs or other larger units. The video data received by the dividing unit 101 may be a video sequence including video frames such as I frames, P frames, or B frames.

The prediction processing unit 100 may divide a CTU into CUs, and perform intra-frame predictive coding or inter-frame predictive coding on the CUs. When performing intra-frame coding on a CU, the 2N×2N CU can be divided into 2N×2N or N×N prediction units (PU: prediction unit) for intra-frame prediction. When performing inter-frame prediction on a CU, the 2N×2N CU can be divided into PUs of 2N×2N, 2N×N, N×2N, N×N or other sizes for inter-frame prediction, and asymmetrical PUs can also be supported divided.

The inter prediction processing unit 121 may perform inter prediction on the PU to generate prediction data of the PU, the prediction data including the prediction block of the PU, motion information of the PU and various syntax elements.

The intra prediction processing unit 126 may perform intra prediction on the PU to generate prediction data for the PU. The prediction data for a PU may include the prediction block and various syntax elements for the PU. The intra-frame prediction processing unit 126 may try multiple selectable intra-frame prediction modes, and select an intra-frame prediction mode with the least cost to perform intra-frame prediction on the PU.

The prediction residual generation unit 102 may generate a prediction residual block of the CU based on the original block of the CU and the prediction block of the PU into which the CU is divided.

The transform processing unit 104 may divide the CU into one or more transform units (TU: Transform Unit), and the prediction residual block associated with the TU is a sub-block obtained by dividing the prediction residual block of the CU. A TU-associated coefficient block is generated by applying one or more transforms to the TU-associated prediction residual block. For example, the transform processing unit 104 may apply discrete cosine transform (DCT: Discrete Cosine Transform), directional transform or other transforms to the prediction residual block associated with the TU, and may convert the prediction residual block from the pixel domain to the frequency domain.

The quantization unit 106 can quantize the coefficients in the coefficient block based on a selected quantization parameter (QP). Quantization may cause quantization losses. By adjusting the QP value, the degree of quantization of the coefficient block can be adjusted.

The inverse quantization unit 108 and the inverse transformation unit 110 may respectively apply inverse quantization and inverse transformation to the coefficient blocks to obtain TU-associated reconstructed prediction residual blocks.

The reconstruction unit 112 may generate a reconstructed block of the CU based on the reconstructed prediction residual block and the prediction block generated by the prediction processing unit 100 .

The filter unit 113 performs loop filtering on the reconstructed block and stores it in the decoded picture buffer 114 . The intra prediction processing unit 126 may extract the reconstructed reference information adjacent to the PU from the reconstructed blocks cached in the decoded picture buffer 114 to perform intra prediction on the PU. Inter prediction processing unit 121 may perform inter prediction on PUs of other pictures using reference pictures cached by decoded picture buffer 114 that contain reconstructed blocks.

The entropy coding unit 116 can perform entropy coding operations on received data (such as syntax elements, quantized system blocks, motion information, etc.), such as performing context adaptive variable length coding (CAVLC: Context Adaptive Variable Length Coding), context self- Adapt to Binary Arithmetic Coding (CABAC: Context-based Adaptive Binary Arithmetic Coding), etc., and output code stream (that is, coded video code stream).

FIG. 3 is a structural block diagram of an exemplary video decoder. In this example, the description is mainly based on the terminology and block division of the H.265/HEVC standard, but the structure of the video decoder can also be used for videos of H.264/AVC, H.266/VVC and other similar standards decoding.

The video decoder 30 can decode the received code stream and output decoded video data. As shown in the figure, the video decoder 30 includes an entropy decoding unit 150, a prediction processing unit 152, an inverse quantization unit 154, an inverse transformation processing unit 156, a reconstruction unit 158 (indicated by a circle with a plus sign in the figure), a filter unit 159 , and the picture buffer 160. In other embodiments, video decoder 30 may contain more, fewer or different functional components.

The entropy decoding unit 150 may perform entropy decoding on the received code stream to extract information such as syntax elements, quantized coefficient blocks, and PU motion information. The prediction processing unit 152 , the inverse quantization unit 154 , the inverse transform processing unit 156 , the reconstruction unit 158 and the filter unit 159 can all perform corresponding operations based on the syntax elements extracted from the code stream.

As a functional component performing a reconstruction operation, the inverse quantization unit 154 may inverse quantize the quantized TU-associated coefficient blocks. Inverse transform processing unit 156 may apply one or more inverse transforms to the inverse quantized coefficient block in order to generate a reconstructed prediction residual block for the TU.

Prediction processing unit 152 includes inter prediction processing unit 162 and intra prediction processing unit 164 . If the PU is encoded using intra-frame prediction, the intra-frame prediction processing unit 164 can determine the intra-frame prediction mode of the PU based on the syntax elements parsed from the code stream, and according to the determined intra-frame prediction mode and the adjacent PU obtained from the picture buffer device 60 Intra prediction is performed on the reconstructed reference information, resulting in a prediction block of the PU. If the PU is encoded using inter-prediction, inter-prediction processing unit 162 may determine one or more reference blocks for the PU based on the motion information of the PU and corresponding syntax elements to generate a predictive block for the PU.

The reconstruction unit 158 may obtain the reconstruction block of the CU based on the reconstruction prediction residual block associated with the TU and the prediction block of the PU generated by the prediction processing unit 152 (ie intra prediction data or inter prediction data).

The filter unit 159 may perform loop filtering on the reconstructed block of the CU to obtain a reconstructed picture. The reconstructed pictures are stored in the picture buffer 160 . The picture buffer 160 can provide reference pictures for subsequent motion compensation, intra prediction, inter prediction, etc., and can also output the reconstructed video data as decoded video data for presentation on a display device.

Because video encoding includes encoding and decoding, for the convenience of description later, encoding at the encoder end and decoding at the decoder end may also be collectively referred to as encoding or decoding. According to the contextual description of the relevant steps, those skilled in the art can know whether the encoding (decoding) mentioned later refers to the encoding at the encoder end or the decoding at the decoder end. The term "coding block" or "video block" may be used in this application to refer to one or more blocks of samples, as well as the syntax structure for encoding (decoding) one or more blocks of samples; Instance types may include CTU, CU, PUT, TU, subblock in H.265/HEVC, or macroblocks and macroblock partitions in other video codec standards.

Some concepts involved in the embodiments of the present disclosure are firstly introduced below. The related descriptions of the embodiments of the present disclosure adopt terms in H.265/HEVC or H.266/VVC for easy explanation. However, it is not limited that the solutions provided by the embodiments of the present disclosure are limited to H.265/HEVC or H.266/VVC. In fact, the technical solutions provided by the embodiments of the present disclosure can also be implemented in H.264/AVC, MPEG, AOM, AVS, etc., and the follow-up and expansion of these standards.

CTU is the abbreviation of Coding Tree Unit, which is equivalent to the macroblock in H.264/AVC. According to the YUV sampling format, a coding tree unit (CTU) should contain one luma coding tree block (CTB) and two chrominance coding tree blocks (CTB) (CrCb) at the same position.

The coding unit CU (Coding Unit) is the basic unit for various types of coding operations or decoding operations in the video coding and decoding process, such as CU-based prediction, transformation, entropy coding, and other operations. CU refers to a two-dimensional sampling point array, which may be a square array or a rectangular array. For example, a 4x8 CU can be regarded as a square sampling point array composed of 4x8 and 32 sampling points. A CU may also be called a tile. A CU may include a coding block and corresponding syntax elements.

The residual block refers to the residual image block formed by subtracting the predicted block from the current block to be coded after the predicted block of the current block is generated by inter-frame prediction and/or intra-frame prediction, which can also be called residual bad data.

The coefficient block includes transforming the residual block to obtain a transform block containing transform coefficients, or not transforming the residual block, including a residual block containing residual data (residual signal). In the embodiment of the present disclosure, the coefficients include the coefficients of the transform block obtained by transforming the residual block, or the coefficients of the residual block, and performing entropy coding on the coefficients includes performing entropy coding on the coefficients of the transform block after quantization, or, if not Applying a transform to the residual data includes entropy coding the quantized coefficients of the residual block. The untransformed residual signal and the transformed residual signal may also be collectively referred to as coefficients. for effective compression. Generally, the coefficients need to be quantized, and the quantized coefficients can also be called levels.

Quantization is usually used to reduce the dynamic range of coefficients, so as to achieve the purpose of expressing video with fewer codewords. The quantized value is usually called a level. The operation of quantization is usually to divide the coefficient by the quantization step size, and the quantization step size is determined by the quantization factor transmitted in the code stream. Inverse quantization is done by multiplying the level by the quantization step. For a block of N×M size, the quantization of all coefficients can be done independently. This technology is widely used in many international video compression standards, such as H.265/HEVC, H.266/VVC, etc. A specific scan order can transform a two-dimensional coefficient block into a one-dimensional coefficient stream. The scan sequence can be Z-type, horizontal, vertical or any other sequential scan. In the international video compression standard, the quantization operation can use the correlation between coefficients and the characteristics of the quantized coefficients to select a better quantization method, so as to achieve the purpose of optimizing quantization.

Quantization and dequantization of transform coefficients is an important part of encoding and decoding, including non-dependent scalar quantization and dependent scalar quantization (Dependent Scalar Quantization, DSQ).

In the embodiments of the present disclosure, non-dependent scalar quantization is a relative concept, which means that relative to dependent scalar quantization DSQ, scalar quantization methods that are not dependent on scalar quantization DSQ are called non-dependent scalar quantization. For example, RDOQ (Rate distortion optimized quantization, quantization based on rate distortion optimization) is also a kind of non-dependent scalar quantization. The scalar-independent quantization involved in the embodiments of the present disclosure is not limited to a specific method, and a corresponding scalar quantization method in related standards may be used.

Dependent scalar quantization DSQ means that a set of admissible reconstruction values of transform coefficients depends on the value of the transform coefficient level whose reconstruction order is before the current transform coefficient level (transform coefficient level). The main impact of this approach is that its admissible reconstruction vectors are denser in the N-dimensional vector space (where N denotes the number of transform coefficients in a transform block) compared to non-dependent scalar quantization in H.265/HEVC.

The realization process of dependent scalar quantization is: (a) define two scalar quantizers with different reconstruction levels; (b) define the conversion mode between the two scalar quantizers.

The two scalar quantizers are denoted as Q0 and Q1, respectively, as shown in Figure 4. The position of the reconstruction level is uniquely determined by the quantization step size Δ. The scalar quantizer (Q0 or Q1) does not show transmission in the codestream. The quantizer used is determined by the parity of the variable coefficient level preceding the current transform coefficient in encoding/reconstruction order.

The transition between the two scalar quantizers (Q0 and Q1) is implemented by a state machine with 4 states, as shown in Figure 5. Status can take 4 different values: 0, 1, 2, 3. It is uniquely determined by the parity of the variable coefficient level preceding the current transform coefficient in encoding/reconstruction order. The state is set to 0 at the initial stage of transform block dequantization. Transform coefficients are reconstructed in scan order (eg same as entropy decoding order). After the current transform coefficient is rebuilt, its state is updated as shown above, and k represents the transform coefficient level value.

At the encoding end, the scalar-dependent quantization DSQ involves the quantization unit 106 and the inverse quantization unit 108 as shown in FIG. 2 ; at the decoding end, the scalar-dependent quantization DSQ involves the inverse quantization unit 154 as shown in FIG. 3 .

In an embodiment of the present disclosure, an implementation method of an encoder using DSQ is as follows. First, the input image is partitioned into non-overlapping CTU blocks. Then, each CTU is processed sequentially according to the raster scan sequence, and the CTU is divided into several blocks (CU1, CU2, ..., CUi, ...) according to the optimal mode determined by intra/inter prediction, and the corresponding prediction residuals are obtained. difference block (resB1, resB2, ..., resBi, ...). Then, transform and quantize the residual block. The main steps of quantization are as follows:

(1) Input the i-th residual block resBi, initialize the state state=0, j=0, process the M coefficients of the current block according to the raster scanning order;

(2) Input the jth coefficient, determine the quantization step size △ and the change coefficient level k encoded before the current transform coefficient;

(3) Calculate two quantizers Q0 and Q1 according to △ and k;

(4) Select a quantizer for quantization according to k and state;

(5) Update state, if j<M, j++, skip to step (2), otherwise i++, skip to step (1).

Finally, entropy encoding is performed on the quantized coefficients, and the output code stream is waiting for transmission.

In an embodiment of the present disclosure, an implementation method of a decoder using DSQ is as follows. First, entropy decoding is performed on the input code stream, and transformed and quantized residual blocks (resB*1, resB*2, ..., resB*i, ...) are obtained. Then, the residual block is dequantized. The main steps are as follows:

(1) Input the i-th residual block resB*i, initialize the state state=0, j=0, process the M coefficients of the current block according to the raster scanning order;

(2) Input the jth coefficient, determine the quantization step size △ and reconstruct the change coefficient level k before the current transform coefficient;

(3) Calculate two quantizers Q0 and Q1 according to △ and k;

(4) Select a quantizer for inverse quantization according to k and state;

Then, inverse transform is performed on the inversely quantized residual block, and the inversely transformed residual value and predicted value are superimposed to obtain a reconstructed CU until the reconstruction of the current CTU is completed.

Finally, the reconstructed image is sent to the DBF/SAO/ALF filter, and the filtered image is sent to the buffer, waiting for the video to play.

It can be seen that the DSQ technology applied to the video codec chip can further improve the coding performance.

The study found that the starting point of relying on scalar quantization DSQ can be simply understood as considering the spatial correlation of the image during quantization, so that the reconstruction vector of each block is more compact. This is true for natural content videos, since natural content exhibits strong continuity in both temporal and spatial domains, so DSQ can achieve performance gains of more than 1.5% as shown in proposal JVET-K0071.

However, the content of some video sequences (such as screen content video) shows strong discontinuity in space, which violates the starting point of DSQ technology. Therefore, the use of DSQ technology will not improve the coding efficiency, but will increase the coding complexity. Generate a lot of unnecessary overhead and waste computing resources.

In order to further optimize video encoding and decoding performance, an embodiment of the present disclosure proposes an adaptive quantization technology based on spatial continuity analysis. The technical scheme first analyzes the video content, calculates the continuity of the image content, and then designs an adaptive quantization technology based on this. When the video image content shows strong discontinuity, non-dependent scalar quantization is adopted; otherwise, dependent scalar quantization is adopted.

An embodiment of the present disclosure provides a video coding method, as shown in FIG. 6 , including:

Step 601, determine the spatial continuity feature value of the image to be encoded;

Step 602: Determine, according to the spatial continuity feature value, whether to quantize the image to be encoded by means of DSQ-dependent scalar quantization or non-scalar-dependent quantization.

In an embodiment of the present disclosure, the video coding method, as shown in FIG. 7 , includes:

Step 6021, in the case that the feature value of the spatial continuity is within the feature value threshold range, quantize the image to be encoded using scalar-dependent quantization;

Step 6022: In the case that the spatial continuity feature value is outside the feature value threshold range, quantize the image to be coded using scalar-independent quantization.

Wherein, the spatial continuity feature value is a value used to characterize the spatial continuity of the image content to be encoded.

It should be noted that, in step 6021/6022, the scalar-dependent/scalar-independent quantization method for the image to be encoded refers to: the scalar-dependent/scalar-independent quantization method is used for quantization of the coefficient block corresponding to the image to be encoded. Wherein, the coefficient block includes transforming the residual block to obtain a transform block containing transform coefficients; or not transforming the residual block, including a residual block containing residual data (residual signal).

In an embodiment of the present disclosure, in step 6022, the image to be encoded is quantized using a non-dependent scalar quantization method, including:

Perform the following steps for each residual block obtained after dividing and predicting the image to be encoded in turn:

Determine the quantizer to perform quantization according to the quantized step size;

In an embodiment of the present disclosure, in step 6021, quantization is performed on the image to be encoded using a scalar-dependent quantization method, including: sequentially performing the following steps on each residual block obtained after dividing and predicting the image to be encoded:

Determine the quantization step size △ and the change coefficient level k encoded before the current transform coefficient;

Calculate two quantizers Q0 and Q1 according to the quantization step size △ and the variation coefficient level k;

Select a quantizer for quantization according to the variation coefficient level k and the state machine state.

In an embodiment of the present disclosure, the quantization of the image to be coded using a dependent scalar quantization method includes: sequentially performing the following steps on each residual block obtained after dividing and predicting the image to be coded:

(1) The initial value of the initialization state machine state is 0, j=0, and the M coefficients of the current block are processed according to the raster scanning order;

(3) Calculate two quantizers Q0 and Q1 according to △ and k;

(4) Select a quantizer for quantization according to k and state machine state;

(5) Update the state machine state, in the case of j<M, j++, and skip to step (2); in the case of j>=M, skip to step (1) to process the next residual block.

In an embodiment of the present disclosure, the method further includes: performing entropy coding on the quantized coefficients to form an output code stream.

In an embodiment of the present disclosure, the image to be encoded is a frame to be encoded (Frame) or a slice to be encoded (Slice).

In an embodiment of the present disclosure, determining the spatial continuity feature value of the image to be encoded in step 601 includes:

The feature value of spatial continuity is determined according to the gradient map of the image to be encoded.

determining a first gradient map according to the image to be encoded;

determining a second gradient map based on the first gradient map;

calculating an average value according to the second gradient map, and using the average value as the spatial continuity feature value of the image to be encoded;

Wherein, the first gradient map is a gradient map corresponding to the image to be encoded; the second gradient map is a gradient map corresponding to the first gradient map. In an embodiment of the present disclosure, the first gradient of each pixel in the first gradient map is determined in the following manner:

Gmap(x,y)=|I(x,y)-I(x+1,y)|+|I(x,y)-I(x,y+1)|;

Wherein, (x, y) represents the pixel in the image to be encoded, I (x, y) is the pixel value of the pixel (x, y), and Gmap (x, y) is the pixel (x, y) In an embodiment of the present disclosure, in the YUV space, the pixel value I(x, y) is the luminance component Y on the pixel point (x, y).

The second gradient of each pixel in the second gradient map is determined in the following manner:

GGmap(x,y)＝|Gmap(x,y)-Gmap(x+1,y)|+|Gmap(x,y)-Gmap(x,y+1)|;

Wherein, Gmap(x, y) is the first gradient of the pixel point (x, y), and GGmap(x, y) is the second gradient of the pixel point (x, y).

Optionally, those skilled in the art may use other equivalent or similar attribute data of the image to reflect the spatial continuity of the content of the image to be encoded, which is not limited to the specific examples described in the embodiments of the present disclosure.

In an embodiment of the present disclosure, the calculating the average value according to the second gradient map includes: calculating the average value according to the second gradients of all pixels in the second gradient map.

In an embodiment of the present disclosure, the feature value of the spatial continuity in step 6021 is within the feature threshold range, including: the spatial continuity feature value is smaller than a preset threshold, and is determined to be within the feature threshold range. That is, the spatial continuity eigenvalue of the image to be encoded is calculated according to the above method, and when the spatial continuity eigenvalue is smaller than the preset threshold, it is determined that the spatial continuity of the image to be encoded is strong, and subsequent dependent scalar quantization DSQ Quantization and/or dequantization by way of quantization and/or dequantization; in the case that the spatial continuity feature value is greater than or equal to the preset feature threshold, it is determined that the spatial continuity of the image to be encoded is not strong, and subsequent quantization and / or dequantization. That is, in this embodiment, the feature value threshold range includes: feature values smaller than a preset threshold.

It should be noted that in the solution provided by the embodiments of the present disclosure, the spatial continuity feature value is used to characterize the spatial continuity of the image content to be encoded, except for the method of calculating the first gradient map and the second gradient map in the above example. In addition to calculating the spatial continuity eigenvalues, those skilled in the art can also choose other similar or equivalent methods to calculate the spatial continuity eigenvalues of the image to be encoded. The specific calculation schemes are different, and their corresponding eigenvalue threshold ranges are different. The image to be encoded within the eigenvalue threshold range is an image with strong spatial continuity, and the image to be encoded outside the eigenvalue threshold range is spatially continuous. Not strong image. The specific calculation method of the spatial continuity feature value and the corresponding feature value threshold range are not limited to the aspects of the above examples of the present disclosure.

In an embodiment of the present disclosure, the method further includes:

Step 603, write the quantization mode identifier into the encoded code stream; wherein, the quantization mode identifier indicates a scalar-dependent quantization mode or a scalar-independent quantization mode.

In an embodiment of the present disclosure, writing the quantization mode identifier into the encoded code stream in step 603 includes: writing the quantization mode identifier into one of the following syntax elements of the encoded code stream: sequence (Sequence) level syntax elements, frame (Frame) level syntax elements and slice (Slice) level syntax elements.

It can be seen that after the encoding end determines quantization modes for different images, it writes the quantization mode identification corresponding to the determined quantization mode into the coded code stream. The quantization method identifies syntax elements carried at different levels in the code stream. After the decoder parses the code stream and obtains the identifier, it will execute the corresponding dequantization method according to the identifier. Those skilled in the art can understand that the quantization mode identifier written by the encoding end will be analyzed and obtained by the decoding end, and the identifier can also be understood as an inverse quantization mode identifier at the decoding end.

In an embodiment of the present disclosure, the quantization mode is marked as 1, indicating a dependent scalar quantization mode; the quantization mode is marked as 0, indicating a non-scalar quantization mode. Optionally, it can also be other values, such as true, false, or other value methods, which are not limited here.

In an embodiment of the present disclosure, the quantization mode identifiers of each image to be encoded in step 603 may also be combined and set by multiple identifiers. According to the preset combination setting rules, the decoder can reversely analyze each quantization mode identifier. For example, in a video sequence (including 16 images), the quantization mode of the first 5 images is marked as 1, and the quantization mode of the last 11 images is marked as 0, then 16-bit data 1111100000000000 can be set in the sequence-level syntax unit, Indicates that the quantization methods of the 16 images included in the sequence are scalar-dependent quantization DSQ or scalar-independent quantization respectively; or, 1, 5, 1; 6, 11, 0 can be set in the sequence-level syntax unit; The quantization mode of 5 consecutive images from the first image is marked as 1 (indicating dependent scalar quantization DSQ), and the quantization mode of 11 consecutive images starting from the 6th image is marked as 0 (non-scalar quantization); or, also 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 can be set in 16 image-level syntax units. It can be seen that the solution provided by the embodiments of the present disclosure can indicate the image quantization or inverse quantization mode by extending the data carried in the encoded code stream. The specific position of the carried data and/or the coding method of the carried data are not limited to The content of the example is disclosed, and those skilled in the art may alternatively select different implementations according to the above examples.

An embodiment of the present disclosure also provides a video coding method, the steps related to quantization are shown in FIG. 8 , including:

Obtain the video sequence to be encoded, start from the beginning (i=0), and process each frame (frame i) sequentially according to the following steps:

Step 801, calculating the spatial continuity feature value ss of the i-th frame image;

Step 802, judging whether the feature value ss is less than the preset feature threshold T; if it is less, then perform step 803; if it is greater than or equal to, then perform step 804;

Step 803, determine that the quantization mode flag pic_DSQ_enable_flag of the i-th frame image is 1;

Step 804, determine that the quantization mode flag pic_DSQ_enable_flag of the i-th frame image is 0;

Step 805, add 1 to i

Step 806, judging whether the encoding of the current sequence has been completed, if it is completed, the encoding of the video sequence ends; if not, the next frame of image is obtained, and the execution of steps 801-806 is continued.

Wherein, step 801 includes:

Step 8011, calculate the gradient map Gmap of the current frame image;

Step 8012, calculate the gradient map GGmap of the gradient map Gmap, and calculate the average value ss according to the GGmap.

Among them, Gmap is also called the first gradient map, which is determined according to the following method:

Gmap(x,y)=|I(x,y)-I(x+1,y)|+|I(x,y)-I(x,y+1)|;

Wherein, (x, y) represents the pixel in the image to be encoded, and I(x, y) is the pixel value of the pixel (x, y), which is the luminance component Y on the pixel (x, y) ;

GGmap, also known as the second gradient map, is determined according to:

GGmap(x,y)＝|Gmap(x,y)-Gmap(x+1,y)|+|Gmap(x,y)-Gmap(x,y+1)|;

Wherein, Gmap(x, y) is the first gradient of the pixel point (x, y).

In an embodiment of the present disclosure, a video coding method is also provided, including:

First, determine the quantization flag pic_DSQ_enable_flag of each frame of image according to the method as shown in Figure 8; the input image is divided into multiple non-overlapping CTU blocks.

Then, each CTU is processed sequentially according to the raster scan sequence, and the CTU is divided into several blocks (CU1, CU2, ..., CUi, ...) according to the optimal mode determined by intra/inter prediction, and the corresponding prediction residuals are obtained. difference block (resB1, resB2, ..., resBi, ...).

Then, transform and quantize the residual block. If pic_DSQ_enable_flag=1, perform dependent scalar quantization. The main steps are as follows:

(3) Calculate two quantizers Q0 and Q1 according to △ and k;

(4) Select a quantizer for quantization according to k and state;

If pic_DSQ_enable_flag=0, non-dependent scalar quantization is performed, and the quantizer is directly determined according to the quantization step size for quantization.

In an embodiment of the present disclosure, the method further includes: writing the quantization mode identifier pic_DSQ_enable_flag of each frame of image into the coded code stream; writing into one of the following syntax elements in the coded code stream: Sequence level syntax elements, frame (Frame) level syntax elements and slice (Slice) level syntax elements.

It can be seen that, using the solutions provided by the embodiments of the present disclosure, different quantization methods can be used for each image to be encoded in the video sequence, and the identifier indicating the quantization method of each image can be written into the syntax element, so as to indicate the decoding end according to This flag performs inverse quantization of the corresponding method for each image.

It should be noted that the encoding method described in the embodiments of the present disclosure focuses on quantification-related steps in the overall encoding process. The implementation details of the encoding steps in other aspects can be implemented according to relevant specifications or schemes, and are not limited or protected by the disclosed scheme. range, which will not be detailed here.

An embodiment of the present disclosure also provides a decoding method, as shown in FIG. 9 , including:

Step 901, analyzing the code stream to obtain the quantization mode identification of the image to be decoded;

Step 902, perform inverse quantization on the image to be decoded according to the quantization mode indicated by the quantization mode identifier;

It should be noted that the quantization mode identifier obtained by parsing the code stream in step 901 is the quantization mode identifier written by the encoder in step 603 . The method of dequantizing the image to be decoded corresponds to the quantization method determined by the encoding end according to the spatial continuity feature value of the original image (ie, the image before encoding); the spatial continuity feature value is used to represent the A value that describes the spatial continuity of the original image content.

In an embodiment of the present disclosure, in step 902, inverse quantization is performed on the image to be decoded according to the quantization mode indicated by the quantization mode identifier, including:

In the case where the quantization mode identifier indicates an independent scalar quantization mode, perform entropy decoding on the code stream in turn and perform the following steps for each residual block obtained after transformation and quantization:

Determine the quantizer for inverse quantization according to the quantization step size;

In the case where the quantization mode identifier indicates that it depends on the scalar quantization mode, perform entropy decoding on the code stream in turn and perform the following steps for each residual block obtained after transformation and quantization:

Determine the quantization step size △ and reconstruct the change coefficient level k before the current transform coefficient;

Select a quantizer for inverse quantization according to the variation coefficient level k and the state machine state.

In an embodiment of the present disclosure, in the case where the quantization mode identifier indicates that it depends on the scalar quantization mode, the following steps are performed for each residual block obtained by entropy decoding the code stream in turn after transformation and quantization:

(3) Calculate two quantizers Q0 and Q1 according to △ and k;

(4) Select a quantizer for inverse quantization according to k and state machine state;

In an embodiment of the present disclosure, the method further includes:

Perform inverse transformation on the inversely quantized residual block, superimpose the inversely transformed residual value and the predicted value, and obtain the reconstructed coding unit CU until the reconstruction of the current coding tree unit CTU is completed;

Filter the reconstructed image and store it in the buffer, waiting for the video to play.

In an embodiment of the present disclosure, parsing the code stream in step 901 to obtain the quantization mode identification of the image to be decoded includes: parsing the code stream, and obtaining the quantization mode identification of the image to be decoded from one of the following syntax elements :

Sequence (Sequence) level syntax elements, frame (Frame) level syntax elements and slice (Slice) level syntax elements.

That is, the quantization mode identifier for the image to be decoded can be obtained by parsing syntax elements of multiple levels.

It can be seen that the decoder can parse the received code stream to obtain the quantization mode identifier of the image to be decoded, and perform corresponding inverse quantization according to the quantization mode indicated by the identifier to realize decoding.

It should be noted that, in step 902, performing inverse quantization on the image to be decoded according to the quantization method indicated by the quantization method identifier means: corresponding to the quantization block corresponding to the image to be decoded according to the quantization method indicated by the quantization identifier dequantization.

An embodiment of the present disclosure also provides a decoding method, including:

First, entropy decoding is performed on the input code stream, and the transformed and quantized residual blocks (resB*1, resB*2, ..., resB*i, ...) are obtained;

Next, dequantize the residual block to obtain the quantization mode flag pic_DSQ_enable_flag; if pic_DSQ_enable_flag=1, perform dependent scalar dequantization, the main steps are as follows:

(3) Calculate two quantizers Q0 and Q1 according to △ and k;

(4) Select a quantizer for inverse quantization according to k and state;

If pic_DSQ_enable_flag=0, non-dependent scalar inverse quantization is performed, and the quantizer is directly determined according to the quantization step size to perform inverse quantization.

It should be noted that the decoding method described in the embodiments of the present disclosure focuses on the steps related to inverse quantization in the overall decoding process. The implementation details of the decoding steps in other aspects can be implemented according to relevant specifications or schemes, and do not belong to the limitations or schemes of the present disclosure. The scope of protection will not be repeated here.

An embodiment of the present disclosure also provides a video encoding device, as shown in FIG. 10 , including a processor and a memory storing a computer program that can run on the processor, wherein the processor executes the computer The program implements the video coding method described in any embodiment of the present disclosure.

An embodiment of the present disclosure also provides a video decoding device, as shown in FIG. 10 , including a processor and a memory storing a computer program that can run on the processor, wherein the processor executes the computer The program realizes the video decoding method described in any embodiment of the present disclosure.

An embodiment of the present disclosure further provides a video encoding and decoding system, including the video encoding device described in any implementation of the present disclosure and/or the video decoding device described in any implementation of the present disclosure.

An embodiment of the present disclosure also provides a non-transitory computer-readable storage medium, the computer-readable storage medium stores a computer program, wherein, when the computer program is executed by a processor, any implementation of the present disclosure can be realized. The video decoding method or encoding method described in the example.

An embodiment of the present disclosure also provides a code stream, wherein the code stream is generated according to the video coding method according to any embodiment of the present disclosure, wherein the code stream includes a quantization mode identifier, and the quantization The mode identifier is used to indicate the way the encoder quantizes the image.

It can be seen that the encoding and decoding methods provided by the embodiments of the present disclosure can reduce encoding complexity without reducing encoding performance. Compared with the existing quantization technology, this technology uses non-dependent scalar quantization technology for images with discontinuous spatial content. Since the content of these images is discontinuous, the original dependent scalar quantization cannot achieve the expected coding optimization effect. Therefore, using scalar-independent quantization for these images not only does not lose coding performance, but also reduces the total coding time. In a word, the solutions provided by the embodiments of the present disclosure can significantly reduce the coding complexity while maintaining the coding performance substantially equivalent to that of the related art.

In one or more exemplary embodiments, the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over, as one or more instructions or code, a computer-readable medium and executed by a hardware-based processing unit. Computer-readable media may include computer-readable storage media that correspond to tangible media such as data storage media, or communication media including any medium that facilitates transfer of a computer program from one place to another, eg, according to a communication protocol. In this manner, a computer-readable medium may generally correspond to a non-transitory tangible computer-readable storage medium or a communication medium such as a signal or carrier wave. Data storage media may be any available media that can be accessed by one or more computers or one or more processors to retrieve instructions, code and/or data structures for implementation of the techniques described in this disclosure. A computer program product may comprise a computer readable medium.

By way of example and not limitation, such computer-readable storage media may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk or other magnetic storage, flash memory, or may be used to store instructions or data Any other medium that stores desired program code in the form of a structure and that can be accessed by a computer. Moreover, any connection could also be termed a computer-readable medium. For example, if a connection is made from a website, server or other remote source for transmitting instructions, coaxial cable, fiber optic cable, dual wire, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. It should be understood, however, that computer-readable storage media and data storage media do not encompass connections, carrier waves, signals, or other transitory (transitory) media, but are instead directed to non-transitory tangible storage media. As used herein, disk and disc include compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk, or blu-ray disc, etc. where disks usually reproduce data magnetically, while discs use lasers to Data is reproduced optically. Combinations of the above should also be included within the scope of computer-readable media.

can be implemented by one or more processors such as one or more digital signal processors (DSPs), general purpose microprocessors, application specific integrated circuits (ASICs), field programmable logic arrays (FPGAs), or other equivalent integrated or discrete logic circuits. Execute instructions. Accordingly, the term "processor," as used herein may refer to any of the foregoing structure or any other structure suitable for implementation of the techniques described herein. In addition, in some aspects the functionality described herein may be provided within dedicated hardware and/or software modules configured for encoding and decoding, or incorporated in a combined codec. Also, the techniques may be fully implemented in one or more circuits or logic elements.

The technical solutions of the embodiments of the present disclosure may be implemented in a wide variety of devices or devices, including a wireless handset, an integrated circuit (IC), or a set of ICs (eg, a chipset). Various components, modules, or units are described in the disclosed embodiments to emphasize functional aspects of devices configured to perform the described techniques, but do not necessarily require realization by different hardware units. Rather, as described above, the various units may be combined in a codec hardware unit or provided by a collection of interoperable hardware units (comprising one or more processors as described above) in combination with suitable software and/or firmware.

Those of ordinary skill in the art can understand that all or some of the steps in the methods disclosed above, the functional modules/units in the system, and the device can be implemented as software, firmware, hardware, and an appropriate combination thereof. In a hardware implementation, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be composed of several physical components. Components cooperate to execute. Some or all of the components may be implemented as software executed by a processor, such as a digital signal processor or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). As known to those of ordinary skill in the art, the term computer storage media includes both volatile and nonvolatile media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data. permanent, removable and non-removable media. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cartridges, tape, magnetic disk storage or other magnetic storage devices, or can Any other medium used to store desired information and which can be accessed by a computer. In addition, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media .

Claims

A video decoding method, characterized in that, comprising:

Analyze the code stream to obtain the quantization mode identification of the image to be decoded;

Dequantize the image to be decoded according to the quantization mode indicated by the quantization mode identifier;

Wherein, the quantization mode identification is an identification determined according to the spatial continuity feature value of the original image to indicate a scalar-dependent quantization mode or a scalar-independent quantization mode.
The decoding method according to claim 1, characterized in that,

The method also includes:

Perform inverse transformation on the inversely quantized residual block, superimpose the inversely transformed residual value and the predicted value, and obtain the reconstructed coding unit CU until the reconstruction of the current coding tree unit CTU is completed;

The reconstructed image is filtered and stored in the buffer.
The decoding method according to claim 1 or 2, characterized in that,

Analyzing the code stream to obtain the quantization mode identification of the image to be decoded includes: parsing the code stream, and obtaining the quantization mode identification of the image to be decoded from one of the following syntax elements:

Sequence-level syntax elements, frame-level syntax elements, and slice-level syntax elements.
The decoding method according to claim 1 or 2, characterized in that,

The dequantization of the image to be decoded according to the quantization mode indicated by the quantization mode identification includes:

In the case where the quantization mode identifier indicates an independent scalar quantization mode, perform entropy decoding on the code stream in turn and perform the following steps for each residual block obtained after transformation and quantization:

Determine the quantizer for inverse quantization according to the quantization step size;

In the case where the quantization mode identifier indicates that it depends on the scalar quantization mode, perform entropy decoding on the code stream in turn and perform the following steps for each residual block obtained after transformation and quantization:

Determine the quantization step size △ and reconstruct the change coefficient level k before the current transform coefficient;

Calculate two quantizers Q0 and Q1 according to the quantization step size △ and the variation coefficient level k;

Select a quantizer for inverse quantization according to the variation coefficient level k and the state machine state.
A video coding method, characterized in that, comprising:

Determine the spatial continuity eigenvalue of the image to be encoded;

Determining, according to the spatial continuity feature value, that the image to be encoded is quantized in a scalar-dependent quantization manner or in a scalar-independent quantization manner.
The encoding method according to claim 5, characterized in that,

The determining according to the spatial continuity eigenvalues to quantize the image to be encoded using a scalar-dependent quantization method or a scalar-independent quantization method includes:

When the spatial continuity eigenvalue is within the eigenvalue threshold range, performing quantization on the image to be encoded using a dependent scalar quantization method;

In the case that the spatial continuity feature value is outside the feature value threshold range, the image to be encoded is quantized using a scalar-independent quantization manner.
The coding method according to claim 5 or 6, characterized in that,

The spatial continuity feature value is a value used to characterize the spatial continuity of the image content to be encoded.
The coding method according to claim 5 or 6, characterized in that,

The method also includes:

Entropy encoding is performed on the quantized coefficients to form an output code stream.
The coding method according to claim 5 or 6, characterized in that,

The quantization of the image to be encoded using a non-dependent scalar quantization method includes:

Perform the following steps for each residual block obtained after dividing and predicting the image to be encoded in turn:

Determine the quantizer to perform quantization according to the quantized step size;

The quantization of the image to be coded by relying on scalar quantization includes: sequentially performing the following steps on each residual block obtained after dividing and predicting the image to be coded:

Determine the quantization step size △ and the change coefficient level k encoded before the current transform coefficient;

Calculate two quantizers Q0 and Q1 according to the quantization step size △ and the variation coefficient level k;

Select a quantizer for quantization according to the variation coefficient level k and the state machine state.
The coding method according to claim 5 or 6, characterized in that,

The image to be encoded is a frame to be encoded or a slice to be encoded.
The coding method according to claim 5 or 6, characterized in that,

The determination of the spatial continuity feature value of the image to be encoded includes:

The feature value of spatial continuity is determined according to the gradient map of the image to be encoded.
The encoding method according to claim 11, characterized in that,

The determining the spatial continuity feature value according to the gradient map of the image to be encoded includes:

determining a first gradient map according to the image to be encoded;

determining a second gradient map based on the first gradient map;

calculating an average value according to the second gradient map, and using the average value as the spatial continuity feature value of the image to be encoded;

Wherein, the first gradient map is a gradient map corresponding to the image to be encoded; the second gradient map is a gradient map corresponding to the first gradient map.
The encoding method according to claim 12, characterized in that,

The first gradient of each pixel in the first gradient map is determined in the following manner:

Gmap(x,y)=|I(x,y)-I(x+1,y)|+|I(x,y)-I(x,y+1)|;

Wherein, (x, y) represents the pixel in the image to be encoded, I (x, y) is the pixel value of the pixel (x, y), and Gmap (x, y) is the pixel (x, y) the first gradient of

The second gradient of each pixel in the second gradient map is determined in the following manner:

GGmap(x,y)＝|Gmap(x,y)-Gmap(x+1,y)|+|Gmap(x,y)-Gmap(x,y+1)|;

Wherein, Gmap(x, y) is the first gradient of the pixel point (x, y), and GGmap(x, y) is the second gradient of the pixel point (x, y).
The encoding method according to claim 13, characterized in that,

The calculation of the average value according to the second gradient map includes:

An average value is calculated according to the second gradients of all pixels in the second gradient map.
The encoding method according to claim 6, characterized in that,

The characteristic value of the spatial continuity is located in the characteristic value threshold range, including:

In a case where the characteristic value of the spatial continuity is smaller than a preset threshold, it is determined that the characteristic value of the spatial continuity is within a range of the characteristic value threshold.
The coding method according to claim 5 or 6, characterized in that,

The method further includes: writing the quantization mode identifier into the coded code stream; wherein, the quantization mode identifier indicates a dependent scalar quantization mode or an independent scalar quantization mode.
The encoding method according to claim 16, characterized in that,

The writing of the quantization mode identification into the encoded code stream includes:

Write the quantization mode identifier into one of the following syntax elements of the encoded code stream:

Sequence-level syntax elements, frame-level syntax elements, and slice-level syntax elements.
A video decoding device, characterized by comprising a processor and a memory storing a computer program that can run on the processor, wherein when the processor executes the computer program, any of claims 1 to 4 can be realized. One of the described decoding methods.
A video encoding device, characterized by comprising a processor and a memory storing a computer program that can run on the processor, wherein when the processor executes the computer program, any of claims 5 to 17 can be realized. - said encoding method.
A video codec system, comprising the decoding device as claimed in claim 18 and/or the encoding device as claimed in claim 19.
A non-transitory computer-readable storage medium, the computer-readable storage medium stores a computer program, wherein the computer program implements the method according to any one of claims 1 to 17 when executed by a processor.
A code stream, wherein the code stream is generated according to the encoding method according to any one of claims 5 to 17, wherein the code stream includes a quantization mode identification, and the quantization mode identification is used to indicate that the coding end How to quantize the image.