CN111901596B

CN111901596B - Video hybrid encoding and decoding method, device and medium based on deep learning

Info

Publication number: CN111901596B
Application number: CN202010604772.1A
Authority: CN
Inventors: 贾川民; 马思伟; 王苫社
Original assignee: Peking University
Current assignee: Peking University
Priority date: 2020-06-29
Filing date: 2020-06-29
Publication date: 2021-10-22
Anticipated expiration: 2040-06-29
Also published as: CN111901596A

Abstract

The invention discloses a video hybrid encoding and decoding method, device and medium based on deep learning. The encoding method includes: extracting the bottleneck layer feature from the specified frame image; reconstructing the first frame image according to the bottleneck layer feature; quantizing and entropy encoding the bottleneck layer feature to obtain intra-frame encoded data; The image is compensated, transformed, quantized and entropy encoded to obtain first prediction residual data. The decoding method includes performing entropy decoding on the intra-frame coded data to obtain the bottleneck layer feature and decoding the specified frame image; performing entropy decoding, inverse quantization, inverse transformation and compensation on the first prediction residual data, and then performing compensation on the compensated data. Loop filtering to decode the first subsequent frame image; the encoding and decoding device corresponds to the corresponding encoding and decoding method. The present invention provides a brand-new video encoding and decoding scheme, which can realize high-efficiency compression and fast decoding of video, and realizes a great improvement in video compression performance.

Description

Video hybrid coding and decoding method, device and medium based on deep learning

Technical Field

The present invention relates to the field of video coding technologies, and in particular, to a video hybrid coding and decoding method and apparatus, and a computer storage medium.

Background

At present, a traditional hybrid coding framework mainly performs predictive transform coding based on image blocks with different sizes, and an improved scheme often focuses on local rate distortion optimization of each coding tool; for example, entropy estimation models are continuously improved and promoted, probability estimation models based on a mixed gaussian model and a gaussian super-prior distribution entropy estimation model are proposed, and a context model is established in combination with an autoregressive model-based coding framework to help an end-to-end image coding framework obtain higher coding gain. However, existing video hybrid coding schemes are often difficult to meet when faced with higher video compression requirements.

Therefore, how to further improve the video compression rate through hybrid coding becomes a key point for those skilled in the art to solve the technical problem and research all the time.

Disclosure of Invention

In order to solve the problem that the compression rate of the existing video hybrid coding scheme is difficult to further improve, the invention innovatively provides a video hybrid coding and decoding method, a device and a medium based on deep learning, and the video hybrid coding and decoding method, device and medium can better solve the problem of the existing video hybrid coding scheme.

To achieve the technical purpose, the present invention specifically discloses a video hybrid coding method based on deep learning, which includes, but is not limited to, the following processes.

And extracting bottleneck layer characteristics from the specified frame image of the current video.

Reconstructing a first frame image according to the bottleneck layer characteristics; and quantizing and entropy coding the bottleneck layer characteristics in sequence to obtain intra-frame coded data for writing in a code stream.

And compensating a first subsequent frame image of the current video by using the first frame image, and sequentially transforming, quantizing and entropy coding the compensated image to obtain first prediction residual data for writing in a code stream.

Further, the hybrid encoding method further includes:

and compensating a second subsequent frame image of the current video by using the first frame image or the first subsequent frame image, and then transforming, quantizing and entropy coding the compensated image to obtain second prediction residual data for writing in a code stream.

Wherein the designated frame image, the first subsequent frame image, and the second subsequent frame image appear in the current video in a front-to-back order.

Further, the process of compensating the second subsequent frame image by using the first subsequent frame image comprises:

and sequentially carrying out transformation, quantization, inverse transformation, secondary compensation and loop filtering on the first subsequent frame image, taking the loop-filtered image as a reconstructed second frame image, and compensating the second subsequent frame image by using the second frame image.

Further, the process of extracting the bottleneck layer feature from the specified frame image of the current video includes:

sequentially grouping all frame images of the current video according to the sequence from front to back to obtain a plurality of groups of images, and taking the first frame image of each group of images as the appointed frame image; wherein each group of images includes the first subsequent frame image and the second subsequent frame image.

And extracting bottleneck layer characteristics from the first frame image of the current group of images.

Further, before the quantization and entropy coding are sequentially performed on the bottleneck layer features, the method further includes:

and performing code rate estimation on the bottleneck layer characteristics to be quantized by adopting a super-prior network.

In order to achieve the technical purpose, the invention also discloses a video decoding method based on deep learning, which is used for decoding data generated by the video hybrid coding method according to any embodiment of the invention; the video decoding method based on deep learning comprises the following steps:

entropy decoding intra-frame coded data in the received code stream to obtain bottleneck layer characteristics, and decoding a specified frame image according to the bottleneck layer characteristics.

And performing entropy decoding, inverse quantization and inverse transformation on first prediction residual data in the received code stream, compensating by using the specified frame image, and performing loop filtering on the compensated data so as to decode a first subsequent frame image in a loop filtering manner.

The deep learning based video decoding method further comprises:

and performing entropy decoding, inverse quantization and inverse transformation on second prediction residual data in the received code stream, compensating by using the first subsequent frame image or the appointed frame image, and performing loop filtering on the compensated image so as to decode a second subsequent frame image in a loop filtering manner.

In order to achieve the technical object, the present invention also specifically discloses a video hybrid coding device based on deep learning, which includes, but is not limited to, the following structure.

And the analysis network module is used for extracting the bottleneck layer characteristics from the specified frame image of the current video.

And the encoding end generation network module is used for reconstructing a first frame image according to the bottleneck layer characteristics.

And the quantization module quantizes the bottleneck layer characteristics.

And the entropy coding module is used for entropy coding the quantized bottleneck layer characteristics to obtain intra-frame coded data used for writing in the code stream.

And the encoding end compensation module is used for compensating a first subsequent frame image of the current video by using the first frame image.

And the transformation module is used for transforming the compensated image.

The quantization module is further configured to quantize the transformed image.

The entropy coding module is further configured to perform entropy coding on the quantized image to obtain first prediction residual data for writing into a code stream.

In order to achieve the technical object, the invention also discloses a video decoding device based on deep learning, which is used for decoding the data generated by the video hybrid coding device of any embodiment of the invention; the video decoding apparatus based on the deep learning includes, but is not limited to, the following structure.

The entropy decoding module is used for carrying out entropy decoding on intra-frame coded data in the received code stream so as to obtain bottleneck layer characteristics; and the decoder is also used for carrying out entropy decoding on the first prediction residual data in the received code stream.

And the decoding end generates a network module for decoding the appointed frame image according to the bottleneck layer characteristics.

And the decoding end inverse quantization module is used for carrying out inverse quantization on the entropy-decoded first prediction residual data.

And the decoding end inverse transformation module is used for carrying out inverse transformation on the first prediction residual data after inverse quantization.

And the decoding end compensation module is used for compensating the first prediction residual data after inverse transformation by using the appointed frame image.

And the decoding end loop filtering module is used for performing loop filtering on the compensated first prediction residual data so as to decode a first subsequent frame image in a loop filtering mode.

To achieve the above technical object, the present invention further provides a computer-readable storage medium having a computer program stored thereon, where the computer program, when executed by a processor, implements a deep learning based video hybrid encoding method in any embodiment of the present invention or a deep learning based video decoding method in any embodiment of the present invention.

The invention has the beneficial effects that: the invention provides a novel video coding and decoding scheme, which can realize high-efficiency compression and quick decoding of videos, greatly improve the video compression performance and better solve the problems of difficult compression ratio improvement, slow decoding and the like in the conventional video mixed coding scheme.

Compared with the conventional technology, especially compared with the scheme of generating the intra-frame prediction pixels by adopting the neighborhood pixels, the method does not need to encode the residual error information in the intra-frame encoding process, so the encoding efficiency of the method is far higher than that of the conventional encoding scheme, and the video encoding performance is greatly improved.

In addition, the weight data of the analysis network and the generation network in the deep learning self-encoder are stored off line, and do not need to be transmitted through a code stream, so the deep learning self-encoder can effectively reduce the code flow.

Drawings

Fig. 1 is a flow chart of a video hybrid coding method based on deep learning according to some embodiments of the present invention.

Fig. 2 illustrates a schematic diagram of a coding end framework in some embodiments of the invention.

Fig. 3 shows a decoding end frame diagram in some embodiments of the invention.

FIG. 4 illustrates a schematic diagram of the operational status of an analysis network module and a generation network module in some embodiments of the invention.

Fig. 5 shows a graph of rate-distortion performance compared in synchronization with various conventional coding algorithms.

Detailed Description

The following explains and explains a video hybrid encoding and decoding method and apparatus based on deep learning, and a computer readable storage medium in detail, with reference to the drawings of the specification.

The first embodiment is as follows:

as shown in fig. 1 and 2, the present embodiment provides a video hybrid coding method based on deep learning, which can perform intra-frame coding work by a self-encoder of deep learning. In particular, the video hybrid encoding method may include, but is not limited to, the following steps.

Firstly, an intra-frame coding process is carried out, and bottleneck layer characteristics are extracted from a specified frame image of a current video. In some preferred embodiments of the present invention, the process of extracting the bottleneck layer feature from the specified frame image of the current video includes: sequentially grouping all frame images of the current video according to the sequence from front to back to obtain a plurality of groups of images, and taking the first frame image of each group of images as a designated frame image; if the current video has 1600 frames, it can be divided into 100 groups, each group having 16 frames; each group of images comprises a first subsequent frame image and a second subsequent frame image; the method comprises the steps that Bottleneck Layer characteristics are extracted from a first frame image of a current group of images, and the Bottleneck Layer characteristics can be realized through a deep learning self-encoder (Auto-encoder) constructed based on a convolutional neural network, wherein the self-encoder is provided with a Bottleneck Layer (Bottleneck Layer), different self-encoders can be trained for each code rate point (quantization parameter value), and different network weights can be configured in the training process to realize the encoding of different code rates; as in the above example, for any group of images, if the group of images is numbered from 1 to 16 from front to back, the number 1 image is a designated frame image, the number 2 image is a first subsequent frame image, and the number 3-16 images may be the first subsequent frame image or the second subsequent frame image.

As shown in fig. 2, before extracting the bottleneck layer feature, the present embodiment should determine whether the current frame is the designated frame F_n. That is, in this embodiment, it is required to select whether the encoding mode is the intra mode or the non-intra mode, and the bottleneck layer feature extraction processing can be performed on the first frame (the first frame) of each group of pictures in the intra mode. The non-intra mode in the present embodiment includes, but is not limited to, an inter mode, a skip mode, a merge mode, an affine motion mode, and the like.

Secondly, reconstructing the first frame image according to the bottleneck layer characteristics. And the bottleneck layer characteristics are sequentially quantized and entropy-coded to obtain intra-frame coded data for writing in the code stream, and further for transmitting the intra-frame coded data through the code stream. Therefore, the invention can improve the whole video coding efficiency by improving the intra-frame coding efficiency. In some preferred embodiments of the present invention, before the quantifying and entropy-encoding the bottleneck layer features in sequence, the method further comprises: and performing code rate estimation on the bottleneck layer characteristics to be quantized by adopting a super-prior network. After the code rate estimation processing is carried out through the super-prior network, the invention can further improve the compression rate, so that the code stream is smaller and the transmission speed under the same transmission condition is higher.

And thirdly, performing an encoding process of interframe motion compensation prediction, effectively utilizing the first frame image to compensate a first subsequent frame image of the current video, and sequentially transforming, quantizing and entropy encoding the compensated image to obtain first prediction residual data written in a code stream, wherein the first prediction residual data is used for transmitting the first prediction residual data through the code stream, and any group of images of the current video can generate one or more times of first prediction residual data. The inter coding mode of the present embodiment may be a block-based inter motion compensated predictive coding mode. The transform in this embodiment may be an orthogonal transform, and may include, but is not limited to, Discrete Cosine Transform (DCT), wavelet transform, hadamard transform, etc., and some signals insensitive to human eyes may be exposed through the transform process. The present embodiment may achieve the purpose of compression by removing some non-critical or unimportant signals, such as signals insensitive to human eyes, through quantization. The present embodiment enables further compression by entropy coding.

Finally, the embodiment compensates a second subsequent frame image of the current video by using the first frame image or the first subsequent frame image, and then transforms, quantizes and entropy-encodes the compensated image to obtain second prediction residual data for writing into the code stream. In this embodiment, the step of compensating the second subsequent frame image of the current video by using the first frame image is similar to the step of compensating the first subsequent frame image of the current video by using the first frame image, and is not repeated. In some preferred embodiments of the present invention, the process of compensating the second subsequent frame image by using the first subsequent frame image comprises: the first subsequent frame image may be sequentially subjected to transformation, quantization, inverse transformation, secondary compensation and Loop Filter (Loop Filter), and then the Loop-filtered image is used as a reconstructed second frame image, so as to compensate the second subsequent frame image by using the second frame image. The appointed frame image, the first subsequent frame image and the second subsequent frame image appear in the current video from front to back. It should be understood that the preceding second subsequent frame image may be used to compensate the following second subsequent frame image. The present embodiment can be executed in a loop, and after one group of frame images is encoded, the above-mentioned bottleneck layer feature extraction step is restarted for the next group of frame images.

As shown in fig. 5, a schematic diagram of rate-distortion performance of the present invention compared with various conventional coding algorithms in the same step is shown, which can show that the video coding method based on deep learning provided by the present invention has better performance.

Example two:

as shown in fig. 3 and 4, the present embodiment provides a video decoding method based on deep learning, which corresponds to the video encoding method according to any embodiment of the present invention, and is used to perform corresponding decoding on corresponding data generated by the video hybrid encoding method according to any embodiment of the present invention.

Specifically, the video decoding method based on deep learning of the present embodiment includes, but is not limited to, the following steps.

Firstly, entropy decoding is carried out on intra-frame coded data in a received code stream to obtain bottleneck layer characteristics, and a specified frame image is decoded according to the bottleneck layer characteristics.

And then, performing entropy decoding, inverse quantization and inverse transformation on the first prediction residual data in the received code stream, performing compensation by using a specified frame image, and performing loop filtering on the compensated data so as to decode a first subsequent frame image in a loop filtering manner.

Then, entropy decoding, inverse quantization and inverse transformation can be carried out on second prediction residual data in the received code stream, compensation is carried out by utilizing the first subsequent frame image or the appointed frame image, loop filtering is carried out on the compensated image, and therefore a second subsequent frame image is decoded in a loop filtering mode.

Example three:

the present embodiment is based on the same inventive concept as the first embodiment, and can specifically provide a video hybrid coding device based on deep learning. The device can provide a video coding framework fusing a depth intra-frame self-encoder and inter-frame motion compensation prediction, an intra-frame mode or a non-intra-frame mode can be selected at a coding end, efficient coding of video is achieved, and further compression of the video is completed.

And the analysis network module inputs the original signals and outputs the bottleneck layer characteristics. The visible analysis network module is used for extracting the bottleneck layer characteristics from the appointed frame image of the current video. In this embodiment, all frame images of a current video are sequentially grouped and set from front to back to obtain a plurality of groups of images, and a first frame image of each group of images is used as a designated frame image; each group of images can comprise a first subsequent frame image and a second subsequent frame image, and the bottleneck layer characteristics are extracted from the first frame image of the current group of images.

The encoding end generates a network module, which is used to reconstruct the first frame image according to the bottleneck layer characteristics, so as to obtain P illustrated in fig. 2. At the encoding end, the analysis network module and the encoding end generation network module in this embodiment can form a deep learning self-encoder constructed based on a convolutional neural network, and the encoder has a bottleneck layer.

And the quantification module is used for quantifying the characteristics of the bottleneck layer.

The encoding end compensation module is used for compensating a first subsequent frame image of the current video by using the first frame image P; and is also used for compensating a second subsequent frame image in the current video by using the first frame image P or the first subsequent frame image.

The transformation module is used for transforming the compensated image; the transformation module may also be used to transform the first subsequent frame image.

And the super-prior coding module can be used for carrying out code rate estimation on the bottleneck layer characteristics to be quantized by adopting a super-prior network.

The quantization module is also used for quantizing the transformed image; and the quantization module may also be used to quantize the transformed first subsequent frame image.

And the coding end inverse quantization module is used for carrying out inverse quantization on the quantized first subsequent frame image.

A coding end inverse transformation module for performing inverse transformation on the inverse quantized first subsequent frame image to obtain D shown in fig. 2_n′。

A secondary compensation module at the encoding end, which can be used to perform secondary compensation on the inverse-transformed first subsequent frame image to obtain uF as shown in FIG. 2_n′。

And the encoding-side loop filtering module is configured to perform loop filtering on the secondarily compensated first subsequent frame image, so as to use the loop-filtered image as a reconstructed second frame image (also denoted by P in fig. 2) and compensate the second subsequent frame image by using the second frame image.

The entropy coding module can be also used for entropy coding the quantized image to obtain first prediction residual data used for writing in a code stream; and the second prediction residual error data is used for entropy coding the transformed and quantized second subsequent frame image to obtain second prediction residual error data used for writing in the code stream. The designated frame image, the first subsequent frame image, and the second subsequent frame image in this embodiment may appear in the current video in the order from front to back.

Example four:

the video decoding method in the second embodiment can be based on the same inventive concept, and this embodiment specifically provides a video decoding apparatus based on deep learning, which can be used for decoding data generated by the video hybrid encoding apparatus in any embodiment of the present invention.

As shown in fig. 3 and 4, the video decoding apparatus based on deep learning in the present embodiment includes, but is not limited to, the following modules. In addition, the decoding end of the invention does not need to analyze the network, so the structure of the decoding end of the invention is more simplified.

The entropy decoding module is used for carrying out entropy decoding on intra-frame coded data in the received code stream so as to obtain bottleneck layer characteristics; the decoder is also used for carrying out entropy decoding on the first prediction residual data in the received code stream; and the decoder is also used for carrying out entropy decoding on second prediction residual data in the received code stream.

And the decoding end generates a network module, inputs the bottleneck layer characteristics subjected to entropy decoding, and outputs the bottleneck layer characteristics subjected to entropy decoding into a reconstructed intra-frame coding image. As can be seen, the decoding end generation network module of this embodiment is used for decoding the corresponding specified frame image P according to the bottleneck layer characteristics.

And the decoding end inverse quantization module is used for carrying out inverse quantization on the first prediction residual data and the second prediction residual data after entropy decoding.

A decoding-end inverse transformation module, configured to perform inverse transformation on the inverse quantized first prediction residual data and the second prediction residual data to obtain D in fig. 3_n′。

A decoding-end compensation module, configured to compensate the inverse-transformed first prediction residual data with a specified frame image to obtain uF in fig. 3_n'; and also for compensating the inversely transformed second prediction residual data with the first subsequent frame picture (also denoted by P) or the designated frame picture.

And the decoding end loop filtering module is used for performing loop filtering on the compensated first prediction residual data and second prediction residual data so as to decode a first subsequent frame image and a second subsequent frame image in a loop filtering mode.

Example five:

the present embodiment provides a computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements a deep learning based video hybrid encoding method or a deep learning based video decoding method according to any one of the embodiments of the present invention.

The logic and/or steps represented in the flowcharts or otherwise described herein, such as an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable storage medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable storage medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer cartridge (magnetic device), a Random Access Memory (RAM), a Read-Only Memory (ROM), an Erasable Programmable Read-Only Memory (EPROM-Only Memory, or flash Memory), an optical fiber device, and a portable Compact Disc Read-Only Memory (CDROM). Additionally, the computer-readable storage medium may even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.

It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic Gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic Gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

In the description herein, references to the description of the term "the present embodiment," "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and simplifications made in the spirit of the present invention are intended to be included in the scope of the present invention.

Claims

1. a video hybrid coding method based on deep learning, is characterized in that, comprises:

Extract the bottleneck layer feature from the specified frame image of the current video, and use the first frame image as the specified frame image;

Reconstructing the first frame image according to the bottleneck layer feature; and sequentially performing quantization and entropy encoding on the bottleneck layer feature to obtain intra-frame encoded data for writing into the code stream;

Use the first frame image to perform inter-frame motion compensation on the first subsequent frame image of the current video, and sequentially perform transformation, quantization and entropy encoding on the compensated image to obtain the first prediction residual for writing into the code stream. data.

2. the video hybrid coding method based on deep learning according to claim 1, is characterized in that, also comprises:

Compensate the second subsequent frame image of the current video by using the first frame image or the first subsequent frame image, and then perform transformation, quantization and entropy encoding on the compensated image, so as to obtain a code for writing the code stream. The second prediction residual data;

The specified frame image, the first subsequent frame image, and the second subsequent frame image appear in the current video in order from front to back.

3. The video hybrid coding method based on deep learning according to claim 2, wherein the process of using the first subsequent frame image to compensate the second subsequent frame image comprises:

Perform transformation, quantization, inverse quantization, inverse transformation, secondary compensation, and loop filtering on the first subsequent frame image in sequence, and use the loop filtered image as the reconstructed second frame image to utilize the second frame image The image is compensated for the second subsequent frame image.

4. The video hybrid coding method based on deep learning according to any one of claims 1 to 3, wherein the process of extracting the bottleneck layer feature from the specified frame image of the current video comprises:

All frame images of the current video are grouped and set in sequence from front to back to obtain multiple groups of images, and the first frame image of each group of images is used as the specified frame image; wherein, each group of images includes all the images. the first subsequent frame image and the second subsequent frame image;

The bottleneck layer features are extracted from the first frame of the current group of images.

5. The video hybrid coding method based on deep learning according to claim 1, is characterized in that, before carrying out quantization and entropy coding successively to described bottleneck layer feature, also comprises:

A super-prior network is used to estimate the code rate of the bottleneck layer feature to be quantized.

6. A video decoding method based on deep learning, characterized in that, for decoding the data generated by the video hybrid coding method according to any one of claims 2 to 5; the deep learning-based video decoding Methods include:

Entropy decoding is performed on the intra-frame encoded data in the received code stream to obtain a bottleneck layer feature, and a specified frame image is decoded according to the bottleneck layer feature;

Entropy decoding, inverse quantization, and inverse transformation are performed on the first prediction residual data in the received code stream, and compensation is performed by using the specified frame image, and then loop filtering is performed on the compensated data, so as to pass the loop filtering method The first subsequent frame image is decoded.

7. The video decoding method based on deep learning according to claim 6, further comprising:

Entropy decoding, inverse quantization, and inverse transformation are performed on the second prediction residual data in the received code stream, and compensation is performed using the first subsequent frame image or the specified frame image, and then loop filtering is performed on the compensated image , so that the second subsequent frame image is decoded by means of loop filtering.

8. A video hybrid encoding device based on deep learning, characterized in that, comprising:

The analysis network module is used to extract the bottleneck layer feature from the specified frame image of the current video, and the first frame image is used as the specified frame image;

The encoding end generates a network module, which is used to reconstruct the first frame image according to the bottleneck layer feature;

a quantization module, to quantify the features of the bottleneck layer;

an entropy coding module, used for entropy coding the quantized bottleneck layer feature to obtain intra-frame coding data for writing into the code stream;

an encoder compensation module, configured to perform inter-frame motion compensation on the first subsequent frame image of the current video by using the first frame image;

The transformation module is used to transform the compensated image;

The quantization module is also used to quantize the transformed image;

The entropy coding module is further configured to perform entropy coding on the quantized image to obtain first prediction residual data for writing into the code stream.

9. A video decoding device based on deep learning, characterized in that, for decoding the data generated by the video hybrid encoding device according to claim 8; the video decoding device based on deep learning comprises:

The entropy decoding module is used for entropy decoding the intra-frame coded data in the received code stream to obtain the bottleneck layer feature; and is also used for entropy decoding on the first prediction residual data in the received code stream;

The decoding end generates a network module for decoding the specified frame image according to the bottleneck layer feature;

an inverse quantization module at the decoding end, used for inverse quantization of the first prediction residual data after entropy decoding;

an inverse transformation module at the decoding end, which is used to inversely transform the inversely quantized first prediction residual data;

a decoding end compensation module, configured to use the specified frame image to compensate the inversely transformed first prediction residual data;

The decoding-end loop filtering module is configured to perform loop filtering on the compensated first prediction residual data, so as to decode the first subsequent frame image by means of loop filtering.

10. A computer-readable storage medium, characterized in that a computer program is stored thereon, and when the computer program is executed by a processor, the video hybrid encoding method according to any one of claims 1 to 5 or The video decoding method of claim 6 or 7.