CN108174218B - Video coding and decoding system based on learning - Google Patents

Video coding and decoding system based on learning Download PDF

Info

Publication number
CN108174218B
CN108174218B CN201810064012.9A CN201810064012A CN108174218B CN 108174218 B CN108174218 B CN 108174218B CN 201810064012 A CN201810064012 A CN 201810064012A CN 108174218 B CN108174218 B CN 108174218B
Authority
CN
China
Prior art keywords
iterative
time domain
space
coding
analyzer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810064012.9A
Other languages
Chinese (zh)
Other versions
CN108174218A (en
Inventor
陈志波
何天宇
金鑫
刘森
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology of China USTC
Original Assignee
University of Science and Technology of China USTC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology of China USTC filed Critical University of Science and Technology of China USTC
Priority to CN201810064012.9A priority Critical patent/CN108174218B/en
Publication of CN108174218A publication Critical patent/CN108174218A/en
Application granted granted Critical
Publication of CN108174218B publication Critical patent/CN108174218B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/192Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding the adaptation method, adaptation tool or adaptation type being iterative or recursive
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/593Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards

Abstract

The invention discloses a video coding and decoding frame based on learning, which comprises: a space-time domain reconstruction memory for storing the encoded and decoded reconstructed video content; the space-time domain prediction network is used for modeling the reconstructed video content through a convolutional neural network and a cyclic neural network by utilizing the space-time domain correlation of the reconstructed video content and outputting a predicted value of a current coding block; subtracting the predicted value and the original value to form a residual error; the iterative analyzer and the iterative synthesizer encode and decode the input residual error step by step; a binarizer for converting the output of the iterative analyzer into a binary representation; the entropy coder is used for entropy coding the quantized coded output to obtain an output code stream; and the entropy decoder is used for performing entropy decoding on the output code stream and then outputting the decoded output code stream to the iterative synthesizer. The coding framework realizes the prediction of a space-time domain through a learning-based VoxelCNN (space-time domain prediction network), and realizes the control of the distortion optimization of a video coding rate by using a residual iterative coding method.

Description

Video coding and decoding system based on learning
Technical Field
The invention relates to the technical field of video coding and decoding, in particular to a video coding and decoding framework based on learning.
Background
Existing image video coding standards such as: JPEG, H.261, MPEG-2, H.264, H.265, all based on a hybrid coding framework. Through years of development, the improvement of the coding performance is accompanied by the continuous increase of complexity, and further the improvement of the coding performance under the existing hybrid coding architecture also faces more and more challenges.
However, at present, the hybrid coding framework generally implements optimized coding of image video according to a heuristic method, and it is increasingly difficult to meet the current requirements of complex and intelligent media applications such as face recognition, target tracking, image retrieval, and the like.
Disclosure of Invention
The invention aims to provide a learning-based video coding and decoding framework which can realize the control of video coding rate distortion optimization.
The purpose of the invention is realized by the following technical scheme:
a learning-based video coding/decoding framework, comprising: an encoding end and a decoding end; wherein the encoding end includes: the device comprises a space-time domain reconstruction memory, a space-time domain prediction network, an iterative analyzer, an iterative synthesizer, a binarizer, an entropy encoder and an entropy decoder;
the space-time domain reconstruction memory is used for storing the reconstructed video content after being encoded and decoded;
the space-time domain prediction network is used for modeling the reconstructed video content through a convolutional neural network and a cyclic neural network by utilizing the space-time domain correlation of the reconstructed video content and outputting a predicted value of a current coding block;
the iterative analyzer comprises a convolutional neural network and a cyclic neural network structure, a residual error formed by subtracting a predicted value output by the space-time domain prediction network from an original value is used as an input, and the output is a compressed expression of the residual error;
the iterative synthesizer comprises a convolutional neural network and a cyclic neural network structure, receives the compressed expression of the residual error generated by decoding by the entropy decoder, and superposes the predicted value output by the space-time domain prediction network to form reconstructed video content;
the iterative analyzer and the iterative synthesizer encode and decode the input residual error step by step, and gradually reduce the distortion degree of the residual error by increasing the code stream, thereby realizing the encoding of different distortion degrees under the conditions of high and low code streams;
the binarizer is used for converting the output of the iterative analyzer into binary representation;
the entropy encoder is used for entropy encoding the quantized encoded output to obtain an output code stream;
and the entropy decoder is used for performing entropy decoding on the output code stream and then outputting the decoded output code stream to the iterative synthesizer.
The technical scheme provided by the invention can be seen that the space-time domain prediction and residual iterative coding method is integrated, the space-time domain prediction is realized through the learning-based VoxelCNN (space-time domain prediction network), and the control of the video coding rate distortion optimization is realized by using the residual iterative coding method.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.
Fig. 1 is a schematic diagram of a learning-based video encoding and decoding framework according to an embodiment of the present invention;
fig. 2 is a schematic diagram of a main processing procedure of a video codec framework according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a motion interpolation process according to an embodiment of the present invention;
fig. 4 is a schematic diagram of a motion extending process provided by an embodiment of the invention.
Detailed Description
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention provides a video coding and decoding frame based on learning, which mainly comprises: an encoding end and a decoding end; as shown in fig. 1, the encoding end mainly includes: the device comprises a space-time domain reconstruction memory, a space-time domain prediction network, an iterative analyzer, an iterative synthesizer, a binarizer, an entropy encoder and an entropy decoder;
and the space-time domain reconstruction memory is used for storing the reconstructed video content after encoding and decoding, and comprises decoded frames and decoded blocks of the current frame. The codec process typically proceeds in either the forward (P-frame) or bi-directional (B-frame) of the video timeline, with each frame typically being codec block-by-block in left-to-right, top-to-bottom order.
The space-time domain prediction network (VoxelCNN) is used for utilizing the space-time domain correlation of the reconstructed video content to model the reconstructed video content through a convolutional neural network and a cyclic neural network and outputting a prediction value of a current coding block; and subtracting the predicted value and the original value to form a residual error, and performing iterative coding through an iterative analyzer and an iterative synthesizer to realize rate distortion optimization.
The iterative analyzer comprises a convolutional neural network and a cyclic neural network structure, a residual error formed by subtracting a predicted value output by the space-time domain prediction network from an original value is used as an input, and the output is a compressed expression of the residual error;
the iterative synthesizer comprises a convolutional neural network and a cyclic neural network structure, receives the compressed expression of the residual error generated by decoding by the entropy decoder, and superposes the predicted value output by the space-time domain prediction network to form reconstructed video content;
the iterative analyzer and the iterative synthesizer encode and decode the input residual error step by step, and gradually reduce the distortion degree of the residual error by increasing the code stream, thereby realizing the encoding of different distortion degrees under the conditions of high and low code streams;
the binarizer is used for converting the output of the iterative analyzer into binary representation;
the entropy encoder is used for entropy encoding the quantized encoded output to obtain an output code stream;
and the entropy decoder is used for carrying out entropy decoding on the output code stream and outputting the decoded output code stream to the iterative synthesizer.
In the embodiment of the invention, the entropy encoder and the entropy decoder can be realized by using methods such as context-based arithmetic coding and decoding, namely, the arithmetic encoder/decoder is used as the entropy encoder/decoder.
In the embodiment of the invention, the space-time domain reconstruction memory, the space-time domain prediction network, the iterative synthesizer and the entropy decoder form a decoder in a coding end.
Those skilled in the art will appreciate that since the decoding side can only obtain the reconstructed video content, not the original video content, the encoding side includes decoding functionality to provide the reconstructed video content for reference by the encoder.
For ease of understanding, the main processing in the video codec framework will be described in detail below with reference to a specific example as shown in fig. 2.
In the embodiment of the invention, the space-time domain prediction network calculates the prediction value of the coding block by two processes of motion synthesis and hybrid prediction.
1. And (4) synthesizing the movement.
Motion synthesis includes motion interpolation and motion extension, for two different coding modes, one of which is optional in operation.
1) The motion interpolation is to obtain an object motion track according to two adjacent frames in the reconstructed video content and interpolate the object motion track between the two adjacent frames to serve as an interpolation frame. As shown in fig. 3, the motion interpolation process is as follows: let vx,vy,x,
Figure GDA0002262455760000031
Wherein (v)x,vy) Which represents a motion vector, is used to represent,
Figure GDA0002262455760000041
representing a set of integers. Let the interpolated frame be
Figure GDA0002262455760000042
Two adjacent frames in the reconstructed video content are respectively marked asAnd
Figure GDA0002262455760000044
determining a motion vector (v) of a block centered on coordinates (x, y) by a motion compensation operation with a block size mx,vy) Interpolation frame
Figure GDA0002262455760000045
In (x, y) -centered coding block
Figure GDA0002262455760000046
Is given a value of
Figure GDA0002262455760000047
To Chinese
Figure GDA0002262455760000048
Code block as centerCopied, according to which a complete interpolation frame is obtained
Figure GDA00022624557600000410
And as an output of the motion interpolation operation.
2) Motion extension is to obtain the motion track of an object by reconstructing the first two frames of the video content and extend the motion track backwards to obtain an extended frame
Figure GDA00022624557600000411
As shown in fig. 4, the motion extension process is as follows: first two frames
Figure GDA00022624557600000412
And
Figure GDA00022624557600000413
in the method, a motion vector (v) of a coding block centered on coordinates (x, y) is determined by a motion compensation operation for a coding block size of mx,vy) Extended frame
Figure GDA00022624557600000414
In (x, y) -centered coding blockIs given a value ofIn (x-v)x,y-vy) Code block as center
Figure GDA00022624557600000417
Copied, in this way, a complete extended frame is obtained
Figure GDA00022624557600000418
And as an output of the motion extension operation.
2. And (4) mixed prediction.
The hybrid prediction includes convolution and convolution LSTM structures, and an interpolation frame or an extension frame (in FIG. 2, if a motion extension operation is performed in the motion synthesis process, the interpolation frame or the extension frame is referred to as an extension frame) and a frame before and after the interpolation frame are respectively (
Figure GDA00022624557600000419
And
Figure GDA00022624557600000420
) Or the first two frames of the extended frame (And
Figure GDA00022624557600000422
) And the blocks which are positioned above the current coding block in the current frame and decoded on the left are used as input, and the prediction value of the current coding block in the current frame is generated by learning the modeling of the video space-time domain information; through iterative computation, a predicted value of the current coding block is generated every time according to the sequence from left to right and from top to bottom, and finally the whole is spliced out.
As shown in FIG. 2, assume that motion is usedIn the extended coding mode, the first two frames of the extended frame are extended in the motion extension mode (And
Figure GDA00022624557600000424
) And the already decoded block (for each frame, coding and decoding are carried out from top to bottom and from left to right in sequence) positioned at the upper left of the current coding block in the current frame is taken as input; in the motion interpolation mode, the previous frame and the next frame of the interpolation frame are processed (And
Figure GDA00022624557600000426
) And a block already decoded at the top left of the current coding block in the current frame as input. The hybrid prediction generates a prediction value of a current coding block by learning the modeling of video space-time domain information; through iterative computation, a predicted value of the current coding block is generated every time according to the sequence from top to bottom and from left to right, and finally the whole is spliced out. In the embodiment of the invention, a predicted value output by the space-time domain prediction network is subtracted from an original value to form a residual error, iterative coding is carried out through an iterative analyzer and an iterative synthesizer, and the optimization target of the space-time domain prediction network is as follows:
wherein B is the total number of frames involved in the optimization, J is the total number of coded blocks per frame in the reconstructed video content,
Figure GDA00022624557600000428
Figure GDA0002262455760000051
the original value and the predicted value of the j coding block in the i frame are respectively corresponded.
In the embodiment of the invention, the optimization target is equivalent to a loss function, and the space-time domain prediction network has the function of generating a predicted value and enabling the predicted value to be close to an original value.
In the embodiment of the invention, the iteration analyzer and the iteration synthesizer both comprise S encoding stages consisting of S convolution-based self-encoders, a reconstructed value and a target value are continuously iterated, analyzed and synthesized to realize a variable compression ratio, each stage of the iteration analyzer generates a compressed expression of an input residual error, the compressed expression is quantized to form an output code stream, and the optimization target expression of the iteration analyzer and the iteration synthesizer is as follows:
Figure GDA0002262455760000053
Figure GDA0002262455760000054
wherein the content of the first and second substances,
Figure GDA0002262455760000055
is the residual error input in the initial stage (i.e. the 1 st stage),
Figure GDA0002262455760000056
representing the residual error input at the nth stage,
Figure GDA0002262455760000057
representing the output of the nth stage (i.e., the compressed representation of the input residual by the n stages).
In the embodiment of the invention, the iterative analyzer and the iterative synthesizer are jointly optimized and in a formulaIs actually
Figure GDA0002262455760000059
The parameters include all the parameters in the iterative analyzer and the iterative synthesizer.
The scheme provided by the embodiment of the invention solves the problems that the motion prediction is difficult to realize through integrated training in a neural network, provides the VoxelCNN to simultaneously model the space-time domain prior of the video content, integrates an iterative analyzer/synthesizer, a binarizer, an entropy encoder/decoder and the like, and realizes the video coding and decoding based on learning. In experimental verification, under the condition of no entropy encoder/decoder, the performance of the method exceeds that of an MPEG-2 standard encoder, and the effect similar to H.264 is achieved.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (6)

1. A learning-based video coding/decoding system, comprising: an encoding end and a decoding end; wherein the encoding end includes: the device comprises a space-time domain reconstruction memory, a space-time domain prediction network, an iterative analyzer, an iterative synthesizer, a binarizer, an entropy encoder and an entropy decoder;
the space-time domain reconstruction memory is used for storing the reconstructed video content after being encoded and decoded;
the space-time domain prediction network is used for modeling the reconstructed video content through a convolutional neural network and a cyclic neural network by utilizing the space-time domain correlation of the reconstructed video content and outputting a predicted value of a current coding block;
the iterative analyzer comprises a convolutional neural network and a cyclic neural network structure, a residual error formed by subtracting a predicted value and an original value output by the space-time domain prediction network is used as an input, and the output is a compressed expression of the residual error;
the iterative synthesizer comprises a convolutional neural network and a cyclic neural network structure, receives the compressed expression of the residual error generated by decoding by the entropy decoder, and superposes the predicted value output by the space-time domain prediction network to form reconstructed video content;
the entropy encoder and the entropy decoder encode and decode input residual errors step by step, and the iterative analyzer and the iterative synthesizer gradually reduce the distortion degree of the residual errors by increasing the code stream, so as to realize encoding of different distortion degrees under the conditions of high and low code streams;
the binarizer is used for converting the output of the iterative analyzer into binary representation;
the entropy encoder is used for entropy encoding the quantized encoded output to obtain an output code stream;
the entropy decoder is used for performing entropy decoding on the output code stream and then outputting the output code stream to the iterative synthesizer;
the space-time domain prediction network calculates the prediction value of the coding block by two processes of motion synthesis and hybrid prediction, wherein:
motion synthesis is carried out to obtain motion tracks of objects through two adjacent frames of reconstructed video contents and the motion tracks are interpolated between the two adjacent frames to be used as interpolation frames; motion extension is to obtain an object motion track by reconstructing the first two frames of the video content and extend backwards, so as to obtain an extended frame;
the mixed prediction comprises convolution and convolution LSTM structures, an interpolation frame or an extension frame, the first two frames of the previous and the next frames or the extension frames of the interpolation frame and the decoded blocks positioned above and on the left of a current coding block in a current frame are used as input, and a prediction value of the current coding block in the current frame is generated by learning the modeling of video space-time domain information; and finally obtaining the predicted value of each coding block through iterative calculation.
2. The learning-based video coding-decoding system according to claim 1, wherein the space-time domain reconstruction memory, the space-time domain prediction network, the iterative synthesizer and the entropy decoder constitute a decoder in a coding end.
3. The learning-based video coding and decoding system of claim 1, wherein the motion interpolation process is as follows: let the interpolated frame be
Figure FDA0002262455750000021
Two adjacent frames in the reconstructed video content are respectively marked as
Figure FDA0002262455750000022
Anddetermining a motion vector (v) of a block centered on coordinates (x, y) by a motion compensation operation with a block size mx,vy) Interpolation frameIn (x, y) -centered coding block
Figure FDA0002262455750000025
Is given a value of
Figure FDA0002262455750000026
To Chinese
Figure FDA0002262455750000027
Code block as center
Figure FDA0002262455750000028
Copied to obtain a complete interpolated frame
Figure FDA0002262455750000029
4. The learning-based video codec system of claim 1, wherein the motion extension process is as follows: two frames before reconstructing video content
Figure FDA00022624557500000210
And
Figure FDA00022624557500000211
in the method, a motion vector (v) of a coding block centered on coordinates (x, y) is determined by a motion compensation operation for a coding block size of mx,vy) Extended frame
Figure FDA00022624557500000212
In (x, y) -centered coding blockIs given a value of
Figure FDA00022624557500000214
In (x-v)x,y-vy) Code block as center
Figure FDA00022624557500000215
Is replicated, and in this way, a complete extended frame is obtained
Figure FDA00022624557500000216
5. The video coding and decoding system based on learning of claim 1, wherein the predicted value of the output of the space-time domain prediction network is subtracted from the original value to form a residual error, and the residual error is iteratively encoded by an iterative analyzer and an iterative synthesizer cooperating with an entropy encoder and an entropy decoder, and the optimization goal of the space-time domain prediction network is as follows:
Figure FDA00022624557500000217
wherein B is the total number of frames involved in the optimization, J is the total number of coded blocks per frame in the reconstructed video content,
Figure FDA00022624557500000218
Figure FDA00022624557500000219
the original value and the predicted value of the j coding block in the i frame are respectively corresponded.
6. The learning-based video coding and decoding system according to claim 5, wherein the iterative analyzer and the iterative synthesizer both include S coding stages composed of S convolution-based self-encoders, the reconstructed value and the target value are continuously iteratively analyzed and synthesized to realize a variable compression ratio, the reconstructed value is also the reconstructed video content, each stage of the iterative analyzer generates a compressed representation of the input residual, the compressed representation is quantized to form an output code stream, and the optimization target representation of the iterative analyzer and the iterative synthesizer is as follows:
Figure FDA00022624557500000220
Figure FDA00022624557500000221
Figure FDA00022624557500000222
wherein the content of the first and second substances,
Figure FDA00022624557500000223
is the residual error input in the initial stage,
Figure FDA00022624557500000224
representing the residual error input at the nth stage,
Figure FDA00022624557500000225
representing the output of the nth stage iterative analyzer.
CN201810064012.9A 2018-01-23 2018-01-23 Video coding and decoding system based on learning Active CN108174218B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810064012.9A CN108174218B (en) 2018-01-23 2018-01-23 Video coding and decoding system based on learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810064012.9A CN108174218B (en) 2018-01-23 2018-01-23 Video coding and decoding system based on learning

Publications (2)

Publication Number Publication Date
CN108174218A CN108174218A (en) 2018-06-15
CN108174218B true CN108174218B (en) 2020-02-07

Family

ID=62515681

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810064012.9A Active CN108174218B (en) 2018-01-23 2018-01-23 Video coding and decoding system based on learning

Country Status (1)

Country Link
CN (1) CN108174218B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109451308B (en) 2018-11-29 2021-03-09 北京市商汤科技开发有限公司 Video compression processing method and device, electronic equipment and storage medium
CN110493596B (en) * 2019-09-02 2021-09-17 西北工业大学 Video coding system and method based on neural network
CN111222532B (en) * 2019-10-23 2024-04-02 西安交通大学 Training method for edge cloud collaborative deep learning model with classification precision maintenance and bandwidth protection
CN111050174A (en) * 2019-12-27 2020-04-21 清华大学 Image compression method, device and system
CN111669601B (en) * 2020-05-21 2022-02-08 天津大学 Intelligent multi-domain joint prediction coding method and device for 3D video
CN111898638B (en) * 2020-06-29 2022-12-02 北京大学 Image processing method, electronic device and medium fusing different visual tasks
CN115118972A (en) * 2021-03-17 2022-09-27 华为技术有限公司 Video image coding and decoding method and related equipment
CN113473149A (en) * 2021-05-14 2021-10-01 北京邮电大学 Semantic channel joint coding method and device for wireless image transmission

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1857001A (en) * 2003-05-20 2006-11-01 Amt先进多媒体科技公司 Hybrid video compression method
CN105163121A (en) * 2015-08-24 2015-12-16 西安电子科技大学 Large-compression-ratio satellite remote sensing image compression method based on deep self-encoding network
CN105430415A (en) * 2015-12-02 2016-03-23 宁波大学 Fast intraframe coding method of 3D-HEVC depth videos
CN107105278A (en) * 2017-04-21 2017-08-29 中国科学技术大学 The coding and decoding video framework that motion vector is automatically generated

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1857001A (en) * 2003-05-20 2006-11-01 Amt先进多媒体科技公司 Hybrid video compression method
CN105163121A (en) * 2015-08-24 2015-12-16 西安电子科技大学 Large-compression-ratio satellite remote sensing image compression method based on deep self-encoding network
CN105430415A (en) * 2015-12-02 2016-03-23 宁波大学 Fast intraframe coding method of 3D-HEVC depth videos
CN107105278A (en) * 2017-04-21 2017-08-29 中国科学技术大学 The coding and decoding video framework that motion vector is automatically generated

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
An End-to-End Compression Framework Based on Convolutional Neural Networks;Feng Jiang et al.;《IEEE Transactions on Circuits and Systems for Video Technology》;20170801;全文 *
Pixel Recurrent Neural Networks;A.v.d.Oord et al.;《International Conference on Machine Learning》;20160819;第3007-3018页 *

Also Published As

Publication number Publication date
CN108174218A (en) 2018-06-15

Similar Documents

Publication Publication Date Title
CN108174218B (en) Video coding and decoding system based on learning
Baig et al. Learning to inpaint for image compression
US8625682B2 (en) Nonlinear, prediction filter for hybrid video compression
CN112866694B (en) Intelligent image compression optimization method combining asymmetric convolution block and condition context
US9609324B2 (en) Image encoding/decoding method and device using coefficients of adaptive interpolation filter
JPH1093972A (en) Outline encoding method
CN111294604B (en) Video compression method based on deep learning
Islam et al. Image compression with recurrent neural network and generalized divisive normalization
US20100158131A1 (en) Iterative dvc decoder based on adaptively weighting of motion side information
Akbari et al. Learned multi-resolution variable-rate image compression with octave-based residual blocks
Zhou et al. Distributed video coding using interval overlapped arithmetic coding
CN105556850A (en) Encoder and decoder, and method of operation
US8594196B2 (en) Spatial Wyner Ziv coding
CN105794116A (en) Encoder, decoder and method of operation using interpolation
CN111343458A (en) Sparse gray image coding and decoding method and system based on reconstructed residual
JPH08116542A (en) Image coder, image decoder and motion vector detector
CN111131834B (en) Reversible self-encoder, encoding and decoding method, image compression method and device
CN112437300B (en) Distributed video coding method based on self-adaptive interval overlapping factor
KR101500300B1 (en) Selective Low-Power Video Codec with Interaction Between Encoder and Decoder, and an Encoding/Decoding Method Thereof
CN1848960A (en) Residual coding in compliance with a video standard using non-standardized vector quantization coder
Wang et al. Transform skip inspired end-to-end compression for screen content image
CN113938687A (en) Multi-reference inter-frame prediction method, system, device and storage medium
CN106791864A (en) A kind of implementation method based on raising video code conversion speed under HEVC standard
WO2022067806A1 (en) Video encoding and decoding methods, encoder, decoder, and storage medium
Tian et al. Effortless cross-platform video codec: A codebook-based method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant