CN110602494A - Image coding and decoding system and method based on deep learning - Google Patents

Image coding and decoding system and method based on deep learning Download PDF

Info

Publication number
CN110602494A
CN110602494A CN201910705904.7A CN201910705904A CN110602494A CN 110602494 A CN110602494 A CN 110602494A CN 201910705904 A CN201910705904 A CN 201910705904A CN 110602494 A CN110602494 A CN 110602494A
Authority
CN
China
Prior art keywords
image
super
prior
module
deep learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910705904.7A
Other languages
Chinese (zh)
Inventor
王培�
其他发明人请求不公开姓名
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Pico Pico Technology Co ltd
Original Assignee
Hangzhou Pico Pico Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Pico Pico Technology Co ltd filed Critical Hangzhou Pico Pico Technology Co ltd
Priority to CN201910705904.7A priority Critical patent/CN110602494A/en
Publication of CN110602494A publication Critical patent/CN110602494A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/13Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/149Data rate or code amount at the encoder output by estimating the code amount by means of a model, e.g. mathematical model or statistical model
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Algebra (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an image coding and decoding system and a coding and decoding method based on deep learning, wherein the coding system comprises: the forward transformation network module based on deep learning, the conditional probability super-prior analysis module based on deep learning and the entropy coding module; the forward conversion network module is used for obtaining a characteristic coefficient; the super-first-check analysis module is used for obtaining a super-first-check characteristic value; the entropy coding module is used for entropy coding. The decoding system includes: the device comprises an entropy decoding module, a deep learning-based reconstruction module and a deep learning-based inverse transformation network module; the entropy decoding module is used for entropy decoding; the reconstruction module is used for obtaining a conditional probability model; the inverse transform network module is used for reconstructing image pixel values. By adopting the invention, the performance of the codec obtained by training exceeds various traditional coding standards in an unsupervised mode.

Description

Image coding and decoding system and method based on deep learning
Technical Field
The invention relates to the technical field of image coding, in particular to an image coding and decoding system and a coding and decoding method based on deep learning.
Background
With the rapid development of multimedia technology and network communication technology, image multimedia applications have covered various aspects of human life. The large number of image applications creates a huge amount of data that would be difficult to apply for practical storage and transmission if not compressed. The image compression coding technology can effectively remove redundant information in the data, and realize the quick transmission and off-line storage of the image data in the Internet. Therefore, image compression encoding technology is a key technology in video applications.
In the past decades, a series of image coding standards have been widely used. There are many existing standards for image compression, including JPEG and JPEG2000, as set forth by the Joint Picture Experts Group (Joint Picture Experts Group), PNG, as developed by the Unisys corporation and promulgated by the International Organization for standardization (ISO)/International Electrotechnical Commission (IEC), WebP, as promulgated by Google, and Fabry Bellard, created in 2014. Although conventional coding standards are numerous and continue to advance, the coding framework has not changed significantly. For example, image coding standards basically follow the framework of transform coding (transform coding), and the development trend of conventional coding standards is to exchange finer and more complex algorithms for higher coding performance. The more difficult it is to further gain performance that image coding standards iterate to date.
In recent years, deep learning techniques have made a major breakthrough in multiple image processing and machine vision tasks, and have received extensive attention from researchers. The deep learning technique can learn data prior knowledge and adaptive transformation operation from a large amount of data, which is also suitable for the image coding task. Research using deep learning for image coding began with a recurrent neural network-based image coding method published in google in 2015. Recently, several studies have shown that deep learning based image coding methods have achieved performance exceeding many conventional image coding techniques. Although such methods have not been developed for a long time, they have achieved performance comparable to the best current conventional coding techniques (BPG HEVC-based intra coding is the best current image coding). These results all show that the image coding technology based on deep learning has great potential, and it is possible to achieve coding performance which is fully superior to that of the traditional method. In addition, compared with the traditional method which depends on expert knowledge and characteristic engineering, the deep learning technology has strong adaptivity, and can be trained according to specific data in practical application to obtain higher coding efficiency. The establishment and release of a new generation of traditional video coding standard often requires 10 years, so that through the research on deep learning-based image coding, the coding performance is expected to be remarkably improved, and the method has very important academic exploration and practical application values.
However, the image coding method based on the recurrent neural network disclosed in google mentioned above is too computationally expensive, which hinders practical use. Therefore, it is urgently needed to provide an image encoding method with high computational efficiency and excellent encoding performance.
Disclosure of Invention
The invention provides an image coding and decoding system and a coding and decoding method based on deep learning aiming at the problems in the prior art, provides a set of training strategy which enables the whole coding network to carry out end-to-end optimization, and adopts an unsupervised mode, so that the performance of a coder obtained by training exceeds various traditional coding standards.
In order to solve the technical problems, the invention is realized by the following technical scheme:
the invention provides an image coding system based on deep learning, which comprises:
the forward transformation network module is used for enabling the image to pass through a forward transformation network to obtain a characteristic coefficient representing image information;
the system comprises a condition probability super-prior analysis module based on deep learning, a condition probability super-prior analysis module and a feature coefficient analysis module, wherein the condition probability super-prior analysis module is used for analyzing the feature coefficients to obtain a super-prior feature value representing the condition probability of the feature coefficients;
and the entropy coding module is used for entropy coding the quantized feature coefficients to obtain a feature coefficient code stream under the guidance of the super-prior conditional probability, and is also used for entropy coding the quantized super-prior feature values by a conditional probability model counted on a training set to obtain the super-prior feature value code stream.
Preferably, the entropy coding module is further configured to perform bypass entropy coding on the image meta-information to obtain an image meta-information code stream;
wherein the image meta information includes: the length and width of the image, and the model number used by the image.
Preferably, the forward transform network module is constructed based on a deep convolutional neural network;
the forward conversion network module comprises N convolutional layers and N-1 normalization layers, wherein the forward conversion module starts from the convolutional layers, and the convolutional layers and the normalization layers are alternately distributed.
Preferably, the super-prior analysis module is constructed based on a deep convolutional neural network;
the analysis network of the super-prior analysis module comprises six layers; the first layer is an absolute value operation layer, the second layer is a convolution layer, the third layer is an activation layer, the fourth layer is a convolution layer, the fifth layer is an activation layer, and the sixth layer is a convolution layer.
Preferably, the quantization in the entropy coding module is adding random uniform noise approximation quantization.
The present invention also provides an image decoding system based on deep learning, which is an image decoding system corresponding to the above image encoding system, and includes:
the entropy decoding module is used for carrying out entropy decoding on the super-prior-check eigenvalue code stream to obtain a reconstructed super-prior-check eigenvalue matrix;
the deep learning-based reconstruction module is used for training according to the super-prior eigenvalue matrix to obtain a conditional probability model of which the eigenvalue is based on Laplace distribution super-prior; the entropy decoding module is further used for performing entropy decoding on the characteristic coefficient code stream according to the conditional probability model to obtain a reconstructed characteristic coefficient matrix;
and the deep learning-based inverse transformation network module is used for enabling the reconstructed characteristic coefficient matrix to reconstruct the image pixel value through an inverse transformation network.
Preferably, the entropy decoding module is further configured to perform entropy decoding on the image meta-information code stream to obtain image meta-information;
wherein the image meta information includes: the length and width of the image, and the model number used by the image.
Preferably, the inverse transformation network module is constructed based on a deep convolutional neural network;
the inverse transformation network module and the forward transformation network module are in a symmetrical structure;
the inverse transformation network module comprises N layers of deconvolution layers and N-1 layers of inverse normalization layers, the inverse transformation module starts with the inverse convolution layers, and the inverse convolution layers and the inverse normalization layers are alternately distributed.
Preferably, the super-prior reconstruction module is constructed based on a deep convolutional neural network;
the super-prior reconstruction module and the super-prior analysis module are in a symmetrical structure;
the reconstruction network of the super-prior reconstruction module also comprises six layers, wherein the first layer is an deconvolution layer, the second layer is an activation layer, the third layer is an deconvolution layer, the fourth layer is an activation layer, the fifth layer is an deconvolution layer, and the sixth layer is an exponential function output layer for each input characteristic value.
The invention also provides an image coding method based on deep learning, which comprises the following steps:
s101: carrying out forward transformation on an input image to obtain a characteristic coefficient matrix representing image information;
s102: inputting the characteristic coefficient matrix into a super-prior analysis module, and outputting to obtain a super-prior eigenvalue matrix representing the probability of the characteristic coefficient;
s103: quantizing the super prior eigenvalue matrix in the S102, and entropy coding the quantized super prior eigenvalue matrix to obtain a super prior eigenvalue code stream;
s104: training according to the quantized super-prior eigenvalue matrix in the S103 to obtain a conditional probability model of which the eigenvalue is based on Laplace distribution super-prior;
s105: quantizing the characteristic coefficient matrix in the S101, and performing entropy coding on the quantized characteristic coefficient matrix by using the conditional probability model in the S104 to obtain a characteristic coefficient code stream;
s106: the code stream of the output image of the packing includes: the code stream of the super-prior eigenvalue in S103 and the code stream of the eigenvalue in S105.
Preferably, between S105 and S106, further comprising:
s111: carrying out bypass entropy coding on the image meta-information to obtain an image meta-information code stream, wherein the image meta-information comprises: the length and width of the image, and the model serial number adopted by the image; further, the air conditioner is provided with a fan,
the encoding code stream of the output image in S106 further includes: the image meta information code stream in S101.
Preferably, the quantization in S103 and/or S105 is approximate quantization, and the approximate quantization is performed by adding random uniform noise.
Preferably, the value range of the random uniform noise is [ -0.5,0.5 ].
Preferably, the S104 includes:
taking a minimized loss function J which is R + lambda D as a target, adopting MS-SSIM or PSNR as a measurement index, and approximating by using information entropy; wherein:
the information entropy is obtained according to a conditional probability function of the characteristic coefficient, namely n is sum (-plog2 (p));
the conditional probability density is modeled based on Laplace distribution, the mean value is assumed to be 0, and the variance is the super-prior conditional probability model obtained by training;
where R is the code rate, D is the distortion, n is the information entropy, and p is the conditional probability function.
The present invention also provides an image decoding method based on deep learning, which is an image decoding method corresponding to the above image encoding method, and which includes the steps of:
s141: entropy decoding to obtain image meta-information, including the length and width of an image and a model sequence number adopted by the image;
s142: decoding the code stream of the super-prior eigenvalue to obtain a super-prior eigenvalue matrix by using a corresponding super-prior eigenvalue entropy coding model according to the serial number of the model; constructing and initializing a corresponding network model according to the model serial number;
s143: sending the super-prior eigenvalue matrix obtained by decoding in the 142 into a super-prior reconstruction module, and outputting the conditional probability of the obtained eigen coefficients;
s144: decoding the characteristic coefficient code stream to obtain a characteristic coefficient matrix of the image by using the conditional probability of the characteristic coefficient in the step 143;
s145: and sending the characteristic coefficient matrix in the step S144 to an inverse transformation network module, and reconstructing a pixel value.
The invention also provides an image coding terminal, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor can be used for executing the image coding method based on deep learning when executing the program.
The invention also provides an image decoding terminal, which comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein the processor can be used for executing the image coding and decoding method based on the deep learning when executing the program.
The present invention also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, is operable to perform the above-described deep learning-based image encoding method.
The present invention also provides a computer readable storage medium, on which a computer program is stored, which, when being executed by a processor, is operable to perform the above-mentioned deep learning-based image coding and decoding method.
Compared with the prior art, the invention has the following advantages:
(1) the image coding and decoding system and the coding and decoding method based on deep learning are constructed based on a neural network, network parameters need to be trained, and a set of training strategy which enables the whole coding network to be optimized end to end is provided; the calculation efficiency is high based on the neural network; in an unsupervised manner, the trained encoder performance exceeds a variety of conventional encoding standards, such as: JPEG, JPEG2000, etc.;
(2) in the image coding and decoding system and the image coding and decoding method based on deep learning, in the training stage, conditional probability modeling based on Laplace distribution super-prior is carried out on the characteristic coefficient of an image, the conditional probability modeling is a differentiable modeling, so that a code rate loss term can be expressed by using a continuously-derivable function, and thus, network parameters can be updated by using gradient reverse conduction;
(3) the image coding and decoding system and the coding and decoding method based on deep learning approximate quantization operation by adding random uniform noise in the training stage, so that the coding and decoding process becomes conductive.
Of course, it is not necessary for any product that implements the invention to achieve all of the above-described advantages at the same time.
Drawings
Embodiments of the invention are further described below with reference to the accompanying drawings:
FIG. 1 is a flowchart of an image coding method based on deep learning according to an embodiment of the present invention;
FIG. 2 is a block diagram of an image coding method based on deep learning according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating an output code stream of an image coding method based on deep learning according to an embodiment of the present invention;
fig. 4 is a flowchart of an image decoding method based on deep learning according to an embodiment of the present invention.
Detailed Description
The following examples are described in detail, which are carried out on the premise of the technical solution of the present invention, and detailed embodiments and specific procedures are provided, but the scope of the present invention is not limited to the following examples.
Example 1:
the deep learning-based image encoding system of the present embodiment includes: the forward transformation network module based on deep learning, the conditional probability prior analysis module based on deep learning and the entropy coding module. The forward transformation network module is used for enabling the image to pass through a forward transformation network to obtain a characteristic coefficient representing image information; the conditional probability super-prior analysis module is used for analyzing the characteristic coefficient to obtain a super-prior characteristic value representing the conditional probability of the characteristic coefficient; the entropy coding module is used for entropy coding the quantized feature coefficients to obtain feature coefficient code streams under the guidance of the super-prior conditional probability, and is also used for entropy coding the quantized super-prior feature values by a conditional probability model counted on a training set to obtain the super-prior feature value code streams.
In a preferred embodiment, the entropy coding module is further configured to perform bypass entropy coding on the image meta-information to obtain an image meta-information code stream; wherein the image meta information includes: the length and width of the image, the model number used by the image (used to determine the network parameters of the deep learning network used).
In a preferred embodiment, the forward transform network module is constructed based on a deep convolutional neural network; the forward conversion network module starts with the convolution layer, and the convolution layer and the normalization layer are distributed alternately. In one embodiment, the convolution kernels of each convolution layer are all 5 × 5 in size, the number of convolution kernels is 192, and the spatial length and width of the feature coefficients after convolution are both reduced to half of the original size. The normalization layer uses a normal generalized division normalization operation, which is proposed by Ball et al to be a function of local gain control.
In a preferred embodiment, the super-a-priori analysis module is constructed based on a deep convolutional neural network. In one embodiment, the analysis network of the super-first analysis module comprises six layers; the first layer is an operation of taking absolute values, namely a point operation; the second layer is convolution operation, the size of a convolution kernel is 3 multiplied by 3, the number of the convolution kernel is 192, and the space size of the feature passing through the layer is unchanged; the third layer is an active layer and adopts a Leaky relu function; the fourth layer is a convolution layer, the size of a convolution kernel is 5 multiplied by 5, the number of the convolution kernels is 192, and the length and the width of a space after the characteristics are convoluted are reduced to half of the original length and width; the fifth layer is an active layer and adopts a Leaky relu function; the last layer is a convolution layer, the size of a convolution kernel is 5 multiplied by 5, the number of the convolution kernels is 192, and the length and the width of a space after the characteristics are convoluted are reduced to half of the original space.
In a preferred embodiment, the quantization in the entropy coding module is the addition of random uniform noise approximation quantization, making the codec process scalable. In one embodiment, the value range of the uniform noise is [ -0.5,0.5 ].
In a preferred embodiment, the quantization used in the entropy coding module is scalar quantization, and the quantization function is y round (x), i.e. the input is rounded and quantized, and the output is the nearest integer.
Example 2:
the image decoding system based on deep learning of the present embodiment corresponds to the image encoding system of the above-described embodiment, and includes: the device comprises an entropy decoding module, a deep learning-based reconstruction module and a deep learning-based inverse transformation network module. The entropy decoding module is used for performing entropy decoding on the super-prior eigenvalue code stream to obtain a reconstructed super-prior eigenvalue matrix; the deep learning-based reconstruction module is used for training according to a super-prior eigenvalue matrix to obtain a conditional probability model of which the eigenvalue is based on Laplace distribution super-prior; the entropy decoding module is also used for carrying out entropy decoding on the characteristic coefficient code stream according to the conditional probability model to obtain a reconstructed characteristic coefficient matrix; and the deep learning-based inverse transformation network module is used for reconstructing the pixel values of the reconstructed characteristic coefficient matrix through an inverse transformation network.
In a preferred embodiment, the entropy decoding module is further configured to perform entropy decoding on the image meta-information code stream to obtain image meta-information; wherein the image meta information includes: the length and width of the image, the model sequence number used by the image.
In a preferred embodiment, the inverse transformation network module is constructed based on a deep convolutional neural network; the inverse transformation network module and the forward transformation network module are in a symmetrical structure and comprise N layers of deconvolution layers and N-1 layers of inverse normalization layers, the inverse transformation module starts from the deconvolution layers, and the deconvolution layers and the inverse normalization layers are alternately distributed. In one embodiment, the convolution kernels of each deconvolution layer are all 5 × 5 in size, the number of convolution kernels is 192, and the length and width of the space of the feature coefficient after deconvolution are both reduced to twice of the original length and width. The inverse normalization layer adopts inverse generalized division normalization operation.
In a preferred embodiment, the super-prior reconstruction module is constructed based on a deep convolutional neural network, and the super-prior reconstruction module and the super-prior analysis module are in a symmetrical structure. In one embodiment, the reconstruction network of the superma reconstruction module includes six layers, the first layer is an deconvolution layer, the size of a convolution kernel is 5 × 5, the number of the convolution kernels is 192, and the space length and the width of the feature are reduced to two times of the original space length and width after convolution. The second layer is an active layer and adopts a Leaky relu function; the third layer is an deconvolution layer, the size of a convolution kernel is 5 multiplied by 5, the number of the convolution kernels is 192, and the length and the width of a space after the characteristics are convoluted are reduced to two times of the original length and width; the fourth layer is an active layer and adopts a Leaky relu function; the fifth layer is an deconvolution layer, the size of a convolution kernel is 3 multiplied by 3, the number of the convolution kernels is 192, and the space length and the space width of the characteristic after convolution are kept unchanged; the last layer is the output of the function of taking the index of each characteristic value input.
Example 3:
the present invention also provides an image coding method based on deep learning, which is a flowchart as shown in fig. 1, and a frame diagram as shown in fig. 2, and includes the following steps:
s101: carrying out forward transformation on an input image to obtain a characteristic coefficient matrix representing image information;
s102: inputting the characteristic coefficient matrix into a super-first-check analysis module, and outputting a super-first-check characteristic value matrix representing the probability of the characteristic coefficient;
s103: quantizing the super prior eigenvalue matrix in the step S102, and entropy coding the quantized super prior eigenvalue matrix to obtain a super prior eigenvalue code stream (namely an auxiliary information code stream);
s104: training according to the quantized super-prior eigenvalue matrix in the step S103 to obtain a conditional probability model of which the eigenvalue is based on Laplace distribution super-prior;
s105: quantizing the characteristic coefficient matrix in the step S101, and entropy coding the quantized characteristic coefficient matrix by using the conditional probability model in the step S104 to obtain a characteristic coefficient code stream;
s106: the code stream of the output image of the packing includes: the code stream of the super-prior feature value in step S103 and the code stream of the feature coefficient in step S105.
In a preferred embodiment, the step S105 and the step S106 further include:
s111: carrying out bypass entropy coding on the image meta-information to obtain an image meta-information code stream, wherein the image meta-information comprises: the length and width of the image, and the model serial number adopted by the image; further, the outputting of the encoded code stream of the image in step S106 further includes: the image meta-information code stream in step S111 is a structural diagram of the output code stream of this embodiment as shown in fig. 3.
In the preferred embodiment, the quantization in step S103 and/or step S105 is approximate quantization, and a random uniform noise is added to perform approximate quantization. In one embodiment, the random uniform noise has a value range of [ -0.5,0.5 ].
In a preferred embodiment, step S104 specifically includes: taking a minimized loss function J which is R + lambda D as a target, adopting MS-SSIM or PSNR as a measurement index, and approximating by using information entropy; the information entropy is obtained and is related to a conditional probability function of the characteristic coefficient, the conditional probability density is modeled based on Laplace distribution, the mean value is assumed to be 0, and the variance is a super-prior conditional probability model obtained by training; where R is the code rate and D is the distortion.
Example 4:
the present invention also provides an image decoding method based on deep learning, a flowchart of which is shown in fig. 4, and the image decoding method corresponds to the image encoding method, and includes the following steps:
s141: entropy decoding to obtain image meta-information, including the length and width of an image and a model sequence number adopted by the image;
s142: decoding the code stream of the super-prior eigenvalue to obtain a super-prior eigenvalue matrix by using a corresponding super-prior eigenvalue entropy coding model according to the serial number of the model; according to the model serial number, constructing and initializing a corresponding network model (including network parameters of the adopted deep learning network);
s143: sending the super-prior eigenvalue matrix obtained by decoding in the step 142 into a super-prior reconstruction module, and outputting the conditional probability of the obtained characteristic coefficient;
s144: decoding the characteristic coefficient code stream to obtain a characteristic coefficient matrix of the image by using the conditional probability of the characteristic coefficient in the step 143;
s145: and (4) sending the characteristic coefficient matrix in the step (S144) to an inverse transformation network module, and reconstructing a pixel value.
Example 5:
an image coding terminal comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor can be used for executing the image coding method based on deep learning in the embodiment 3 when executing the program.
Example 6:
an image decoding terminal comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor can be used for executing the image coding and decoding method based on deep learning in the embodiment 4 when executing the program.
Example 7:
a computer-readable storage medium, having stored thereon a computer program which, when executed by a processor, is operable to execute the deep learning-based image encoding method of embodiment 3 described above.
Example 8:
a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, is operable to perform the deep learning-based image coding and decoding method of embodiment 4 described above.
Through the embodiments of the invention, an unsupervised mode is adopted, the performance of the trained coder-decoder exceeds various traditional coding standards, and the calculation efficiency is high.
The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, and not to limit the invention. Any modifications and variations within the scope of the description, which may occur to those skilled in the art, are intended to be within the scope of the invention.

Claims (18)

1. An image coding system based on deep learning, comprising:
the forward transformation network module is used for enabling the image to pass through a forward transformation network to obtain a characteristic coefficient representing image information;
the system comprises a condition probability super-prior analysis module based on deep learning, a condition probability super-prior analysis module and a feature coefficient analysis module, wherein the condition probability super-prior analysis module is used for analyzing the feature coefficients to obtain a super-prior feature value representing the condition probability of the feature coefficients;
and the entropy coding module is used for entropy coding the quantized feature coefficients to obtain a feature coefficient code stream under the guidance of the super-prior conditional probability, and is also used for entropy coding the quantized super-prior feature values by a conditional probability model counted on a training set to obtain a super-prior feature value code stream.
2. The deep learning based image coding system of claim 1, wherein the entropy coding module is further configured to perform bypass entropy coding on the image meta-information to obtain an image meta-information code stream;
wherein the image meta information includes: the length and width of the image, and the model number used by the image.
3. The deep learning based image coding system of claim 1, wherein the forward transform network module is constructed based on a deep convolutional neural network;
the forward conversion network module comprises N convolutional layers and N-1 normalization layers, wherein the forward conversion module starts from the convolutional layers, and the convolutional layers and the normalization layers are alternately distributed.
4. The deep learning based image coding system according to claim 1, wherein the super-prior analysis module is constructed based on a deep convolutional neural network;
the analysis network of the super-prior analysis module comprises six layers; the first layer is an absolute value operation layer, the second layer is a convolution layer, the third layer is an activation layer, the fourth layer is a convolution layer, the fifth layer is an activation layer, and the sixth layer is a convolution layer.
5. The deep learning based image coding system of claim 1, wherein the quantization in the entropy coding module is adding random uniform noise approximation quantization.
6. An image decoding system based on deep learning, which is an image decoding system corresponding to the image encoding system of any one of claims 1 to 5, comprising:
the entropy decoding module is used for carrying out entropy decoding on the super-prior-check eigenvalue code stream to obtain a reconstructed super-prior-check eigenvalue matrix;
the deep learning-based reconstruction module is used for training according to the super-prior eigenvalue matrix to obtain a conditional probability model of which the eigenvalue is based on Laplace distribution super-prior; the entropy decoding module is also used for carrying out entropy decoding on the characteristic coefficient code stream according to the conditional probability model to obtain a reconstructed characteristic coefficient matrix;
and the deep learning-based inverse transformation network module is used for enabling the reconstructed characteristic coefficient matrix to reconstruct the image pixel value through an inverse transformation network.
7. The deep learning based image decoding system of claim 6, wherein the entropy decoding module is further configured to perform entropy decoding on the image meta-information code stream to obtain image meta-information;
wherein the image meta information includes: the length and width of the image, and the model number used by the image.
8. The deep learning based image decoding system of claim 6, wherein the inverse transform network module is constructed based on a deep convolutional neural network;
the inverse transformation network module and the forward transformation network module are in a symmetrical structure;
the inverse transformation network module comprises N layers of deconvolution layers and N-1 layers of inverse normalization layers, the inverse transformation module starts with the inverse convolution layers, and the inverse convolution layers and the inverse normalization layers are distributed alternately.
9. The deep learning based image decoding system of claim 6, wherein the super-prior reconstruction module is constructed based on a deep convolutional neural network;
the super-prior reconstruction module and the super-prior analysis module are in a symmetrical structure;
the reconstruction network of the super-prior reconstruction module also comprises six layers, wherein the first layer is an deconvolution layer, the second layer is an activation layer, the third layer is an deconvolution layer, the fourth layer is an activation layer, the fifth layer is an deconvolution layer, and the sixth layer is an exponential function output layer for each input characteristic value.
10. An image coding method based on deep learning, comprising:
s101: carrying out forward transformation on an input image to obtain a characteristic coefficient matrix representing image information;
s102: inputting the characteristic coefficient matrix into a super-prior analysis module, and outputting a super-prior eigenvalue matrix representing the probability of the characteristic coefficient;
s103: quantizing the super prior eigenvalue matrix in the S102, and entropy coding the quantized super prior eigenvalue matrix to obtain a super prior eigenvalue code stream;
s104: training according to the quantized super-prior eigenvalue matrix in the S103 to obtain a conditional probability model of which the eigenvalue is based on Laplace distribution super-prior;
s105: quantizing the characteristic coefficient matrix in the S101, and performing entropy coding on the quantized characteristic coefficient matrix by using the conditional probability model in the S104 to obtain a characteristic coefficient code stream;
s106: the code stream of the output image of the packing includes: the code stream of the super-prior eigenvalue in S103 and the code stream of the eigen coefficient in S105.
11. The method for coding an image based on deep learning of claim 10, wherein between S105 and S106 further comprises:
s111: carrying out bypass entropy coding on the image meta-information to obtain an image meta-information code stream, wherein the image meta-information comprises: the length and width of the image, and the model serial number adopted by the image; further, the air conditioner is provided with a fan,
the encoding code stream of the output image in S106 further includes: and the image meta-information code stream in the S101.
12. The method according to claim 10, wherein the quantization in S103 and/or S105 is approximate quantization, and the approximate quantization is performed by a method of adding random uniform noise.
13. The deep learning based image coding method according to claim 10, wherein the S104 comprises:
taking a minimized loss function J which is R + lambda D as a target, adopting MS-SSIM or PSNR as a measurement index, and approximating by using information entropy; wherein:
the information entropy is obtained according to a conditional probability function of the characteristic coefficient, namely n is sum (-plog2 (p));
the conditional probability density is modeled based on Laplace distribution, the mean value is assumed to be 0, and the variance is the super-prior conditional probability model obtained by training;
where R is the code rate, D is the distortion, n is the information entropy, and p is the conditional probability function.
14. An image decoding method based on deep learning, which is an image decoding method corresponding to the image encoding method according to any one of claims 10 to 13, comprising:
s141: entropy decoding to obtain image meta-information, including the length and width of an image and the model serial number adopted by the image;
s142: decoding the code stream of the super-prior eigenvalue to obtain a super-prior eigenvalue matrix by using a corresponding super-prior eigenvalue entropy coding model according to the serial number of the model; constructing and initializing a corresponding network model according to the model serial number;
s143: sending the super-prior eigenvalue matrix obtained by decoding in the 142 into a super-prior reconstruction module, and outputting the conditional probability of the obtained eigen coefficients;
s144: decoding the characteristic coefficient code stream to obtain a characteristic coefficient matrix of the image by using the conditional probability of the characteristic coefficient in the step 143;
s145: and sending the characteristic coefficient matrix in the step S144 to an inverse transformation network module, and reconstructing a pixel value.
15. An image encoding terminal comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor is operable to execute the image encoding method based on deep learning according to any one of claims 10 to 13 when executing the program.
16. An image decoding terminal comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor is operable to execute the method of claim 14 when executing the program.
17. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, is adapted to carry out the method for deep learning based image encoding according to any one of claims 10 to 13.
18. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, is adapted to perform the method for deep learning based image coding and decoding as set forth in claim 14.
CN201910705904.7A 2019-08-01 2019-08-01 Image coding and decoding system and method based on deep learning Pending CN110602494A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910705904.7A CN110602494A (en) 2019-08-01 2019-08-01 Image coding and decoding system and method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910705904.7A CN110602494A (en) 2019-08-01 2019-08-01 Image coding and decoding system and method based on deep learning

Publications (1)

Publication Number Publication Date
CN110602494A true CN110602494A (en) 2019-12-20

Family

ID=68853337

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910705904.7A Pending CN110602494A (en) 2019-08-01 2019-08-01 Image coding and decoding system and method based on deep learning

Country Status (1)

Country Link
CN (1) CN110602494A (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111524114A (en) * 2020-04-17 2020-08-11 哈尔滨理工大学 Steel plate surface defect detection method based on deep learning
CN112019865A (en) * 2020-07-26 2020-12-01 杭州皮克皮克科技有限公司 Cross-platform entropy coding method and decoding method for deep learning coding
CN112203093A (en) * 2020-10-12 2021-01-08 苏州天必佑科技有限公司 Signal processing method based on deep neural network
CN113259676A (en) * 2020-02-10 2021-08-13 北京大学 Image compression method and device based on deep learning
CN113766237A (en) * 2021-09-30 2021-12-07 咪咕文化科技有限公司 Encoding method, decoding method, device, equipment and readable storage medium
CN113949880A (en) * 2021-09-02 2022-01-18 北京大学 Extremely-low-bit-rate man-machine collaborative image coding training method and coding and decoding method
WO2022028197A1 (en) * 2020-08-06 2022-02-10 华为技术有限公司 Image processing method and device thereof
CN114449276A (en) * 2022-01-06 2022-05-06 北京工业大学 Super-prior side information compensation image compression method based on learning
CN114554205A (en) * 2020-11-26 2022-05-27 华为技术有限公司 Image coding and decoding method and device
WO2022253088A1 (en) * 2021-05-29 2022-12-08 华为技术有限公司 Encoding method and apparatus, decoding method and apparatus, device, storage medium, and computer program and product
WO2023169303A1 (en) * 2022-03-10 2023-09-14 华为技术有限公司 Encoding and decoding method and apparatus, device, storage medium, and computer program product
EP4207766A4 (en) * 2020-09-30 2024-03-06 Huawei Technologies Co., Ltd. Entropy encoding/decoding method and device

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101159136A (en) * 2007-11-13 2008-04-09 中国传媒大学 Low bit rate music signal coding method
CN101771868A (en) * 2008-12-31 2010-07-07 华为技术有限公司 Method and device for processing images in quantification
JP2013167698A (en) * 2012-02-14 2013-08-29 Nippon Telegr & Teleph Corp <Ntt> Apparatus and method for estimating spectral shape feature quantity of signal for every sound source, and apparatus, method and program for estimating spectral feature quantity of target signal
CN106991411A (en) * 2017-04-17 2017-07-28 中国科学院电子学研究所 Remote Sensing Target based on depth shape priori becomes more meticulous extracting method
CN107121926A (en) * 2017-05-08 2017-09-01 广东产品质量监督检验研究院 A kind of industrial robot Reliability Modeling based on deep learning
CN109543745A (en) * 2018-11-20 2019-03-29 江南大学 Feature learning method and image-recognizing method based on condition confrontation autoencoder network
CN109889839A (en) * 2019-03-27 2019-06-14 上海交通大学 ROI Image Coding, decoding system and method based on deep learning
CN109996071A (en) * 2019-03-27 2019-07-09 上海交通大学 Variable bit rate image coding, decoding system and method based on deep learning

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101159136A (en) * 2007-11-13 2008-04-09 中国传媒大学 Low bit rate music signal coding method
CN101771868A (en) * 2008-12-31 2010-07-07 华为技术有限公司 Method and device for processing images in quantification
JP2013167698A (en) * 2012-02-14 2013-08-29 Nippon Telegr & Teleph Corp <Ntt> Apparatus and method for estimating spectral shape feature quantity of signal for every sound source, and apparatus, method and program for estimating spectral feature quantity of target signal
CN106991411A (en) * 2017-04-17 2017-07-28 中国科学院电子学研究所 Remote Sensing Target based on depth shape priori becomes more meticulous extracting method
CN107121926A (en) * 2017-05-08 2017-09-01 广东产品质量监督检验研究院 A kind of industrial robot Reliability Modeling based on deep learning
CN109543745A (en) * 2018-11-20 2019-03-29 江南大学 Feature learning method and image-recognizing method based on condition confrontation autoencoder network
CN109889839A (en) * 2019-03-27 2019-06-14 上海交通大学 ROI Image Coding, decoding system and method based on deep learning
CN109996071A (en) * 2019-03-27 2019-07-09 上海交通大学 Variable bit rate image coding, decoding system and method based on deep learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JOHANNES BALLÉ ET AL: "VARIATIONAL IMAGE COMPRESSION WITH A SCALE HYPERPRIOR", 《ARXIV:1802.01436V2》 *

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113259676B (en) * 2020-02-10 2023-01-17 北京大学 Image compression method and device based on deep learning
CN113259676A (en) * 2020-02-10 2021-08-13 北京大学 Image compression method and device based on deep learning
CN111524114A (en) * 2020-04-17 2020-08-11 哈尔滨理工大学 Steel plate surface defect detection method based on deep learning
CN112019865A (en) * 2020-07-26 2020-12-01 杭州皮克皮克科技有限公司 Cross-platform entropy coding method and decoding method for deep learning coding
WO2022028197A1 (en) * 2020-08-06 2022-02-10 华为技术有限公司 Image processing method and device thereof
CN114071141A (en) * 2020-08-06 2022-02-18 华为技术有限公司 Image processing method and equipment
EP4207766A4 (en) * 2020-09-30 2024-03-06 Huawei Technologies Co., Ltd. Entropy encoding/decoding method and device
JP7500873B2 (en) 2020-09-30 2024-06-17 華為技術有限公司 Entropy encoding/decoding method and apparatus
CN112203093B (en) * 2020-10-12 2022-07-01 苏州天必佑科技有限公司 Signal processing method based on deep neural network
CN112203093A (en) * 2020-10-12 2021-01-08 苏州天必佑科技有限公司 Signal processing method based on deep neural network
CN114554205A (en) * 2020-11-26 2022-05-27 华为技术有限公司 Image coding and decoding method and device
CN114554205B (en) * 2020-11-26 2023-03-10 华为技术有限公司 Image encoding and decoding method and device
WO2022253088A1 (en) * 2021-05-29 2022-12-08 华为技术有限公司 Encoding method and apparatus, decoding method and apparatus, device, storage medium, and computer program and product
CN113949880A (en) * 2021-09-02 2022-01-18 北京大学 Extremely-low-bit-rate man-machine collaborative image coding training method and coding and decoding method
CN113766237A (en) * 2021-09-30 2021-12-07 咪咕文化科技有限公司 Encoding method, decoding method, device, equipment and readable storage medium
CN114449276B (en) * 2022-01-06 2024-04-02 北京工业大学 Super prior side information compensation image compression method based on learning
CN114449276A (en) * 2022-01-06 2022-05-06 北京工业大学 Super-prior side information compensation image compression method based on learning
WO2023169303A1 (en) * 2022-03-10 2023-09-14 华为技术有限公司 Encoding and decoding method and apparatus, device, storage medium, and computer program product

Similar Documents

Publication Publication Date Title
CN110602494A (en) Image coding and decoding system and method based on deep learning
CN109996071B (en) Variable code rate image coding and decoding system and method based on deep learning
Hu et al. Learning end-to-end lossy image compression: A benchmark
CN109889839B (en) Region-of-interest image coding and decoding system and method based on deep learning
CN112203093B (en) Signal processing method based on deep neural network
CN111641832B (en) Encoding method, decoding method, device, electronic device and storage medium
CN114449276B (en) Super prior side information compensation image compression method based on learning
Zhang et al. Lossless image compression using a multi-scale progressive statistical model
CN113822147A (en) Deep compression method for semantic task of cooperative machine
CN111246206A (en) Optical flow information compression method and device based on self-encoder
CN113132735A (en) Video coding method based on video frame generation
Li et al. Multiple description coding based on convolutional auto-encoder
CN117354523A (en) Image coding, decoding and compressing method for frequency domain feature perception learning
CN111050170A (en) Image compression system construction method, compression system and method based on GAN
CN114792347A (en) Image compression method based on multi-scale space and context information fusion
CN112702600B (en) Image coding and decoding neural network layered fixed-point method
Yadav et al. Flow-MotionNet: A neural network based video compression architecture
CN112188217A (en) JPEG compressed image decompression effect removing method combining DCT domain and pixel domain learning
CN111080729B (en) Training picture compression network construction method and system based on Attention mechanism
AU2002366676A1 (en) Method, apparatus and software for lossy data compression and function estimation
CN114422802B (en) Self-encoder image compression method based on codebook
CN111107377A (en) Depth image compression method, device, equipment and storage medium
Gao et al. Volumetric end-to-end optimized compression for brain images
Li et al. 3D tensor auto-encoder with application to video compression
Sun et al. Hlic: Harmonizing optimization metrics in learned image compression by reinforcement learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20191220