CN110956671A - Image compression method based on multi-scale feature coding - Google Patents

Image compression method based on multi-scale feature coding Download PDF

Info

Publication number
CN110956671A
CN110956671A CN201911290877.8A CN201911290877A CN110956671A CN 110956671 A CN110956671 A CN 110956671A CN 201911290877 A CN201911290877 A CN 201911290877A CN 110956671 A CN110956671 A CN 110956671A
Authority
CN
China
Prior art keywords
image
resolution
layer
convolutional
size
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911290877.8A
Other languages
Chinese (zh)
Other versions
CN110956671B (en
Inventor
吴庆波
吴晨豪
李宏亮
孟凡满
许林峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201911290877.8A priority Critical patent/CN110956671B/en
Publication of CN110956671A publication Critical patent/CN110956671A/en
Application granted granted Critical
Publication of CN110956671B publication Critical patent/CN110956671B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4046Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Compression Of Band Width Or Redundancy In Fax (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention discloses an image compression method based on multi-scale feature coding, which is characterized in that selection vectors are obtained by averaging the absolute values of gradient spectrums of image features in a training set, and the selection vectors are utilized to guide different channel features to select coding resolution; and meanwhile, the characteristics of the low-resolution codes are restored through a super-resolution network at a decoding end, and finally the characteristics of the low-resolution codes and the characteristics of the high-resolution codes are recombined into a complete characteristic spectrum and mapped back to the original image. The invention carries out difference processing aiming at the characteristics of the image characteristics, and transmits the characteristics which are easy to recover from the context information by using low resolution, thereby saving code rate; for complex fine features transmitted with high resolution, the degree of loss is reduced.

Description

Image compression method based on multi-scale feature coding
Technical Field
The invention belongs to the technical field of image compression, and particularly relates to a design of an image compression method based on multi-scale feature coding.
Background
At present, in the field of image compression, many methods based on deep learning begin to appear, for example, the feature extraction capability of a convolutional neural network is utilized to map an image to a feature space, quantization and entropy coding are performed on the obtained features, and a transposed convolution is used to map the features back to an original image after entropy decoding is performed on a decoding end. However, there is a difference in complexity between the features of different channels, and the same processing method will waste a large amount of code rate on the smooth features and damage the fineness of the complex features.
Disclosure of Invention
The invention aims to provide an image compression method based on multi-scale feature coding, which carries out difference processing aiming at the characteristics of image features and transmits the features which are easy to recover from context information by using low resolution, thereby saving code rate; for complex fine features transmitted with high resolution, the degree of loss is reduced.
The technical scheme of the invention is as follows: an image compression method based on multi-scale feature coding comprises the following steps:
and S1, performing feature extraction on the input image to obtain image features.
And S2, selecting channels according to the image characteristics to obtain a high-resolution characteristic channel and a low-resolution characteristic channel.
And S3, respectively coding and decoding the image features in the high-resolution feature channel and the image features in the low-resolution feature channel to obtain first high-resolution image features and low-resolution image features.
And S4, inputting the low-resolution image features into a super-resolution network for recovery to obtain second high-resolution image features.
And S5, carrying out image synthesis on the first high-resolution image characteristic and the second high-resolution image characteristic to obtain an output image.
Further, step S1 is specifically: performing feature extraction on an input image through 4 sequentially connected downsampling convolutional layers at an image coding end to obtain image features; the convolution kernel size of each downsampled convolution layer is 5 × 5, the step size is 2, and the activation function is a GDN function.
Further, step S2 includes the following substeps:
and S21, extracting the characteristic spectrum of the image characteristic, and calculating by using a Sobel gradient operator to obtain the gradient spectrum.
And S22, averaging the absolute values of the gradient spectrum of each characteristic channel to obtain a one-dimensional vector for describing the complexity of the characteristic channel.
And S23, setting the channel corresponding to the half one-dimensional vector with larger complexity as a high-resolution characteristic channel, and setting the channel corresponding to the half one-dimensional vector with smaller complexity as a low-resolution characteristic channel, wherein the channel corresponding to the half one-dimensional vector with larger complexity is set as a 1.
Further, the specific method for encoding and decoding the image features in the high-resolution feature channel in step S3 is as follows:
and A1, quantizing the image features in the high-resolution feature channel.
And A2, estimating the probability distribution of the quantized image features through a hyper-pilot network.
And A3, performing arithmetic coding on the quantized image features according to the probability distribution to obtain a binary code stream.
And A4, performing arithmetic decoding on the binary code stream according to the probability distribution to obtain a first high-resolution image characteristic.
The specific method for encoding and decoding the image features in the low-resolution feature channel in step S3 is as follows:
and B1, down-sampling the image features in the low-resolution feature channel.
And B2, quantizing the image features after down sampling.
And B3, estimating the probability distribution of the quantized image features through a hyper-pilot network.
And B4, performing arithmetic coding on the quantized image features according to the probability distribution to obtain a binary code stream.
And B5, performing arithmetic decoding on the binary code stream according to the probability distribution to obtain the low-resolution image characteristics.
Further, step a1 specifically includes: in the training process, quantizing the image features in the high-resolution feature channel by increasing a uniform noise approximate representation quantization result; in the testing process, the image features in the high-resolution feature channel are quantized in a rounding mode.
The step B2 specifically includes: in the training process, the down-sampled image features are quantized in a mode of increasing a uniform noise approximate representation quantization result; in the testing process, the image characteristics after down sampling are quantized by adopting a rounding mode.
Further, in step B1, the image features in the low-resolution feature channel are downsampled by a downsampled convolutional layer, the size of the convolution kernel of the downsampled convolutional layer is 5 × 5, the step size is 2, and the activation function is a GDN function.
Further, the super-prior network extracts variance from image features as side information, and the side information uses fixed probability distribution when encoding and decoding.
The encoding end of the super-first network comprises three convolution layers which are connected in sequence, the size of a convolution kernel of each convolution layer is 5 multiplied by 5, the step length is 2, and the activation function is a ReLU function.
The decoding end of the super-pilot network comprises three sequentially connected transposition convolutional layers, the size of a convolution kernel of each transposition convolutional layer is 5 multiplied by 5, the step length is 2, and the activation function is a ReLU function.
Further, the super-resolution network in step S4 includes a first convolutional layer, a GDN function, a second convolutional layer, a first cascaded layer, a third convolutional layer, a GDN function, a fourth convolutional layer, a GDN function, a first residual block, a second residual block, a fifth convolutional layer, a GDN function, a second cascaded layer, and a transposed convolutional layer, which are connected in sequence.
The input end of the first convolution layer inputs a first high-resolution image characteristic, the input end of the first cascade layer inputs a low-resolution image characteristic, the input end of the second cascade layer is also connected with the output end of the first cascade layer, and the output end of the transposition convolution layer outputs a second high-resolution image characteristic.
The number of filters of the first convolutional layer is 192, the size of the convolutional kernel is 3 × 3, and the sampling multiple is 1.
The number of filters of the second convolutional layer is 96, the size of the convolutional kernel is 1 × 1, and the sampling multiple is 2 times of down-sampling.
The number of filters of the third convolutional layer is 384, the convolutional kernel size is 3 × 3, and the sampling multiple is 1.
The number of filters of the fourth convolutional layer is 192, the size of the convolutional kernel is 3 × 3, and the sampling multiple is 1.
The number of filters of the fifth convolutional layer is 384, the convolutional kernel size is 1 × 1, and the sampling multiple is 1.
The number of filters of the transposed convolution layer is 96, the size of the convolution kernel is 3 multiplied by 3, and the sampling multiple is 2 times of up-sampling.
The first residual block and the second residual block have the same structure and respectively comprise a sixth convolution layer, a GDN function, a seventh convolution layer and an adder which are sequentially connected, wherein the input end of the sixth convolution layer is used as the input end of the first residual block or the second residual block and is connected with the input end of the adder, and the output end of the adder is used as the output end of the first residual block or the second residual block.
The number of filters of the sixth convolutional layer is 192, the size of the convolutional kernel is 3 × 3, and the sampling multiple is 1.
The number of filters of the seventh convolutional layer is 192, the size of the convolutional kernel is 3 × 3, and the sampling multiple is 1.
Further, the main constraint condition of the super-resolution network in the step S4 in the training process is rate distortion loss, and the auxiliary constraint condition is super-resolution loss.
The rate-distortion loss is calculated by the formula:
L=R+λD
wherein L represents rate distortion loss, R represents code rate, D represents distortion degree, and lambda is weight.
The super-resolution loss is the mean square error between the image characteristics before the down-sampling of the two characteristic channels at the image encoding end and the image characteristics after the super-resolution at the image decoding end.
Further, in step S5, the first high-resolution image feature and the second high-resolution image feature are image-synthesized by 4 transposed convolution layers, each of which has a convolution kernel size of 5 × 5, and the activation function is an IGDN function that is an inverse function of the GDN function.
The invention has the beneficial effects that:
(1) the invention adopts coding modes with different resolutions for the image characteristics of different channels to realize the purpose of distributing corresponding code rates for the characteristics with different fineness degrees.
(2) The invention processes the characteristics of low-resolution coding at a decoding end through a super-resolution network, fully utilizes the capability of a neural network in the aspect of image recovery, deduces lost information from context content and can reduce the loss degree of images.
Drawings
Fig. 1 is a flowchart of an image compression method based on multi-scale feature coding according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of a super-resolution network structure according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It is to be understood that the embodiments shown and described in the drawings are merely exemplary and are intended to illustrate the principles and spirit of the invention, not to limit the scope of the invention.
The embodiment of the invention provides an image compression method based on multi-scale feature coding, as shown in fig. 1, the method comprises the following steps of S1-S5:
and S1, performing feature extraction on the input image to obtain image features.
In the embodiment of the invention, the image coding end carries out feature extraction on the input image through 4 downsampling convolutional layers which are sequentially connected to obtain the image features. The convolution kernel size of each downsampled convolutional layer is 5 × 5, the step size is 2, and the activation function is a GDN (Generalized split Normalization) function.
And S2, selecting channels according to the image characteristics to obtain a high-resolution characteristic channel and a low-resolution characteristic channel.
The step S2 includes the following substeps S21-S23:
and S21, extracting the characteristic spectrum of the image characteristic, and calculating by using a Sobel gradient operator to obtain the gradient spectrum.
And S22, averaging the absolute values of the gradient spectrum of each characteristic channel to obtain a one-dimensional vector for describing the complexity of the characteristic channel. In the embodiment of the invention, the larger the numerical value in the one-dimensional vector is, the larger the complexity of the corresponding characteristic channel is.
And S23, setting the channel corresponding to the half one-dimensional vector with larger complexity as a high-resolution characteristic channel, and setting the channel corresponding to the half one-dimensional vector with smaller complexity as a low-resolution characteristic channel, wherein the channel corresponding to the half one-dimensional vector with larger complexity is set as a 1.
And S3, respectively coding and decoding the image features in the high-resolution feature channel and the image features in the low-resolution feature channel to obtain first high-resolution image features and low-resolution image features.
The specific method for coding and decoding the image features in the high-resolution feature channel comprises the following steps:
and A1, quantizing the image features in the high-resolution feature channel.
In the embodiment of the invention, in the training process, because the quantization operation can not be reversely propagated, an alternative mode is adopted to quantize the image characteristics in the high-resolution characteristic channel by increasing a mode of approximately representing the quantization result by uniform noise; in the testing process, the image features in the high-resolution feature channel are quantized in a rounding mode.
And A2, estimating the probability distribution of the quantized image features through a hyper-pilot network.
And A3, performing arithmetic coding on the quantized image features according to the probability distribution to obtain a binary code stream.
And A4, performing arithmetic decoding on the binary code stream according to the probability distribution to obtain a first high-resolution image characteristic.
The specific method for coding and decoding the image features in the low-resolution feature channel comprises the following steps:
and B1, down-sampling the image features in the low-resolution feature channel.
In the embodiment of the invention, the image features in the low-resolution feature channel are downsampled through a downsampling convolutional layer, the convolutional kernel size of the downsampling convolutional layer is 5 multiplied by 5, the step size is 2, and the activation function is a GDN function.
And B2, quantizing the image features after down sampling.
In the embodiment of the invention, in the training process, because the quantization operation can not be carried out with back propagation, an alternative mode is adopted, and the image characteristics after down sampling are quantized in a mode of increasing uniform noise to approximately represent the quantization result; in the testing process, the image characteristics after down sampling are quantized by adopting a rounding mode.
And B3, estimating the probability distribution of the quantized image features through a hyper-pilot network.
And B4, performing arithmetic coding on the quantized image features according to the probability distribution to obtain a binary code stream.
And B5, performing arithmetic decoding on the binary code stream according to the probability distribution to obtain the low-resolution image characteristics.
In the embodiment of the invention, the coding and decoding part adopts arithmetic coding and decoding, the arithmetic coding and decoding needs a probability distribution which is common to both the coding and decoding parts, at the moment, the probability distribution is modeled into a zero-mean Gaussian mixture model, and a super-prior network is used for extracting variance from characteristics to be used as side information. The arithmetic encoding of the two feature channel encoding and decoding sections encodes the features into a binary code stream according to the same probability distribution parameters, and the arithmetic decoding decodes them back to the original features. And using fixed probability distribution when the side information of the prior network is coded and decoded.
The encoding end of the super-advanced network comprises three convolution layers which are connected in sequence, the size of a convolution kernel of each convolution layer is 5 multiplied by 5, the step length is 2, and the activation function is a ReLU function. The decoding end of the super-first-rate network comprises three sequentially connected transposition convolutional layers corresponding to the convolutional layers at the encoding end, the size of a convolutional core of each transposition convolutional layer is 5 multiplied by 5, the step length is 2, and the activation function is a ReLU function.
And S4, inputting the low-resolution image features into a super-resolution network for recovery to obtain second high-resolution image features.
As shown in fig. 2, in the embodiment of the present invention, the super-resolution network includes a first convolutional layer, a GDN function, a second convolutional layer, a first cascaded layer, a third convolutional layer, a GDN function, a fourth convolutional layer, a GDN function, a first residual block, a second residual block, a fifth convolutional layer, a GDN function, a second cascaded layer, and a transposed convolutional layer, which are sequentially connected.
Since there is a certain correlation between the low-resolution image feature and the first high-resolution image feature, the first high-resolution image feature is also used as an input when super-resolving the low-resolution image feature. In the embodiment of the invention, the input end of the first convolution layer inputs the first high-resolution image characteristic, the input end of the first cascade layer inputs the low-resolution image characteristic, the input end of the second cascade layer is also connected with the output end of the first cascade layer, and the output end of the transposition convolution layer outputs the second high-resolution image characteristic.
The number of filters of the first convolutional layer is 192, the size of the convolutional kernel is 3 × 3, and the sampling multiple is 1.
The number of filters of the second convolutional layer is 96, the size of the convolutional kernel is 1 × 1, and the sampling multiple is 2 times of down-sampling.
The number of filters of the third convolutional layer is 384, the convolutional kernel size is 3 × 3, and the sampling multiple is 1.
The number of filters of the fourth convolutional layer is 192, the size of the convolutional kernel is 3 × 3, and the sampling multiple is 1.
The number of filters of the fifth convolutional layer is 384, the convolutional kernel size is 1 × 1, and the sampling multiple is 1.
The number of filters of the transposed convolution layer is 96, the size of the convolution kernel is 3 multiplied by 3, and the sampling multiple is 2 times of up-sampling.
The first residual block and the second residual block have the same structure and respectively comprise a sixth convolution layer, a GDN function, a seventh convolution layer and an adder which are sequentially connected, wherein the input end of the sixth convolution layer is used as the input end of the first residual block or the second residual block and is connected with the input end of the adder, and the output end of the adder is used as the output end of the first residual block or the second residual block.
The number of filters of the sixth convolutional layer is 192, the size of the convolutional kernel is 3 × 3, and the sampling multiple is 1.
The number of filters of the seventh convolutional layer is 192, the size of the convolutional kernel is 3 × 3, and the sampling multiple is 1.
In the embodiment of the invention, the main constraint condition of the super-resolution network in the training process is rate distortion loss, and the auxiliary constraint condition is super-resolution loss.
The rate distortion loss is obtained by weighting the code rate and the distortion, and the calculation formula is as follows:
L=R+λD
wherein L represents rate distortion loss, and R represents code rate, the information entropy is directly used as the code rate after entropy coding by adopting the current probability distribution; d represents distortion degree, and the Mean Square Error (MSE) between the original image and the decoded image is adopted to describe the distortion degree in the embodiment of the invention; the lambda is a weight value, manual setting is adopted in the embodiment of the invention, and the compression ratio of the image can be changed by adjusting the size of the lambda.
The super-resolution loss is the mean square error between the image characteristics before the down-sampling of the two characteristic channels at the image encoding end and the image characteristics after the super-resolution at the image decoding end.
And S5, carrying out image synthesis on the first high-resolution image characteristic and the second high-resolution image characteristic to obtain an output image.
In the embodiment of the invention, the first high-resolution image feature and the second high-resolution image feature are subjected to image synthesis through 4 transposition convolutional layers, the size of a convolution kernel of each transposition convolutional layer is 5 multiplied by 5, and an activation function is an IGDN function of a GDN function.
It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.

Claims (10)

1. An image compression method based on multi-scale feature coding is characterized by comprising the following steps:
s1, performing feature extraction on the input image to obtain image features;
s2, selecting channels according to the image characteristics to obtain a high-resolution characteristic channel and a low-resolution characteristic channel;
s3, respectively coding and decoding the image features in the high-resolution feature channel and the image features in the low-resolution feature channel to obtain a first high-resolution image feature and a first low-resolution image feature;
s4, inputting the low-resolution image features into a super-resolution network for recovery to obtain second high-resolution image features;
and S5, carrying out image synthesis on the first high-resolution image characteristic and the second high-resolution image characteristic to obtain an output image.
2. The image compression method according to claim 1, wherein the step S1 is specifically: performing feature extraction on an input image through 4 sequentially connected downsampling convolutional layers at an image coding end to obtain image features; the convolution kernel size of each downsampled convolution layer is 5 x 5, the step size is 2, and the activation function is a GDN function.
3. The image compression method according to claim 1, wherein the step S2 includes the following substeps:
s21, extracting a characteristic spectrum of the image characteristics, and calculating by using a Sobel gradient operator to obtain a gradient spectrum of the image characteristics;
s22, averaging the absolute values of the gradient spectrums of each characteristic channel to obtain a one-dimensional vector for describing the complexity of the characteristic channel;
and S23, setting the channel corresponding to the half one-dimensional vector with larger complexity as a high-resolution characteristic channel, and setting the channel corresponding to the half one-dimensional vector with smaller complexity as a low-resolution characteristic channel, wherein the channel corresponding to the half one-dimensional vector with larger complexity is set as a 1.
4. The image compression method according to claim 1, wherein the specific method for coding and decoding the image features in the high resolution feature channel in step S3 is as follows:
a1, quantizing the image features in the high-resolution feature channel;
a2, estimating probability distribution of the quantized image features through a hyper-pilot network;
a3, performing arithmetic coding on the quantized image features according to probability distribution to obtain a binary code stream;
a4, performing arithmetic decoding on the binary code stream according to probability distribution to obtain a first high-resolution image characteristic;
the specific method for encoding and decoding the image features in the low-resolution feature channel in step S3 is as follows:
b1, down-sampling the image features in the low-resolution feature channel;
b2, quantizing the image features after down sampling;
b3, estimating probability distribution of the quantized image features through a super-prior network;
b4, performing arithmetic coding on the quantized image features according to probability distribution to obtain a binary code stream;
and B5, performing arithmetic decoding on the binary code stream according to the probability distribution to obtain the low-resolution image characteristics.
5. The image compression method according to claim 4, wherein the step A1 is specifically as follows: in the training process, quantizing the image features in the high-resolution feature channel by increasing a uniform noise approximate representation quantization result; in the testing process, quantizing the image characteristics in the high-resolution characteristic channel by adopting a rounding mode;
the step B2 specifically includes: in the training process, the down-sampled image features are quantized in a mode of increasing a uniform noise approximate representation quantization result; in the testing process, the image characteristics after down sampling are quantized by adopting a rounding mode.
6. The image compression method as claimed in claim 4, wherein in step B1, the image features in the low resolution feature channel are downsampled by a downsampling convolutional layer, the size of convolution kernel of the downsampling convolutional layer is 5 x 5, the step size is 2, and the activation function is GDN function.
7. The image compression method according to claim 4, wherein the super prior network extracts variance from image features as side information, and the side information uses fixed probability distribution when encoding and decoding;
the encoding end of the super-advanced network comprises three convolution layers which are sequentially connected, the size of a convolution kernel of each convolution layer is 5 multiplied by 5, the step length is 2, and the activation function is a ReLU function;
the decoding end of the super-advanced network comprises three sequentially connected transposition convolution layers, the convolution kernel size of each transposition convolution layer is 5 multiplied by 5, the step size is 2, and the activation function is a ReLU function.
8. The image compression method of claim 1, wherein the super-resolution network in step S4 includes a first convolutional layer, a GDN function, a second convolutional layer, a first cascaded layer, a third convolutional layer, a GDN function, a fourth convolutional layer, a GDN function, a first residual block, a second residual block, a fifth convolutional layer, a GDN function, a second cascaded layer, and a transposed convolutional layer, which are connected in sequence;
the input end of the first convolution layer inputs a first high-resolution image feature, the input end of the first cascade layer inputs a low-resolution image feature, the input end of the second cascade layer is also connected with the output end of the first cascade layer, and the output end of the first convolution layer outputs a second high-resolution image feature;
the number of the filters of the first convolution layer is 192, the size of a convolution kernel is 3 multiplied by 3, and the sampling multiple is 1;
the number of the filters of the second convolution layer is 96, the size of the convolution kernel is 1 multiplied by 1, and the sampling multiple is 2 times of down-sampling;
the number of the filters of the third convolutional layer is 384, the size of a convolutional kernel is 3 multiplied by 3, and the sampling multiple is 1;
the number of the filters of the fourth convolutional layer is 192, the size of a convolutional kernel is 3 multiplied by 3, and the sampling multiple is 1;
the number of the filters of the fifth convolutional layer is 384, the size of a convolutional kernel is 1 multiplied by 1, and the sampling multiple is 1;
the number of filters of the transposed convolution layer is 96, the size of a convolution kernel is 3 multiplied by 3, and the sampling multiple is 2 times of upsampling;
the first residual block and the second residual block have the same structure and respectively comprise a sixth convolution layer, a GDN function, a seventh convolution layer and an adder which are sequentially connected, wherein the input end of the sixth convolution layer is used as the input end of the first residual block or the second residual block and is connected with the input end of the adder, and the output end of the adder is used as the output end of the first residual block or the second residual block;
the number of the filters of the sixth convolutional layer is 192, the size of a convolutional kernel is 3 multiplied by 3, and the sampling multiple is 1;
the number of filters of the seventh convolutional layer is 192, the size of a convolutional kernel is 3 × 3, and the sampling multiple is 1.
9. The image compression method according to claim 8, wherein the super-resolution network in step S4 has a main constraint condition of rate distortion loss and an auxiliary constraint condition of super-resolution loss during training;
the calculation formula of the rate distortion loss is as follows:
L=R+λD
wherein L represents rate distortion loss, R represents code rate, D represents distortion degree, and lambda is weight;
the super-resolution loss is the mean square error between the image characteristics before the down-sampling of the two characteristic channels at the image encoding end and the image characteristics after the super-resolution at the image decoding end.
10. The image compression method according to claim 1, wherein in step S5, the first high-resolution image feature and the second high-resolution image feature are image-synthesized by 4 transposed convolutional layers, each of the transposed convolutional layers has a convolutional kernel size of 5 x 5, and the activation function is an IGDN function which is an inverse function of a GDN function.
CN201911290877.8A 2019-12-12 2019-12-12 Image compression method based on multi-scale feature coding Active CN110956671B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911290877.8A CN110956671B (en) 2019-12-12 2019-12-12 Image compression method based on multi-scale feature coding

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911290877.8A CN110956671B (en) 2019-12-12 2019-12-12 Image compression method based on multi-scale feature coding

Publications (2)

Publication Number Publication Date
CN110956671A true CN110956671A (en) 2020-04-03
CN110956671B CN110956671B (en) 2022-08-02

Family

ID=69981810

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911290877.8A Active CN110956671B (en) 2019-12-12 2019-12-12 Image compression method based on multi-scale feature coding

Country Status (1)

Country Link
CN (1) CN110956671B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112149652A (en) * 2020-11-27 2020-12-29 南京理工大学 Space-spectrum joint depth convolution network method for lossy compression of hyperspectral image
CN113014927A (en) * 2021-03-02 2021-06-22 三星(中国)半导体有限公司 Image compression method and image compression device
CN113393377A (en) * 2021-05-18 2021-09-14 电子科技大学 Single-frame image super-resolution method based on video coding
WO2021258529A1 (en) * 2020-06-22 2021-12-30 北京大学深圳研究生院 Image resolution reduction and restoration method, device, and readable storage medium
CN114363624A (en) * 2020-10-13 2022-04-15 北京大学 Sensitivity-based code rate allocation characteristic compression method
CN115866252A (en) * 2023-02-09 2023-03-28 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Image compression method, device, equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105392009A (en) * 2015-11-27 2016-03-09 四川大学 Low bit rate image coding method based on block self-adaptive sampling and super-resolution reconstruction
CN107018422A (en) * 2017-04-27 2017-08-04 四川大学 Still image compression method based on depth convolutional neural networks
CN109146788A (en) * 2018-08-16 2019-01-04 广州视源电子科技股份有限公司 Super-resolution image reconstruction method and device based on deep learning
CN109741256A (en) * 2018-12-13 2019-05-10 西安电子科技大学 Image super-resolution rebuilding method based on rarefaction representation and deep learning
CN109996071A (en) * 2019-03-27 2019-07-09 上海交通大学 Variable bit rate image coding, decoding system and method based on deep learning
WO2019145767A1 (en) * 2018-01-25 2019-08-01 King Abdullah University Of Science And Technology Deep-learning based structure reconstruction method and apparatus
CN110087092A (en) * 2019-03-11 2019-08-02 西安电子科技大学 Low bit-rate video decoding method based on image reconstruction convolutional neural networks

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105392009A (en) * 2015-11-27 2016-03-09 四川大学 Low bit rate image coding method based on block self-adaptive sampling and super-resolution reconstruction
CN107018422A (en) * 2017-04-27 2017-08-04 四川大学 Still image compression method based on depth convolutional neural networks
WO2019145767A1 (en) * 2018-01-25 2019-08-01 King Abdullah University Of Science And Technology Deep-learning based structure reconstruction method and apparatus
CN109146788A (en) * 2018-08-16 2019-01-04 广州视源电子科技股份有限公司 Super-resolution image reconstruction method and device based on deep learning
CN109741256A (en) * 2018-12-13 2019-05-10 西安电子科技大学 Image super-resolution rebuilding method based on rarefaction representation and deep learning
CN110087092A (en) * 2019-03-11 2019-08-02 西安电子科技大学 Low bit-rate video decoding method based on image reconstruction convolutional neural networks
CN109996071A (en) * 2019-03-27 2019-07-09 上海交通大学 Variable bit rate image coding, decoding system and method based on deep learning

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021258529A1 (en) * 2020-06-22 2021-12-30 北京大学深圳研究生院 Image resolution reduction and restoration method, device, and readable storage medium
CN114363624A (en) * 2020-10-13 2022-04-15 北京大学 Sensitivity-based code rate allocation characteristic compression method
CN114363624B (en) * 2020-10-13 2023-03-31 北京大学 Sensitivity-based code rate allocation characteristic compression method
CN112149652A (en) * 2020-11-27 2020-12-29 南京理工大学 Space-spectrum joint depth convolution network method for lossy compression of hyperspectral image
CN113014927A (en) * 2021-03-02 2021-06-22 三星(中国)半导体有限公司 Image compression method and image compression device
CN113014927B (en) * 2021-03-02 2024-01-09 三星(中国)半导体有限公司 Image compression method and image compression device
CN113393377A (en) * 2021-05-18 2021-09-14 电子科技大学 Single-frame image super-resolution method based on video coding
CN113393377B (en) * 2021-05-18 2022-02-01 电子科技大学 Single-frame image super-resolution method based on video coding
CN115866252A (en) * 2023-02-09 2023-03-28 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Image compression method, device, equipment and storage medium
CN115866252B (en) * 2023-02-09 2023-05-02 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Image compression method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN110956671B (en) 2022-08-02

Similar Documents

Publication Publication Date Title
CN110956671B (en) Image compression method based on multi-scale feature coding
Cheng et al. Learned image compression with discretized gaussian mixture likelihoods and attention modules
Gao et al. Neural image compression via attentional multi-scale back projection and frequency decomposition
US8223837B2 (en) Learning-based image compression
CN110290387B (en) Image compression method based on generative model
WO2020237646A1 (en) Image processing method and device, and computer-readable storage medium
CN108737823B (en) Image coding method and device and decoding method and device based on super-resolution technology
CN113259676A (en) Image compression method and device based on deep learning
CN103607591A (en) Image compression method combining super-resolution reconstruction
WO2020238439A1 (en) Video quality-of-service enhancement method under restricted bandwidth of wireless ad hoc network
CN112149652A (en) Space-spectrum joint depth convolution network method for lossy compression of hyperspectral image
CN113079378B (en) Image processing method and device and electronic equipment
US8737753B2 (en) Image restoration by vector quantization utilizing visual patterns
CN115131675A (en) Remote sensing image compression method and system based on reference image texture migration
Akbari et al. Learned multi-resolution variable-rate image compression with octave-based residual blocks
CN112702600B (en) Image coding and decoding neural network layered fixed-point method
CN111080729B (en) Training picture compression network construction method and system based on Attention mechanism
CN115866252B (en) Image compression method, device, equipment and storage medium
CN115776571B (en) Image compression method, device, equipment and storage medium
Tan et al. Image compression algorithms based on super-resolution reconstruction technology
CN116630448A (en) Image compression method based on neural data dependent transformation of window attention
CN110730347A (en) Image compression method and device and electronic equipment
CN115294222A (en) Image encoding method, image processing method, terminal, and medium
CN115361555A (en) Image encoding method, image encoding device, and computer storage medium
CN115150628A (en) Coarse-to-fine depth video coding method with super-prior guiding mode prediction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant