CN110956671A - Image compression method based on multi-scale feature coding - Google Patents
Image compression method based on multi-scale feature coding Download PDFInfo
- Publication number
- CN110956671A CN110956671A CN201911290877.8A CN201911290877A CN110956671A CN 110956671 A CN110956671 A CN 110956671A CN 201911290877 A CN201911290877 A CN 201911290877A CN 110956671 A CN110956671 A CN 110956671A
- Authority
- CN
- China
- Prior art keywords
- image
- resolution
- layer
- convolutional
- size
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 41
- 230000006835 compression Effects 0.000 title claims abstract description 21
- 238000007906 compression Methods 0.000 title claims abstract description 21
- 239000013598 vector Substances 0.000 claims abstract description 15
- 238000001228 spectrum Methods 0.000 claims abstract description 11
- 238000012549 training Methods 0.000 claims abstract description 10
- 238000012935 Averaging Methods 0.000 claims abstract description 4
- 238000005070 sampling Methods 0.000 claims description 42
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims description 24
- 230000004913 activation Effects 0.000 claims description 15
- 230000017105 transposition Effects 0.000 claims description 10
- 238000013139 quantization Methods 0.000 claims description 9
- 238000000605 extraction Methods 0.000 claims description 7
- 238000012360 testing method Methods 0.000 claims description 6
- 230000015572 biosynthetic process Effects 0.000 claims description 4
- 238000011084 recovery Methods 0.000 claims description 4
- 238000003786 synthesis reaction Methods 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims description 2
- 239000000284 extract Substances 0.000 claims description 2
- 238000012545 processing Methods 0.000 abstract description 2
- 230000006870 function Effects 0.000 description 33
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4046—Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4053—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Compression Of Band Width Or Redundancy In Fax (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
The invention discloses an image compression method based on multi-scale feature coding, which is characterized in that selection vectors are obtained by averaging the absolute values of gradient spectrums of image features in a training set, and the selection vectors are utilized to guide different channel features to select coding resolution; and meanwhile, the characteristics of the low-resolution codes are restored through a super-resolution network at a decoding end, and finally the characteristics of the low-resolution codes and the characteristics of the high-resolution codes are recombined into a complete characteristic spectrum and mapped back to the original image. The invention carries out difference processing aiming at the characteristics of the image characteristics, and transmits the characteristics which are easy to recover from the context information by using low resolution, thereby saving code rate; for complex fine features transmitted with high resolution, the degree of loss is reduced.
Description
Technical Field
The invention belongs to the technical field of image compression, and particularly relates to a design of an image compression method based on multi-scale feature coding.
Background
At present, in the field of image compression, many methods based on deep learning begin to appear, for example, the feature extraction capability of a convolutional neural network is utilized to map an image to a feature space, quantization and entropy coding are performed on the obtained features, and a transposed convolution is used to map the features back to an original image after entropy decoding is performed on a decoding end. However, there is a difference in complexity between the features of different channels, and the same processing method will waste a large amount of code rate on the smooth features and damage the fineness of the complex features.
Disclosure of Invention
The invention aims to provide an image compression method based on multi-scale feature coding, which carries out difference processing aiming at the characteristics of image features and transmits the features which are easy to recover from context information by using low resolution, thereby saving code rate; for complex fine features transmitted with high resolution, the degree of loss is reduced.
The technical scheme of the invention is as follows: an image compression method based on multi-scale feature coding comprises the following steps:
and S1, performing feature extraction on the input image to obtain image features.
And S2, selecting channels according to the image characteristics to obtain a high-resolution characteristic channel and a low-resolution characteristic channel.
And S3, respectively coding and decoding the image features in the high-resolution feature channel and the image features in the low-resolution feature channel to obtain first high-resolution image features and low-resolution image features.
And S4, inputting the low-resolution image features into a super-resolution network for recovery to obtain second high-resolution image features.
And S5, carrying out image synthesis on the first high-resolution image characteristic and the second high-resolution image characteristic to obtain an output image.
Further, step S1 is specifically: performing feature extraction on an input image through 4 sequentially connected downsampling convolutional layers at an image coding end to obtain image features; the convolution kernel size of each downsampled convolution layer is 5 × 5, the step size is 2, and the activation function is a GDN function.
Further, step S2 includes the following substeps:
and S21, extracting the characteristic spectrum of the image characteristic, and calculating by using a Sobel gradient operator to obtain the gradient spectrum.
And S22, averaging the absolute values of the gradient spectrum of each characteristic channel to obtain a one-dimensional vector for describing the complexity of the characteristic channel.
And S23, setting the channel corresponding to the half one-dimensional vector with larger complexity as a high-resolution characteristic channel, and setting the channel corresponding to the half one-dimensional vector with smaller complexity as a low-resolution characteristic channel, wherein the channel corresponding to the half one-dimensional vector with larger complexity is set as a 1.
Further, the specific method for encoding and decoding the image features in the high-resolution feature channel in step S3 is as follows:
and A1, quantizing the image features in the high-resolution feature channel.
And A2, estimating the probability distribution of the quantized image features through a hyper-pilot network.
And A3, performing arithmetic coding on the quantized image features according to the probability distribution to obtain a binary code stream.
And A4, performing arithmetic decoding on the binary code stream according to the probability distribution to obtain a first high-resolution image characteristic.
The specific method for encoding and decoding the image features in the low-resolution feature channel in step S3 is as follows:
and B1, down-sampling the image features in the low-resolution feature channel.
And B2, quantizing the image features after down sampling.
And B3, estimating the probability distribution of the quantized image features through a hyper-pilot network.
And B4, performing arithmetic coding on the quantized image features according to the probability distribution to obtain a binary code stream.
And B5, performing arithmetic decoding on the binary code stream according to the probability distribution to obtain the low-resolution image characteristics.
Further, step a1 specifically includes: in the training process, quantizing the image features in the high-resolution feature channel by increasing a uniform noise approximate representation quantization result; in the testing process, the image features in the high-resolution feature channel are quantized in a rounding mode.
The step B2 specifically includes: in the training process, the down-sampled image features are quantized in a mode of increasing a uniform noise approximate representation quantization result; in the testing process, the image characteristics after down sampling are quantized by adopting a rounding mode.
Further, in step B1, the image features in the low-resolution feature channel are downsampled by a downsampled convolutional layer, the size of the convolution kernel of the downsampled convolutional layer is 5 × 5, the step size is 2, and the activation function is a GDN function.
Further, the super-prior network extracts variance from image features as side information, and the side information uses fixed probability distribution when encoding and decoding.
The encoding end of the super-first network comprises three convolution layers which are connected in sequence, the size of a convolution kernel of each convolution layer is 5 multiplied by 5, the step length is 2, and the activation function is a ReLU function.
The decoding end of the super-pilot network comprises three sequentially connected transposition convolutional layers, the size of a convolution kernel of each transposition convolutional layer is 5 multiplied by 5, the step length is 2, and the activation function is a ReLU function.
Further, the super-resolution network in step S4 includes a first convolutional layer, a GDN function, a second convolutional layer, a first cascaded layer, a third convolutional layer, a GDN function, a fourth convolutional layer, a GDN function, a first residual block, a second residual block, a fifth convolutional layer, a GDN function, a second cascaded layer, and a transposed convolutional layer, which are connected in sequence.
The input end of the first convolution layer inputs a first high-resolution image characteristic, the input end of the first cascade layer inputs a low-resolution image characteristic, the input end of the second cascade layer is also connected with the output end of the first cascade layer, and the output end of the transposition convolution layer outputs a second high-resolution image characteristic.
The number of filters of the first convolutional layer is 192, the size of the convolutional kernel is 3 × 3, and the sampling multiple is 1.
The number of filters of the second convolutional layer is 96, the size of the convolutional kernel is 1 × 1, and the sampling multiple is 2 times of down-sampling.
The number of filters of the third convolutional layer is 384, the convolutional kernel size is 3 × 3, and the sampling multiple is 1.
The number of filters of the fourth convolutional layer is 192, the size of the convolutional kernel is 3 × 3, and the sampling multiple is 1.
The number of filters of the fifth convolutional layer is 384, the convolutional kernel size is 1 × 1, and the sampling multiple is 1.
The number of filters of the transposed convolution layer is 96, the size of the convolution kernel is 3 multiplied by 3, and the sampling multiple is 2 times of up-sampling.
The first residual block and the second residual block have the same structure and respectively comprise a sixth convolution layer, a GDN function, a seventh convolution layer and an adder which are sequentially connected, wherein the input end of the sixth convolution layer is used as the input end of the first residual block or the second residual block and is connected with the input end of the adder, and the output end of the adder is used as the output end of the first residual block or the second residual block.
The number of filters of the sixth convolutional layer is 192, the size of the convolutional kernel is 3 × 3, and the sampling multiple is 1.
The number of filters of the seventh convolutional layer is 192, the size of the convolutional kernel is 3 × 3, and the sampling multiple is 1.
Further, the main constraint condition of the super-resolution network in the step S4 in the training process is rate distortion loss, and the auxiliary constraint condition is super-resolution loss.
The rate-distortion loss is calculated by the formula:
L=R+λD
wherein L represents rate distortion loss, R represents code rate, D represents distortion degree, and lambda is weight.
The super-resolution loss is the mean square error between the image characteristics before the down-sampling of the two characteristic channels at the image encoding end and the image characteristics after the super-resolution at the image decoding end.
Further, in step S5, the first high-resolution image feature and the second high-resolution image feature are image-synthesized by 4 transposed convolution layers, each of which has a convolution kernel size of 5 × 5, and the activation function is an IGDN function that is an inverse function of the GDN function.
The invention has the beneficial effects that:
(1) the invention adopts coding modes with different resolutions for the image characteristics of different channels to realize the purpose of distributing corresponding code rates for the characteristics with different fineness degrees.
(2) The invention processes the characteristics of low-resolution coding at a decoding end through a super-resolution network, fully utilizes the capability of a neural network in the aspect of image recovery, deduces lost information from context content and can reduce the loss degree of images.
Drawings
Fig. 1 is a flowchart of an image compression method based on multi-scale feature coding according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of a super-resolution network structure according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It is to be understood that the embodiments shown and described in the drawings are merely exemplary and are intended to illustrate the principles and spirit of the invention, not to limit the scope of the invention.
The embodiment of the invention provides an image compression method based on multi-scale feature coding, as shown in fig. 1, the method comprises the following steps of S1-S5:
and S1, performing feature extraction on the input image to obtain image features.
In the embodiment of the invention, the image coding end carries out feature extraction on the input image through 4 downsampling convolutional layers which are sequentially connected to obtain the image features. The convolution kernel size of each downsampled convolutional layer is 5 × 5, the step size is 2, and the activation function is a GDN (Generalized split Normalization) function.
And S2, selecting channels according to the image characteristics to obtain a high-resolution characteristic channel and a low-resolution characteristic channel.
The step S2 includes the following substeps S21-S23:
and S21, extracting the characteristic spectrum of the image characteristic, and calculating by using a Sobel gradient operator to obtain the gradient spectrum.
And S22, averaging the absolute values of the gradient spectrum of each characteristic channel to obtain a one-dimensional vector for describing the complexity of the characteristic channel. In the embodiment of the invention, the larger the numerical value in the one-dimensional vector is, the larger the complexity of the corresponding characteristic channel is.
And S23, setting the channel corresponding to the half one-dimensional vector with larger complexity as a high-resolution characteristic channel, and setting the channel corresponding to the half one-dimensional vector with smaller complexity as a low-resolution characteristic channel, wherein the channel corresponding to the half one-dimensional vector with larger complexity is set as a 1.
And S3, respectively coding and decoding the image features in the high-resolution feature channel and the image features in the low-resolution feature channel to obtain first high-resolution image features and low-resolution image features.
The specific method for coding and decoding the image features in the high-resolution feature channel comprises the following steps:
and A1, quantizing the image features in the high-resolution feature channel.
In the embodiment of the invention, in the training process, because the quantization operation can not be reversely propagated, an alternative mode is adopted to quantize the image characteristics in the high-resolution characteristic channel by increasing a mode of approximately representing the quantization result by uniform noise; in the testing process, the image features in the high-resolution feature channel are quantized in a rounding mode.
And A2, estimating the probability distribution of the quantized image features through a hyper-pilot network.
And A3, performing arithmetic coding on the quantized image features according to the probability distribution to obtain a binary code stream.
And A4, performing arithmetic decoding on the binary code stream according to the probability distribution to obtain a first high-resolution image characteristic.
The specific method for coding and decoding the image features in the low-resolution feature channel comprises the following steps:
and B1, down-sampling the image features in the low-resolution feature channel.
In the embodiment of the invention, the image features in the low-resolution feature channel are downsampled through a downsampling convolutional layer, the convolutional kernel size of the downsampling convolutional layer is 5 multiplied by 5, the step size is 2, and the activation function is a GDN function.
And B2, quantizing the image features after down sampling.
In the embodiment of the invention, in the training process, because the quantization operation can not be carried out with back propagation, an alternative mode is adopted, and the image characteristics after down sampling are quantized in a mode of increasing uniform noise to approximately represent the quantization result; in the testing process, the image characteristics after down sampling are quantized by adopting a rounding mode.
And B3, estimating the probability distribution of the quantized image features through a hyper-pilot network.
And B4, performing arithmetic coding on the quantized image features according to the probability distribution to obtain a binary code stream.
And B5, performing arithmetic decoding on the binary code stream according to the probability distribution to obtain the low-resolution image characteristics.
In the embodiment of the invention, the coding and decoding part adopts arithmetic coding and decoding, the arithmetic coding and decoding needs a probability distribution which is common to both the coding and decoding parts, at the moment, the probability distribution is modeled into a zero-mean Gaussian mixture model, and a super-prior network is used for extracting variance from characteristics to be used as side information. The arithmetic encoding of the two feature channel encoding and decoding sections encodes the features into a binary code stream according to the same probability distribution parameters, and the arithmetic decoding decodes them back to the original features. And using fixed probability distribution when the side information of the prior network is coded and decoded.
The encoding end of the super-advanced network comprises three convolution layers which are connected in sequence, the size of a convolution kernel of each convolution layer is 5 multiplied by 5, the step length is 2, and the activation function is a ReLU function. The decoding end of the super-first-rate network comprises three sequentially connected transposition convolutional layers corresponding to the convolutional layers at the encoding end, the size of a convolutional core of each transposition convolutional layer is 5 multiplied by 5, the step length is 2, and the activation function is a ReLU function.
And S4, inputting the low-resolution image features into a super-resolution network for recovery to obtain second high-resolution image features.
As shown in fig. 2, in the embodiment of the present invention, the super-resolution network includes a first convolutional layer, a GDN function, a second convolutional layer, a first cascaded layer, a third convolutional layer, a GDN function, a fourth convolutional layer, a GDN function, a first residual block, a second residual block, a fifth convolutional layer, a GDN function, a second cascaded layer, and a transposed convolutional layer, which are sequentially connected.
Since there is a certain correlation between the low-resolution image feature and the first high-resolution image feature, the first high-resolution image feature is also used as an input when super-resolving the low-resolution image feature. In the embodiment of the invention, the input end of the first convolution layer inputs the first high-resolution image characteristic, the input end of the first cascade layer inputs the low-resolution image characteristic, the input end of the second cascade layer is also connected with the output end of the first cascade layer, and the output end of the transposition convolution layer outputs the second high-resolution image characteristic.
The number of filters of the first convolutional layer is 192, the size of the convolutional kernel is 3 × 3, and the sampling multiple is 1.
The number of filters of the second convolutional layer is 96, the size of the convolutional kernel is 1 × 1, and the sampling multiple is 2 times of down-sampling.
The number of filters of the third convolutional layer is 384, the convolutional kernel size is 3 × 3, and the sampling multiple is 1.
The number of filters of the fourth convolutional layer is 192, the size of the convolutional kernel is 3 × 3, and the sampling multiple is 1.
The number of filters of the fifth convolutional layer is 384, the convolutional kernel size is 1 × 1, and the sampling multiple is 1.
The number of filters of the transposed convolution layer is 96, the size of the convolution kernel is 3 multiplied by 3, and the sampling multiple is 2 times of up-sampling.
The first residual block and the second residual block have the same structure and respectively comprise a sixth convolution layer, a GDN function, a seventh convolution layer and an adder which are sequentially connected, wherein the input end of the sixth convolution layer is used as the input end of the first residual block or the second residual block and is connected with the input end of the adder, and the output end of the adder is used as the output end of the first residual block or the second residual block.
The number of filters of the sixth convolutional layer is 192, the size of the convolutional kernel is 3 × 3, and the sampling multiple is 1.
The number of filters of the seventh convolutional layer is 192, the size of the convolutional kernel is 3 × 3, and the sampling multiple is 1.
In the embodiment of the invention, the main constraint condition of the super-resolution network in the training process is rate distortion loss, and the auxiliary constraint condition is super-resolution loss.
The rate distortion loss is obtained by weighting the code rate and the distortion, and the calculation formula is as follows:
L=R+λD
wherein L represents rate distortion loss, and R represents code rate, the information entropy is directly used as the code rate after entropy coding by adopting the current probability distribution; d represents distortion degree, and the Mean Square Error (MSE) between the original image and the decoded image is adopted to describe the distortion degree in the embodiment of the invention; the lambda is a weight value, manual setting is adopted in the embodiment of the invention, and the compression ratio of the image can be changed by adjusting the size of the lambda.
The super-resolution loss is the mean square error between the image characteristics before the down-sampling of the two characteristic channels at the image encoding end and the image characteristics after the super-resolution at the image decoding end.
And S5, carrying out image synthesis on the first high-resolution image characteristic and the second high-resolution image characteristic to obtain an output image.
In the embodiment of the invention, the first high-resolution image feature and the second high-resolution image feature are subjected to image synthesis through 4 transposition convolutional layers, the size of a convolution kernel of each transposition convolutional layer is 5 multiplied by 5, and an activation function is an IGDN function of a GDN function.
It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.
Claims (10)
1. An image compression method based on multi-scale feature coding is characterized by comprising the following steps:
s1, performing feature extraction on the input image to obtain image features;
s2, selecting channels according to the image characteristics to obtain a high-resolution characteristic channel and a low-resolution characteristic channel;
s3, respectively coding and decoding the image features in the high-resolution feature channel and the image features in the low-resolution feature channel to obtain a first high-resolution image feature and a first low-resolution image feature;
s4, inputting the low-resolution image features into a super-resolution network for recovery to obtain second high-resolution image features;
and S5, carrying out image synthesis on the first high-resolution image characteristic and the second high-resolution image characteristic to obtain an output image.
2. The image compression method according to claim 1, wherein the step S1 is specifically: performing feature extraction on an input image through 4 sequentially connected downsampling convolutional layers at an image coding end to obtain image features; the convolution kernel size of each downsampled convolution layer is 5 x 5, the step size is 2, and the activation function is a GDN function.
3. The image compression method according to claim 1, wherein the step S2 includes the following substeps:
s21, extracting a characteristic spectrum of the image characteristics, and calculating by using a Sobel gradient operator to obtain a gradient spectrum of the image characteristics;
s22, averaging the absolute values of the gradient spectrums of each characteristic channel to obtain a one-dimensional vector for describing the complexity of the characteristic channel;
and S23, setting the channel corresponding to the half one-dimensional vector with larger complexity as a high-resolution characteristic channel, and setting the channel corresponding to the half one-dimensional vector with smaller complexity as a low-resolution characteristic channel, wherein the channel corresponding to the half one-dimensional vector with larger complexity is set as a 1.
4. The image compression method according to claim 1, wherein the specific method for coding and decoding the image features in the high resolution feature channel in step S3 is as follows:
a1, quantizing the image features in the high-resolution feature channel;
a2, estimating probability distribution of the quantized image features through a hyper-pilot network;
a3, performing arithmetic coding on the quantized image features according to probability distribution to obtain a binary code stream;
a4, performing arithmetic decoding on the binary code stream according to probability distribution to obtain a first high-resolution image characteristic;
the specific method for encoding and decoding the image features in the low-resolution feature channel in step S3 is as follows:
b1, down-sampling the image features in the low-resolution feature channel;
b2, quantizing the image features after down sampling;
b3, estimating probability distribution of the quantized image features through a super-prior network;
b4, performing arithmetic coding on the quantized image features according to probability distribution to obtain a binary code stream;
and B5, performing arithmetic decoding on the binary code stream according to the probability distribution to obtain the low-resolution image characteristics.
5. The image compression method according to claim 4, wherein the step A1 is specifically as follows: in the training process, quantizing the image features in the high-resolution feature channel by increasing a uniform noise approximate representation quantization result; in the testing process, quantizing the image characteristics in the high-resolution characteristic channel by adopting a rounding mode;
the step B2 specifically includes: in the training process, the down-sampled image features are quantized in a mode of increasing a uniform noise approximate representation quantization result; in the testing process, the image characteristics after down sampling are quantized by adopting a rounding mode.
6. The image compression method as claimed in claim 4, wherein in step B1, the image features in the low resolution feature channel are downsampled by a downsampling convolutional layer, the size of convolution kernel of the downsampling convolutional layer is 5 x 5, the step size is 2, and the activation function is GDN function.
7. The image compression method according to claim 4, wherein the super prior network extracts variance from image features as side information, and the side information uses fixed probability distribution when encoding and decoding;
the encoding end of the super-advanced network comprises three convolution layers which are sequentially connected, the size of a convolution kernel of each convolution layer is 5 multiplied by 5, the step length is 2, and the activation function is a ReLU function;
the decoding end of the super-advanced network comprises three sequentially connected transposition convolution layers, the convolution kernel size of each transposition convolution layer is 5 multiplied by 5, the step size is 2, and the activation function is a ReLU function.
8. The image compression method of claim 1, wherein the super-resolution network in step S4 includes a first convolutional layer, a GDN function, a second convolutional layer, a first cascaded layer, a third convolutional layer, a GDN function, a fourth convolutional layer, a GDN function, a first residual block, a second residual block, a fifth convolutional layer, a GDN function, a second cascaded layer, and a transposed convolutional layer, which are connected in sequence;
the input end of the first convolution layer inputs a first high-resolution image feature, the input end of the first cascade layer inputs a low-resolution image feature, the input end of the second cascade layer is also connected with the output end of the first cascade layer, and the output end of the first convolution layer outputs a second high-resolution image feature;
the number of the filters of the first convolution layer is 192, the size of a convolution kernel is 3 multiplied by 3, and the sampling multiple is 1;
the number of the filters of the second convolution layer is 96, the size of the convolution kernel is 1 multiplied by 1, and the sampling multiple is 2 times of down-sampling;
the number of the filters of the third convolutional layer is 384, the size of a convolutional kernel is 3 multiplied by 3, and the sampling multiple is 1;
the number of the filters of the fourth convolutional layer is 192, the size of a convolutional kernel is 3 multiplied by 3, and the sampling multiple is 1;
the number of the filters of the fifth convolutional layer is 384, the size of a convolutional kernel is 1 multiplied by 1, and the sampling multiple is 1;
the number of filters of the transposed convolution layer is 96, the size of a convolution kernel is 3 multiplied by 3, and the sampling multiple is 2 times of upsampling;
the first residual block and the second residual block have the same structure and respectively comprise a sixth convolution layer, a GDN function, a seventh convolution layer and an adder which are sequentially connected, wherein the input end of the sixth convolution layer is used as the input end of the first residual block or the second residual block and is connected with the input end of the adder, and the output end of the adder is used as the output end of the first residual block or the second residual block;
the number of the filters of the sixth convolutional layer is 192, the size of a convolutional kernel is 3 multiplied by 3, and the sampling multiple is 1;
the number of filters of the seventh convolutional layer is 192, the size of a convolutional kernel is 3 × 3, and the sampling multiple is 1.
9. The image compression method according to claim 8, wherein the super-resolution network in step S4 has a main constraint condition of rate distortion loss and an auxiliary constraint condition of super-resolution loss during training;
the calculation formula of the rate distortion loss is as follows:
L=R+λD
wherein L represents rate distortion loss, R represents code rate, D represents distortion degree, and lambda is weight;
the super-resolution loss is the mean square error between the image characteristics before the down-sampling of the two characteristic channels at the image encoding end and the image characteristics after the super-resolution at the image decoding end.
10. The image compression method according to claim 1, wherein in step S5, the first high-resolution image feature and the second high-resolution image feature are image-synthesized by 4 transposed convolutional layers, each of the transposed convolutional layers has a convolutional kernel size of 5 x 5, and the activation function is an IGDN function which is an inverse function of a GDN function.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911290877.8A CN110956671B (en) | 2019-12-12 | 2019-12-12 | Image compression method based on multi-scale feature coding |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911290877.8A CN110956671B (en) | 2019-12-12 | 2019-12-12 | Image compression method based on multi-scale feature coding |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110956671A true CN110956671A (en) | 2020-04-03 |
CN110956671B CN110956671B (en) | 2022-08-02 |
Family
ID=69981810
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911290877.8A Active CN110956671B (en) | 2019-12-12 | 2019-12-12 | Image compression method based on multi-scale feature coding |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110956671B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112149652A (en) * | 2020-11-27 | 2020-12-29 | 南京理工大学 | Space-spectrum joint depth convolution network method for lossy compression of hyperspectral image |
CN113014927A (en) * | 2021-03-02 | 2021-06-22 | 三星(中国)半导体有限公司 | Image compression method and image compression device |
CN113393377A (en) * | 2021-05-18 | 2021-09-14 | 电子科技大学 | Single-frame image super-resolution method based on video coding |
WO2021258529A1 (en) * | 2020-06-22 | 2021-12-30 | 北京大学深圳研究生院 | Image resolution reduction and restoration method, device, and readable storage medium |
CN114363624A (en) * | 2020-10-13 | 2022-04-15 | 北京大学 | Sensitivity-based code rate allocation characteristic compression method |
CN115866252A (en) * | 2023-02-09 | 2023-03-28 | 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) | Image compression method, device, equipment and storage medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105392009A (en) * | 2015-11-27 | 2016-03-09 | 四川大学 | Low bit rate image coding method based on block self-adaptive sampling and super-resolution reconstruction |
CN107018422A (en) * | 2017-04-27 | 2017-08-04 | 四川大学 | Still image compression method based on depth convolutional neural networks |
CN109146788A (en) * | 2018-08-16 | 2019-01-04 | 广州视源电子科技股份有限公司 | Super-resolution image reconstruction method and device based on deep learning |
CN109741256A (en) * | 2018-12-13 | 2019-05-10 | 西安电子科技大学 | Image super-resolution rebuilding method based on rarefaction representation and deep learning |
CN109996071A (en) * | 2019-03-27 | 2019-07-09 | 上海交通大学 | Variable bit rate image coding, decoding system and method based on deep learning |
WO2019145767A1 (en) * | 2018-01-25 | 2019-08-01 | King Abdullah University Of Science And Technology | Deep-learning based structure reconstruction method and apparatus |
CN110087092A (en) * | 2019-03-11 | 2019-08-02 | 西安电子科技大学 | Low bit-rate video decoding method based on image reconstruction convolutional neural networks |
-
2019
- 2019-12-12 CN CN201911290877.8A patent/CN110956671B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105392009A (en) * | 2015-11-27 | 2016-03-09 | 四川大学 | Low bit rate image coding method based on block self-adaptive sampling and super-resolution reconstruction |
CN107018422A (en) * | 2017-04-27 | 2017-08-04 | 四川大学 | Still image compression method based on depth convolutional neural networks |
WO2019145767A1 (en) * | 2018-01-25 | 2019-08-01 | King Abdullah University Of Science And Technology | Deep-learning based structure reconstruction method and apparatus |
CN109146788A (en) * | 2018-08-16 | 2019-01-04 | 广州视源电子科技股份有限公司 | Super-resolution image reconstruction method and device based on deep learning |
CN109741256A (en) * | 2018-12-13 | 2019-05-10 | 西安电子科技大学 | Image super-resolution rebuilding method based on rarefaction representation and deep learning |
CN110087092A (en) * | 2019-03-11 | 2019-08-02 | 西安电子科技大学 | Low bit-rate video decoding method based on image reconstruction convolutional neural networks |
CN109996071A (en) * | 2019-03-27 | 2019-07-09 | 上海交通大学 | Variable bit rate image coding, decoding system and method based on deep learning |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021258529A1 (en) * | 2020-06-22 | 2021-12-30 | 北京大学深圳研究生院 | Image resolution reduction and restoration method, device, and readable storage medium |
CN114363624A (en) * | 2020-10-13 | 2022-04-15 | 北京大学 | Sensitivity-based code rate allocation characteristic compression method |
CN114363624B (en) * | 2020-10-13 | 2023-03-31 | 北京大学 | Sensitivity-based code rate allocation characteristic compression method |
CN112149652A (en) * | 2020-11-27 | 2020-12-29 | 南京理工大学 | Space-spectrum joint depth convolution network method for lossy compression of hyperspectral image |
CN113014927A (en) * | 2021-03-02 | 2021-06-22 | 三星(中国)半导体有限公司 | Image compression method and image compression device |
CN113014927B (en) * | 2021-03-02 | 2024-01-09 | 三星(中国)半导体有限公司 | Image compression method and image compression device |
CN113393377A (en) * | 2021-05-18 | 2021-09-14 | 电子科技大学 | Single-frame image super-resolution method based on video coding |
CN113393377B (en) * | 2021-05-18 | 2022-02-01 | 电子科技大学 | Single-frame image super-resolution method based on video coding |
CN115866252A (en) * | 2023-02-09 | 2023-03-28 | 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) | Image compression method, device, equipment and storage medium |
CN115866252B (en) * | 2023-02-09 | 2023-05-02 | 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) | Image compression method, device, equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN110956671B (en) | 2022-08-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110956671B (en) | Image compression method based on multi-scale feature coding | |
Cheng et al. | Learned image compression with discretized gaussian mixture likelihoods and attention modules | |
CN108737823B (en) | Image coding method and device and decoding method and device based on super-resolution technology | |
CN113259676A (en) | Image compression method and device based on deep learning | |
CN111630570A (en) | Image processing method, apparatus and computer-readable storage medium | |
WO2020238439A1 (en) | Video quality-of-service enhancement method under restricted bandwidth of wireless ad hoc network | |
CN112149652A (en) | Space-spectrum joint depth convolution network method for lossy compression of hyperspectral image | |
CN109949222A (en) | Image super-resolution rebuilding method based on grapheme | |
CN113079378B (en) | Image processing method and device and electronic equipment | |
CN109361919A (en) | A kind of image coding efficiency method for improving combined super-resolution and remove pinch effect | |
US8737753B2 (en) | Image restoration by vector quantization utilizing visual patterns | |
CN115131675A (en) | Remote sensing image compression method and system based on reference image texture migration | |
Akbari et al. | Learned multi-resolution variable-rate image compression with octave-based residual blocks | |
CN115294222A (en) | Image encoding method, image processing method, terminal, and medium | |
Tan et al. | Image compression algorithms based on super-resolution reconstruction technology | |
CN111080729B (en) | Training picture compression network construction method and system based on Attention mechanism | |
CN115866252B (en) | Image compression method, device, equipment and storage medium | |
CN115776571B (en) | Image compression method, device, equipment and storage medium | |
CN110730347A (en) | Image compression method and device and electronic equipment | |
CN115361555A (en) | Image encoding method, image encoding device, and computer storage medium | |
CN110717948A (en) | Image post-processing method, system and terminal equipment | |
CN115358954B (en) | Attention-guided feature compression method | |
CN117915107B (en) | Image compression system, image compression method, storage medium and chip | |
CN116260969B (en) | Self-adaptive channel progressive coding and decoding method, device, terminal and medium | |
Ghorbel et al. | Joint Hierarchical Priors and Adaptive Spatial Resolution for Efficient Neural Image Compression |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |