CN115767096A - Image compression method, apparatus, device and medium - Google Patents

Image compression method, apparatus, device and medium Download PDF

Info

Publication number
CN115767096A
CN115767096A CN202211307249.8A CN202211307249A CN115767096A CN 115767096 A CN115767096 A CN 115767096A CN 202211307249 A CN202211307249 A CN 202211307249A CN 115767096 A CN115767096 A CN 115767096A
Authority
CN
China
Prior art keywords
image compression
deep learning
network structure
network
quantization
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211307249.8A
Other languages
Chinese (zh)
Inventor
王荣刚
李立天
高文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University Shenzhen Graduate School
Original Assignee
Peking University Shenzhen Graduate School
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University Shenzhen Graduate School filed Critical Peking University Shenzhen Graduate School
Priority to CN202211307249.8A priority Critical patent/CN115767096A/en
Publication of CN115767096A publication Critical patent/CN115767096A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention discloses an image compression method, an image compression device, image compression equipment and an image compression medium, and aims to solve the problem of quantization drift of an end-to-end image encoder in a continuous compression process. In order to further enhance the reversibility of the transformation network and reduce the generation loss, a channel relaxation strategy designed for a network structure is invented, and the channel relaxation strategy is adopted in the network structure design to ensure that the channel number of the characteristic vector is in a reasonable range. Therefore, the technical problems that an end-to-end image compression model is unstable in a continuous compression process and the multi-generation robustness of the model is poor in the prior art are solved, and the multi-generation robustness of the prior end-to-end image compression model can be greatly improved on the premise that the first-time coding performance is hardly lost. And the method is simple and effective, does not need to change the network structure, and is easy to expand to other end-to-end image compression models.

Description

Image compression method, apparatus, device and medium
Technical Field
The present invention relates to the field of image compression technologies, and in particular, to an end-to-end image compression method based on deep learning, an end-to-end image compression device based on deep learning, an end-to-end image compression apparatus based on deep learning, and a computer readable storage medium.
Background
Image compression is one of the most basic techniques in the multimedia field. Efficient image compression techniques provide support for the storage and transmission of large amounts of image data. The traditional coding standard relies on a manually designed module to remove spatial redundancy and statistical redundancy in image data to achieve the purpose of compression. In recent years, an image compression method based on deep learning becomes a research hotspot in the coding field, the network framework can be integrally optimized in an end-to-end mode under the guidance of a loss function, and the compression performance of the method exceeds the most advanced traditional coding method.
Multi-generation coding is a process of repeatedly compressing and decompressing images or videos, which often occurs in multimedia application scenarios such as image editing, transcoding, and network distribution. For lossy compression methods, the first compression causes distortion, but when the decoded picture is re-encoded with the same settings, the distortion does not increase in the ideal case. However, the existing end-to-end image encoder based on deep learning has poor multi-generation robustness. The image quality will drop sharply after successive compression, leading to distortion problems such as blurring and color cast, and high frequency noise that may be introduced during iterative encoding can significantly increase the bit rate of the image, which is particularly disadvantageous for practical applications.
Disclosure of Invention
The invention mainly aims to provide an end-to-end image compression method based on deep learning, an end-to-end image compression device based on deep learning, end-to-end image compression equipment based on deep learning and a computer readable storage medium, and aims to solve the technical problems that an end-to-end image compression model is unstable in a continuous compression process and the multi-generation robustness of the model is poor in the prior art.
In order to achieve the above object, the present invention provides an end-to-end image compression method based on deep learning, which includes:
in the quantization stage, a direct quantization method is adopted to quantize the transformed feature vector;
and setting the channel number of the characteristic vector within a preset reasonable range by adopting a channel relaxation strategy on the network structure design.
Optionally, the step of quantizing the transformed feature vector by using a direct quantization method in a quantization stage includes:
rounding the value to be quantized in the feature vector;
entropy coding is performed on the result of the rounding operation.
Optionally, the rounding the value to be quantized in the feature vector includes:
rounding the value to be quantized to a discrete value;
the step of entropy coding the result of the rounding operation comprises:
entropy encoding the discrete value.
Optionally, the step of setting the number of channels of the feature vector in a preset reasonable range by using a channel relaxation strategy in the network structure design includes:
increasing the hyper-parameters of the network structure of the image compression model based on the information representation capability of the image compression model at the coding bottleneck, and setting the channel number of the characteristic vector in a preset reasonable range;
wherein, the hyper-parameter of the network structure of the image compression model is an output channel of the last convolution layer in the transformation network.
Optionally, after the step of setting the number of channels of the feature vector in a preset reasonable range by using a channel relaxation strategy in the network structure design, the method further includes:
and optimizing a target loss function by adopting a joint code rate, distortion and reversibility loss function in a model training stage.
Optionally, before the step of optimizing the target loss function by using the joint code rate, the distortion and the invertibility loss function in the model training stage, the method further includes:
and introducing reversible items determined based on the original image, the transformation network and the inverse transformation network in the training process of the network structure, and constructing the reversible loss function.
Optionally, the step of introducing a reversible term determined based on the original image, the transformation network and the inverse transformation network in the training process of the network structure includes:
and determining the reversible item according to the distortion between the original image and the image passing through a transformation network and an inverse transformation network.
In addition, to achieve the above object, the present invention further provides an end-to-end image compression apparatus based on deep learning, including:
the quantization module is used for quantizing the transformed feature vectors by adopting a direct quantization method in a quantization stage;
and the setting module is used for setting the channel number of the characteristic vector in a preset reasonable range by adopting a channel relaxation strategy on the network structure design.
Further, to achieve the above object, the present invention also provides an end-to-end image compression apparatus based on deep learning, comprising: memory, a processor, and a computer program stored on the memory and executable on the processor, which computer program, when executed by the processor, performs the steps of the deep learning based end-to-end image compression method as claimed in any one of the above.
Furthermore, to achieve the above object, the present invention also provides a computer readable storage medium, on which a computer program is stored, the computer program, when executed by a processor, implements the steps of the deep learning based end-to-end image compression method as described in any one of the above.
In the end-to-end image compression method based on deep learning, the end-to-end image compression device based on deep learning, the end-to-end image compression equipment based on deep learning and the computer readable storage medium provided by the embodiment of the invention, a direct quantization method is adopted in a quantization stage to quantize the transformed feature vector; and setting the channel number of the characteristic vector within a preset reasonable range by adopting a channel relaxation strategy on the network structure design.
In order to solve the problem of quantization drift of an end-to-end image encoder in the continuous compression process, a direct quantization method is used for replacing the original quantization method, and the direct quantization method is adopted in the quantization stage to quantize the transformed feature vector. In order to further enhance the reversibility of the transformation network and reduce the generation loss, a channel relaxation strategy designed for the network structure is invented, and the channel relaxation strategy is adopted in the network structure design to ensure that the channel number of the feature vector is in a reasonable range. Therefore, the technical problems that an end-to-end image compression model is unstable in a continuous compression process and the multi-generation robustness of the model is poor in the prior art are solved, and the multi-generation robustness of the prior end-to-end image compression model can be greatly improved on the premise that the first-time coding performance is hardly lost. And the method is simple and effective, does not need to change the network structure, and is easy to expand to other end-to-end image compression models.
Drawings
Fig. 1 is a schematic terminal structure diagram of a hardware operating environment according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating an embodiment of an end-to-end image compression method based on deep learning according to the present invention;
FIG. 3 is a schematic diagram of a compression framework of an embodiment of an end-to-end image compression method based on deep learning according to the present invention;
FIG. 4 is a schematic diagram of an entropy model and a quantization mode of an embodiment of an end-to-end image compression method based on deep learning according to the present invention;
FIG. 5 is a schematic diagram of an embodiment of an apparatus for deep learning-based end-to-end image compression according to the present invention.
The implementation, functional features and advantages of the present invention will be further described with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Referring to fig. 1, fig. 1 is a schematic structural diagram of an operating device in a hardware operating environment according to an embodiment of the present invention.
As shown in fig. 1, the operation device may include: the processor 1001 is, for example, a Central Processing Unit (CPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a display screen (Di sp ay), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., a WI re l ess-fde l ity, WI-fi) interface). The Memory 1005 may be a Random Access Memory (RAM) Memory, or a Non-Vo l at i e Memory (NVM), such as a disk Memory. The memory 1005 may alternatively be a storage device separate from the processor 1001 described previously.
Those skilled in the art will appreciate that the configuration shown in FIG. 1 does not constitute a limitation of the operating device and may include more or fewer components than shown, or some components may be combined, or a different arrangement of components.
As shown in fig. 1, the memory 1005, which is a storage medium, may include therein an operating system, a data storage module, a network communication module, a user interface module, and a computer program.
In the operating device shown in fig. 1, the network interface 1004 is mainly used for data communication with other devices; the user interface 1003 is mainly used for data interaction with a user; the processor 1001 and the memory 1005 in the execution apparatus of the present invention may be provided in an execution apparatus that calls a computer program stored in the memory 1005 by the processor 1001 and performs the following operations:
in the quantization stage, a direct quantization method is adopted to quantize the transformed feature vector;
and setting the channel number of the characteristic vector within a preset reasonable range by adopting a channel relaxation strategy on the network structure design.
Further, the processor 1001 may call the computer program stored in the memory 1005, and also perform the following operations:
the step of quantizing the transformed feature vector by using a direct quantization method in a quantization stage comprises the following steps:
rounding up the value to be quantized in the feature vector;
entropy coding is performed on the result of the rounding operation.
Further, the processor 1001 may call the computer program stored in the memory 1005, and also perform the following operations:
the step of rounding the value to be quantized in the feature vector includes:
rounding the value to be quantized to a discrete value;
the step of entropy coding the result of the rounding operation comprises:
entropy encoding the discrete value.
Further, the processor 1001 may call the computer program stored in the memory 1005, and also perform the following operations:
the step of setting the channel number of the characteristic vector in a preset reasonable range by adopting a channel relaxation strategy on the network structure design comprises the following steps:
increasing the hyper-parameters of the network structure of the image compression model based on the information representation capability of the image compression model at the coding bottleneck, and setting the channel number of the characteristic vector in a preset reasonable range;
wherein, the hyper-parameter of the network structure of the image compression model is an output channel of the last convolution layer in the transformation network.
Further, the processor 1001 may call the computer program stored in the memory 1005, and also perform the following operations:
after the step of setting the number of channels of the feature vector within a preset reasonable range by adopting a channel relaxation strategy in the network structure design, the method further comprises the following steps:
and optimizing a target loss function by adopting a joint code rate, distortion and reversibility loss function in the model training stage.
Further, the processor 1001 may call the computer program stored in the memory 1005, and also perform the following operations:
before the step of optimizing the target loss function by adopting the joint code rate, the distortion and the reversible loss function in the model training stage, the method further comprises the following steps:
and introducing reversible items determined based on the original image, the transformation network and the inverse transformation network in the training process of the network structure, and constructing the reversible loss function.
Further, the processor 1001 may call the computer program stored in the memory 1005, and also perform the following operations:
the step of introducing a reversible term determined based on an original image, a transformation network and an inverse transformation network in the training process of the network structure comprises:
and determining the reversible item according to the distortion between the original image and the image passing through a transformation network and an inverse transformation network.
Referring to fig. 2, the present invention provides an end-to-end image compression method based on deep learning, which includes:
and step S10, adopting a direct quantization method in a quantization stage to quantize the transformed feature vector.
Optionally, the step of quantizing the transformed feature vector by using a direct quantization method in the quantization stage includes:
rounding up the value to be quantized in the feature vector;
entropy coding is performed on the result of the rounding operation.
Optionally, the rounding operation on the value to be quantized in the feature vector includes:
rounding the value to be quantized into a discrete value;
the step of entropy coding the result of the rounding operation comprises:
entropy encoding the discrete value.
In this embodiment, a direct quantization method is used in the quantization stage to quantize the transformed feature vector. After the image is transformed into a feature vector through a convolution operation, the vector needs to be quantized into discrete values to perform an entropy coding process. Existing commonly used end-to-end image compression methods typically model each element of the feature vector as a variable conforming to a gaussian distribution, i.e., y i ~Nμ ii ). In order to accurately estimate the gaussian distribution parameters μ and σ of the feature vector, the most advanced end-to-end model generally employs a joint super-prior and contextual entropy model as shown in fig. 4. The super-check part adopts a convolution network to capture global information of the characteristic vector y and transmits the global information as side information to a decoding end by a code stream so as to eliminate spatial redundancy in the characteristic vector y; the context part uses masked convolution from the coded element y <i Capture local information and use as coded current element y i The purpose of which is to eliminate the correlation between adjacent elements.
The existing model generally adopts a correction quantization method to discretize the feature vector y. As shown in FIG. 4, the mean μ is first subtracted before quantization i Then rounding and rounding are performed, and the corresponding mean value mu is added when inverse quantization is performed at the decoding end i . This process can be formulated as
Figure BDA0003905425480000071
Figure BDA0003905425480000072
And (4) showing. Experiments show that the correction quantization can cause the instability of an end-to-end image compression model in the continuous compression process because the value after each quantization can not be kept in a constant range, and quantization drift and quantization deviation are causedResulting in a continuous degradation of the image quality.
The direct quantization method proposed in the embodiment removes the step of correction, directly quantizing the value y to be quantized i Rounding and rounding operations are performed, and entropy coding is performed on the basis of y without subtracting the mean value i The direct quantization process can be formulated
Figure BDA0003905425480000073
And (4) showing. By direct quantization, the feature vectors only need to fall in the same quantization interval in the process of two consecutive compressions, the vectors subjected to inverse transformation will be the same, and the decoded images will be the same.
And S20, setting the channel number of the characteristic vector within a preset reasonable range by adopting a channel relaxation strategy on the network structure design.
Optionally, the step of setting the number of channels of the feature vector within a preset reasonable range by using a channel relaxation policy in the network structure design includes:
increasing the hyper-parameters of the network structure of the image compression model based on the information representation capability of the image compression model at the encoding bottleneck, and setting the channel number of the feature vector within a preset reasonable range;
wherein, the hyper-parameter of the network structure of the image compression model is an output channel of the last convolution layer in the transformation network.
Referring to fig. 3, in the existing end-to-end image compression model, the transformation portion typically employs a convolutional neural network. Wherein the encoding transformation part comprises 4 down-sampling operations, and the decoding transformation part comprises 4 up-sampling operations. The original image has dimensions h × w × 3, and the dimensions in the transformation process are
Figure BDA0003905425480000081
The dimension of the feature vector is
Figure BDA0003905425480000082
Where h, w, d represent the height, width of the image to be encoded and the depth of the down-sampled layer to be encoded, respectively. Now thatModels to get more compact feature vectors, M is usually set to a smaller value, e.g., 192 or 320. Experiments show that smaller M limits the information representation capability of the feature vector, and the lost information can cause continuous reduction of image quality in the continuous compression process.
In order to enhance the reversibility of the network in the transformation part and thus reduce the accumulated information loss, a channel relaxation strategy is proposed in the embodiment. The strategy improves the information representation capability of the image compression model at the coding bottleneck position by reasonably increasing the hyper-parameter M of the model, thereby reducing the quality loss of the image in the continuous compression process. The change of the hyper-parameter M will only affect the two convolutional layer parameters closest to the encoding bottleneck, and has no effect on the overall network structure.
In this embodiment, a direct quantization method is adopted in the quantization stage to quantize the transformed eigenvector; and setting the channel number of the characteristic vector within a preset reasonable range by adopting a channel relaxation strategy on the network structure design. In order to solve the quantization drift problem of an end-to-end image encoder in the continuous compression process, a direct quantization method is used for replacing the original quantization method, and the direct quantization method is adopted in the quantization stage to quantize the transformed eigenvector. In order to further enhance the reversibility of the transformation network and reduce the generation loss, a channel relaxation strategy designed for a network structure is invented, and the channel relaxation strategy is adopted in the network structure design to ensure that the channel number of the characteristic vector is in a reasonable range. Therefore, the technical problems that an end-to-end image compression model is unstable in a continuous compression process and the multi-generation robustness of the model is poor in the prior art are solved, and the multi-generation robustness of the prior end-to-end image compression model can be greatly improved on the premise that the first-time coding performance is hardly lost. And the method is simple and effective, does not need to change the network structure, and is easy to expand to other end-to-end image compression models.
Further, in another embodiment of the end-to-end image compression method based on deep learning of the present invention, after the step of setting the number of channels of the feature vector in a preset reasonable range by using a channel relaxation strategy on the network structure design, the method further includes:
and optimizing a target loss function by adopting a joint code rate, distortion and reversibility loss function in the model training stage.
Optionally, before the step of optimizing the target loss function by using the joint code rate, the distortion and the invertibility loss function in the model training stage, the method further includes:
and introducing reversible items determined based on the original image, the transformation network and the inverse transformation network in the training process of the network structure, and constructing the reversible loss function.
Optionally, the step of introducing a reversible term determined based on the original image, the transformation network, and the inverse transformation network in the training process of the network structure includes:
and determining the reversible item according to the distortion between the original image and the image which passes through a transformation network and an inverse transformation network.
In this embodiment, joint code rate, distortion and invertibility loss function are used for optimization during the model training phase. In the training process of a conventional end-to-end image compression model, optimization is often performed by combining a code rate and a distortion function, so that ideal rate distortion performance is achieved. The reversibility of the transformation is an important influence factor of multi-generation robustness, and the orthogonal reversible transformation adopted by the traditional JPEG coding ensures that the orthogonal reversible transformation has good multi-generation robustness. In the training process of the conventional end-to-end model, the reversibility of the transformation network and the inverse transformation network is not explicitly restricted, so that the instability of the transformation network and the inverse transformation network in the continuous compression process is caused.
In this embodiment, it is proposed to perform optimization by using a joint code rate, distortion and reversible loss function in the model training stage, where the reversible loss is applied to the original image and the distortion of the image after the original image is subjected to transform inverse transform. To constrain reversibility, we introduce a reversible term in the training process of the network, which can be formulated as d (x, g) s (g a (x) In which g) of a And g s Respectively, a transform network and an inverse transform network.
Further, in another embodiment of the deep learning-based end-to-end image compression method of the present invention, in order to solve the quantization drift problem of the end-to-end image encoder during the continuous compression process, it is proposed to use a direct quantization method instead of the original quantization method. The method is used for improving the problem that the quality of the image is rapidly reduced in the continuous compression process. In an end-to-end image compression method based on deep learning, encoding and decoding of an image generally go through four steps of transformation, quantization, entropy coding and inverse transformation, wherein the quantization and transformation parts are crucial to the multi-generation robustness of a model. In order to further enhance the reversibility of the transformation network and reduce the generation loss, a channel relaxation strategy designed aiming at the network structure and a reversibility loss function in the training process are invented. Further, the method comprises the following steps:
step one, quantizing the transformed characteristic vector by adopting a direct quantization method in a quantization stage; secondly, enhancing reversibility of the transformation network and the inverse transformation network by adopting the method in the third step or the fourth step; thirdly, adopting a channel relaxation strategy to ensure that the channel number of the characteristic vector is in a reasonable range on the network structure design; and fourthly, optimizing by adopting a joint code rate, distortion and reversible loss function in the model training stage. Wherein, the direct quantization method comprises the following steps: rounding the value to be quantized to a discrete value and directly entropy coding the discrete value. The reversibility is as follows: and the similarity between the image after the transformation network and the inverse transformation network and the original image is higher, and the reversibility of the transformation is stronger when the similarity is higher. The channel relaxation strategy is characterized in that: the output channel M of the last convolutional layer in the transformation network needs to be set within a reasonable range, so that the feature vector is guaranteed to have enough channels to keep information, information loss in the transformation process is reduced, and the reversibility of the transformation network is increased. The loss of reversibility is: distortion between the original image and the original image through the transformation and inverse transformation network.
As shown in fig. 3, the end-to-end image compression framework with enhanced multi-generation robustness includes the following features: s1, in the coding stage, a transformation network g comprising 4 convolution down-sampling operations is adopted a Converting an original image x into a feature vector y, wherein the feature vector y is obtained by a direct quantization method
Figure BDA0003905425480000101
The symbol probability table obtained by the arithmetic coder through the entropy model will be
Figure BDA0003905425480000102
Encoding into a binary code stream; s2, in the decoding stage, the binary code stream is obtained through the entropy decoding process
Figure BDA0003905425480000103
Followed by
Figure BDA0003905425480000104
By an inverse transform network g comprising 4 deconvolution upsamples s Then rounding and truncating to obtain the final decoded image
Figure BDA0003905425480000105
S3, enhancing the reversibility of the transformation network by adopting the method set forth in S4 or S5 by an end-to-end image compression framework for enhancing the multi-generation robustness; and S4, in the transformation network and the inverse transformation network, the number of the vector channels of the middle convolution layer is N, and the number of the characteristic vector channels positioned at the bottleneck of coding is M. And a channel relaxation strategy is adopted to set M to be large enough, so that the information loss in the transformation process is reduced to enhance the reversibility of the transformation. S5, in the training process, optimizing the target loss function except for the code rate R and the distortion
Figure BDA0003905425480000106
In addition, the reversibility loss is additionally increased
Figure BDA0003905425480000107
Wherein
Figure BDA0003905425480000108
Is that the original image x passes through g a And g s The image obtained directly.
In addition, referring to fig. 5, an embodiment of the present invention further provides an end-to-end image compression apparatus based on deep learning, where the end-to-end image compression apparatus based on deep learning includes:
a quantization module M1, configured to quantize the transformed feature vector by using a direct quantization method in a quantization stage;
and the setting module M2 is used for setting the channel number of the characteristic vector within a preset reasonable range by adopting a channel relaxation strategy on the network structure design.
Optionally, the quantization module is further configured to round and round the value to be quantized in the feature vector;
entropy coding is performed on the result of the rounding operation.
Optionally, the quantization module is further configured to round the value to be quantized to a discrete value;
the step of entropy coding the result of the rounding operation comprises:
entropy encoding the discrete value.
Optionally, the setting module is further configured to increase a hyper-parameter of the network structure of the image compression model based on an information representation capability of the image compression model at a coding bottleneck, and set the number of channels of the feature vector within a preset reasonable range;
wherein the hyper-parameter of the network structure of the image compression model is an output channel of a last convolutional layer in a transform network.
Optionally, the deep learning-based end-to-end image compression apparatus further includes an optimization module, configured to optimize a target loss function by using a joint code rate, distortion and invertibility loss function in a model training stage.
Optionally, the optimization module is further configured to introduce a reversible term determined based on the original image, the transformation network, and the inverse transformation network in a training process of the network structure, and construct the reversible loss function.
Optionally, the optimization module is further configured to determine the reversible term according to distortion between the original image and an image passing through a transformation network and an inverse transformation network.
The end-to-end image compression device based on deep learning provided by the invention adopts the end-to-end image compression method based on deep learning in the embodiment, and solves the technical problems that an end-to-end image compression model is unstable in a continuous compression process and the multi-generation robustness of the model is poor in the prior art. Compared with the prior art, the beneficial effects of the deep learning-based end-to-end image compression device provided by the embodiment of the invention are the same as the beneficial effects of the deep learning-based end-to-end image compression method provided by the embodiment, and other technical features of the deep learning-based end-to-end image compression device are the same as those disclosed by the embodiment method, which is not repeated herein.
In addition, an embodiment of the present invention further provides an end-to-end image compression device based on deep learning, where the end-to-end image compression device based on deep learning includes: memory, a processor, and a computer program stored on the memory and executable on the processor, which computer program, when executed by the processor, performs the steps of the deep learning based end-to-end image compression method as claimed in any one of the above.
Furthermore, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the deep learning-based end-to-end image compression method according to any one of the above items.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrases "comprising one of 8230; \8230;" 8230; "does not exclude the presence of additional like elements in a process, method, article, or system that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are only for description, and do not represent the advantages and disadvantages of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. An end-to-end image compression method based on deep learning, which is characterized in that the end-to-end image compression method based on deep learning comprises the following steps:
in the quantization stage, a direct quantization method is adopted to quantize the transformed feature vector;
and setting the number of the channels of the characteristic vector in a preset reasonable range by adopting a channel relaxation strategy on the design of a network structure.
2. The end-to-end image compression method based on deep learning of claim 1, wherein the step of quantizing the transformed feature vector by using a direct quantization method in the quantization stage comprises:
rounding the value to be quantized in the feature vector;
entropy coding is performed on the result of the rounding operation.
3. The end-to-end image compression method based on deep learning of claim 2, wherein the step of rounding the value to be quantized in the feature vector comprises:
rounding the value to be quantized to a discrete value;
the step of entropy coding the result of the rounding operation comprises:
entropy encoding the discrete value.
4. The end-to-end image compression method based on deep learning of claim 1, wherein the step of setting the number of channels of the feature vector within a preset reasonable range by using a channel relaxation strategy on the network structure design comprises:
increasing the hyper-parameters of the network structure of the image compression model based on the information representation capability of the image compression model at the encoding bottleneck, and setting the channel number of the feature vector within a preset reasonable range;
wherein, the hyper-parameter of the network structure of the image compression model is an output channel of the last convolution layer in the transformation network.
5. The end-to-end image compression method based on deep learning of claim 1, wherein after the step of setting the number of channels of the feature vector within a preset reasonable range by using a channel relaxation strategy on the network structure design, the method further comprises:
and optimizing a target loss function by adopting a joint code rate, distortion and reversibility loss function in a model training stage.
6. The deep learning-based end-to-end image compression method of claim 5, wherein before the step of optimizing the objective loss function by using the joint rate, distortion and invertibility loss functions in the model training stage, the method further comprises:
and introducing reversible items determined based on the original image, the transformation network and the inverse transformation network in the training process of the network structure, and constructing the reversible loss function.
7. The deep learning-based end-to-end image compression method as claimed in claim 6, wherein the step of introducing reversible terms determined based on an original image, a transformation network and an inverse transformation network in the training process of the network structure comprises:
and determining the reversible item according to the distortion between the original image and the image passing through a transformation network and an inverse transformation network.
8. An end-to-end image compression apparatus based on deep learning, the apparatus comprising:
the quantization module is used for quantizing the transformed feature vectors by adopting a direct quantization method in a quantization stage;
and the setting module is used for setting the channel number of the characteristic vector in a preset reasonable range by adopting a channel relaxation strategy on the network structure design.
9. An end-to-end image compression device based on deep learning, characterized in that the end-to-end image compression device based on deep learning comprises: memory, a processor, and a computer program stored on the memory and executable on the processor, the computer program when executed by the processor implementing the steps of the deep learning based end-to-end image compression method according to any one of claims 1 to 7.
10. A computer-readable storage medium, characterized in that a computer program is stored thereon, which computer program, when being executed by a processor, carries out the steps of the deep learning based end-to-end image compression method according to any one of claims 1 to 7.
CN202211307249.8A 2022-10-24 2022-10-24 Image compression method, apparatus, device and medium Pending CN115767096A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211307249.8A CN115767096A (en) 2022-10-24 2022-10-24 Image compression method, apparatus, device and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211307249.8A CN115767096A (en) 2022-10-24 2022-10-24 Image compression method, apparatus, device and medium

Publications (1)

Publication Number Publication Date
CN115767096A true CN115767096A (en) 2023-03-07

Family

ID=85352988

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211307249.8A Pending CN115767096A (en) 2022-10-24 2022-10-24 Image compression method, apparatus, device and medium

Country Status (1)

Country Link
CN (1) CN115767096A (en)

Similar Documents

Publication Publication Date Title
Minnen et al. Joint autoregressive and hierarchical priors for learned image compression
JP4700491B2 (en) Adaptive coefficient scan ordering
CN115514978B (en) Method and apparatus for mixing probabilities of entropy coding in video compression
JP4906855B2 (en) Efficient coding and decoding of transform blocks
US11949868B2 (en) Method and device for selecting context model of quantization coefficient end flag bit
JP5076150B2 (en) Image coding apparatus, image coding method, and image coding program
US7835582B2 (en) Image encoding apparatus and control method thereof
US20220215595A1 (en) Systems and methods for image compression at multiple, different bitrates
CN103947206A (en) Region-based image compression
CN112821894A (en) Lossless compression method and lossless decompression method based on weighted probability model
Maleki et al. Blockcnn: A deep network for artifact removal and image compression
KR20070053098A (en) Decoding apparatus, inverse quantization method, and computer readable medium
JP2008527809A (en) Process for image compression and decompression acceleration
CN110738666A (en) discrete cosine transform-based image semantic segmentation method and device
KR102020220B1 (en) Method and apparatus for compressing images
CN110730347A (en) Image compression method and device and electronic equipment
JP2006270737A (en) Decoder, distribution estimating method, decoding method and their programs
CN115767096A (en) Image compression method, apparatus, device and medium
CN110234011B (en) Video compression method and system
JP4784386B2 (en) Decoding device, inverse quantization method, and program
WO2020172908A1 (en) Decoding method and device for quantization block, and electronic device
JP2004310735A (en) Method and device for fast inverse discrete cosine transform
CN115358954B (en) Attention-guided feature compression method
CN114449277B (en) Method and apparatus for context derivation for coefficient coding
CN117376576A (en) Multi-level super-resolution JPEG lossless transcoding method, system, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination