CN111131834A - Reversible self-encoder, encoding and decoding method, image compression method and device - Google Patents

Reversible self-encoder, encoding and decoding method, image compression method and device Download PDF

Info

Publication number
CN111131834A
CN111131834A CN201911391009.9A CN201911391009A CN111131834A CN 111131834 A CN111131834 A CN 111131834A CN 201911391009 A CN201911391009 A CN 201911391009A CN 111131834 A CN111131834 A CN 111131834A
Authority
CN
China
Prior art keywords
reversible
module
decoding
sub
coding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911391009.9A
Other languages
Chinese (zh)
Other versions
CN111131834B (en
Inventor
戴文睿
李劭辉
邹君妮
李成林
姚斌
朱照远
李飞飞
熊红凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN201911391009.9A priority Critical patent/CN111131834B/en
Publication of CN111131834A publication Critical patent/CN111131834A/en
Application granted granted Critical
Publication of CN111131834B publication Critical patent/CN111131834B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/13Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/423Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation characterised by memory arrangements
    • H04N19/426Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation characterised by memory arrangements using memory downsizing methods

Abstract

The invention provides a reversible self-encoder, a coding and decoding method, an image compression method and a device, wherein the reversible self-encoder comprises: the encoding signal separation module, the cascade reversible encoding module, the encoding signal synthesis module, the decoding signal separation module, the cascade reversible decoding module and the decoding signal synthesis module, wherein: the signal separation module separates an input image and generates two paths of signals, the cascade reversible coding module and the cascade reversible decoding module process the two paths of signals, and the signal synthesis module synthesizes the processed two paths of signals. The invention relates to a colleague of a reversible self-encoder and also provides a method for applying the reversible self-encoder to image compression, compared with a reference neural network, the method can reduce half of parameter quantity and calculated quantity on the premise of achieving the same compression effect.

Description

Reversible self-encoder, encoding and decoding method, image compression method and device
Technical Field
The invention belongs to the field of digital image processing and the field of image compression, and particularly relates to a reversible self-encoder based on a lifting structure, a reversible self-encoding method based on the lifting structure, an image compression method and an image compression device adopting the reversible self-encoding method.
Background
In recent years, deep neural networks have been highlighted in the image processing field, and have achieved surprising results on a variety of image processing tasks. In the image compression direction, some of the modules use a deep neural network to replace transformation, quantization and entropy coding modules in the traditional method, so that the end-to-end image compression method is realized. The method has better performance under the subjective evaluation index of the image, and the end-to-end image compression method can save 30% -50% of the code stream overhead compared with the intra-frame coding method of HEVC generally under the condition of achieving the same MS-SSIM value. Meanwhile, under an objective index represented by a peak signal-to-noise ratio (PSNR), the end-to-end image compression can also achieve code stream overhead similar to that of an intra-frame coding method of HEVC.
An existing self-encoder (AutoEncoder) generally comprises an encoding unit and a decoding unit, and although an end-to-end method brings great improvement in performance, the method generally comprises a large number of neural network parameters which need to be trained, and therefore a large amount of storage and calculation resources are occupied. Taking a certain end-to-end image compression method as an example, compared with the intra-frame coding method of HEVC, it takes about 667 times more storage space and about 176 times more computation overhead to decode the same image. Therefore, for the end-to-end image compression method, a new method needs to be introduced to reduce the storage space overhead and the calculation overhead.
Through the literature search of the prior art, the Fabian Mentzer, eirikurgustson, michael tschann, Radu mobility and Luc Van Gool provide a method in the 'Conditional Probability Models for deep Image Compression' published in the 2018 IEEE Conference on computer vision and Pattern Recognition Conference, which optimizes the neural network parameters in an end-to-end manner, compresses the Image, and utilizes the residual units of multilayer connection to achieve the purpose of nonlinear feature transformation of the Image features. Compared with the traditional image compression method, the method realizes end-to-end optimization and joint optimization among module parameters, but the parameter quantity is greatly increased along with the improvement of performance. Compared with the traditional method, the method has the advantage that the storage space is increased by hundreds of times.
Disclosure of Invention
The invention provides a reversible self-encoder, a coding and decoding method, an image compression method and an image compression device aiming at the problems, which can replace a self-encoder structure occupying a large number of parameters in an end-to-end image compression method, realize the parameter multiplexing of an encoder end and a decoder end and save half of storage cost.
According to a first aspect of the present invention, there is provided an encoding unit of a lifting structure-based reversible self-encoder, comprising:
a coded signal separation module which separates an input image or a high-dimensional signal into two sub-signals;
the cascade reversible coding module comprises a plurality of stages of reversible coding sub-modules based on a lifting structure, the coded signal separation module separates two paths of sub-signals, and the two paths of output of the reversible coding sub-module at the previous stage are used as the input of the reversible coding sub-module at the next stage;
and the coded signal synthesis module is used for re-synthesizing the two paths of signals processed by the cascade reversible coding module into one path of signal.
Optionally, the reversible encoding submodule includes:
the operator adaptively transforms one path of the two-path input signals by utilizing the nonlinear fitting capacity of the convolutional network, and performs summation operation with the other path of input to obtain the output corresponding to the path of input, wherein the output is used as one path of a new two-path output signal;
the operator updates an operator based on a convolution network, takes one path of output generated by the prediction operator and the summation operation as input, adaptively transforms by utilizing the nonlinear fitting capacity of the convolution network, and performs the summation operation with the other path of input of the input signal to be taken as a second path of new two-path output signal;
optionally, the encoding unit of the lifting structure-based reversible self-encoder further comprises:
and the network regulation and control module controls the stage number of the cascade reversible coding module and the transformation attribute of each reversible coding submodule.
According to a second aspect of the present invention, there is provided an encoding method of a lifting structure-based reversible self-encoder, comprising:
separating an input image or a high-dimensional signal;
decomposing the separated signals by adopting a cascade reversible coding module, wherein the cascade reversible coding module comprises a plurality of stages of reversible coding sub-modules based on a lifting structure, and the two-way output of the reversible coding sub-module at the previous stage is used as the input of the reversible coding sub-module at the next stage;
and recombining the two paths of signals processed by the cascade reversible coding submodule into one path of signal.
Optionally, the reversible self-encoding method further comprises: and controlling the series of the cascade reversible coding modules and the transformation attribute of each reversible coding submodule.
According to a third aspect of the present invention, there is provided a decoding unit of a lifting structure-based reversible self-encoder, comprising:
a decoded signal separation module which separates an input signal into two sub-signals;
the cascade reversible decoding module comprises a plurality of stages of reversible decoding submodules based on a lifting structure, reconstructs a coded and decomposed signal, and takes the output of the previous stage of reversible decoding submodule as the input of the next stage of reversible decoding submodule;
and the decoding signal synthesis module is used for re-synthesizing the two paths of sub-signals processed by the cascade reversible decoding module into one path of signal.
Optionally, the reversible decoding submodule includes:
the operator is updated based on the convolution network, adaptively transforms one path of the two paths of input signals by utilizing the nonlinear fitting capacity of the convolution network, and calculates the difference with the other path of input signals to be used as one path of the output two paths of new signals;
and the operator adaptively transforms the generated one path of new signals by utilizing the nonlinear fitting capability of the convolutional network, and calculates the difference with the other path of input in the input to be used as the second path of output two paths of new signals.
According to a fourth aspect of the present invention, there is provided a decoding method for a lifting structure-based reversible self-encoder, comprising:
separating the high-dimensional signal into two sub-signals;
reconstructing the coded and decomposed signals by adopting a cascade reversible decoding module, wherein the cascade reversible decoding module comprises a plurality of stages of reversible decoding submodules based on a lifting structure, and the two-way output of the previous stage of reversible decoding submodule is used as the input of the next stage of reversible decoding submodule;
and recombining the two paths of signals processed by the cascade reversible decoding module into one path of signal.
According to a fifth aspect of the present invention, there is provided a reversible self-encoder comprising:
an encoded signal separation module which separates an input image or a high-dimensional signal;
the cascade reversible coding module comprises a plurality of stages of reversible coding sub-modules based on a lifting structure, the signals obtained by the separation of the signal separation module are decomposed, and the two-way output of the reversible coding sub-module at the previous stage is used as the input of the reversible coding sub-module at the next stage;
the coded signal synthesis module is used for recombining the two paths of signals processed by the cascade reversible coding module into one path of signal;
a decoded signal separation module which decomposes the synthesized signal of the encoded signal synthesis module into two sub-signals;
the cascade reversible decoding module is used for reconstructing a double-path sub-signal obtained after the decoding signal separation module is decomposed, the module comprises a plurality of stages of reversible decoding sub-modules based on a lifting structure, and double-path output of a previous stage of reversible decoding sub-module is used as input of a next stage of reversible decoding sub-module; the number of stages of the cascaded reversible decoding module is the same as the number of reversible coding sub-modules in the cascaded reversible coding module, the reversible decoding sub-modules correspond to the reversible coding sub-modules one by one, and the corresponding pair of reversible decoding sub-modules has the same parameter as the reversible coding sub-modules, wherein the arrangement sequence of the reversible decoding sub-modules in the cascaded reversible decoding module is opposite to the arrangement sequence of the reversible coding sub-modules in the cascaded reversible coding module;
and the decoding signal synthesis module is used for re-synthesizing the two paths of signals processed by the cascade reversible decoding module into one path of signal.
Optionally, the reversible encoding submodule includes:
the operator adaptively transforms one path of the two-path input signals by utilizing the nonlinear fitting capacity of the convolutional network, and performs summation operation with the other path of input to obtain the output corresponding to the path of input, wherein the output is used as one path of a new two-path output signal;
the operator updates an operator based on a convolution network, takes one path of output generated by the prediction operator and the summation operation as input, adaptively transforms by utilizing the nonlinear fitting capacity of the convolution network, and performs the summation operation with the other path of input of the input signal to be taken as a second path of new two-path output signal;
the reversible decoding submodule multiplexes a prediction operator based on a convolution network and an updating operator based on the convolution network in the reversible coding submodule; wherein: and through parameter sharing, the prediction operator based on the convolutional network in the reversible decoding submodule and the update operator based on the convolutional network call the parameters of the prediction operator and the update operator in the reversible coding submodule.
Optionally, the reversible self-encoder further includes:
and the network regulation and control module controls the series of the cascade reversible coding module and the cascade reversible decoding module and the transformation attribute of each pair of the reversible coding submodule and the reversible coding submodule.
According to a sixth aspect of the present invention, there is provided a reversible self-coding and decoding method, comprising:
separating an input image or a high-dimensional signal into two sub-signals;
decomposing the two paths of sub-signals obtained by separation by adopting a cascade reversible coding module, wherein the cascade reversible coding module comprises a plurality of stages of reversible coding modules based on a lifting structure, and the two paths of output of the reversible coding module at the previous stage are used as the input of the reversible coding module at the next stage;
re-synthesizing the two paths of signals processed by the cascade reversible coding module into one path of signal to obtain a synthesized high-dimensional signal;
separating the synthesized high-dimensional signal into two paths of sub-signals, and adapting to the input of a cascade decoding module;
reconstructing the two-path sub-signals after the synthesized high-dimensional signals are separated by adopting a cascade reversible decoding module, wherein the cascade reversible decoding module comprises a plurality of stages of reversible decoding modules based on a lifting structure, and the two-path output of the previous stage of reversible decoding module is used as the input of the next stage of reversible decoding module; wherein: the number of stages of the cascaded reversible decoding modules is the same as the number of the reversible coding modules in the cascaded reversible coding modules, the reversible decoding modules correspond to the reversible coding modules one by one, and the corresponding pair of the reversible decoding modules has the same parameter as the reversible coding modules, wherein the arrangement sequence of the reversible decoding modules in the cascaded reversible decoding modules is opposite to the arrangement sequence of the reversible coding modules in the cascaded reversible coding modules;
and recombining the two paths of signals processed by the cascade reversible coding module and the cascade reversible decoding module into one path of signal.
Optionally, the reversible self-coding and decoding method further includes:
controlling the number of levels of the cascaded reversible encoding modules and the cascaded reversible decoding modules and the transformation properties of each pair of the reversible decoding modules and the reversible encoding modules.
According to a seventh aspect of the present invention, there is provided an image compression method using any one of the reversible self-coding and decoding methods described above.
Optionally, the image compression method includes:
the method comprises the steps that an input image is subjected to up-sampling through a convolutional layer and a ReLU layer of a convolutional neural network, and then signal separation is carried out to obtain an original separation signal;
forward coding the original separation signal by adopting a cascade reversible coding module to generate a two-path characteristic diagram;
synthesizing the obtained two-way characteristic diagram into a one-way characteristic diagram;
quantizing and entropy coding the generated characteristic diagram to obtain a binary code stream;
decoding the binary code stream to obtain a reconstructed characteristic diagram;
separating the reconstructed characteristic diagram into two characteristic diagrams;
a cascade reversible decoding module is adopted to carry out reverse decoding on the reconstructed characteristic diagram to obtain a double-path reconstructed signal;
and synthesizing the two paths of reconstruction signals obtained by reverse decoding, and realizing down-sampling reconstruction images through the convolution layer and the ReLU layer.
According to an eighth aspect of the present invention, there is provided an image compression apparatus comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor being operable to perform any of the image compression methods described above when executing the program.
Compared with the prior art, the invention has at least one of the following beneficial effects:
the coding unit, the decoding unit and the method of the reversible self-encoder based on the lifting structure realize the one-to-one correspondence of the coding sub-modules and the decoding sub-modules in the coding unit and the decoding unit, and provide possibility for further parameter multiplexing.
The reversible self-encoder and the encoding and decoding method based on the lifting structure realize parameter multiplexing of the encoding end and the decoding end. The reversible self-encoder comprises an encoding signal separation module, a cascade reversible encoding module, an encoding signal synthesis module, a decoding signal separation module, a cascade reversible decoding module and a decoding signal synthesis module, and can replace a self-encoder structure occupying a large number of parameters in an end-to-end image compression method, realize the parameter multiplexing of an encoder end and a decoder end, and save half of storage cost.
Compared with the reference neural network, the image compression method and the device adopting the reversible self-coding and decoding method can reduce half of parameter quantity and greatly reduce storage cost on the premise of achieving the same compression effect. In addition, compared with a reference neural network, the image compression method and the image compression device reduce about half of learnable parameters in the training process, reduce the calculated amount in the training process and improve the processing speed.
Drawings
FIG. 1 is a block diagram of an encoding unit of a reversible self-encoder based on a lifting structure according to an embodiment of the present invention;
FIG. 2 is a flow chart of an encoding method of a reversible self-encoder according to an embodiment of the present invention;
FIG. 3 is a block diagram of a decoding unit of a reversible self-encoder based on lifting structure according to an embodiment of the present invention;
FIG. 4 is a flowchart of a decoding method of the reversible self-encoder according to an embodiment of the present invention;
FIG. 5 is a block diagram of a reversible self-encoder module according to an embodiment of the present invention;
FIG. 6 is a flowchart illustrating a reversible self-coding/decoding method according to an embodiment of the present invention;
FIG. 7 is a schematic diagram of a reversible self-coding and decoding method according to an embodiment of the present invention;
FIG. 8 is a flowchart of an image compression method according to an embodiment of the invention;
FIG. 9 is a detailed schematic diagram of an encoding unit of a reversible self-encoder used in an image compression method according to an embodiment of the present invention;
FIG. 10 is a detailed schematic diagram of a decoding unit of a reversible self-encoder used in an image compression method according to an embodiment of the present invention;
FIG. 11 is a diagram illustrating the effect of an image compression method according to an embodiment of the present invention.
Detailed Description
The present invention will be described in detail below with reference to specific embodiments and the accompanying drawings. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that variations and modifications can be made by persons skilled in the art without departing from the spirit of the invention. All falling within the scope of the present invention.
Fig. 1 is a block diagram of an encoding unit of a lifting structure-based reversible self-encoder according to an embodiment of the present invention. The encoding unit of the lifting structure-based reversible self-encoder comprises: the device comprises a coded signal separation module, a cascade reversible coding module and a coded signal synthesis module, wherein: the coded signal separation module separates the input image or high-dimensional signal; the cascade reversible coding module comprises a plurality of stages of reversible coding sub-modules based on a lifting structure, signals obtained by separation of the coded signal separation module are decomposed, and two-way output of a previous stage of reversible coding sub-module is used as input of a next stage of reversible coding sub-module; and the coded signal synthesis module re-synthesizes the two paths of signals processed by the cascade reversible coding module into one path of signal.
The reversible encoding submodule in this embodiment includes: a convolutional network based predictor and a convolutional network based update operator, wherein: the operator adaptively transforms one path of the two-path input signals by utilizing the nonlinear fitting capacity of the convolutional network, and performs summation operation with the other path of input to obtain the output corresponding to the path of input, wherein the output is used as one path of a new two-path output signal; and based on an updating operator of the convolution network, the operator takes one path of output generated by the prediction operator and the summation operation as input, adaptively transforms by utilizing the nonlinear fitting capability of the convolution network, and performs the summation operation with the other path of input of the input signal to be used as a second path of new two-path output signal.
The encoding unit of the reversible self-encoder based on the lifting structure can realize the purpose of transforming the input signal to a sparse transform domain for representation and laying a cushion for the subsequent quantization and encoding stages; at the same time, the lifting structure comprised by the coding unit provides the possibility for multiplexing parameters.
In a preferred embodiment, the encoding unit of the reversible self-encoder comprises: the system comprises a code signal separation module, a cascade reversible coding module, a code signal synthesis module and a network regulation and control module, wherein the code signal separation module, the cascade reversible coding module and the code signal synthesis module have the same action in the embodiment, and the network regulation and control module is used for controlling the stage number of the cascade reversible coding module and the transformation attribute of each reversible coding submodule, wherein the transformation attribute comprises the upper and lower bounds of the transformation corresponding to the network. The number of the training parameters needed by the coding unit of the reversible self-encoder can be adjusted by adjusting the parameters in the module, the parameters are increased within a certain range, and the compression performance can be effectively improved. In addition, the network regulation and control module can make the whole transformation stable to noise by adjusting the upper and lower bounds of the transformation, so that the coding is more stable.
In the above embodiment, the coded signal separation module divides the input signal into two signals with equal dimensions according to the position. In implementation, multi-channel signals such as a hyperspectral image are divided in channel dimensions; and for the natural image, dividing according to the pixel position.
In the above embodiment, the encoded signal synthesizing module is an inverse transform of the encoded signal separating module, that is, the input signal passes through the encoded signal separating module and then is directly input to the encoded signal synthesizing module, so as to obtain the original signal.
Fig. 2 is a flowchart illustrating an encoding method of a reversible self-encoder according to an embodiment of the present invention. The method can be used in the lifting structure-based reversible self-coding structure shown in fig. 1. Referring to fig. 2, the encoding method based on the reversible self-encoder includes the following steps: separating an input image or a high-dimensional signal; decomposing the separated signals by adopting a cascade reversible coding module, wherein the cascade reversible coding module comprises a plurality of stages of reversible coding sub-modules based on a lifting structure, and the two-way output of the reversible coding sub-module at the previous stage is used as the input of the reversible coding sub-module at the next stage; and recombining the two paths of signals processed by the cascade reversible coding submodule into one path of signal.
Correspondingly, the reversible coding submodule comprises a prediction operator based on the convolutional network and an update operator based on the convolutional network, and the two-way input is recorded as a0And b0The predictor and the update operator based on the convolution network are respectively recorded as
Figure BDA0002344958470000081
And
Figure BDA0002344958470000082
the specific operation in the reversible coding submodule is as follows:
Figure BDA0002344958470000083
Figure BDA0002344958470000084
in the above formula a1And b1Is based on the two-way output of the reversible coding submodule with a lifting structure, and mu and theta are the learnable modes of the convolution network corresponding to a prediction operator and an update operatorThe parameter set is learned.
The encoding method of the reversible self-encoder of the embodiment of the invention can realize that the input signal is transformed to a sparse transform domain to be represented, and the subsequent quantization and encoding stages are padded; at the same time, the lifting structure comprised by the coding unit provides the possibility for multiplexing parameters.
In some preferred embodiments, the encoding method based on reversible self-encoder may further include, on the basis of the flow shown in fig. 2: and controlling the series of the cascade reversible coding modules and the transformation attribute of each reversible coding submodule. Wherein the transformation attributes include upper and lower bounds of the transformation corresponding to the network. The number of the training parameters required by the coding unit of the self-coder is adjusted, and the parameters are increased within a certain range, so that the compression performance can be effectively improved. In addition, the network regulation and control module can make the whole transformation stable to noise by adjusting the upper and lower bounds of the transformation, so that the coding is more stable.
Fig. 3 is a block diagram of a decoding unit of a lifting structure-based reversible self-encoder according to an embodiment of the present invention, which can be used with the encoding unit shown in fig. 1. As shown in fig. 3, the decoding unit of the lifting structure-based reversible self-encoder includes: the decoding signal separation module separates input signals into two paths of sub signals; the cascade reversible decoding module comprises a plurality of stages of reversible decoding submodules based on a lifting structure, and is used for reconstructing two paths of sub-signals, and the two paths of output of the previous stage of reversible decoding submodule are used as the input of the next stage of reversible decoding submodule; and the decoding signal synthesis module re-synthesizes the two paths of signals processed by the cascade reversible decoding module into one path of signal.
Specifically, the reversible decoding submodule in the cascaded reversible decoding module includes: a convolutional network based predictor and a convolutional network based update operator, wherein: the operator is updated based on the convolution network, adaptively transforms one path of the two paths of input signals by utilizing the nonlinear fitting capacity of the convolution network, and calculates the difference with the other path of input signals to be used as one path of the output two paths of new signals; and the operator adaptively transforms the generated one path of new signals by utilizing the nonlinear fitting capability of the convolutional network, and calculates the difference with the other path of input in the input to be used as the second path of output two paths of new signals. Wherein:
Figure BDA0002344958470000091
Figure BDA0002344958470000092
the prediction operator based on the convolutional network and the updating operator based on the convolutional network are realized by adopting a multilayer convolutional neural network; wherein the content of the first and second substances,
Figure BDA0002344958470000093
and
Figure BDA0002344958470000094
is a two-way input of a certain reversible decoding submodule,
Figure BDA0002344958470000095
and
Figure BDA0002344958470000096
is a double-path output of a certain reversible decoding submodule,
Figure BDA0002344958470000097
and
Figure BDA0002344958470000098
are respectively paired with b0And a1And performing lifting and prediction transformation, wherein mu and theta are learnable parameter sets of the convolution network corresponding to the prediction operator and the updating operator.
The decoding unit of the reversible self-encoder of the above embodiment of the present invention can transform the signal that is forward encoded by the encoding unit back to the original signal domain; meanwhile, parameters in the decoding unit can be multiplexed with parameters in the encoding unit in the above embodiment, and reversible encoding sub-modules and reversible decoding sub-modules are in one-to-one correspondence, so that the picture is restored and half of the parameter overhead is saved.
Fig. 4 is a flow chart of a decoding method of the reversible self-encoder in a preferred embodiment of the present invention. As shown in fig. 4, the decoding method based on the reversible self-encoder includes: separating the high-dimensional signal into two sub-signals; reconstructing the coded and decomposed signals by adopting a cascade reversible decoding module, wherein the cascade reversible decoding module comprises a plurality of stages of reversible decoding submodules based on a lifting structure, and the two-way output of the previous stage of reversible decoding submodule is used as the input of the next stage of reversible decoding submodule; and recombining the two paths of signals processed by the cascade reversible decoding module into one path of signal.
Specifically, the reversible decoding submodule includes a prediction operator based on a convolutional network and an update operator based on the convolutional network, and specifically operates as follows:
Figure BDA0002344958470000101
Figure BDA0002344958470000102
the prediction operator based on the convolutional network and the updating operator based on the convolutional network are realized by adopting a multilayer convolutional neural network; wherein the content of the first and second substances,
Figure BDA0002344958470000103
and
Figure BDA0002344958470000104
is a two-way input of a certain reversible decoding module,
Figure BDA0002344958470000105
and
Figure BDA0002344958470000106
is a dual output of a certain reversible decoding module,
Figure BDA0002344958470000107
and
Figure BDA0002344958470000108
are respectively paired with b0And a1And performing lifting and prediction transformation, wherein mu and theta are learnable parameter sets of the convolution network corresponding to the prediction operator and the updating operator.
The decoding method based on the reversible self-encoder of the embodiment of the invention can realize the conversion of the signal coded by the coding method back to the original signal domain; meanwhile, the parameters in the decoding method can be multiplexed with the parameters in the encoding method of the above embodiment, so that the picture is restored and half of the parameter overhead is saved.
Fig. 5 is a block diagram of a reversible self-encoder according to an embodiment of the present invention. As shown in fig. 5, the reversible self-encoder includes: the encoding signal separation module, the cascade reversible encoding module, the encoding signal synthesis module, the decoding signal separation module, the cascade reversible decoding module and the decoding signal synthesis module, wherein: the coded signal separation module separates the input image or high-dimensional signal; the cascade reversible coding module comprises a plurality of stages of reversible coding sub-modules based on a lifting structure, signals obtained by separation of the coded signal separation module are decomposed, and two-way output of a previous stage of reversible coding sub-module is used as input of a next stage of reversible coding sub-module; the coded signal synthesis module re-synthesizes the two paths of signals processed by the cascade reversible coding module into one path of signal; the decoding signal separation module separates the synthesized high-dimensional signal into two-path signals and adapts to the input of the cascade decoding module; reconstructing a coded and decomposed signal by a cascade reversible decoding module, wherein the module comprises a plurality of stages of reversible decoding submodules based on a lifting structure, and the two-way output of the previous stage of reversible decoding submodule is used as the input of the next stage of reversible decoding submodule; and the decoding signal synthesis module is used for re-synthesizing the two paths of signals processed by the cascade reversible coding module and the cascade reversible decoding module into one path of signal.
In the reversible self-encoder in the above embodiment, the number of stages of the cascaded reversible decoding module is the same as the number of reversible encoding sub-modules in the cascaded reversible encoding module, the reversible decoding sub-modules correspond to the reversible encoding sub-modules one to one, the parameters of the corresponding pair of reversible decoding sub-modules are the same as the parameters of the reversible encoding sub-modules, and the order of the reversible decoding sub-modules in the cascaded reversible decoding module is opposite to the order of the reversible encoding sub-modules in the cascaded reversible encoding module.
Specifically, the reversible coding submodule includes: the operator adaptively transforms one path of the two-path input signals by utilizing the nonlinear fitting capacity of the convolutional network, and performs summation operation with the other path of input to obtain the output corresponding to the path of input, wherein the output is used as one path of a new two-path output signal; and based on an updating operator of the convolution network, the operator takes one path of output generated by the prediction operator and the summation operation as input, adaptively transforms by utilizing the nonlinear fitting capability of the convolution network, and performs the summation operation with the other path of input of the input signal to be used as a second path of new two-path output signal. Will be denoted as a for two-way input0And b0The predictor and the update operator based on the convolution network are respectively recorded as
Figure BDA0002344958470000111
And
Figure BDA0002344958470000112
the specific operation in the reversible coding submodule is as follows:
Figure BDA0002344958470000113
Figure BDA0002344958470000114
in the above formula a1And b1The method is based on two-way output of a reversible coding submodule of a lifting structure, and mu and theta are learnable parameter sets of a convolution network corresponding to a prediction operator and an update operator;
the reversible decoding submodule multiplexing coding submodule based on the lifting structure multiplexes a prediction operator based on a convolution network and an updating operator based on the convolution network in a coding submodule, and the operation is as follows:
Figure BDA0002344958470000115
Figure BDA0002344958470000116
the prediction operator based on the convolutional network and the updating operator based on the convolutional network are realized by adopting a multilayer convolutional neural network; wherein the content of the first and second substances,
Figure BDA0002344958470000117
and
Figure BDA0002344958470000118
is a two-way input of a certain reversible decoding module,
Figure BDA0002344958470000119
and
Figure BDA00023449584700001110
is a dual output of a certain reversible decoding module,
Figure BDA00023449584700001111
and
Figure BDA00023449584700001112
are respectively paired with b0And a1And performing lifting and prediction transformation, wherein mu and theta are learnable parameter sets of the convolution network corresponding to the prediction operator and the updating operator.
Specifically, in the above embodiment, the reversible decoding sub-modules and the reversible coding sub-modules are in one-to-one correspondence and arranged in the following order:
recording the set of reversible coding sub-modules in the N-level cascade reversible coding modules as { Ai(·,·;θi,μi)}1≤i≤NAnd the operation realized by the ith module is recorded as:
[ai,bi]=Ai(ai-1,bi-1;θi,μi)
at this time, the N-level cascaded reversible coding module is implemented:
[aN,bN]=AN(AN-1(…A1(a0,b0;θ1,μ1)…;θN-1,μN-1);θN,μN)
and the reversible decoding submodule set is { B }i(·,·;θi,μi)}1≤i≤NAnd the operation realized by the ith module is recorded as:
Figure BDA00023449584700001113
then the N-level cascaded reversible decoding module implements:
Figure BDA00023449584700001114
wherein, { theta }i,μi}1≤i≤NLearnable parameters multiplexed with a reversible decoding submodule and a reversible encoding submodule.
The reversible self-encoder based on the lifting structure of the embodiment of the invention realizes the parameter multiplexing of the encoding end and the decoding end, can replace a self-encoder structure occupying a large number of parameters in an end-to-end image compression method, realizes the parameter multiplexing of the encoder end and the decoder end, and can save half of storage cost.
In some preferred embodiments, the reversible self-encoder may further include: and the network regulation and control module controls the cascade reversible coding module and the cascade reversible decoding module and the transformation attribute of each pair of reversible decoding submodule and reversible coding submodule. The transformation attribute in the network regulation and control module comprises an upper boundary and a lower boundary of the transformation corresponding to the network. Furthermore, the network regulation and control module controls and controls the upper and lower transformation bounds of each pair of coding/decoding sub-modules through the upper and lower transformation bounds corresponding to the prediction operator and the update operator based on the convolutional neural network. Specifically, penalty terms in proportion to two norms of convolution kernels of convolution layers in a predictor and an update operator are added to a loss function in the training process.
In the above embodiment, the encoding or decoding signal separation module divides the input signal into two signals with equal dimensions according to the position. In implementation, multi-channel signals such as a hyperspectral image are divided in channel dimensions; and for the natural image, dividing according to the pixel position.
In the above embodiment, the encoding or decoding signal synthesizing module is an inverse transform of the signal separating module, that is, the input signal is directly input to the encoding or decoding signal synthesizing module after passing through the encoding or decoding signal separating module, so as to obtain the original signal.
Fig. 6 is a flowchart of a reversible self-decoding method according to an embodiment of the invention. The reversible self-coding and decoding method can be used in the reversible self-coder shown in fig. 5.
Specifically, as shown in the embodiment of fig. 6, a reversible self-encoding and decoding method based on a lifting structure includes the following steps: separating an input image or a high-dimensional signal; decomposing the separated signals by adopting a cascade reversible coding module, wherein the cascade reversible coding module comprises a plurality of stages of reversible coding modules based on a lifting structure, and the two-way output of the previous stage of reversible coding module is used as the input of the next stage of reversible coding module; recombining the two paths of signals processed by the cascade reversible coding module into one path of signal; separating the synthesized high-dimensional signal into two paths of signals, and adapting to the input of a cascade decoding module; reconstructing the coded and decomposed signals by adopting a cascade reversible decoding module, wherein the cascade reversible decoding module comprises a plurality of stages of reversible decoding modules based on a lifting structure, and the two-way output of the previous stage of reversible decoding module is used as the input of the next stage of reversible decoding module; and recombining the two paths of signals processed by the cascade reversible coding module and the cascade reversible decoding module into one path of signal.
Referring to fig. 7, in the above embodiment, the number of stages of the cascaded reversible decoding modules is the same as the number of reversible encoding modules in the cascaded reversible encoding modules, the reversible decoding modules correspond to the reversible encoding modules one to one, and a pair of corresponding reversible decoding modules have the same parameter as the reversible encoding modules, wherein an arrangement order of the reversible decoding modules in the cascaded reversible decoding modules is opposite to an arrangement order of the reversible encoding modules in the cascaded reversible encoding modules;
specifically, the reversible coding submodule comprises a prediction operator based on a convolutional network and an update operator based on the convolutional network; will be denoted as a for two-way input0And b0The predictor and the update operator based on the convolution network are respectively recorded as
Figure BDA0002344958470000131
And
Figure BDA0002344958470000132
the specific operation in the reversible coding submodule is as follows:
Figure BDA0002344958470000133
Figure BDA0002344958470000134
in the above formula a1And b1The method is based on two-way output of a reversible coding submodule of a lifting structure, and mu and theta are learnable parameter sets of a convolution network corresponding to a prediction operator and an update operator;
in the above embodiment, the reversible decoding submodule based on the lifting structure multiplexes the prediction operator based on the convolutional network and the update operator based on the convolutional network in the coding submodule, and specifically the operations are as follows:
Figure BDA0002344958470000135
Figure BDA0002344958470000136
wherein, theThe prediction operator based on the convolution network and the updating operator based on the convolution network are realized by adopting a multilayer convolution neural network; wherein the content of the first and second substances,
Figure BDA0002344958470000137
and
Figure BDA0002344958470000138
is a two-way input of a certain reversible decoding module,
Figure BDA0002344958470000139
and
Figure BDA00023449584700001310
is a dual output of a reversible decoding module,
Figure BDA00023449584700001311
and
Figure BDA00023449584700001312
are respectively paired with b0And a1And performing lifting and prediction transformation, wherein mu and theta are learnable parameter sets of the convolution network corresponding to the prediction operator and the updating operator.
In some preferred embodiments, the reversible self-coding and decoding method may further include: and controlling the series of the cascade reversible coding modules and the cascade reversible decoding modules and the transformation attribute of each pair of the reversible decoding modules and the reversible coding modules. The transformation attribute in the network regulation and control module comprises an upper boundary and a lower boundary of the transformation corresponding to the network. Furthermore, the network regulation and control module controls the upper and lower transformation bounds of each pair of coding/decoding sub-modules through the upper and lower transformation bounds corresponding to the update operator and the lifting operator based on the convolutional neural network. Specifically, a penalty term which is in direct proportion to the convolution kernel two-norm of the convolution layer in the update operator and the predictor is added to the loss function in the training process.
The reversible self-coding and decoding method of the embodiment of the invention realizes the parameter multiplexing of the coding end and the decoding end, can replace a self-coder structure occupying a large number of parameters in the end-to-end image compression method, realizes the parameter multiplexing of the coder end and the decoder end, and can save half of the storage cost.
Fig. 8 is a flowchart of an image compression method in an embodiment of the present invention, and as shown in fig. 8, an image compression method using the reversible self-encoding and decoding method can reduce half of parameter amount and calculation amount on the premise of achieving the same compression effect, thereby greatly reducing storage overhead and increasing processing speed, compared with a reference neural network.
Specifically, the image compression method using the reversible self-coding and decoding method may be implemented according to the following steps:
s1, performing up-sampling on the input image through a convolutional layer and a ReLU layer of a convolutional neural network, and then performing signal separation to obtain an original separation signal;
s2, forward transformation is carried out on the original separation signal by adopting a cascade reversible coding module to generate a characteristic diagram;
s3, synthesizing the two-way characteristic diagram into one-way characteristic diagram;
s4, quantizing and entropy coding the generated characteristic diagram to obtain a binary code stream;
s5, decoding the binary code stream to obtain a reconstructed characteristic diagram;
s6, separating the reconstructed characteristic diagram into two characteristic diagrams;
s7, performing reverse transformation on the two reconstructed characteristic diagrams by adopting a cascade reversible decoding module to obtain two signals;
and S8, synthesizing the two paths of signals obtained by the inverse transformation, and realizing down-sampling reconstruction images through the convolution layer and the ReLU layer to finish image compression.
In some embodiments, the quantization mode is bit-plane coding and the entropy coding is adaptive arithmetic coding or context-based adaptive arithmetic coding. In some embodiments, the manner of context modeling is a neural network or a feature extraction method.
In the embodiment of the image compression method, regarding technical features in the reversible self-coding and decoding method, reference may be made to the descriptions and the prior art in the embodiment, and further description is omitted.
Based on the image compression method, in another embodiment, an image compression apparatus is correspondingly provided, where the apparatus includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor is configured to execute the image compression method according to any one of the above descriptions.
As shown in fig. 8, 9 and 10, in order to describe the above-mentioned image compression method and the encoding and decoding techniques involved therein in more detail, the steps of the above-mentioned image compression method are expanded one by one, and it should be understood that this is only for better understanding of the technical solution of the present invention, and the embodiments of the present invention are not limited to the following specific cases. As shown in fig. 8, the overall steps of the image compression method involve the following sections.
1. Upsampling and signal separation
The natural image is the projection of a high-dimensional object on a two-dimensional plane, and in order to fully utilize the processing advantage of a convolutional neural network on a high-dimensional data structure, the up-sampling of an input image can be realized by using a convolutional layer and a ReLU layer before a signal separation step is carried out.
One convolution layer and the ReLU layer are used as a group, and two groups of the structure are used in the embodiment to realize up-sampling. Wherein, each convolution layer realizes down-sampling with the step length of 2 on the length and the width of the image and up-sampling on the channel. Specifically, the number of channels of the feature map becomes 64 after passing through the first convolutional layer, and the number of channels of the feature map becomes 128 after passing through the second convolutional layer.
After up-sampling, the feature maps are separated according to the channels, the front 64 channels form a group of feature maps, which are marked as a, and the back 64 channels form another path of feature map, which is marked as b. The two groups of characteristic graphs are the two-way input of the cascade reversible coding module.
2. Cascaded reversible coding modules
As shown in fig. 9, the concatenated reversible coding module is formed by concatenating reversible coding sub-modules as a unit. The unit comprises a prediction operator based on a convolution network and an update based on the convolution networkAnd (5) an operator. Will be denoted as a for two-way input0And b0The predictor and the update operator based on the convolution network are respectively recorded as
Figure BDA0002344958470000151
And
Figure BDA0002344958470000152
the specific operation in the reversible coding submodule is as follows:
Figure BDA0002344958470000153
Figure BDA0002344958470000154
in particular, the convolution networks used in the predictors and update operators may contain convolutional layers, nonlinear layers, and Batch-Normalization layers. In this embodiment, the structure of the predictor and the update operator in each reversible coding sub-module is the same, and both the predictor and the update operator are composed of convolution layers and ReLU layers. Assuming that the input signal is x, the specific implementation of the ReLU layer is:
Figure BDA0002344958470000155
accordingly, the structure of the predictor is as follows:
1) convolution layer, the size of four-dimensional convolution kernel is 3 × 3 × 64 × 64, and the step size is 1;
2) a ReLU layer;
3) convolution layer, the size of four-dimensional convolution kernel is 3 × 3 × 64 × 64, and the step size is 1;
4) a ReLU layer;
5) convolution layer, the size of four-dimensional convolution kernel is 3 × 3 × 64 × 64, and the step size is 1;
and the parameters of the convolutional layer are all open multiplexing authority, so that a decoding module can be conveniently called.
The same structure is adopted for the update operator, and in some embodiments, the parameters of the update operator and the predictor are multiplexed to realizeNow the orthogonal transformation is performed, but this embodiment does not multiplex. In this embodiment, an 8-layer reversible coding submodule is used to construct a cascaded reversible coding module. Specifically, 8 modules are denoted as A in sequence1(·,·;θ1,μ1),A2(·,·;θ2,μ2),...,A8(·,·;θ8,μ8) Wherein the parameters of the coding sub-module are independent of the input, so that' is used to represent any input, and the whole cascaded reversible coding module can be recorded as
[a8,b8]=A8(A7(…A1(a0,b0;θ1,μ1)…;θ7,μ7);θ8,μ8)
Wherein the parameter set pair is involvedi,ui}1≤i≤8And is used in the cascaded reversible decoding module.
3. Quantization and coding process
The quantization and coding process is a process of converting the multi-dimensional characteristic diagram into a binary code stream. Various schemes may be employed in implementations, including bit-plane coding, adaptive arithmetic coding, and the like. In this embodiment, the quantization method used is a clustering method, and the feature graph output by the concatenated coding module has N symbols (denoted as { z } in totaliI is more than or equal to 1 and less than or equal to N), and clustering the N symbols to 8 central points (marked as { c ≦ N })jJ is more than or equal to 1 and less than or equal to 8}), the symbol z is represented byiWill be quantified as follows
Figure BDA0002344958470000167
Figure BDA0002344958470000161
For the convenience of end-to-end training, the following soft quantization substitution is used in the reverse propagation:
Figure BDA0002344958470000162
wherein σ is soft quantization parameter and can be adjusted
Figure BDA0002344958470000168
Is distributed to approximate ziAnd distribution is realized, and the coding efficiency is improved. The specific encoding process uses context encoding based on a neural network, and the conditional probability among symbols is predicted through the neural network so as to update the probability in the arithmetic encoding.
4. Inverse quantization and inverse coding process
The inverse quantization and inverse coding process is the inverse process of the quantization and coding process, and the recovered two-way reconstruction characteristic diagram is obtained
Figure BDA0002344958470000163
5. Cascaded reversible decoding module
As shown in fig. 10, the cascaded reversible decoding module is an inverse process of the cascaded reversible coding sub-module, takes the reversible decoding sub-module as a unit, and multiplexes a prediction operator based on a convolutional network and an update operator based on a convolutional network in the coding module, and the specific operations are as follows:
Figure BDA0002344958470000164
Figure BDA0002344958470000165
corresponding to the cascade reversible coding submodule comprising 8 units, 8 reversible decoding submodules are marked as B according to the signal passing sequence8(·,·;θ8,μ8),...,B2(·,·;θ2,μ2),B1(·,·;θ1,μ1) Then the entire concatenated reversible encoding module can be written as
Figure BDA0002344958470000166
The above formula shows the one-to-one correspondence of 8 units.
6. Signal synthesis and down-sampling
In order to restore the reconstructed image, the two characteristic maps are synthesized according to the channels, namely two groups of characteristic maps are connected in series according to the channels to synthesize a new characteristic map. The feature map has 128 channels, each having a length and width of the original image
Figure BDA0002344958470000171
Corresponding to the up-sampling process, the down-sampling process uses the deconvolution layer and the ReLU layer. One deconvolution layer and the ReLU layer are used as a group, and two groups of structures are used in the embodiment to realize down-sampling. Wherein, each convolution layer realizes the up-sampling of the image with the step length of 2 in the length and the width and realizes the down-sampling in the channel. Specifically, the number of channels of the feature map after the first convolutional layer was changed to 64, and after the second convolutional layer was passed, the number of channels of the feature map was changed to 3/1 according to the color/gray-scale image.
7. The experimental results are as follows:
the encoding end and the decoding end in this embodiment are separated, and the intermediate binary code stream is stored as a compressed file. Assuming that the input image is H × W × C and the intermediate binary code stream is B bits, the compression rate of the image is
Figure BDA0002344958470000172
Bit per pixel (bpp), and image quality is measured by the subjective evaluation index MS-SSIM. According to the two indexes, the average value is calculated on the test set, and the compression ratio-loss curve of the image compression method can be drawn.
In order to verify the effectiveness of the method of the embodiment, a reference network based on a residual error network is added in the experimental comparison and is used for comparing with the reversible self-encoder structure based on the lifting structure provided by the embodiment of the invention. In particular, the units thereof may be implemented
Figure BDA0002344958470000173
Wherein the content of the first and second substances,
Figure BDA0002344958470000174
the structure is as follows:
1) convolution layer, the size of four-dimensional convolution kernel is 3 × 3 × 90 × 90, and the step size is 1;
2) a ReLU layer;
3) convolution layer, the size of four-dimensional convolution kernel is 3 × 3 × 90 × 90, and the step size is 1;
4) a ReLU layer;
5) convolution layer, the size of four-dimensional convolution kernel is 3 × 3 × 90 × 90, and the step size is 1;
8 residual blocks are adopted to replace a cascade reversible coding module at a coding end, and 8 residual blocks are adopted to replace a cascade reversible decoding module at a decoding end, but the residual blocks do not multiplex parameters. In this way, it is ensured that the parameter quantity of the reference model is approximately half of the method of the present embodiment. In this embodiment, the lifting structure based reversible self-encoder structure has about 124 ten thousand parameters, while the reference network used for comparison has about 291 ten thousand parameters. The experimental results as shown in fig. 11 can be obtained. Experiments show that the encoder and the decoder in the embodiment of the invention can replace self-encoder and decoder structures occupying a large number of parameters in an end-to-end image compression method, realize parameter multiplexing of an encoder end and a decoder end, and save half of storage cost.
It should be noted that, the steps in the method provided by the present invention may be implemented by using corresponding modules, devices, units, and the like in the system, and those skilled in the art may refer to the technical solution of the system to implement the step flow of the method, that is, the embodiment in the system may be understood as a preferred example for implementing the method, and details are not described herein.
Those skilled in the art will appreciate that, in addition to implementing the system and its various devices provided by the present invention in purely computer readable program code means, the method steps can be fully programmed to implement the same functions by implementing the system and its various devices in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system and various devices thereof provided by the present invention can be regarded as a hardware component, and the devices included in the system and various devices thereof for realizing various functions can also be regarded as structures in the hardware component; means for performing the functions may also be regarded as structures within both software modules and hardware components for performing the methods.
The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes and modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The above-described preferred features may be used in any combination without conflict with each other.

Claims (20)

1. An encoding unit of a lifting structure-based reversible self-encoder, comprising:
a coded signal separation module which separates an input image or a high-dimensional signal into two sub-signals;
the cascade reversible coding module comprises a plurality of stages of reversible coding sub-modules based on a lifting structure, two paths of sub-signals obtained by separating the coded signal separation module are decomposed, and the output of the reversible coding sub-module at the previous stage is used as the input of the reversible coding sub-module at the next stage;
and the coded signal synthesis module is used for re-synthesizing the two paths of sub-signals processed by the cascade reversible coding module into one path of signal.
2. The coding unit of the lifting structure-based reversible self-encoder according to claim 1, characterized in that the reversible coding submodule comprises:
the operator adaptively transforms one path of the two-path input signals by utilizing the nonlinear fitting capacity of the convolutional network, and performs summation operation with the other path of input to obtain the output corresponding to the path of input, wherein the output is used as one path of a new two-path output signal;
and based on an updating operator of the convolution network, the operator takes one path of output generated by the prediction operator and the summation operation as input, adaptively transforms by utilizing the nonlinear fitting capability of the convolution network, and performs the summation operation with the other path of input of the input signal to be used as a second path of new two-path output signal.
3. The encoding unit of the lifting structure-based reversible self-encoder according to claim 1 or 2, further comprising:
and the network regulation and control module controls the stage number of the cascade reversible coding module and the transformation attribute of each reversible coding submodule.
4. A method of encoding with a reversible self-encoder, comprising:
separating an input image or a high-dimensional signal into two sub-signals;
decomposing the two paths of sub-signals obtained by separation by adopting a cascade reversible coding module, wherein the cascade reversible coding module comprises a plurality of stages of reversible coding sub-modules based on a lifting structure, and the two paths of output of the reversible coding sub-module at the previous stage are used as the input of the reversible coding sub-module at the next stage;
and recombining the two paths of sub-signals processed by the cascade reversible coding sub-module into one path of signal.
5. The encoding method of a reversible self-encoder according to claim 4, characterized in that the reversible encoding submodule comprises a convolutional network based predictor and a convolutional network based update operator, and the two-way input is denoted as a0And b0The predictor and the update operator based on the convolution network are respectively recorded as
Figure FDA0002344958460000021
And
Figure FDA0002344958460000022
the specific operation in the reversible coding submodule is as follows:
Figure FDA0002344958460000023
Figure FDA0002344958460000024
in the above formula a1And b1The method is based on two-way output of a reversible coding submodule of a lifting structure, and mu and theta are learnable parameter sets of a convolution network corresponding to a prediction operator and an update operator.
6. The encoding method of a reversible self-encoder according to claim 4 or 5, characterized in that it further comprises: and controlling the series of the cascade reversible coding modules and the transformation attribute of each reversible coding submodule.
7. A decoding unit of a lifting structure-based reversible self-encoder, comprising:
a decoded signal separation module which separates an input signal into two sub-signals;
the cascade reversible decoding module comprises a plurality of stages of reversible decoding submodules based on a lifting structure, reconstructs a coded and decomposed signal, and takes the output of the previous stage of reversible decoding submodule as the input of the next stage of reversible decoding submodule;
and the decoding signal synthesis module is used for re-synthesizing the two paths of sub-signals processed by the cascade reversible decoding module into one path of signal.
8. The decoding unit of the lifting structure-based reversible self-encoder according to claim 7, wherein said reversible decoding submodule comprises:
the operator is updated based on the convolution network, adaptively transforms one path of the two paths of input signals by utilizing the nonlinear fitting capacity of the convolution network, and calculates the difference with the other path of input signals to be used as one path of the output two paths of new signals;
and the operator adaptively transforms the generated one path of new signals by utilizing the nonlinear fitting capability of the convolutional network, and calculates the difference with the other path of input in the input to be used as the second path of output two paths of new signals.
9. A decoding method of a reversible self-encoder, comprising:
separating the high-dimensional signal into two sub-signals;
reconstructing the high-dimensional signal by adopting a cascade reversible decoding module, wherein the cascade reversible decoding module comprises a plurality of stages of reversible decoding submodules based on a lifting structure, and the two-way output of the previous stage of reversible decoding submodule is used as the input of the next stage of reversible decoding submodule;
and recombining the two paths of sub-signals processed by the cascade reversible decoding module into one path of signal.
10. The decoding method of a reversible self-encoder according to claim 9, characterized in that said reversible decoding submodule comprises a convolutional network based predictor and a convolutional network based update operator, and the operations are as follows:
Figure FDA0002344958460000025
Figure FDA0002344958460000031
the prediction operator based on the convolutional network and the updating operator based on the convolutional network are realized by adopting a multilayer convolutional neural network; wherein the content of the first and second substances,
Figure FDA0002344958460000032
and
Figure FDA0002344958460000033
is a two-way input of a certain reversible decoding submodule,
Figure FDA0002344958460000034
and
Figure FDA0002344958460000035
is a double-path output of a certain reversible decoding submodule,
Figure FDA0002344958460000036
and
Figure FDA0002344958460000037
are respectively paired
Figure FDA0002344958460000038
And
Figure FDA0002344958460000039
the lifting and the prediction transformation are carried out,
Figure FDA00023449584600000310
and
Figure FDA00023449584600000311
a set of learnable parameters for the convolutional network corresponding to the predictor and the update operator.
11. A reversible self-encoder, characterized in that it comprises:
a coded signal separation module which separates an input image or a high-dimensional signal into two sub-signals;
the cascade reversible coding module comprises a plurality of stages of reversible coding sub-modules based on a lifting structure, the signals obtained by the separation of the coded signal separation module are decomposed, and the two-way output of the reversible coding sub-module at the previous stage is used as the input of the reversible coding sub-module at the next stage;
the coded signal synthesis module is used for recombining the two paths of sub signals processed by the cascade reversible coding module into one path of signal;
a decoded signal separation module which decomposes the synthesized signal of the encoded signal synthesis module into two sub-signals;
the cascade reversible decoding module is used for reconstructing a double-path sub-signal obtained after the decoding signal separation module is decomposed, the module comprises a plurality of stages of reversible decoding sub-modules based on a lifting structure, and double-path output of a previous stage of reversible decoding sub-module is used as input of a next stage of reversible decoding sub-module; the number of stages of the cascaded reversible decoding module is the same as the number of reversible coding sub-modules in the cascaded reversible coding module, the reversible decoding sub-modules correspond to the reversible coding sub-modules one by one, and the corresponding pair of reversible decoding sub-modules has the same parameter as the reversible coding sub-modules, wherein the arrangement sequence of the reversible decoding sub-modules in the cascaded reversible decoding module is opposite to the arrangement sequence of the reversible coding sub-modules in the cascaded reversible coding module;
and the decoding signal synthesis module is used for re-synthesizing the two paths of sub-signals processed by the cascade reversible decoding module into one path of signal.
12. The reversible self-encoder according to claim 11, characterized in that said reversible encoding submodule comprises:
the operator adaptively transforms one path of the two-path input signals by utilizing the nonlinear fitting capacity of the convolutional network, and performs summation operation with the other path of input to obtain the output corresponding to the path of input, wherein the output is used as one path of a new two-path output signal;
the operator updates an operator based on a convolution network, takes one path of output generated by the prediction operator and the summation operation as input, adaptively transforms by utilizing the nonlinear fitting capacity of the convolution network, and performs the summation operation with the other path of input of the input signal to be taken as a second path of new two-path output signal;
the reversible decoding submodule multiplexes a prediction operator based on a convolution network and an updating operator based on the convolution network in the reversible coding submodule; and calling parameters of a predictor and an update operator in the reversible coding submodule through parameter sharing to finish the predictor based on the convolutional network and the update operator based on the convolutional network in the reversible decoding submodule.
13. The reversible self-encoder according to claim 11 or 12, characterized in that it further comprises:
and the network regulation and control module controls the cascade reversible coding module and the cascade reversible decoding module and the transformation attribute of each pair of reversible decoding sub-modules and reversible coding sub-modules.
14. The reversible self-encoder according to claim 13, wherein the reversible decoding sub-modules and the reversible encoding sub-modules are in one-to-one correspondence and arranged in the following order:
recording the set of reversible coding sub-modules in the N-level cascade reversible coding modules as { Ai(·,·;θii)}1≤i≤NAnd the operation realized by the ith module is recorded as:
[ai,bi]=Ai(ai-1,bi-1;θii)
at this time, the N-level cascaded reversible coding module is implemented:
[aN,bN]=AN(AN-1(…A1(a0,b0;θ11)…;θN-1N-1);θNN)
in addition, let the set of reversible decoding sub-modules be { Bi(·,·;θii)}1≤i≤NAnd the operation realized by the ith module is recorded as:
Figure FDA0002344958460000041
then the N-level cascaded reversible decoding module implements:
Figure FDA0002344958460000042
wherein, { theta }ii}1≤i≤NLearnable parameters multiplexed with the reversible decoding submodule and the reversible encoding submodule;
the transformation attributes in the network regulation module comprise upper and lower bounds of the transformation corresponding to the network.
15. A reversible self-coding and decoding method is characterized by comprising the following steps:
separating an input image or a high-dimensional signal into two sub-signals;
decomposing the two paths of sub-signals obtained by separation by adopting a cascade reversible coding module, wherein the cascade reversible coding module comprises a plurality of stages of reversible coding sub-modules based on a lifting structure, and the two paths of output of the reversible coding sub-module at the previous stage are used as the input of the reversible coding sub-module at the next stage;
re-synthesizing the two paths of sub-signals processed by the cascade reversible coding module into one path of signal to obtain a synthesized high-dimensional signal;
separating the synthesized high-dimensional signal into two paths of sub-signals, and adapting to the input of a cascade reversible decoding module;
reconstructing the two-way sub-signals after the synthesized high-dimensional signals are separated by adopting a cascade reversible decoding module, wherein the cascade reversible decoding module comprises a plurality of stages of reversible decoding sub-modules based on a lifting structure, and the two-way output of the previous stage of reversible decoding module is used as the input of the next stage of reversible decoding module; wherein: the number of the reversible decoding sub-modules is the same as that of the reversible coding sub-modules in the cascade reversible coding module, the reversible decoding modules correspond to the reversible coding modules one by one, and the corresponding pair of the reversible decoding modules has the same parameter as that of the reversible coding modules, wherein the arrangement sequence of the reversible decoding modules in the cascade reversible decoding modules is opposite to that of the reversible coding modules in the cascade reversible coding modules;
and recombining the two paths of sub-signals processed by the cascade reversible decoding module into one path of signal.
16. The reversible self-coding-decoding method according to claim 15, wherein the reversible coding sub-module comprises a convolutional network based predictor and a convolutional network based update operator; will be denoted as a for two-way input0And b0The predictor and the update operator based on the convolution network are respectively recorded as
Figure FDA0002344958460000051
And
Figure FDA0002344958460000052
the specific operation in the reversible coding submodule is as follows:
Figure FDA0002344958460000053
Figure FDA0002344958460000054
in the above formula a1And b1The method is based on two-way output of a reversible coding submodule of a lifting structure, and mu and theta are learnable parameter sets of a convolution network corresponding to a prediction operator and an update operator;
the reversible decoding submodule multiplexes a prediction operator based on a convolutional network and an updating operator based on the convolutional network in the reversible coding submodule, and the operation is as follows:
Figure FDA0002344958460000055
Figure FDA0002344958460000056
the prediction operator based on the convolutional network and the updating operator based on the convolutional network are specifically realized as a multilayer convolutional neural network; wherein the content of the first and second substances,
Figure FDA0002344958460000057
and
Figure FDA0002344958460000058
is a two-way input of a certain reversible decoding submodule,
Figure FDA0002344958460000059
and
Figure FDA00023449584600000510
is a double-path output of a certain reversible decoding submodule,
Figure FDA00023449584600000511
and
Figure FDA00023449584600000512
are respectively paired
Figure FDA00023449584600000513
And
Figure FDA00023449584600000514
and carrying out lifting and predictive transformation, wherein mu and theta are learnable parameter sets of the convolution network corresponding to a predictive operator and an updating operator in the multiplexing reversible coding submodule.
17. The reversible self-coding and decoding method according to claim 15 or 16, further comprising:
controlling the number of levels of the cascaded reversible encoding modules and the cascaded reversible decoding modules and the transformation properties of each pair of the reversible decoding modules and the reversible encoding modules.
18. An image compression method, characterized in that the reversible self-coding and decoding method according to any one of claims 15 to 17 is used.
19. The image compression method according to claim 18, comprising:
the method comprises the steps that an input image is subjected to up-sampling through a convolutional layer and a ReLU layer of a convolutional neural network, and then signal separation is carried out to obtain an original separation signal;
forward coding the original separation signal by adopting a cascade reversible coding module to generate a two-path characteristic diagram;
synthesizing the obtained two-way characteristic diagram into a one-way characteristic diagram;
quantizing and entropy coding the generated one-way feature map to obtain a binary code stream;
decoding the binary code stream to obtain a reconstructed characteristic diagram;
separating the reconstructed characteristic diagram into two characteristic diagrams;
reversely decoding the two reconstructed characteristic graphs by adopting a cascade reversible decoding module to obtain two reconstructed signals;
and synthesizing the two paths of reconstruction signals obtained by reverse decoding, and realizing down-sampling reconstruction images through a convolution layer and a ReLU layer.
20. An image compression apparatus comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor is operable to execute the program when executing the program to perform the image compression method of claim 18 or 19.
CN201911391009.9A 2019-12-30 2019-12-30 Reversible self-encoder, encoding and decoding method, image compression method and device Active CN111131834B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911391009.9A CN111131834B (en) 2019-12-30 2019-12-30 Reversible self-encoder, encoding and decoding method, image compression method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911391009.9A CN111131834B (en) 2019-12-30 2019-12-30 Reversible self-encoder, encoding and decoding method, image compression method and device

Publications (2)

Publication Number Publication Date
CN111131834A true CN111131834A (en) 2020-05-08
CN111131834B CN111131834B (en) 2021-07-06

Family

ID=70504723

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911391009.9A Active CN111131834B (en) 2019-12-30 2019-12-30 Reversible self-encoder, encoding and decoding method, image compression method and device

Country Status (1)

Country Link
CN (1) CN111131834B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023082107A1 (en) * 2021-11-10 2023-05-19 Oppo广东移动通信有限公司 Decoding method, encoding method, decoder, encoder, and encoding and decoding system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107801026A (en) * 2017-11-09 2018-03-13 京东方科技集团股份有限公司 Method for compressing image and device, compression of images and decompression systems
US20180082150A1 (en) * 2016-09-20 2018-03-22 Kabushiki Kaisha Toshiba Abnormality detection device, learning device, abnormality detection method, and learning method
CN110070498A (en) * 2019-03-12 2019-07-30 浙江工业大学 A kind of image enchancing method based on convolution self-encoding encoder
CN110493596A (en) * 2019-09-02 2019-11-22 西北工业大学 A kind of video coding framework neural network based

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180082150A1 (en) * 2016-09-20 2018-03-22 Kabushiki Kaisha Toshiba Abnormality detection device, learning device, abnormality detection method, and learning method
CN107801026A (en) * 2017-11-09 2018-03-13 京东方科技集团股份有限公司 Method for compressing image and device, compression of images and decompression systems
CN110070498A (en) * 2019-03-12 2019-07-30 浙江工业大学 A kind of image enchancing method based on convolution self-encoding encoder
CN110493596A (en) * 2019-09-02 2019-11-22 西北工业大学 A kind of video coding framework neural network based

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ROBIN BRÜGGER 等: "A Partially Reversible U-Net for Memory-Efficient Volumetric Image Segmentation", 《INTERNATIONAL CONFERENCE ON MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION》 *
ROBIN TIBOR SCHIRRMEISTER等: "Training Generative Reversible Networks", 《ICML 2018 WORKSHOP ON THEORETICAL FOUNDATIONSAND APPLICATIONS OF DEEP GENERATIVE MODELS》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023082107A1 (en) * 2021-11-10 2023-05-19 Oppo广东移动通信有限公司 Decoding method, encoding method, decoder, encoder, and encoding and decoding system

Also Published As

Publication number Publication date
CN111131834B (en) 2021-07-06

Similar Documents

Publication Publication Date Title
CN111641832B (en) Encoding method, decoding method, device, electronic device and storage medium
CN109451308B (en) Video compression processing method and device, electronic equipment and storage medium
US20190141353A1 (en) Image compression/decompression method and device, and image processing system
CN108174218B (en) Video coding and decoding system based on learning
WO2001050768A2 (en) Method and apparatus for video compression using sequential frame cellular automata transforms
EP2168382A1 (en) Method for processing images and the corresponding electronic device
CN110753225A (en) Video compression method and device and terminal equipment
CN111669588B (en) Ultra-high definition video compression coding and decoding method with ultra-low time delay
CN113747163A (en) Image coding and decoding method and compression method based on context reorganization modeling
Akbari et al. Learned multi-resolution variable-rate image compression with octave-based residual blocks
CN111131834B (en) Reversible self-encoder, encoding and decoding method, image compression method and device
Kabir et al. Edge-based transformation and entropy coding for lossless image compression
CN112188217A (en) JPEG compressed image decompression effect removing method combining DCT domain and pixel domain learning
CN111080729B (en) Training picture compression network construction method and system based on Attention mechanism
WO2001050769A9 (en) Method and apparatus for video compression using multi-state dynamical predictive systems
WO2013011355A1 (en) Method and apparatus for encoding an image
WO2023082107A1 (en) Decoding method, encoding method, decoder, encoder, and encoding and decoding system
CN112437300B (en) Distributed video coding method based on self-adaptive interval overlapping factor
Wu et al. Enhancement of transform coding by nonlinear interpolation
WO2001050767A2 (en) Method and apparatus for digital video compression using three-dimensional cellular automata transforms
Abdul-Wahed et al. Compression of image using multi-wavelet techniques
RU2799099C1 (en) Method for processing video information based on three-dimensional discrete cosine transformation with motion compensation
Kuo et al. Image Compression Architecture with Built-in Lightweight Model
Sachdeva et al. A Review on Digital Image Compression Techniques
JP2939869B2 (en) Image encoding device and image decoding device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant