CN114286113A - Image compression recovery method and system based on multi-head heterogeneous convolution self-encoder - Google Patents

Image compression recovery method and system based on multi-head heterogeneous convolution self-encoder Download PDF

Info

Publication number
CN114286113A
CN114286113A CN202111605004.9A CN202111605004A CN114286113A CN 114286113 A CN114286113 A CN 114286113A CN 202111605004 A CN202111605004 A CN 202111605004A CN 114286113 A CN114286113 A CN 114286113A
Authority
CN
China
Prior art keywords
image
heterogeneous
original image
original
encoding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111605004.9A
Other languages
Chinese (zh)
Other versions
CN114286113B (en
Inventor
吴靖
刘超
陈爽
白朝晖
魏江
王浩
张艳
王幸同
常宏周
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Yanfu Technology Co ltd
State Grid Shaanxi Electric Power Co Ltd Xixian New Area Power Supply Co
Global Energy Interconnection Research Institute
Original Assignee
Beijing Yanfu Technology Co ltd
State Grid Shaanxi Electric Power Co Ltd Xixian New Area Power Supply Co
Global Energy Interconnection Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Yanfu Technology Co ltd, State Grid Shaanxi Electric Power Co Ltd Xixian New Area Power Supply Co, Global Energy Interconnection Research Institute filed Critical Beijing Yanfu Technology Co ltd
Priority to CN202111605004.9A priority Critical patent/CN114286113B/en
Publication of CN114286113A publication Critical patent/CN114286113A/en
Application granted granted Critical
Publication of CN114286113B publication Critical patent/CN114286113B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention discloses an image compression recovery method and system based on a multi-head heterogeneous convolution self-encoder, which comprises the following steps: processing an input original image based on a heterogeneous transformation method to obtain a heterogeneous image; coding an original image and an isomerism image based on a depth learning method of a convolution self-encoder to obtain an original image code and an isomerism image code; fusing and quantizing the original image code and the heterogeneous image code based on an attention mechanism to obtain a compressed image; decoding the compressed image based on a deep learning method of a decoder to obtain a restored image; and constructing a loss function based on the difference between the recovered image and the original image, and continuously iterating until convergence by training the loss function to obtain the optimal recovered image. The method improves the picture compression quality by carrying out heterogeneous processing on the image and processing the image based on the attention mechanism, and has higher application value in the aspect of image transmission.

Description

Image compression recovery method and system based on multi-head heterogeneous convolution self-encoder
Technical Field
The invention belongs to the field of image processing, and relates to an image compression recovery method and system based on a multi-head heterogeneous convolution self-encoder.
Background
The image compression is mainly divided into lossy compression and lossless compression algorithms, and for the image compression, the lossless compression ratio is generally very small, and the lossy compression algorithm is mainly used. Image compression is important in the process of image fast transmission, the higher the compression ratio is, the faster the image transmission is, but usually a trade-off is required between the image compression ratio and the image fidelity. In recent years, deep learning has been increasingly applied to the field of image compression, but how to make the compressed content of an image restore the image itself under a constant compression ratio is still a problem to be solved.
Disclosure of Invention
The invention aims to solve the problems in the prior art and provides an image compression recovery method and system based on a multi-head heterogeneous convolution self-encoder.
In order to achieve the purpose, the invention adopts the following technical scheme to realize the purpose:
the image compression recovery method based on the multi-head heterogeneous convolution self-encoder comprises the following steps:
processing an input original image based on a heterogeneous transformation method to obtain a heterogeneous image;
the depth learning method based on the convolution self-encoder is used for respectively encoding an original image and an image after isomerism by using an independent convolution self-encoder to obtain an original image code and an image code after isomerism;
fusing and quantizing the original image code and the heterogeneous image code based on an attention mechanism to obtain a compressed image;
decoding the compressed image based on a deep learning method of a decoder to obtain a restored image;
and constructing a loss function based on the difference between the recovered image and the original image, and continuously iterating until convergence by training the loss function to obtain the optimal recovered image.
The invention is further improved in that:
processing an input original image based on a heterogeneous transformation method, specifically:
inputting an original image I0Dimension of original image is [ H ]0,W0,3]Respectively carrying out heterogeneous transformation on the original image by using different heterogeneous transformation methods to obtain different heterogeneous images;
the heterogeneous transformation method comprises a method of randomly increasing and decreasing brightness, a method of randomly increasing and decreasing hue and a method of randomly increasing and decreasing contrast, and is based on three methodsThe heterogeneous method obtains three groups of different heterogeneous images I1,I2,I3
The depth learning method based on the convolution self-encoder is used for respectively encoding an original image and an image after isomerism by using an independent convolution self-encoder to obtain an original image code and an image code after isomerism, and specifically comprises the following steps:
the original image I0And heterogeneous images I1,I2,I3Respectively inputting the images into Python software to respectively obtain original images I0And heterogeneous images I1,I2,I3Mean and variance var of (1), for the original image I0And heterogeneous images I1,I2,I3Respectively carrying out normalization operation, and carrying out normalization operation on the original images I0And heterogeneous images I1,I2,I3Respectively carrying out convolution operation and down-sampling and feature extraction by using independent convolution self-encoders to obtain an original image code f0And heterogeneous image coding f1,f2,f3
The normalization operation is shown in equation (1):
Figure BDA0003433402000000021
where i includes 0, 1, 2, and 3.
Fusing and quantizing the original image code and the heterogeneous image code based on an attention mechanism to obtain a compressed image, which specifically comprises the following steps:
encoding of an original image f0And heterogeneous image coding f1,f2,f3Adding spatial attention, encoding the original image with spatial attention added0And heterogeneous image coding f1,f2,f3Respectively carrying out global pooling and average pooling, and splicing the pooled result with the original image coding matrix; encoding of an original image f0And heterogeneous image coding f1,f2,f3Respectively carrying out convolution operation to obtain the dimension of [ H, W,1 ]]Is characterized by
Figure BDA0003433402000000031
And based on
Figure BDA0003433402000000032
Generating spatial attention weight w by sigmoid0By encoding of the original image f0And heterogeneous image coding f1,f2,f3Respectively with spatial attention weight w0Performing matrix multiplication to obtain the data added with space attention
Figure BDA0003433402000000033
After paying attention to the acquired added space
Figure BDA0003433402000000034
Add channel attention, to
Figure BDA0003433402000000035
In [ H, W, C ]]Performing global pooling in dimension to obtain features
Figure BDA0003433402000000036
Figure BDA0003433402000000037
Generating weights z by fully connected layers and sigmoid0Will z0And
Figure BDA0003433402000000038
multiplying and summing to obtain the characteristic f after adding the attention of the channel, and quantizing the f to obtain a compressed image fqWith dimensions [ H, W, C];
The matrix dimensions of the original image coding and the heterogeneous image coding are [ H, W, C ].
The deep learning method based on the decoder decodes the compressed image to obtain a restored image, and specifically comprises the following steps: after compressingImage f ofqPerforming inverse quantization, performing upsampling on the image by a decoder by using deconvolution operation, enlarging dimension of H and W of the image subjected to inverse quantization, and finally outputting a restored image
Figure BDA0003433402000000039
Dimension of [ H ]0,W0,3]And an original image I0The dimensions are the same.
The loss function is shown in equation (2):
Figure BDA00034334020000000310
where m is the number of points in the original image, the number of points in the original image and the restored image being the same, x1 and x2Is the corresponding value of the original image and the restored image at the same point.
The image compression recovery system based on the multi-head heterogeneous convolution self-encoder comprises:
the image processing module is used for processing an input original image based on a heterogeneous transformation method to obtain a heterogeneous image;
the encoding module is used for respectively encoding the original image and the heterogeneous image by using independent convolution self-encoders based on a deep learning method of the convolution self-encoders to acquire the original image encoding and the heterogeneous image encoding;
the fusion quantization module fuses and quantizes the original image code and the heterogeneous image code based on an attention mechanism to obtain a compressed image;
a decoding module for decoding the compressed image based on a deep learning method of a decoder to obtain a restored image,
and the loss function optimization module constructs a loss function based on the difference between the recovered image and the original image, and obtains the optimal recovered image by continuously iterating to converge through the training loss function.
A terminal device comprising a memory, a processor and a computer program stored in said memory and executable on said processor, said processor implementing the steps of the above method when executing said computer program.
A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method.
Compared with the prior art, the invention has the following beneficial effects:
the invention makes the convolution self-encoder focus on the characteristics of different aspects of the image by carrying out isomerism on the image. Meanwhile, an attention mechanism is used for processing the image, the image compression quality is improved, meanwhile, the method can be more suitable for image compression under different shooting conditions, the image fidelity can be improved under a certain compression ratio, and the method has a high application value in the aspect of image transmission.
Drawings
In order to more clearly explain the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention, and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
FIG. 1 is a general flowchart of an image compression recovery method based on a multi-head heterogeneous convolutional auto-encoder according to an embodiment of the present invention;
FIG. 2 is another flowchart of an image compression recovery method based on a multi-head heterogeneous convolutional auto-encoder according to an embodiment of the present invention;
fig. 3 is a block diagram of an image compression recovery system based on a multi-head heterogeneous convolutional auto-encoder according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
In the description of the embodiments of the present invention, it should be noted that if the terms "upper", "lower", "horizontal", "inner", etc. are used for indicating the orientation or positional relationship based on the orientation or positional relationship shown in the drawings or the orientation or positional relationship which is usually arranged when the product of the present invention is used, the description is merely for convenience and simplicity, and the indication or suggestion that the referred device or element must have a specific orientation, be constructed and operated in a specific orientation, and thus, cannot be understood as limiting the present invention. Furthermore, the terms "first," "second," and the like are used merely to distinguish one description from another, and are not to be construed as indicating or implying relative importance.
Furthermore, the term "horizontal", if present, does not mean that the component is required to be absolutely horizontal, but may be slightly inclined. For example, "horizontal" merely means that the direction is more horizontal than "vertical" and does not mean that the structure must be perfectly horizontal, but may be slightly inclined.
In the description of the embodiments of the present invention, it should be further noted that unless otherwise explicitly stated or limited, the terms "disposed," "mounted," "connected," and "connected" should be interpreted broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.
The invention is described in further detail below with reference to the accompanying drawings:
referring to fig. 1 and fig. 2, the invention discloses an image compression recovery method based on a multi-head heterogeneous convolutional self-encoder, comprising the following steps:
step 1, processing an input original image based on a heterogeneous transformation method to obtain a heterogeneous image.
Wherein, the input image is set as I0Dimension of [ H ]0,W0,3]The heterogeneous transformation method r used in this embodiment1,r2,r3The brightness, the hue and the contrast are respectively increased and decreased randomly, and an image obtained by a heterogeneous transformation method is I1,I3,I3
The input image is assumed to be in the RGB format,
Figure BDA0003433402000000061
wherein rand (0.8-1.2) represents a random number generated between 0.8 and 1.2 every time brightness conversion is performed, and min (rand (0.8-1.2) × I)0255) represents para-rand (0.8-1.2) × I0The upper limit of each of the values in (a) is set to 255,
Figure BDA0003433402000000062
indicating rounding down the numbers therein;
for hue random increase and decrease, firstly, the image needs to be converted from an RGB format to an HSV format, firstly, the maximum value and the minimum value of RGB three channels are obtained, MAX is MAX (R, G, B), MIN is MIN (R, G, B),
wherein R, G and B represent I0A corresponding matrix of three channels, thenCalculating hue of color
Figure BDA0003433402000000063
If the channel where the minimum value exists is R, H is H +120, if the channel where the minimum value exists is G, H is H +120, H is randomly transformed by R2=(H+rand(0,30))%360;
For the random contrast transformation, first, image I is obtained0Average avg of medium maximum and minimum values ═
Figure BDA0003433402000000065
And half diff of the difference between 0.5 max (I)0)-min(I0) Then the new range is (max (avg-rand (0.8,1.2) × diff,0), min (avg + rand (0.8,1.2) × diff,255)), I is divided using the generated random numbers0To a new range, then
Figure BDA0003433402000000064
And 2, respectively coding the original image and the heterogeneous image by using independent convolution self-encoders based on the depth learning method of the convolution self-encoders to obtain the original image code and the heterogeneous image code.
The original image I0And heterogeneous images I1,I2,I3Respectively inputting the images into Python software to respectively obtain original images I0And heterogeneous images I1,I2,I3Mean and variance var of (1), for the original image I0And heterogeneous images I1,I2,I3Respectively carrying out normalization operation, and carrying out normalization operation on the original images I0And heterogeneous images I1,I2,I3Respectively carrying out convolution operation and down-sampling and feature extraction by using independent convolution self-encoders to obtain an original image code f0And heterogeneous image coding f1,f2,f3
The normalization operation is shown in equation (1):
Figure BDA0003433402000000071
where i includes 0, 1, 2, and 3.
And 3, fusing and quantizing the original image code and the heterogeneous image code based on an attention mechanism to obtain a compressed image.
Wherein, the feature f is coded by the original image0For example, let the matrix dimension of image coding be [ H, W, C]. Spatial attention is first added, giving different weights to different spatial locations of each feature. To f0Performing global pooling and average pooling, and splicing the pooled result with the original image code to obtain a feature dimension of [ H, W, C +2 ]]Then using 5-by-5 convolution operation to obtain dimension [ H, W,1 ]]Is characterized by
Figure BDA0003433402000000072
Will be provided with
Figure BDA0003433402000000073
Generating spatial attention weight w by sigmoid0A 1 is to f0 and w0Matrix multiplication is carried out to obtain the product with added space attention
Figure BDA0003433402000000074
The same operation is carried out on the characteristics after the isomerization, and the characteristics can be obtained
Figure BDA0003433402000000075
Channel attention is then added, making the content of the image itself more interesting. The isomerization operation has 3 kinds in total, will
Figure BDA0003433402000000076
In [ H, W, C ]]Performing global pooling in dimension to obtain dimension of [4]Is characterized by
Figure BDA0003433402000000077
Will be provided with
Figure BDA0003433402000000078
Generating weight z corresponding to each heterogeneous feature through full connection layer and sigmoid0Dimension of [4 ]]Will z0And
Figure BDA0003433402000000079
multiplying and summing to obtain the characteristic f after adding the attention of the channel, and quantizing the f to obtain an image compression result fqThe dimensions of which are likewise [ H, W, C ]]。
The image compression at this time is composed of two parts, one is the reduction of the spatial dimension due to down-sampling, and the other is the compression due to quantization. Image compression ratio
Figure BDA00034334020000000710
wherein q1Is the number of bits in quantizing f, q0Is an original image I0The number of bits per se is generally 8.
And 4, decoding the compressed image based on a deep learning method of a decoder to obtain a recovered image.
Wherein f is first quantized using inverse quantizationqChanging into floating point number, the decoder then uses multiple deconvolution operations to perform upsampling, gradually enlarging its H, W dimension size, and finally outputting the restored image
Figure BDA0003433402000000081
Dimension of [ H ]0,W0,3]And the same dimension as the original image. Recovering images by computing output during neural network training
Figure BDA0003433402000000082
And an original image I0As a function of the loss. In which upsampling is done using a bilinear interpolation operation, e.g. for two points (x) in the image0,y0),(x1,y1) The values of which are A and B, respectively, and are needed to be in the middle (x) if the up-sampling operation is performed2,y2),x1>x2>x0Is inserted into a new point, then its value
Figure BDA0003433402000000083
And 5: and constructing a loss function based on the difference between the recovered image and the original image, and continuously iterating until convergence by training the loss function to obtain the optimal recovered image.
The loss function is shown in equation (2):
Figure BDA0003433402000000084
where m is the number of points in the original image, the number of points in the original image and the restored image being the same, x1 and x2Is the corresponding value of the original image and the restored image at the same point.
Referring to fig. 3, the invention discloses an image compression recovery system based on a multi-head heterogeneous convolutional auto-encoder, comprising:
the image processing module is used for processing an input original image based on a heterogeneous transformation method to obtain a heterogeneous image;
the encoding module is used for respectively encoding the original image and the heterogeneous image by using independent convolution self-encoders based on a deep learning method of the convolution self-encoders to acquire the original image encoding and the heterogeneous image encoding;
the fusion quantization module fuses and quantizes the original image code and the heterogeneous image code based on an attention mechanism to obtain a compressed image;
a decoding module for decoding the compressed image based on a deep learning method of a decoder to obtain a restored image,
and the loss function optimization module constructs a loss function based on the difference between the recovered image and the original image, and obtains the optimal recovered image by continuously iterating to converge through the training loss function.
The terminal device provided by the embodiment of the invention. The terminal device of this embodiment includes: a processor, a memory, and a computer program stored in the memory and executable on the processor. The processor realizes the steps of the above-mentioned method embodiments when executing the computer program. Alternatively, the processor implements the functions of the modules/units in the above device embodiments when executing the computer program.
The computer program may be partitioned into one or more modules/units that are stored in the memory and executed by the processor to implement the invention.
The terminal device can be a desktop computer, a notebook, a palm computer, a cloud server and other computing devices. The terminal device may include, but is not limited to, a processor, a memory.
The processor may be a Central Processing Unit (CPU), other general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, etc.
The memory may be used for storing the computer programs and/or modules, and the processor may implement various functions of the terminal device by executing or executing the computer programs and/or modules stored in the memory and calling data stored in the memory.
The terminal device integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer memory, Read-only memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, etc. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.
The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes will occur to those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (9)

1. The image compression recovery method based on the multi-head heterogeneous convolution self-encoder is characterized by comprising the following steps:
processing an input original image based on a heterogeneous transformation method to obtain a heterogeneous image;
the depth learning method based on the convolution self-encoder is used for respectively encoding an original image and an image after isomerism by using an independent convolution self-encoder to obtain an original image code and an image code after isomerism;
fusing and quantizing the original image code and the heterogeneous image code based on an attention mechanism to obtain a compressed image;
decoding the compressed image based on a deep learning method of a decoder to obtain a restored image;
and constructing a loss function based on the difference between the recovered image and the original image, and continuously iterating until convergence by training the loss function to obtain the optimal recovered image.
2. The image compression recovery method based on the multi-head heterogeneous convolution self-encoder according to claim 1, wherein the heterogeneous transformation method is used for processing an input original image, and specifically comprises:
inputting an original image I0Dimension of original image is [ H ]0,W0,3]Respectively carrying out heterogeneous transformation on the original image by using different heterogeneous transformation methods to obtain different heterogeneous images;
the heterogeneous transformation method comprises a method of random increase and decrease of brightness, a method of random increase and decrease of hue and a method of random increase and decrease of contrast, and three groups of different heterogeneous images I are obtained based on the three heterogeneous methods1,I2,I3
3. The image compression recovery method based on the multi-head heterogeneous convolutional auto-encoder as claimed in claim 2, wherein the depth learning method based on the convolutional auto-encoder is used for respectively encoding the original image and the heterogeneous image by using independent convolutional auto-encoders to obtain the original image encoding and the heterogeneous image encoding, and specifically comprises:
the original image I0And heterogeneous images I1,I2,I3Respectively inputting the images into Python software to respectively obtain original images I0And heterogeneous images I1,I2,I3Mean and variance var of (1), for the original image I0And heterogeneous images I1,I2,I3Respectively carrying out normalization operation, and carrying out normalization operation on the original images I0And heterogeneous images I1,I2,I3Respectively carrying out convolution operation and down-sampling and feature extraction by using independent convolution self-encoders to obtain an original image code f0And heterogeneous image coding f1,f2,f3
The normalization operation is shown in equation (1):
Figure FDA0003433401990000021
where i includes 0, 1, 2, and 3.
4. The image compression recovery method based on the multi-head heterogeneous convolution self-encoder according to claim 3, wherein the original image encoding and the heterogeneous image encoding are fused and quantized based on an attention mechanism to obtain a compressed image, specifically:
encoding of an original image f0And heterogeneous image coding f1,f2,f3Adding spatial attention, encoding the original image with spatial attention added0And heterogeneous image coding f1,f2,f3Respectively carrying out global pooling and average pooling, and splicing the pooled result with the original image coding matrix; encoding of an original image f0And heterogeneous image coding f1,f2,f3Respectively carrying out convolution operation to obtain the dimension of [ H, W,1 ]]Is characterized by
Figure FDA0003433401990000022
And based on
Figure FDA0003433401990000023
Generating spatial attention weight w by sigmoid0By encoding of the original image f0And heterogeneous image coding f1,f2,f3Respectively with spatial attention weight w0Performing matrix multiplication to obtain the data added with space attention
Figure FDA0003433401990000024
After paying attention to the acquired added space
Figure FDA0003433401990000025
Add channel attention, to
Figure FDA0003433401990000026
In [ H, W, C ]]Performing global pooling in dimension to obtain features
Figure FDA0003433401990000027
Figure FDA0003433401990000028
Generating weights z by fully connected layers and sigmoid0Will z0And
Figure FDA0003433401990000029
multiplying and summing to obtain the characteristic f after adding the attention of the channel, and quantizing the f to obtain a compressed image fqWith dimensions [ H, W, C];
And the matrix dimensions of the original image coding and the heterogeneous image coding are [ H, W, C ].
5. The image compression recovery method based on the multi-head heterogeneous convolution self-encoder according to claim 4, wherein the depth learning method based on the decoder decodes the compressed image to obtain a recovered image, and specifically includes: for compressed image fqPerforming inverse quantization, performing upsampling on the image by a decoder by using deconvolution operation, enlarging dimension of H and W of the image subjected to inverse quantization, and finally outputting a restored image
Figure FDA00034334019900000210
Dimension of [ H ]0,W0,3]And an original image I0The dimensions are the same.
6. The image compression recovery method based on the multi-head heterogeneous convolution self-encoder according to claim 5, wherein the loss function is shown in formula (2):
Figure FDA0003433401990000031
where m is the number of points in the original image, the number of points in the original image and the restored image being the same, x1 and x2Is the corresponding value of the original image and the restored image at the same point.
7. An image compression recovery system based on a multi-head heterogeneous convolutional auto-encoder, comprising:
the image processing module is used for processing an input original image based on a heterogeneous transformation method to obtain a heterogeneous image;
the encoding module is used for respectively encoding the original image and the heterogeneous image by using independent convolution self-encoders based on a deep learning method of the convolution self-encoders to acquire the original image encoding and the heterogeneous image encoding;
the fusion quantization module fuses and quantizes the original image code and the heterogeneous image code based on an attention mechanism to obtain a compressed image;
a decoding module for decoding the compressed image based on a deep learning method of a decoder to obtain a restored image,
and the loss function optimization module constructs a loss function based on the difference between the recovered image and the original image, and obtains the optimal recovered image by continuously iterating to converge through the training loss function.
8. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1-6 when executing the computer program.
9. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 6.
CN202111605004.9A 2021-12-24 2021-12-24 Image compression recovery method and system based on multi-head heterogeneous convolution self-encoder Active CN114286113B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111605004.9A CN114286113B (en) 2021-12-24 2021-12-24 Image compression recovery method and system based on multi-head heterogeneous convolution self-encoder

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111605004.9A CN114286113B (en) 2021-12-24 2021-12-24 Image compression recovery method and system based on multi-head heterogeneous convolution self-encoder

Publications (2)

Publication Number Publication Date
CN114286113A true CN114286113A (en) 2022-04-05
CN114286113B CN114286113B (en) 2023-05-30

Family

ID=80875568

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111605004.9A Active CN114286113B (en) 2021-12-24 2021-12-24 Image compression recovery method and system based on multi-head heterogeneous convolution self-encoder

Country Status (1)

Country Link
CN (1) CN114286113B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110334765A (en) * 2019-07-05 2019-10-15 西安电子科技大学 Remote Image Classification based on the multiple dimensioned deep learning of attention mechanism
US10593021B1 (en) * 2019-09-11 2020-03-17 Inception Institute of Artificial Intelligence, Ltd. Motion deblurring using neural network architectures
CN113095439A (en) * 2021-04-30 2021-07-09 东南大学 Heterogeneous graph embedding learning method based on attention mechanism
CN113240589A (en) * 2021-04-01 2021-08-10 重庆兆光科技股份有限公司 Image defogging method and system based on multi-scale feature fusion

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110334765A (en) * 2019-07-05 2019-10-15 西安电子科技大学 Remote Image Classification based on the multiple dimensioned deep learning of attention mechanism
US10593021B1 (en) * 2019-09-11 2020-03-17 Inception Institute of Artificial Intelligence, Ltd. Motion deblurring using neural network architectures
CN113240589A (en) * 2021-04-01 2021-08-10 重庆兆光科技股份有限公司 Image defogging method and system based on multi-scale feature fusion
CN113095439A (en) * 2021-04-30 2021-07-09 东南大学 Heterogeneous graph embedding learning method based on attention mechanism

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
苗壮等: "采用GPU加速的压缩感知图像恢复算法", 微电子学与计算机 *

Also Published As

Publication number Publication date
CN114286113B (en) 2023-05-30

Similar Documents

Publication Publication Date Title
EP3275190B1 (en) Chroma subsampling and gamut reshaping
CN108022212B (en) High-resolution picture generation method, generation device and storage medium
KR102165155B1 (en) Adaptive interpolation for spatially scalable video coding
US10909728B1 (en) Learned lossy image compression codec
CN114581544A (en) Image compression method, computer device and computer storage medium
CN113781320A (en) Image processing method and device, terminal equipment and storage medium
CN116636217A (en) Method and apparatus for encoding image and decoding code stream using neural network
JP2014521275A (en) Adaptive upsampling method, program and computer system for spatially scalable video coding
US20200366938A1 (en) Signal encoding
US20240048738A1 (en) Methods, apparatuses, computer programs and computer-readable media for processing configuration data
Xing et al. Scale-arbitrary invertible image downscaling
TWI805085B (en) Handling method of chroma subsampled formats in machine-learning-based video coding
CN114943643A (en) Image reconstruction method, image coding and decoding method and related equipment
TW202228439A (en) Method for chroma subsampled formats handling in machine-learning-based picture coding
US20200169742A1 (en) Single-channel inverse mapping for image/video processing
KR102312338B1 (en) AI encoding apparatus and operating method for the same, and AI decoding apparatus and operating method for the same
CN114286113A (en) Image compression recovery method and system based on multi-head heterogeneous convolution self-encoder
Hasnat et al. Luminance approximated vector quantization algorithm to retain better image quality of the decompressed image
US11490102B2 (en) Resilient image compression and decompression
WO2019023202A1 (en) Single-channel inverse mapping for image/video processing
US8582906B2 (en) Image data compression and decompression
EP4252423A1 (en) Video encoding using pre-processing
CN114450692A (en) Neural network model compression using block partitioning
CN113068033B (en) Multimedia inverse quantization processing method, device, equipment and storage medium
US20240013446A1 (en) Method and apparatus for encoding or decoding a picture using a neural network comprising sub-networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant