CN114286113B - Image compression recovery method and system based on multi-head heterogeneous convolution self-encoder - Google Patents

Image compression recovery method and system based on multi-head heterogeneous convolution self-encoder Download PDF

Info

Publication number
CN114286113B
CN114286113B CN202111605004.9A CN202111605004A CN114286113B CN 114286113 B CN114286113 B CN 114286113B CN 202111605004 A CN202111605004 A CN 202111605004A CN 114286113 B CN114286113 B CN 114286113B
Authority
CN
China
Prior art keywords
image
heterogeneous
original image
encoder
original
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111605004.9A
Other languages
Chinese (zh)
Other versions
CN114286113A (en
Inventor
吴靖
刘超
陈爽
白朝晖
魏江
王浩
张艳
王幸同
常宏周
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Yanfu Technology Co ltd
State Grid Shaanxi Electric Power Co Ltd Xixian New Area Power Supply Co
Global Energy Interconnection Research Institute
Original Assignee
Beijing Yanfu Technology Co ltd
State Grid Shaanxi Electric Power Co Ltd Xixian New Area Power Supply Co
Global Energy Interconnection Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Yanfu Technology Co ltd, State Grid Shaanxi Electric Power Co Ltd Xixian New Area Power Supply Co, Global Energy Interconnection Research Institute filed Critical Beijing Yanfu Technology Co ltd
Priority to CN202111605004.9A priority Critical patent/CN114286113B/en
Publication of CN114286113A publication Critical patent/CN114286113A/en
Application granted granted Critical
Publication of CN114286113B publication Critical patent/CN114286113B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention discloses an image compression recovery method and system based on a multi-head heterogeneous convolution self-encoder, comprising the following steps: processing an input original image based on a heterogeneous transformation method to obtain a heterogeneous image; coding an original image and a heterogeneous image based on a deep learning method of a convolution self-encoder to obtain an original image code and a heterogeneous image code; fusing and quantizing the original image code and the heterogeneous image code based on an attention mechanism to obtain a compressed image; decoding the compressed image based on a deep learning method of a decoder to obtain a restored image; based on the difference between the recovered image and the original image, constructing a loss function, and continuously iterating to be converged through training the loss function to obtain an optimal recovered image. The invention improves the compression quality of the picture by carrying out heterogeneous processing on the image and processing the image based on the attention mechanism, and has higher application value in the aspect of image transmission.

Description

Image compression recovery method and system based on multi-head heterogeneous convolution self-encoder
Technical Field
The invention belongs to the field of image processing, and relates to an image compression recovery method and system based on a multi-head heterogeneous convolution self-encoder.
Background
Image compression is mainly classified into lossy compression and lossless compression algorithms, and for image compression, the lossless compression ratio is generally small, and lossy compression algorithms are mainly used. Image compression is important in the process of rapid image transmission, and the higher the compression ratio is, the faster the image is transmitted, but the image compression ratio and the image fidelity are often required to be chosen and replaced. In recent years, deep learning has been increasingly applied in the field of image compression, but how to make the image compression content restore the image itself under the condition of a certain compression ratio is still a problem to be solved.
Disclosure of Invention
The invention aims to solve the problems in the prior art, and provides an image compression recovery method and system based on a multi-head heterogeneous convolution self-encoder, which are used for transforming images through a plurality of heterogeneous encoders and processing the images by combining an attention mechanism, so that the image fidelity is improved under a certain compression ratio, and the image compression recovery method and system have higher application value in the aspect of image compression in image transmission.
In order to achieve the purpose, the invention is realized by adopting the following technical scheme:
an image compression recovery method based on a multi-head heterogeneous convolution self-encoder comprises the following steps:
processing an input original image based on a heterogeneous transformation method to obtain a heterogeneous image;
the method comprises the steps of respectively encoding an original image and a heterogeneous image by using independent convolution self-encoders based on a deep learning method of the convolution self-encoders to obtain an original image code and a heterogeneous image code;
fusing and quantizing the original image code and the heterogeneous image code based on an attention mechanism to obtain a compressed image;
decoding the compressed image based on a deep learning method of a decoder to obtain a restored image;
based on the difference between the recovered image and the original image, constructing a loss function, and continuously iterating to be converged through training the loss function to obtain an optimal recovered image.
The invention further improves that:
based on the heterogeneous transformation method, the input original image is processed, specifically:
input of original image I 0 The original image has dimensions of [ H ] 0 ,W 0 ,3]Carrying out heterogeneous transformation on the original image by using different heterogeneous transformation methods respectively to obtain different heterogeneous images;
the heterogeneous transformation method comprises a brightness random increase/decrease method, a hue random increase/decrease method and a contrast random increase/decrease method, and three different groups of heterogeneous images I are obtained based on the three heterogeneous methods 1 ,I 2 ,I 3
The method for deep learning based on the convolution self-encoder is characterized in that an original image and a heterogeneous image are respectively encoded by using independent convolution self-encoders, and the original image encoding and the heterogeneous image encoding are obtained, specifically:
will original image I 0 And heterogeneous image I 1 ,I 2 ,I 3 Respectively inputting the images into Python software to respectively obtain original images I 0 And heterogeneous image I 1 ,I 2 ,I 3 Mean and variance var of (a)For the original image I 0 And heterogeneous image I 1 ,I 2 ,I 3 Respectively carrying out normalization operation, and carrying out normalization operation on the original image I 0 And heterogeneous image I 1 ,I 2 ,I 3 The independent convolution self-encoder is used for carrying out convolution operation, downsampling and feature extraction respectively, and the original image code f is obtained 0 And heterogeneous image coding f 1 ,f 2 ,f 3
The normalization operation is as shown in formula (1):
Figure BDA0003433402000000021
wherein i comprises 0, 1, 2 and 3.
Fusing and quantizing the original image code and the heterogeneous image code based on an attention mechanism to obtain a compressed image, wherein the compressed image specifically comprises:
encoding f of original image 0 And heterogeneous image coding f 1 ,f 2 ,f 3 Adding spatial attention, and coding f for the original image after adding the spatial attention 0 And heterogeneous image coding f 1 ,f 2 ,f 3 Global pooling and average pooling are respectively carried out, and the pooled results are respectively spliced with the original image coding matrix; encoding f of original image 0 And heterogeneous image coding f 1 ,f 2 ,f 3 Respectively performing convolution operation to obtain the dimension of [ H, W,1 ]]Features of (2)
Figure BDA0003433402000000031
And based on->
Figure BDA0003433402000000032
Generating spatial attention weights w by sigmoid 0 Encoding f by original image 0 And heterogeneous image coding f 1 ,f 2 ,f 3 Respectively with the space attention weight w 0 Matrix multiplication is carried out to obtain the spatial attention after the addition of the spatial attention/>
Figure BDA0003433402000000033
For the acquired added spatial attention
Figure BDA0003433402000000034
Add channel attention to
Figure BDA0003433402000000035
In [ H, W, C ]]Global pooling is performed on dimensions to obtain features ∈>
Figure BDA0003433402000000036
Figure BDA0003433402000000037
Generating weights z through full connection layer and sigmoid 0 Will z 0 And->
Figure BDA0003433402000000038
Multiplying and summing to obtain a characteristic f after adding the channel attention, and quantizing f to obtain a compressed image f q Its dimension is [ H, W, C];
The matrix dimensions of the original image code and the heterogeneous image code are H, W and C.
Decoding the compressed image based on a deep learning method of a decoder to obtain a restored image, wherein the method specifically comprises the following steps: for compressed image f q Performing inverse quantization processing, up-sampling the inverse quantized image by using inverse convolution operation by a decoder, enlarging the dimension sizes of the inverse quantized image H and W, and finally enabling the output to recover the image
Figure BDA0003433402000000039
Dimension is [ H ] 0 ,W 0 ,3]And original image I 0 The dimensions are the same.
The loss function is shown in formula (2):
Figure BDA00034334020000000310
where m is the number of points in the original image, the number of points in the original image and the recovered image are the same, x 1 and x2 Is the corresponding value of the original image and the restored image at the same point.
An image compression recovery system based on a multi-headed heterogeneous convolution self-encoder, comprising:
the image processing module is used for processing the input original image based on a heterogeneous transformation method to obtain a heterogeneous image;
the coding module is used for respectively coding the original image and the heterogeneous image by using an independent convolution self-encoder based on a deep learning method of the convolution self-encoder to obtain an original image code and a heterogeneous image code;
the fusion quantization module is used for fusing and quantizing the original image codes and the heterogeneous image codes based on an attention mechanism to obtain compressed images;
a decoding module for decoding the compressed image based on a deep learning method of the decoder to obtain a restored image,
and the loss function optimization module is used for constructing a loss function based on the difference between the restored image and the original image, and obtaining the optimal restored image by continuously iterating to be converged through training the loss function.
A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the above method when the computer program is executed.
A computer readable storage medium storing a computer program which, when executed by a processor, performs the steps of the method described above.
Compared with the prior art, the invention has the following beneficial effects:
the present invention focuses the convolution self-encoder on the features of different aspects of the image by isomerising the image. Meanwhile, the attention mechanism is used for processing the image, so that the image compression quality is improved, the image compression method is more suitable for image compression under different shooting conditions, the image fidelity can be improved under a certain compression ratio, and the image compression method has higher application value in the aspect of image transmission.
Drawings
For a clearer description of the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present invention and should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a general flow chart of a multi-head heterogeneous convolution self-encoder based image compression recovery method according to an embodiment of the present invention;
FIG. 2 is a flowchart of another method for image compression recovery based on a multi-head heterogeneous convolutional self-encoder according to an embodiment of the present invention;
fig. 3 is a block diagram of an image compression recovery system based on a multi-head heterogeneous convolution self-encoder according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.
In the description of the embodiments of the present invention, it should be noted that, if the terms "upper," "lower," "horizontal," "inner," and the like indicate an azimuth or a positional relationship based on the azimuth or the positional relationship shown in the drawings, or the azimuth or the positional relationship in which the inventive product is conventionally put in use, it is merely for convenience of describing the present invention and simplifying the description, and does not indicate or imply that the apparatus or element to be referred to must have a specific azimuth, be configured and operated in a specific azimuth, and thus should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and the like, are used merely to distinguish between descriptions and should not be construed as indicating or implying relative importance.
Furthermore, the term "horizontal" if present does not mean that the component is required to be absolutely horizontal, but may be slightly inclined. As "horizontal" merely means that its direction is more horizontal than "vertical", and does not mean that the structure must be perfectly horizontal, but may be slightly inclined.
In the description of the embodiments of the present invention, it should also be noted that, unless explicitly specified and limited otherwise, the terms "disposed," "mounted," "connected," and "connected" should be construed broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communication between two elements. The specific meaning of the above terms in the present invention can be understood by those of ordinary skill in the art according to the specific circumstances.
The invention is described in further detail below with reference to the attached drawing figures:
referring to fig. 1 and 2, the invention discloses an image compression recovery method based on a multi-head heterogeneous convolution self-encoder, which comprises the following steps:
and step 1, processing an input original image based on a heterogeneous transformation method to obtain a heterogeneous image.
Wherein, let the input image be I 0 Dimension is [ H ] 0 ,W 0 ,3]The heterogeneous transformation method r used in the present embodiment 1 ,r 2 ,r 3 The brightness, hue and contrast are randomly increased and decreased, respectively, and the image obtained by heterogeneous transformation method is I 1 ,I 3 ,I 3
Let the input image be in RGB format,
Figure BDA0003433402000000061
wherein rand (0.8-1.2) represents that a random number between 0.8 and 1.2 is generated each time brightness conversion is performed, min (rand (0.8-1.2). Times.I 0 255) represents the para-rand (0.8-1.2) I 0 The upper limit of each value of (c) is set to 255,
Figure BDA0003433402000000062
representing rounding down the numbers therein;
for the hue random increase and decrease, firstly, the image is required to be converted from the RGB format to the HSV format, firstly, the maximum value and the minimum value of RGB three channels are obtained, MAX=max (R, G, B), MIN=min (R, G, B),
wherein R, G, B represent I 0 Three channels of the corresponding matrix, then the hue can be calculated
Figure BDA0003433402000000063
If the channel where the minimum value is located is R, h=h+120, and if the channel where the minimum value is located is G, h=h+120, performing random transform R on H 2 =(H+rand(0,30))%360;
For contrast random transformation, first find image I 0 Average avg=of the maximum and minimum values of (a)
Figure BDA0003433402000000065
And one of the difference value of the twoHalf=0.5 x (max (I 0 )-min(I 0 ) Then the new range is (max (avg-rand (0.8,1.2) ×diff, 0), min (avg+rand (0.8,1.2) ×diff, 255)), and the generated random number is used to determine I 0 Mapping to a new range, then
Figure BDA0003433402000000064
And 2, respectively encoding the original image and the heterogeneous image by using independent convolution self-encoders based on a deep learning method of the convolution self-encoders to obtain an original image code and a heterogeneous image code.
Will original image I 0 And heterogeneous image I 1 ,I 2 ,I 3 Respectively inputting the images into Python software to respectively obtain original images I 0 And heterogeneous image I 1 ,I 2 ,I 3 Mean and variance var of (a) for the original image I 0 And heterogeneous image I 1 ,I 2 ,I 3 Respectively carrying out normalization operation, and carrying out normalization operation on the original image I 0 And heterogeneous image I 1 ,I 2 ,I 3 The independent convolution self-encoder is used for carrying out convolution operation, downsampling and feature extraction respectively, and the original image code f is obtained 0 And heterogeneous image coding f 1 ,f 2 ,f 3
The normalization operation is as shown in formula (1):
Figure BDA0003433402000000071
wherein i comprises 0, 1, 2 and 3.
And 3, fusing and quantizing the original image code and the heterogeneous image code based on an attention mechanism to obtain a compressed image.
Wherein, the characteristic f after the original image is encoded 0 For example, let the matrix dimension of image coding be [ H, W, C ]]. Spatial attention is first added, giving different weights to different spatial locations of each feature. For f 0 Global pooling and average pooling are carried out, and then the pooled result and the original image code are spliced, and the obtained characteristic dimension is [ H, W, C+2 ]]The 5*5 convolution operation is then used to obtain a convolution with dimensions H, W,1]Features of (2)
Figure BDA0003433402000000072
Will->
Figure BDA0003433402000000073
Generating spatial attention weights w by sigmoid 0 Will f 0 and w0 Matrix multiplication is performed to obtain +.>
Figure BDA0003433402000000074
The same procedure is carried out on the isomerised features, so that +.>
Figure BDA0003433402000000075
Channel attention is then added, making the content of the image itself more interesting. There are 3 isomerism operations, which will
Figure BDA0003433402000000076
In [ H, W, C ]]Global pooling is carried out on the dimension to obtain the dimension [4 ]]Features of->
Figure BDA0003433402000000077
Will->
Figure BDA0003433402000000078
Generating weight z corresponding to each heterogeneous feature through full connection layer and sigmoid 0 Dimension [4 ]]Will z 0 And->
Figure BDA0003433402000000079
Multiplying and summing to obtain the characteristic f after adding the channel attention, and quantizing f to obtain the image compression result f q Its dimension is also [ H, W, C]。
The image compression at this time is composed of two parts, one isAnother part is quantization-induced compression due to spatial dimension reduction caused by downsampling. Image compression rate
Figure BDA00034334020000000710
wherein q1 Is the number of bits when f is quantized, q 0 Is the original image I 0 The number of bits per se is generally 8.
And 4, decoding the compressed image based on a deep learning method of the decoder to obtain a restored image.
Wherein f is first quantized using inverse quantization q Turning to a floating point number, the decoder then upsamples using multiple deconvolution operations, gradually expanding its H, W dimension size, ultimately restoring the output image
Figure BDA0003433402000000081
Dimension is [ H ] 0 ,W 0 ,3]The same dimensions as the original image. In the neural network training process, the restoration image +.>
Figure BDA0003433402000000082
And original image I 0 The difference between them as a function of loss. Where the upsampling is done using bilinear interpolation operations, e.g. for two points (x 0 ,y 0 ),(x 1 ,y 1 ) The values are A and B, respectively, and if up-sampling operation is performed, it is necessary to perform a sampling operation in the middle (x 2 ,y 2 ),x 1 >x 2 >x 0 A new point is inserted, then its value +.>
Figure BDA0003433402000000083
Step 5: based on the difference between the recovered image and the original image, constructing a loss function, and continuously iterating to be converged through training the loss function to obtain an optimal recovered image.
The loss function is shown in formula (2):
Figure BDA0003433402000000084
/>
where m is the number of points in the original image, the number of points in the original image and the recovered image are the same, x 1 and x2 Is the corresponding value of the original image and the restored image at the same point.
Referring to fig. 3, the invention discloses an image compression recovery system based on a multi-head heterogeneous convolution self-encoder, which comprises:
the image processing module is used for processing the input original image based on a heterogeneous transformation method to obtain a heterogeneous image;
the coding module is used for respectively coding the original image and the heterogeneous image by using an independent convolution self-encoder based on a deep learning method of the convolution self-encoder to obtain an original image code and a heterogeneous image code;
the fusion quantization module is used for fusing and quantizing the original image codes and the heterogeneous image codes based on an attention mechanism to obtain compressed images;
a decoding module for decoding the compressed image based on a deep learning method of the decoder to obtain a restored image,
and the loss function optimization module is used for constructing a loss function based on the difference between the restored image and the original image, and obtaining the optimal restored image by continuously iterating to be converged through training the loss function.
The embodiment of the invention provides terminal equipment. The terminal device of this embodiment includes: a processor, a memory, and a computer program stored in the memory and executable on the processor. The steps of the various method embodiments described above are implemented when the processor executes the computer program. Alternatively, the processor may implement the functions of the modules/units in the above-described device embodiments when executing the computer program.
The computer program may be divided into one or more modules/units, which are stored in the memory and executed by the processor to accomplish the present invention.
The terminal equipment can be computing equipment such as a desktop computer, a notebook computer, a palm computer, a cloud server and the like. The terminal device may include, but is not limited to, a processor, a memory.
The processor may be a central processing unit (CentralProcessingUnit, CPU), but may also be other general purpose processors, digital signal processors (DigitalSignalProcessor, DSP), application specific integrated circuits (ApplicationSpecificIntegratedCircuit, ASIC), off-the-shelf programmable gate arrays (Field-ProgrammableGateArray, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like.
The memory may be used to store the computer program and/or module, and the processor may implement various functions of the terminal device by running or executing the computer program and/or module stored in the memory and invoking data stored in the memory.
The modules/units integrated in the terminal device may be stored in a computer readable storage medium if implemented in the form of software functional units and sold or used as separate products. Based on such understanding, the present invention may implement all or part of the flow of the method of the above embodiment, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of each of the method embodiments described above. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer memory, a Read-only memory (ROM), a random access memory (RAM, randomAccessMemory), an electrical carrier signal, a telecommunication signal, a software distribution medium, and so forth. It should be noted that the computer readable medium contains content that can be appropriately scaled according to the requirements of jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is subject to legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunication signals.
The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (9)

1. The image compression recovery method based on the multi-head heterogeneous convolution self-encoder is characterized by comprising the following steps of:
processing an input original image based on a heterogeneous transformation method to obtain a heterogeneous image;
the method comprises the steps of respectively encoding an original image and a heterogeneous image by using independent convolution self-encoders based on a deep learning method of the convolution self-encoders to obtain an original image code and a heterogeneous image code;
fusing and quantizing the original image code and the heterogeneous image code based on an attention mechanism to obtain a compressed image;
decoding the compressed image based on a deep learning method of a decoder to obtain a restored image;
based on the difference between the recovered image and the original image, constructing a loss function, and continuously iterating to be converged through training the loss function to obtain an optimal recovered image.
2. The image compression recovery method based on the multi-head heterogeneous convolution self-encoder according to claim 1, wherein the processing of the input original image based on the heterogeneous transformation method is specifically:
inputting an original image
Figure QLYQS_1
The original image has dimensions +.>
Figure QLYQS_2
Carrying out heterogeneous transformation on the original image by using different heterogeneous transformation methods respectively to obtain different heterogeneous images;
the heterogeneous transformation method comprises a brightness random increase/decrease method, a hue random increase/decrease method and a contrast random increase/decrease method, and three different groups of heterogeneous images are obtained based on the three heterogeneous methods
Figure QLYQS_3
3. The image compression recovery method based on the multi-head heterogeneous convolutional self-encoder according to claim 2, wherein the deep learning method based on the convolutional self-encoder encodes an original image and a heterogeneous image respectively by using independent convolutional self-encoders, and obtains an original image encoding and a heterogeneous image encoding, specifically:
to the original image
Figure QLYQS_7
And isomerised image->
Figure QLYQS_9
Respectively inputting the images into Python software to respectively obtain original images
Figure QLYQS_12
And isomerised image->
Figure QLYQS_5
Mean>
Figure QLYQS_8
Sum of variances->
Figure QLYQS_11
For the original image->
Figure QLYQS_14
And heterogeneous images
Figure QLYQS_6
Respectively performing normalization operation, and performing +_on the original image after normalization operation>
Figure QLYQS_10
And isomerised image->
Figure QLYQS_13
The independent convolution self-encoder is used for carrying out convolution operation, downsampling and feature extraction respectively, and the original image coding is obtained
Figure QLYQS_15
And heterogeneous image coding +.>
Figure QLYQS_4
The normalization operation is as shown in formula (1):
Figure QLYQS_16
wherein ,
Figure QLYQS_17
including 0, 1, 2 and 3.
4. The image compression recovery method based on the multi-head heterogeneous convolution self-encoder according to claim 3, wherein the attention-based mechanism fuses and quantizes an original image code and a heterogeneous image code to obtain a compressed image, specifically:
encoding an original image
Figure QLYQS_20
And heterogeneous image coding +.>
Figure QLYQS_22
Respectively carrying out global pooling and average pooling, and respectively splicing the pooled results with the original image codes to obtain a characteristic dimension of +.>
Figure QLYQS_25
The 5*5 convolution operation is then used to obtain a dimension of +.>
Figure QLYQS_19
Features of->
Figure QLYQS_21
And is based on->
Figure QLYQS_24
Generating spatial attention weights by sigmoid +.>
Figure QLYQS_27
Encoded by the original image +.>
Figure QLYQS_18
And heterogeneous image coding +.>
Figure QLYQS_23
Respectively with spatial attention weight->
Figure QLYQS_26
Matrix multiplication is performed to obtain +.>
Figure QLYQS_28
For the acquired added spatial attention
Figure QLYQS_29
Add channel attention, for->
Figure QLYQS_33
At the position of
Figure QLYQS_37
Global pooling is performed on dimensions to obtain features ∈>
Figure QLYQS_31
,/>
Figure QLYQS_35
Generating weights by full connection layer and sigmoid>
Figure QLYQS_36
Will->
Figure QLYQS_39
And->
Figure QLYQS_32
Multiplying and summing to obtain the feature +.>
Figure QLYQS_34
For->
Figure QLYQS_38
Quantization is carried out to obtain a compressed image +.>
Figure QLYQS_40
Its dimension is->
Figure QLYQS_30
The matrix dimensions of the original image code and the heterogeneous image code are respectively
Figure QLYQS_41
5. The multi-headed based compound of claim 4The image compression recovery method of the deconvolution self-encoder is characterized in that the deep learning method based on the decoder decodes the compressed image to obtain a recovery image, and specifically comprises the following steps: for compressed image
Figure QLYQS_42
Performing inverse quantization processing, up-sampling the inverse quantized image by using inverse convolution operation by a decoder, enlarging the dimension size of the inverse quantized image H and W, and finally enabling the output recovery image +.>
Figure QLYQS_43
Dimension is->
Figure QLYQS_44
And original image +.>
Figure QLYQS_45
The dimensions are the same.
6. The image compression recovery method based on a multi-head heterogeneous convolutional self-encoder according to claim 5, wherein the loss function is as shown in formula (2):
Figure QLYQS_46
wherein ,
Figure QLYQS_47
is the number of pixels in the original image, the number of pixels in the original image is the same as the number of pixels in the restored image, < +.>
Figure QLYQS_48
And
Figure QLYQS_49
is the corresponding value of the original image and the restored image at the same pixel point.
7. An image compression recovery system based on a multi-head heterogeneous convolution self-encoder, comprising:
the image processing module is used for processing the input original image based on a heterogeneous transformation method to obtain a heterogeneous image;
the coding module is used for respectively coding the original image and the heterogeneous image by using an independent convolution self-encoder based on a deep learning method of the convolution self-encoder to obtain an original image code and a heterogeneous image code;
the fusion quantization module is used for fusing and quantizing the original image codes and the heterogeneous image codes based on an attention mechanism to obtain compressed images;
a decoding module for decoding the compressed image based on a deep learning method of the decoder to obtain a restored image,
and the loss function optimization module is used for constructing a loss function based on the difference between the restored image and the original image, and obtaining the optimal restored image by continuously iterating to be converged through training the loss function.
8. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1-6 when the computer program is executed.
9. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method according to any of claims 1-6.
CN202111605004.9A 2021-12-24 2021-12-24 Image compression recovery method and system based on multi-head heterogeneous convolution self-encoder Active CN114286113B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111605004.9A CN114286113B (en) 2021-12-24 2021-12-24 Image compression recovery method and system based on multi-head heterogeneous convolution self-encoder

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111605004.9A CN114286113B (en) 2021-12-24 2021-12-24 Image compression recovery method and system based on multi-head heterogeneous convolution self-encoder

Publications (2)

Publication Number Publication Date
CN114286113A CN114286113A (en) 2022-04-05
CN114286113B true CN114286113B (en) 2023-05-30

Family

ID=80875568

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111605004.9A Active CN114286113B (en) 2021-12-24 2021-12-24 Image compression recovery method and system based on multi-head heterogeneous convolution self-encoder

Country Status (1)

Country Link
CN (1) CN114286113B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110334765A (en) * 2019-07-05 2019-10-15 西安电子科技大学 Remote Image Classification based on the multiple dimensioned deep learning of attention mechanism
US10593021B1 (en) * 2019-09-11 2020-03-17 Inception Institute of Artificial Intelligence, Ltd. Motion deblurring using neural network architectures
CN113095439A (en) * 2021-04-30 2021-07-09 东南大学 Heterogeneous graph embedding learning method based on attention mechanism
CN113240589A (en) * 2021-04-01 2021-08-10 重庆兆光科技股份有限公司 Image defogging method and system based on multi-scale feature fusion

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110334765A (en) * 2019-07-05 2019-10-15 西安电子科技大学 Remote Image Classification based on the multiple dimensioned deep learning of attention mechanism
US10593021B1 (en) * 2019-09-11 2020-03-17 Inception Institute of Artificial Intelligence, Ltd. Motion deblurring using neural network architectures
CN113240589A (en) * 2021-04-01 2021-08-10 重庆兆光科技股份有限公司 Image defogging method and system based on multi-scale feature fusion
CN113095439A (en) * 2021-04-30 2021-07-09 东南大学 Heterogeneous graph embedding learning method based on attention mechanism

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
采用GPU加速的压缩感知图像恢复算法;苗壮等;微电子学与计算机(12);第125-129页 *

Also Published As

Publication number Publication date
CN114286113A (en) 2022-04-05

Similar Documents

Publication Publication Date Title
CN108022212B (en) High-resolution picture generation method, generation device and storage medium
EP3275190B1 (en) Chroma subsampling and gamut reshaping
US20200145692A1 (en) Video processing method and apparatus
US10909728B1 (en) Learned lossy image compression codec
KR20160021417A (en) Adaptive interpolation for spatially scalable video coding
CN114581544A (en) Image compression method, computer device and computer storage medium
CN108921801B (en) Method and apparatus for generating image
CN113888410A (en) Image super-resolution method, apparatus, device, storage medium, and program product
CN116636217A (en) Method and apparatus for encoding image and decoding code stream using neural network
CN107220934B (en) Image reconstruction method and device
TWI807491B (en) Method for chroma subsampled formats handling in machine-learning-based picture coding
Xing et al. Scale-arbitrary invertible image downscaling
CN114125454A (en) Video image coding system and method
CN110717864A (en) Image enhancement method and device, terminal equipment and computer readable medium
CN113628115A (en) Image reconstruction processing method and device, electronic equipment and storage medium
CN114286113B (en) Image compression recovery method and system based on multi-head heterogeneous convolution self-encoder
CN112399069B (en) Image encoding method and apparatus, storage medium, and electronic device
CN115866253B (en) Inter-channel conversion method, device, terminal and medium based on self-modulation
CN116547969A (en) Processing method of chroma subsampling format in image decoding based on machine learning
CN110267038A (en) Coding method and device, coding/decoding method and device
KR20200044668A (en) AI encoding apparatus and operating method for the same, and AI decoding apparatus and operating method for the same
Wang et al. A customized deep network based encryption-then-lossy-compression scheme of color images achieving arbitrary compression ratios
CN113096019B (en) Image reconstruction method, image reconstruction device, image processing equipment and storage medium
CN112637609B (en) Image real-time transmission method, sending end and receiving end
Ayyoubzadeh et al. Lossless compression of mosaic images with convolutional neural network prediction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant