CN116823656B - Image blind deblurring method and system based on frequency domain local feature attention mechanism - Google Patents
Image blind deblurring method and system based on frequency domain local feature attention mechanism Download PDFInfo
- Publication number
- CN116823656B CN116823656B CN202310764762.8A CN202310764762A CN116823656B CN 116823656 B CN116823656 B CN 116823656B CN 202310764762 A CN202310764762 A CN 202310764762A CN 116823656 B CN116823656 B CN 116823656B
- Authority
- CN
- China
- Prior art keywords
- image
- frequency domain
- local feature
- attention mechanism
- feature attention
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000007246 mechanism Effects 0.000 title claims abstract description 96
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000012549 training Methods 0.000 claims abstract description 56
- 238000012545 processing Methods 0.000 claims abstract description 24
- 238000007781 pre-processing Methods 0.000 claims abstract description 7
- 239000011159 matrix material Substances 0.000 claims description 12
- 230000008569 process Effects 0.000 claims description 8
- 230000000903 blocking effect Effects 0.000 claims description 3
- 238000011084 recovery Methods 0.000 claims 2
- 238000013527 convolutional neural network Methods 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 238000010606 normalization Methods 0.000 description 3
- 230000002776 aggregation Effects 0.000 description 2
- 238000004220 aggregation Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses an image blind deblurring method and system based on a frequency domain local feature attention mechanism, wherein the method comprises the following steps: acquiring an image deblurring data set, preprocessing the image deblurring data set, and acquiring a training set of the image deblurring data set; training the initial frequency domain local feature attention mechanism network based on the blurred image and the clear image in the training set to obtain a target frequency domain local feature attention mechanism network; and inputting the blurred image to be detected into a target frequency domain local feature attention mechanism network to perform image blind deblurring processing, and obtaining a target clear image. The invention combines the frequency domain information and the space information of the characteristic image, and can effectively help the convolutional neural network to restore the blurred image into a clearer image through the frequency domain information of the characteristic image.
Description
Technical Field
The invention belongs to the technical field of computer vision and image processing, and particularly relates to an image blind deblurring method and system based on a frequency domain local feature attention mechanism.
Background
Image deblurring aims at eliminating the blurred features to restore a sharp image. Many factors can cause blurring, such as irregular movement of the camera or object, optical defocus, etc. Low quality blurred images present significant challenges for subsequent advanced visual tasks such as medical diagnosis, object recognition, etc.
In the wave of the global feature learning method, significant progress has been made in the field of image restoration. Existing MLP-based methods, as shown in FIG. 5 (a), MAXIM sparsely decomposes global MLP operations into window-MLP and grid-MLP. In addition to the MLP-based approach, recent research studies such as Restormer, uformer, stripformer have shown the ability of the attention mechanism in image deblurring tasks. Note that the mechanism (SA) transform model is a key to capturing remote dependencies, and its computational complexity is twice as large as the number of pixels, which is not suitable for application to high resolution image deblurring tasks. In order to make the computation feasible, the existing methods try various methods to reduce the number of pixels of the SA in the spatial domain, which can be divided into three categories. (1) Local Spatial-wise SA (Spa-LS). As in fig. 5 (b), uformer proposes a locally enhanced window transform block to capture local context information, which makes it difficult for the remote information to be modeled efficiently. (2) global SA. As in fig. 5 (c), stripformer explores horizontal and vertical intra-and inter-stripe SAs, which rely on a strong assumption that image blur is generally region-oriented. (3) coarse-grained global SA. As in FIG. 5 (d), restormer captures remote interactions (Spa-GC) through Global CHANNEL WISE SA. Although Spa-GC can learn global information of features, it is inevitably more focused on extracting low frequency components of an image (1) the energy of an image is mainly concentrated at low frequencies and (2) when feature learning is performed, a high frequency part is generally more difficult to handle than a low frequency part in practice. The low-frequency part is coarse-granularity information, namely the basic structure of the object; the high frequency part is fine-grained level information, i.e. texture details. Therefore, a coarse global SA such as Spa-GC has a problem of insufficient fine-grained correlation.
Disclosure of Invention
In order to build a remote dependency model without compromising fine-grained detail, the present invention proposes a network of frequency-domain local feature attention mechanisms (LoFormer) for image deblurring as shown in FIG. 2. In particular, the present invention proposes a frequency domain local channel self-attention structure (Freq-LC) as shown in fig. 3. First, the present invention converts features to the frequency domain through Discrete Cosine Transform (DCT). The DCT represents the original features as coefficients of different base images. As shown in fig. 5 (e), the base map may be arranged in a rectangular grid with the low frequency component at the upper left corner and the high frequency component at the lower right corner. The top left base graph represents the average intensity of the entire image, while the remaining base graphs capture finer and finer detail and texture. It is apparent that coefficients of any frequency have global information. In order to provide the coarse grain structure and fine grain detail with equal learning opportunities, the invention designs a window-based frequency characteristic extraction paradigm, namely splitting frequency coefficients into non-overlapping windows.
In order to achieve the above object, the present invention provides the following solutions:
An image blind deblurring method based on a frequency domain local feature attention mechanism comprises the following steps:
acquiring an image deblurring dataset;
Preprocessing the image deblurring data set to obtain a training set of the image deblurring data set;
Training the initial frequency domain local feature attention mechanism network based on the blurred image and the clear image in the training set to obtain a target frequency domain local feature attention mechanism network;
And inputting the blurred image to be detected into the target frequency domain local feature attention mechanism network to perform image blind deblurring processing, and obtaining a target clear image.
Preferably, the image deblurring dataset comprises: goPro dataset, HIDE dataset, realBlur dataset, REDS dataset.
Preferably, the method for training the initial frequency domain local feature attention mechanism network based on the blurred image and the clear image in the training set comprises the following steps:
S1: inputting the blurred image into an initial frequency domain local feature attention mechanism network to obtain an output image;
s2: calculating loss and carrying out gradient inversion based on the difference between the output image and the clear image, and updating parameters of the initial frequency domain local feature attention mechanism network;
S3: and repeating the step S1 and the step S2 until the training times reach the preset number, and obtaining the target frequency domain local feature attention mechanism network.
Preferably, the method for inputting the blurred image into the initial frequency domain local feature attention mechanism network to obtain an output image comprises the following steps:
inputting the blurred image into the initial frequency domain local feature attention mechanism network, and processing the blurred image through a U-shaped coder and decoder to obtain corresponding blurred features;
and adding the blurred image and the corresponding blurred feature to obtain an output image.
Preferably, the method of calculating the loss based on the difference between the output image and the clear image comprises,
Comparing the clear image in the training set with the output image of the frequency domain local feature attention mechanism network to obtain a gap;
And calculating the difference to obtain loss.
The invention also provides an image blind deblurring system based on the frequency domain local feature attention mechanism, which comprises: the system comprises a data set acquisition module, a training set acquisition module, a network acquisition module and an image acquisition module;
the data set acquisition module is used for acquiring an image deblurring data set;
The training set acquisition module is used for preprocessing the image deblurring data set to acquire a training set of the image deblurring data set;
the network acquisition module is used for training the initial frequency domain local feature attention mechanism network based on the blurred image and the clear image in the training set to obtain a target frequency domain local feature attention mechanism network;
The image acquisition module is used for inputting the blurred image to be detected into the target frequency domain local feature attention mechanism network to perform image blind deblurring processing, and a target clear image is obtained.
Preferably, the image deblurring dataset comprises: goPro dataset, HIDE dataset, realBlur dataset, REDS dataset.
Preferably, the network acquisition module includes: an output image obtaining unit, a calculating unit, and a network obtaining unit;
The output image obtaining unit is used for inputting the blurred image into an initial frequency domain local feature attention mechanism network to obtain an output image;
The computing unit is used for computing loss and carrying out gradient inversion based on the difference between the output image and the clear image, and updating parameters of the initial frequency domain local feature attention mechanism network;
The network obtaining unit is used for repeating the training steps of the output image obtaining unit and the calculating unit until the training times reach the preset number, and obtaining the target frequency domain local feature attention mechanism network.
Preferably, in the output image obtaining unit, the process of inputting the blurred image into an initial frequency domain local feature attention mechanism network to obtain an output image includes:
inputting the blurred image into the initial frequency domain local feature attention mechanism network, and processing the blurred image through a U-shaped coder and decoder to obtain corresponding blurred features;
and adding the blurred image and the corresponding blurred feature to obtain an output image.
Preferably, in the calculating unit, the process of calculating the loss includes,
Comparing the clear image in the training set with the output image of the frequency domain local feature attention mechanism network to obtain a gap;
And calculating the difference to obtain loss.
Compared with the prior art, the invention has the beneficial effects that:
The invention provides an image blind deblurring method based on a frequency domain local feature attention mechanism, wherein a frequency domain local feature attention mechanism network comprises a frequency domain normalization (DCT-LN) module, and from the perspective of an image frequency domain, the training stability is improved in a frequency domain information normalization mode. Furthermore, the frequency domain local feature attention mechanism (Freq-LC) module utilizes the frequency domain space to realize the decomposition of different fine granularity features, thereby achieving the balanced utilization of high and low frequency information. Further, the MLP gating mechanism (MGate) enhances the aggregation capability of global information, effectively restoring blurred images to clearer images.
Drawings
In order to more clearly illustrate the technical solutions of the present invention, the drawings that are needed in the embodiments are briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of an embodiment of the present invention;
FIG. 2 is a schematic diagram of a frequency domain local feature attention mechanism in an embodiment of the present invention;
FIG. 3 is a schematic diagram of a frequency domain local feature attention mechanism according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a frequency domain normalization process according to an embodiment of the present invention;
FIG. 5 is a diagram showing a comparison of a frequency domain local feature attention method and other global learning methods in an embodiment of the present invention;
FIG. 6 is a diagram of an example of GoPro test data in an embodiment of the invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.
Example 1
As shown in fig. 1, the present embodiment provides an image blind deblurring method based on a frequency domain local feature attention mechanism, which includes two stages, namely a network training stage and a network prediction stage.
The training phase of the network and the prediction phase of the network comprise the following steps:
1. preparing an image deblurring standard data set, wherein the selected image deblurring standard data set is as follows: a GoPro dataset, a HIDE dataset, realBlur dataset and a REDS dataset; preprocessing a data set, namely randomly cutting data into 384 multiplied by 384 and randomly horizontally turning up and down before inputting experimental data into a model for training to obtain a training set of an image deblurring data set;
2. Training the initial frequency domain local feature attention mechanism network based on the blurred image and the clear image in the training set to obtain a target frequency domain local feature attention mechanism network;
In the embodiment, a blurred image is input into an initial frequency domain local feature attention mechanism network to obtain an output image; based on the difference between the output image and the clear image, calculating loss and carrying out gradient inversion, and updating parameters of an initial frequency domain local feature attention mechanism network; repeating the training steps until the training times reach the preset number, and obtaining a target frequency domain local feature attention mechanism network;
The method for inputting the blurred image into the initial frequency domain local feature attention mechanism network and obtaining the output image comprises the following steps: inputting the blurred image into an initial frequency domain local feature attention mechanism network, and processing the blurred image through a U-shaped coder and decoder to obtain a corresponding blurred feature prediction scale; adding the blurred image and the corresponding blurred features to obtain an output image;
The method for calculating the loss based on the difference between the output image and the clear image comprises the steps of comparing the clear image in the training set with the network output image of the frequency domain local feature attention mechanism to obtain the difference, and calculating to obtain the loss;
3. Inputting blurred images in the training set of the image deblurring standard dataset into a target frequency domain local feature attention mechanism network (Local Frequency Transformer) to obtain estimated images which are possibly clear;
Further, the difference between the network output image and the clear image in the training set is calculated to obtain loss, and the gradient is returned to update the parameters of the network, so that the training is repeated until the number of training times reaches the preset number. Is provided with S is the reconstruction result and the real and clear data output by the network respectively. The loss functions of the network training are respectively the combination of the following two loss functions:
1) L1 reconstruction Loss function (L1 Loss):
2) Frequency domain reconstruction Loss function (FR Loss):
The final loss function is as follows:
L=L1+aLfr
Wherein a is a super parameter, which is set to 0.01.
Referring to fig. 2, the frequency domain local feature attention mechanism network (Local Frequency Transformer, loFormer) adopts a U-shaped blurred image restoration network structure (UNet) and comprises an encoder, a latent layer feature processing module and a decoder, wherein the encoder and the decoder respectively comprise 3 scales, and three transverse connections exist between the encoder and the decoder. The experimental data firstly enter an encoder and a latent layer characteristic processing module, and finally the decoder which passes through the experimental data obtains fuzzy characteristic prediction and then adds the fuzzy characteristic prediction with corresponding input to obtain a restored image. The invention designs two models LoFormer-S and LoFormer-B respectively by changing the number of frequency domain local feature attention modules (Local Frequency Transformer Block, loFT block) contained in the encoder, the latent layer feature processing module and the decoder under different scales (Stage 1-4 and STAGE REFINEMENT in figure 2). Wherein LoFormer-S encoders and decoders from Stage 1-4 contain 2, 4, 6, 14 LoFT block, STAGE REFINEMENT contain 2 LoFT block, respectively. LoFormer-B encoders and decoders contained 2, 4, 12, 18 LoFT block, STAGE REFINEMENT contained 2 LoFT block, respectively, from Stage 1-4.
Referring to fig. 3, the frequency domain local feature attention module process is as follows:
(1) Calculating the 2D discrete cosine transform of X in to obtain X dct∈RH×W×C;
(2) LayerNorm is carried out on the channel dimension of X dct to obtain X LN=LN(Xdct);
(3) X LN was subjected to 1X 1 Conv-3X 3DConv and blocked by 8X 8 to give Q, K, V e R K ×n×C, n=64, k=h×w/n;
(4) Q, K obtaining an attention matrix A epsilon R K×C×C through matrix multiplication operation;
(5) V and the attention matrix A are subjected to matrix multiplication operation to obtain V attn, MLP-GeLU operation is performed on the second dimension of V to obtain V MGate, and the V attn and the V MGate are multiplied to obtain V out=Vattn×VMGate;
(6) Performing inverse blocking treatment on V out to obtain Z' dct∈RH×W×C, and performing 1X 1Conv to obtain Z dct∈RH×W×C;
(7) Calculating the 2D inverse discrete cosine transform of Z dct to obtain Z E R H×W×C
Through the operation, the invention models the local correlation of the frequency domain information. The final output was calculated by y=x in +z.
Referring to fig. 4, the frequency domain features after LN are distributed more uniformly than the frequency domain features before LN.
Referring to fig. 5, the difference between the frequency domain local feature attention mechanism proposed by the present invention and other mainstream MLP methods and attention mechanism methods is that the frequency domain local feature attention mechanism proposed by the present invention uses features contained in a local window of a frequency domain feature to implement global modeling of spatial information.
Referring to fig. 6, the large graph is the original sharp image, and the small graph is the sharp image, blurred image, MIMO-unet+, MPRNet, deepRFT +, NAFNet, restormer, and LoFormer-B results, respectively, from top left to bottom right. By comparing Restormer results with the frequency domain local feature attention mechanism (LoFormer-B), the invention can be used for effectively improving the deblurring capability of the convolutional neural network, and compared with the current mainstream deblurring neural network, the frequency domain local feature attention mechanism (LoFormer) has better deblurring capability.
Example two
The invention also provides an image blind deblurring system based on the frequency domain local feature attention mechanism, which comprises: the system comprises a data set acquisition module, a training set acquisition module, a network acquisition module and an image acquisition module;
The data set acquisition module is used for acquiring an image deblurring data set;
The training set acquisition module is used for preprocessing the image deblurring data set to obtain a training set of the image deblurring data set;
The network acquisition module is used for training the initial frequency domain local feature attention mechanism network based on the blurred image and the clear image in the training set to obtain a target frequency domain local feature attention mechanism network;
The image acquisition module is used for inputting the blurred image to be detected into the target frequency domain local feature attention mechanism network to perform image blind deblurring processing, and obtaining a target clear image.
In this embodiment, the image deblurring dataset comprises: goPro dataset, HIDE dataset, realBlur dataset, REDS dataset.
In this embodiment, the network acquisition module includes: an output image obtaining unit, a calculating unit, and a network obtaining unit;
the output image obtaining unit is used for inputting the blurred image into an initial frequency domain local feature attention mechanism network to obtain an output image;
The computing unit is used for computing loss and carrying out gradient inversion based on the difference between the output image and the clear image, and updating the parameters of the initial frequency domain local feature attention mechanism network;
the network obtaining unit is used for repeatedly outputting the training steps of the image obtaining unit and the calculating unit until the training times reach the preset number, and obtaining the target frequency domain local feature attention mechanism network.
In this embodiment, in the output image obtaining unit, the process of inputting the blurred image into the initial frequency domain local feature attention mechanism network to obtain the output image includes:
Inputting the blurred image into an initial frequency domain local feature attention mechanism network, and processing the blurred image through a U-shaped coder and decoder to obtain corresponding blurred features;
And adding the blurred image and the corresponding blurred features to obtain an output image.
In the present embodiment, the process of calculating the loss based on the difference between the output image and the clear image in the calculating unit includes,
Comparing the clear image in the training set with the output image of the frequency domain local feature attention mechanism network to obtain a gap;
and calculating the gap to obtain loss.
The invention is mainly characterized in that the invention provides a frequency domain local feature attention mechanism network (LoFormer) for an image deblurring task. Unlike previous transform-based methods that either learn local Self-Attention or coarse-grained global Self-Attention to reduce computational complexity, loFormer models coarse-grained and fine-grained remote dependencies by simply executing the channel Self-Attention within each local window of the frequency domain feature. In order to filter out invalid features and enhance global learning ability, the invention further designs an MLP gating mechanism to enhance aggregation of frequency domain information. LoFormer successfully solves the problem of insufficient fitting of the traditional coarse-granularity global Self-attribute to high-frequency (detail) information, and effectively improves the deblurring effect of the model.
The above embodiments are merely illustrative of the preferred embodiments of the present invention, and the scope of the present invention is not limited thereto, but various modifications and improvements made by those skilled in the art to which the present invention pertains are made without departing from the spirit of the present invention, and all modifications and improvements fall within the scope of the present invention as defined in the appended claims.
Claims (10)
1. The image blind deblurring method based on the frequency domain local feature attention mechanism is characterized by comprising the following steps of:
acquiring an image deblurring dataset;
Preprocessing the image deblurring data set to obtain a training set of the image deblurring data set;
Training the initial frequency domain local feature attention mechanism network based on the blurred image and the clear image in the training set to obtain a target frequency domain local feature attention mechanism network;
inputting the blurred image to be detected into the target frequency domain local feature attention mechanism network to perform image blind deblurring processing to obtain a target clear image;
The target frequency domain local feature attention mechanism network adopts a U-shaped fuzzy image recovery network structure UNet, and comprises an encoder, a latent layer feature processing module and a decoder, wherein the encoder and the decoder respectively comprise 3 scales, and three transverse connections exist between the encoder and the decoder; the experimental data firstly enter an encoder and a latent layer characteristic processing module, and finally a decoder passing through the experimental data obtains fuzzy characteristic prediction and then adds the fuzzy characteristic prediction with corresponding input to obtain a restored image; two models LoFormer-S and LoFormer-B are respectively designed by changing the number of frequency domain local feature attention modules contained in the encoder, the latent layer feature processing module and the decoder under different scales; wherein LoFormer-S encoders and decoders from Stage 1-4 contain 2, 4, 6, 14 LoFT block, STAGE REFINEMENT contain 2 LoFT block, respectively; loFormer-B encoders and decoders from Stage 1-4 contained 2, 4, 12, 18 LoFT block, STAGE REFINEMENT contained 2 LoFT block, respectively;
The frequency domain local feature attention module processing procedure is as follows:
Calculating the 2D discrete cosine transform of X in to obtain X dct∈RH×W×C;
LayerNorm is carried out on the channel dimension of X dct to obtain X LN=LN(Xdct);
X LN was subjected to 1X 1 Conv-3X 3DConv and blocked by 8X 8 to give Q, K, V e R K×n×C, n=64, k=h×w/n;
Q, K obtaining an attention matrix A epsilon R K×C×C through matrix multiplication operation;
V and the attention matrix A are subjected to matrix multiplication operation to obtain V attn, MLP-GeLU operation is performed on the second dimension of V to obtain V MGate, and the V attn and the V MGate are multiplied to obtain V out=Vattn×VMGate;
performing inverse blocking treatment on V out to obtain Z' dct∈RH×W×C, and performing 1X 1Conv to obtain Z dct∈RH×W×C;
Calculating the 2D inverse discrete cosine transform of Z dct to obtain Z epsilon R H×W×C;
Modeling the local correlation of the frequency domain information, and finally obtaining the output through calculation of Y=X in +Z;
the frequency domain features after LN are distributed more uniformly than the frequency domain features before LN.
2. The method of image blind deblurring based on a frequency domain local feature attention mechanism of claim 1, wherein the image deblurring dataset comprises: goPro dataset, HIDE dataset, realBlur dataset, REDS dataset.
3. The method for blind deblurring of images based on frequency domain local feature attention mechanisms of claim 1, wherein the method for training the initial frequency domain local feature attention mechanism network based on blurred images and sharp images in the training set comprises:
S1: inputting the blurred image into an initial frequency domain local feature attention mechanism network to obtain an output image;
s2: calculating loss and carrying out gradient inversion based on the difference between the output image and the clear image, and updating parameters of the initial frequency domain local feature attention mechanism network;
S3: and repeating the step S1 and the step S2 until the training times reach the preset number, and obtaining the target frequency domain local feature attention mechanism network.
4. A method of blind deblurring of an image based on a frequency domain local feature attention mechanism according to claim 3, wherein inputting the blurred image into an initial frequency domain local feature attention mechanism network, the method of obtaining an output image comprises:
inputting the blurred image into the initial frequency domain local feature attention mechanism network, and processing the blurred image through a U-shaped coder and decoder to obtain corresponding blurred features;
and adding the blurred image and the corresponding blurred feature to obtain an output image.
5. The method for blind deblurring of an image based on a frequency domain local feature attention mechanism according to claim 3, wherein the method for calculating a loss based on a gap between said output image and said sharp image comprises,
Comparing the clear image in the training set with the output image of the frequency domain local feature attention mechanism network to obtain a gap;
And calculating the difference to obtain loss.
6. An image blind deblurring system based on a frequency domain local feature attention mechanism, comprising: the system comprises a data set acquisition module, a training set acquisition module, a network acquisition module and an image acquisition module;
the data set acquisition module is used for acquiring an image deblurring data set;
The training set acquisition module is used for preprocessing the image deblurring data set to acquire a training set of the image deblurring data set;
the network acquisition module is used for training the initial frequency domain local feature attention mechanism network based on the blurred image and the clear image in the training set to obtain a target frequency domain local feature attention mechanism network;
The image acquisition module is used for inputting a to-be-detected blurred image into the target frequency domain local feature attention mechanism network to perform image blind deblurring processing to obtain a target clear image;
The target frequency domain local feature attention mechanism network adopts a U-shaped fuzzy image recovery network structure UNet, and comprises an encoder, a latent layer feature processing module and a decoder, wherein the encoder and the decoder respectively comprise 3 scales, and three transverse connections exist between the encoder and the decoder; the experimental data firstly enter an encoder and a latent layer characteristic processing module, and finally a decoder passing through the experimental data obtains fuzzy characteristic prediction and then adds the fuzzy characteristic prediction with corresponding input to obtain a restored image; two models LoFormer-S and LoFormer-B are respectively designed by changing the number of frequency domain local feature attention modules contained in the encoder, the latent layer feature processing module and the decoder under different scales; wherein LoFormer-S encoders and decoders from Stage 1-4 contain 2, 4, 6, 14 LoFT block, STAGE REFINEMENT contain 2 LoFT block, respectively; loFormer-B encoders and decoders from Stage 1-4 contained 2, 4, 12, 18 LoFT block, STAGE REFINEMENT contained 2 LoFT block, respectively;
The frequency domain local feature attention module processing procedure is as follows:
Calculating the 2D discrete cosine transform of X in to obtain X dct∈RH×W×C;
LayerNorm is carried out on the channel dimension of X dct to obtain X LN=LN(Xdct);
X LN was subjected to 1X 1 Conv-3X 3DConv and blocked by 8X 8 to give Q, K, V e R K×n×C, n=64, k=h×w/n;
Q, K obtaining an attention matrix A epsilon R K×C×C through matrix multiplication operation;
V and the attention matrix A are subjected to matrix multiplication operation to obtain V attn, MLP-GeLU operation is performed on the second dimension of V to obtain V MGate, and the V attn and the V MGate are multiplied to obtain V out=Vattn×VMGate;
performing inverse blocking treatment on V out to obtain Z' dct∈RH×W×C, and performing 1X 1Conv to obtain Z dct∈RH×W×C;
Calculating the 2D inverse discrete cosine transform of Z dct to obtain Z epsilon R H×W×C;
Modeling the local correlation of the frequency domain information, and finally obtaining the output through calculation of Y=X in +Z;
the frequency domain features after LN are distributed more uniformly than the frequency domain features before LN.
7. The image blind deblurring system based on a frequency domain local feature attention mechanism of claim 6 wherein the image deblurring dataset comprises: goPro dataset, HIDE dataset, realBlur dataset, REDS dataset.
8. The image blind deblurring system based on a frequency domain local feature attention mechanism of claim 6, wherein said network acquisition module comprises: an output image obtaining unit, a calculating unit, and a network obtaining unit;
The output image obtaining unit is used for inputting the blurred image into an initial frequency domain local feature attention mechanism network to obtain an output image;
The computing unit is used for computing loss and carrying out gradient inversion based on the difference between the output image and the clear image, and updating parameters of the initial frequency domain local feature attention mechanism network;
The network obtaining unit is used for repeating the training steps of the output image obtaining unit and the calculating unit until the training times reach the preset number, and obtaining the target frequency domain local feature attention mechanism network.
9. The image blind deblurring system based on the frequency domain local feature attention mechanism according to claim 8, wherein the process of inputting the blurred image into the initial frequency domain local feature attention mechanism network in the output image obtaining unit, to obtain an output image, comprises:
inputting the blurred image into the initial frequency domain local feature attention mechanism network, and processing the blurred image through a U-shaped coder and decoder to obtain corresponding blurred features;
and adding the blurred image and the corresponding blurred feature to obtain an output image.
10. The image blind deblurring system based on the frequency domain local feature attention mechanism according to claim 8, wherein the process of calculating the loss based on the difference between the output image and the clear image in the calculating unit includes,
Comparing the clear image in the training set with the output image of the frequency domain local feature attention mechanism network to obtain a gap;
And calculating the difference to obtain loss.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310764762.8A CN116823656B (en) | 2023-06-27 | 2023-06-27 | Image blind deblurring method and system based on frequency domain local feature attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310764762.8A CN116823656B (en) | 2023-06-27 | 2023-06-27 | Image blind deblurring method and system based on frequency domain local feature attention mechanism |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116823656A CN116823656A (en) | 2023-09-29 |
CN116823656B true CN116823656B (en) | 2024-06-28 |
Family
ID=88140666
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310764762.8A Active CN116823656B (en) | 2023-06-27 | 2023-06-27 | Image blind deblurring method and system based on frequency domain local feature attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116823656B (en) |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11966839B2 (en) * | 2017-10-25 | 2024-04-23 | Deepmind Technologies Limited | Auto-regressive neural network systems with a soft attention mechanism using support data patches |
CN111709895B (en) * | 2020-06-17 | 2023-05-16 | 中国科学院微小卫星创新研究院 | Image blind deblurring method and system based on attention mechanism |
CN114913565B (en) * | 2021-01-28 | 2023-11-17 | 腾讯科技(深圳)有限公司 | Face image detection method, model training method, device and storage medium |
CN114240764B (en) * | 2021-11-12 | 2024-04-23 | 清华大学 | De-blurring convolutional neural network training method, device, equipment and storage medium |
US20230196520A1 (en) * | 2021-12-20 | 2023-06-22 | POSTECH Research and Business Development Foundation | Inverse kernel-based defocus deblurring method and apparatus |
CN114677304B (en) * | 2022-03-28 | 2024-08-23 | 东南大学 | Image deblurring algorithm based on knowledge distillation and deep neural network |
CN114998296A (en) * | 2022-06-24 | 2022-09-02 | 常州大学 | Thyroid nodule segmentation method based on improved Unet network |
-
2023
- 2023-06-27 CN CN202310764762.8A patent/CN116823656B/en active Active
Non-Patent Citations (2)
Title |
---|
"基于先验学习的图像去模糊关键技术研究";陈磊;《中国博士学位论文全文数据库信息科技辑》;20230115;全文 * |
Xintian Mao 等."intriguing findings of frequecy selection for image deblurring".《AAAI-23》.全文. * |
Also Published As
Publication number | Publication date |
---|---|
CN116823656A (en) | 2023-09-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Bai et al. | Graph-based blind image deblurring from a single photograph | |
Zhang | The mean field theory in EM procedures for blind Markov random field image restoration | |
CN110443768B (en) | Single-frame image super-resolution reconstruction method based on multiple consistency constraints | |
Matakos et al. | Accelerated edge-preserving image restoration without boundary artifacts | |
CN115409733B (en) | Low-dose CT image noise reduction method based on image enhancement and diffusion model | |
Ma et al. | Efficient and fast real-world noisy image denoising by combining pyramid neural network and two-pathway unscented Kalman filter | |
CN104134196B (en) | Split Bregman weight iteration image blind restoration method based on non-convex higher-order total variation model | |
Wang et al. | A hybrid model for image denoising combining modified isotropic diffusion model and modified Perona-Malik model | |
CN104835130A (en) | Multi-exposure image fusion method | |
CN108932699B (en) | Three-dimensional matching harmonic filtering image denoising method based on transform domain | |
Min et al. | Blind deblurring via a novel recursive deep CNN improved by wavelet transform | |
CN114897741B (en) | Image blind deblurring method based on depth residual Fourier transform | |
Guidotti et al. | Image restoration with a new class of forward-backward-forward diffusion equations of Perona--Malik type with applications to satellite image enhancement | |
Liu et al. | Image denoising based on improved bidimensional empirical mode decomposition thresholding technology | |
CN113570516A (en) | Image blind motion deblurring method based on CNN-Transformer hybrid self-encoder | |
Deng | Guided wavelet shrinkage for edge-aware smoothing | |
Jiang et al. | Enhanced frequency fusion network with dynamic hash attention for image denoising | |
CN117726540A (en) | Image denoising method for enhanced gate control converter | |
Shukla et al. | Adaptive fractional masks and super resolution based approach for image enhancement | |
An et al. | Image super-resolution reconstruction algorithm based on significant network connection-collaborative migration structure | |
Wu et al. | A convex variational approach for image deblurring with multiplicative structured noise | |
Ally et al. | Diffusion-driven image denoising model with texture preservation capabilities | |
CN116823656B (en) | Image blind deblurring method and system based on frequency domain local feature attention mechanism | |
CN117078780A (en) | Deep learning-based micro-fossil CT image preprocessing method and device | |
Tun et al. | Joint Training of Noisy Image Patch and Impulse Response of Low-Pass Filter in CNN for Image Denoising |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |