CN114022506B - Image restoration method for edge prior fusion multi-head attention mechanism - Google Patents
Image restoration method for edge prior fusion multi-head attention mechanism Download PDFInfo
- Publication number
- CN114022506B CN114022506B CN202111356234.6A CN202111356234A CN114022506B CN 114022506 B CN114022506 B CN 114022506B CN 202111356234 A CN202111356234 A CN 202111356234A CN 114022506 B CN114022506 B CN 114022506B
- Authority
- CN
- China
- Prior art keywords
- image
- edge
- restoration
- repair
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 230000007246 mechanism Effects 0.000 title claims abstract description 39
- 230000004927 fusion Effects 0.000 title claims abstract description 28
- 230000008439 repair process Effects 0.000 claims abstract description 64
- 238000012360 testing method Methods 0.000 claims abstract description 16
- 238000012549 training Methods 0.000 claims abstract description 13
- 230000007547 defect Effects 0.000 claims abstract description 7
- 238000007781 pre-processing Methods 0.000 claims abstract description 5
- 239000011159 matrix material Substances 0.000 claims description 31
- 230000006870 function Effects 0.000 claims description 22
- 238000010586 diagram Methods 0.000 claims description 9
- 230000004913 activation Effects 0.000 claims description 5
- 238000004364 calculation method Methods 0.000 claims description 5
- 230000008447 perception Effects 0.000 claims description 5
- 238000005070 sampling Methods 0.000 claims description 5
- 101000840267 Homo sapiens Immunoglobulin lambda-like polypeptide 1 Proteins 0.000 claims description 3
- 102100029616 Immunoglobulin lambda-like polypeptide 1 Human genes 0.000 claims description 3
- 230000008569 process Effects 0.000 claims description 3
- 230000008485 antagonism Effects 0.000 claims 1
- 230000009466 transformation Effects 0.000 claims 1
- 230000001131 transforming effect Effects 0.000 claims 1
- 230000000694 effects Effects 0.000 abstract description 10
- 238000012545 processing Methods 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000003708 edge detection Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 230000001788 irregular Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000003042 antagnostic effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000010339 dilation Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/13—Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4038—Image mosaicing, e.g. composing plane images from plane sub-images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Health & Medical Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to the technical field of image restoration, and discloses an image restoration method of an edge prior fusion multi-head attention mechanism, which comprises the following steps of S1: acquiring experimental data and preprocessing the data, wherein the experimental data comprises a training set and a testing set, and extracting an edge map of an image from the preprocessed image; step S2: the method comprises the steps of constructing an edge first-fusion multi-attention mechanism repair model, wherein the edge repair model comprises an edge repair model and an image repair model, the edge repair model takes an extracted edge image, an original image and a mask image as inputs, outputs the extracted edge image and the original image as repaired edge images, and the image repair model trains by taking the repaired edge images and the defect images as inputs; by means of fusing a multi-head attention mechanism, the image restoration effect is achieved by extracting more abundant images and depending on pixels for a long distance.
Description
Technical Field
The invention relates to the technical field of image restoration, in particular to an image restoration method of an edge prior fusion multi-head attention mechanism.
Background
In information society, images are the most important sources of information. How to obtain more complete and clear images has also become a hotspot in the field of computer vision, and related fields of application include image restoration and super resolution. Image restoration refers to a technique of recovering a complete image from the rest of the image information in a corrupted image. For the human eye this is not a laborious task, but for computer vision it is a rather challenging task. There are many practical solutions to this technique, such as image restoration (for removing photo scratches and text occlusions), photo editing (removing unwanted objects), image encoding and transmission (network during image transmission) require the use of image block content loss caused by data packet loss. Therefore, the image restoration technique is a very popular research field in recent years.
At present, an image is repaired based on a generated countermeasure network to become a mainstream, the generated countermeasure network enables an image which is similar to training data but does not exist to be generated through a network model so as to achieve the effect of spurious, in recent years, the effect of image repair is continuously proposed by improving the generated countermeasure network through utilizing the generation characteristics of the generated countermeasure network, in the prior art, the convolutional neural network only pays attention to pixel values of local areas when learning characteristics, and the influence of relevance of pixels of a remote area on image generation and repair is ignored.
Disclosure of Invention
Aiming at the defects existing in the prior art, the invention aims to provide an image restoration method with an edge priori fusion multi-head attention mechanism.
In order to achieve the above object, the present invention provides the following technical solutions:
An image restoration method for fusing multiple head attention mechanisms by edge prior comprises the following steps of
Step S1: acquiring experimental data and preprocessing the data, wherein the experimental data comprises a training set and a testing set, and extracting an edge map of an image from the preprocessed image;
Step S2: the method comprises the steps of constructing an edge first-fusion multi-attention mechanism repair model, wherein the edge repair model comprises an edge repair model and an image repair model, the edge repair model takes an extracted edge image, an original image and a mask image as inputs, outputs the extracted edge image and the original image as repaired edge images, and the image repair model trains by taking the repaired edge images and the defect images as inputs;
The image restoration model comprises an image restoration device, wherein the image restoration device generates restoration pictures after repeated sampling of restored edge images, repeated residual convolution based on expansion convolution, one multi-head attention network and two deconvolutions;
step S3: and evaluating the result of fusing the multi-attention mechanism repair model on the edge through the test set.
In the invention, further, the edge restoration model comprises an edge restoration device, wherein the edge restoration device samples the extracted edge map, the original image and the mask image, and converts the feature map into a single-channel edge map after multiple convolution residual errors based on the expansion volume and two deconvolutions.
In the invention, further, the edge repair model repair method comprises the following steps:
Step S20: obtaining a predicted edge restoration result of the edge restoration device, obtaining a generation result of an edge restoration model according to the predicted edge restoration result, reserving an image edge of an already-regional area, and filling an edge part needing restoration in the missing region, wherein the generation result comprises the following steps:
Cp=Ge(M,C,Igray)
C+=C·(1-M)+Cp·M
Where C p represents the predicted edge restoration image, G e represents the edge restoration device, M represents the mask image, C represents the edge map of the image to be restored, I gray represents the gray map of the image to be restored, and C + represents the generated restoration edge image of the table edge restoration model.
In the invention, further, the repairing method of the edge repairing model further comprises the following steps of
Step S21, calculating a loss function of the edge restorer, wherein the loss function is a weighted summation of the generated edge countermeasure loss and the edge characteristic loss;
step S22: and optimizing the generation result of the edge restoration model to obtain a restored edge image.
In the present invention, further, the method for repairing the image repairing model includes:
Step S23: obtaining a predicted repair image by using tensors spliced by the repaired edge image and the damaged image as input, and obtaining the repair image according to the predicted repair image:
Ip=Gi(M,C+IM)
I+=I·(1-M)+Ip·M
Wherein, I p is a predicted repair image, I is a real image, G i is an image healer C + is a repair edge image, I M;
step S24: calculating an image restoration loss function, and optimizing a restoration result of an image restoration model, wherein the image restoration loss function comprises image contrast loss, style loss and perception loss, and the calculation method comprises the following steps:
wherein, lambda 3, lambda 4, lambda 5 and lambda 6 are custom super-parameters, The contrast loss generated for the image restoration model,For style loss,/>Is a perceived loss.
In the present invention, further, the generating the repair picture by the image repair device after sampling the repaired edge image for a plurality of times, performing residual convolution based on the dilation convolution for a plurality of times, performing multi-head attention network once and performing deconvolution twice includes: step S2-1: the obtained feature images passing through the convolution layer and the residual error network are subjected to different convolution changes to obtain a plurality of groups query, key, value of feature images;
step S2-2: acquiring a reconstructed feature map;
Step S2-3: splicing the reconstructed feature images according to the channel dimension to obtain a plurality of attention combination results;
Step S2-4: after the original input feature size is converted by the convolution network conversion, the restored reconstructed feature map and the original feature map are added to be used as a final output restoration picture result.
In the present invention, further, the step S2-2 of obtaining the reconstructed feature map includes:
Step S2-2-1: converting the key feature map into a rank, and performing dot product operation on groups between the query feature map and the key feature map after converting the rank to obtain a plurality of groups of correlation attention matrixes;
Step S2-2-2: normalizing the correlation attention moment array;
step S2-2-3: and carrying out matrix multiplication operation on the normalized self-attention matrix of each group of correlations and the value feature map of the group to obtain a reconstructed feature map of the group.
Step S2-3 is to splice the reconstructed feature images according to the channel dimension, and the obtaining of a plurality of attention combination results comprises the following steps:
step S2-3-1: obtain the attention result of the i-th head:
wherein, the Qi, ki and Vi are used for representing the feature map query key value matrix of the ith head;
step S2-3-2: the self-attention results of all heads are spliced, a W matrix is used for carrying out fusion projection on a plurality of feature spaces to the size of an original matrix, and a plurality of self-attention combined results are finally obtained:
MultiHead=Concat(heead1,dead2,...,headh)Wo
wherein, A gram matrix representing the predictive image vector inner product, gr i(IM) represents the gram matrix of the true image vector inner product, c ihiwi represents the dimension of the activation feature.
The style loss calculation method comprises the following steps:
wherein, A gram matrix representing the predictive image vector inner product, gr i(IM) represents the gram matrix of the true image vector inner product, c ihiwi represents the dimension of the activation feature.
In the present invention, further, the value feature map element in the value feature map in step S2-2-3 uses the correlation attention matrix to reconstruct the pixel by weighting, and the weight of other elements in the weighted reconstruction process is the pixel value corresponding to the correlation attention moment matrix.
Compared with the prior art, the invention has the beneficial effects that:
According to the invention, the multi-head attention network capable of capturing the long-distance relativity between the richer pixel areas is added after the last residual layer of the image restoration model, in order to enable the model to learn information in different subspaces, a plurality of repeated parallel attention calculations are used, each head is used for processing different information, so that the characteristics of different parts can be processed, the richer long-distance relativity is extracted, the multi-head self-attention network can learn the relativity matrixes of different modes, the multi-head attention network has very important effect on improving restoration results, and the restoration effect of the image is improved.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention. In the drawings:
FIG. 1 is a general flow chart of an image restoration method of the edge prior fusion multi-head attention mechanism of the present invention;
FIG. 2 is a flowchart of step S2 in an image restoration method of an edge prior fusion multi-head attention mechanism of the present invention;
FIG. 3 is a workflow diagram of step S2-2 and step S2-3 in an image restoration method of an edge prior fusion multi-head attention mechanism of the present invention;
FIG. 4 is a flowchart of an implementation of a method for repairing an edge repair model in an image repair method of an edge prior fusion multi-head attention mechanism of the present invention;
FIG. 5 is a schematic diagram of acquiring a query, key, value feature map in an image restoration method of an edge prior fusion multi-head attention mechanism of the present invention;
FIG. 6 is a schematic diagram of a correlation attention matrix acquisition flow in an image restoration method of an edge prior fusion multi-head attention mechanism according to the present invention
FIG. 7 is a schematic flow chart of a reconstructed feature map in an image restoration method of an edge prior fusion multi-head attention mechanism of the invention;
FIG. 8 is a diagram of a multi-head self-attention layer network architecture in an image restoration method of the edge prior fusion multi-head attention mechanism of the present invention;
FIG. 9 is a schematic diagram of an edge restoration model construction framework in an image restoration method of an edge prior fusion multi-head attention mechanism of the invention;
FIG. 10 is a schematic diagram of an edge image restoration model construction framework in an image restoration method of an edge prior fusion multi-head attention mechanism of the invention;
Fig. 11 is a schematic diagram of experimental results of an image restoration method of the edge prior fusion multi-head attention mechanism of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It will be understood that when an element is referred to as being "fixed to" another element, it can be directly on the other element or intervening elements may also be present. When a component is considered to be "connected" to another component, it can be directly connected to the other component or intervening components may also be present. When an element is referred to as being "disposed on" another element, it can be directly on the other element or intervening elements may also be present. The terms "vertical," "horizontal," "left," "right," and the like are used herein for illustrative purposes only.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used herein in the description of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. The term "and/or" as used herein includes any and all combinations of one or more of the associated listed items.
Referring to fig. 1, a preferred embodiment of the present invention provides an image restoration method for edge prior fusion multi-head attention mechanism, comprising
Step S1: acquiring experimental data and preprocessing the data, wherein the experimental data comprises a training set and a testing set, and extracting an edge map of an image from the preprocessed image;
Step S2: the method comprises the steps of constructing an edge first-fusion multi-attention mechanism repair model, wherein the edge repair model comprises an edge repair model and an image repair model, the edge repair model takes an extracted edge image, an original image and a mask image as inputs and outputs the extracted edge image and the original image and the mask image as repaired edge images, and the image repair model trains by taking the repaired edge images and defect images as inputs;
The image restoration model comprises an image restoration device, wherein the image restoration device generates restoration pictures after repeated sampling of restored edge images, repeated residual convolution based on expansion convolution, one multi-head attention network and two deconvolutions;
step S3: and evaluating the result of fusing the multi-attention mechanism repair model on the edge through the test set.
Specifically, the method collects a certain amount of images with good related images according to experimental requirements, completes data acquisition, then uses a preprocessing technology to perform preliminary processing on the data to obtain data meeting standards, divides the data set into a training set and a testing set, gradually builds an image restoration model according to algorithm design, trains the model by using the training set after the model is built, and performs test evaluation on the model effect by using the testing set. In the scheme, the multi-head attention mechanism is fused, so that the edge is fused with the multi-attention mechanism repair model first, the richer images are extracted, and the pixels are relied on for a long distance to achieve the effect of improving the image repair.
In the invention, celeba is adopted to disclose a dataset image, and the picture size is adjusted to 256×256 and then the dataset image is applied to experiments. However, since the data set does not divide the training set, the verification set and the test set, the first 18 ten thousand pictures are selected from the data set for training the model, and 4000 pictures are selected as test set analysis and comparison experiment results. In addition, mask images used by the user during model training are taken from an irregular mask data set, the irregular mask images which are arranged by the data set are divided into six groups according to the proportion of the area of a missing area to the whole image, namely, 0-10%, 10-20%, 20-30%, 30-40%, 40-50% and 50-60% of code images, each group comprises 2000 images, 1000 mask images represent the situation that the image boundary is missing, and the other 1000 mask images represent the situation that the image boundary is intact.
Extracting an edge map of an image by Canny edge detection aiming at an image originally input by a training set, wherein the Canny edge detection comprises four steps: gaussian filtering, calculating a gradient value and a gradient direction, filtering a non-maximum value, detecting edges by using upper and lower threshold values, obtaining an edge map of an original image, carrying out the edge mapping through a binary mask, and generating a repaired edge image through an antagonistic network. In the test task, 4000 pictures are taken out from Celeba data sets and used as test sets, a missing area is simulated by using mask patterns with random missing, and hand-drawn mask patterns used for testing are divided into 1 to 4 groups according to the ratio of the missing area from small to large, wherein each group contains 1000 4000 pictures in total.
In the present invention, further, as shown in fig. 9, the edge restoration model includes an edge restoration device, and the edge restoration device samples the extracted edge map, the original image and the mask image, and converts the feature map into a single-channel edge map after multiple convolution residuals based on the dilated volume and two deconvolutions. Specifically, the scheme converts the feature map into a single-channel edge map after 3 downsampling, 8 residual convolutions based on expansion convolution and 2 deconvolution.
Specifically, in the present invention, as shown in fig. 4, the repairing method of the edge repairing model is as follows:
step S20: the edge repairing device splices an edge image of an image to be repaired, a mask image and a gray level image of the repaired image into tensors to be used as input, so that a predicted edge repairing result of the edge repairing device is obtained, a generating result of an edge repairing model is obtained according to the predicted edge repairing result, the image edge of an already-regional image is reserved, and the missing region is filled with an edge part needing repairing, and the method comprises the following steps:
Cp=Ge(M,C,Igray)
C+=C·(1-M)+Cp·M
Where C p represents the predicted edge restoration image, G e represents the edge restoration device, M represents the mask image, C represents the edge map of the image to be restored, I gray′ represents the gray map of the image to be restored, and C + represents the generated restored edge image of the edge restoration model.
In the present invention, further, the repairing method of the edge repairing model further includes:
Step S21, calculating a loss function of the edge restorer, wherein the loss function is a weighted summation of the generated edge countermeasure loss and the edge characteristic loss;
Step S22: and optimizing the generation result of the edge restoration model according to the loss function to obtain a restored edge image finally output by the edge restoration model.
In particular, the loss function of the edge restoration model belongs to a mixed loss function, the purpose of which is to constrain the results of the edge discriminator processing, and the mixed function is to generate a weighted sum of edge countermeasure loss and edge feature loss. Wherein, generating edge contrast loss is a type of cross entropy of two classes, which can be recorded as:
wherein, Representing edge contrast loss,/>Representing real pictures and gray-scale map expectations,/>Representing the grayscale and original image expectations, D e represents the edge discriminator.
Secondly, the edge feature loss is a distance function defined on the feature layer of the edge healer, the main function is to calculate the sum of the distances between the generated edge and the features of different layers extracted by the edge healer, which are actually detected by using canny, and then the feature loss formula expression is as follows:
wherein, Representing edge feature loss, n represents the number of active layers of the edge discriminator
Finally, the optimization objective of the edge model can be written as:
wherein, Representing minimized edge healer,/>Representing a maximized edge discriminator.
In the invention, as shown in fig. 10, the image restoration model comprises an image restoration device, wherein the image restoration device generates restoration pictures after sampling restored edge images for a plurality of times, residual convolution based on expansion convolution for a plurality of times, one multi-head attention network and two deconvolutions.
The convolutional neural network only focuses on the pixel values of the local area when learning the features, but ignores the influence of the relevance of the pixels of the remote area on the image generation and restoration, so that the design of a plurality of attention mechanism models is also designed for capturing the remote dependency relationship better, wherein one of the multi-head self-attention network is based on the expansion structure of the self-attention network, and the self-attention network can capture the remote relationship among the pixels in imagination effectively. But there is not only one set of pixel long-distance relationships for each region, but the self-attention network is insufficient to learn multiple long-distance relationships, so we employ a multi-head attention network that can capture the long-distance relationships between more abundant pixel regions. The multi-headed self-care network may learn correlation matrices of different patterns, which has a very important role in improving repair results.
Specifically, as shown in fig. 2, the scheme adds a multi-head self-attention layer network after the last residual layer, and the specific scheme is as follows:
Step S2-1: the obtained feature images passing through the convolution layer and the residual error network are subjected to different convolution changes to obtain a plurality of groups query, key, value of feature images;
Specifically, as shown in fig. 5, the size of the query feature map is B g×Wf×Hf×Cq, where B g is a hidden variable batch input by the generator, W f is the width of the query feature map, H f is the height of the query feature map, C q is the channel dimension of the query feature map, key is the size of the feature map is B g×Wf×Hf×Ck, and several other parameters are the same as the query feature map, and C k is the channel dimension of the key feature map. The Value feature map has a size of B g×Wf×Hf×Cv, other parameters are the same as keys and query, and C v is the channel dimension of the feature map.
Step S2-2: the reconstructed feature map is obtained, as shown in fig. 3, and the specific method comprises the following steps:
Step S2-2-1: converting the key feature map into a rank, and performing dot product operation on groups between the query feature map and the key feature map after converting the rank, as shown in fig. 6, to obtain a plurality of groups of correlation attention matrixes;
Step S2-2-2: the correlation attention moment array is normalized, and in the step, the dot-integration matrix is normalized by a method such as Softmax.
Step S2-2-3: the normalized self-attention matrix of each group of correlations is matrix multiplied with the value feature map of the group, as shown in fig. 7, to obtain a reconstructed feature map of the group. And the value feature map elements in the value feature map are used for carrying out weighted reconstruction on the pixels by using the group of correlation attention matrixes, and the weight values of other elements in the weighted reconstruction process are pixel values corresponding to the correlation attention moment matrixes.
Further, after the reconstructed feature map is obtained, step S2-3 is performed, as shown in fig. 8,
Step S2-3: and splicing the reconstructed feature images according to the channel dimension to obtain a plurality of attention combination results.
Step S2-4: after the original input feature size is converted by the convolution network conversion, the restored reconstructed feature map and the original feature map are added to be used as a final output restoration picture result.
In one embodiment provided by the invention, a specific method for obtaining multiple attention combination results is as follows:
step S2-3-1: obtain the attention result of the i-th head:
wherein, the Qi, ki and Vi are used for representing the feature map query key value matrix of the ith head;
step S2-3-2: the self-attention results of all heads are spliced, a W matrix is used for carrying out fusion projection on a plurality of feature spaces to the size of an original matrix, and a plurality of self-attention combined results are finally obtained:
MultiHead=Concat(head1,head2,...,heanh)Wo
wherein, A gram matrix representing the predictive image vector inner product, gr i(IM) represents the gram matrix of the true image vector inner product, c ihiwi represents the dimension of the activation feature.
In summary, the scheme adds the multi-head self-attention layer network, but the output characteristic size is not changed, so that the remote information processed by a plurality of heads is more participated, and the image restoration effect is further improved.
In the invention, further, the image of the image restoration model after being processed by the edge restoration model is utilized for restoration, and the specific restoration method comprises the following steps:
Step S23: obtaining a predicted repair image by using tensors spliced by the repaired edge image and the damaged image as input, and obtaining the repair image according to the predicted repair image:
Ip=Gi(M,C+,IM)
I +=I·(1-M)+Ip·MIp is a predicted repair image, I is a real image, G i is an image healer, C + is a repair edge map, and I M is a defect map.
Step S24: and calculating an image restoration loss function, and optimizing the restoration result of the image restoration model, wherein the image restoration loss function comprises image contrast loss, style loss and perception loss.
For example, the image contrast loss is similar to the generated edge contrast loss of the edge restoration model, the image contrast loss of the image restoration modelThe method comprises the following steps:
furthermore, the first occurrence of style loss was proposed in the task of image style migration, and in a new improvement, by introducing a gram matrix (GramMatrix), the artifact problem that exists in deconvolution is alleviated. The model herein employs a loss function based on the style loss of the gram matrix. Its loss function The expression is as follows:
wherein, A gram matrix representing the predictive image vector inner product, gr i(IM) represents the gram matrix of the true image vector inner product, c ihiwi represents the dimension of the activation feature. Four active layers relu-2, relu-4, relu-4, and relu-2 in the VGG19 network are selected for use herein.
In addition, the perception loss is penalized by defining a distance measure between pre-training activated layers to penalize the generated image which does not coincide with the perception result of the real image phenomenonCan be defined as:
wherein, in the formula Corresponding to the 5 active layers of the pre-trained VGG19 network are relu-1, relu-2-1, relu-1, relu4-1 and relu-1, respectively. Where w i represents the weight parameter (the values of this scheme w i are all 1).
In summary, the loss function of the image restoration model contains multiple loss functions, and can be jointly calculated as:
wherein, lambda 3, lambda 4, lambda 5 and lambda 6 are custom super-parameters, The contrast loss generated for the image restoration model,For style loss,/>Is a perceived loss.
In the invention, further, after the edge fusion multi-attention mechanism repair model training is completed, the result of model repair is tested and evaluated through a test set, and the part is mainly completed by using PyTorch learning frames on two 1080 TIANGPU. The quality and the repairing effect of the chapter model are evaluated through four evaluation indexes of peak signal-to-noise ratio (PSNR), similarity (SSIM), l i error and distance score (FID).
In addition, as shown in fig. 11, the repair result of the multi-attention mechanism repair model is fused at the edge of the scheme, from left to right, the first image is an original image, the second image is an image to be repaired covered by a binary mask, the third image is an image repaired by the edge repair model, and the fourth image and the fifth image are result pictures repaired by the image repair model. Therefore, the pictures repaired by the prior fusion multi-attention mechanism repair model can be intuitively observed to be very similar to the original pictures, but the repair of certain completely missing parts remembers the differences from the original pictures, but the pictures are not different according to the human sensory observation. The scheme has good repairing effect and can reasonably repair the missing part. The result shows that the network is superior to expectations in the aspect of image restoration by fusing a multi-head attention mechanism.
The foregoing description is directed to the preferred embodiments of the present invention, but the embodiments are not intended to limit the scope of the invention, and all equivalent changes or modifications made under the technical spirit of the present invention should be construed to fall within the scope of the present invention.
Claims (7)
1. An image restoration method of an edge prior fusion multi-head attention mechanism is characterized by comprising the following steps of
Step S1: acquiring experimental data and preprocessing the data, wherein the experimental data comprises a training set and a testing set, and extracting an edge map of an image from the preprocessed image;
Step S2: the method comprises the steps of constructing an edge first-fusion multi-attention mechanism repair model, wherein the edge repair model comprises an edge repair model and an image repair model, the edge repair model takes an extracted edge image, an original image and a mask image as inputs, outputs the extracted edge image and the original image as repaired edge images, and the image repair model trains by taking the repaired edge images and the defect images as inputs;
The image restoration model comprises an image restoration device, wherein the image restoration device generates restoration pictures after repeated sampling of restored edge images, repeated residual convolution based on expansion convolution, one multi-head attention network and two deconvolutions;
step S3: evaluating the result of fusing the multi-attention mechanism repair model on the edge through the test set;
The image healer samples the healed edge image for a plurality of times, residual convolution based on expansion convolution for a plurality of times, and generating a healed picture after one-time multi-head attention network and two deconvolutions comprises the following steps:
Step S2-1: the characteristic diagrams obtained through the convolution layer and the residual error network are subjected to different convolution changes to obtain a plurality of groups query, key, value of characteristic diagrams;
step S2-2: acquiring a reconstructed feature map;
Step S2-3: splicing the reconstructed feature images according to the channel dimension to obtain a plurality of attention combination results;
Step S2-4: transforming the restored reconstructed feature map and the original feature map into the original input feature size through convolution network transformation, and adding the restored reconstructed feature map and the original feature map to obtain a final output restoration picture result;
The step S2-2 of obtaining the reconstructed feature map comprises the following steps:
Step S2-2-1: converting the key feature map into a rank, and performing dot product operation on groups between the query feature map and the key feature map after converting the rank to obtain a plurality of groups of correlation attention matrixes;
Step S2-2-2: normalizing the correlation attention moment array;
step S2-2-3: performing matrix multiplication operation on the normalized self-attention matrix of each group of correlations and the value feature map of the group to obtain a reconstructed feature map of the group;
Step S2-3 is to splice the reconstructed feature images according to the channel dimension, and the obtaining of a plurality of attention combination results comprises the following steps:
step S2-3-1: obtain the attention result of the i-th head:
wherein, the Qi, ki and Vi are used for representing the feature map query key value matrix of the ith head;
Step S2-3-2: splice the self-attention results of the individual heads, use The matrix performs fusion projection on the multiple feature spaces to the original matrix size, and finally a plurality of self-attention combined results are obtained:
MultiHead=Concat(head1,head2,...,headh)Wo。
2. The image restoration method for an edge prior fusion multi-head attention mechanism according to claim 1, wherein the edge restoration model comprises an edge restoration device, the edge restoration device samples the extracted edge map, the original image and the mask image, and converts the feature map into a single-channel edge map after a plurality of times of convolution based on expansion convolution residual convolution and two deconvolutions.
3. The image restoration method of an edge prior fusion multi-head attention mechanism according to claim 2, wherein the restoration method of the edge restoration model is as follows:
Step S20: obtaining a predicted edge restoration result of the edge restoration device, obtaining a generation result of an edge restoration model according to the predicted edge restoration result, reserving an image edge of an already-regional area, and filling an edge part needing restoration in the missing region, wherein the generation result comprises the following steps:
Cp=Ge(M,C,Igray)
C+=C·(1-M)+Cp·M
where C p represents the predicted edge restoration image, G e represents the edge restoration device, M represents the mask image, C represents the edge map of the image to be restored, I gray represents the gray map of the image to be restored, and C + represents the generated restored edge image of the edge restoration model.
4. An image restoration method according to claim 3, wherein said restoration method of said edge restoration model further comprises
Step S21, calculating a loss function of the edge restorer, wherein the loss function is a weighted summation of the generated edge countermeasure loss and the edge characteristic loss;
step S22: and optimizing the generation result of the edge restoration model to obtain a restored edge image.
5. The image restoration method of an edge prior fusion multi-head attention mechanism according to claim 1, wherein the restoration method of the image restoration model comprises the following steps:
Step S23: obtaining a predicted repair image by using tensors spliced by the repaired edge image and the damaged image as input, and obtaining the repair image according to the predicted repair image:
Ip=Gi(M,C+,IM)
I+=I·(1-M)+Ip·M
Wherein I p is a predicted repair image, I is a real image, G i is an image healer,
G + is a repair edge image, I M is a defect image, and I + is a repair image;
step S24: calculating an image restoration loss function, and optimizing a restoration result of an image restoration model, wherein the image restoration loss function comprises image contrast loss, style loss and perception loss, and the calculation method comprises the following steps:
wherein, lambda 3, lambda 4, lambda 5 and lambda 6 are custom super-parameters, Loss of antagonism generated for image repair models,/>For style loss,/>Is a perceived loss.
6. The image restoration method of an edge prior fusion multi-head attention mechanism according to claim 5, wherein the style loss calculation method is as follows:
wherein, A gram matrix representing the predictive image vector inner product, gr i(IM) represents the gram matrix of the true image vector inner product, c ihiwi represents the dimension of the activation feature.
7. The method for repairing an image by using an edge prior fusion multi-head attention mechanism according to claim 1, wherein the value feature map elements in the value feature map in the step S2-2-3 are used for carrying out weighted reconstruction on the pixels by using the group of correlation attention matrixes, and the weights of other elements in the weighted reconstruction process are the pixel values corresponding to the correlation attention moment matrixes.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111356234.6A CN114022506B (en) | 2021-11-16 | 2021-11-16 | Image restoration method for edge prior fusion multi-head attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111356234.6A CN114022506B (en) | 2021-11-16 | 2021-11-16 | Image restoration method for edge prior fusion multi-head attention mechanism |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114022506A CN114022506A (en) | 2022-02-08 |
CN114022506B true CN114022506B (en) | 2024-05-17 |
Family
ID=80065024
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111356234.6A Active CN114022506B (en) | 2021-11-16 | 2021-11-16 | Image restoration method for edge prior fusion multi-head attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114022506B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116188875B (en) * | 2023-03-29 | 2024-03-01 | 北京百度网讯科技有限公司 | Image classification method, device, electronic equipment, medium and product |
CN117649365A (en) * | 2023-11-16 | 2024-03-05 | 西南交通大学 | Paper book graph digital restoration method based on convolutional neural network and diffusion model |
CN117351015B (en) * | 2023-12-05 | 2024-03-19 | 中国海洋大学 | Tamper detection method and system based on edge supervision and multi-domain cross correlation |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111127346A (en) * | 2019-12-08 | 2020-05-08 | 复旦大学 | Multi-level image restoration method based on partial-to-integral attention mechanism |
CN113112411A (en) * | 2020-01-13 | 2021-07-13 | 南京信息工程大学 | Human face image semantic restoration method based on multi-scale feature fusion |
CN113240613A (en) * | 2021-06-07 | 2021-08-10 | 北京航空航天大学 | Image restoration method based on edge information reconstruction |
CN113379655A (en) * | 2021-05-18 | 2021-09-10 | 电子科技大学 | Image synthesis method for generating antagonistic network based on dynamic self-attention |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11398015B2 (en) * | 2020-04-29 | 2022-07-26 | Adobe Inc. | Iterative image inpainting with confidence feedback |
-
2021
- 2021-11-16 CN CN202111356234.6A patent/CN114022506B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111127346A (en) * | 2019-12-08 | 2020-05-08 | 复旦大学 | Multi-level image restoration method based on partial-to-integral attention mechanism |
CN113112411A (en) * | 2020-01-13 | 2021-07-13 | 南京信息工程大学 | Human face image semantic restoration method based on multi-scale feature fusion |
CN113379655A (en) * | 2021-05-18 | 2021-09-10 | 电子科技大学 | Image synthesis method for generating antagonistic network based on dynamic self-attention |
CN113240613A (en) * | 2021-06-07 | 2021-08-10 | 北京航空航天大学 | Image restoration method based on edge information reconstruction |
Non-Patent Citations (2)
Title |
---|
基于并行对抗与多条件融合的生成式高分辨率图像修复;邵杭;王永雄;;模式识别与人工智能;20200415(第04期);全文 * |
基于生成对抗网络的图像修复技术研究;李炬;黄文培;;计算机应用与软件;20191212(第12期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN114022506A (en) | 2022-02-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114022506B (en) | Image restoration method for edge prior fusion multi-head attention mechanism | |
CN114092330B (en) | Light-weight multi-scale infrared image super-resolution reconstruction method | |
CN113240580A (en) | Lightweight image super-resolution reconstruction method based on multi-dimensional knowledge distillation | |
CN110136062B (en) | Super-resolution reconstruction method combining semantic segmentation | |
CN107977932A (en) | It is a kind of based on can differentiate attribute constraint generation confrontation network face image super-resolution reconstruction method | |
CN111787187B (en) | Method, system and terminal for repairing video by utilizing deep convolutional neural network | |
CN112183637A (en) | Single-light-source scene illumination re-rendering method and system based on neural network | |
CN115018727A (en) | Multi-scale image restoration method, storage medium and terminal | |
CN114897742B (en) | Image restoration method with texture and structural features fused twice | |
CN110490968A (en) | Based on the light field axial direction refocusing image super-resolution method for generating confrontation network | |
CN113793261A (en) | Spectrum reconstruction method based on 3D attention mechanism full-channel fusion network | |
CN111882516B (en) | Image quality evaluation method based on visual saliency and deep neural network | |
CN117095287A (en) | Remote sensing image change detection method based on space-time interaction transducer model | |
Ma et al. | Multi-task interaction learning for spatiospectral image super-resolution | |
CN116757986A (en) | Infrared and visible light image fusion method and device | |
CN113888399B (en) | Face age synthesis method based on style fusion and domain selection structure | |
CN114998667A (en) | Multispectral target detection method, multispectral target detection system, computer equipment and storage medium | |
CN116523985B (en) | Structure and texture feature guided double-encoder image restoration method | |
CN117456330A (en) | MSFAF-Net-based low-illumination target detection method | |
CN116993639A (en) | Visible light and infrared image fusion method based on structural re-parameterization | |
CN115984949A (en) | Low-quality face image recognition method and device with attention mechanism | |
CN113205005B (en) | Low-illumination low-resolution face image reconstruction method | |
KR102340387B1 (en) | Method of learning brain connectivity and system threrfor | |
CN113888417A (en) | Human face image restoration method based on semantic analysis generation guidance | |
CN114331821A (en) | Image conversion method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |