CN114463209A - Image restoration method based on deep multi-feature collaborative learning - Google Patents
Image restoration method based on deep multi-feature collaborative learning Download PDFInfo
- Publication number
- CN114463209A CN114463209A CN202210089664.4A CN202210089664A CN114463209A CN 114463209 A CN114463209 A CN 114463209A CN 202210089664 A CN202210089664 A CN 202210089664A CN 114463209 A CN114463209 A CN 114463209A
- Authority
- CN
- China
- Prior art keywords
- feature
- image
- texture
- features
- cte
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 39
- 230000004927 fusion Effects 0.000 claims abstract description 32
- 230000002776 aggregation Effects 0.000 claims abstract description 18
- 238000004220 aggregation Methods 0.000 claims abstract description 18
- 230000002146 bilateral effect Effects 0.000 claims abstract description 18
- 238000013528 artificial neural network Methods 0.000 claims abstract description 4
- 238000000605 extraction Methods 0.000 claims abstract description 4
- 230000004913 activation Effects 0.000 claims description 18
- 230000006870 function Effects 0.000 claims description 13
- 230000008447 perception Effects 0.000 claims description 12
- 230000009977 dual effect Effects 0.000 claims description 10
- 230000006835 compression Effects 0.000 claims description 8
- 238000007906 compression Methods 0.000 claims description 8
- 239000000126 substance Substances 0.000 claims description 6
- 230000003044 adaptive effect Effects 0.000 claims description 5
- 230000008439 repair process Effects 0.000 claims description 4
- 238000003706 image smoothing Methods 0.000 claims description 3
- 238000010187 selection method Methods 0.000 claims description 3
- 238000012549 training Methods 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 2
- 230000003595 spectral effect Effects 0.000 claims description 2
- 238000012545 processing Methods 0.000 abstract description 3
- 238000010586 diagram Methods 0.000 description 5
- 238000013135 deep learning Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 230000000007 visual effect Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000009792 diffusion process Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000006798 recombination Effects 0.000 description 2
- 238000005215 recombination Methods 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 238000002679 ablation Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 230000008602 contraction Effects 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 238000003708 edge detection Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 238000000691 measurement method Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/77—Retouching; Inpainting; Scratch removal
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to the field of image processing, in particular to an image restoration method based on depth multi-feature collaborative learning, which comprises the following steps: s1, inputting an image to be restored into a preset image feature encoder, and performing effective feature extraction on the image to be restored through deep neural network coding to form an effective image feature set; s2, decoding and repairing the effective image feature set through a preset image decoder, and forming a repaired image through a local discriminator and a global discriminator; the image feature encoder consists of six convolutional layers, wherein three shallow convolutional layers are used for reorganizing texture features, and three deep convolutional layers are used for reorganizing structural features to obtain a structural feature set and a texture feature set; the image decoder comprises a soft gate control dual-feature fusion module used for fusing the structural features and the texture features, and a bilateral propagation feature aggregation module used for balancing the features among channel information, context attention and feature space. The technology can effectively solve the artifact of the repaired image, so that the repaired image has detailed texture and better image appearance.
Description
Technical Field
The invention relates to the field of image processing, in particular to an image restoration method based on depth multi-feature collaborative learning.
Background
With the advancement of information technology and the advent of the digital age, digital images have been widely present in human life as carriers for recording and transferring image data, and have grown at an alarming rate. However, digital images are often damaged during capture, storage, processing and transmission or the integrity of the information stored in the images is lost due to occlusion. In order to retrieve the lost part of the damaged digital image information, the current technology can reasonably restore the lost digital image information according to the relevant characteristics of the information in the current image data, namely, the lost digital image information is restored as much as possible according to the image information which is not damaged or shielded, and the technology is commonly called image restoration technology.
Image restoration aims at reconstructing a damaged area or removing an unnecessary area in an image while improving its visual aesthetic sense, and is widely used for low-level visual tasks such as restoring a damaged photograph or removing a target area, and the current conventional restoration methods are classified into a diffusion-based method and a block-based method.
For example, a feature equalization-based inter codec restoration method proposed by liu rainbow rain, the technology proposes an inter codec using deep and shallow convolutional feature layers as the structure and texture of an image, respectively. The deep features are sent to the structural branches and the shallow features are sent to the texture branches. In each branch, the holes are filled with a plurality of sizes. And connecting the characteristics from the two branches to perform channel equalization and characteristic equalization. The channel equalization adopted by the technology adopts a compression and activation network (SEnet), and uses a bilateral propagation activation function to balance the attention of the channel again on the characteristic equalization so as to realize the space equalization. And finally, generating an output image in a jump connection mode.
The technology provides a two-stage network image restoration algorithm based on bidirectional cascade edge detection network (BDCN) and U-net incomplete edge generation. In the first stage, image edge information is extracted based on a BDCN network to replace a Canny operator to extract the edge of a residual region, each layer of network learns the edge characteristics of a specific scale, multi-scale edge characteristics are obtained through fusion, then the edge characteristics of the residual image are extracted by using a contraction path based on a U-net network architecture, and then the image edge texture information is restored by using an expansion path. In the second stage, cavity convolution is used for lower sampling and upper sampling, and a missing image with rich details is reconstructed through a residual error network.
A cascade generation-based confrontation network image restoration algorithm proposed by He is formed by connecting coarsening and optimization generation sub-networks in series. A parallel convolution module is designed in a coarsening generation network and is formed by connecting 3 layers of convolution paths and 1 deep layer convolution path in parallel, and when the number of the convolution layers is deep, the problem of gradient disappearance can be solved; a cascade residual module is provided in a deep convolution path, and the characteristic multiplexing can be effectively enhanced by performing cross cascade on the double-layer convolution of 4 channels; and correspondingly adding the convolution result and the element of the module input characteristic diagram, and performing local residual learning to improve the expression capability of the network.
The existing diffusion-based method propagates appearance information of adjacent content to fill in missing areas, and only relies on a search mechanism on the adjacent content, so that obvious artifacts are generated when a large-area defective picture is repaired. Block-based methods fill missing regions by searching for the most similar blocks from the undamaged region, which, although has the advantage of obtaining distant information, is difficult to generate semantically reasonable images due to the lack of high-level structural understanding. With the progress of the technology, although the method based on deep learning can understand high-level semantics to generate reasonable content, due to the lack of an effective multi-feature fusion technology, the actual repairing effect of the existing image repairing method is still not natural and perfect.
Disclosure of Invention
The invention provides an image restoration method based on depth multi-feature collaborative learning, aiming at the technical problems of artifacts, unnatural structures and textures and the like in the existing image restoration technology.
The image restoration method based on the depth multi-feature collaborative learning comprises the following steps:
s1, inputting an image to be restored into a preset image feature encoder, and performing effective feature extraction on the image to be restored through deep neural network encoding to form an effective image feature set;
s2, decoding and repairing the effective image characteristic set through a preset image decoder, and forming a repaired image after passing through a local discriminator and a global discriminator;
the image feature encoder consists of six convolutional layers, wherein three shallow convolutional layers are used for reorganizing texture features to represent image details, and three deep convolutional layers are used for reorganizing structural features to represent image semantics to obtain a structural feature set and a texture feature set;
the image decoder comprises a soft gate control dual-feature fusion module used for fusing the structural features and the texture features, and a double-side propagation feature aggregation module used for balancing the features among channel information, context attention and feature space.
Preferably, the texture feature and the structural feature are first filled in the damaged area by using three parallel streams with different kernel sizes, the three streams are combined to form an output feature map, and then the output feature map is mapped to the same size of the input feature.
Further, the output of the structural features and the texture features meets the following requirements:
Lrst=||g(Fcst)-Ist||1 (1-1)
Lrte=||g(Fcte)-Igt||1 (1-2)
wherein, FcstAnd FcteRespectively expressed as output characteristics, L, of the structure and texture resulting from the concatenation of the multi-scale filling stagesrstAnd LrteDenoted as reconstruction loss of structure and texture, respectively, g (-) is a convolution operation with a kernel size of 1, F can be expressedcstAnd FcteRespectively mapped as color images, IgtAnd IstRepresenting a real image and its structural image, respectively, using an edge preserving image smoothing method to generate Ist。
Preferably, the soft-gated dual feature fusion module comprises a structure-guided texture feature unit for executing an algorithm,
Gte=σ(SE(h([Fcst,Fcte]))) (2-1)
F′cte=α(β(Gte⊙Fcte)⊙Fcte)⊕Fcte (2-2)
wherein, FcstAnd FcteRespectively expressed as the output characteristics of the structure and texture generated by the concatenation of the multi-scale filling stages, h (-) is a convolution operation with kernel size 3, SE (-) is a compression and activation operation to capture important channel information, σ (-) is a Sigmoid activation function, GteIs used to control the degree of refinement, F ', of texture information'cteIndicating a texture feature with structure perception, alpha and beta are learnable parameters, indicating an element to element product, and ^ indicating an element to element addition.
Preferably, the soft-gated dual feature fusion module comprises a texture-guided structural feature unit for executing an algorithm,
Gst=σ(SE(k([Fcst,Fcte]))) (2-3)
wherein, FcstAnd FcteRespectively expressed as the output characteristics of the structure and texture generated by the concatenation of the multi-scale filling stages, k (-) is a convolution operation with kernel size 3, SE (-) is a compression and activation operation to capture important channel information, σ (-) is a Sigmoid activation function, GstTo control the degree of refinement, F ', of the structural information'cstIndicating a texture feature with structure perception, gamma is a learnable parameter, indicating an element to element product, and ^ indicating an element to element addition.
Ffu=v([F′cst,F′cte]) (2-5)
Wherein, F'cteAnd F'cstRespectively representing texture features with structure perception and texture features with structure perception, v (-) is kernel largeConvolution operations as small as 1, FfuIs the final output characteristic of the soft gating dual-characteristic fusion module.
Preferably, the bilateral propagation feature aggregation module includes a capture channel information fusion unit, which captures channel information by an adaptive core selection method using a dynamic core selection network to obtain a feature map F'fu。
Further, the bilateral propagation feature aggregation module includes a context attention fusion unit, configured to capture a relationship between input image blocks, and calculate a cosine similarity, and specifically execute the following algorithm:
wherein, feature F'fuDivided into non-overlapping blocks (pixels of size 3 x 3),representing the cosine similarity between the output feature blocks,denotes the attention score, p, obtained by the Softmax functioniAnd pjAre respectively the ith and jth blocks of the input feature F, N being the input feature F'fuThe total number of blocks of (a) is,a feature map reconstructed from the attention scores is shown.
Preferably, the bilateral propagation feature aggregation module includes a spatial information fusion unit, and specifically executes the following algorithm:
wherein the content of the first and second substances,andrepresenting spatial and range similarity profiles, xiIs a characteristic of inputThe ith characteristic channel of (1)jAre adjacent feature channels at locations j around channel i,is a Gaussian function for adjusting the spatial contributions from neighboring feature channels, C (x) isF (-) is a dot product operation.
Further, the calculation method of the output characteristic channel comprises the following steps:
wherein the content of the first and second substances,andthe spatial and range similarity feature maps are shown, q represents the convolutional layer, and the kernel size is 1. Further, each channel feature is aggregated to obtain a reconstructed feature mapF 'is then formed by concatenated convolution'fuAndfusion to give Fsc,
WhereinIs a recombined multichannel character, F'fuTo weigh the features obtained after channel information, FscFor the final fusion repair feature, z is a convolution operation with a convolution kernel size of 1.
Preferably, the global and local discriminators are composed of five convolutional layers, the size of a convolutional kernel is 4, the step length is 2, and all layers except the last layer use a Leaky ReLu with a slope of 0.2.
Compared with the prior art, the image restoration method based on the depth multi-feature collaborative learning has the following beneficial effects:
compared with the prior art, the method has the advantages that not only the relation between the image structure and the texture is considered, but also the relation between the image contexts is considered. The method adopts a single-stage network, and uses double branches to respectively learn the structure and the texture of the image, so that the generated structure and the texture are more consistent. And the image structure information is fully utilized, so that the generated image structure is more reasonable, and the visual image result is more real. Specifically, the consistency of the structure and the texture is enhanced through a soft gating dual-feature fusion (SDFF) module, and the blurring and the artifacts around the hole area can be effectively reduced through a switching and recombination mode. The connection from local features to overall consistency is enhanced through a Bilateral Propagation Feature Aggregation (BPFA) module, and the connection between context attention, channel information and feature space is considered, so that the repaired image has detailed textures and better image appearance.
Drawings
The present invention is further described with reference to the accompanying drawings, but the embodiments in the drawings do not limit the present invention in any way, and for those skilled in the art, other drawings may be obtained according to the following drawings without creative efforts.
FIG. 1 is a block diagram of such a multi-feature collaborative learning network provided by the present invention;
FIG. 2 is a schematic diagram of a soft-gated dual feature fusion module;
FIG. 3 is a schematic diagram of a bilateral propagation feature aggregation module;
FIG. 4 is a comparison graph of the repairing effect of the present invention on irregular holes and the image repairing technique based on deep learning in the prior art;
FIG. 5 is a comparison graph of the repairing effect of the present invention on the central hole and the existing image repairing technology based on deep learning;
fig. 6 is a graph of the results of an image repair ablation experiment of the present invention.
Detailed Description
The image restoration method based on deep multi-feature collaborative learning provided by the present invention is further described below with reference to the accompanying drawings, and it should be noted that the technical solution and the design principle of the present invention are explained in detail below only by an optimized technical solution.
The core of the image restoration method based on deep multi-feature collaborative learning provided by the invention is to provide a multi-feature collaborative learning network for restoring damaged images. First, this patent proposes a soft-gated dual feature fusion (SDFF) module that enables the coordinated information exchange of image structure and texture, thereby enabling them to strengthen the connection between each other. Second, the patent uses a Bilateral Propagation Feature Aggregation (BPFA) module to further refine the generated structure and texture by enhancing the connection from local features to global consistency through collaborative learning of context attention, channel information, and feature space. In addition, the invention uses an end-to-end single-stage network training mode, and adopts double branches to respectively learn the image structure and the texture in a single stage, so that the image artifact can be effectively reduced and a more real image result can be generated.
Specifically, the technical overall stem model of the image inpainting method based on the deep multi-feature collaborative learning is shown in fig. 1, and the method includes the following steps: (1) the encoder consists of six convolutional layers. The three shallow features are reorganized into texture features to represent image details. Meanwhile, reorganizing the three deep features into structural features to represent image semantics; (2) two branches are adopted to learn structural and textural features respectively; (3) a soft-gated dual feature fusion module to fuse the structural and texture features generated by the two branches, see fig. 2 in particular; (4) a bilateral propagation feature aggregation module equalizes features between channel information, context awareness, and feature space, see in particular fig. 3. Specifically, the dynamic kernel selection network (SKNets) is used for selecting capture channel information through an adaptive convolution kernel, capturing context relations in an image by using a Context Awareness (CA) module, and capturing relations between space and range by using a Bilateral Propagation Activation (BPA) module; (5) finally, the decoder is given guidance information by the jump-join method, synthesizing the structural and texture branches to produce more complex images; (6) the use of local and global discriminators makes the generated image more realistic.
Specifically, the image restoration method based on the depth multi-feature collaborative learning comprises the following steps:
s1, inputting an image to be restored into a preset image feature encoder, and performing effective feature extraction on the image to be restored through deep neural network coding to form an effective image feature set;
s2, decoding and repairing the effective image feature set through a preset image decoder, and forming a repaired image through a local discriminator and a global discriminator;
the image feature encoder consists of six convolutional layers, wherein three shallow convolutional layers are used for reorganizing texture features to represent image details, and three deep convolutional layers are used for reorganizing structural features to represent image semantics to obtain a structural feature set and a texture feature set;
the image decoder comprises a soft gate control dual-feature fusion module used for fusing the structural features and the texture features, and a double-side propagation feature aggregation module used for balancing the features among channel information, context attention and feature space.
Preferably, the texture feature and the structural feature are first filled in the damaged area by using three parallel streams with different kernel sizes, the three streams are combined to form an output feature map, and then the output feature map is mapped to the same size of the input feature.
Further, the output of the structural features and the texture features meets the following requirements:
Lrst=||g(Fcst)-Ist||1 (1-1)
Lrte=||g(Fcte)-Igt||1 (1-2)
wherein, FcstAnd FcteOutput features, L, respectively expressed as structures and textures generated by the multi-scale fill-phase joinrstAnd LrteExpressed as reconstruction loss of structure and texture, respectively, g (-) is a convolution operation with a kernel size of 1, let FcstAnd FcteRespectively mapped as color images, IgtAnd IstRepresenting a real image and its structural image, respectively, using an edge preserving image smoothing method to generate Ist。
Preferably, the soft-gated dual feature fusion module comprises a structure-guided texture feature unit for executing an algorithm,
Gte=σ(SE(h([Fcst,Fcte]))) (2-1)
F′cte=α(β(Gte⊙Fcte)⊙Fcst)⊕Fcte (2-2)
wherein, FcstAnd FcteRespectively expressed as output features of structure and texture generated by concatenation of the multi-scale filling stages, h (-) is a convolution operation with kernel size 3, SE (-) is a compression and activation operation to capture important channel information, σ (-) is a Sigmoid activation function, GteIs used to control the degree of refinement, F ', of texture information'cteIndicating a texture feature with structure perception, alpha and beta are learnable parameters, indicating an element to element product, and ^ indicating an element to element addition.
Preferably, the soft-gated dual feature fusion module comprises a texture-guided structural feature unit for executing an algorithm,
Gst=σ(SE(k([Fcst,Fcte]))) (2-3)
F′cst=γ(GsteFcte)⊕Fcst (2-4)
wherein, FcstAnd FcteRespectively, as output features of structure and texture generated by the concatenation of the multiple-scale filling stages, k (.) is a convolution operation with kernel size 3, SE (·) is a compression and activation operation to capture important channel information, σ (·) is a Sigmoid activation function, G (·)stTo control the degree of refinement, F ', of the structural information'cstIndicating a texture feature with structure perception, gamma is a learnable parameter, indicating an element to element product, and ^ indicating an element to element addition.
Ffu=v([F′cst,F′cte]) (2-5)
Wherein, F'cteAnd F'cstRepresenting texture features with structure perception and texture features with structure perception, respectively, v (-) is a convolution operation with a kernel size of 1, FfuIs the final output characteristic of the soft gating dual-characteristic fusion module.
Preferably, the bilateral propagation feature aggregation module includes a capture channel information fusion unit, which captures channel information by an adaptive core selection method using a dynamic core selection network to obtain a feature map F'fu。
Further, the bilateral propagation feature aggregation module includes a context attention fusion unit, configured to capture a relationship between input image blocks, and calculate a cosine similarity, and specifically execute the following algorithm:
wherein, feature F'fuDivided into non-overlapping blocks (pixels of size 3 x 3),representing the cosine similarity between the output feature blocks,denotes the attention score, p, obtained by the Softmax functioniAnd pjAre respectively an input characteristic F'fuIs an input feature F'fuThe total number of blocks of (a) is,a feature map reconstructed from the attention scores is shown.
Preferably, the bilateral propagation feature aggregation module includes a spatial information fusion unit, and specifically executes the following algorithm:
wherein the content of the first and second substances,andrepresenting spatial and range similarity profiles, xiIs a characteristic of inputThe ith characteristic channel of (1)jAre neighboring feature channels at locations j around channel i,is a Gaussian function for adjusting the spatial contributions from neighboring feature channels, C (x) isF (-) is a dot product operation.
Further, the calculation method of the output characteristic channel comprises the following steps:
wherein the content of the first and second substances,andthe spatial and range similarity feature maps are shown, q represents the convolutional layer, and the kernel size is 1. Further, each channel feature is aggregated to obtain a reconstructed feature mapThen pass throughConcatenate convolution to F'fuAndfusion to give Fsc,
WhereinFor recombined multichannel characteristics, FfuTo weigh the features obtained after channel information, FscFor the final fusion repair feature, z is a convolution operation with a convolution kernel size of 1.
Preferably, the global and local discriminators are composed of five convolutional layers, the size of a convolutional kernel is 4, the step length is 2, and all layers except the last layer use a Leaky ReLu with a slope of 0.2.
The following points describe the core technical process in detail:
(1) structural and texture branching
The texture feature of the shallow convolution reconstruction is denoted as FteStructural features of deep convolutional recombination are denoted as Fst. In each branch, three parallel streams are used, with different scales to fill the damaged area. Where the kernel sizes of different streams are different. Finally, by combining the output feature maps of the three streams, the combined features are then mapped to the same size of the input features. Here, FcstAnd FcteRepresented as the outputs of the texture and texture branches, respectively. To ensure that each branch focuses on structure and texture, respectively, we use two reconstruction penalties, denoted L, respectivelyrstAnd Lrte. The pixel level loss is defined as:
Lrst=||g(Fcst)-Ist||1 (1-1)
Lrte=||g(Fcte)-Igt||1 (1-2)
where g (-) is a convolution operation with a kernel size of 1, with the aim of dividing FcstAnd FcteRespectively mapped as color images. I.C. AgtAnd IstRespectively representing a real image and its structural image. Generating I using an edge preserving smoothing methodst。
(2) Soft-gated dual feature fusion module
In this algorithm, the structural features F generated by the two branches arecstAnd texture feature FcteBetter combinations are made. By exchanging the two types of information, the ratio is dynamically controlled by soft gating to achieve the purpose of dynamic combination. In particular, to construct structure-guided texture features. Soft gate control GteTo control the refinement of the texture information. This is defined as:
Gte=σ(SE(h([Fcst,Fcte]))) (2-1)
where h (-) is a convolution operation with a kernel size of 3. SE (-) is a compression and activation operation to capture important channel information. σ (-) is a Sigmoid activation function, using soft gating GteThis can dynamically couple FcstIs fused into Fcte:
F′cte=α(β(Gte⊙Fcte)⊙Fcst)⊕Fcte (2-2)
Where α and β are learnable parameters, indicating element multiplication, and ≧ element addition.
Likewise, texture guides feature F'cstIs defined as:
Gst=σ(SE(k([Fcst,Fcte]))) (2-3)
F′cst=γ(GsteFcte)⊕Fcst (2-4)
where k and h have the same arithmetic operation and γ is a learnable parameter.
Finally, is bonded F'cteAnd F'cstAnd generating feature F using convolution operation v having kernel size 1fu:
Ffu=v([F′cst,F′cte]) (2-5)
(3) Bilateral propagation feature aggregation module
This module is proposed to re-weigh the channels and space so that the image representation is more consistent. Firstly, capturing channel information by using a dynamic core selection network in an adaptive core selection mode to obtain a feature map F'fuThe correlation between channels can be enhanced, and the consistency of the whole image can be maintained. And a Context Awareness (CA) module is introduced to capture the association between image blocks. Specifically, for a given input feature F, we extract a block of 3 × 3 pixels and calculate the cosine similarity:
wherein p isiAnd pjThe ith and jth blocks of the input feature F, respectively.
We use the Softmax function to obtain the attention score between each pair of blocks:
wherein N is an input feature F'fuTotal number of blocks. Next, the feature map is reconstructed using the attention scores:
In the spatial and range domains, we introduce a Bilateral Propagation Activation (BPA) module to generate response values based on range and spatial distance. The response values were calculated as follows:
wherein xiIs a characteristic of inputThe ith characteristic channel of (1)jAre adjacent feature channels at locations j around channel i,is a Gaussian function for adjusting the spatial contributions from neighboring feature channels, C (x) isThe number of locations in (f) · is a dot product operation. In the spatial domain, we explore j in the neighborhood s for global propagation. S is set to the same size as the input features in the experiment. In the range domain, v is an adjacent region of the position i, and its size is set to 3 × 3. Therefore, we can obtain the characteristic diagram by the space and range similarity measurement method respectivelyAndeach feature channel can compute:
where q represents the convolutional layer and the kernel size is 1.
Next, each channel is aggregated to obtain a reconstructed feature mapFinally, we concatenate then convolve F'fuAndto obtain Fsc。
Where z is a convolution operation with a convolution kernel size of 1.
(4) Distinguishing device
The invention introduces global and local discriminators to ensure that the local-global image content is more consistent. It consists of five convolution layers, the convolution kernel size is 4, the step length is 2, except the last layer, all other layers use the Leaky ReLu with the slope of 0.2. In addition, spectral normalization is employed to achieve stable training.
The above is only a preferred embodiment of the present invention, and it should be noted that the above preferred embodiment should not be considered as limiting the present invention, and the protection scope of the present invention should be subject to the scope defined by the claims. It will be apparent to those skilled in the art that several modifications, substitutions, improvements and embellishments of the steps can be made without departing from the spirit and scope of the invention, and these modifications, substitutions, improvements and embellishments should also be construed as the scope of the invention.
Claims (10)
1. An image restoration method based on depth multi-feature collaborative learning comprises the following steps:
s1, inputting an image to be restored into a preset image feature encoder, and performing effective feature extraction on the image to be restored through deep neural network encoding to form an effective image feature set;
s2, decoding and repairing the effective image characteristic set through a preset image decoder, and forming a repaired image after passing through a local discriminator and a global discriminator;
the image feature encoder is characterized by comprising six convolutional layers, wherein three shallow convolutional layers are used for reorganizing texture features to represent image details, and three deep convolutional layers are used for reorganizing structural features to represent image semantics to obtain a structural feature set and a texture feature set;
the image decoder comprises a soft gate control dual-feature fusion module used for fusing the structural features and the texture features, and a double-side propagation feature aggregation module used for balancing the features among channel information, context attention and feature space.
2. An image inpainting method as claimed in claim 1, characterized in that the texture features and the texture features are first filled into the damaged area using three parallel streams with different kernel sizes, respectively, the three streams are combined to form an output feature map, and then the output feature map is mapped to the same size as the input features, and the output of the texture features and the texture features satisfies the following requirements:
wherein, FcstAnd FcteRespectively as output features of the structure and texture generated by the multi-scale fill phase join,anddenoted as reconstruction loss of structure and texture, respectively, g (-) is a convolution operation with a kernel size of 1, F can be expressedcstAnd FcteRespectively mapped as color images, IgtAnd IstRepresenting a real image and its structural image, respectively, using an edge preserving image smoothing method to generate Ist。
3. The image inpainting method of claim 1, wherein the soft-gated dual feature fusion module comprises a structure-guided texture feature unit for performing an algorithm,
Gte=σ(SE(h([Fcst,Fcte]))) (2-1)
wherein, FcstAnd FcteRespectively expressed as the output characteristics of the structure and texture generated by the concatenation of the multi-scale filling stages, h (-) is a convolution operation with kernel size 3, SE (-) is a compression and activation operation to capture important channel information, σ (-) is a Sigmoid activation function, GteIs used to control the degree of refinement, F ', of texture information'cteIndicating a texture feature with structure perception, alpha and beta are learnable parameters, which indicate the corresponding product of elements,indicating that the elements are correspondingly added.
4. The image inpainting method of claim 1, wherein the soft-gated dual feature fusion module comprises a texture-guided structural feature unit configured to perform an algorithm,
Gst=σ(SE(k([Fcst,Fcte]))) (2-3)
wherein, FcstAnd FcteRespectively expressed as the output characteristics of the structure and texture generated by the concatenation of the multi-scale filling stages, k (-) is a convolution operation with kernel size 3, SE (-) is a compression and activation operation to capture important channel information, σ (-) is a Sigmoid activation function, GstTo control the degree of refinement, F ', of the structural information'cstIndicating a structural feature with texture perception, gamma is a learnable parameter, which indicates the corresponding product of elements,indicating that the elements are correspondingly added.
Ffu=v([F′cst,F′cte]) (2-5)
Wherein, F'cteAnd F'cstRepresenting texture features with texture perception and texture features with texture perception, respectively, v (-) is a convolution operation with a kernel size of 1, FfuIs the final output characteristic of the soft gate control dual-characteristic fusion module.
5. The image inpainting method of claim 1, wherein the bilateral propagation feature aggregation module comprises a capture channel information fusion unit for capturing channel information by an adaptive core selection method using a dynamic core selection network to obtain the feature map F'fu。
6. The image inpainting method of claim 5, wherein the bilateral propagation feature aggregation module includes a context attention fusion unit configured to capture a relationship between input image blocks and calculate a cosine similarity, and specifically executes the following algorithm:
wherein, feature F'fuThe division into non-overlapping blocks is performed,representing the cosine similarity between the output feature blocks,denotes the attention score, p, obtained by the Softmax functioniAnd pjAre respectively an input characteristic F'fuIs an input feature F'fuThe total number of blocks of (a) is,a feature map obtained by reconstructing the combined feature block from the attention scores is shown.
7. The image inpainting method of claim 1, wherein the bilateral propagation feature aggregation module includes a spatial information fusion unit that specifically executes the following algorithm:
wherein the content of the first and second substances,andrepresenting spatial and range similarity profiles, xiIs a characteristic of inputThe ith characteristic channel of (1)jIs the adjacent characteristic channel, g, at a position j around channel iαsIs a Gaussian function for adjusting the spatial contributions from neighboring feature channels, C (x) isF (-) is a dot product operation. In the spatial domain, j is explored in the neighborhood s for global propagation. In the range domain, v is an adjacent area of the position i, and the size thereof is set to 3 × 3.
9. An image inpainting method as claimed in claim 8, wherein each channel feature is aggregated to obtain a reconstructed feature mapF 'is then formed by concatenated convolution'fuAndfusion to give Fsc,
10. The image inpainting method of claim 1, wherein the global and local discriminators are composed of five convolutional layers, the convolutional kernel size is 4, the step size is 2, all layers except the last layer use a Leaky ReLu with a slope of 0.2, and stable training is achieved using spectral normalization.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210089664.4A CN114463209B (en) | 2022-01-25 | 2022-01-25 | Image restoration method based on deep multi-feature collaborative learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210089664.4A CN114463209B (en) | 2022-01-25 | 2022-01-25 | Image restoration method based on deep multi-feature collaborative learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114463209A true CN114463209A (en) | 2022-05-10 |
CN114463209B CN114463209B (en) | 2022-12-16 |
Family
ID=81410572
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210089664.4A Active CN114463209B (en) | 2022-01-25 | 2022-01-25 | Image restoration method based on deep multi-feature collaborative learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114463209B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114897742A (en) * | 2022-06-10 | 2022-08-12 | 重庆师范大学 | Image restoration method with texture and structural features fused twice |
CN115082743A (en) * | 2022-08-16 | 2022-09-20 | 之江实验室 | Full-field digital pathological image classification system considering tumor microenvironment and construction method |
CN115841625A (en) * | 2023-02-23 | 2023-03-24 | 杭州电子科技大学 | Remote sensing building image extraction method based on improved U-Net model |
CN116681980A (en) * | 2023-07-31 | 2023-09-01 | 北京建筑大学 | Deep learning-based large-deletion-rate image restoration method, device and storage medium |
WO2023225808A1 (en) * | 2022-05-23 | 2023-11-30 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Learned image compress ion and decompression using long and short attention module |
CN117196981A (en) * | 2023-09-08 | 2023-12-08 | 兰州交通大学 | Bidirectional information flow method based on texture and structure reconciliation |
CN117422911A (en) * | 2023-10-20 | 2024-01-19 | 哈尔滨工业大学 | Collaborative learning driven multi-category full-slice digital pathological image classification system |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108460746A (en) * | 2018-04-10 | 2018-08-28 | 武汉大学 | A kind of image repair method predicted based on structure and texture layer |
CN112365422A (en) * | 2020-11-17 | 2021-02-12 | 重庆邮电大学 | Irregular missing image restoration method and system based on deep aggregation network |
US20210125313A1 (en) * | 2019-10-25 | 2021-04-29 | Samsung Electronics Co., Ltd. | Image processing method, apparatus, electronic device and computer readable storage medium |
CN113298733A (en) * | 2021-06-09 | 2021-08-24 | 华南理工大学 | Implicit edge prior based scale progressive image completion method |
WO2021232589A1 (en) * | 2020-05-21 | 2021-11-25 | 平安国际智慧城市科技股份有限公司 | Intention identification method, apparatus and device based on attention mechanism, and storage medium |
-
2022
- 2022-01-25 CN CN202210089664.4A patent/CN114463209B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108460746A (en) * | 2018-04-10 | 2018-08-28 | 武汉大学 | A kind of image repair method predicted based on structure and texture layer |
US20210125313A1 (en) * | 2019-10-25 | 2021-04-29 | Samsung Electronics Co., Ltd. | Image processing method, apparatus, electronic device and computer readable storage medium |
WO2021232589A1 (en) * | 2020-05-21 | 2021-11-25 | 平安国际智慧城市科技股份有限公司 | Intention identification method, apparatus and device based on attention mechanism, and storage medium |
CN112365422A (en) * | 2020-11-17 | 2021-02-12 | 重庆邮电大学 | Irregular missing image restoration method and system based on deep aggregation network |
CN113298733A (en) * | 2021-06-09 | 2021-08-24 | 华南理工大学 | Implicit edge prior based scale progressive image completion method |
Non-Patent Citations (4)
Title |
---|
XIANG LI等: "Selective Kernel Networks", 《2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)》 * |
XIEFAN GUO等: "Image Inpainting via Conditional Texture and Structure Dual Generation", 《2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV)》 * |
宋中山等: "基于双向门控尺度特征融合的遥感场景分类", 《计算机应用》 * |
范春奇等: "基于深度学习的数字图像修复算法最新进展", 《信号处理》 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023225808A1 (en) * | 2022-05-23 | 2023-11-30 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Learned image compress ion and decompression using long and short attention module |
CN114897742A (en) * | 2022-06-10 | 2022-08-12 | 重庆师范大学 | Image restoration method with texture and structural features fused twice |
CN115082743A (en) * | 2022-08-16 | 2022-09-20 | 之江实验室 | Full-field digital pathological image classification system considering tumor microenvironment and construction method |
CN115082743B (en) * | 2022-08-16 | 2022-12-06 | 之江实验室 | Full-field digital pathological image classification system considering tumor microenvironment and construction method |
CN115841625A (en) * | 2023-02-23 | 2023-03-24 | 杭州电子科技大学 | Remote sensing building image extraction method based on improved U-Net model |
CN116681980A (en) * | 2023-07-31 | 2023-09-01 | 北京建筑大学 | Deep learning-based large-deletion-rate image restoration method, device and storage medium |
CN116681980B (en) * | 2023-07-31 | 2023-10-20 | 北京建筑大学 | Deep learning-based large-deletion-rate image restoration method, device and storage medium |
CN117196981A (en) * | 2023-09-08 | 2023-12-08 | 兰州交通大学 | Bidirectional information flow method based on texture and structure reconciliation |
CN117196981B (en) * | 2023-09-08 | 2024-04-26 | 兰州交通大学 | Bidirectional information flow method based on texture and structure reconciliation |
CN117422911A (en) * | 2023-10-20 | 2024-01-19 | 哈尔滨工业大学 | Collaborative learning driven multi-category full-slice digital pathological image classification system |
CN117422911B (en) * | 2023-10-20 | 2024-04-30 | 哈尔滨工业大学 | Collaborative learning driven multi-category full-slice digital pathological image classification system |
Also Published As
Publication number | Publication date |
---|---|
CN114463209B (en) | 2022-12-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114463209B (en) | Image restoration method based on deep multi-feature collaborative learning | |
CN109447907B (en) | Single image enhancement method based on full convolution neural network | |
CN111242238B (en) | RGB-D image saliency target acquisition method | |
CN110689495B (en) | Image restoration method for deep learning | |
CN111787187B (en) | Method, system and terminal for repairing video by utilizing deep convolutional neural network | |
CN110223251B (en) | Convolution neural network underwater image restoration method suitable for artificial and natural light sources | |
CN112991231B (en) | Single-image super-image and perception image enhancement joint task learning system | |
CN114897742B (en) | Image restoration method with texture and structural features fused twice | |
CN110349087A (en) | RGB-D image superior quality grid generation method based on adaptability convolution | |
CN112422870B (en) | Deep learning video frame insertion method based on knowledge distillation | |
CN114820341A (en) | Image blind denoising method and system based on enhanced transform | |
CN113989129A (en) | Image restoration method based on gating and context attention mechanism | |
CN115239564B (en) | Mine image super-resolution reconstruction method combining semantic information | |
CN115170915A (en) | Infrared and visible light image fusion method based on end-to-end attention network | |
CN116958534A (en) | Image processing method, training method of image processing model and related device | |
CN116485741A (en) | No-reference image quality evaluation method, system, electronic equipment and storage medium | |
CN115829876A (en) | Real degraded image blind restoration method based on cross attention mechanism | |
CN115829880A (en) | Image restoration method based on context structure attention pyramid network | |
CN116109510A (en) | Face image restoration method based on structure and texture dual generation | |
CN112785502A (en) | Light field image super-resolution method of hybrid camera based on texture migration | |
CN116523985A (en) | Structure and texture feature guided double-encoder image restoration method | |
CN115035170A (en) | Image restoration method based on global texture and structure | |
CN116167920A (en) | Image compression and reconstruction method based on super-resolution and priori knowledge | |
JPS62131383A (en) | Method and apparatus for evaluating movement of image train | |
CN114820316A (en) | Video image super-resolution recovery system based on deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |