CN116862784B - Single image defogging method based on multi-teacher knowledge distillation - Google Patents
Single image defogging method based on multi-teacher knowledge distillation Download PDFInfo
- Publication number
- CN116862784B CN116862784B CN202310681883.6A CN202310681883A CN116862784B CN 116862784 B CN116862784 B CN 116862784B CN 202310681883 A CN202310681883 A CN 202310681883A CN 116862784 B CN116862784 B CN 116862784B
- Authority
- CN
- China
- Prior art keywords
- feature map
- scale
- network model
- computer
- decoding
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 238000013140 knowledge distillation Methods 0.000 title claims abstract description 22
- 238000000605 extraction Methods 0.000 claims abstract description 104
- 238000012549 training Methods 0.000 claims abstract description 84
- 230000006870 function Effects 0.000 claims description 66
- 238000005070 sampling Methods 0.000 claims description 57
- 230000004927 fusion Effects 0.000 claims description 52
- 230000004913 activation Effects 0.000 claims description 32
- 238000010586 diagram Methods 0.000 claims description 28
- 238000010606 normalization Methods 0.000 claims description 26
- 238000012545 processing Methods 0.000 claims description 20
- 241000282326 Felis catus Species 0.000 claims description 18
- 230000008569 process Effects 0.000 claims description 13
- 238000005457 optimization Methods 0.000 claims description 9
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 4
- 238000012546 transfer Methods 0.000 claims description 4
- 238000004821 distillation Methods 0.000 claims description 3
- 230000008447 perception Effects 0.000 claims description 3
- 230000000694 effects Effects 0.000 abstract description 8
- 238000013461 design Methods 0.000 description 5
- 238000013135 deep learning Methods 0.000 description 3
- 239000011159 matrix material Substances 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a single image defogging method based on multi-teacher knowledge distillation, which comprises the following steps: 1. acquiring a training set image; 2. establishing a student network model; 3. extracting features of the foggy training images; 4. establishing a total loss function; 5. training the student network model by the foggy training image; 6. defogging the single image by using the trained student network model. According to the invention, the student network model is guided and trained through the EPDN teacher network model and the PSD teacher network model, so that the feature extraction capability of the student network is effectively improved, the student network model realizes the extraction of multi-scale information of defogging images through the encoding and decoding of four scales, the global and local features of the defogging images are effectively fused, and the defogging effect of the images is further improved.
Description
Technical Field
The invention belongs to the technical field of image defogging processing, and particularly relates to a single image defogging method based on multi-teacher knowledge distillation.
Background
At present, a teacher model in an image defogging method mainly comprises a defogging method based on prior information and a defogging method based on deep learning. The image defogging method based on prior information has advantages in the aspects of recovering the visibility, contrast and texture structure of the image, and the image defogging method based on deep learning has better effects in the aspects of improving the authenticity and color fidelity of the image. However, at present, knowledge learned by a single teacher model is generally transferred to a student model, so that the student model has similar performance to the teacher model, but the trained student model is often limited by the performance of the teacher model due to the fact that the single teacher model is adopted to carry out unidirectional knowledge transfer on a student network.
Therefore, a single image defogging method based on multi-teacher knowledge distillation, which is simple in structure and reasonable in design, is lacking at present, a student network model is trained through a EPDN teacher network model and a PSD teacher network model, the feature extraction capability of the student network is effectively improved, the student network model is used for extracting multi-scale information of defogging images through four-scale encoding and decoding, global and local features of defogging images are effectively fused, and then the defogging effect of the images is improved.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a single image defogging method based on multi-teacher knowledge distillation, which has simple steps and reasonable design, guides and trains a student network model through a EPDN teacher network model and a PSD teacher network model, effectively improves the characteristic extraction capability of the student network, realizes the extraction of multi-scale information of defogging images through the four-scale encoding and decoding of the student network model, effectively fuses global and local characteristics of defogging images, and further improves the defogging effect of the images.
In order to solve the technical problems, the invention adopts the following technical scheme: a single image defogging method based on multi-teacher knowledge distillation, which is characterized by comprising the following steps:
step one, acquiring a training set image:
selecting an indoor training set from the foggy day image database RESIDE; the indoor training set comprises foggy training images and foggy training images corresponding to the foggy training images, wherein the number of the foggy training images and the number of the foggy training images are the same;
step two, establishing a student network model:
The method for establishing the student network model comprises the following specific processes:
Step 201, establishing an encoder model of a student network by adopting a computer; the encoder model of the student network comprises a first scale network model, a second scale network model, a third scale network model and a fourth scale network model, wherein the first scale network model comprises a first convolution layer and two RDB modules based on PA, and the second scale network model comprises a second convolution layer, two RDB modules based on PA and a feature fusion module; the third scale network model comprises a third convolution layer, two RDB modules based on PA and a feature fusion module; the fourth scale network model comprises a fourth convolution layer, two RDB modules based on PA and a feature fusion module;
Step 202, adopting a computer to establish a decoder model of the student network; the decoder model of the student network comprises a first decoding network model, a second decoding network model, a third decoding network model, a fourth decoding network model and a fifth convolution layer, wherein the first decoding network model comprises two RDB modules based on PA, and the second decoding network model comprises a first transfer convolution layer, two RDB modules based on PA and a feature fusion module; the third decoding network model comprises a second transpose convolution layer, two RDB modules based on PA and a feature fusion module; the fourth decoding network model comprises a third transpose convolution layer, two RDB modules based on PA and a feature fusion module;
step three, extracting features of the foggy training images:
Step 301, extracting features of the foggy training image I through a first scale network model by adopting a computer to obtain a first scale feature map F e1;
Step 302, extracting features of the first scale feature map F e1 through a second scale network model by using a computer to obtain a second scale feature map F e2;
step 303, extracting features of the second scale feature map F e2 through a third scale network model by using a computer to obtain a third scale feature map F e3;
step 304, extracting features of the third scale feature map F e3 through a fourth scale network model by using a computer to obtain a fourth scale feature map F e4;
step 305, performing feature extraction on the fourth scale feature map F e4 through the first decoding network model by using a computer to obtain a first decoding feature map F d1;
Step 306, extracting features of the first decoding feature map F d1 through a second decoding network model by using a computer to obtain a second decoding feature map F d2;
Step 307, performing feature extraction on the second decoding feature map F d2 through a third decoding network model by using a computer to obtain a third decoding feature map F d3;
Step 308, performing feature extraction on the third decoding feature map F d3 through a fourth decoding network model by using a computer to obtain a fourth decoding feature map F d4; performing feature extraction on the fourth decoding feature map F d4 through fifth convolution by adopting a computer to obtain an output defogging image out;
309, processing the foggy training image I by using a EPDN teacher network model by using a computer to obtain a EPDN teacher network output defogging image out EP, and recording a feature map output by a global sub-generator in the EPDN teacher network model as a EPDN teacher network intermediate output feature map EP 1;
Processing the foggy training image I by a computer by using a teacher PSD network model to obtain a PSD teacher network output defogging image out PS, and recording a feature map output by a trunk network in the teacher PSD network model as a PSD teacher network middle output feature map PS 2;
Step four, establishing a total loss function:
step 401, adopting a computer according to Obtaining a perception loss function L per; wherein I is a positive integer, n=5, Φ i (gt) represents a characteristic diagram of an defogging training image corresponding to the foggy training image I, which is output by Relu I _1 layer in the VGG19 network model, Φ i (out) represents a characteristic diagram of an output defogging image out of the student network model, which is output by Relu I _1 layer in the VGG19 network model, and I is more than or equal to 1 and less than or equal to 5; c i、Hi and W i represent the number of channels, length and width of the feature map output by the Relu i _1 layer, respectively; (Φ i(gt),Φi(out))L1 represents the Manhattan distance between the two feature maps of Relu i _1 layer output in the VGG19 network model;
step 402, obtaining a distillation loss function L diss according to Ldist=(out,outEP)L1+(out,outPS)L1+0.25(EP1,Fd2)L1+0.5(PS2,Fd3)L1, by adopting a computer; wherein, (out EP)L1 represents the manhattan distance between the output defogging image out of the student network model and the EPDN teacher network output defogging image out EP, (out PS)L1 represents the manhattan distance between the output defogging image out of the student network model and the PSD teacher network output defogging image out PS, (EP 1,Fd2)L1 represents the manhattan distance between the EPDN teacher network intermediate output feature map EP 1 and the second decoding feature map F d2 of the student network model, (PS 2,Fd3)L1 represents the manhattan distance between the PSD teacher network intermediate output feature map PS 2 and the third decoding feature map F d3 of the student network model);
Step 403, obtaining a total loss function L loss by adopting a computer according to the L loss=0.1Lper+Ldist;
Training the student network model by the foggy training image:
step 501, adopting an Adam optimization algorithm by a computer, and performing iterative optimization on a student network model by using a total loss function L loss until a training set is completely trained, and completing one-time iterative training;
step 502, repeating the iterative training in step 501 until the iterative training preset times are met, and obtaining a trained student network model;
Step six, defogging the single image by using the trained student network model:
And inputting any one foggy image into a trained student network model by adopting a computer to perform defogging treatment, so as to obtain a foggy image.
The single image defogging method based on multi-teacher knowledge distillation is characterized by comprising the following steps of: in step 201, the number of convolution kernels in the first convolution layer is 32, the size of the convolution kernels is 3×3, the sliding step size is 1, and the padding is 1;
The number of convolution kernels in the second convolution layer is 64, the size of the convolution kernels is 3×3, the sliding step size is 2, and the padding is 1;
The PA-based RDB module in step 201 includes a first conv+relu layer, a Conv1 convolution layer, an RDB module, a Conv2 convolution layer, and a Sigmoid activation function layer; the number of convolution kernels in the first Conv+ReLU layer is 32, the size of the convolution kernels is 3 multiplied by 3, the sliding step length is 1, and the padding is 1; the number of convolution kernels in the Conv1 convolution layer is 32, the size of the convolution kernels is 1 multiplied by 1, the sliding step length is 1, and the padding is 0; the number of convolution kernels in the Conv2 convolution layer is 32, the size of the convolution kernels is 3 multiplied by 3, the sliding step length is 1, and the padding is 1;
the number of convolution kernels in the third convolution layer is 128, the size of the convolution kernels is 3×3, the sliding step length is 2, and the padding is 1;
The number of convolution kernels in the fourth convolution layer is 256, the size of the convolution kernels is 3 multiplied by 3, the sliding step length is 2, and the padding is 1;
The feature fusion module in step 201 includes a first conv+ InstanceNorm normalization+relu activation function layer and a second conv+ InstanceNorm normalization+relu activation function layer;
in step 202, the number of convolution kernels in the first transpose convolution layer is 128, the size of the convolution kernels is 3×3, the sliding step size is 2, the padding is1, and the out_padding is 1;
the number of convolution kernels in the second transpose convolution layer is 64, the size of the convolution kernels is 3×3, the sliding step size is 2, the padding is 1, and the out_padding is 1;
the number of convolution kernels in the third transpose convolution layer is 32, the size of the convolution kernels is 3×3, the sliding step size is 2, the padding is 1, and the out_padding is 1;
The number of convolution kernels in the fifth convolution layer is 3, the size of the convolution kernels is 3×3, the sliding step size is 1, and the padding is 1.
The single image defogging method based on multi-teacher knowledge distillation is characterized by comprising the following steps of: in step 301, a computer is used to perform feature extraction on the foggy training image I through a first scale network model, so as to obtain a first scale feature map F e1, which specifically includes the following steps:
Step 3011, performing feature extraction on the foggy training image I through a first convolution layer by adopting a computer to obtain an input feature map F in;
Step 3012, inputting the input feature map F in into a PA-based RDB module by a computer to perform feature extraction, so as to obtain an intermediate output feature map F out;
Step 3013, according to the method described in step 3012, the computer inputs the intermediate output feature map F out into another PA-based RDB module for feature extraction, to obtain a first scale feature map F e1.
The single image defogging method based on multi-teacher knowledge distillation is characterized by comprising the following steps of: in step 302, a computer is used to perform feature extraction on the first scale feature map F e1 through a second scale network model to obtain a second scale feature map F e2, which specifically includes the following steps:
Step 3021, performing feature extraction on the first scale feature map F e1 through a second convolution layer by using a computer to obtain a second input feature map;
Step 3022, inputting the second input feature map into a PA-based RDB module in the second scale network model by the computer to perform feature extraction, so as to obtain a second scale first coding feature map;
Step 3023, inputting the second-scale first coding feature map into another PA-based RDB module in the second-scale network model by the computer for feature extraction, so as to obtain a second-scale second coding feature map;
Step 3024, the computer downsamples the first scale feature map F e1 by 0.5 times to obtain a first downsampled feature map;
step 3025, calling a splicing cat function module by a computer to splice the first downsampling feature map and the second-scale second coding feature map to obtain a first spliced feature map;
Step 3026, inputting the first spliced feature map into a feature fusion module in the second scale network model by using a computer to obtain a second scale feature map F e2;
In step 303, a computer is used to perform feature extraction on the second scale feature map F e2 through a third scale network model to obtain a third scale feature map F e3, which specifically includes the following steps:
step 3031, a computer is adopted to conduct feature extraction on the second scale feature map F e2 through a third convolution layer to obtain a third input feature map;
step 3032, the computer inputs the third input feature map into a PA-based RDB module in the third-scale network model to perform feature extraction to obtain a third-scale first coding feature map;
step 3033, the computer inputs the third-scale first decoding feature map into another RDB module based on PA in the third-scale network model to perform feature extraction, so as to obtain a third-scale second coding feature map;
step 3034, the computer performs 0.5 times downsampling on the second scale feature map F e2 to obtain a second downsampled feature map;
The computer performs 0.25 times downsampling on the first scale feature map F e1 to obtain a third downsampled feature map;
Step 3035, a computer is adopted to call a splicing cat function module to splice the second downsampling feature map, the third downsampling feature map and the third-scale second coding feature map to obtain a second spliced feature map;
Step 3036, inputting the second spliced feature map into a feature fusion module in the third-scale network model by adopting a computer to obtain a third-scale feature map F e3;
In step 304, a computer is adopted to extract the features of the third scale feature map F e3 through a fourth scale network model to obtain a fourth scale feature map F e4, which specifically comprises the following steps:
step 3041, performing feature extraction on the third scale feature map F e3 through a fourth convolution layer by adopting a computer to obtain a fourth input feature map;
Step 3042, inputting the fourth input feature map into a PA-based RDB module in a fourth-scale network model by a computer to perform feature extraction, so as to obtain a fourth-scale first coding feature map;
step 3043, inputting the fourth-scale first coding feature map into another RDB module based on PA in the fourth-scale network model by a computer for feature extraction to obtain a fourth-scale second coding feature map;
step 3044, the computer performs 0.5 times downsampling on the third scale feature map F e3 to obtain a fourth downsampled feature map;
The computer performs 0.25 times downsampling on the second scale feature map F e2 to obtain a fifth downsampled feature map;
The computer performs 0.125 times downsampling on the first scale feature map F e1 to obtain a sixth downsampled feature map;
Step 3045, calling a splicing cat function module by a computer to splice the fourth downsampling feature map, the fifth downsampling feature map, the sixth downsampling feature map and the fourth-scale second coding feature map to obtain a third spliced feature map;
And step 3046, inputting the third spliced feature map into a feature fusion module in the fourth-scale network model by adopting a computer to obtain a fourth-scale feature map F e4.
The single image defogging method based on multi-teacher knowledge distillation is characterized by comprising the following steps of: in step 3012, the computer inputs the input feature map F in into a PA-based RDB module to perform feature extraction to obtain an intermediate feature map F out, which specifically includes the following steps:
Step A, a computer performs feature extraction on an input feature map F in through a first Conv+ReLU layer to obtain a feature map F pre;
Step B, a computer inputs a characteristic diagram F pre into a Conv1 convolution layer and an RDB module to perform characteristic extraction to obtain a characteristic diagram F RDB, and simultaneously, inputs a characteristic diagram F pre into a Conv2 convolution layer to perform convolution processing and normalizes the characteristic diagram through a Sigmoid activation function to obtain a space weight diagram F s;
Step C, the computer is according to Obtaining a characteristic diagram F mid; wherein/>Hadamard product operation between matrices representing feature maps,/>Representing addition operations between feature map matrices;
step D, the computer is according to Obtaining an intermediate feature map F out;
The single image defogging method based on multi-teacher knowledge distillation is characterized by comprising the following steps of: in step 3026, step 3036 and step 3046, the first post-stitching feature map, the second post-stitching feature map and the third post-stitching feature map are recorded as post-stitching feature maps, and the second scale feature map F e2, the third scale feature map F e3 and the fourth scale feature map F e4 are all recorded as post-fusion scale feature maps, and then the post-stitching feature maps are input into a feature fusion module by a computer to obtain a scale feature map, which comprises the following specific steps:
A1, performing feature processing on the spliced feature map through a first Conv+ InstanceNorm normalization and ReLU activation function layer by adopting a computer to obtain a fusion coding feature map;
and A2, performing feature processing on the fusion coding feature map through a second Conv+ InstanceNorm normalization and ReLU activation function layer by adopting a computer to obtain a fused scale feature map.
The single image defogging method based on multi-teacher knowledge distillation is characterized by comprising the following steps of: in step 305, a computer is used to extract features of the fourth scale feature map F e4 through the first decoding network model to obtain a first decoding feature map F d1, which specifically includes the following steps:
step 3051, performing feature extraction on the fourth scale feature map F e4 by using a computer through a PA-based RDB module in the first decoding network model to obtain a first pre-decoding feature map;
Step 3052, inputting the first pre-decoding feature map into another PA-based RDB module in the first decoding network model by the computer to perform feature extraction, so as to obtain a first decoding feature map F d1;
step 306, the specific process is as follows:
Step 3061, performing feature extraction on the first decoded feature map F d1 through a first transposed convolutional layer by using a computer to obtain a first decoded first upsampled feature map;
Step 3062, performing feature extraction on the first decoded first upsampled feature map through two PA-based RDB modules in the second decoding network model by using a computer to obtain a first intermediate feature map;
step 3063, performing 2 times up-sampling processing on the first decoding feature map F d1 by adopting a computer to obtain a first decoding second up-sampling feature map;
Step 3064, a computer is adopted to call a splicing cat function module to splice the first intermediate feature map and the first decoding second up-sampling feature map, so as to obtain a first decoding splicing feature map;
step 3065, inputting the first decoding spliced feature map into a feature fusion module in a second decoding network model by adopting a computer to obtain a second decoding feature map F d2;
step 307, the specific process is as follows:
Step 3071, using a computer to perform feature extraction on the second decoded feature map F d2 through a second transposed convolutional layer to obtain a second decoded first upsampled feature map;
step 3072, performing feature extraction on the second decoded first upsampled feature map by using a computer through two PA-based RDB modules in the third decoding network model to obtain a second intermediate feature map;
3073, performing 4 times up-sampling on the first decoding feature map F d1 by adopting a computer to obtain a second decoding second up-sampling feature map;
performing 2 times up-sampling on the second decoding feature map F d2 to obtain a second decoding third up-sampling feature map;
Step 3074, calling a splicing cat function module by a computer to splice the second intermediate feature map, the second decoding second up-sampling feature map and the second decoding third up-sampling feature map to obtain a second decoding splicing feature map;
Step 3075, inputting the second decoding spliced feature map into a feature fusion module in a third decoding network model by adopting a computer to obtain a third decoding feature map F d3;
Step 308, the specific process is as follows:
Step 3081, performing feature extraction on the third decoded feature map F d3 through a third transposed convolutional layer by using a computer to obtain a third decoded first upsampled feature map;
Step 3072, performing feature extraction on the third decoded first upsampled feature map by using a computer through two PA-based RDB modules in the fourth decoding network model to obtain a third intermediate feature map;
3073, performing 8 times up-sampling on the first decoding feature map F d1 by adopting a computer to obtain a third decoding second up-sampling feature map;
performing 4 times up-sampling on the second decoding feature map F d2 to obtain a third up-sampling feature map for third decoding;
Performing 2 times up-sampling on the third decoding feature map F d3 to obtain a third decoding fourth up-sampling feature map;
Step 3074, calling a splicing cat function module by a computer to splice the third intermediate feature map, the third decoding second up-sampling feature map, the third decoding third up-sampling feature map and the third decoding fourth up-sampling feature map to obtain a third decoding splicing feature map;
And 3075, inputting the third decoding spliced feature map into a feature fusion module in a third decoding network model by adopting a computer to obtain a fourth decoding feature map F d4.
Compared with the prior art, the invention has the following advantages:
1. The method has simple steps and reasonable design, and firstly, the training set image is acquired; secondly, establishing a student network model, extracting features of a foggy training image, establishing a total loss function, training the student network model by the foggy training image, defogging a single image by using the trained student network model, and improving the defogging effect of the image.
2. The student network model adopts the feature attention residual error dense block to carry out multi-scale feature extraction to generate the end-to-end feature image, thereby utilizing the advantages of the neural network and having stronger generalization capability.
3. The encoder model in the student network model comprises a first scale network model, a second scale network model, a third scale network model and a fourth scale network model, the decoder model comprises a first decoding network model, a second decoding network model, a third decoding network model and a fourth decoding network model, the four-scale downsampling characteristic extraction is realized through the encoder model, the four-scale upsampling characteristic extraction is realized through the decoder model, the extraction of multi-scale information of a defogging image is realized, the global and local characteristics of the defogging image are effectively fused, and the defogging effect of the image is further improved.
4. According to the invention, by adopting EPDN teacher network model and PSD teacher network model, knowledge migration from the teacher network to the student network is realized in a multi-teacher knowledge distillation mode, so that the student network can combine the complementary advantages of the image defogging method based on prior information and the image defogging method based on deep learning.
In summary, the method has simple steps and reasonable design, the student network model is guided and trained through the EPDN teacher network model and the PSD teacher network model, the feature extraction capability of the student network is effectively improved, the student network model realizes the extraction of multi-scale information of the defogging image through the encoding and decoding of four scales, the global and local features of the defogging image are effectively fused, and the defogging effect of the image is further improved.
The technical scheme of the invention is further described in detail through the drawings and the embodiments.
Drawings
FIG. 1 is a block flow diagram of the method of the present invention.
Fig. 2 is a schematic diagram of the structure of the student network model of the present invention.
Fig. 3 is a schematic structural diagram of a residual error density block according to a feature of the present invention.
Fig. 4 is a schematic structural diagram of a feature fusion module according to the present invention.
Detailed Description
As shown in fig. 1 to 4, the single image defogging method based on multi-teacher knowledge distillation of the present invention comprises the following steps:
step one, acquiring a training set image:
selecting an indoor training set from the foggy day image database RESIDE; the indoor training set comprises foggy training images and foggy training images corresponding to the foggy training images, wherein the number of the foggy training images and the number of the foggy training images are the same;
step two, establishing a student network model:
The method for establishing the student network model comprises the following specific processes:
Step 201, establishing an encoder model of a student network by adopting a computer; the encoder model of the student network comprises a first scale network model, a second scale network model, a third scale network model and a fourth scale network model, wherein the first scale network model comprises a first convolution layer and two RDB modules based on PA, and the second scale network model comprises a second convolution layer, two RDB modules based on PA and a feature fusion module; the third scale network model comprises a third convolution layer, two RDB modules based on PA and a feature fusion module; the fourth scale network model comprises a fourth convolution layer, two RDB modules based on PA and a feature fusion module;
Step 202, adopting a computer to establish a decoder model of the student network; the decoder model of the student network comprises a first decoding network model, a second decoding network model, a third decoding network model, a fourth decoding network model and a fifth convolution layer, wherein the first decoding network model comprises two RDB modules based on PA, and the second decoding network model comprises a first transfer convolution layer, two RDB modules based on PA and a feature fusion module; the third decoding network model comprises a second transpose convolution layer, two RDB modules based on PA and a feature fusion module; the fourth decoding network model comprises a third transpose convolution layer, two RDB modules based on PA and a feature fusion module;
step three, extracting features of the foggy training images:
Step 301, extracting features of the foggy training image I through a first scale network model by adopting a computer to obtain a first scale feature map F e1;
Step 302, extracting features of the first scale feature map F e1 through a second scale network model by using a computer to obtain a second scale feature map F e2;
step 303, extracting features of the second scale feature map F e2 through a third scale network model by using a computer to obtain a third scale feature map F e3;
step 304, extracting features of the third scale feature map F e3 through a fourth scale network model by using a computer to obtain a fourth scale feature map F e4;
step 305, performing feature extraction on the fourth scale feature map F e4 through the first decoding network model by using a computer to obtain a first decoding feature map F d1;
Step 306, extracting features of the first decoding feature map F d1 through a second decoding network model by using a computer to obtain a second decoding feature map F d2;
Step 307, performing feature extraction on the second decoding feature map F d2 through a third decoding network model by using a computer to obtain a third decoding feature map F d3;
Step 308, performing feature extraction on the third decoding feature map F d3 through a fourth decoding network model by using a computer to obtain a fourth decoding feature map F d4; performing feature extraction on the fourth decoding feature map F d4 through fifth convolution by adopting a computer to obtain an output defogging image out;
309, processing the foggy training image I by using a EPDN teacher network model by using a computer to obtain a EPDN teacher network output defogging image out EP, and recording a feature map output by a global sub-generator in the EPDN teacher network model as a EPDN teacher network intermediate output feature map EP 1;
Processing the foggy training image I by a computer by using a teacher PSD network model to obtain a PSD teacher network output defogging image out PS, and recording a feature map output by a trunk network in the teacher PSD network model as a PSD teacher network middle output feature map PS 2;
Step four, establishing a total loss function:
step 401, adopting a computer according to Obtaining a perception loss function L per; wherein I is a positive integer, n=5, Φ i (gt) represents a characteristic diagram of an defogging training image corresponding to the foggy training image I, which is output by Relu I _1 layer in the VGG19 network model, Φ i (out) represents a characteristic diagram of an output defogging image out of the student network model, which is output by Relu I _1 layer in the VGG19 network model, and I is more than or equal to 1 and less than or equal to 5; c i、Hi and W i represent the number of channels, length and width of the feature map output by the Relu i _1 layer, respectively; (Φ i(gt),Φi(out))L1 represents the Manhattan distance between the two feature maps of Relu i _1 layer output in the VGG19 network model;
step 402, obtaining a distillation loss function L diss according to Ldist=(out,outEP)L1+(out,outPS)L1+0.25(EP1,Fd2)L1+0.5(PS2,Fd3)L1, by adopting a computer; wherein, (out EP)L1 represents the manhattan distance between the output defogging image out of the student network model and the EPDN teacher network output defogging image out EP, (out PS)L1 represents the manhattan distance between the output defogging image out of the student network model and the PSD teacher network output defogging image out PS, (EP 1,Fd2)L1 represents the manhattan distance between the EPDN teacher network intermediate output feature map EP 1 and the second decoding feature map F d2 of the student network model, (PS 2,Fd3)L1 represents the manhattan distance between the PSD teacher network intermediate output feature map PS 2 and the third decoding feature map F d3 of the student network model);
Step 403, obtaining a total loss function L loss by adopting a computer according to the L loss=0.1Lper+Ldist;
Training the student network model by the foggy training image:
step 501, adopting an Adam optimization algorithm by a computer, and performing iterative optimization on a student network model by using a total loss function L loss until a training set is completely trained, and completing one-time iterative training;
step 502, repeating the iterative training in step 501 until the iterative training preset times are met, and obtaining a trained student network model;
Step six, defogging the single image by using the trained student network model:
And inputting any one foggy image into a trained student network model by adopting a computer to perform defogging treatment, so as to obtain a foggy image.
In this embodiment, in step 201, the number of convolution kernels in the first convolution layer is 32, the size of the convolution kernel is 3×3, the sliding step size is 1, and the padding is 1;
The number of convolution kernels in the second convolution layer is 64, the size of the convolution kernels is 3×3, the sliding step size is 2, and the padding is 1;
The PA-based RDB module in step 201 includes a first conv+relu layer, a Conv1 convolution layer, an RDB module, a Conv2 convolution layer, and a Sigmoid activation function layer; the number of convolution kernels in the first Conv+ReLU layer is 32, the size of the convolution kernels is 3 multiplied by 3, the sliding step length is 1, and the padding is 1; the number of convolution kernels in the Conv1 convolution layer is 32, the size of the convolution kernels is 1 multiplied by 1, the sliding step length is 1, and the padding is 0; the number of convolution kernels in the Conv2 convolution layer is 32, the size of the convolution kernels is 3 multiplied by 3, the sliding step length is 1, and the padding is 1;
the number of convolution kernels in the third convolution layer is 128, the size of the convolution kernels is 3×3, the sliding step length is 2, and the padding is 1;
The number of convolution kernels in the fourth convolution layer is 256, the size of the convolution kernels is 3 multiplied by 3, the sliding step length is 2, and the padding is 1;
The feature fusion module in step 201 includes a first conv+ InstanceNorm normalization+relu activation function layer and a second conv+ InstanceNorm normalization+relu activation function layer;
in step 202, the number of convolution kernels in the first transpose convolution layer is 128, the size of the convolution kernels is 3×3, the sliding step size is 2, the padding is1, and the out_padding is 1;
the number of convolution kernels in the second transpose convolution layer is 64, the size of the convolution kernels is 3×3, the sliding step size is 2, the padding is 1, and the out_padding is 1;
the number of convolution kernels in the third transpose convolution layer is 32, the size of the convolution kernels is 3×3, the sliding step size is 2, the padding is 1, and the out_padding is 1;
The number of convolution kernels in the fifth convolution layer is 3, the size of the convolution kernels is 3×3, the sliding step size is 1, and the padding is 1.
In this embodiment, in step 301, a computer is used to perform feature extraction on the foggy training image I through a first scale network model to obtain a first scale feature map F e1, which specifically includes the following steps:
Step 3011, performing feature extraction on the foggy training image I through a first convolution layer by adopting a computer to obtain an input feature map F in;
Step 3012, inputting the input feature map F in into a PA-based RDB module by a computer to perform feature extraction, so as to obtain an intermediate output feature map F out;
Step 3013, according to the method described in step 3012, the computer inputs the intermediate output feature map F out into another PA-based RDB module for feature extraction, to obtain a first scale feature map F e1.
In this embodiment, in step 302, a computer is used to perform feature extraction on the first scale feature map F e1 through a second scale network model to obtain a second scale feature map F e2, which specifically includes the following steps:
Step 3021, performing feature extraction on the first scale feature map F e1 through a second convolution layer by using a computer to obtain a second input feature map;
Step 3022, inputting the second input feature map into a PA-based RDB module in the second scale network model by the computer to perform feature extraction, so as to obtain a second scale first coding feature map;
Step 3023, inputting the second-scale first coding feature map into another PA-based RDB module in the second-scale network model by the computer for feature extraction, so as to obtain a second-scale second coding feature map;
Step 3024, the computer downsamples the first scale feature map F e1 by 0.5 times to obtain a first downsampled feature map;
step 3025, calling a splicing cat function module by a computer to splice the first downsampling feature map and the second-scale second coding feature map to obtain a first spliced feature map;
Step 3026, inputting the first spliced feature map into a feature fusion module in the second scale network model by using a computer to obtain a second scale feature map F e2;
In step 303, a computer is used to perform feature extraction on the second scale feature map F e2 through a third scale network model to obtain a third scale feature map F e3, which specifically includes the following steps:
step 3031, a computer is adopted to conduct feature extraction on the second scale feature map F e2 through a third convolution layer to obtain a third input feature map;
step 3032, the computer inputs the third input feature map into a PA-based RDB module in the third-scale network model to perform feature extraction to obtain a third-scale first coding feature map;
step 3033, the computer inputs the third-scale first decoding feature map into another RDB module based on PA in the third-scale network model to perform feature extraction, so as to obtain a third-scale second coding feature map;
step 3034, the computer performs 0.5 times downsampling on the second scale feature map F e2 to obtain a second downsampled feature map;
The computer performs 0.25 times downsampling on the first scale feature map F e1 to obtain a third downsampled feature map;
Step 3035, a computer is adopted to call a splicing cat function module to splice the second downsampling feature map, the third downsampling feature map and the third-scale second coding feature map to obtain a second spliced feature map;
Step 3036, inputting the second spliced feature map into a feature fusion module in the third-scale network model by adopting a computer to obtain a third-scale feature map F e3;
In step 304, a computer is adopted to extract the features of the third scale feature map F e3 through a fourth scale network model to obtain a fourth scale feature map F e4, which specifically comprises the following steps:
step 3041, performing feature extraction on the third scale feature map F e3 through a fourth convolution layer by adopting a computer to obtain a fourth input feature map;
Step 3042, inputting the fourth input feature map into a PA-based RDB module in a fourth-scale network model by a computer to perform feature extraction, so as to obtain a fourth-scale first coding feature map;
step 3043, inputting the fourth-scale first coding feature map into another RDB module based on PA in the fourth-scale network model by a computer for feature extraction to obtain a fourth-scale second coding feature map;
step 3044, the computer performs 0.5 times downsampling on the third scale feature map F e3 to obtain a fourth downsampled feature map;
The computer performs 0.25 times downsampling on the second scale feature map F e2 to obtain a fifth downsampled feature map;
The computer performs 0.125 times downsampling on the first scale feature map F e1 to obtain a sixth downsampled feature map;
Step 3045, calling a splicing cat function module by a computer to splice the fourth downsampling feature map, the fifth downsampling feature map, the sixth downsampling feature map and the fourth-scale second coding feature map to obtain a third spliced feature map;
And step 3046, inputting the third spliced feature map into a feature fusion module in the fourth-scale network model by adopting a computer to obtain a fourth-scale feature map F e4.
In this embodiment, in step 3012, the computer inputs the input feature map F in into a PA-based RDB module to perform feature extraction to obtain an intermediate feature map F out, which specifically includes the following steps:
Step A, a computer performs feature extraction on an input feature map F in through a first Conv+ReLU layer to obtain a feature map F pre;
Step B, a computer inputs a characteristic diagram F pre into a Conv1 convolution layer and an RDB module to perform characteristic extraction to obtain a characteristic diagram F RDB, and simultaneously, inputs a characteristic diagram F pre into a Conv2 convolution layer to perform convolution processing and normalizes the characteristic diagram through a Sigmoid activation function to obtain a space weight diagram F s;
Step C, the computer is according to Obtaining a characteristic diagram F mid; wherein/>Hadamard product operation between matrices representing feature maps,/>Representing addition operations between feature map matrices;
step D, the computer is according to Obtaining an intermediate feature map F out;
In this embodiment, in step 3026, step 3036 and step 3046, the first post-stitching feature map, the second post-stitching feature map and the third post-stitching feature map are recorded as post-stitching feature maps, and the second scale feature map F e2, the third scale feature map F e3 and the fourth scale feature map F e4 are all recorded as post-fusion scale feature maps, and then the post-stitching feature maps are input into a feature fusion module by a computer to obtain a scale feature map, which comprises the following specific steps:
A1, performing feature processing on the spliced feature map through a first Conv+ InstanceNorm normalization and ReLU activation function layer by adopting a computer to obtain a fusion coding feature map;
and A2, performing feature processing on the fusion coding feature map through a second Conv+ InstanceNorm normalization and ReLU activation function layer by adopting a computer to obtain a fused scale feature map.
In this embodiment, in step 305, a computer is used to perform feature extraction on the fourth scale feature map F e4 through the first decoding network model to obtain a first decoding feature map F d1, which specifically includes the following steps:
step 3051, performing feature extraction on the fourth scale feature map F e4 by using a computer through a PA-based RDB module in the first decoding network model to obtain a first pre-decoding feature map;
Step 3052, inputting the first pre-decoding feature map into another PA-based RDB module in the first decoding network model by the computer to perform feature extraction, so as to obtain a first decoding feature map F d1;
step 306, the specific process is as follows:
Step 3061, performing feature extraction on the first decoded feature map F d1 through a first transposed convolutional layer by using a computer to obtain a first decoded first upsampled feature map;
Step 3062, performing feature extraction on the first decoded first upsampled feature map through two PA-based RDB modules in the second decoding network model by using a computer to obtain a first intermediate feature map;
step 3063, performing 2 times up-sampling processing on the first decoding feature map F d1 by adopting a computer to obtain a first decoding second up-sampling feature map;
Step 3064, a computer is adopted to call a splicing cat function module to splice the first intermediate feature map and the first decoding second up-sampling feature map, so as to obtain a first decoding splicing feature map;
step 3065, inputting the first decoding spliced feature map into a feature fusion module in a second decoding network model by adopting a computer to obtain a second decoding feature map F d2;
step 307, the specific process is as follows:
Step 3071, using a computer to perform feature extraction on the second decoded feature map F d2 through a second transposed convolutional layer to obtain a second decoded first upsampled feature map;
step 3072, performing feature extraction on the second decoded first upsampled feature map by using a computer through two PA-based RDB modules in the third decoding network model to obtain a second intermediate feature map;
3073, performing 4 times up-sampling on the first decoding feature map F d1 by adopting a computer to obtain a second decoding second up-sampling feature map;
performing 2 times up-sampling on the second decoding feature map F d2 to obtain a second decoding third up-sampling feature map;
Step 3074, calling a splicing cat function module by a computer to splice the second intermediate feature map, the second decoding second up-sampling feature map and the second decoding third up-sampling feature map to obtain a second decoding splicing feature map;
Step 3075, inputting the second decoding spliced feature map into a feature fusion module in a third decoding network model by adopting a computer to obtain a third decoding feature map F d3;
Step 308, the specific process is as follows:
Step 3081, performing feature extraction on the third decoded feature map F d3 through a third transposed convolutional layer by using a computer to obtain a third decoded first upsampled feature map;
Step 3072, performing feature extraction on the third decoded first upsampled feature map by using a computer through two PA-based RDB modules in the fourth decoding network model to obtain a third intermediate feature map;
3073, performing 8 times up-sampling on the first decoding feature map F d1 by adopting a computer to obtain a third decoding second up-sampling feature map;
performing 4 times up-sampling on the second decoding feature map F d2 to obtain a third up-sampling feature map for third decoding;
Performing 2 times up-sampling on the third decoding feature map F d3 to obtain a third decoding fourth up-sampling feature map;
Step 3074, calling a splicing cat function module by a computer to splice the third intermediate feature map, the third decoding second up-sampling feature map, the third decoding third up-sampling feature map and the third decoding fourth up-sampling feature map to obtain a third decoding splicing feature map;
And 3075, inputting the third decoding spliced feature map into a feature fusion module in a third decoding network model by adopting a computer to obtain a fourth decoding feature map F d4.
In this embodiment, it should be noted that the structures of the feature fusion modules in the second scale network model, the third scale network model and the fourth scale network model are the same and only the number of convolution kernels is different.
In this embodiment, it should be noted that the structures of the feature fusion modules in the second decoding network model, the third decoding network model, and the fourth decoding network model are the same and only the number of convolution kernels is different.
In this embodiment, the convolution layer in the first conv+ InstanceNorm normalization+relu activation function layer in the second scale network model is a Conv3 convolution layer, the number of convolution kernels in the Conv3 convolution layer is 96, the size of the convolution kernels is 3×3, the sliding step size is 1, and the padding is 1;
The convolution layer in the second Conv+ InstanceNorm normalization+ReLU activation function layer is a Conv4 convolution layer, the number of convolution kernels in the Conv4 convolution layer is 64, the size of the convolution kernels is 3×3, the sliding step size is 1, and the padding is 1.
The convolution layer in the first Conv+ InstanceNorm normalization+ReLU activation function layer in the third scale network model is a Conv3 convolution layer, the number of convolution kernels in the Conv3 convolution layer is 224, the size of the convolution kernels is 3 multiplied by 3, the sliding step length is 1, and the padding is 1;
The convolution layer in the second Conv+ InstanceNorm normalization+ReLU activation function layer is a Conv4 convolution layer, the number of convolution kernels in the Conv4 convolution layer is 128, the size of the convolution kernels is 3×3, the sliding step size is 1, and the padding is 1.
The convolution layer in the first Conv+ InstanceNorm normalization+ReLU activation function layer in the fourth scale network model is a Conv3 convolution layer, the number of convolution kernels in the Conv3 convolution layer is 480, the size of the convolution kernels is 3 multiplied by 3, the sliding step length is 1, and the padding is 1;
The convolution layer in the second Conv+ InstanceNorm normalization+ReLU activation function layer is a Conv4 convolution layer, the number of convolution kernels in the Conv4 convolution layer is 256, the size of the convolution kernels is 3×3, the sliding step size is 1, and the padding is 1.
The first Conv+ InstanceNorm normalization+ReLU activation function layer in the second decoding network model is a Conv3 convolution layer, the number of convolution kernels in the Conv3 convolution layer is 384, the size of the convolution kernels is 3 multiplied by 3, the sliding step length is 1, and the padding is 1;
The convolution layer in the second Conv+ InstanceNorm normalization+ReLU activation function layer is a Conv4 convolution layer, the number of convolution kernels in the Conv4 convolution layer is 128, the size of the convolution kernels is 3 multiplied by 3, the sliding step length is 1, and the padding is 1;
The first Conv+ InstanceNorm normalization+ReLU activation function layer in the third decoding network model is a Conv3 convolution layer, the number of convolution kernels in the Conv3 convolution layer is 448, the size of the convolution kernels is 3 multiplied by 3, the sliding step length is 1, and the padding is 1;
The convolution layer in the second Conv+ InstanceNorm normalization+ReLU activation function layer is a Conv4 convolution layer, the number of convolution kernels in the Conv4 convolution layer is 64, the size of the convolution kernels is 3×3, the sliding step size is 1, and the padding is 1.
The convolution layer in the first Conv+ InstanceNorm normalization+ReLU activation function layer in the fourth decoding network model is a Conv3 convolution layer, the number of convolution kernels in the Conv3 convolution layer is 480, the size of the convolution kernels is 3 multiplied by 3, the sliding step length is 1, and the padding is 1;
The convolution layer in the second Conv+ InstanceNorm normalization+ReLU activation function layer is a Conv4 convolution layer, the number of convolution kernels in the Conv4 convolution layer is 32, the size of the convolution kernels is 3×3, the sliding step size is 1, and the padding is 1.
In this embodiment, the PA module is a spatial attention mechanism module, and the RDB is a residual error density block (Residual Dense Block).
In this embodiment, it should be noted that the Adam optimization algorithm, that is, adaptive momentum optimization algorithm, is a first-order optimization algorithm that can replace the conventional random gradient descent process, and can iteratively update the model parameters based on training data.
In this embodiment, the number of the foggy training images and the number of the foggy training images are 13990.
In this embodiment, it should be noted that, in actual use,Refers to Hadamard product between image matrices, for example, let the ith row and jth column elements in matrix A be aij and the ith row and jth column elements in matrix B be bij, then/>The ith row and jth column element in C is cij=aij×bij and A, B and C are homography matrices.
In this embodiment, it should be noted that the preset number of iterative training in step 502 is 30.
In this embodiment, it should be noted that, when i=1, Φ 1 (gt) represents a feature map of the defogging training image gt output through Relu1_1 layer in the VGG19 network model, and Φ 1 (out) represents a feature map of the defogging image out output through Relu1_1 layer in the VGG19 network model;
When i=2, Φ 2 (gt) represents a feature map of the defogging training image gt output through Relu2_1 layer in the VGG19 network model, Φ 2 (out) represents a feature map of the defogging image out output through Relu2_1 layer in the VGG19 network model;
When i=3, Φ 3 (gt) represents a feature map of the defogging training image gt output through Relu3_1 layer in the VGG19 network model, Φ 3 (out) represents a feature map of the defogging image out output through Relu3_1 layer in the VGG19 network model;
When i=4, Φ 4 (gt) represents a feature map of the defogging training image gt output through Relu4_1 layer in the VGG19 network model, and Φ 4 (out) represents a feature map of the defogging image out output through Relu4_1 layer in the VGG19 network model;
When i=5, Φ 5 (gt) represents a feature map of the defogging training image gt output through Relu5_1 layer in the VGG19 network model, and Φ 5 (out) represents a feature map of the defogging image out output through Relu5_1 layer in the VGG19 network model.
In this embodiment, the downsampling is nearest neighbor downsampling, and the upsampling is nearest neighbor upsampling.
In this embodiment, the number of channels of the image is unchanged as the downsampling by 0.5 times, and the size of the image is changed to 1/2 of the original size; downsampling by 0.25 times, namely, the channel number of the image is unchanged, and the size of the image is changed to 1/4 of the original size; the downsampling is 0.125 times, namely the channel number of the image is unchanged, and the size of the image is changed to 1/8 of the original size.
In this embodiment, it should be noted that, the number of channels of the image is unchanged as the up-sampling is 2 times, and the size of the image is changed to 2 as the original size; 4 times up sampling, namely the channel number of the image is unchanged, and the size of the image is changed into 4; the 8 times up sampling is that the channel number of the image is unchanged, and the size of the image is changed to 8.
In this embodiment, the foggy training image I is a three-channel RGB color image, i.e., 3×256×256. The size of the fog monitoring image is 3×256×256.
In this embodiment, the size of the EPDN teacher network output defogging image out EP and the size of the PSD teacher network output defogging image out PS are 3×256×256, the size of the epdn teacher network intermediate output feature map EP 1 is 128×64×64, and the size of the PSD teacher network intermediate output feature map PS 2 is 64×128×128.
In this embodiment, the size of the feature map F pre is 32×256×256, the size of the feature map F RDB is 32×256×256, the size of the feature map F mid is 32×256×256, and the size of the feature map F s is 32×256×256.
In the present embodiment, the size of the feature map is expressed by the number of channels×length×width, the size of the input feature map F in is 32×256×256, the size of the output feature map F out is 32×256×256, and the size of the first scale feature map F e1 is 32×256×256;
the second input feature map has a size of 64×128×128, the second-scale first coding feature map has a size of 64×128×128, the second-scale second coding feature map has a size of 64×128×128, the first downsampled feature map has a size of 32×128×128, the first post-splice feature map has a size of 96×128×128, and the second-scale feature map F e2 has a size of 64×128×128;
The third input feature map has a size of 128 x 64, the third scale first encoding feature map has a size of 128 x 64, the third scale second encoding feature map has a size of 128 x 64, the second downsampled feature map has a size of 64 x 64, the third downsampled feature map has a size of 32 x 64, the second post-stitching feature map has a size 224×64×64, and the third scale feature map F e3 has a size 128×64×64;
The fourth input feature map has a size of 256 x 32, the fourth scale first encoding feature map has a size of 256 x 32, the fourth scale second encoding feature map has a size of 256 x 32, the fourth downsampling feature map has a size of 128 x 32, the fifth downsampled feature map has a size of 64 x 32, the sixth downsampled feature map has a size of 32 x 32, the third post-stitching feature map has a size of 480×32×32, and the fourth scale feature map F e4 has a size of 256×32×32.
In this embodiment, the first pre-coding feature map and the first coding feature map F d1 have a size of 256×32×32;
The size of the first decoded first up-sampled feature map is 128×64×64, the size of the first intermediate feature map is 128×64×64, the size of the first decoded second up-sampled feature map is 256×64×64, the size of the first decoded splice feature map is 384×64×64, and the size of the second decoded feature map F d2 is 128×64×64;
The second decoded first upsampled feature map has a size of 64 x 128, the second intermediate feature map has a size of 64 x 128, the second decoded second upsampled feature map has a size of 256 x 128, the second decoded third upsampled feature map has a size of 128 x 128, the second decoded splice feature map has a size of 448 x 128, and the third decoded feature map F d3 has a size of 64 x 128;
the size of the third decoded first upsampled feature map is 32×256×256, the size of the third intermediate feature map is 32×256×256, the size of the third decoded second upsampled feature map is 256×256, the size of the third decoded third upsampled feature map is 128×256×256, the size of the third decoded fourth upsampled feature map is 64×256×256, the size of the third decoded splice feature map is 480×256×256, and the size of the fourth decoded feature map F d4 is 32×256×256;
the size of the output defogging image out is 3×256×256.
In this embodiment, it should be noted that, the first decoding and splicing feature map, the second decoding and splicing feature map, and the third decoding and splicing feature map respectively perform feature processing through a first conv+ InstanceNorm normalization+relu activation function layer and a second conv+ InstanceNorm normalization+relu activation function layer in the feature fusion module, so as to obtain a second decoding feature map, a third decoding feature map, and a fourth decoding feature map.
In summary, the method has simple steps and reasonable design, the student network model is guided and trained through the EPDN teacher network model and the PSD teacher network model, the feature extraction capability of the student network is effectively improved, the student network model realizes the extraction of multi-scale information of the defogging image through the encoding and decoding of four scales, the global and local features of the defogging image are effectively fused, and the defogging effect of the image is further improved.
The foregoing description is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and any simple modification, variation and equivalent structural changes made to the above embodiment according to the technical substance of the present invention still fall within the scope of the technical solution of the present invention.
Claims (7)
1. A single image defogging method based on multi-teacher knowledge distillation, which is characterized by comprising the following steps:
step one, acquiring a training set image:
selecting an indoor training set from the foggy day image database RESIDE; the indoor training set comprises foggy training images and foggy training images corresponding to the foggy training images, wherein the number of the foggy training images and the number of the foggy training images are the same;
step two, establishing a student network model:
The method for establishing the student network model comprises the following specific processes:
Step 201, establishing an encoder model of a student network by adopting a computer; the encoder model of the student network comprises a first scale network model, a second scale network model, a third scale network model and a fourth scale network model, wherein the first scale network model comprises a first convolution layer and two RDB modules based on PA, and the second scale network model comprises a second convolution layer, two RDB modules based on PA and a feature fusion module; the third scale network model comprises a third convolution layer, two RDB modules based on PA and a feature fusion module; the fourth scale network model comprises a fourth convolution layer, two RDB modules based on PA and a feature fusion module;
Step 202, adopting a computer to establish a decoder model of the student network; the decoder model of the student network comprises a first decoding network model, a second decoding network model, a third decoding network model, a fourth decoding network model and a fifth convolution layer, wherein the first decoding network model comprises two RDB modules based on PA, and the second decoding network model comprises a first transfer convolution layer, two RDB modules based on PA and a feature fusion module; the third decoding network model comprises a second transpose convolution layer, two RDB modules based on PA and a feature fusion module; the fourth decoding network model comprises a third transpose convolution layer, two RDB modules based on PA and a feature fusion module;
step three, extracting features of the foggy training images:
Step 301, extracting features of the foggy training image I through a first scale network model by adopting a computer to obtain a first scale feature map F e1;
Step 302, extracting features of the first scale feature map F e1 through a second scale network model by using a computer to obtain a second scale feature map F e2;
step 303, extracting features of the second scale feature map F e2 through a third scale network model by using a computer to obtain a third scale feature map F e3;
step 304, extracting features of the third scale feature map F e3 through a fourth scale network model by using a computer to obtain a fourth scale feature map F e4;
step 305, performing feature extraction on the fourth scale feature map F e4 through the first decoding network model by using a computer to obtain a first decoding feature map F d1;
Step 306, extracting features of the first decoding feature map F d1 through a second decoding network model by using a computer to obtain a second decoding feature map F d2;
Step 307, performing feature extraction on the second decoding feature map F d2 through a third decoding network model by using a computer to obtain a third decoding feature map F d3;
Step 308, performing feature extraction on the third decoding feature map F d3 through a fourth decoding network model by using a computer to obtain a fourth decoding feature map F d4; performing feature extraction on the fourth decoding feature map F d4 through fifth convolution by adopting a computer to obtain an output defogging image out;
309, processing the foggy training image I by using a EPDN teacher network model by using a computer to obtain a EPDN teacher network output defogging image out EP, and recording a feature map output by a global sub-generator in the EPDN teacher network model as a EPDN teacher network intermediate output feature map EP 1;
Processing the foggy training image I by a computer by using a teacher PSD network model to obtain a PSD teacher network output defogging image out PS, and recording a feature map output by a trunk network in the teacher PSD network model as a PSD teacher network middle output feature map PS 2;
Step four, establishing a total loss function:
step 401, adopting a computer according to Obtaining a perception loss function L per; wherein I is a positive integer, n=5, Φ i (gt) represents a characteristic diagram of an defogging training image corresponding to the foggy training image I, which is output by Relu I _1 layer in the VGG19 network model, Φ i (out) represents a characteristic diagram of an output defogging image out of the student network model, which is output by Relu I _1 layer in the VGG19 network model, and I is more than or equal to 1 and less than or equal to 5; c i、Hi and W i represent the number of channels, length and width of the feature map output by the Relu i _1 layer, respectively; (Φ i(gt),Φi(out))L1 represents the Manhattan distance between the two feature maps of Relu i _1 layer output in the VGG19 network model;
step 402, obtaining a distillation loss function L diss according to Ldist=(out,outEP)L1+(out,outPS)L1+0.25(EP1,Fd2)L1+0.5(PS2,Fd3)L1, by adopting a computer; wherein, (out EP)L1 represents the manhattan distance between the output defogging image out of the student network model and the EPDN teacher network output defogging image out EP, (out PS)L1 represents the manhattan distance between the output defogging image out of the student network model and the PSD teacher network output defogging image out PS, (EP 1,Fd2)L1 represents the manhattan distance between the EPDN teacher network intermediate output feature map EP 1 and the second decoding feature map F d2 of the student network model, (PS 2,Fd3)L1 represents the manhattan distance between the PSD teacher network intermediate output feature map PS 2 and the third decoding feature map F d3 of the student network model);
Step 403, obtaining a total loss function L loss by adopting a computer according to the L loss=0.1Lper+Ldist;
Training the student network model by the foggy training image:
step 501, adopting an Adam optimization algorithm by a computer, and performing iterative optimization on a student network model by using a total loss function L loss until a training set is completely trained, and completing one-time iterative training;
step 502, repeating the iterative training in step 501 until the iterative training preset times are met, and obtaining a trained student network model;
Step six, defogging the single image by using the trained student network model:
Inputting any one foggy image into a trained student network model by adopting a computer to perform defogging treatment to obtain a foggy image;
The PA-based RDB module in step 201 includes a first conv+relu layer, a Conv1 convolution layer, an RDB module, a Conv2 convolution layer, and a Sigmoid activation function layer.
2. A single image defogging method based on multi-teacher knowledge distillation according to claim 1, wherein: in step 201, the number of convolution kernels in the first convolution layer is 32, the size of the convolution kernels is 3×3, the sliding step size is 1, and the padding is 1;
The number of convolution kernels in the second convolution layer is 64, the size of the convolution kernels is 3×3, the sliding step size is 2, and the padding is 1;
The number of convolution kernels in the first Conv+ReLU layer is 32, the size of the convolution kernels is 3 multiplied by 3, the sliding step length is 1, and the padding is 1; the number of convolution kernels in the Conv1 convolution layer is 32, the size of the convolution kernels is 1 multiplied by 1, the sliding step length is 1, and the padding is 0; the number of convolution kernels in the Conv2 convolution layer is 32, the size of the convolution kernels is 3 multiplied by 3, the sliding step length is 1, and the padding is 1;
The number of convolution kernels in the third convolution layer is 128, the size of the convolution kernels is 3×3, the sliding step length is 2, and the padding is 1;
the number of convolution kernels in the fourth convolution layer is 256, the size of the convolution kernels is 3×3, the sliding step length is 2, and the padding is 1;
The feature fusion module in step 201 includes a first conv+ InstanceNorm normalization+relu activation function layer and a second conv+ InstanceNorm normalization+relu activation function layer;
in step 202, the number of convolution kernels in the first transpose convolution layer is 128, the size of the convolution kernels is 3×3, the sliding step size is 2, the padding is1, and the out_padding is 1;
the number of convolution kernels in the second transpose convolution layer is 64, the size of the convolution kernels is 3×3, the sliding step size is 2, the padding is 1, and the out_padding is 1;
the number of convolution kernels in the third transpose convolution layer is 32, the size of the convolution kernels is 3×3, the sliding step size is 2, the padding is 1, and the out_padding is 1;
the number of convolution kernels in the fifth convolution layer is 3, the size of the convolution kernels is 3×3, the sliding step size is 1, and the padding is 1.
3. A single image defogging method based on multi-teacher knowledge distillation according to claim 1, wherein: in step 301, a computer is used to perform feature extraction on the foggy training image I through a first scale network model, so as to obtain a first scale feature map F e1, which specifically includes the following steps:
Step 3011, performing feature extraction on the foggy training image I through a first convolution layer by adopting a computer to obtain an input feature map F in;
Step 3012, inputting the input feature map F in into a PA-based RDB module by a computer to perform feature extraction, so as to obtain an intermediate output feature map F out;
Step 3013, according to the method described in step 3012, the computer inputs the intermediate output feature map F out into another PA-based RDB module for feature extraction, to obtain a first scale feature map F e1.
4. A single image defogging method based on multi-teacher knowledge distillation according to claim 1, wherein: in step 302, a computer is used to perform feature extraction on the first scale feature map F e1 through a second scale network model to obtain a second scale feature map F e2, which specifically includes the following steps:
Step 3021, performing feature extraction on the first scale feature map F e1 through a second convolution layer by using a computer to obtain a second input feature map;
Step 3022, inputting the second input feature map into a PA-based RDB module in the second scale network model by the computer to perform feature extraction, so as to obtain a second scale first coding feature map;
Step 3023, inputting the second-scale first coding feature map into another PA-based RDB module in the second-scale network model by the computer for feature extraction, so as to obtain a second-scale second coding feature map;
Step 3024, the computer downsamples the first scale feature map F e1 by 0.5 times to obtain a first downsampled feature map;
step 3025, calling a splicing cat function module by a computer to splice the first downsampling feature map and the second-scale second coding feature map to obtain a first spliced feature map;
Step 3026, inputting the first spliced feature map into a feature fusion module in the second scale network model by using a computer to obtain a second scale feature map F e2;
In step 303, a computer is used to perform feature extraction on the second scale feature map F e2 through a third scale network model to obtain a third scale feature map F e3, which specifically includes the following steps:
step 3031, a computer is adopted to conduct feature extraction on the second scale feature map F e2 through a third convolution layer to obtain a third input feature map;
step 3032, the computer inputs the third input feature map into a PA-based RDB module in the third-scale network model to perform feature extraction to obtain a third-scale first coding feature map;
step 3033, the computer inputs the third-scale first decoding feature map into another RDB module based on PA in the third-scale network model to perform feature extraction, so as to obtain a third-scale second coding feature map;
step 3034, the computer performs 0.5 times downsampling on the second scale feature map F e2 to obtain a second downsampled feature map;
The computer performs 0.25 times downsampling on the first scale feature map F e1 to obtain a third downsampled feature map;
Step 3035, a computer is adopted to call a splicing cat function module to splice the second downsampling feature map, the third downsampling feature map and the third-scale second coding feature map to obtain a second spliced feature map;
Step 3036, inputting the second spliced feature map into a feature fusion module in the third-scale network model by adopting a computer to obtain a third-scale feature map F e3;
In step 304, a computer is adopted to extract the features of the third scale feature map F e3 through a fourth scale network model to obtain a fourth scale feature map F e4, which specifically comprises the following steps:
step 3041, performing feature extraction on the third scale feature map F e3 through a fourth convolution layer by adopting a computer to obtain a fourth input feature map;
Step 3042, inputting the fourth input feature map into a PA-based RDB module in a fourth-scale network model by a computer to perform feature extraction, so as to obtain a fourth-scale first coding feature map;
step 3043, inputting the fourth-scale first coding feature map into another RDB module based on PA in the fourth-scale network model by a computer for feature extraction to obtain a fourth-scale second coding feature map;
step 3044, the computer performs 0.5 times downsampling on the third scale feature map F e3 to obtain a fourth downsampled feature map;
The computer performs 0.25 times downsampling on the second scale feature map F e2 to obtain a fifth downsampled feature map;
The computer performs 0.125 times downsampling on the first scale feature map F e1 to obtain a sixth downsampled feature map;
Step 3045, calling a splicing cat function module by a computer to splice the fourth downsampling feature map, the fifth downsampling feature map, the sixth downsampling feature map and the fourth-scale second coding feature map to obtain a third spliced feature map;
And step 3046, inputting the third spliced feature map into a feature fusion module in the fourth-scale network model by adopting a computer to obtain a fourth-scale feature map F e4.
5. A single image defogging method based on multi-teacher knowledge distillation according to claim 3, characterized in that: in step 3012, the computer inputs the input feature map F in into a PA-based RDB module to perform feature extraction to obtain an intermediate feature map F out, which specifically includes the following steps:
Step A, a computer performs feature extraction on an input feature map F in through a first Conv+ReLU layer to obtain a feature map F pre;
Step B, a computer inputs a characteristic diagram F pre into a Conv1 convolution layer and an RDB module to perform characteristic extraction to obtain a characteristic diagram F RDB, and simultaneously, inputs a characteristic diagram F pre into a Conv2 convolution layer to perform convolution processing and normalizes the characteristic diagram through a Sigmoid activation function to obtain a space weight diagram F s;
Step C, the computer is according to Obtaining a characteristic diagram F mid; wherein/>Hadamard product operation between matrices representing feature maps,/>Representing addition operations between feature map matrices;
step D, the computer is according to An intermediate feature map F out is obtained.
6. A single image defogging method based on multi-teacher knowledge distillation according to claim 4, wherein: in step 3026, step 3036 and step 3046, the first post-stitching feature map, the second post-stitching feature map and the third post-stitching feature map are recorded as post-stitching feature maps, and the second scale feature map F e2, the third scale feature map F e3 and the fourth scale feature map F e4 are all recorded as post-fusion scale feature maps, and then the post-stitching feature maps are input into a feature fusion module by a computer to obtain a scale feature map, which comprises the following specific steps:
A1, performing feature processing on the spliced feature map through a first Conv+ InstanceNorm normalization and ReLU activation function layer by adopting a computer to obtain a fusion coding feature map;
and A2, performing feature processing on the fusion coding feature map through a second Conv+ InstanceNorm normalization and ReLU activation function layer by adopting a computer to obtain a fused scale feature map.
7. A single image defogging method based on multi-teacher knowledge distillation according to claim 1, wherein: in step 305, a computer is used to extract features of the fourth scale feature map F e4 through the first decoding network model to obtain a first decoding feature map F d1, which specifically includes the following steps:
step 3051, performing feature extraction on the fourth scale feature map F e4 by using a computer through a PA-based RDB module in the first decoding network model to obtain a first pre-decoding feature map;
Step 3052, inputting the first pre-decoding feature map into another PA-based RDB module in the first decoding network model by the computer to perform feature extraction, so as to obtain a first decoding feature map F d1;
step 306, the specific process is as follows:
Step 3061, performing feature extraction on the first decoded feature map F d1 through a first transposed convolutional layer by using a computer to obtain a first decoded first upsampled feature map;
Step 3062, performing feature extraction on the first decoded first upsampled feature map through two PA-based RDB modules in the second decoding network model by using a computer to obtain a first intermediate feature map;
step 3063, performing 2 times up-sampling processing on the first decoding feature map F d1 by adopting a computer to obtain a first decoding second up-sampling feature map;
Step 3064, a computer is adopted to call a splicing cat function module to splice the first intermediate feature map and the first decoding second up-sampling feature map, so as to obtain a first decoding splicing feature map;
step 3065, inputting the first decoding spliced feature map into a feature fusion module in a second decoding network model by adopting a computer to obtain a second decoding feature map F d2;
step 307, the specific process is as follows:
Step 3071, using a computer to perform feature extraction on the second decoded feature map F d2 through a second transposed convolutional layer to obtain a second decoded first upsampled feature map;
step 3072, performing feature extraction on the second decoded first upsampled feature map by using a computer through two PA-based RDB modules in the third decoding network model to obtain a second intermediate feature map;
3073, performing 4 times up-sampling on the first decoding feature map F d1 by adopting a computer to obtain a second decoding second up-sampling feature map;
performing 2 times up-sampling on the second decoding feature map F d2 to obtain a second decoding third up-sampling feature map;
Step 3074, calling a splicing cat function module by a computer to splice the second intermediate feature map, the second decoding second up-sampling feature map and the second decoding third up-sampling feature map to obtain a second decoding splicing feature map;
Step 3075, inputting the second decoding spliced feature map into a feature fusion module in a third decoding network model by adopting a computer to obtain a third decoding feature map F d3;
Step 308, the specific process is as follows:
Step 3081, performing feature extraction on the third decoded feature map F d3 through a third transposed convolutional layer by using a computer to obtain a third decoded first upsampled feature map;
Step 3072, performing feature extraction on the third decoded first upsampled feature map by using a computer through two PA-based RDB modules in the fourth decoding network model to obtain a third intermediate feature map;
3073, performing 8 times up-sampling on the first decoding feature map F d1 by adopting a computer to obtain a third decoding second up-sampling feature map;
performing 4 times up-sampling on the second decoding feature map F d2 to obtain a third up-sampling feature map for third decoding;
Performing 2 times up-sampling on the third decoding feature map F d3 to obtain a third decoding fourth up-sampling feature map;
Step 3074, calling a splicing cat function module by a computer to splice the third intermediate feature map, the third decoding second up-sampling feature map, the third decoding third up-sampling feature map and the third decoding fourth up-sampling feature map to obtain a third decoding splicing feature map;
And 3075, inputting the third decoding spliced feature map into a feature fusion module in a third decoding network model by adopting a computer to obtain a fourth decoding feature map F d4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310681883.6A CN116862784B (en) | 2023-06-09 | 2023-06-09 | Single image defogging method based on multi-teacher knowledge distillation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310681883.6A CN116862784B (en) | 2023-06-09 | 2023-06-09 | Single image defogging method based on multi-teacher knowledge distillation |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116862784A CN116862784A (en) | 2023-10-10 |
CN116862784B true CN116862784B (en) | 2024-06-04 |
Family
ID=88218024
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310681883.6A Active CN116862784B (en) | 2023-06-09 | 2023-06-09 | Single image defogging method based on multi-teacher knowledge distillation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116862784B (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2020100274A4 (en) * | 2020-02-25 | 2020-03-26 | Huang, Shuying DR | A Multi-Scale Feature Fusion Network based on GANs for Haze Removal |
CN111833277A (en) * | 2020-07-27 | 2020-10-27 | 大连海事大学 | Marine image defogging method with non-paired multi-scale hybrid coding and decoding structure |
WO2021056043A1 (en) * | 2019-09-23 | 2021-04-01 | Presagen Pty Ltd | Decentralised artificial intelligence (ai)/machine learning training system |
CN113066025A (en) * | 2021-03-23 | 2021-07-02 | 河南理工大学 | Image defogging method based on incremental learning and feature and attention transfer |
CN113379613A (en) * | 2020-03-10 | 2021-09-10 | 三星电子株式会社 | Image denoising system and method using deep convolutional network |
CN113744146A (en) * | 2021-08-23 | 2021-12-03 | 山东师范大学 | Image defogging method based on contrast learning and knowledge distillation |
CN115358942A (en) * | 2022-08-09 | 2022-11-18 | 山西大学 | Image defogging method combining course learning and teacher-student learning |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11508037B2 (en) * | 2020-03-10 | 2022-11-22 | Samsung Electronics Co., Ltd. | Systems and methods for image denoising using deep convolutional networks |
-
2023
- 2023-06-09 CN CN202310681883.6A patent/CN116862784B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021056043A1 (en) * | 2019-09-23 | 2021-04-01 | Presagen Pty Ltd | Decentralised artificial intelligence (ai)/machine learning training system |
AU2020100274A4 (en) * | 2020-02-25 | 2020-03-26 | Huang, Shuying DR | A Multi-Scale Feature Fusion Network based on GANs for Haze Removal |
CN113379613A (en) * | 2020-03-10 | 2021-09-10 | 三星电子株式会社 | Image denoising system and method using deep convolutional network |
CN111833277A (en) * | 2020-07-27 | 2020-10-27 | 大连海事大学 | Marine image defogging method with non-paired multi-scale hybrid coding and decoding structure |
CN113066025A (en) * | 2021-03-23 | 2021-07-02 | 河南理工大学 | Image defogging method based on incremental learning and feature and attention transfer |
CN113744146A (en) * | 2021-08-23 | 2021-12-03 | 山东师范大学 | Image defogging method based on contrast learning and knowledge distillation |
CN115358942A (en) * | 2022-08-09 | 2022-11-18 | 山西大学 | Image defogging method combining course learning and teacher-student learning |
Non-Patent Citations (2)
Title |
---|
Multi-Scale Boosted Dehazing Network With Dense Feature Fusion;Hang Dong等;2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR);20200805;2154-2164 * |
一种基于深度学习的两阶段图像去雾网络;吴嘉炜;余兆钗;李佐勇;刘维娜;张祖昌;;计算机应用与软件;20200430;第37卷(第04期);197-202 * |
Also Published As
Publication number | Publication date |
---|---|
CN116862784A (en) | 2023-10-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110232394B (en) | Multi-scale image semantic segmentation method | |
CN109165660B (en) | Significant object detection method based on convolutional neural network | |
CN108804397B (en) | Chinese character font conversion generation method based on small amount of target fonts | |
CN111243066A (en) | Facial expression migration method based on self-supervision learning and confrontation generation mechanism | |
CN111832570A (en) | Image semantic segmentation model training method and system | |
CN110826596A (en) | Semantic segmentation method based on multi-scale deformable convolution | |
CN112733768B (en) | Natural scene text recognition method and device based on bidirectional characteristic language model | |
Haque et al. | Image denoising and restoration with CNN-LSTM Encoder Decoder with Direct Attention | |
CN109377532B (en) | Image processing method and device based on neural network | |
CN111951164B (en) | Image super-resolution reconstruction network structure and image reconstruction effect analysis method | |
CN113240683B (en) | Attention mechanism-based lightweight semantic segmentation model construction method | |
CN113066025B (en) | Image defogging method based on incremental learning and feature and attention transfer | |
CN113298716B (en) | Image super-resolution reconstruction method based on convolutional neural network | |
CN113421187B (en) | Super-resolution reconstruction method, system, storage medium and equipment | |
CN113362242B (en) | Image restoration method based on multi-feature fusion network | |
CN112700460B (en) | Image segmentation method and system | |
CN111353939A (en) | Image super-resolution method based on multi-scale feature representation and weight sharing convolution layer | |
CN115908805A (en) | U-shaped image segmentation network based on convolution enhanced cross self-attention deformer | |
CN115393289A (en) | Tumor image semi-supervised segmentation method based on integrated cross pseudo label | |
CN114359073A (en) | Low-illumination image enhancement method, system, device and medium | |
CN115496919A (en) | Hybrid convolution-transformer framework based on window mask strategy and self-supervision method | |
CN114565539A (en) | Image defogging method based on online knowledge distillation | |
CN117474796B (en) | Image generation method, device, equipment and computer readable storage medium | |
CN111667401B (en) | Multi-level gradient image style migration method and system | |
CN116862784B (en) | Single image defogging method based on multi-teacher knowledge distillation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |