CN116862784A - Single image defogging method based on multi-teacher knowledge distillation - Google Patents

Single image defogging method based on multi-teacher knowledge distillation Download PDF

Info

Publication number
CN116862784A
CN116862784A CN202310681883.6A CN202310681883A CN116862784A CN 116862784 A CN116862784 A CN 116862784A CN 202310681883 A CN202310681883 A CN 202310681883A CN 116862784 A CN116862784 A CN 116862784A
Authority
CN
China
Prior art keywords
feature map
scale
computer
network model
decoding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310681883.6A
Other languages
Chinese (zh)
Other versions
CN116862784B (en
Inventor
兰云伟
崔智高
苏延召
马铮
蔡艳平
王涛
曹继平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Rocket Force University of Engineering of PLA
Original Assignee
Rocket Force University of Engineering of PLA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Rocket Force University of Engineering of PLA filed Critical Rocket Force University of Engineering of PLA
Priority to CN202310681883.6A priority Critical patent/CN116862784B/en
Publication of CN116862784A publication Critical patent/CN116862784A/en
Application granted granted Critical
Publication of CN116862784B publication Critical patent/CN116862784B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a single image defogging method based on multi-teacher knowledge distillation, which comprises the following steps: 1. acquiring a training set image; 2. establishing a student network model; 3. extracting features of the foggy training images; 4. establishing a total loss function; 5. training the student network model by the foggy training image; 6. defogging the single image by using the trained student network model. According to the invention, the EPDN teacher network model and the PSD teacher network model are used for guiding and training the student network model, so that the feature extraction capability of the student network is effectively improved, the student network model is used for extracting multi-scale information of defogging images through four-scale encoding and decoding, global and local features of defogging images are effectively fused, and the defogging effect of the images is further improved.

Description

Single image defogging method based on multi-teacher knowledge distillation
Technical Field
The invention belongs to the technical field of image defogging processing, and particularly relates to a single image defogging method based on multi-teacher knowledge distillation.
Background
At present, a teacher model in an image defogging method mainly comprises a defogging method based on prior information and a defogging method based on deep learning. The image defogging method based on prior information has advantages in the aspects of recovering the visibility, contrast and texture structure of the image, and the image defogging method based on deep learning has better effects in the aspects of improving the authenticity and color fidelity of the image. However, at present, knowledge learned by a single teacher model is generally transferred to a student model, so that the student model has similar performance to the teacher model, but the trained student model is often limited by the performance of the teacher model due to the fact that the single teacher model is adopted to carry out unidirectional knowledge transfer on a student network.
Therefore, a single image defogging method based on multi-teacher knowledge distillation, which is simple in structure and reasonable in design, is lacking at present, a student network model is trained through an EPDN teacher network model and a PSD teacher network model, the feature extraction capability of the student network is effectively improved, the student network model is used for extracting multi-scale information of defogging images through four-scale encoding and decoding, global and local features of defogging images are effectively fused, and then the defogging effect of the images is improved.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a single image defogging method based on multi-teacher knowledge distillation, which has simple steps and reasonable design, and guides and trains a student network model through an EPDN teacher network model and a PSD teacher network model, so that the characteristic extraction capability of the student network is effectively improved, the student network model realizes the extraction of multi-scale information of defogging images through four-scale encoding and decoding, effectively fuses global and local characteristics of defogging images, and further improves the defogging effect of the images.
In order to solve the technical problems, the invention adopts the following technical scheme: a single image defogging method based on multi-teacher knowledge distillation, which is characterized by comprising the following steps:
Step one, acquiring a training set image:
selecting an indoor training set from a foggy day image database RESIDE; the indoor training set comprises foggy training images and foggy training images corresponding to the foggy training images, wherein the number of the foggy training images and the number of the foggy training images are the same;
step two, establishing a student network model:
the method for establishing the student network model comprises the following specific processes:
step 201, establishing an encoder model of a student network by adopting a computer; the encoder model of the student network comprises a first scale network model, a second scale network model, a third scale network model and a fourth scale network model, wherein the first scale network model comprises a first convolution layer and two RDB modules based on PA, and the second scale network model comprises a second convolution layer, two RDB modules based on PA and a feature fusion module; the third scale network model comprises a third convolution layer, two RDB modules based on PA and a feature fusion module; the fourth scale network model comprises a fourth convolution layer, two RDB modules based on PA and a feature fusion module;
step 202, adopting a computer to establish a decoder model of the student network; the decoder model of the student network comprises a first decoding network model, a second decoding network model, a third decoding network model, a fourth decoding network model and a fifth convolution layer, wherein the first decoding network model comprises two RDB modules based on PA, and the second decoding network model comprises a first transfer convolution layer, two RDB modules based on PA and a feature fusion module; the third decoding network model comprises a second transpose convolution layer, two RDB modules based on PA and a feature fusion module; the fourth decoding network model comprises a third transpose convolution layer, two RDB modules based on PA and a feature fusion module;
Step three, extracting features of the foggy training images:
step 301, extracting features of the foggy training image I through a first scale network model by using a computer to obtain a first scale feature map F e1
Step 302, adopting a computer to make the first scale feature map F e1 Feature extraction is carried out through a second scale network model, and a second scale feature map F is obtained e2
Step 303, adopting a computer to make the second scale feature map F e2 Feature extraction is carried out through a third-scale network model, and a third-scale feature map F is obtained e3
Step 304, computer-integrating the third scale feature map F e3 Feature extraction is carried out through a fourth-scale network model, and a fourth-scale feature map F is obtained e4
Step 305, using a computer to map the fourth scale feature map F e4 Extracting features through the first decoding network model to obtain a first decoding feature map F d1
Step 306, using a computer to decode the first decoding feature map F d1 Feature extraction is carried out through a second decoding network model to obtain a second decoding feature map F d2
Step 307, using a computer to decode the feature map F d2 Extracting features through a third decoding network model to obtain a third decoding feature map F d3
Step 308, using a computer to decode the third decoding feature map F d3 Feature extraction is carried out through a fourth decoding network model to obtain a fourth decoding feature map F d4 The method comprises the steps of carrying out a first treatment on the surface of the Computer-implemented fourth decoding feature map F d4 Feature extraction is carried out through fifth convolution, and an output defogging image out is obtained;
step 309, processing the foggy training image I by using the EPDN teacher network model with a computer to obtain an defogging image out output by the EPDN teacher network EP And EP is addedFeature map output by global sub-generator in DN teacher network model and marked as EPDN teacher network intermediate output feature map EP 1
Processing the foggy training image I by using a computer through a teacher PSD network model to obtain a PSD teacher network output defogging image out PS The feature map output by the main network in the PSD network model of the teacher is recorded as a PSD teacher network middle output feature map PS 2
Step four, establishing a total loss function:
step 401, adopting a computer according toObtaining a perception loss function L per The method comprises the steps of carrying out a first treatment on the surface of the Where i is a positive integer, n=5, and Φ i (gt) a characteristic diagram representing the output of the foggy training image gt corresponding to the foggy training image I through the Relu i_1 layer in the VGG19 network model, phi i (out) a characteristic diagram of the output defogging image out of the student network model, which is output by a Relu i_1 layer in the VGG19 network model, wherein i is more than or equal to 1 and less than or equal to 5; c (C) i 、H i And W is i The channel number, the length and the width of the feature map output by the Relu i_1 layer are respectively represented; (phi) i (gt),Φ i (out)) L1 Representing Manhattan distance between two feature graphs output by a Relu i_1 layer in a VGG19 network model;
step 402, using computer according to L dist =(out,out EP ) L1 +(out,out PS ) L1 +0.25(EP 1 ,F d2 ) L1 +0.5(PS 2 ,F d3 ) L1 Obtaining a distillation loss function L diss The method comprises the steps of carrying out a first treatment on the surface of the Wherein (out ) EP ) L1 Output defogging image out representing student network model and EPDN teacher network output defogging image out EP Manhattan distance between (out ) PS ) L1 Output defogging image out representing student network model and PSD teacher network output defogging image out PS Manhattan distance between (EP) 1 ,F d2 ) L1 Representing EPDN teacher network intermediate output feature map EP 1 And a student network modelSecond decoding feature map F d2 Manhattan distance between (PS) 2 ,F d3 ) L1 Representing PSD teacher network intermediate output feature map PS 2 And third decoding feature map F of student network model d3 Manhattan distance between;
step 403, using computer according to L loss =0.1L per +L dist Obtaining the total loss function L loss
Training the student network model by the foggy training image:
step 501, adopting Adam optimization algorithm by computer and utilizing total loss function L loss Performing iterative optimization on the student network model until the training set is completely trained, and completing one-time iterative training;
Step 502, repeating the iterative training in step 501 until the iterative training preset times are met, and obtaining a trained student network model;
step six, defogging the single image by using the trained student network model:
and inputting any one foggy image into a trained student network model by adopting a computer to perform defogging treatment, so as to obtain a foggy image.
The single image defogging method based on multi-teacher knowledge distillation is characterized by comprising the following steps of: in step 201, the number of convolution kernels in the first convolution layer is 32, the size of the convolution kernels is 3×3, the sliding step size is 1, and the padding is 1;
the number of convolution kernels in the second convolution layer is 64, the size of the convolution kernels is 3×3, the sliding step size is 2, and the padding is 1;
the PA-based RDB module in step 201 includes a first conv+relu layer, a Conv1 convolution layer, an RDB module, a Conv2 convolution layer, and a Sigmoid activation function layer; the number of convolution kernels in the first Conv+ReLU layer is 32, the size of the convolution kernels is 3 multiplied by 3, the sliding step length is 1, and the padding is 1; the number of convolution kernels in the Conv1 convolution layer is 32, the size of the convolution kernels is 1 multiplied by 1, the sliding step length is 1, and the padding is 0; the number of convolution kernels in the Conv2 convolution layer is 32, the size of the convolution kernels is 3 multiplied by 3, the sliding step length is 1, and the padding is 1;
The number of convolution kernels in the third convolution layer is 128, the size of the convolution kernels is 3×3, the sliding step length is 2, and the padding is 1;
the number of convolution kernels in the fourth convolution layer is 256, the size of the convolution kernels is 3 multiplied by 3, the sliding step length is 2, and the padding is 1;
the feature fusion module in step 201 comprises a first Conv+InstanceNorm normalization+ReLU activation function layer and a second Conv+InstanceNorm normalization+ReLU activation function layer;
in step 202, the number of convolution kernels in the first transpose convolution layer is 128, the size of the convolution kernels is 3×3, the sliding step size is 2, the padding is 1, and the out_padding is 1;
the number of convolution kernels in the second transpose convolution layer is 64, the size of the convolution kernels is 3×3, the sliding step size is 2, the padding is 1, and the out_padding is 1;
the number of convolution kernels in the third transpose convolution layer is 32, the size of the convolution kernels is 3×3, the sliding step size is 2, the padding is 1, and the out_padding is 1;
the number of convolution kernels in the fifth convolution layer is 3, the size of the convolution kernels is 3×3, the sliding step size is 1, and the padding is 1.
The single image defogging method based on multi-teacher knowledge distillation is characterized by comprising the following steps of: in step 301, a computer is used to perform feature extraction on the foggy training image I through a first scale network model to obtain a first scale feature map F e1 The specific process is as follows:
step 3011, performing feature extraction on the foggy training image I through a first convolution layer by adopting a computer to obtain an input feature map F in
Step 3012, the computer inputs the feature map F in Inputting a RDB module based on PA for feature extraction to obtain an intermediate output feature map F out
Step 3013, the computer outputs the intermediate output feature map F according to the method described in step 3012 out Inputting another RDB module based on PA to perform feature extraction to obtain a first scale feature map F e1
One of the above is based on multiple teachersThe defogging method for the single image of knowledge distillation is characterized by comprising the following steps of: computer-implemented first scale feature map F in step 302 e1 Feature extraction is carried out through a second scale network model to obtain a second scale feature map F e2 The specific process is as follows:
step 3021, using a computer to map the first scale feature map F e1 Extracting features through a second convolution layer to obtain a second input feature map;
step 3022, inputting the second input feature map into a PA-based RDB module in the second scale network model by the computer to perform feature extraction, so as to obtain a second scale first coding feature map;
step 3023, inputting the second-scale first coding feature map into another PA-based RDB module in the second-scale network model by the computer for feature extraction, so as to obtain a second-scale second coding feature map;
Step 3024, the computer maps the first scale feature map F e1 Performing 0.5 times downsampling to obtain a first downsampling feature map;
step 3025, calling a splicing cat function module by a computer to splice the first downsampling feature map and the second-scale second coding feature map to obtain a first spliced feature map;
step 3026, inputting the first spliced feature map into a feature fusion module in the second scale network model by using a computer to obtain a second scale feature map F e2
Computer-implemented step 303 of mapping the second scale feature map F e2 Feature extraction is carried out through a third-scale network model to obtain a third-scale feature map F e3 The specific process is as follows:
step 3031, using a computer to map the second scale feature map F e2 Extracting features through a third convolution layer to obtain a third input feature map;
step 3032, the computer inputs the third input feature map into a PA-based RDB module in the third-scale network model to perform feature extraction to obtain a third-scale first coding feature map;
step 3033, the computer inputs the third-scale first decoding feature map into another RDB module based on PA in the third-scale network model to perform feature extraction, so as to obtain a third-scale second coding feature map;
Step 3034, the computer maps the second scale feature map F e2 Performing 0.5 times downsampling to obtain a second downsampling feature map;
the computer maps the first scale characteristic map F e1 Performing 0.25 times downsampling to obtain a third downsampling feature map;
step 3035, a computer is adopted to call a splicing cat function module to splice the second downsampling feature map, the third downsampling feature map and the third-scale second coding feature map to obtain a second spliced feature map;
step 3036, inputting the second spliced feature map into a feature fusion module in the third-scale network model by adopting a computer to obtain a third-scale feature map F e3
Computer-implemented third scale feature map F in step 304 e3 Feature extraction is carried out through a fourth-scale network model to obtain a fourth-scale feature map F e4 The specific process is as follows:
step 3041, computer-implemented third scale feature map F e3 Extracting features through a fourth convolution layer to obtain a fourth input feature map;
step 3042, inputting the fourth input feature map into a PA-based RDB module in a fourth-scale network model by a computer to perform feature extraction, so as to obtain a fourth-scale first coding feature map;
step 3043, inputting the fourth-scale first coding feature map into another RDB module based on PA in the fourth-scale network model by a computer for feature extraction to obtain a fourth-scale second coding feature map;
Step 3044 computer maps the third scale feature map F e3 Performing 0.5 times downsampling to obtain a fourth downsampling feature map;
the computer maps the second scale characteristic map F e2 Performing 0.25 times downsampling to obtain a fifth downsampling characteristic map;
the computer maps the first scale characteristic map F e1 Performing 0.125 times downsampling to obtain a sixth downsampled feature map;
step 3045, calling a splicing cat function module by a computer to splice the fourth downsampling feature map, the fifth downsampling feature map, the sixth downsampling feature map and the fourth-scale second coding feature map to obtain a third spliced feature map;
step 3046, inputting the third spliced feature map into a feature fusion module in the fourth-scale network model by using a computer to obtain a fourth-scale feature map F e4
The single image defogging method based on multi-teacher knowledge distillation is characterized by comprising the following steps of: the computer will input a feature map F in step 3012 in Inputting a PA-based RDB module to perform feature extraction to obtain an intermediate feature map F out The specific process is as follows:
step A, the computer inputs the characteristic diagram F in Feature extraction is carried out through a first Conv+ReLU layer to obtain a feature map F pre
Step B, the computer makes the feature map F pre The Conv1 convolution layer and the RDB module are input to perform feature extraction to obtain a feature map F RDB At the same time, feature map F pre The input Conv2 convolution layer carries out convolution processing and is normalized by a Sigmoid activation function to obtain a space weight diagram F s
Step C, the computer is according toObtaining a characteristic diagram F mid The method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Hadamard product operation between matrices representing feature maps,/->Representing addition operations between feature map matrices;
step D, the computer is according toObtaining an intermediate feature map F out
The single image defogging method based on multi-teacher knowledge distillation is characterized by comprising the following steps of: in step 3026, step 3036, and step 3046, the first post-stitching feature map, the second post-stitching feature map, and the third post-stitching feature map are recorded as post-stitching feature maps, and the second scale feature map F e2 Third scale feature map F e3 And fourth scale feature map F e4 And (3) respectively recording the characteristic images as fused scale characteristic images, and inputting the spliced characteristic images into a characteristic fusion module by adopting a computer to obtain the scale characteristic images, wherein the specific process is as follows:
a1, performing feature processing on the spliced feature map through a first Conv+InstanceNorm normalization+ReLU activation function layer by adopting a computer to obtain a fusion coding feature map;
and A2, performing feature processing on the fusion coding feature map through a second Conv+InstanceNorm normalization+ReLU activation function layer by adopting a computer to obtain a fused scale feature map.
The single image defogging method based on multi-teacher knowledge distillation is characterized by comprising the following steps of: computer-implemented fourth scale feature map F in step 305 e4 Feature extraction is carried out through a first decoding network model to obtain a first decoding feature map F d1 The specific process is as follows:
step 3051, computer is used to map the fourth scale feature map F e4 Performing feature extraction through a PA-based RDB module in the first decoding network model to obtain a first pre-decoding feature map;
step 3052, inputting the first pre-decoding feature map into another PA-based RDB module in the first decoding network model by the computer to perform feature extraction, thereby obtaining a first decoding feature map F d1
Step 306, the specific process is as follows:
step 3061, using a computer to decode the first feature map F d1 Performing feature extraction through a first inversion convolution layer to obtain a first decoded first up-sampling feature map;
step 3062, performing feature extraction on the first decoded first upsampled feature map through two PA-based RDB modules in the second decoding network model by using a computer to obtain a first intermediate feature map;
step 3063, using a computer to decode the first feature map F d1 2 times of up-sampling processing is carried out to obtain a first decoding second up-sampling feature map;
Step 3064, a computer is adopted to call a splicing cat function module to splice the first intermediate feature map and the first decoding second up-sampling feature map, so as to obtain a first decoding splicing feature map;
step 3065, inputting the first decoding spliced feature map into a feature fusion module in the second decoding network model by using a computer to obtain a second decoding feature map F d2
Step 307, the specific process is as follows:
step 3071, computer-readable recording medium storing the second decoding profile F d2 Performing feature extraction through a second transposition convolution layer to obtain a second decoded first upsampling feature map;
step 3072, performing feature extraction on the second decoded first upsampled feature map by using a computer through two PA-based RDB modules in the third decoding network model to obtain a second intermediate feature map;
step 3073, using a computer to decode the first decoded feature map F d1 Obtaining a second up-sampling characteristic diagram of a second decoding through 4 times up-sampling;
decoding the second decoding feature map F d2 Obtaining a second decoding third up-sampling feature map through 2 times up-sampling;
step 3074, calling a splicing cat function module by a computer to splice the second intermediate feature map, the second decoding second up-sampling feature map and the second decoding third up-sampling feature map to obtain a second decoding splicing feature map;
Step 3075, inputting the second decoding spliced feature map into a feature fusion module in the third decoding network model by using a computer to obtain a third decoding feature map F d3
Step 308, the specific process is as follows:
step 3081, using a computer to decode the third feature map F d3 Performing feature extraction through a third transposition convolution layer to obtain a third decoded first upsampling feature map;
step 3072, performing feature extraction on the third decoded first upsampled feature map by using a computer through two PA-based RDB modules in the fourth decoding network model to obtain a third intermediate feature map;
step 3073, using a computer to decode the first decoded feature map F d1 Obtaining a third decoding second up-sampling feature map through 8 times up-sampling;
decoding the second decoding feature map F d2 Obtaining a third up-sampling characteristic diagram of a third decoding through 4 times up-sampling;
decoding the third decoding feature map F d3 Obtaining a third decoding fourth up-sampling feature map through 2 times up-sampling;
step 3074, calling a splicing cat function module by a computer to splice the third intermediate feature map, the third decoding second up-sampling feature map, the third decoding third up-sampling feature map and the third decoding fourth up-sampling feature map to obtain a third decoding splicing feature map;
Step 3075, inputting the third decoding spliced feature map into a feature fusion module in the third decoding network model by using a computer to obtain a fourth decoding feature map F d4
Compared with the prior art, the invention has the following advantages:
1. the method has simple steps and reasonable design, and firstly, the training set image is acquired; secondly, establishing a student network model, extracting features of a foggy training image, establishing a total loss function, training the student network model by the foggy training image, defogging a single image by using the trained student network model, and improving the defogging effect of the image.
2. The student network model adopts the feature attention residual error dense block to carry out multi-scale feature extraction to generate the end-to-end feature image, thereby utilizing the advantages of the neural network and having stronger generalization capability.
3. The encoder model in the student network model comprises a first scale network model, a second scale network model, a third scale network model and a fourth scale network model, the decoder model comprises a first decoding network model, a second decoding network model, a third decoding network model and a fourth decoding network model, the four-scale downsampling characteristic extraction is realized through the encoder model, the four-scale upsampling characteristic extraction is realized through the decoder model, the extraction of multi-scale information of a defogging image is realized, the global and local characteristics of the defogging image are effectively fused, and the defogging effect of the image is further improved.
4. According to the invention, the EPDN teacher network model and the PSD teacher network model are adopted to realize knowledge migration from the teacher network to the student network in a multi-teacher knowledge distillation mode, so that the student network can combine the complementary advantages of an image defogging method based on prior information and an image defogging method based on deep learning.
In summary, the method has simple steps and reasonable design, the EPDN teacher network model and the PSD teacher network model are used for guiding and training the student network model, the feature extraction capability of the student network is effectively improved, the student network model is used for extracting multi-scale information of the defogging image through four-scale encoding and decoding, the global and local features of the defogging image are effectively fused, and the defogging effect of the image is further improved.
The technical scheme of the invention is further described in detail through the drawings and the embodiments.
Drawings
FIG. 1 is a block flow diagram of the method of the present invention.
Fig. 2 is a schematic diagram of the structure of the student network model of the present invention.
Fig. 3 is a schematic structural diagram of a residual error density block according to a feature of the present invention.
Fig. 4 is a schematic structural diagram of a feature fusion module according to the present invention.
Detailed Description
As shown in fig. 1 to 4, the single image defogging method based on multi-teacher knowledge distillation of the present invention comprises the following steps:
step one, acquiring a training set image:
selecting an indoor training set from a foggy day image database RESIDE; the indoor training set comprises foggy training images and foggy training images corresponding to the foggy training images, wherein the number of the foggy training images and the number of the foggy training images are the same;
step two, establishing a student network model:
the method for establishing the student network model comprises the following specific processes:
step 201, establishing an encoder model of a student network by adopting a computer; the encoder model of the student network comprises a first scale network model, a second scale network model, a third scale network model and a fourth scale network model, wherein the first scale network model comprises a first convolution layer and two RDB modules based on PA, and the second scale network model comprises a second convolution layer, two RDB modules based on PA and a feature fusion module; the third scale network model comprises a third convolution layer, two RDB modules based on PA and a feature fusion module; the fourth scale network model comprises a fourth convolution layer, two RDB modules based on PA and a feature fusion module;
Step 202, adopting a computer to establish a decoder model of the student network; the decoder model of the student network comprises a first decoding network model, a second decoding network model, a third decoding network model, a fourth decoding network model and a fifth convolution layer, wherein the first decoding network model comprises two RDB modules based on PA, and the second decoding network model comprises a first transfer convolution layer, two RDB modules based on PA and a feature fusion module; the third decoding network model comprises a second transpose convolution layer, two RDB modules based on PA and a feature fusion module; the fourth decoding network model comprises a third transpose convolution layer, two RDB modules based on PA and a feature fusion module;
step three, extracting features of the foggy training images:
step 301, adopting a computer to train the foggy training image IFeature extraction is carried out through a first scale network model to obtain a first scale feature map F e1
Step 302, adopting a computer to make the first scale feature map F e1 Feature extraction is carried out through a second scale network model, and a second scale feature map F is obtained e2
Step 303, adopting a computer to make the second scale feature map F e2 Feature extraction is carried out through a third-scale network model, and a third-scale feature map F is obtained e3
Step 304, computer-integrating the third scale feature map F e3 Feature extraction is carried out through a fourth-scale network model, and a fourth-scale feature map F is obtained e4
Step 305, using a computer to map the fourth scale feature map F e4 Extracting features through the first decoding network model to obtain a first decoding feature map F d1
Step 306, using a computer to decode the first decoding feature map F d1 Feature extraction is carried out through a second decoding network model to obtain a second decoding feature map F d2
Step 307, using a computer to decode the feature map F d2 Extracting features through a third decoding network model to obtain a third decoding feature map F d3
Step 308, using a computer to decode the third decoding feature map F d3 Feature extraction is carried out through a fourth decoding network model to obtain a fourth decoding feature map F d4 The method comprises the steps of carrying out a first treatment on the surface of the Computer-implemented fourth decoding feature map F d4 Feature extraction is carried out through fifth convolution, and an output defogging image out is obtained;
step 309, processing the foggy training image I by using the EPDN teacher network model with a computer to obtain an defogging image out output by the EPDN teacher network EP And the feature map output by the global sub-generator in the EPDN teacher network model is recorded as the intermediate output feature map EP of the EPDN teacher network 1
Processing the foggy training image I by using a computer and utilizing a teacher PSD network model to obtain a PSD teacher network output defogging imageout PS The feature map output by the main network in the PSD network model of the teacher is recorded as a PSD teacher network middle output feature map PS 2
Step four, establishing a total loss function:
step 401, adopting a computer according toObtaining a perception loss function L per The method comprises the steps of carrying out a first treatment on the surface of the Where i is a positive integer, n=5, and Φ i (gt) a characteristic diagram representing the output of the foggy training image gt corresponding to the foggy training image I through the Relu i_1 layer in the VGG19 network model, phi i (out) a characteristic diagram of the output defogging image out of the student network model, which is output by a Relu i_1 layer in the VGG19 network model, wherein i is more than or equal to 1 and less than or equal to 5; c (C) i 、H i And W is i The channel number, the length and the width of the feature map output by the Relu i_1 layer are respectively represented; (phi) i (gt),Φ i (out)) L1 Representing Manhattan distance between two feature graphs output by a Relu i_1 layer in a VGG19 network model;
step 402, using computer according to L dist =(out,out EP ) L1 +(out,out PS ) L1 +0.25(EP 1 ,F d2 ) L1 +0.5(PS 2 ,F d3 ) L1 Obtaining a distillation loss function L diss The method comprises the steps of carrying out a first treatment on the surface of the Wherein (out ) EP ) L1 Output defogging image out representing student network model and EPDN teacher network output defogging image out EP Manhattan distance between (out ) PS ) L1 Output defogging image out representing student network model and PSD teacher network output defogging image out PS Manhattan distance between (EP) 1 ,F d2 ) L1 Representing EPDN teacher network intermediate output feature map EP 1 And a second decoding feature map F of the student network model d2 Manhattan distance between (PS) 2 ,F d3 ) L1 Representing PSD teacher network intermediate output feature map PS 2 And third decoding feature map F of student network model d3 Manhattan distance between;
step 403, using computer according to L loss =0.1L per +L dist Obtaining the total loss function L loss
Training the student network model by the foggy training image:
step 501, adopting Adam optimization algorithm by computer and utilizing total loss function L loss Performing iterative optimization on the student network model until the training set is completely trained, and completing one-time iterative training;
step 502, repeating the iterative training in step 501 until the iterative training preset times are met, and obtaining a trained student network model;
step six, defogging the single image by using the trained student network model:
and inputting any one foggy image into a trained student network model by adopting a computer to perform defogging treatment, so as to obtain a foggy image.
In this embodiment, in step 201, the number of convolution kernels in the first convolution layer is 32, the size of the convolution kernel is 3×3, the sliding step size is 1, and the padding is 1;
the number of convolution kernels in the second convolution layer is 64, the size of the convolution kernels is 3×3, the sliding step size is 2, and the padding is 1;
The PA-based RDB module in step 201 includes a first conv+relu layer, a Conv1 convolution layer, an RDB module, a Conv2 convolution layer, and a Sigmoid activation function layer; the number of convolution kernels in the first Conv+ReLU layer is 32, the size of the convolution kernels is 3 multiplied by 3, the sliding step length is 1, and the padding is 1; the number of convolution kernels in the Conv1 convolution layer is 32, the size of the convolution kernels is 1 multiplied by 1, the sliding step length is 1, and the padding is 0; the number of convolution kernels in the Conv2 convolution layer is 32, the size of the convolution kernels is 3 multiplied by 3, the sliding step length is 1, and the padding is 1;
the number of convolution kernels in the third convolution layer is 128, the size of the convolution kernels is 3×3, the sliding step length is 2, and the padding is 1;
the number of convolution kernels in the fourth convolution layer is 256, the size of the convolution kernels is 3 multiplied by 3, the sliding step length is 2, and the padding is 1;
the feature fusion module in step 201 comprises a first Conv+InstanceNorm normalization+ReLU activation function layer and a second Conv+InstanceNorm normalization+ReLU activation function layer;
in step 202, the number of convolution kernels in the first transpose convolution layer is 128, the size of the convolution kernels is 3×3, the sliding step size is 2, the padding is 1, and the out_padding is 1;
the number of convolution kernels in the second transpose convolution layer is 64, the size of the convolution kernels is 3×3, the sliding step size is 2, the padding is 1, and the out_padding is 1;
The number of convolution kernels in the third transpose convolution layer is 32, the size of the convolution kernels is 3×3, the sliding step size is 2, the padding is 1, and the out_padding is 1;
the number of convolution kernels in the fifth convolution layer is 3, the size of the convolution kernels is 3×3, the sliding step size is 1, and the padding is 1.
In this embodiment, in step 301, a computer is used to perform feature extraction on the foggy training image I through a first scale network model to obtain a first scale feature map F e1 The specific process is as follows:
step 3011, performing feature extraction on the foggy training image I through a first convolution layer by adopting a computer to obtain an input feature map F in
Step 3012, the computer inputs the feature map F in Inputting a RDB module based on PA for feature extraction to obtain an intermediate output feature map F out
Step 3013, the computer outputs the intermediate output feature map F according to the method described in step 3012 out Inputting another RDB module based on PA to perform feature extraction to obtain a first scale feature map F e1
In this embodiment, in step 302, the first scale feature map F is computed e1 Feature extraction is carried out through a second scale network model to obtain a second scale feature map F e2 The specific process is as follows:
step 3021, using a computer to map the first scale feature map F e1 Extracting features through a second convolution layer to obtain a second input feature map;
step 3022, inputting the second input feature map into a PA-based RDB module in the second scale network model by the computer to perform feature extraction, so as to obtain a second scale first coding feature map;
step 3023, inputting the second-scale first coding feature map into another PA-based RDB module in the second-scale network model by the computer for feature extraction, so as to obtain a second-scale second coding feature map;
step 3024, the computer maps the first scale feature map F e1 Performing 0.5 times downsampling to obtain a first downsampling feature map;
step 3025, calling a splicing cat function module by a computer to splice the first downsampling feature map and the second-scale second coding feature map to obtain a first spliced feature map;
step 3026, inputting the first spliced feature map into a feature fusion module in the second scale network model by using a computer to obtain a second scale feature map F e2
Computer-implemented step 303 of mapping the second scale feature map F e2 Feature extraction is carried out through a third-scale network model to obtain a third-scale feature map F e3 The specific process is as follows:
step 3031, using a computer to map the second scale feature map F e2 Extracting features through a third convolution layer to obtain a third input feature map;
step 3032, the computer inputs the third input feature map into a PA-based RDB module in the third-scale network model to perform feature extraction to obtain a third-scale first coding feature map;
step 3033, the computer inputs the third-scale first decoding feature map into another RDB module based on PA in the third-scale network model to perform feature extraction, so as to obtain a third-scale second coding feature map;
step 3034, the computer maps the second scale feature map F e2 Performing 0.5 times downsampling to obtain a second downsampling feature map;
the computer maps the first scale characteristic map F e1 Performing 0.25 times downsampling to obtain a third downsampling feature map;
step 3035, a computer is adopted to call a splicing cat function module to splice the second downsampling feature map, the third downsampling feature map and the third-scale second coding feature map to obtain a second spliced feature map;
step 3036, inputting the second spliced feature map into a feature fusion module in the third-scale network model by adopting a computer to obtain a third-scale feature map F e3
Computer-implemented third scale feature map F in step 304 e3 Feature extraction is carried out through a fourth-scale network model to obtain a fourth-scale feature map F e4 The specific process is as follows:
step 3041, computer-implemented third scale feature map F e3 Extracting features through a fourth convolution layer to obtain a fourth input feature map;
step 3042, inputting the fourth input feature map into a PA-based RDB module in a fourth-scale network model by a computer to perform feature extraction, so as to obtain a fourth-scale first coding feature map;
step 3043, inputting the fourth-scale first coding feature map into another RDB module based on PA in the fourth-scale network model by a computer for feature extraction to obtain a fourth-scale second coding feature map;
step 3044 computer maps the third scale feature map F e3 Performing 0.5 times downsampling to obtain a fourth downsampling feature map;
the computer maps the second scale characteristic map F e2 Performing 0.25 times downsampling to obtain a fifth downsampling characteristic map;
the computer maps the first scale characteristic map F e1 Performing 0.125 times downsampling to obtain a sixth downsampled feature map;
step 3045, calling a splicing cat function module by a computer to splice the fourth downsampling feature map, the fifth downsampling feature map, the sixth downsampling feature map and the fourth-scale second coding feature map to obtain a third spliced feature map;
step 3046, inputting the third spliced feature map into a feature fusion module in the fourth-scale network model by using a computer to obtain a fourth-scale feature map Sign F e4
In this embodiment, the computer in step 3012 will input the feature map F in Inputting a PA-based RDB module to perform feature extraction to obtain an intermediate feature map F out The specific process is as follows:
step A, the computer inputs the characteristic diagram F in Feature extraction is carried out through a first Conv+ReLU layer to obtain a feature map F pre
Step B, the computer makes the feature map F pre The Conv1 convolution layer and the RDB module are input to perform feature extraction to obtain a feature map F RDB At the same time, feature map F pre The input Conv2 convolution layer carries out convolution processing and is normalized by a Sigmoid activation function to obtain a space weight diagram F s
Step C, the computer is according toObtaining a characteristic diagram F mid The method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Hadamard product operation between matrices representing feature maps,/->Representing addition operations between feature map matrices;
step D, the computer is according toObtaining an intermediate feature map F out
In this embodiment, in step 3026, step 3036 and step 3046, the first post-stitching feature map, the second post-stitching feature map and the third post-stitching feature map are recorded as post-stitching feature maps, and the second scale feature map F e2 Third scale feature map F e3 And fourth scale feature map F e4 And (3) respectively recording the characteristic images as fused scale characteristic images, and inputting the spliced characteristic images into a characteristic fusion module by adopting a computer to obtain the scale characteristic images, wherein the specific process is as follows:
A1, performing feature processing on the spliced feature map through a first Conv+InstanceNorm normalization+ReLU activation function layer by adopting a computer to obtain a fusion coding feature map;
and A2, performing feature processing on the fusion coding feature map through a second Conv+InstanceNorm normalization+ReLU activation function layer by adopting a computer to obtain a fused scale feature map.
In this embodiment, in step 305, a fourth scale feature map F is computed e4 Feature extraction is carried out through a first decoding network model to obtain a first decoding feature map F d1 The specific process is as follows:
step 3051, computer is used to map the fourth scale feature map F e4 Performing feature extraction through a PA-based RDB module in the first decoding network model to obtain a first pre-decoding feature map;
step 3052, inputting the first pre-decoding feature map into another PA-based RDB module in the first decoding network model by the computer to perform feature extraction, thereby obtaining a first decoding feature map F d1
Step 306, the specific process is as follows:
step 3061, using a computer to decode the first feature map F d1 Performing feature extraction through a first inversion convolution layer to obtain a first decoded first up-sampling feature map;
step 3062, performing feature extraction on the first decoded first upsampled feature map through two PA-based RDB modules in the second decoding network model by using a computer to obtain a first intermediate feature map;
Step 3063, using a computer to decode the first feature map F d1 2 times of up-sampling processing is carried out to obtain a first decoding second up-sampling feature map;
step 3064, a computer is adopted to call a splicing cat function module to splice the first intermediate feature map and the first decoding second up-sampling feature map, so as to obtain a first decoding splicing feature map;
step 3065, inputting the first decoding spliced feature map into a feature fusion module in the second decoding network model by adopting a computer to obtain a second decodingFeature map F d2
Step 307, the specific process is as follows:
step 3071, computer-readable recording medium storing the second decoding profile F d2 Performing feature extraction through a second transposition convolution layer to obtain a second decoded first upsampling feature map;
step 3072, performing feature extraction on the second decoded first upsampled feature map by using a computer through two PA-based RDB modules in the third decoding network model to obtain a second intermediate feature map;
step 3073, using a computer to decode the first decoded feature map F d1 Obtaining a second up-sampling characteristic diagram of a second decoding through 4 times up-sampling;
decoding the second decoding feature map F d2 Obtaining a second decoding third up-sampling feature map through 2 times up-sampling;
step 3074, calling a splicing cat function module by a computer to splice the second intermediate feature map, the second decoding second up-sampling feature map and the second decoding third up-sampling feature map to obtain a second decoding splicing feature map;
Step 3075, inputting the second decoding spliced feature map into a feature fusion module in the third decoding network model by using a computer to obtain a third decoding feature map F d3
Step 308, the specific process is as follows:
step 3081, using a computer to decode the third feature map F d3 Performing feature extraction through a third transposition convolution layer to obtain a third decoded first upsampling feature map;
step 3072, performing feature extraction on the third decoded first upsampled feature map by using a computer through two PA-based RDB modules in the fourth decoding network model to obtain a third intermediate feature map;
step 3073, using a computer to decode the first decoded feature map F d1 Obtaining a third decoding second up-sampling feature map through 8 times up-sampling;
decoding the second decoding feature map F d2 Obtaining a third up-sampling characteristic diagram of a third decoding through 4 times up-sampling;
decoding the third decoding feature map F d3 Obtaining a third decoding fourth up-sampling feature map through 2 times up-sampling;
step 3074, calling a splicing cat function module by a computer to splice the third intermediate feature map, the third decoding second up-sampling feature map, the third decoding third up-sampling feature map and the third decoding fourth up-sampling feature map to obtain a third decoding splicing feature map;
Step 3075, inputting the third decoding spliced feature map into a feature fusion module in the third decoding network model by using a computer to obtain a fourth decoding feature map F d4
In this embodiment, it should be noted that the structures of the feature fusion modules in the second scale network model, the third scale network model and the fourth scale network model are the same and only the number of convolution kernels is different.
In this embodiment, it should be noted that the structures of the feature fusion modules in the second decoding network model, the third decoding network model, and the fourth decoding network model are the same and only the number of convolution kernels is different.
In this embodiment, the convolution layer in the first conv+instancenorm normalization+relu activation function layer in the second scale network model is a Conv3 convolution layer, the number of convolution kernels in the Conv3 convolution layer is 96, the size of the convolution kernels is 3×3, the sliding step size is 1, and the padding is 1;
the convolution layer in the second Conv+InstanceNorm normalization+ReLU activation function layer is a Conv4 convolution layer, the number of convolution kernels in the Conv4 convolution layer is 64, the size of the convolution kernels is 3×3, the sliding step size is 1, and the padding is 1.
The convolution layer in the first Conv+InstanceNorm normalization+ReLU activation function layer in the third scale network model is a Conv3 convolution layer, the number of convolution kernels in the Conv3 convolution layer is 224, the size of the convolution kernels is 3 multiplied by 3, the sliding step length is 1, and the padding is 1;
The convolution layer in the second Conv+InstanceNorm normalization+ReLU activation function layer is Conv4 convolution layer, the number of convolution kernels in the Conv4 convolution layer is 128, the size of the convolution kernels is 3×3, the sliding step size is 1, and the padding is 1.
The convolution layer in the first Conv+InstanceNorm normalization+ReLU activation function layer in the fourth scale network model is a Conv3 convolution layer, the number of convolution kernels in the Conv3 convolution layer is 480, the size of the convolution kernels is 3 multiplied by 3, the sliding step length is 1, and the padding is 1;
the convolution layer in the second Conv+InstanceNorm normalization+ReLU activation function layer is a Conv4 convolution layer, the number of convolution kernels in the Conv4 convolution layer is 256, the size of the convolution kernels is 3×3, the sliding step size is 1, and the padding is 1.
The convolution layer in the first Conv+InstanceNorm normalization+ReLU activation function layer in the second decoding network model is a Conv3 convolution layer, the number of convolution kernels in the Conv3 convolution layer is 384, the size of the convolution kernels is 3 multiplied by 3, the sliding step length is 1, and the padding is 1;
the convolution layer in the second Conv+Ins tanceNorm normalization+ReLU activation function layer is a Conv4 convolution layer, the number of convolution kernels in the Conv4 convolution layer is 128, the size of the convolution kernels is 3 multiplied by 3, the sliding step length is 1, and the padding is 1;
the convolution layer in the first Conv+Ins tanceNorm normalization+ReLU activation function layer in the third decoding network model is a Conv3 convolution layer, the number of convolution kernels in the Conv3 convolution layer is 448, the size of the convolution kernels is 3 multiplied by 3, the sliding step length is 1, and the padding is 1;
The convolution layer in the second Conv+Ins tanceNorm normalization+ReLU activation function layer is Conv4 convolution layer, the number of convolution kernels in the Conv4 convolution layer is 64, the size of the convolution kernels is 3×3, the sliding step size is 1, and the padding is 1.
The convolution layer in the first Conv+Ins tanceNorm normalization+ReLU activation function layer in the fourth decoding network model is a Conv3 convolution layer, the number of convolution kernels in the Conv3 convolution layer is 480, the size of the convolution kernels is 3 multiplied by 3, the sliding step length is 1, and the padding is 1;
the convolution layer in the second Conv+Ins tanceNorm normalization+ReLU activation function layer is Conv4 convolution layer, the number of convolution kernels in the Conv4 convolution layer is 32, the size of the convolution kernels is 3×3, the sliding step size is 1, and the padding is 1.
In this embodiment, the PA module is a spatial attention mechanism module, and the RDB is a residual error density block (Res idual Dense Block).
In this embodiment, it should be noted that the Adam optimization algorithm, i.e. Adapt ive momentum optimization algorithm, is a first-order optimization algorithm that can replace the conventional random gradient descent process, and can iteratively update the model parameters based on training data.
In this embodiment, the number of the foggy training images and the number of the foggy training images are 13990.
In this embodiment, it should be noted that, in actual use,refers to Hadamard product between image matrices, for example, let the ith row and jth column elements in matrix A be ai j, let the ith row and jth column elements in matrix B be bi j, then->The ith row and jth column element in C is cij=aij×bi j and A, B and C are homography matrices.
In this embodiment, it should be noted that the preset number of iterative training in step 502 is 30.
In this embodiment, when i=1, Φ 1 (gt) a characteristic diagram showing the output of the foggless training image gt through the Relu1_1 layer in the VGG19 network model, phi 1 (out) a feature map representing the output defogging image out of the student network model through the Relu1_1 layer output in the VGG19 network model;
when i=2, then Φ 2 (gt) a characteristic diagram showing the output of the foggless training image gt through the Relu2_1 layer in the VGG19 network model, phi 2 (out) a feature map representing the output defogging image out of the student network model through the Relu2_1 layer output in the VGG19 network model;
i=3, then Φ 3 (gt) a characteristic diagram showing the output of the foggless training image gt through the Relu3_1 layer in the VGG19 network model, phi 3 (out) a feature map representing the output defogging image out of the student network model through the Relu3_1 layer output in the VGG19 network model;
When i=4, then Φ 4 (gt) represents the output of the fogless training image gt through the Relu4_1 layer in the VGG19 network modelIs characterized by phi 4 (out) a feature map representing the output defogging image out of the student network model through the Relu4_1 layer output in the VGG19 network model;
when i=5, then Φ 5 (gt) a characteristic diagram showing the output of the foggless training image gt through the Relu5_1 layer in the VGG19 network model, phi 5 (out) represents a feature map of the output defogging image out of the student network model output through the Relu5_1 layer in the VGG19 network model.
In this embodiment, the downsampling is nearest neighbor downsampling, and the upsampling is nearest neighbor upsampling.
In this embodiment, the number of channels of the image is unchanged as the downsampling by 0.5 times, and the size of the image is changed to 1/2 of the original size; downsampling by 0.25 times, namely, the channel number of the image is unchanged, and the size of the image is changed to 1/4 of the original size; the downsampling is 0.125 times, namely the channel number of the image is unchanged, and the size of the image is changed to 1/8 of the original size.
In this embodiment, it should be noted that, the number of channels of the image is unchanged as the up-sampling is 2 times, and the size of the image is changed to 2 as the original size; 4 times up sampling, namely the channel number of the image is unchanged, and the size of the image is changed into 4; the 8 times up sampling is that the channel number of the image is unchanged, and the size of the image is changed to 8.
In this embodiment, the foggy training image I is a three-channel RGB color image, i.e., 3×256×256. The size of the fog monitoring image is 3×256×256.
In this embodiment, the EPDN teacher network outputs defogging image out EP And PSD teacher network output defogging image out PS Is 3×256×256, and the epdn teacher network intermediate output feature map EP 1 The size of the PSD teacher network intermediate output feature map PS is 128 multiplied by 64 2 Is 64 x 128.
In the present embodiment, feature map F pre The size of (2) is 32×256×256, and the feature map F RDB The size of (2) is 32×256×256, and the feature map F mid The size of (2) is 32×256×256, and the feature map F s The size of (2) is 32×256×256.
In the present embodiment, the size of the feature map is represented by the number of channels×the length×the width, and is inputFeature map F in The size of (2) is 32×256×256, and the feature map F is output out Is 32 x 256, the first scale feature map F e1 The size of (2) is 32×256×256;
the second input feature map has a size of 64×128×128, the second-scale first coding feature map has a size of 64×128×128, the second-scale second coding feature map has a size of 64×128×128, the first downsampled feature map has a size of 32×128×128, the first post-splice feature map has a size of 96×128×128, and the second-scale feature map F e2 The size of (2) is 64×128×128;
the third input feature map has a size of 128 x 64, the third scale first encoding feature map has a size of 128 x 064 x 164, the third scale second encoding feature map has a size of 128 x 64, the second downsampled feature map has a size of 64 x 64, the third downsampled feature map has a size of 32 x 64, the size of the second spliced characteristic diagram is 224 multiplied by 64, and the third scale characteristic diagram F e3 The size of (2) is 128×64×64;
the fourth input feature map has a size of 256 x 32, the fourth scale first encoding feature map has a size of 256 x 032 x 132, the fourth scale second encoding feature map has a size of 256 x 232 x 332, the fourth downsampling feature map has a size of 128 x 32, the fifth downsampled feature map has a size of 64 x 32, the sixth downsampled feature map has a size of 32 x 32, the size of the third spliced characteristic diagram is 480 multiplied by 32, and the fourth scale characteristic diagram F e4 The size of (2) is 256×32×32.
In the present embodiment, a first pre-coding feature map and a first coding feature map F d1 The size of (2) is 256×32×32;
the first decoding first up-sampling feature map has a size of 128×64×64, the first intermediate feature map has a size of 128×64×64, the first decoding second up-sampling feature map has a size of 256×64×64, the first decoding splice feature map has a size of 384×64×64, and the second decoding feature map F d2 The size of (2) is 128×64×64;
the second decoded first upsampled feature map has a size of 64×128×128, the second intermediate feature map has a size of 64×128×128, and the second decoded second upsampled featureThe size of the map is 256 x 128, the size of the second decoded third upsampled feature map is 128 x 128, the second decoding spliced characteristic diagram has the size of 448 multiplied by 128, and the third decoding characteristic diagram F d3 The size of (2) is 64×128×128;
the size of the third decoded first upsampled feature map is 32 x 256, the size of the third intermediate feature map is 32 x 0256 x 1256, the size of the third decoded second upsampled feature map is 256 x 256, the size of the third up-sampling feature map of the third decoding is 128×256×256, the size of the fourth up-sampling feature map of the third decoding is 64×256×256, the size of the third decoding splice feature map is 480×256×256, and the fourth decoding feature map F d4 The size of (2) is 32×256×256;
the size of the output defogging image out is 3×256×256.
In this embodiment, it should be noted that, the first decoding concatenation feature map, the second decoding concatenation feature map, and the third decoding concatenation feature map are subjected to feature processing by a first conv+instancenorm normalization+relu activation function layer and a second conv+instancenorm normalization+relu activation function layer in the feature fusion module, respectively, to obtain a second decoding feature map, a third decoding feature map, and a fourth decoding feature map.
In summary, the method has simple steps and reasonable design, the EPDN teacher network model and the PSD teacher network model are used for guiding and training the student network model, the feature extraction capability of the student network is effectively improved, the student network model is used for extracting multi-scale information of the defogging image through four-scale encoding and decoding, the global and local features of the defogging image are effectively fused, and the defogging effect of the image is further improved.
The foregoing description is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and any simple modification, variation and equivalent structural changes made to the above embodiment according to the technical substance of the present invention still fall within the scope of the technical solution of the present invention.

Claims (7)

1. A single image defogging method based on multi-teacher knowledge distillation, which is characterized by comprising the following steps:
step one, acquiring a training set image:
selecting an indoor training set from a foggy day image database RESIDE; the indoor training set comprises foggy training images and foggy training images corresponding to the foggy training images, wherein the number of the foggy training images and the number of the foggy training images are the same;
Step two, establishing a student network model:
the method for establishing the student network model comprises the following specific processes:
step 201, establishing an encoder model of a student network by adopting a computer; the encoder model of the student network comprises a first scale network model, a second scale network model, a third scale network model and a fourth scale network model, wherein the first scale network model comprises a first convolution layer and two RDB modules based on PA, and the second scale network model comprises a second convolution layer, two RDB modules based on PA and a feature fusion module; the third scale network model comprises a third convolution layer, two RDB modules based on PA and a feature fusion module; the fourth scale network model comprises a fourth convolution layer, two RDB modules based on PA and a feature fusion module;
step 202, adopting a computer to establish a decoder model of the student network; the decoder model of the student network comprises a first decoding network model, a second decoding network model, a third decoding network model, a fourth decoding network model and a fifth convolution layer, wherein the first decoding network model comprises two RDB modules based on PA, and the second decoding network model comprises a first transfer convolution layer, two RDB modules based on PA and a feature fusion module; the third decoding network model comprises a second transpose convolution layer, two RDB modules based on PA and a feature fusion module; the fourth decoding network model comprises a third transpose convolution layer, two RDB modules based on PA and a feature fusion module;
Step three, extracting features of the foggy training images:
step 301, using a computer to storeThe fog training image I is subjected to feature extraction through a first scale network model to obtain a first scale feature map F e1
Step 302, adopting a computer to make the first scale feature map F e1 Feature extraction is carried out through a second scale network model, and a second scale feature map F is obtained e2
Step 303, adopting a computer to make the second scale feature map F e2 Feature extraction is carried out through a third-scale network model, and a third-scale feature map F is obtained e3
Step 304, computer-integrating the third scale feature map F e3 Feature extraction is carried out through a fourth-scale network model, and a fourth-scale feature map F is obtained e4
Step 305, using a computer to map the fourth scale feature map F e4 Extracting features through the first decoding network model to obtain a first decoding feature map F d1
Step 306, using a computer to decode the first decoding feature map F d1 Feature extraction is carried out through a second decoding network model to obtain a second decoding feature map F d2
Step 307, using a computer to decode the feature map F d2 Extracting features through a third decoding network model to obtain a third decoding feature map F d3
Step 308, using a computer to decode the third decoding feature map F d3 Feature extraction is carried out through a fourth decoding network model to obtain a fourth decoding feature map F d4 The method comprises the steps of carrying out a first treatment on the surface of the Computer-implemented fourth decoding feature map F d4 Feature extraction is carried out through fifth convolution, and an output defogging image out is obtained;
step 309, processing the foggy training image I by using the EPDN teacher network model with a computer to obtain an defogging image out output by the EPDN teacher network EP And the feature map output by the global sub-generator in the EPDN teacher network model is recorded as the intermediate output feature map EP of the EPDN teacher network 1
Processing the foggy training image I by using a computer and utilizing a teacher PSD network model to obtain a PSD teacher network output defogging image out PS The feature map output by the main network in the PSD network model of the teacher is recorded as a PSD teacher network middle output feature map PS 2
Step four, establishing a total loss function:
step 401, adopting a computer according toObtaining a perception loss function L per The method comprises the steps of carrying out a first treatment on the surface of the Where i is a positive integer, n=5, and Φ i (gt) a characteristic diagram representing the output of the foggy training image gt corresponding to the foggy training image I through the Relu i_1 layer in the VGG19 network model, phi i (out) a characteristic diagram of the output defogging image out of the student network model, which is output by a Relu i_1 layer in the VGG19 network model, wherein i is more than or equal to 1 and less than or equal to 5; c (C) i 、H i And W is i The channel number, the length and the width of the feature map output by the Relu i_1 layer are respectively represented; (phi) i (gt),Φ i (out)) L1 Representing Manhattan distance between two feature graphs output by a Relu i_1 layer in a VGG19 network model;
step 402, using computer according to L dist =(out,out EP ) L1 +(out,out PS ) L1 +0.25(EP 1 ,F d2 ) L1 +0.5(PS 2 ,F d3 ) L1 Obtaining a distillation loss function L diss The method comprises the steps of carrying out a first treatment on the surface of the Wherein (out ) EP ) L1 Output defogging image out representing student network model and EPDN teacher network output defogging image out EP Manhattan distance between (out ) PS ) L1 Output defogging image out representing student network model and PSD teacher network output defogging image out PS Manhattan distance between (EP) 1 ,F d2 ) L1 Representing EPDN teacher network intermediate output feature map EP 1 And a second decoding feature map F of the student network model d2 Manhattan distance between (PS) 2 ,F d3 ) L1 Representing PSD teacher network intermediate output feature map PS 2 And third decoding feature map F of student network model d3 Manhattan distance between;
step 403, using computer according to L loss =0.1L per +L dist Obtaining the total loss function L loss
Training the student network model by the foggy training image:
step 501, adopting Adam optimization algorithm by computer and utilizing total loss function L loss Performing iterative optimization on the student network model until the training set is completely trained, and completing one-time iterative training;
Step 502, repeating the iterative training in step 501 until the iterative training preset times are met, and obtaining a trained student network model;
step six, defogging the single image by using the trained student network model:
and inputting any one foggy image into a trained student network model by adopting a computer to perform defogging treatment, so as to obtain a foggy image.
2. A single image defogging method based on multi-teacher knowledge distillation according to claim 1, wherein: in step 201, the number of convolution kernels in the first convolution layer is 32, the size of the convolution kernels is 3×3, the sliding step size is 1, and the padding is 1;
the number of convolution kernels in the second convolution layer is 64, the size of the convolution kernels is 3×3, the sliding step size is 2, and the padding is 1;
the PA-based RDB module in step 201 includes a first conv+relu layer, a Conv1 convolution layer, an RDB module, a Conv2 convolution layer, and a Sigmoid activation function layer; the number of convolution kernels in the first Conv+ReLU layer is 32, the size of the convolution kernels is 3 multiplied by 3, the sliding step length is 1, and the padding is 1; the number of convolution kernels in the Conv1 convolution layer is 32, the size of the convolution kernels is 1 multiplied by 1, the sliding step length is 1, and the padding is 0; the number of convolution kernels in the Conv2 convolution layer is 32, the size of the convolution kernels is 3 multiplied by 3, the sliding step length is 1, and the padding is 1;
The number of convolution kernels in the third convolution layer is 128, the size of the convolution kernels is 3×3, the sliding step length is 2, and the padding is 1;
the number of convolution kernels in the fourth convolution layer is 256, the size of the convolution kernels is 3 multiplied by 3, the sliding step length is 2, and the padding is 1;
the feature fusion module in step 201 comprises a first Conv+InstanceNorm normalization+ReLU activation function layer and a second Conv+InstanceNorm normalization+ReLU activation function layer;
in step 202, the number of convolution kernels in the first transpose convolution layer is 128, the size of the convolution kernels is 3×3, the sliding step size is 2, the padding is 1, and the out_padding is 1;
the number of convolution kernels in the second transpose convolution layer is 64, the size of the convolution kernels is 3×3, the sliding step size is 2, the padding is 1, and the out_padding is 1;
the number of convolution kernels in the third transpose convolution layer is 32, the size of the convolution kernels is 3×3, the sliding step size is 2, the padding is 1, and the out_padding is 1;
the number of convolution kernels in the fifth convolution layer is 3, the size of the convolution kernels is 3×3, the sliding step size is 1, and the padding is 1.
3. A single image defogging method based on multi-teacher knowledge distillation according to claim 1, wherein: in step 301, a computer is used to perform feature extraction on the foggy training image I through a first scale network model to obtain a first scale feature map F e1 The specific process is as follows:
step 3011, performing feature extraction on the foggy training image I through a first convolution layer by adopting a computer to obtain an input feature map F in
Step 3012, the computer inputs the feature map F in Inputting a RDB module based on PA for feature extraction to obtain an intermediate output feature map F out
Step 3013, the computer outputs the intermediate output feature map F according to the method described in step 3012 out Inputting another RDB module based on PA to perform feature extraction to obtain a first scale feature map F e1
4. A single image defogging method based on multi-teacher knowledge distillation according to claim 1, wherein: step 302 uses a computer to apply a first rulerDegree feature map F e1 Feature extraction is carried out through a second scale network model to obtain a second scale feature map F e2 The specific process is as follows:
step 3021, using a computer to map the first scale feature map F e1 Extracting features through a second convolution layer to obtain a second input feature map;
step 3022, inputting the second input feature map into a PA-based RDB module in the second scale network model by the computer to perform feature extraction, so as to obtain a second scale first coding feature map;
step 3023, inputting the second-scale first coding feature map into another PA-based RDB module in the second-scale network model by the computer for feature extraction, so as to obtain a second-scale second coding feature map;
Step 3024, the computer maps the first scale feature map F e1 Performing 0.5 times downsampling to obtain a first downsampling feature map;
step 3025, calling a splicing cat function module by a computer to splice the first downsampling feature map and the second-scale second coding feature map to obtain a first spliced feature map;
step 3026, inputting the first spliced feature map into a feature fusion module in the second scale network model by using a computer to obtain a second scale feature map F e2
Computer-implemented step 303 of mapping the second scale feature map F e2 Feature extraction is carried out through a third-scale network model to obtain a third-scale feature map F e3 The specific process is as follows:
step 3031, using a computer to map the second scale feature map F e2 Extracting features through a third convolution layer to obtain a third input feature map;
step 3032, the computer inputs the third input feature map into a PA-based RDB module in the third-scale network model to perform feature extraction to obtain a third-scale first coding feature map;
step 3033, the computer inputs the third-scale first decoding feature map into another RDB module based on PA in the third-scale network model to perform feature extraction, so as to obtain a third-scale second coding feature map;
Step 3034, the computer maps the second scale feature map F e2 Performing 0.5 times downsampling to obtain a second downsampling feature map;
the computer maps the first scale characteristic map F e1 Performing 0.25 times downsampling to obtain a third downsampling feature map;
step 3035, a computer is adopted to call a splicing cat function module to splice the second downsampling feature map, the third downsampling feature map and the third-scale second coding feature map to obtain a second spliced feature map;
step 3036, inputting the second spliced feature map into a feature fusion module in the third-scale network model by adopting a computer to obtain a third-scale feature map F e3
Computer-implemented third scale feature map F in step 304 e3 Feature extraction is carried out through a fourth-scale network model to obtain a fourth-scale feature map F e4 The specific process is as follows:
step 3041, computer-implemented third scale feature map F e3 Extracting features through a fourth convolution layer to obtain a fourth input feature map;
step 3042, inputting the fourth input feature map into a PA-based RDB module in a fourth-scale network model by a computer to perform feature extraction, so as to obtain a fourth-scale first coding feature map;
step 3043, inputting the fourth-scale first coding feature map into another RDB module based on PA in the fourth-scale network model by a computer for feature extraction to obtain a fourth-scale second coding feature map;
Step 3044 computer maps the third scale feature map F e3 Performing 0.5 times downsampling to obtain a fourth downsampling feature map;
the computer maps the second scale characteristic map F e2 Performing 0.25 times downsampling to obtain a fifth downsampling characteristic map;
the computer maps the first scale characteristic map F e1 Performing 0.125 times downsampling to obtain a sixth downsampled feature map;
step 3045, calling a splicing cat function module by a computer to splice the fourth downsampling feature map, the fifth downsampling feature map, the sixth downsampling feature map and the fourth-scale second coding feature map to obtain a third spliced feature map;
step 3046, inputting the third spliced feature map into a feature fusion module in the fourth-scale network model by using a computer to obtain a fourth-scale feature map F e4
5. A single image defogging method based on multi-teacher knowledge distillation according to claim 3, characterized in that: the computer will input a feature map F in step 3012 in Inputting a PA-based RDB module to perform feature extraction to obtain an intermediate feature map F out The specific process is as follows:
step A, the computer inputs the characteristic diagram F in Feature extraction is carried out through a first Conv+ReLU layer to obtain a feature map F pre
Step B, the computer makes the feature map F pre The Conv1 convolution layer and the RDB module are input to perform feature extraction to obtain a feature map F RDB At the same time, feature map F pre The input Conv2 convolution layer carries out convolution processing and is normalized by a Sigmoid activation function to obtain a space weight diagram F s
Step C, the computer is according toObtaining a characteristic diagram F mid The method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Hadamard product operation between matrices representing feature maps,/->Representing addition operations between feature map matrices;
step D, the computer is according toObtaining an intermediate feature map F out
6. A single image defogging method based on multi-teacher knowledge distillation according to claim 3, characterized in that: in step 3026, step 3036, and step 3046, the first post-stitching feature map, the second post-stitching feature map, and the third post-stitching feature map are recorded as post-stitching feature maps, and the second scale feature map F e2 Third scale feature map F e3 And fourth scale feature map F e4 And (3) respectively recording the characteristic images as fused scale characteristic images, and inputting the spliced characteristic images into a characteristic fusion module by adopting a computer to obtain the scale characteristic images, wherein the specific process is as follows:
a1, performing feature processing on the spliced feature map through a first Conv+InstanceNorm normalization+ReLU activation function layer by adopting a computer to obtain a fusion coding feature map;
And A2, performing feature processing on the fusion coding feature map through a second Conv+InstanceNorm normalization+ReLU activation function layer by adopting a computer to obtain a fused scale feature map.
7. A single image defogging method based on multi-teacher knowledge distillation according to claim 1, wherein: computer-implemented fourth scale feature map F in step 305 e4 Feature extraction is carried out through a first decoding network model to obtain a first decoding feature map F d1 The specific process is as follows:
step 3051, computer is used to map the fourth scale feature map F e4 Performing feature extraction through a PA-based RDB module in the first decoding network model to obtain a first pre-decoding feature map;
step 3052, inputting the first pre-decoding feature map into another PA-based RDB module in the first decoding network model by the computer to perform feature extraction, thereby obtaining a first decoding feature map F d1
Step 306, the specific process is as follows:
step 3061, using a computer to decode the first feature map F d1 Performing feature extraction through a first inversion convolution layer to obtain a first decoded first up-sampling feature map;
step 3062, performing feature extraction on the first decoded first upsampled feature map through two PA-based RDB modules in the second decoding network model by using a computer to obtain a first intermediate feature map;
Step 3063, using a computer to decode the first feature map F d1 2 times of up-sampling processing is carried out to obtain a first decoding second up-sampling feature map;
step 3064, a computer is adopted to call a splicing cat function module to splice the first intermediate feature map and the first decoding second up-sampling feature map, so as to obtain a first decoding splicing feature map;
step 3065, inputting the first decoding spliced feature map into a feature fusion module in the second decoding network model by using a computer to obtain a second decoding feature map F d2
Step 307, the specific process is as follows:
step 3071, computer-readable recording medium storing the second decoding profile F d2 Performing feature extraction through a second transposition convolution layer to obtain a second decoded first upsampling feature map;
step 3072, performing feature extraction on the second decoded first upsampled feature map by using a computer through two PA-based RDB modules in the third decoding network model to obtain a second intermediate feature map;
step 3073, using a computer to decode the first decoded feature map F d1 Obtaining a second up-sampling characteristic diagram of a second decoding through 4 times up-sampling;
decoding the second decoding feature map F d2 Obtaining a second decoding third up-sampling feature map through 2 times up-sampling;
step 3074, calling a splicing cat function module by a computer to splice the second intermediate feature map, the second decoding second up-sampling feature map and the second decoding third up-sampling feature map to obtain a second decoding splicing feature map;
Step 3075, inputting the second decoding spliced feature map into a feature fusion module in the third decoding network model by using a computer to obtain a third decoding feature map F d3
Step 308, the specific process is as follows:
step 3081, using a computer to decode the third feature map F d3 Performing feature extraction through a third transposition convolution layer to obtain a third decoded first upsampling feature map;
step 3072, performing feature extraction on the third decoded first upsampled feature map by using a computer through two PA-based RDB modules in the fourth decoding network model to obtain a third intermediate feature map;
step 3073, using a computer to decode the first decoded feature map F d1 Obtaining a third decoding second up-sampling feature map through 8 times up-sampling;
decoding the second decoding feature map F d2 Obtaining a third up-sampling characteristic diagram of a third decoding through 4 times up-sampling;
decoding the third decoding feature map F d3 Obtaining a third decoding fourth up-sampling feature map through 2 times up-sampling;
step 3074, calling a splicing cat function module by a computer to splice the third intermediate feature map, the third decoding second up-sampling feature map, the third decoding third up-sampling feature map and the third decoding fourth up-sampling feature map to obtain a third decoding splicing feature map;
Step 3075, inputting the third decoding spliced feature map into a feature fusion module in the third decoding network model by using a computer to obtain a fourth decoding feature map F d4
CN202310681883.6A 2023-06-09 2023-06-09 Single image defogging method based on multi-teacher knowledge distillation Active CN116862784B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310681883.6A CN116862784B (en) 2023-06-09 2023-06-09 Single image defogging method based on multi-teacher knowledge distillation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310681883.6A CN116862784B (en) 2023-06-09 2023-06-09 Single image defogging method based on multi-teacher knowledge distillation

Publications (2)

Publication Number Publication Date
CN116862784A true CN116862784A (en) 2023-10-10
CN116862784B CN116862784B (en) 2024-06-04

Family

ID=88218024

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310681883.6A Active CN116862784B (en) 2023-06-09 2023-06-09 Single image defogging method based on multi-teacher knowledge distillation

Country Status (1)

Country Link
CN (1) CN116862784B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2020100274A4 (en) * 2020-02-25 2020-03-26 Huang, Shuying DR A Multi-Scale Feature Fusion Network based on GANs for Haze Removal
CN111833277A (en) * 2020-07-27 2020-10-27 大连海事大学 Marine image defogging method with non-paired multi-scale hybrid coding and decoding structure
WO2021056043A1 (en) * 2019-09-23 2021-04-01 Presagen Pty Ltd Decentralised artificial intelligence (ai)/machine learning training system
CN113066025A (en) * 2021-03-23 2021-07-02 河南理工大学 Image defogging method based on incremental learning and feature and attention transfer
CN113379613A (en) * 2020-03-10 2021-09-10 三星电子株式会社 Image denoising system and method using deep convolutional network
US20210287342A1 (en) * 2020-03-10 2021-09-16 Samsung Electronics Co., Ltd. Systems and methods for image denoising using deep convolutional networks
CN113744146A (en) * 2021-08-23 2021-12-03 山东师范大学 Image defogging method based on contrast learning and knowledge distillation
CN115358942A (en) * 2022-08-09 2022-11-18 山西大学 Image defogging method combining course learning and teacher-student learning

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021056043A1 (en) * 2019-09-23 2021-04-01 Presagen Pty Ltd Decentralised artificial intelligence (ai)/machine learning training system
AU2020100274A4 (en) * 2020-02-25 2020-03-26 Huang, Shuying DR A Multi-Scale Feature Fusion Network based on GANs for Haze Removal
CN113379613A (en) * 2020-03-10 2021-09-10 三星电子株式会社 Image denoising system and method using deep convolutional network
US20210287342A1 (en) * 2020-03-10 2021-09-16 Samsung Electronics Co., Ltd. Systems and methods for image denoising using deep convolutional networks
CN111833277A (en) * 2020-07-27 2020-10-27 大连海事大学 Marine image defogging method with non-paired multi-scale hybrid coding and decoding structure
CN113066025A (en) * 2021-03-23 2021-07-02 河南理工大学 Image defogging method based on incremental learning and feature and attention transfer
CN113744146A (en) * 2021-08-23 2021-12-03 山东师范大学 Image defogging method based on contrast learning and knowledge distillation
CN115358942A (en) * 2022-08-09 2022-11-18 山西大学 Image defogging method combining course learning and teacher-student learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HANG DONG等: "Multi-Scale Boosted Dehazing Network With Dense Feature Fusion", 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 5 August 2020 (2020-08-05), pages 2154 - 2164 *
吴嘉炜;余兆钗;李佐勇;刘维娜;张祖昌;: "一种基于深度学习的两阶段图像去雾网络", 计算机应用与软件, vol. 37, no. 04, 30 April 2020 (2020-04-30), pages 197 - 202 *

Also Published As

Publication number Publication date
CN116862784B (en) 2024-06-04

Similar Documents

Publication Publication Date Title
CN111243066B (en) Facial expression migration method based on self-supervision learning and confrontation generation mechanism
CN110232394B (en) Multi-scale image semantic segmentation method
CN111833246B (en) Single-frame image super-resolution method based on attention cascade network
CN109033095B (en) Target transformation method based on attention mechanism
CN111179167B (en) Image super-resolution method based on multi-stage attention enhancement network
CN113096017B (en) Image super-resolution reconstruction method based on depth coordinate attention network model
CN111832570A (en) Image semantic segmentation model training method and system
CN111046962A (en) Sparse attention-based feature visualization method and system for convolutional neural network model
CN112733768B (en) Natural scene text recognition method and device based on bidirectional characteristic language model
CN111754446A (en) Image fusion method, system and storage medium based on generation countermeasure network
Haque et al. Image denoising and restoration with CNN-LSTM Encoder Decoder with Direct Attention
CN109377532B (en) Image processing method and device based on neural network
CN115311720B (en) Method for generating deepfake based on transducer
CN113240683B (en) Attention mechanism-based lightweight semantic segmentation model construction method
CN113066025B (en) Image defogging method based on incremental learning and feature and attention transfer
CN107749048B (en) Image correction system and method, and color blindness image correction system and method
CN112767283A (en) Non-uniform image defogging method based on multi-image block division
CN112700460B (en) Image segmentation method and system
CN115908805A (en) U-shaped image segmentation network based on convolution enhanced cross self-attention deformer
CN111353939A (en) Image super-resolution method based on multi-scale feature representation and weight sharing convolution layer
CN115393289A (en) Tumor image semi-supervised segmentation method based on integrated cross pseudo label
CN114565539B (en) Image defogging method based on online knowledge distillation
CN114912575A (en) Medical image segmentation model and method based on Swin transform connection path
CN112733861B (en) Text erasing and character matting method based on U-shaped residual error network
CN112270650B (en) Image processing method, system, medium, and apparatus based on sparse autoencoder

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant